1
|
Ameli A, Peña-Castillo L, Usefi H. Assessing the reproducibility of machine-learning-based biomarker discovery in Parkinson's disease. Comput Biol Med 2024; 174:108407. [PMID: 38603902 DOI: 10.1016/j.compbiomed.2024.108407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 03/21/2024] [Accepted: 04/01/2024] [Indexed: 04/13/2024]
Abstract
Feature selection and machine learning algorithms can be used to analyze Single Nucleotide Polymorphisms (SNPs) data and identify potential disease biomarkers. Reproducibility of identified biomarkers is critical for them to be useful for clinical research; however, genotyping platforms and selection criteria for individuals to be genotyped affect the reproducibility of identified biomarkers. To assess biomarkers reproducibility, we collected five SNPs datasets from the database of Genotypes and Phenotypes (dbGaP) and explored several data integration strategies. While combining datasets can lead to a reduction in classification accuracy, it has the potential to improve the reproducibility of potential biomarkers. We evaluated the agreement among different strategies in terms of the SNPs that were identified as potential Parkinson's disease (PD) biomarkers. Our findings indicate that, on average, 93% of the SNPs identified in a single dataset fail to be identified in other datasets. However, through dataset integration, this lack of replication is reduced to 62%. We discovered fifty SNPs that were identified at least twice, which could potentially serve as novel PD biomarkers. These SNPs are indirectly linked to PD in the literature but have not been directly associated with PD before. These findings open up new potential avenues of investigation.
Collapse
Affiliation(s)
- Ali Ameli
- Department of Computer Science, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, A1C5S7, NL, Canada
| | - Lourdes Peña-Castillo
- Department of Computer Science, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, A1C5S7, NL, Canada; Department of Biology, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, A1C5S7, NL, Canada.
| | - Hamid Usefi
- Department of Computer Science, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, A1C5S7, NL, Canada; Department of Mathematics and Statistics, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, A1C5S7, NL, Canada.
| |
Collapse
|
2
|
Longjohn MN, Hudson JABJ, Peña-Castillo L, Cormier RPJ, Hannay B, Chacko S, Lewis SM, Moorehead PC, Christian SL. Extracellular vesicle small RNA cargo discriminates non-cancer donors from pediatric B-lymphoblastic leukemia patients. Front Oncol 2023; 13:1272883. [PMID: 38023151 PMCID: PMC10679349 DOI: 10.3389/fonc.2023.1272883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 10/19/2023] [Indexed: 12/01/2023] Open
Abstract
Pediatric B-acute lymphoblastic leukemia (B-ALL) is a disease of abnormally growing B lymphoblasts. Here we hypothesized that extracellular vesicles (EVs), which are nanosized particles released by all cells (including cancer cells), could be used to monitor B-ALL severity and progression by sampling plasma instead of bone marrow. EVs are especially attractive as they are present throughout the circulation regardless of the location of the originating cell. First, we used nanoparticle tracking analysis to compare EVs between non-cancer donor (NCD) and B-ALL blood plasma; we found that B-ALL plasma contains more EVs than NCD plasma. We then isolated EVs from NCD and pediatric B-ALL peripheral blood plasma using a synthetic peptide-based isolation technique (Vn96), which is clinically amenable and isolates a broad spectrum of EVs. RNA-seq analysis of small RNAs contained within the isolated EVs revealed a signature of differentially packaged and exclusively packaged RNAs that distinguish NCD from B-ALL. The plasma EVs contain a heterogenous mixture of miRNAs and fragments of long non-coding RNA (lncRNA) and messenger RNA (mRNA). Transcripts packaged in B-ALL EVs include those involved in negative cell cycle regulation, potentially suggesting that B-ALL cells may use EVs to discard gene sequences that control growth. In contrast, NCD EVs carry sequences representative of multiple organs, including brain, muscle, and epithelial cells. This signature could potentially be used to monitor B-ALL disease burden in pediatric B-ALL patients via blood draws instead of invasive bone marrow aspirates.
Collapse
Affiliation(s)
- Modeline N. Longjohn
- Department of Biochemistry, Memorial University of Newfoundland, St. John’s, NL, Canada
- Beatrice Hunter Cancer Research Institute, Halifax, NS, Canada
| | - Jo-Anna B. J. Hudson
- Discipline of Pediatrics, Memorial University of Newfoundland, St. John’s, NL, Canada
| | - Lourdes Peña-Castillo
- Department of Biology, Memorial University of Newfoundland, St. John’s, NL, Canada
- Department of Computer Science, Memorial University of Newfoundland, St. John’s, NL, Canada
| | | | | | - Simi Chacko
- Atlantic Cancer Research Institute, Moncton, NB, Canada
| | - Stephen M. Lewis
- Beatrice Hunter Cancer Research Institute, Halifax, NS, Canada
- Atlantic Cancer Research Institute, Moncton, NB, Canada
- Department of Chemistry & Biochemistry, Université de Moncton, Moncton, NB, Canada
| | - Paul C. Moorehead
- Discipline of Pediatrics, Memorial University of Newfoundland, St. John’s, NL, Canada
| | - Sherri L. Christian
- Department of Biochemistry, Memorial University of Newfoundland, St. John’s, NL, Canada
- Beatrice Hunter Cancer Research Institute, Halifax, NS, Canada
| |
Collapse
|
3
|
Tavakoli Y, Peña-Castillo L, Soares A. A Study on the Geometric and Kinematic Descriptors of Trajectories in the Classification of Ship Types. Sensors (Basel) 2022; 22:5588. [PMID: 35898098 PMCID: PMC9329964 DOI: 10.3390/s22155588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/15/2022] [Accepted: 07/20/2022] [Indexed: 06/15/2023]
Abstract
The classification of ships based on their trajectory descriptors is a common practice that is helpful in various contexts, such as maritime security and traffic management. For the most part, the descriptors are either geometric, which capture the shape of a ship's trajectory, or kinematic, which capture the motion properties of a ship's movement. Understanding the implications of the type of descriptor that is used in classification is important for feature engineering and model interpretation. However, this matter has not yet been deeply studied. This article contributes to feature engineering within this field by introducing proper similarity measures between the descriptors and defining sound benchmark classifiers, based on which we compared the predictive performance of geometric and kinematic descriptors. The performance profiles of geometric and kinematic descriptors, along with several standard tools in interpretable machine learning, helped us to provide an account of how different ships differ in movement. Our results indicated that the predictive performance of geometric and kinematic descriptors varied greatly, depending on the classification problem at hand. We also showed that the movement of certain ship classes solely differed geometrically while some other classes differed kinematically and that this difference could be formulated in simple terms. On the other hand, the movement characteristics of some other ship classes could not be delineated along these lines and were more complicated to express. Finally, this study verified the conjecture that the geometric-kinematic taxonomy could be further developed as a tool for more accessible feature selection.
Collapse
|
4
|
Naskulwar K, Peña-Castillo L. sRNARFTarget: a fast machine-learning-based approach for transcriptome-wide sRNA target prediction. RNA Biol 2021; 19:44-54. [PMID: 34965197 PMCID: PMC8794260 DOI: 10.1080/15476286.2021.2012058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Bacterial small regulatory RNAs (sRNAs) are key regulators of gene expression in many processes related to adaptive responses. A multitude of sRNAs have been identified in many bacterial species; however, their function has yet to be elucidated. A key step to understand sRNAs function is to identify the mRNAs these sRNAs bind to. There are several computational methods for sRNA target prediction, and the most accurate one is CopraRNA which is based on comparative-genomics. However, species-specific sRNAs are quite common and CopraRNA cannot be used for these sRNAs. The most commonly used transcriptome-wide sRNA target prediction method and second-most-accurate method is IntaRNA. However, IntaRNA can take hours to run on a bacterial transcriptome. Here we present sRNARFTarget, a machine-learning-based method for transcriptome-wide sRNA target prediction applicable to any sRNA. We comparatively assessed the performance of sRNARFTarget, CopraRNA and IntaRNA in three bacterial species. Our results show that sRNARFTarget outperforms IntaRNA in terms of accuracy, ranking of true interacting pairs, and running time. However, CopraRNA substantially outperforms the other two programsin terms of accuracy. Thus, we suggest using CopraRNA when homolog sequences of the sRNA are available, and sRNARFTarget for transcriptome-wide prediction or for species-specific sRNAs. sRNARFTarget is available at https://github.com/BioinformaticsLabAtMUN/sRNARFTarget.
Collapse
Affiliation(s)
- Kratika Naskulwar
- Department of Computer Science, Memorial University of Newfoundland, St. John's, Canada
| | - Lourdes Peña-Castillo
- Department of Computer Science, Memorial University of Newfoundland, St. John's, Canada.,Department of Biology, Memorial University of Newfoundland, St. John's, Canada
| |
Collapse
|
5
|
Abstract
Promoters are genomic regions where the transcription machinery binds to initiate the transcription of specific genes. Computational tools for identifying bacterial promoters have been around for decades. However, most of these tools were designed to recognize promoters in one or few bacterial species. Here, we present Promotech, a machine-learning-based method for promoter recognition in a wide range of bacterial species. We compare Promotech's performance with the performance of five other promoter prediction methods. Promotech outperforms these other programs in terms of area under the precision-recall curve (AUPRC) or precision at the same level of recall. Promotech is available at https://github.com/BioinformaticsLabAtMUN/PromoTech .
Collapse
Affiliation(s)
- Ruben Chevez-Guardado
- Department of Computer Science, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, Newfoundland, A1C 5S7, Canada
| | - Lourdes Peña-Castillo
- Department of Computer Science, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, Newfoundland, A1C 5S7, Canada. .,Department of Biology, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, Newfoundland, A1C 5S7, Canada.
| |
Collapse
|
6
|
Pallegar P, Canuti M, Langille E, Peña-Castillo L, Lang AS. A Two-Component System Acquired by Horizontal Gene Transfer Modulates Gene Transfer and Motility via Cyclic Dimeric GMP. J Mol Biol 2020; 432:4840-4855. [PMID: 32634380 DOI: 10.1016/j.jmb.2020.07.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 06/08/2020] [Accepted: 07/01/2020] [Indexed: 10/23/2022]
Abstract
Bis-(3'-5')-cyclic dimeric guanosine monophosphate (c-di-GMP) is an important intracellular signaling molecule that affects diverse physiological processes in bacteria. The intracellular levels of c-di-GMP are controlled by proteins acting as diguanylate cyclase (DGC) and phosphodiesterase (PDE) enzymes that synthesize and degrade c-di-GMP, respectively. In the alphaproteobacterium Rhodobacter capsulatus, flagellar motility and gene exchange via production of the gene transfer agent RcGTA are regulated by c-di-GMP. One of the R. capsulatus proteins involved in this regulation is Rcc00620, which contains an N-terminal two-component system response regulator receiver (REC) domain and C-terminal DGC and PDE domains. We demonstrate that the enzymatic activity of Rcc00620 is regulated through the phosphorylation status of its REC domain, which is controlled by a cognate histidine kinase protein, Rcc00621. In this system, the phosphorylated form of Rcc00620 is active as a PDE enzyme and stimulates gene transfer and motility. In addition, we discovered that the rcc00620 and rcc00621 genes are present in only one lineage within the genus Rhodobacter and were acquired via horizontal gene transfer from a distantly related alphaproteobacterium in the order Sphingomonadales. Therefore, a horizontally acquired regulatory system regulates gene transfer in the recipient organism.
Collapse
Affiliation(s)
- Purvikalyan Pallegar
- Department of Biology, Memorial University of Newfoundland, St. John's, NL A1B 3X9, Canada.
| | - Marta Canuti
- Department of Biology, Memorial University of Newfoundland, St. John's, NL A1B 3X9, Canada.
| | - Evan Langille
- Department of Chemistry, Memorial University of Newfoundland, St. John's, NL A1B 3X7, Canada.
| | - Lourdes Peña-Castillo
- Department of Biology, Memorial University of Newfoundland, St. John's, NL A1B 3X9, Canada; Department of Computer Science, Memorial University of Newfoundland, St. John's, NL A1B 3X5, Canada.
| | - Andrew S Lang
- Department of Biology, Memorial University of Newfoundland, St. John's, NL A1B 3X9, Canada.
| |
Collapse
|
7
|
Nartey MN, Peña-Castillo L, LeGrow M, Doré J, Bhattacharya S, Darby-King A, Carew SJ, Yuan Q, Harley CW, McLean JH. Learning-induced mRNA alterations in olfactory bulb mitral cells in neonatal rats. ACTA ACUST UNITED AC 2020; 27:209-221. [PMID: 32295841 PMCID: PMC7164515 DOI: 10.1101/lm.051177.119] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2019] [Accepted: 02/11/2020] [Indexed: 12/20/2022]
Abstract
In the olfactory bulb, a cAMP/PKA/CREB-dependent form of learning occurs in the first week of life that provides a unique mammalian model for defining the epigenetic role of this evolutionarily ancient plasticity cascade. Odor preference learning in the week-old rat pup is rapidly induced by a 10-min pairing of odor and stroking. Memory is demonstrable at 24 h, but not 48 h, posttraining. Using this paradigm, pups that showed peppermint preference 30 min posttraining were sacrificed 20 min later for laser microdissection of odor-encoding mitral cells. Controls were given odor only. Microarray analysis revealed that 13 nonprotein-coding mRNAs linked to mRNA translation and splicing and 11 protein-coding mRNAs linked to transcription differed with odor preference training. MicroRNA23b, a translation inhibitor of multiple plasticity-related mRNAs, was down-regulated. Protein-coding transcription was up-regulated for Sec23b, Clic2, Rpp14, Dcbld1, Magee2, Mstn, Fam229b, RGD1566265, and Mgst2. Gng12 and Srcg1 mRNAs were down-regulated. Increases in Sec23b, Clic2, and Dcbld1 proteins were confirmed in mitral cells in situ at the same time point following training. The protein-coding changes are consistent with extracellular matrix remodeling and ryanodine receptor involvement in odor preference learning. A role for CREB and AP1 as triggers of memory-related mRNA regulation is supported. The small number of gene changes identified in the mitral cell input/output link for 24 h memory will facilitate investigation of the nature, and reversibility, of changes supporting temporally restricted long-term memory.
Collapse
Affiliation(s)
- Michaelina N Nartey
- Divison of Biomedical Sciences, Memorial University of Newfoundland, St. John's, Newfoundland A1B3V6, Canada
| | - Lourdes Peña-Castillo
- Department of Computer Science, Memorial University of Newfoundland, St. John's, Newfoundland A1B3X5, Canada
| | - Megan LeGrow
- Divison of Biomedical Sciences, Memorial University of Newfoundland, St. John's, Newfoundland A1B3V6, Canada
| | - Jules Doré
- Divison of Biomedical Sciences, Memorial University of Newfoundland, St. John's, Newfoundland A1B3V6, Canada
| | - Sriya Bhattacharya
- Divison of Biomedical Sciences, Memorial University of Newfoundland, St. John's, Newfoundland A1B3V6, Canada
| | - Andrea Darby-King
- Divison of Biomedical Sciences, Memorial University of Newfoundland, St. John's, Newfoundland A1B3V6, Canada
| | - Samantha J Carew
- Divison of Biomedical Sciences, Memorial University of Newfoundland, St. John's, Newfoundland A1B3V6, Canada
| | - Qi Yuan
- Divison of Biomedical Sciences, Memorial University of Newfoundland, St. John's, Newfoundland A1B3V6, Canada
| | - Carolyn W Harley
- Department of Psychology, Memorial University of Newfoundland, St. John's, Newfoundland A1B3X9, Canada
| | - John H McLean
- Divison of Biomedical Sciences, Memorial University of Newfoundland, St. John's, Newfoundland A1B3V6, Canada
| |
Collapse
|
8
|
McCuaig B, Peña-Castillo L, Dufour SC. Metagenomic analysis suggests broad metabolic potential in extracellular symbionts of the bivalve Thyasira cf. gouldi. Anim Microbiome 2020; 2:7. [PMID: 33499960 PMCID: PMC7807488 DOI: 10.1186/s42523-020-00025-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Accepted: 02/20/2020] [Indexed: 11/26/2022] Open
Abstract
Background Next-generation sequencing has opened new avenues for studying metabolic capabilities of bacteria that cannot be cultured. Here, we provide a metagenomic description of chemoautotrophic gammaproteobacterial symbionts associated with Thyasira cf. gouldi, a sediment-dwelling bivalve from the family Thyasiridae. Thyasirid symbionts differ from those of other bivalves by being extracellular, and recent work suggests that they are capable of living freely in the environment. Results Thyasira cf. gouldi symbionts appear to form mixed, non-clonal populations in the host, show no signs of genomic reduction and contain many genes that would only be useful outside the host, including flagellar and chemotaxis genes. The thyasirid symbionts may be capable of sulfur oxidation via both the sulfur oxidation and reverse dissimilatory sulfate reduction pathways, as observed in other bivalve symbionts. In addition, genes for hydrogen oxidation and dissimilatory nitrate reduction were found, suggesting varied metabolic capabilities under a range of redox conditions. The genes of the tricarboxylic acid cycle are also present, along with membrane bound sugar importer channels, suggesting that the bacteria may be mixotrophic. Conclusions In this study, we have generated the first thyasirid symbiont genomic resources. In Thyasira cf. gouldi, symbiont populations appear non-clonal and encode genes for a plethora of metabolic capabilities; future work should examine whether symbiont heterogeneity and metabolic breadth, which have been shown in some intracellular chemosymbionts, are signatures of extracellular chemosymbionts in bivalves.
Collapse
Affiliation(s)
- Bonita McCuaig
- Department of Biology, Memorial University of Newfoundland, St. John's, NL, Canada
| | - Lourdes Peña-Castillo
- Department of Biology, Memorial University of Newfoundland, St. John's, NL, Canada.,Department of Computer Science, Memorial University of Newfoundland, St. John's, NL, Canada
| | - Suzanne C Dufour
- Department of Biology, Memorial University of Newfoundland, St. John's, NL, Canada.
| |
Collapse
|
9
|
Pallegar P, Peña-Castillo L, Langille E, Gomelsky M, Lang AS. Cyclic di-GMP-Mediated Regulation of Gene Transfer and Motility in Rhodobacter capsulatus. J Bacteriol 2020; 202:e00554-19. [PMID: 31659012 PMCID: PMC6941535 DOI: 10.1128/jb.00554-19] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 10/19/2019] [Indexed: 02/08/2023] Open
Abstract
Gene transfer agents (GTAs) are bacteriophage-like particles produced by several bacterial and archaeal lineages that contain small pieces of the producing cells' genomes that can be transferred to other cells in a process similar to transduction. One well-studied GTA is RcGTA, produced by the alphaproteobacterium Rhodobacter capsulatus RcGTA gene expression is regulated by several cellular regulatory systems, including the CckA-ChpT-CtrA phosphorelay. The transcription of multiple other regulator-encoding genes is affected by the response regulator CtrA, including genes encoding putative enzymes involved in the synthesis and hydrolysis of the second messenger bis-(3'-5')-cyclic dimeric GMP (c-di-GMP). To investigate whether c-di-GMP signaling plays a role in RcGTA production, we disrupted the CtrA-affected genes potentially involved in this process. We found that disruption of four of these genes affected RcGTA gene expression and production. We performed site-directed mutagenesis of key catalytic residues in the GGDEF and EAL domains responsible for diguanylate cyclase (DGC) and c-di-GMP phosphodiesterase (PDE) activities and analyzed the functions of the wild-type and mutant proteins. We also measured RcGTA production in R. capsulatus strains where intracellular levels of c-di-GMP were altered by the expression of either a heterologous DGC or a heterologous PDE. This adds c-di-GMP signaling to the collection of cellular regulatory systems controlling gene transfer in this bacterium. Furthermore, the heterologous gene expression and the four gene disruptions had similar effects on R. capsulatus flagellar motility as found for gene transfer, and we conclude that c-di-GMP inhibits both RcGTA production and flagellar motility in R. capsulatusIMPORTANCE Gene transfer agents (GTAs) are virus-like particles that move cellular DNA between cells. In the alphaproteobacterium Rhodobacter capsulatus, GTA production is affected by the activities of multiple cellular regulatory systems, to which we have now added signaling via the second messenger dinucleotide molecule bis-(3'-5')-cyclic dimeric GMP (c-di-GMP). Similar to the CtrA phosphorelay, c-di-GMP also affects R. capsulatus flagellar motility in addition to GTA production, with lower levels of intracellular c-di-GMP favoring increased flagellar motility and gene transfer. These findings further illustrate the interconnection of GTA production with global systems of regulation in R. capsulatus, providing additional support for the notion that the production of GTAs has been maintained in this and related bacteria because it provides a benefit to the producing organisms.
Collapse
Affiliation(s)
- Purvikalyan Pallegar
- Department of Biology, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| | - Lourdes Peña-Castillo
- Department of Biology, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
- Department of Computer Science, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| | - Evan Langille
- Department of Chemistry, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| | - Mark Gomelsky
- Department of Molecular Biology, University of Wyoming, Laramie, Wyoming, USA
| | - Andrew S Lang
- Department of Biology, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| |
Collapse
|
10
|
Abstract
Bacterial small (sRNAs) are involved in the control of several cellular processes. Hundreds of putative sRNAs have been identified in many bacterial species through RNA sequencing. The existence of putative sRNAs is usually validated by Northern blot analysis. However, the large amount of novel putative sRNAs reported in the literature makes it impractical to validate each of them in the wet lab. In this work, we applied five machine learning approaches to construct twenty models to discriminate bona fide sRNAs from random genomic sequences in five bacterial species. Sequences were represented using seven features including free energy of their predicted secondary structure, their distances to the closest predicted promoter site and Rho-independent terminator, and their distance to the closest open reading frames (ORFs). To automatically calculate these features, we developed an sRNA Characterization Pipeline (sRNACharP). All seven features used in the classification task contributed positively to the performance of the predictive models. The best performing model obtained a median precision of 100% at 10% recall and of 64% at 40% recall across all five bacterial species, and it outperformed previous published approaches on two benchmark datasets in terms of precision and recall. Our results indicate that even though there is limited sRNA sequence conservation across different bacterial species, there are intrinsic features in the genomic context of sRNAs that are conserved across taxa. We show that these features are utilized by machine learning approaches to learn a species-independent model to prioritize bona fide bacterial sRNAs.
Collapse
Affiliation(s)
- Erik J J Eppenhof
- Department of Artificial Intelligence, Radboud University Nijmegen, Nijmegen, Netherlands
| | - Lourdes Peña-Castillo
- Department of Biology, Memorial University of Newfoundland, St. John's, Canada.,Department of Computer Science, Memorial University of Newfoundland, St. John's, Canada
| |
Collapse
|
11
|
Alam Z, Roncal J, Peña-Castillo L. Genetic variation associated with healthy traits and environmental conditions in Vaccinium vitis-idaea. BMC Genomics 2018; 19:4. [PMID: 29291734 PMCID: PMC5748963 DOI: 10.1186/s12864-017-4396-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 12/19/2017] [Indexed: 12/31/2022] Open
Abstract
Background Lingonberry (Vaccinium vitis-idaea L.), one of the least studied fruit crops in the Ericaceae family, has a dramatically increased worldwide demand due to its numerous health benefits. Genetic markers can facilitate the selection of berries with desirable climatic adaptations, agronomic and nutritious characteristics to improve cultivation programs. However, no genomic resources are available for this species. Results We used Genotyping-by-Sequencing (GBS) to analyze the genetic variation of 56 lingonberry samples from across Newfoundland and Labrador, Canada. To elucidate a potential adaptation to environmental conditions we searched for genotype-environment associations by applying three distinct approaches to screen the identified single nucleotide polymorphisms (SNPs) for correlation with six environmental variables. We also searched for an association between the identified SNPs and two phenotypic traits: the total phenolic content (TPC) and antioxidant capacity (AC) of fruit. We identified 1586 high-quality putative SNPs using the UNEAK pipeline available in TASSEL. We found 132 SNPs likely associated with at least one of the environmental or phenotypic variables. To obtain insights on the function of the genomic sequences containing the SNPs likely to be associated with the environmental or phenotypic variables, we performed a sequence-based functional annotation and identified homologous protein-coding sequences with functional roles related to abiotic stress response, pathogen defense, RNA metabolism, and, most interestingly, phenolic compound biosynthesis. Conclusions The putative SNPs discovered are the first genomic resource for lingonberry. This resource might prove useful in high-density quantitative trait locus analysis, and association mapping. The identified candidate genes containing the SNPs need further studies on their potential role in local adaptation of lingonberry. Altogether, the present study provides new resources that can be used to breed for desirable traits in lingonberry. Electronic supplementary material The online version of this article (10.1186/s12864-017-4396-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zobayer Alam
- Department of Biology, Memorial University of Newfoundland, St. John's, NL, A1B 3X9, Canada
| | - Julissa Roncal
- Department of Biology, Memorial University of Newfoundland, St. John's, NL, A1B 3X9, Canada.
| | - Lourdes Peña-Castillo
- Department of Biology, Memorial University of Newfoundland, St. John's, NL, A1B 3X9, Canada.,Department of Computer Science, Memorial University of Newfoundland, St. John's, NL, A1B 3X5, Canada
| |
Collapse
|
12
|
Ayre DC, Chute IC, Joy AP, Barnett DA, Hogan AM, Grüll MP, Peña-Castillo L, Lang AS, Lewis SM, Christian SL. CD24 induces changes to the surface receptors of B cell microvesicles with variable effects on their RNA and protein cargo. Sci Rep 2017; 7:8642. [PMID: 28819186 PMCID: PMC5561059 DOI: 10.1038/s41598-017-08094-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2017] [Accepted: 07/07/2017] [Indexed: 12/20/2022] Open
Abstract
The CD24 cell surface receptor promotes apoptosis in developing B cells, and we recently found that it induces B cells to release plasma membrane-derived, CD24-bearing microvesicles (MVs). Here we have performed a systematic characterization of B cell MVs released from WEHI-231 B lymphoma cells in response to CD24 stimulation. We found that B cells constitutively release MVs of approximately 120 nm, and that CD24 induces an increase in phosphatidylserine-positive MV release. RNA cargo is predominantly comprised of 5S rRNA, regardless of stimulation; however, CD24 causes a decrease in the incorporation of protein coding transcripts. The MV proteome is enriched with mitochondrial and metabolism-related proteins after CD24 stimulation; however, these changes were variable and could not be fully validated by Western blotting. CD24-bearing MVs carry Siglec-2, CD63, IgM, and, unexpectedly, Ter119, but not Siglec-G or MHC-II despite their presence on the cell surface. CD24 stimulation also induces changes in CD63 and IgM expression on MVs that is not mirrored by the changes in cell surface expression. Overall, the composition of these MVs suggests that they may be involved in releasing mitochondrial components in response to pro-apoptotic stress with changes to the surface receptors potentially altering the cell type(s) that interact with the MVs.
Collapse
Affiliation(s)
- D Craig Ayre
- Department of Biochemistry, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| | - Ian C Chute
- Atlantic Cancer Research Institute, Moncton, New Brunswick, Canada
| | - Andrew P Joy
- Atlantic Cancer Research Institute, Moncton, New Brunswick, Canada
| | - David A Barnett
- Atlantic Cancer Research Institute, Moncton, New Brunswick, Canada
| | - Andrew M Hogan
- Department of Biochemistry, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| | - Marc P Grüll
- Departments of Biology, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| | - Lourdes Peña-Castillo
- Departments of Biology, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada.,Department of Computer Science, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| | - Andrew S Lang
- Departments of Biology, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| | - Stephen M Lewis
- Atlantic Cancer Research Institute, Moncton, New Brunswick, Canada.,Department of Microbiology & Immunology, Dalhousie University, Halifax, Nova Scotia, Canada.,Department of Biology, University of New Brunswick, Saint John, New Brunswick, Canada.,Department of Chemistry & Biochemistry, Université de Moncton, Moncton, New Brunswick, Canada
| | - Sherri L Christian
- Department of Biochemistry, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada.
| |
Collapse
|
13
|
Desai AP, Razeghin M, Meruvia-Pastor O, Peña-Castillo L. GeNET: a web application to explore and share Gene Co-expression Network Analysis data. PeerJ 2017; 5:e3678. [PMID: 28828272 PMCID: PMC5560228 DOI: 10.7717/peerj.3678] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 07/22/2017] [Indexed: 12/05/2022] Open
Abstract
Gene Co-expression Network Analysis (GCNA) is a popular approach to analyze a collection of gene expression profiles. GCNA yields an assignment of genes to gene co-expression modules, a list of gene sets statistically over-represented in these modules, and a gene-to-gene network. There are several computer programs for gene-to-gene network visualization, but these programs have limitations in terms of integrating all the data generated by a GCNA and making these data available online. To facilitate sharing and study of GCNA data, we developed GeNET. For researchers interested in sharing their GCNA data, GeNET provides a convenient interface to upload their data and automatically make it accessible to the public through an online server. For researchers interested in exploring GCNA data published by others, GeNET provides an intuitive online tool to interactively explore GCNA data by genes, gene sets or modules. In addition, GeNET allows users to download all or part of the published data for further computational analysis. To demonstrate the applicability of GeNET, we imported three published GCNA datasets, the largest of which consists of roughly 17,000 genes and 200 conditions. GeNET is available at bengi.cs.mun.ca/genet.
Collapse
Affiliation(s)
- Amit P. Desai
- Department of Computer Science, Memorial University of Newfoundland, St. John’s, Canada
| | - Mehdi Razeghin
- Department of Computer Science, Memorial University of Newfoundland, St. John’s, Canada
| | - Oscar Meruvia-Pastor
- Department of Computer Science, Memorial University of Newfoundland, St. John’s, Canada
- Office of the Dean of Science, Memorial University of Newfoundland, St. John’s, Canada
| | - Lourdes Peña-Castillo
- Department of Computer Science, Memorial University of Newfoundland, St. John’s, Canada
- Department of Biology, Memorial University of Newfoundland, St. John’s, Canada
| |
Collapse
|
14
|
Grüll MP, Peña-Castillo L, Mulligan ME, Lang AS. Genome-wide identification and characterization of small RNAs in Rhodobacter capsulatus and identification of small RNAs affected by loss of the response regulator CtrA. RNA Biol 2017; 14:914-925. [PMID: 28296577 PMCID: PMC5546546 DOI: 10.1080/15476286.2017.1306175] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/30/2022] Open
Abstract
Small non-coding RNAs (sRNAs) are involved in the control of numerous cellular processes through various regulatory mechanisms, and in the past decade many studies have identified sRNAs in a multitude of bacterial species using RNA sequencing (RNA-seq). Here, we present the first genome-wide analysis of sRNA sequencing data in Rhodobacter capsulatus, a purple nonsulfur photosynthetic alphaproteobacterium. Using a recently developed bioinformatics approach, sRNA-Detect, we detected 422 putative sRNAs from R. capsulatus RNA-seq data. Based on their sequence similarity to sRNAs in a sRNA collection, consisting of published putative sRNAs from 23 additional bacterial species, and RNA databases, the sequences of 124 putative sRNAs were conserved in at least one other bacterial species; and, 19 putative sRNAs were assigned a predicted function. We bioinformatically characterized all putative sRNAs and applied machine learning approaches to calculate the probability of a nucleotide sequence to be a bona fide sRNA. The resulting quantitative model was able to correctly classify 95.2% of sequences in a validation set. We found that putative cis-targets for antisense and partially overlapping sRNAs were enriched with protein-coding genes involved in primary metabolic processes, photosynthesis, compound binding, and with genes forming part of macromolecular complexes. We performed differential expression analysis to compare the wild type strain to a mutant lacking the response regulator CtrA, an important regulator of gene expression in R. capsulatus, and identified 18 putative sRNAs with differing levels in the two strains. Finally, we validated the existence and expression patterns of four novel sRNAs by Northern blot analysis.
Collapse
Affiliation(s)
- Marc P Grüll
- a Department of Biology , Memorial University of Newfoundland , St. John's , NL , Canada
| | - Lourdes Peña-Castillo
- a Department of Biology , Memorial University of Newfoundland , St. John's , NL , Canada.,b Department of Computer Science , Memorial University of Newfoundland , St. John's , NL , Canada
| | - Martin E Mulligan
- c Department of Biochemistry , Memorial University of Newfoundland , St. John's , NL , Canada
| | - Andrew S Lang
- a Department of Biology , Memorial University of Newfoundland , St. John's , NL , Canada
| |
Collapse
|
15
|
Ferguson NL, Peña-Castillo L, Moore MA, Bignell DRD, Tahlan K. Proteomics analysis of global regulatory cascades involved in clavulanic acid production and morphological development in Streptomyces clavuligerus. ACTA ACUST UNITED AC 2016; 43:537-55. [DOI: 10.1007/s10295-016-1733-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2015] [Accepted: 01/02/2016] [Indexed: 12/11/2022]
Abstract
Abstract
The genus Streptomyces comprises bacteria that undergo a complex developmental life cycle and produce many metabolites of importance to industry and medicine. Streptomyces clavuligerus produces the β-lactamase inhibitor clavulanic acid, which is used in combination with β-lactam antibiotics to treat certain β-lactam resistant bacterial infections. Many aspects of how clavulanic acid production is globally regulated in S. clavuligerus still remains unknown. We conducted comparative proteomics analysis using the wild type strain of S. clavuligerus and two mutants (ΔbldA and ΔbldG), which are defective in global regulators and vary in their ability to produce clavulanic acid. Approximately 33.5 % of the predicted S. clavuligerus proteome was detected and 192 known or putative regulatory proteins showed statistically differential expression levels in pairwise comparisons. Interestingly, the expression of many proteins whose corresponding genes contain TTA codons (predicted to require the bldA tRNA for translation) was unaffected in the bldA mutant.
Collapse
Affiliation(s)
- Nicole L Ferguson
- grid.25055.37 0000000091306822 Department of Biology Memorial University of Newfoundland A1B 3X9 St. John’s NL Canada
| | - Lourdes Peña-Castillo
- grid.25055.37 0000000091306822 Department of Biology Memorial University of Newfoundland A1B 3X9 St. John’s NL Canada
- grid.25055.37 0000000091306822 Department of Computer Science Memorial University of Newfoundland A1B 3X5 St. John’s NL Canada
| | - Marcus A Moore
- grid.25055.37 0000000091306822 Department of Biology Memorial University of Newfoundland A1B 3X9 St. John’s NL Canada
| | - Dawn R D Bignell
- grid.25055.37 0000000091306822 Department of Biology Memorial University of Newfoundland A1B 3X9 St. John’s NL Canada
| | - Kapil Tahlan
- grid.25055.37 0000000091306822 Department of Biology Memorial University of Newfoundland A1B 3X9 St. John’s NL Canada
| |
Collapse
|
16
|
Meruvia-Pastor O, Patra P, Andres K, Twomey C, Peña-Castillo L. OMARC: An online multimedia application for training health care providers in the assessment of respiratory conditions. Int J Med Inform 2016; 89:15-24. [PMID: 26980355 DOI: 10.1016/j.ijmedinf.2016.02.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Revised: 12/22/2015] [Accepted: 02/16/2016] [Indexed: 11/26/2022]
Abstract
OBJECTIVES OMARC, a multimedia application designed to support the training of health care providers for the identification of common lung sounds heard in a patient's thorax as part of a health assessment, is described and its positive contribution to user learning is assessed. The main goal of OMARC is to effectively help health-care students become familiar with lung sounds as part of the assessment of respiratory conditions. In addition, the application must be easy to use and accessible to students and practitioners over the internet. SYSTEM DESCRIPTION OMARC was developed using an online platform to facilitate access to users in remote locations. OMARC's unique contribution as an educational software tool is that it presents a narrative about normal and abnormal lung sounds using interactive multimedia and sample case studies designed by professional health-care providers and educators. Its interface consists of two distinct components: a sounds glossary and a rich multimedia interface which presents clinical case studies and provides access to lung sounds placed on a model of a human torso. OMARC's contents can be extended through the addition of sounds and case studies designed by health-care educators and professionals. VALIDATION AND RESULTS To validate OMARC and determine its efficacy in improving learning and capture user perceptions about it, we performed a pilot study with ten nursing students. Participants' performance was measured through an evaluation of their ability to identify several normal and adventitious/abnormal sounds prior and after exposure to OMARC. Results indicate that participants are able to better identify different lung sounds, going from an average of 63% (S.D. 18.3%) in the pre-test evaluation to an average of 90% (S.D. of 11.5%) after practising with OMARC. Furthermore, participants indicated in a user satisfaction questionnaire that they found the application helpful, easy to use and that they would recommend it to other persons in their field. CONCLUSIONS OMARC is an online multimedia application for training health care students in the assessment of respiratory conditions. The software integrates multimedia technology and health-care education concepts to facilitate learning, while being useful and easy to use. Results from a pilot study indicate that OMARC significantly helps to improve the capacity of the users to correctly identify lung sounds for different respiratory conditions. In addition, participants' opinions about OMARC were quite positive: users were likely to recommend the application to other persons in their field and found the application easy to use and helpful to better identify lung sounds.
Collapse
Affiliation(s)
- Oscar Meruvia-Pastor
- Department of Computer Science, Faculty of Science, Memorial University of Newfoundland, St John's, NL, Canada; Office of the Dean of Science, Faculty of Science, Memorial University of Newfoundland, St John's, NL, Canada.
| | - Pranjal Patra
- Department of Computer Science, Faculty of Science, Memorial University of Newfoundland, St John's, NL, Canada
| | - Karen Andres
- GI/Hepatology South Health Campus, Calgary, AB, Canada
| | - Creina Twomey
- School of Nursing, Memorial University of Newfoundland, St John's, NL, Canada
| | - Lourdes Peña-Castillo
- Department of Computer Science, Faculty of Science, Memorial University of Newfoundland, St John's, NL, Canada; Department of Biology, Faculty of Science, Memorial University of Newfoundland, St John's, NL, Canada
| |
Collapse
|
17
|
Peña-Castillo L, Grüell M, Mulligan ME, Lang AS. DETECTION OF BACTERIAL SMALL TRANSCRIPTS FROM RNA-SEQ DATA: A COMPARATIVE ASSESSMENT. Pac Symp Biocomput 2016; 21:456-467. [PMID: 26776209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Small non-coding RNAs (sRNAs) are regulatory RNA molecules that have been identified in a multitude of bacterial species and shown to control numerous cellular processes through various regulatory mechanisms. In the last decade, next generation RNA sequencing (RNA-seq) has been used for the genome-wide detection of bacterial sRNAs. Here we describe sRNA-Detect, a novel approach to identify expressed small transcripts from prokaryotic RNA-seq data. Using RNA-seq data from three bacterial species and two sequencing platforms, we performed a comparative assessment of five computational approaches for the detection of small transcripts. We demonstrate that sRNA-Detect improves upon current standalone computational approaches for identifying novel small transcripts in bacteria.
Collapse
Affiliation(s)
- Lourdes Peña-Castillo
- Department of Computer Science, Memorial University of Newfoundland, St. John's, NL, Canada2Department of Biology, Memorial University of Newfoundland, St. John's, NL, Canada,
| | | | | | | |
Collapse
|
18
|
Peña-Castillo L, Badis G. Systematic Determination of Transcription Factor DNA-Binding Specificities in Yeast. Methods Mol Biol 2015; 1361:203-25. [PMID: 26483024 DOI: 10.1007/978-1-4939-3079-1_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
Understanding how genes are regulated, decoding their "regulome", is one of the main challenges of the post-genomic era. Here, we describe the in vitro method we used to associate cis-regulatory sites with cognate trans-regulators by characterizing the DNA-binding specificity of the vast majority of yeast transcription factors using Protein Binding Microarrays. This approach can be implemented to any given organism.
Collapse
Affiliation(s)
- Lourdes Peña-Castillo
- Department of Biology, Memorial University of Newfoundland, St. John's, NL, Canada, A1B 3X5.,Department of Computer Science, Memorial University of Newfoundland, St. John's, NL, Canada
| | - Gwenael Badis
- Institut Pasteur, Génétique des Interactions Macromoléculaires, Centre National de la Recherche Scientifique, Unité Mixte de Recherche 3525, Paris, 75724, France.
| |
Collapse
|
19
|
Peña-Castillo L, Mercer RG, Gurinovich A, Callister SJ, Wright AT, Westbye AB, Beatty JT, Lang AS. Gene co-expression network analysis in Rhodobacter capsulatus and application to comparative expression analysis of Rhodobacter sphaeroides. BMC Genomics 2014; 15:730. [PMID: 25164283 PMCID: PMC4158056 DOI: 10.1186/1471-2164-15-730] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2014] [Accepted: 08/21/2014] [Indexed: 01/05/2023] Open
Abstract
Background The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigated preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional annotation. We identified R. capsulatus modules enriched with genes for ribosomal proteins, porphyrin and bacteriochlorophyll anabolism, and biosynthesis of secondary metabolites to be preserved in R. sphaeroides whereas modules related to RcGTA production and signalling showed lack of preservation in R. sphaeroides. In addition, we demonstrated that network statistics may also be applied within-species to identify congruence between mRNA expression and protein abundance data for which simple correlation measurements have previously had mixed results. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-730) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lourdes Peña-Castillo
- Department of Biology, Memorial University of Newfoundland, St, John's, NL A1B 3X5, Canada.
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Fillingham J, Kainth P, Lambert JP, van Bakel H, Tsui K, Peña-Castillo L, Nislow C, Figeys D, Hughes TR, Greenblatt J, Andrews BJ. Two-Color Cell Array Screen Reveals Interdependent Roles for Histone Chaperones and a Chromatin Boundary Regulator in Histone Gene Repression. Mol Cell 2009; 35:340-51. [DOI: 10.1016/j.molcel.2009.06.023] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2009] [Revised: 05/12/2009] [Accepted: 06/08/2009] [Indexed: 01/01/2023]
|
21
|
Kainth P, Sassi HE, Peña-Castillo L, Chua G, Hughes TR, Andrews B. Comprehensive genetic analysis of transcription factor pathways using a dual reporter gene system in budding yeast. Methods 2009; 48:258-64. [PMID: 19269327 DOI: 10.1016/j.ymeth.2009.02.015] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2008] [Revised: 02/24/2009] [Accepted: 02/25/2009] [Indexed: 11/25/2022] Open
Abstract
The development and application of genomic reagents and techniques has fuelled progress in our understanding of regulatory networks that control gene expression in eukaryotic cells. However, a full description of the network of regulator-gene interactions that determine global gene expression programs remains elusive and will require systematic genetic as well as biochemical assays. Here, we describe a functional genomics approach that combines reporter technology, genome-wide array-based reagents and high-throughput imaging to discover new regulators controlling gene expression patterns in Saccharomyces cerevisiae. Our strategy utilizes the synthetic genetic array (SGA) method to systematically introduce promoter-GFP (green fluorescent protein) reporter constructs along with a control promoter-RFP (red fluorescent protein) gene into the array of approximately 4500 viable yeast deletion mutants. Fluorescence intensities from each reporter are assayed from individual colonies arrayed on solid agar plates using a scanning fluorimager and the ratio of GFP to RFP intensity reveals deletion mutants that cause differential GFP expression. We are exploiting this screening approach to construct a detailed map describing the interplay of regulators controlling the eukaryotic cell cycle. The method is extensible to any transcription factor or signalling pathway for which an appropriate reporter gene can be devised.
Collapse
Affiliation(s)
- Pinay Kainth
- Banting & Best Department of Medical Research, University of Toronto, 160 College Street, Toronto, Ont. M5S3E1E1, Canada
| | | | | | | | | | | |
Collapse
|
22
|
Alleyne TM, Peña-Castillo L, Badis G, Talukder S, Berger MF, Gehrke AR, Philippakis AA, Bulyk ML, Morris QD, Hughes TR. Predicting the binding preference of transcription factors to individual DNA k-mers. Bioinformatics 2008; 25:1012-8. [PMID: 19088121 PMCID: PMC2666811 DOI: 10.1093/bioinformatics/btn645] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Recognition of specific DNA sequences is a central mechanism by which transcription factors (TFs) control gene expression. Many TF-binding preferences, however, are unknown or poorly characterized, in part due to the difficulty associated with determining their specificity experimentally, and an incomplete understanding of the mechanisms governing sequence specificity. New techniques that estimate the affinity of TFs to all possible k-mers provide a new opportunity to study DNA-protein interaction mechanisms, and may facilitate inference of binding preferences for members of a given TF family when such information is available for other family members. RESULTS We employed a new dataset consisting of the relative preferences of mouse homeodomains for all eight-base DNA sequences in order to ask how well we can predict the binding profiles of homeodomains when only their protein sequences are given. We evaluated a panel of standard statistical inference techniques, as well as variations of the protein features considered. Nearest neighbour among functionally important residues emerged among the most effective methods. Our results underscore the complexity of TF-DNA recognition, and suggest a rational approach for future analyses of TF families.
Collapse
Affiliation(s)
- Trevis M Alleyne
- Department of Molecular Genetics, Banting and Best Department of Medical Research, University of Toronto, Toronto, ON, Canada
| | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Berger MF, Badis G, Gehrke AR, Talukder S, Philippakis AA, Peña-Castillo L, Alleyne TM, Mnaimneh S, Botvinnik OB, Chan ET, Khalid F, Zhang W, Newburger D, Jaeger SA, Morris QD, Bulyk ML, Hughes TR. Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences. Cell 2008; 133:1266-76. [PMID: 18585359 DOI: 10.1016/j.cell.2008.05.024] [Citation(s) in RCA: 480] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2007] [Revised: 03/10/2008] [Accepted: 05/12/2008] [Indexed: 12/29/2022]
Abstract
Most homeodomains are unique within a genome, yet many are highly conserved across vast evolutionary distances, implying strong selection on their precise DNA-binding specificities. We determined the binding preferences of the majority (168) of mouse homeodomains to all possible 8-base sequences, revealing rich and complex patterns of sequence specificity and showing that there are at least 65 distinct homeodomain DNA-binding activities. We developed a computational system that successfully predicts binding sites for homeodomain proteins as distant from mouse as Drosophila and C. elegans, and we infer full 8-mer binding profiles for the majority of known animal homeodomains. Our results provide an unprecedented level of resolution in the analysis of this simple domain structure and suggest that variation in sequence recognition may be a factor in its functional diversity and evolutionary success.
Collapse
Affiliation(s)
- Michael F Berger
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Peña-Castillo L, Tasan M, Myers CL, Lee H, Joshi T, Zhang C, Guan Y, Leone M, Pagnani A, Kim WK, Krumpelman C, Tian W, Obozinski G, Qi Y, Mostafavi S, Lin GN, Berriz GF, Gibbons FD, Lanckriet G, Qiu J, Grant C, Barutcuoglu Z, Hill DP, Warde-Farley D, Grouios C, Ray D, Blake JA, Deng M, Jordan MI, Noble WS, Morris Q, Klein-Seetharaman J, Bar-Joseph Z, Chen T, Sun F, Troyanskaya OG, Marcotte EM, Xu D, Hughes TR, Roth FP. A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol 2008; 9 Suppl 1:S2. [PMID: 18613946 PMCID: PMC2447536 DOI: 10.1186/gb-2008-9-s1-s2] [Citation(s) in RCA: 197] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated. RESULTS In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%. CONCLUSION We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized.
Collapse
Affiliation(s)
- Lourdes Peña-Castillo
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON M5S3E1, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Lovegrove FE, Peña-Castillo L, Liles WC, Hughes TR, Kain KC. Plasmodium falciparum shows transcriptional versatility within the human host. Trends Parasitol 2008; 24:288-91. [PMID: 18538633 DOI: 10.1016/j.pt.2008.04.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2008] [Revised: 04/15/2008] [Accepted: 04/16/2008] [Indexed: 11/16/2022]
Abstract
In a recent study published in Nature, Daily et al. profiled parasite gene expression in Plasmodium falciparum infections and identified three in vivo 'states' based on parasite transcription patterns. Despite similar host clinical features, two states displayed highly divergent gene expression, whereas the third was found in individuals with increased inflammatory markers. These findings suggest that parasites exist in different physiological states in vivo, providing an important foundation for future studies investigating how these states might contribute to malaria pathogenesis and outcome.
Collapse
Affiliation(s)
- Fiona E Lovegrove
- McLaughlin-Rotman Centre for Global Health, McLaughlin Centre for Molecular Medicine, MaRS Centre, Toronto ON, Canada
| | | | | | | | | |
Collapse
|
26
|
Lovegrove FE, Gharib SA, Peña-Castillo L, Patel SN, Ruzinski JT, Hughes TR, Liles WC, Kain KC. Parasite burden and CD36-mediated sequestration are determinants of acute lung injury in an experimental malaria model. PLoS Pathog 2008; 4:e1000068. [PMID: 18483551 PMCID: PMC2364663 DOI: 10.1371/journal.ppat.1000068] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2007] [Accepted: 04/14/2008] [Indexed: 01/11/2023] Open
Abstract
Although acute lung injury (ALI) is a common complication of severe malaria, little is known about the underlying molecular basis of lung dysfunction. Animal models have provided powerful insights into the pathogenesis of severe malaria syndromes such as cerebral malaria (CM); however, no model of malaria-induced lung injury has been definitively established. This study used bronchoalveolar lavage (BAL), histopathology and gene expression analysis to examine the development of ALI in mice infected with Plasmodium berghei ANKA (PbA). BAL fluid of PbA-infected C57BL/6 mice revealed a significant increase in IgM and total protein prior to the development of CM, indicating disruption of the alveolar–capillary membrane barrier—the physiological hallmark of ALI. In contrast to sepsis-induced ALI, BAL fluid cell counts remained constant with no infiltration of neutrophils. Histopathology showed septal inflammation without cellular transmigration into the alveolar spaces. Microarray analysis of lung tissue from PbA-infected mice identified a significant up-regulation of expressed genes associated with the gene ontology categories of defense and immune response. Severity of malaria-induced ALI varied in a panel of inbred mouse strains, and development of ALI correlated with peripheral parasite burden but not CM susceptibility. Cd36−/− mice, which have decreased parasite lung sequestration, were relatively protected from ALI. In summary, parasite burden and CD36-mediated sequestration in the lung are primary determinants of ALI in experimental murine malaria. Furthermore, differential susceptibility of mouse strains to malaria-induced ALI and CM suggests that distinct genetic determinants may regulate susceptibility to these two important causes of malaria-associated morbidity and mortality. Acute lung injury (ALI) and acute respiratory distress syndrome (ARDS) can occur in adult malaria infections with a case fatality rate of 70%–100%. ALI and ARDS are characterized by protein-rich fluid in the lungs, with reduced gas exchange, and in malaria, often accompany high parasite levels and severe or cerebral disease. In this work we have examined lung physiology, pathology and genomics in mouse malaria—Plasmodium berghei ANKA—to show that mice develop malaria-induced ALI. Infected mice have proteinaceous fluid in their lungs, have a migration of inflammatory cells from the blood into the lung walls, and express immune response–related genes. We also found that severity of ALI depended on high parasite levels, both overall and specifically in the lung tissue, but was not consistent with whether the mice developed cerebral malaria. ALI due to Plasmodium berghei ANKA infection models prominent characteristics of human malaria-associated ALI, and we have better defined this model of malaria ALI so it may be used to further explore disease mechanisms and eventual treatment.
Collapse
Affiliation(s)
- Fiona E. Lovegrove
- Institute of Medical Science, Department of Medicine, University of Toronto, Toronto, Ontario, Canada
- McLaughlin-Rotman Centre for Global Health, McLaughlin Centre for Molecular Medicine, University Health Network, University of Toronto, Toronto, Ontario, Canada
| | - Sina A. Gharib
- Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Lourdes Peña-Castillo
- Center for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Samir N. Patel
- McLaughlin-Rotman Centre for Global Health, McLaughlin Centre for Molecular Medicine, University Health Network, University of Toronto, Toronto, Ontario, Canada
| | - John T. Ruzinski
- Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Timothy R. Hughes
- McLaughlin-Rotman Centre for Global Health, McLaughlin Centre for Molecular Medicine, University Health Network, University of Toronto, Toronto, Ontario, Canada
- Center for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto, Ontario, Canada
| | - W. Conrad Liles
- Institute of Medical Science, Department of Medicine, University of Toronto, Toronto, Ontario, Canada
- McLaughlin-Rotman Centre for Global Health, McLaughlin Centre for Molecular Medicine, University Health Network, University of Toronto, Toronto, Ontario, Canada
- Division of Infectious Diseases, Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Kevin C. Kain
- Institute of Medical Science, Department of Medicine, University of Toronto, Toronto, Ontario, Canada
- McLaughlin-Rotman Centre for Global Health, McLaughlin Centre for Molecular Medicine, University Health Network, University of Toronto, Toronto, Ontario, Canada
- Division of Infectious Diseases, Department of Medicine, University of Toronto, Toronto, Ontario, Canada
- * E-mail:
| |
Collapse
|
27
|
Abstract
The yeast genetics community has embraced genomic biology, and there is a general understanding that obtaining a full encyclopedia of functions of the approximately 6000 genes is a worthwhile goal. The yeast literature comprises over 40,000 research papers, and the number of yeast researchers exceeds the number of genes. There are mutated and tagged alleles for virtually every gene, and hundreds of high-throughput data sets and computational analyses have been described. Why, then, are there >1000 genes still listed as uncharacterized on the Saccharomyces Genome Database, 10 years after sequencing the genome of this powerful model organism? Examination of the currently uncharacterized gene set suggests that while some are small or newly discovered, the vast majority were evident from the initial genome sequence. Most are present in multiple genomics data sets, which may provide clues to function. In addition, roughly half contain recognizable protein domains, and many of these suggest specific metabolic activities. Notably, the uncharacterized gene set is highly enriched for genes whose only homologs are in other fungi. Achieving a full catalog of yeast gene functions may require a greater focus on the life of yeast outside the laboratory.
Collapse
Affiliation(s)
- Lourdes Peña-Castillo
- Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | | |
Collapse
|
28
|
Lovegrove FE, Peña-Castillo L, Mohammad N, Liles WC, Hughes TR, Kain KC. Simultaneous host and parasite expression profiling identifies tissue-specific transcriptional programs associated with susceptibility or resistance to experimental cerebral malaria. BMC Genomics 2006; 7:295. [PMID: 17118208 PMCID: PMC1664577 DOI: 10.1186/1471-2164-7-295] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2006] [Accepted: 11/22/2006] [Indexed: 11/30/2022] Open
Abstract
Background The development and outcome of cerebral malaria (CM) reflects a complex interplay between parasite-expressed virulence factors and host response to infection. The murine CM model, Plasmodium berghei ANKA (PbA), which simulates many of the features of human CM, provides an excellent system to study this host/parasite interface. We designed "combination" microarrays that concurrently detect genome-wide transcripts of both PbA and mouse, and examined parasite and host transcriptional programs during infection of CM-susceptible (C57BL/6) and CM-resistant (BALB/c) mice. Results Analysis of expression data from brain, lung, liver, and spleen of PbA infected mice showed that both host and parasite gene expression can be examined using a single microarray, and parasite transcripts can be detected within whole organs at a time when peripheral blood parasitemia is low. Parasites display a unique transcriptional signature in each tissue, and lung appears to be a large reservoir for metabolically active parasites. In comparisons of susceptible versus resistant animals, both host and parasite display distinct, organ-specific transcriptional profiles. Differentially expressed mouse genes were related to humoral immune response, complement activation, or cell-cell interactions. PbA displayed differential expression of genes related to biosynthetic activities. Conclusion These data show that host and parasite gene expression profiles can be simultaneously analysed using a single "combination" microarray, and that both the mouse and malaria parasite display distinct tissue- and strain-specific responses during infection. This technology facilitates the dissection of host-pathogen interactions in experimental cerebral malaria and could be extended to other disease models.
Collapse
Affiliation(s)
- Fiona E Lovegrove
- Institute of Medical Science, Department of Medicine, University of Toronto, Toronto, ON, Canada
| | - Lourdes Peña-Castillo
- Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Naveed Mohammad
- Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - W Conrad Liles
- Institute of Medical Science, Department of Medicine, University of Toronto, Toronto, ON, Canada
- McLaughlin-Rotman Centre, McLaughlin Centre for Molecular Medicine, UHN and University of Toronto, Toronto, ON, Canada
| | - Timothy R Hughes
- Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto, ON, Canada
- McLaughlin-Rotman Centre, McLaughlin Centre for Molecular Medicine, UHN and University of Toronto, Toronto, ON, Canada
| | - Kevin C Kain
- Institute of Medical Science, Department of Medicine, University of Toronto, Toronto, ON, Canada
- McLaughlin-Rotman Centre, McLaughlin Centre for Molecular Medicine, UHN and University of Toronto, Toronto, ON, Canada
| |
Collapse
|