Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Greene CS, Troyanskaya OG. Accurate evaluation and analysis of functional genomics data and methods. Ann N Y Acad Sci 2012;1260:95-100. [PMID: 22268703 DOI: 10.1111/j.1749-6632.2011.06383.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

For:	Greene CS, Troyanskaya OG. Accurate evaluation and analysis of functional genomics data and methods. Ann N Y Acad Sci 2012;1260:95-100. [PMID: 22268703 DOI: 10.1111/j.1749-6632.2011.06383.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Number

Cited by Other Article(s)

Yunes JM, Babbitt PC. Effusion: prediction of protein function from sequence similarity networks. Bioinformatics 2019;35:442-451. [PMID: 30084920 PMCID: PMC6361244 DOI: 10.1093/bioinformatics/bty672] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 07/24/2018] [Accepted: 07/30/2018] [Indexed: 12/26/2022] Open

Kacsoh BZ, Barton S, Jiang Y, Zhou N, Mooney SD, Friedberg I, Radivojac P, Greene CS, Bosco G. New Drosophila Long-Term Memory Genes Revealed by Assessing Computational Function Prediction Methods. G3 (BETHESDA, MD.) 2019;9:251-267. [PMID: 30463884 PMCID: PMC6325913 DOI: 10.1534/g3.118.200867] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Accepted: 11/20/2018] [Indexed: 01/26/2023]

Luecken MD, Page MJT, Crosby AJ, Mason S, Reinert G, Deane CM. CommWalker: correctly evaluating modules in molecular networks in light of annotation bias. Bioinformatics 2019;34:994-1000. [PMID: 29112702 PMCID: PMC5860269 DOI: 10.1093/bioinformatics/btx706] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 11/02/2017] [Indexed: 11/24/2022] Open

Enabling Precision Medicine through Integrative Network Models. J Mol Biol 2018;430:2913-2923. [DOI: 10.1016/j.jmb.2018.07.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2018] [Revised: 06/15/2018] [Accepted: 07/03/2018] [Indexed: 11/17/2022]

Wang X, Cheng F, Rohlsen D, Bi C, Wang C, Xu Y, Wei S, Ye Q, Yin T, Ye N. Organellar genome assembly methods and comparative analysis of horticultural plants. HORTICULTURE RESEARCH 2018;5:3. [PMID: 29423233 PMCID: PMC5798811 DOI: 10.1038/s41438-017-0002-1] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Revised: 11/20/2017] [Accepted: 11/26/2017] [Indexed: 05/31/2023]

Abstract

Although organellar genomes (including chloroplast and mitochondrial genomes) are smaller than nuclear genomes in size and gene number, organellar genomes are very important for the investigation of plant evolution and molecular ecology mechanisms. Few studies have focused on the organellar genomes of horticultural plants. Approximately 1193 chloroplast genomes and 199 mitochondrial genomes of land plants are available in the National Center for Biotechnology Information (NCBI), of which only 39 are from horticultural plants. In this paper, we report an innovative and efficient method for high-quality horticultural organellar genome assembly from next-generation sequencing (NGS) data. Sequencing reads were first assembled by Newbler, Amos, and Minimus software with default parameters. The remaining gaps were then filled through BLASTN search and PCR. The complete DNA sequence was corrected based on Illumina sequencing data using BWA (Burrows-Wheeler Alignment tool) software. The advantage of this approach is that there is no need to isolate organellar DNA from total DNA during sample preparation. Using this procedure, the complete mitochondrial and chloroplast genomes of an ornamental plant, Salix suchowensis, and a fruit tree, Ziziphus jujuba, were identified. This study shows that horticultural plants have similar mitochondrial and chloroplast sequence organization to other seed plants. Most horticultural plants demonstrate a slight bias toward A+T rich features in the mitochondrial genome. In addition, a phylogenetic analysis of 39 horticultural plants based on 15 protein-coding genes showed that some mitochondrial genes are horizontally transferred from chloroplast DNA. Our study will provide an important reference for organellar genome assembly in other horticultural plants. Furthermore, phylogenetic analysis of the organellar genomes of horticultural plants could accurately clarify the unanticipated relationships among these plants.

Collapse

Kacsoh BZ, Greene CS, Bosco G. Machine Learning Analysis Identifies Drosophila Grunge/Atrophin as an Important Learning and Memory Gene Required for Memory Retention and Social Learning. G3 (BETHESDA, MD.) 2017;7:3705-3718. [PMID: 28889104 PMCID: PMC5677163 DOI: 10.1534/g3.117.300172] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Accepted: 08/07/2017] [Indexed: 12/12/2022]

Implicating candidate genes at GWAS signals by leveraging topologically associating domains. Eur J Hum Genet 2017;25:1286-1289. [PMID: 28792001 DOI: 10.1038/ejhg.2017.108] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Revised: 05/02/2017] [Accepted: 06/13/2017] [Indexed: 12/29/2022] Open

Tan J, Doing G, Lewis KA, Price CE, Chen KM, Cady KC, Perchuk B, Laub MT, Hogan DA, Greene CS. Unsupervised Extraction of Stable Expression Signatures from Public Compendia with an Ensemble of Neural Networks. Cell Syst 2017;5:63-71.e6. [PMID: 28711280 PMCID: PMC5532071 DOI: 10.1016/j.cels.2017.06.003] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Revised: 04/11/2017] [Accepted: 06/08/2017] [Indexed: 01/18/2023]

Greene CS, Himmelstein DS. Genetic Association-Guided Analysis of Gene Networks for the Study of Complex Traits. ACTA ACUST UNITED AC 2017;9:179-84. [PMID: 27094199 DOI: 10.1161/circgenetics.115.001181] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Accepted: 03/08/2016] [Indexed: 12/29/2022]

Verleyen W, Ballouz S, Gillis J. Positive and negative forms of replicability in gene network analysis. Bioinformatics 2015;32:1065-73. [PMID: 26668004 DOI: 10.1093/bioinformatics/btv734] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 12/09/2015] [Indexed: 02/07/2023] Open

Youngs N, Penfold-Brown D, Bonneau R, Shasha D. Negative example selection for protein function prediction: the NoGO database. PLoS Comput Biol 2014;10:e1003644. [PMID: 24922051 PMCID: PMC4055410 DOI: 10.1371/journal.pcbi.1003644] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2013] [Accepted: 04/08/2014] [Indexed: 12/28/2022] Open

Abstract

Negative examples – genes that are known not to carry out a given protein function – are rarely recorded in genome and proteome annotation databases, such as the Gene Ontology database. Negative examples are required, however, for several of the most powerful machine learning methods for integrative protein function prediction. Most protein function prediction efforts have relied on a variety of heuristics for the choice of negative examples. Determining the accuracy of methods for negative example prediction is itself a non-trivial task, given that the Open World Assumption as applied to gene annotations rules out many traditional validation metrics. We present a rigorous comparison of these heuristics, utilizing a temporal holdout, and a novel evaluation strategy for negative examples. We add to this comparison several algorithms adapted from Positive-Unlabeled learning scenarios in text-classification, which are the current state of the art methods for generating negative examples in low-density annotation contexts. Lastly, we present two novel algorithms of our own construction, one based on empirical conditional probability, and the other using topic modeling applied to genes and annotations. We demonstrate that our algorithms achieve significantly fewer incorrect negative example predictions than the current state of the art, using multiple benchmarks covering multiple organisms. Our methods may be applied to generate negative examples for any type of method that deals with protein function, and to this end we provide a database of negative examples in several well-studied organisms, for general use (The NoGO database, available at: bonneaulab.bio.nyu.edu/nogo.html).

Many machine learning methods have been applied to the task of predicting the biological function of proteins based on a variety of available data. The majority of these methods require negative examples: proteins that are known not to perform a function, in order to achieve meaningful predictions, but negative examples are often not available. In addition, past heuristic methods for negative example selection suffer from a high error rate. Here, we rigorously compare two novel algorithms against past heuristics, as well as some algorithms adapted from a similar task in text-classification. Through this comparison, performed on several different benchmarks, we demonstrate that our algorithms make significantly fewer mistakes when predicting negative examples. We also provide a database of negative examples for general use in machine learning for protein function prediction (The NoGO database, available at: bonneaulab.bio.nyu.edu/nogo.html).

Collapse

Penrod NM, Greene CS, Moore JH. Predicting targeted drug combinations based on Pareto optimal patterns of coexpression network connectivity. Genome Med 2014;6:33. [PMID: 24944582 PMCID: PMC4062052 DOI: 10.1186/gm550] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Accepted: 04/22/2014] [Indexed: 01/05/2023] Open

Abstract

Background

Molecularly targeted drugs promise a safer and more effective treatment modality than conventional chemotherapy for cancer patients. However, tumors are dynamic systems that readily adapt to these agents activating alternative survival pathways as they evolve resistant phenotypes. Combination therapies can overcome resistance but finding the optimal combinations efficiently presents a formidable challenge. Here we introduce a new paradigm for the design of combination therapy treatment strategies that exploits the tumor adaptive process to identify context-dependent essential genes as druggable targets.

Methods

We have developed a framework to mine high-throughput transcriptomic data, based on differential coexpression and Pareto optimization, to investigate drug-induced tumor adaptation. We use this approach to identify tumor-essential genes as druggable candidates. We apply our method to a set of ER⁺ breast tumor samples, collected before (n = 58) and after (n = 60) neoadjuvant treatment with the aromatase inhibitor letrozole, to prioritize genes as targets for combination therapy with letrozole treatment. We validate letrozole-induced tumor adaptation through coexpression and pathway analyses in an independent data set (n = 18).

Results

We find pervasive differential coexpression between the untreated and letrozole-treated tumor samples as evidence of letrozole-induced tumor adaptation. Based on patterns of coexpression, we identify ten genes as potential candidates for combination therapy with letrozole including EPCAM, a letrozole-induced essential gene and a target to which drugs have already been developed as cancer therapeutics. Through replication, we validate six letrozole-induced coexpression relationships and confirm the epithelial-to-mesenchymal transition as a process that is upregulated in the residual tumor samples following letrozole treatment.

Conclusions

To derive the greatest benefit from molecularly targeted drugs it is critical to design combination treatment strategies rationally. Incorporating knowledge of the tumor adaptation process into the design provides an opportunity to match targeted drugs to the evolving tumor phenotype and surmount resistance.

Collapse

Youngs N, Penfold-Brown D, Drew K, Shasha D, Bonneau R. Parametric Bayesian priors and better choice of negative examples improve protein function prediction. Bioinformatics 2013;29:1190-8. [PMID: 23511543 PMCID: PMC3634187 DOI: 10.1093/bioinformatics/btt110] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Gillis J, Pavlidis P. Assessing identity, redundancy and confounds in Gene Ontology annotations over time. ACTA ACUST UNITED AC 2013;29:476-82. [PMID: 23297035 DOI: 10.1093/bioinformatics/bts727] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Greene CS, Troyanskaya OG. Chapter 2: Data-driven view of disease biology. PLoS Comput Biol 2012;8:e1002816. [PMID: 23300408 PMCID: PMC3531282 DOI: 10.1371/journal.pcbi.1002816] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open