51
|
Kovatch P, Costa A, Giles Z, Fluder E, Cho HM, Mazurkova S. Big Omics Data Experience. SC ... CONFERENCE PROCEEDINGS. SC (CONFERENCE : SUPERCOMPUTING) 2015; 2015. [PMID: 30788464 DOI: 10.1145/2807591.2807595] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
As personalized medicine becomes more integrated into healthcare, the rate at which human genomes are being sequenced is rising quickly together with a concomitant acceleration in compute and storage requirements. To achieve the most effective solution for genomic workloads without re-architecting the industry-standard software, we performed a rigorous analysis of usage statistics, benchmarks and available technologies to design a system for maximum throughput. We share our experiences designing a system optimized for the "Genome Analysis ToolKit (GATK) Best Practices" whole genome DNA and RNA pipeline based on an evaluation of compute, workload and I/O characteristics. The characteristics of genomic-based workloads are vastly different from those of traditional HPC workloads, requiring different configurations of the scheduler and the I/O subsystem to achieve reliability, performance and scalability. By understanding how our researchers and clinicians work, we were able to employ techniques not only to speed up their workflow yielding improved and repeatable performance, but also to make more efficient use of storage and compute resources.
Collapse
Affiliation(s)
- Patricia Kovatch
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| | - Anthony Costa
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| | - Zachary Giles
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| | - Eugene Fluder
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| | - Hyung Min Cho
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| | - Svetlana Mazurkova
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| |
Collapse
|
52
|
Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays. Genetics 2015; 202:351-62. [PMID: 26510793 PMCID: PMC4701098 DOI: 10.1534/genetics.115.183483] [Citation(s) in RCA: 91] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2014] [Accepted: 10/28/2015] [Indexed: 01/06/2023] Open
Abstract
Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation.
Collapse
|
53
|
Structural and Computational Biology in the Design of Immunogenic Vaccine Antigens. J Immunol Res 2015; 2015:156241. [PMID: 26526043 PMCID: PMC4615220 DOI: 10.1155/2015/156241] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 08/02/2015] [Indexed: 01/08/2023] Open
Abstract
Vaccination is historically one of the most important medical interventions for the prevention of infectious disease. Previously, vaccines were typically made of rather crude mixtures of inactivated or attenuated causative agents. However, over the last 10–20 years, several important technological and computational advances have enabled major progress in the discovery and design of potently immunogenic recombinant protein vaccine antigens. Here we discuss three key breakthrough approaches that have potentiated structural and computational vaccine design. Firstly, genomic sciences gave birth to the field of reverse vaccinology, which has enabled the rapid computational identification of potential vaccine antigens. Secondly, major advances in structural biology, experimental epitope mapping, and computational epitope prediction have yielded molecular insights into the immunogenic determinants defining protective antigens, enabling their rational optimization. Thirdly, and most recently, computational approaches have been used to convert this wealth of structural and immunological information into the design of improved vaccine antigens. This review aims to illustrate the growing power of combining sequencing, structural and computational approaches, and we discuss how this may drive the design of novel immunogens suitable for future vaccines urgently needed to increase the global prevention of infectious disease.
Collapse
|
54
|
Gasc C, Ribière C, Parisot N, Beugnot R, Defois C, Petit-Biderre C, Boucher D, Peyretaillade E, Peyret P. Capturing prokaryotic dark matter genomes. Res Microbiol 2015; 166:814-30. [PMID: 26100932 DOI: 10.1016/j.resmic.2015.06.001] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Revised: 06/02/2015] [Accepted: 06/03/2015] [Indexed: 11/18/2022]
Abstract
Prokaryotes are the most diverse and abundant cellular life forms on Earth. Most of them, identified by indirect molecular approaches, belong to microbial dark matter. The advent of metagenomic and single-cell genomic approaches has highlighted the metabolic capabilities of numerous members of this dark matter through genome reconstruction. Thus, linking functions back to the species has revolutionized our understanding of how ecosystem function is sustained by the microbial world. This review will present discoveries acquired through the illumination of prokaryotic dark matter genomes by these innovative approaches.
Collapse
Affiliation(s)
- Cyrielle Gasc
- Clermont Université, Université d'Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France.
| | - Céline Ribière
- Clermont Université, Université d'Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France.
| | - Nicolas Parisot
- Biologie Fonctionnelle Insectes et Interactions, UMR203 BF2I, INRA, INSA-Lyon, Université de Lyon, Villeurbanne, France.
| | - Réjane Beugnot
- Clermont Université, Université d'Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France.
| | - Clémence Defois
- Clermont Université, Université d'Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France.
| | - Corinne Petit-Biderre
- Université Blaise Pascal, Laboratoire Microorganismes, Génome et Environnement, Centre National de la Recherche Scientifique (CNRS), Unité Mixte de Recherche (UMR) 6023, F-63171 Aubière, France.
| | - Delphine Boucher
- Clermont Université, Université d'Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France.
| | - Eric Peyretaillade
- Clermont Université, Université d'Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France.
| | - Pierre Peyret
- Clermont Université, Université d'Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France.
| |
Collapse
|
55
|
Whelan NV, Kocot KM, Halanych KM. Employing Phylogenomics to Resolve the Relationships among Cnidarians, Ctenophores, Sponges, Placozoans, and Bilaterians. Integr Comp Biol 2015; 55:1084-95. [PMID: 25972566 DOI: 10.1093/icb/icv037] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Despite an explosion in the amount of sequence data, phylogenomics has failed to settle controversy regarding some critical nodes on the animal tree of life. Understanding relationships among Bilateria, Ctenophora, Cnidaria, Placozoa, and Porifera is essential for studying how complex traits such as neurons, muscles, and gastrulation have evolved. Recent studies have cast doubt on the historical viewpoint that sponges are sister to all other animal lineages with recent studies recovering ctenophores as sister. However, the ctenophore-sister hypothesis has been criticized as unrealistic and caused by systematic error. We review past phylogenomic studies and potential causes of systematic error in an effort to identify areas that can be improved in future studies. Increased sampling of taxa, less missing data, and a priori removal of sequences and taxa that may cause systematic error in phylogenomic inference will likely be the most fruitful areas of focus when assembling future datasets. Ultimately, we foresee metazoan relationships being resolved with higher support in the near future, and we caution against dismissing novel hypotheses merely because they conflict with historical viewpoints of animal evolution.
Collapse
Affiliation(s)
- Nathan V Whelan
- *Department of Biological Sciences, Molette Biology Laboratory for Environmental and Climate Change Studies, Auburn University, 101 Life Sciences Building, Auburn, AL 36849, USA;
| | - Kevin M Kocot
- School of Biological Sciences, The University of Queensland, 325 Goddard Building, St Lucia, QLD 4101, Australia
| | - Kenneth M Halanych
- *Department of Biological Sciences, Molette Biology Laboratory for Environmental and Climate Change Studies, Auburn University, 101 Life Sciences Building, Auburn, AL 36849, USA
| |
Collapse
|
56
|
Land M, Hauser L, Jun SR, Nookaew I, Leuze MR, Ahn TH, Karpinets T, Lund O, Kora G, Wassenaar T, Poudel S, Ussery DW. Insights from 20 years of bacterial genome sequencing. Funct Integr Genomics 2015; 15:141-61. [PMID: 25722247 PMCID: PMC4361730 DOI: 10.1007/s10142-015-0433-4] [Citation(s) in RCA: 430] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Revised: 02/11/2015] [Accepted: 02/12/2015] [Indexed: 12/18/2022]
Abstract
Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.
Collapse
Affiliation(s)
- Miriam Land
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Loren Hauser
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
- Joint Institute for Biological Sciences, University of Tennessee, Knoxville, TN 37996 USA
- Department of Microbiology, University of Tennessee, Knoxville, TN 37996 USA
| | - Se-Ran Jun
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Intawat Nookaew
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Michael R. Leuze
- Computer Science and Mathematics Division, Computer Science Research Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Tae-Hyuk Ahn
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
- Computer Science and Mathematics Division, Computer Science Research Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Tatiana Karpinets
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Ole Lund
- Center for Biological Sequence Analysis, Department of Systems Biology, The Technical University of Denmark, Kgs. Lyngby, 2800 Denmark
| | - Guruprased Kora
- Computer Science and Mathematics Division, Computer Science Research Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Trudy Wassenaar
- Molecular Microbiology and Genomics Consultants, Tannenstr 7, 55576 Zotzenheim, Germany
| | - Suresh Poudel
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
- Genome Science and Technology, University of Tennessee, Knoxville, TN 37996 USA
| | - David W. Ussery
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
- Joint Institute for Biological Sciences, University of Tennessee, Knoxville, TN 37996 USA
- Center for Biological Sequence Analysis, Department of Systems Biology, The Technical University of Denmark, Kgs. Lyngby, 2800 Denmark
- Genome Science and Technology, University of Tennessee, Knoxville, TN 37996 USA
| |
Collapse
|
57
|
Chen EQ, Bai L, Gong DY, Tang H. Employment of digital gene expression profiling to identify potential pathogenic and therapeutic targets of fulminant hepatic failure. J Transl Med 2015; 13:22. [PMID: 25623171 PMCID: PMC4312436 DOI: 10.1186/s12967-015-0380-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2014] [Accepted: 01/05/2015] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND The dysregulated cytokine metabolism and activity are crucial to the development of fulminant hepatic failure (FHF), and many different cytokines have been identified. However, the precise gene expression profile and their interactions association with FHF are yet to be further elucidated. METHODS In this study, we detected the digital gene expression profile (DGEP) by high-throughput sequencing in normal and FHF mouse liver, and the candidate genes and potential targets for FHF therapy were verified. And the FHF mouse model was induced by D-Galactosamine (GalN)/lipopolysaccharide (LPS). RESULTS Totally 12727 genes were detected, and 3551 differentially expressed genes (DEGs) were obtained from RNA-seq data in FHF mouse liver. In FHF mouse liver, many of those DEGs were identified as differentially expressed in metabolic process, biosynthetic process, response to stimulus and response to stress, etc. Similarly, pathway enrichment analysis in FHF mouse liver showed that many significantly DEGs were also enriched in metabolic pathways, apoptosis, chemokine signaling pathways, etc. Considering the important role of nuclear factor-kappa B (NF-κB) in metabolic regulation and delicate balance between cell survival and death, several DEGs involved in NF-κB pathway were selected for experimental validation. As compared to normal control, NF-κBp65 and its inhibitory protein IκBα were both significantly increased, and NF-κB targeted genes including tumor necrosis factor α(TNFα), inducible nitric oxide synthase (iNOS), interleukin-1β, chemokines CCL3 and CCL4 were also increased in hepatic tissues of FHF. In addition, after NF-κB was successfully pre-blocked, there were significant alteration of hepatic pathological damage and mortality of FHF mouse model. CONCLUSIONS This study provides the globe gene expression profile of FHF mouse liver, and demonstrates the possibility of NF-κB gene as a potential therapeutic target for FHF.
Collapse
Affiliation(s)
- En-Qiang Chen
- Center of Infectious Diseases, West China Hospital of Sichuan University, No.37 Guo Xue Xiang, Wuhou District, Chengdu, 610041, People's Republic of China.
- Division of Infectious Diseases, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, 610041, China.
| | - Lang Bai
- Center of Infectious Diseases, West China Hospital of Sichuan University, No.37 Guo Xue Xiang, Wuhou District, Chengdu, 610041, People's Republic of China.
- Division of Infectious Diseases, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, 610041, China.
| | - Dao-Yin Gong
- Institute of Basic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, 610075, China.
| | - Hong Tang
- Center of Infectious Diseases, West China Hospital of Sichuan University, No.37 Guo Xue Xiang, Wuhou District, Chengdu, 610041, People's Republic of China.
- Division of Infectious Diseases, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, 610041, China.
| |
Collapse
|
58
|
Sim M, Kim J. Metagenome assembly through clustering of next-generation sequencing data using protein sequences. J Microbiol Methods 2015; 109:180-7. [PMID: 25572018 DOI: 10.1016/j.mimet.2015.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2014] [Revised: 01/03/2015] [Accepted: 01/03/2015] [Indexed: 11/16/2022]
Abstract
The study of environmental microbial communities, called metagenomics, has gained a lot of attention because of the recent advances in next-generation sequencing (NGS) technologies. Microbes play a critical role in changing their environments, and the mode of their effect can be solved by investigating metagenomes. However, the difficulty of metagenomes, such as the combination of multiple microbes and different species abundance, makes metagenome assembly tasks more challenging. In this paper, we developed a new metagenome assembly method by utilizing protein sequences, in addition to the NGS read sequences. Our method (i) builds read clusters by using mapping information against available protein sequences, and (ii) creates contig sequences by finding consensus sequences through probabilistic choices from the read clusters. By using simulated NGS read sequences from real microbial genome sequences, we evaluated our method in comparison with four existing assembly programs. We found that our method could generate relatively long and accurate metagenome assemblies, indicating that the idea of using protein sequences, as a guide for the assembly, is promising.
Collapse
Affiliation(s)
- Mikang Sim
- Department of Animal Biotechnology, Konkuk University, Seoul 143-701, Republic of Korea
| | - Jaebum Kim
- Department of Animal Biotechnology, Konkuk University, Seoul 143-701, Republic of Korea.
| |
Collapse
|
59
|
Motooka D, Nakamura S, Hagiwara K, Nakaya T. Viral detection by high-throughput sequencing. Methods Mol Biol 2015; 1236:125-34. [PMID: 25287501 DOI: 10.1007/978-1-4939-1743-3_11] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
We applied a high-throughput sequencing platform, Ion PGM, for viral detection in fecal samples from adult cows collected in Hokkaido, Japan. Random RT-PCR was performed to amplify RNA extracted from 0.25 ml of fecal specimens (N = 8), and more than 5 μg of cDNA was synthesized. Unbiased high-throughput sequencing using the 318 v2 semiconductor chip of these eight samples yielded 57-580 K (average: 270 K, after data analysis) reads in a single run. As a result, viral genome sequences were detected in each specimen. In addition to bacteriophage, mammal- and insect-derived viruses, partial genome sequences of plant, algal, and protozoal viruses were detected. Thus, this metagenomic analysis of fecal specimens could be useful to comprehensively understand viral populations of the intestine and food sources in animals.
Collapse
Affiliation(s)
- Daisuke Motooka
- Department of Infection Metagenomics, Research Institute for Microbial Disease, Osaka University, Osaka, Japan
| | | | | | | |
Collapse
|
60
|
Huang KC, Yang KC, Lin H, Tsao TTH, Lee SA. Transcriptome alterations of mitochondrial and coagulation function in schizophrenia by cortical sequencing analysis. BMC Genomics 2014; 15 Suppl 9:S6. [PMID: 25522158 PMCID: PMC4290619 DOI: 10.1186/1471-2164-15-s9-s6] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
Background Transcriptome sequencing of brain samples provides detailed enrichment analysis of differential expression and genetic interactions for evaluation of mitochondrial and coagulation function of schizophrenia. It is implicated that schizophrenia genetic and protein interactions may give rise to biological dysfunction of energy metabolism and hemostasis. These findings may explain the biological mechanisms responsible for negative and withdraw symptoms of schizophrenia and antipsychotic-induced venous thromboembolism. We conducted a comparison of schizophrenic candidate genes from literature reviews and constructed the schizophrenia-mediator network (SCZMN) which consists of schizophrenic candidate genes and associated mediator genes by applying differential expression analysis to BA22 RNA-Seq brain data. The network was searched against pathway databases such as PID, Reactome, HumanCyc, and Cell-Map. The candidate complexes were identified by MCL clustering using CORUM for potential pathogenesis of schizophrenia. Results Published BA22 RNA-Seq brain data of 9 schizophrenic patients and 9 controls samples were analyzed. The differentially expressed genes in the BA22 brain samples of schizophrenia are proposed as schizophrenia candidate marker genes (SCZCGs). The genetic interactions between mitochondrial genes and many under-expressed SCZCGs indicate the genetic predisposition of mitochondria dysfunction in schizophrenia. The biological functions of SCZCGs, as listed in the Pathway Interaction Database (PID), indicate that these genes have roles in DNA binding transcription factor, signal and cancer-related pathways, coagulation and cell cycle regulation and differentiation pathways. In the query-query protein-protein interaction (QQPPI) network of SCZCGs, TP53, PRKACA, STAT3 and SP1 were identified as the central "hub" genes. Mitochondrial function was modulated by dopamine inhibition of respiratory complex I activity. The genetic interaction between mitochondria function and schizophrenia may be revealed by DRD2 linked to NDUFS7 through protein-protein interactions of FLNA and ARRB2. The biological mechanism of signaling pathway of coagulation cascade was illustrated by the PPI network of the SCZCGs and the coagulation-associated genes. The relationship between antipsychotic target genes (DRD2/3 and HTR2A) and coagulation factor genes (F3, F7 and F10) appeared to cascade the following hemostatic process implicating the bottleneck of coagulation genetic network by the bridging of actin-binding protein (FLNA). Conclusions It is implicated that the energy metabolism and hemostatic process have important roles in the pathogenesis for schizophrenia. The cross-talk of genetic interaction by these co-expressed genes and reached candidate genes may address the key network in disease pathology. The accuracy of candidate genes evaluated from different quantification tools could be improved by crosstalk analysis of overlapping genes in genetic networks.
Collapse
|
61
|
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 2014; 9:e112963. [PMID: 25409509 PMCID: PMC4237348 DOI: 10.1371/journal.pone.0112963] [Citation(s) in RCA: 6044] [Impact Index Per Article: 549.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Accepted: 10/16/2014] [Indexed: 02/06/2023] Open
Abstract
Advances in modern sequencing technologies allow us to generate sufficient data to analyze hundreds of bacterial genomes from a single machine in a single day. This potential for sequencing massive numbers of genomes calls for fully automated methods to produce high-quality assemblies and variant calls. We introduce Pilon, a fully automated, all-in-one tool for correcting draft assemblies and calling sequence variants of multiple sizes, including very large insertions and deletions. Pilon works with many types of sequence data, but is particularly strong when supplied with paired end data from two Illumina libraries with small e.g., 180 bp and large e.g., 3–5 Kb inserts. Pilon significantly improves draft genome assemblies by correcting bases, fixing mis-assemblies and filling gaps. For both haploid and diploid genomes, Pilon produces more contiguous genomes with fewer errors, enabling identification of more biologically relevant genes. Furthermore, Pilon identifies small variants with high accuracy as compared to state-of-the-art tools and is unique in its ability to accurately identify large sequence variants including duplications and resolve large insertions. Pilon is being used to improve the assemblies of thousands of new genomes and to identify variants from thousands of clinically relevant bacterial strains. Pilon is freely available as open source software.
Collapse
Affiliation(s)
- Bruce J. Walker
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- * E-mail: (BJW); (AME)
| | - Thomas Abeel
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- VIB Department of Plant Systems Biology, Ghent University, Ghent, Belgium
| | - Terrance Shea
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Margaret Priest
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Amr Abouelliel
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Sharadha Sakthikumar
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Christina A. Cuomo
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Qiandong Zeng
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Jennifer Wortman
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Sarah K. Young
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Ashlee M. Earl
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- * E-mail: (BJW); (AME)
| |
Collapse
|
62
|
Abstract
Genomic information reported as haplotypes rather than genotypes will be increasingly important for personalized medicine. Current technologies generate diploid sequence data that is rarely resolved into its constituent haplotypes. Furthermore, paradigms for thinking about genomic information are based on interpreting genotypes rather than haplotypes. Nevertheless, haplotypes have historically been useful in contexts ranging from population genetics to disease-gene mapping efforts. The main approaches for phasing genomic sequence data are molecular haplotyping, genetic haplotyping, and population-based inference. Long-read sequencing technologies are enabling longer molecular haplotypes, and decreases in the cost of whole-genome sequencing are enabling the sequencing of whole-chromosome genetic haplotypes. Hybrid approaches combining high-throughput short-read assembly with strategic approaches that enable physical or virtual binning of reads into haplotypes are enabling multi-gene haplotypes to be generated from single individuals. These techniques can be further combined with genetic and population approaches. Here, we review advances in whole-genome haplotyping approaches and discuss the importance of haplotypes for genomic medicine. Clinical applications include diagnosis by recognition of compound heterozygosity and by phasing regulatory variation to coding variation. Haplotypes, which are more specific than less complex variants such as single nucleotide variants, also have applications in prognostics and diagnostics, in the analysis of tumors, and in typing tissue for transplantation. Future advances will include technological innovations, the application of standard metrics for evaluating haplotype quality, and the development of databases that link haplotypes to disease.
Collapse
Affiliation(s)
- Gustavo Glusman
- Institute for Systems Biology, Terry Avenue North, Seattle, WA 98109 USA
| | - Hannah C Cox
- Institute for Systems Biology, Terry Avenue North, Seattle, WA 98109 USA
| | - Jared C Roach
- Institute for Systems Biology, Terry Avenue North, Seattle, WA 98109 USA
| |
Collapse
|
63
|
Microbial communities associated with human decomposition and their potential use as postmortem clocks. Int J Legal Med 2014; 129:623-32. [DOI: 10.1007/s00414-014-1059-0] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2014] [Accepted: 07/30/2014] [Indexed: 12/16/2022]
|
64
|
Abstract
Personalized medicine is the cornerstone of medical practice. It tailors treatments for specific conditions of an affected individual. The borders of personalized medicine are defined by limitations in technology and our understanding of biology, physiology and pathology of various conditions. Current advances in technology have provided physicians with the tools to investigate the molecular makeup of the disease. Translating these molecular make-ups to actionable targets has led to the development of small molecular inhibitors. Also, detailed understanding of genetic makeup has allowed us to develop prognostic markers, better known as companion diagnostics. Current attempts in the development of drug delivery systems offer the opportunity of delivering specific inhibitors to affected cells in an attempt to reduce the unwanted side effects of drugs.
Collapse
Affiliation(s)
- Gayane Badalian-Very
- Department of Medical Oncology, Dana Farber Cancer Institute, Harvard Medical School, 450 Brookline ave, Boston, MA 02115, United States. Tel.: + 1 617 513 7940; fax: + 1 617 632 5998.
| |
Collapse
|
65
|
Wang J, Zhang KY, Liu SM, Sen S. Tumor-associated circulating microRNAs as biomarkers of cancer. Molecules 2014; 19:1912-1938. [PMID: 24518808 PMCID: PMC6271223 DOI: 10.3390/molecules19021912] [Citation(s) in RCA: 114] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Revised: 01/24/2014] [Accepted: 01/29/2014] [Indexed: 02/06/2023] Open
Abstract
MicroRNAs (miRNAs), the 17- to 25-nucleotide long noncoding RNAs that modulate the expression of mRNAs and proteins, have emerged as critical players in cancer initiation and progression processes. Deregulation of tissue miRNA expression levels associated with specific genetic alterations has been demonstrated in cancer, where miRNAs function either as oncogenes or as tumor-suppressor genes and are shed from cancer cells into circulation. The present review summarizes and evaluates recent advances in our understanding of the characteristics of tumor tissue miRNAs, circulating miRNAs, and the stability of miRNAs in tissues and their varying expression profiles in circulating tumor cells, and body fluids including blood plasma. These advances in knowledge have led to intense efforts towards discovery and validation of differentially expressing tumor-associated miRNAs as biomarkers and therapeutic targets of cancer. The development of tumor-specific miRNA signatures as cancer biomarkers detectable in malignant cells and body fluids should help with early detection and more effective therapeutic intervention for individual patients.
Collapse
Affiliation(s)
- Jin Wang
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Ke-Yong Zhang
- Department of orthopedics, Daye People's Hospital, Daye, Hubei 435100, China
| | - Song-Mei Liu
- Center for Gene Diagnosis, Zhongnan Hospital of Wuhan University, Wuhan, Hubei 430071, China
| | - Subrata Sen
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
| |
Collapse
|
66
|
Sequencing: The Next Generation—What Is the Role of Whole-Exome Sequencing in the Diagnosis of Familial Cardiovascular Diseases? Can J Cardiol 2014; 30:152-4. [DOI: 10.1016/j.cjca.2013.12.024] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 12/23/2013] [Indexed: 12/31/2022] Open
|
67
|
Challenges in the Next-Generation Sequencing Field. NEXT GENERATION SEQUENCING TECHNOLOGIES AND CHALLENGES IN SEQUENCE ASSEMBLY 2014. [DOI: 10.1007/978-1-4939-0715-1_5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
68
|
El-Metwally S, Ouda OM, Helmy M. Approaches and Challenges of Next-Generation Sequence Assembly Stages. NEXT GENERATION SEQUENCING TECHNOLOGIES AND CHALLENGES IN SEQUENCE ASSEMBLY 2014. [DOI: 10.1007/978-1-4939-0715-1_9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|