1
|
Telenson AM, Hsieh RR, Cowen GJ, Sode EP, Kwon JM, Vo AH, Hadhazy M, Page PG, Rao NR, Pesce L, Demonbreun AR, Puckelwartz MJ, Savas JN, McNally EM. A novel, rapidly progressive ataxia due to a spontaneous Myo5a mutation in mice impairs transport proteins and alters mitochondria. FASEB J 2025; 39:e70423. [PMID: 40022605 DOI: 10.1096/fj.202402274r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2024] [Revised: 01/01/2025] [Accepted: 02/17/2025] [Indexed: 03/03/2025]
Abstract
Spontaneous mouse mutants have helped define genetic contributions to many phenotypes. Here we report a spontaneous Novel Ataxic Phenotype in mice. Ataxia findings were evident at post-natal day 11 in NAP mice and rapidly worsened, resulting in preweaning lethality. Using genome sequencing and genome-wide mapping, we identified a 3' donor splice variant in exon 14 of Myo5a, encoding an actin-based motor protein. The variant in Myo5a (c.1752g>a) excises exon 14 and ablates MYO5A protein expression, which is implicated in intracellular transport and Griscelli syndrome type I in humans. NAP mice displayed expansion of PAX6-positive cells in the external granule layer of the cerebellum, and mass spectrometry analysis of cerebellar extracts uncovered differentially abundant proteins involved in short-range organelle transport, and specifically proteins implicated with early endosomes. Using cerebellar lysates and primary neurons, we provide evidence for an interaction between MYO5A and ANKFY1, a known effector for the endosomal protein, RAB5A. We also found neurons from NAP mice had elongated mitochondria, linking MYO5A to mitochondrial homeostasis. This allele provides new insight into Myo5a function in developmental neuropathology.
Collapse
Affiliation(s)
- Alexander M Telenson
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Ryan R Hsieh
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Gabrielle J Cowen
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Eoin P Sode
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Jason M Kwon
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Andy H Vo
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Michele Hadhazy
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Patrick G Page
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Nalini R Rao
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Lorenzo Pesce
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Alexis R Demonbreun
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Megan J Puckelwartz
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Jeffrey N Savas
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Elizabeth M McNally
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
- Division of Cardiology, Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
- Department of Biochemistry and Molecular Biology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| |
Collapse
|
2
|
Puckelwartz MJ, Pesce LL, Hernandez EJ, Webster G, Dellefave-Castillo LM, Russell MW, Geisler SS, Kearns SD, Karthik F, Etheridge SP, Monroe TO, Pottinger TD, Kannankeril PJ, Shoemaker MB, Fountain D, Roden DM, Faulkner M, MacLeod HM, Burns KM, Yandell M, Tristani-Firouzi M, George AL, McNally EM. The impact of damaging epilepsy and cardiac genetic variant burden in sudden death in the young. Genome Med 2024; 16:13. [PMID: 38229148 PMCID: PMC10792876 DOI: 10.1186/s13073-024-01284-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 01/03/2024] [Indexed: 01/18/2024] Open
Abstract
BACKGROUND Sudden unexpected death in children is a tragic event. Understanding the genetics of sudden death in the young (SDY) enables family counseling and cascade screening. The objective of this study was to characterize genetic variation in an SDY cohort using whole genome sequencing. METHODS The SDY Case Registry is a National Institutes of Health/Centers for Disease Control and Prevention surveillance effort to discern the prevalence, causes, and risk factors for SDY. The SDY Case Registry prospectively collected clinical data and DNA biospecimens from SDY cases < 20 years of age. SDY cases were collected from medical examiner and coroner offices spanning 13 US jurisdictions from 2015 to 2019. The cohort included 211 children (median age 0.33 year; range 0-20 years), determined to have died suddenly and unexpectedly and from whom DNA biospecimens for DNA extractions and next-of-kin consent were ascertained. A control cohort consisted of 211 randomly sampled, sex- and ancestry-matched individuals from the 1000 Genomes Project. Genetic variation was evaluated in epilepsy, cardiomyopathy, and arrhythmia genes in the SDY and control cohorts. American College of Medical Genetics/Genomics guidelines were used to classify variants as pathogenic or likely pathogenic. Additionally, pathogenic and likely pathogenic genetic variation was identified using a Bayesian-based artificial intelligence (AI) tool. RESULTS The SDY cohort was 43% European, 29% African, 3% Asian, 16% Hispanic, and 9% with mixed ancestries and 39% female. Six percent of the cohort was found to harbor a pathogenic or likely pathogenic genetic variant in an epilepsy, cardiomyopathy, or arrhythmia gene. The genomes of SDY cases, but not controls, were enriched for rare, potentially damaging variants in epilepsy, cardiomyopathy, and arrhythmia-related genes. A greater number of rare epilepsy genetic variants correlated with younger age at death. CONCLUSIONS While damaging cardiomyopathy and arrhythmia genes are recognized contributors to SDY, we also observed an enrichment in epilepsy-related genes in the SDY cohort and a correlation between rare epilepsy variation and younger age at death. These findings emphasize the importance of considering epilepsy genes when evaluating SDY.
Collapse
Affiliation(s)
- Megan J Puckelwartz
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
| | - Lorenzo L Pesce
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | | | - Gregory Webster
- Division of Cardiology, Department of Pediatrics, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| | | | - Mark W Russell
- Department of Pediatrics, University of Michigan, Ann Arbor, MI, USA
| | - Sarah S Geisler
- Department of Pediatrics, University of Michigan, Ann Arbor, MI, USA
| | - Samuel D Kearns
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Felix Karthik
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Susan P Etheridge
- Division of Pediatric Cardiology, University of Utah, Salt Lake City, UT, USA
| | - Tanner O Monroe
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Tess D Pottinger
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Prince J Kannankeril
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - M Benjamin Shoemaker
- Department of Medicine, Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Darlene Fountain
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Dan M Roden
- Departments of Medicine, Pharmacology, and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | | | | | - Kristin M Burns
- Division of Cardiovascular Sciences, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mark Yandell
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | | | - Alfred L George
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Elizabeth M McNally
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| |
Collapse
|
3
|
Mujawar S, Patil G, Suthar S, Shendkar T, Gangadhar V. COVID-19 progression towards ARDS: a genome wide study reveals host factors underlying critical COVID-19. Genomics Inform 2023; 21:e16. [PMID: 37415451 PMCID: PMC10326536 DOI: 10.5808/gi.22080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 05/19/2023] [Accepted: 05/22/2023] [Indexed: 07/08/2023] Open
Abstract
Coronavirus disease 2019 (COVID-19) is a viral infection produced by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus epidemic, which was declared a global pandemic in March 2020. The World Health Organization has recorded around 43.3 billion cases and 59.4 million casualties to date, posing a severe threat to global health. Severe COVID-19 indicates viral pneumonia caused by the SARS-CoV-2 infections, which can induce fatal consequences, including acute respiratory distress syndrome (ARDS). The purpose of this research is to better understand the COVID-19 and ARDS pathways, as well as to find targeted single nucleotide polymorphism. To accomplish this, we retrieved over 100 patients' samples from the Sequence Read Archive, National Center for Biotechnology Information. These sequences were processed through the Galaxy server next generation sequencing pipeline for variant analysis and then visualized in the Integrative Genomics Viewer, and performed statistical analysis using t-tests and Bonferroni correction, where six major genes were identified as DNAH7, CLUAP1, PPA2, PAPSS1, TLR4, and IFITM3. Furthermore, a complete understanding of the genomes of COVID-19-related ARDS will aid in the early identification and treatment of target proteins. Finally, the discovery of novel therapeutics based on discovered proteins can assist to slow the progression of ARDS and lower fatality rates.
Collapse
Affiliation(s)
- Shama Mujawar
- MIT School of Bioengineering Sciences and Research, MIT-Art, Design and Technology University, Loni Kalbhor, Pune 412201, India
| | - Gayatri Patil
- MIT School of Bioengineering Sciences and Research, MIT-Art, Design and Technology University, Loni Kalbhor, Pune 412201, India
| | - Srushti Suthar
- MIT School of Bioengineering Sciences and Research, MIT-Art, Design and Technology University, Loni Kalbhor, Pune 412201, India
| | - Tanuja Shendkar
- MIT School of Bioengineering Sciences and Research, MIT-Art, Design and Technology University, Loni Kalbhor, Pune 412201, India
| | - Vaishnavi Gangadhar
- MIT School of Bioengineering Sciences and Research, MIT-Art, Design and Technology University, Loni Kalbhor, Pune 412201, India
| |
Collapse
|
4
|
Puckelwartz MJ, Pesce LL, Hernandez EJ, Webster G, Dellefave-Castillo LM, Russell MW, Geisler SS, Kearns SD, Etheridge FK, Etheridge SP, Monroe TO, Pottinger TD, Kannankeril PJ, Shoemaker MB, Fountain D, Roden DM, MacLeod H, Burns KM, Yandell M, Tristani-Firouzi M, George AL, McNally EM. The impact of damaging epilepsy and cardiac genetic variant burden in sudden death in the young. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.27.23287711. [PMID: 37034657 PMCID: PMC10081419 DOI: 10.1101/2023.03.27.23287711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Background Sudden unexpected death in children is a tragic event. Understanding the genetics of sudden death in the young (SDY) enables family counseling and cascade screening. The objective of this study was to characterize genetic variation in an SDY cohort using whole genome sequencing. Methods The SDY Case Registry is a National Institutes of Health/Centers for Disease Control surveillance effort to discern the prevalence, causes, and risk factors for SDY. The SDY Case Registry prospectively collected clinical data and DNA biospecimens from SDY cases <20 years of age. SDY cases were collected from medical examiner and coroner offices spanning 13 US jurisdictions from 2015-2019. The cohort included 211 children (mean age 1 year; range 0-20 years), determined to have died suddenly and unexpectedly and in whom DNA biospecimens and next-of-kin consent were ascertained. A control cohort consisted of 211 randomly sampled, sex-and ancestry-matched individuals from the 1000 Genomes Project. Genetic variation was evaluated in epilepsy, cardiomyopathy and arrhythmia genes in the SDY and control cohorts. American College of Medical Genetics/Genomics guidelines were used to classify variants as pathogenic or likely pathogenic. Additionally, genetic variation predicted to be damaging was identified using a Bayesian-based artificial intelligence (AI) tool. Results The SDY cohort was 42% European, 30% African, 17% Hispanic, and 11% with mixed ancestries, and 39% female. Six percent of the cohort was found to harbor a pathogenic or likely pathogenic genetic variant in an epilepsy, cardiomyopathy or arrhythmia gene. The genomes of SDY cases, but not controls, were enriched for rare, damaging variants in epilepsy, cardiomyopathy and arrhythmia-related genes. A greater number of rare epilepsy genetic variants correlated with younger age at death. Conclusions While damaging cardiomyopathy and arrhythmia genes are recognized contributors to SDY, we also observed an enrichment in epilepsy-related genes in the SDY cohort, and a correlation between rare epilepsy variation and younger age at death. These findings emphasize the importance of considering epilepsy genes when evaluating SDY.
Collapse
|
5
|
Bazrafshan S, Sibilia R, Girgla S, Viswanathan SK, Puckelwartz MJ, Sangha KS, Singh RR, Kakroo M, Jandarov R, Harris DM, Rubinstein J, Becker RC, McNally EM, Sadayappan S. South Asian-Specific MYBPC3 Δ25bp Deletion Carriers Display Hypercontraction and Impaired Diastolic Function Under Exercise Stress. Front Cardiovasc Med 2021; 8:766339. [PMID: 35004883 PMCID: PMC8733148 DOI: 10.3389/fcvm.2021.766339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Accepted: 11/22/2021] [Indexed: 11/13/2022] Open
Abstract
Background: A 25-base pair (25bp) intronic deletion in the MYBPC3 gene enriched in South Asians (SAs) is a risk allele for late-onset left ventricular (LV) dysfunction, hypertrophy, and heart failure (HF) with several forms of cardiomyopathy. However, the effect of this variant on exercise parameters has not been evaluated. Methods: As a pilot study, 10 asymptomatic SA carriers of the MYBPC3 Δ25bp variant (52.9 ± 2.14 years) and 10 age- and gender-matched non-carriers (NCs) (50.1 ± 2.7 years) were evaluated at baseline and under exercise stress conditions using bicycle exercise echocardiography and continuous cardiac monitoring. Results: Baseline echocardiography parameters were not different between the two groups. However, in response to exercise stress, the carriers of Δ25bp had significantly higher LV ejection fraction (%) (CI: 4.57 ± 1.93; p < 0.0001), LV outflow tract peak velocity (m/s) (CI: 0.19 ± 0.07; p < 0.0001), and higher aortic valve (AV) peak velocity (m/s) (CI: 0.103 ± 0.08; p = 0.01) in comparison to NCs, and E/A ratio, a marker of diastolic compliance, was significantly lower in Δ25bp carriers (CI: 0.107 ± 0.102; p = 0.038). Interestingly, LV end-diastolic diameter (LVIDdia) was augmented in NCs in response to stress, while it did not increase in Δ25bp carriers (CI: 0.239 ± 0.125; p = 0.0002). Further, stress-induced right ventricular systolic excursion velocity s' (m/s), as a marker of right ventricle function, increased similarly in both groups, but tricuspid annular plane systolic excursion increased more in carriers (slope: 0.008; p = 0.0001), suggesting right ventricle functional differences between the two groups. Conclusions: These data support that MYBPC3 Δ25bp is associated with LV hypercontraction under stress conditions with evidence of diastolic impairment.
Collapse
Affiliation(s)
- Sholeh Bazrafshan
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Robert Sibilia
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Saavia Girgla
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Shiv Kumar Viswanathan
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Megan J. Puckelwartz
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | - Kiranpal S. Sangha
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Rohit R. Singh
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Mashhood Kakroo
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Roman Jandarov
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - David M. Harris
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Jack Rubinstein
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Richard C. Becker
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Elizabeth M. McNally
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | - Sakthivel Sadayappan
- Division of Cardiovascular Health and Disease, Department of Internal Medicine, Heart, Lung and Vascular Institute, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| |
Collapse
|
6
|
Lu L, Chen H, Wang X, Zhao Y, Yao X, Xiong B, Deng Y, Zhao D. Genome-level diversification of eight ancient tea populations in the Guizhou and Yunnan regions identifies candidate genes for core agronomic traits. HORTICULTURE RESEARCH 2021; 8:190. [PMID: 34376642 PMCID: PMC8355299 DOI: 10.1038/s41438-021-00617-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 05/20/2021] [Accepted: 05/24/2021] [Indexed: 05/18/2023]
Abstract
The ancient tea plant, as a precious natural resource and source of tea plant genetic diversity, is of great value for studying the evolutionary mechanism, diversification, and domestication of plants. The overall genetic diversity among ancient tea plants and the genetic changes that occurred during natural selection remain poorly understood. Here, we report the genome resequencing of eight different groups consisting of 120 ancient tea plants: six groups from Guizhou Province and two groups from Yunnan Province. Based on the 8,082,370 identified high-quality SNPs, we constructed phylogenetic relationships, assessed population structure, and performed genome-wide association studies (GWAS). Our phylogenetic analysis showed that the 120 ancient tea plants were mainly clustered into three groups and five single branches, which is consistent with the results of principal component analysis (PCA). Ancient tea plants were further divided into seven subpopulations based on genetic structure analysis. Moreover, it was found that the variation in ancient tea plants was not reduced by pressure from the external natural environment or artificial breeding (nonsynonymous/synonymous = 1.05). By integrating GWAS, selection signals, and gene function prediction, four candidate genes were significantly associated with three leaf traits, and two candidate genes were significantly associated with plant type. These candidate genes can be used for further functional characterization and genetic improvement of tea plants.
Collapse
Affiliation(s)
- Litang Lu
- College of Tea Science, Guizhou University, Guiyang, 550025, People's Republic of China
- College of Life Sciences and The Key Laboratory of Plant Resources Conservation and Germplasm Innovation in the Mountainous Region (Ministry of Education), Institute of Agro-Bioengineering, Guizhou University, Guiyang, 550025, People's Republic of China
| | - Hufang Chen
- College of Tea Science, Guizhou University, Guiyang, 550025, People's Republic of China
- College of Life Sciences and The Key Laboratory of Plant Resources Conservation and Germplasm Innovation in the Mountainous Region (Ministry of Education), Institute of Agro-Bioengineering, Guizhou University, Guiyang, 550025, People's Republic of China
| | - Xiaojing Wang
- College of Tea Science, Guizhou University, Guiyang, 550025, People's Republic of China
| | - Yichen Zhao
- College of Tea Science, Guizhou University, Guiyang, 550025, People's Republic of China
- College of Life Sciences and The Key Laboratory of Plant Resources Conservation and Germplasm Innovation in the Mountainous Region (Ministry of Education), Institute of Agro-Bioengineering, Guizhou University, Guiyang, 550025, People's Republic of China
| | - Xinzhuan Yao
- College of Tea Science, Guizhou University, Guiyang, 550025, People's Republic of China
| | - Biao Xiong
- College of Tea Science, Guizhou University, Guiyang, 550025, People's Republic of China
| | - Yanli Deng
- College of Tea Science, Guizhou University, Guiyang, 550025, People's Republic of China
| | - Degang Zhao
- College of Life Sciences and The Key Laboratory of Plant Resources Conservation and Germplasm Innovation in the Mountainous Region (Ministry of Education), Institute of Agro-Bioengineering, Guizhou University, Guiyang, 550025, People's Republic of China.
- Guizhou Academy of Agricultural Sciences, Guiyang, 550025, People's Republic of China.
| |
Collapse
|
7
|
GPrimer: a fast GPU-based pipeline for primer design for qPCR experiments. BMC Bioinformatics 2021; 22:220. [PMID: 33926379 PMCID: PMC8082839 DOI: 10.1186/s12859-021-04133-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 04/14/2021] [Indexed: 11/10/2022] Open
Abstract
Background Design of valid high-quality primers is essential for qPCR experiments. MRPrimer is a powerful pipeline based on MapReduce that combines both primer design for target sequences and homology tests on off-target sequences. It takes an entire sequence DB as input and returns all feasible and valid primer pairs existing in the DB. Due to the effectiveness of primers designed by MRPrimer in qPCR analysis, it has been widely used for developing many online design tools and building primer databases. However, the computational speed of MRPrimer is too slow to deal with the sizes of sequence DBs growing exponentially and thus must be improved. Results We develop a fast GPU-based pipeline for primer design (GPrimer) that takes the same input and returns the same output with MRPrimer. MRPrimer consists of a total of seven MapReduce steps, among which two steps are very time-consuming. GPrimer significantly improves the speed of those two steps by exploiting the computational power of GPUs. In particular, it designs data structures for coalesced memory access in GPU and workload balancing among GPU threads and copies the data structures between main memory and GPU memory in a streaming fashion. For human RefSeq DB, GPrimer achieves a speedup of 57 times for the entire steps and a speedup of 557 times for the most time-consuming step using a single machine of 4 GPUs, compared with MRPrimer running on a cluster of six machines. Conclusions We propose a GPU-based pipeline for primer design that takes an entire sequence DB as input and returns all feasible and valid primer pairs existing in the DB at once without an additional step using BLAST-like tools. The software is available at https://github.com/qhtjrmin/GPrimer.git.
Collapse
|
8
|
Puckelwartz MJ, Pesce LL, Dellefave‐Castillo LM, Wheeler MT, Pottinger TD, Robinson AC, Kearns SD, Gacita AM, Schoppen ZJ, Pan W, Kim G, Wilcox JE, Anderson AS, Ashley EA, Day SM, Cappola T, Dorn GW, McNally EM. Genomic Context Differs Between Human Dilated Cardiomyopathy and Hypertrophic Cardiomyopathy. J Am Heart Assoc 2021; 10:e019944. [PMID: 33764162 PMCID: PMC8174318 DOI: 10.1161/jaha.120.019944] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Accepted: 02/17/2021] [Indexed: 12/20/2022]
Abstract
Background Inherited cardiomyopathies display variable penetrance and expression, and a component of phenotypic variation is genetically determined. To evaluate the genetic contribution to this variable expression, we compared protein coding variation in the genomes of those with hypertrophic cardiomyopathy (HCM) and dilated cardiomyopathy (DCM). Methods and Results Nonsynonymous single-nucleotide variants (nsSNVs) were ascertained using whole genome sequencing from familial cases of HCM (n=56) or DCM (n=70) and correlated with echocardiographic information. Focusing on nsSNVs in 102 genes linked to inherited cardiomyopathies, we correlated the number of nsSNVs per person with left ventricular measurements. Principal component analysis and generalized linear models were applied to identify the probability of cardiomyopathy type as it related to the number of nsSNVs in cardiomyopathy genes. The probability of having DCM significantly increased as the number of cardiomyopathy gene nsSNVs per person increased. The increase in nsSNVs in cardiomyopathy genes significantly associated with reduced left ventricular ejection fraction and increased left ventricular diameter for individuals carrying a DCM diagnosis, but not for those with HCM. Resampling was used to identify genes with aberrant cumulative allele frequencies, identifying potential modifier genes for cardiomyopathy. Conclusions Participants with DCM had more nsSNVs per person in cardiomyopathy genes than participants with HCM. The nsSNV burden in cardiomyopathy genes did not correlate with the probability or manifestation of left ventricular measures in HCM. These findings support the concept that increased variation in cardiomyopathy genes creates a genetic background that predisposes to DCM and increased disease severity.
Collapse
Affiliation(s)
- Megan J. Puckelwartz
- Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoIL
- Department of PharmacologyNorthwestern University Feinberg School of MedicineChicagoIL
- Department of Medicine/Cardiovascular MedicineStanford UniversityStanfordCA
| | - Lorenzo L. Pesce
- Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoIL
| | | | - Matthew T. Wheeler
- Department of Medicine/Cardiovascular MedicineStanford UniversityStanfordCA
| | - Tess D. Pottinger
- Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoIL
| | - Avery C. Robinson
- Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoIL
| | - Samuel D. Kearns
- Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoIL
| | - Anthony M. Gacita
- Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoIL
| | - Zachary J. Schoppen
- Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoIL
| | - Wenyu Pan
- Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoIL
| | - Gene Kim
- Department of MedicineUniversity of ChicagoChicagoIL
| | - Jane E. Wilcox
- Department of MedicineBluhm Cardiovascular InstituteNorthwestern UniversityChicagoIL
| | - Allen S. Anderson
- Department of MedicineBluhm Cardiovascular InstituteNorthwestern UniversityChicagoIL
| | - Euan A. Ashley
- Department of MedicineBluhm Cardiovascular InstituteNorthwestern UniversityChicagoIL
| | - Sharlene M. Day
- Department of Internal MedicineThe University of MichiganAnn ArborMI
- Perelman School of MedicineDivision of Cardiovascular Medicine and Penn CardiovascularInstitute and Department of MedicineUniversity of PennsylvaniaPhiladelphiaPA
| | - Thomas Cappola
- Perelman School of MedicineDivision of Cardiovascular Medicine and Penn CardiovascularInstitute and Department of MedicineUniversity of PennsylvaniaPhiladelphiaPA
| | | | - Elizabeth M. McNally
- Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoIL
- Department of Internal MedicineThe University of MichiganAnn ArborMI
| |
Collapse
|
9
|
Jarlier F, Joly N, Fedy N, Magalhaes T, Sirotti L, Paganiban P, Martin F, McManus M, Hupé P. QUARTIC: QUick pArallel algoRithms for high-Throughput sequencIng data proCessing. F1000Res 2020; 9:240. [PMID: 32913637 PMCID: PMC7429925 DOI: 10.12688/f1000research.22954.3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/05/2020] [Indexed: 11/20/2022] Open
Abstract
Life science has entered the so-called 'big data era' where biologists, clinicians and bioinformaticians are overwhelmed with high-throughput sequencing data. While they offer new insights to decipher the genome structure they also raise major challenges to use them for daily clinical practice care and diagnosis purposes as they are bigger and bigger. Therefore, we implemented a software to reduce the time to delivery for the alignment and the sorting of high-throughput sequencing data. Our solution is implemented using Message Passing Interface and is intended for high-performance computing architecture. The software scales linearly with respect to the size of the data and ensures a total reproducibility with the traditional tools. For example, a 300X whole genome can be aligned and sorted within less than 9 hours with 128 cores. The software offers significant speed-up using multi-cores and multi-nodes parallelization.
Collapse
Affiliation(s)
- Frédéric Jarlier
- Institut Curie, Paris, F-75005, France.,U900, Inserm, Paris, F-75005, France.,PSL Research University, Paris, France.,Mines Paris Tech, Fontainebleau, F-77305, France
| | | | - Nicolas Fedy
- Institut Curie, Paris, F-75005, France.,U900, Inserm, Paris, F-75005, France.,PSL Research University, Paris, France.,Mines Paris Tech, Fontainebleau, F-77305, France.,Université Paris Descartes, Paris, F-75006, France
| | - Thomas Magalhaes
- Institut Curie, Paris, F-75005, France.,U900, Inserm, Paris, F-75005, France.,PSL Research University, Paris, France.,Mines Paris Tech, Fontainebleau, F-77305, France.,Université Paris Descartes, Paris, F-75006, France
| | - Leonor Sirotti
- Institut Curie, Paris, F-75005, France.,U900, Inserm, Paris, F-75005, France.,PSL Research University, Paris, France.,Mines Paris Tech, Fontainebleau, F-77305, France.,Université Paris Descartes, Paris, F-75006, France
| | - Paul Paganiban
- Institut Curie, Paris, F-75005, France.,U900, Inserm, Paris, F-75005, France.,PSL Research University, Paris, France.,Mines Paris Tech, Fontainebleau, F-77305, France.,Université Paris Descartes, Paris, F-75006, France
| | - Firmin Martin
- Institut Curie, Paris, F-75005, France.,U900, Inserm, Paris, F-75005, France.,PSL Research University, Paris, France.,Mines Paris Tech, Fontainebleau, F-77305, France.,Université Paris Descartes, Paris, F-75006, France
| | | | - Philippe Hupé
- Institut Curie, Paris, F-75005, France.,U900, Inserm, Paris, F-75005, France.,PSL Research University, Paris, France.,Mines Paris Tech, Fontainebleau, F-77305, France.,UMR144, CNRS, Paris, F-75005, France
| |
Collapse
|
10
|
Pottinger TD, Puckelwartz MJ, Pesce LL, Robinson A, Kearns S, Pacheco JA, Rasmussen-Torvik LJ, Smith ME, Chisholm R, McNally EM. Pathogenic and Uncertain Genetic Variants Have Clinical Cardiac Correlates in Diverse Biobank Participants. J Am Heart Assoc 2020; 9:e013808. [PMID: 32009526 PMCID: PMC7033893 DOI: 10.1161/jaha.119.013808] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Background Genome sequencing coupled with electronic heath record data can uncover medically important genetic variation. Interpretation of rare genetic variation and its role in mediating cardiovascular phenotypes is confounded by variants of uncertain significance. Methods and Results We analyzed the whole genome sequence of 900 racially and ethnically diverse biobank participants selected from a single US center. Participants were equally divided among European, African, Hispanic, and mixed races/ethnicities. We evaluated the American College of Medical Genetics and Genomics medically actionable list of 59 genes, focusing on the cardiac genes. Variation was interpreted using the most recent reports in ClinVar, a database of medically relevant human variation. We identified 19 individuals with pathogenic or likely pathogenic variants in cardiac actionable genes (2%) and found evidence of related clinical correlates in the electronic health record. Participants of African ancestry, compared with those of European ancestry, had more variants of uncertain significance in the medically actionable genes including the 30 cardiac actionable genes, even when normalized to total variant count per person. Longitudinal measures of left ventricle size from ≈400 biobank participants (1723 patient‐years) were correlated with genetic findings. The presence of ≥1 uncertain variant in the actionable cardiac genes and a cardiomyopathy diagnosis correlated with increased left ventricular internal diameter in diastole and in systole. In particular, MYBPC3 was identified as a gene with excess variants of uncertain significance. Conclusions These data indicate that a subset of uncertain genetic variants may confer risk and should not be considered benign.
Collapse
Affiliation(s)
- Tess D Pottinger
- Center for Genetic Medicine Northwestern University Feinberg School of Medicine Chicago IL
| | - Megan J Puckelwartz
- Center for Genetic Medicine Northwestern University Feinberg School of Medicine Chicago IL
- Department of Pharmacology Northwestern University Feinberg School of Medicine Chicago IL
| | | | - Avery Robinson
- Center for Genetic Medicine Northwestern University Feinberg School of Medicine Chicago IL
| | - Samuel Kearns
- Center for Genetic Medicine Northwestern University Feinberg School of Medicine Chicago IL
| | - Jennifer A Pacheco
- Center for Genetic Medicine Northwestern University Feinberg School of Medicine Chicago IL
| | - Laura J Rasmussen-Torvik
- Department of Preventive Medicine Northwestern University Feinberg School of Medicine Chicago IL
| | - Maureen E Smith
- Center for Genetic Medicine Northwestern University Feinberg School of Medicine Chicago IL
| | - Rex Chisholm
- Center for Genetic Medicine Northwestern University Feinberg School of Medicine Chicago IL
- Department of Cell and Molecular Biology Northwestern University Feinberg School of Medicine Chicago IL
| | - Elizabeth M McNally
- Center for Genetic Medicine Northwestern University Feinberg School of Medicine Chicago IL
| |
Collapse
|
11
|
Ito S, Yadome M, Nishiki T, Ishiduki S, Inoue H, Yamaguchi R, Miyano S. Virtual Grid Engine: a simulated grid engine environment for large-scale supercomputers. BMC Bioinformatics 2019; 20:591. [PMID: 31787090 PMCID: PMC6886159 DOI: 10.1186/s12859-019-3085-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Background Supercomputers have become indispensable infrastructures in science and industries. In particular, most state-of-the-art scientific results utilize massively parallel supercomputers ranked in TOP500. However, their use is still limited in the bioinformatics field due to the fundamental fact that the asynchronous parallel processing service of Grid Engine is not provided on them. To encourage the use of massively parallel supercomputers in bioinformatics, we developed middleware called Virtual Grid Engine, which enables software pipelines to automatically perform their tasks as MPI programs. Result We conducted basic tests to check the time required to assign jobs to workers by VGE. The results showed that the overhead of the employed algorithm was 246 microseconds and our software can manage thousands of jobs smoothly on the K computer. We also tried a practical test in the bioinformatics field. This test included two tasks, the split and BWA alignment of input FASTQ data. 25,055 nodes (2,000,440 cores) were used for this calculation and accomplished it in three hours. Conclusion We considered that there were four important requirements for this kind of software, non-privilege server program, multiple job handling, dependency control, and usability. We carefully designed and checked all requirements. And this software fulfilled all the requirements and achieved good performance in a large scale analysis.
Collapse
Affiliation(s)
- Satoshi Ito
- The Institute of Medical Science, The University of Tokyo, Shirokanedai 4-6-1, Minato-ku, Tokyo, 108-8639, Japan.
| | - Masaaki Yadome
- The Institute of Medical Science, The University of Tokyo, Shirokanedai 4-6-1, Minato-ku, Tokyo, 108-8639, Japan
| | - Tatsuo Nishiki
- Frontier Computing Center, Fujitsu Limited, Higashishinbashi1-5-2, Minato-ku, Tokyo, 105-7123, Japan
| | - Shigeru Ishiduki
- Frontier Computing Center, Fujitsu Limited, Higashishinbashi1-5-2, Minato-ku, Tokyo, 105-7123, Japan
| | - Hikaru Inoue
- Frontier Computing Center, Fujitsu Limited, Higashishinbashi1-5-2, Minato-ku, Tokyo, 105-7123, Japan
| | - Rui Yamaguchi
- The Institute of Medical Science, The University of Tokyo, Shirokanedai 4-6-1, Minato-ku, Tokyo, 108-8639, Japan
| | - Satoru Miyano
- The Institute of Medical Science, The University of Tokyo, Shirokanedai 4-6-1, Minato-ku, Tokyo, 108-8639, Japan
| |
Collapse
|
12
|
Viswanathan SK, Puckelwartz MJ, Mehta A, Ramachandra CJA, Jagadeesan A, Fritsche-Danielson R, Bhat RV, Wong P, Kandoi S, Schwanekamp JA, Kuffel G, Pesce LL, Zilliox MJ, Durai UNB, Verma RS, Molokie RE, Suresh DP, Khoury PR, Thomas A, Sanagala T, Tang HC, Becker RC, Knöll R, Shim W, McNally EM, Sadayappan S. Association of Cardiomyopathy With MYBPC3 D389V and MYBPC3Δ25bpIntronic Deletion in South Asian Descendants. JAMA Cardiol 2019; 3:481-488. [PMID: 29641836 DOI: 10.1001/jamacardio.2018.0618] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Importance The genetic variant MYBPC3Δ25bp occurs in 4% of South Asian descendants, with an estimated 100 million carriers worldwide. MYBPC3 Δ25bp has been linked to cardiomyopathy and heart failure. However, the high prevalence of MYBPC3Δ25bp suggests that other stressors act in concert with MYBPC3Δ25bp. Objective To determine whether there are additional genetic factors that contribute to the cardiomyopathic expression of MYBPC3Δ25bp. Design, Setting, andParticipants South Asian individuals living in the United States were screened for MYBPC3Δ25bp, and a subgroup was clinically evaluated using electrocardiograms and echocardiograms at Loyola University, Chicago, Illinois, between January 2015 and July 2016. Main Outcomes and Measures Next-generation sequencing of 174 cardiovascular disease genes was applied to identify additional modifying gene mutations and correlate genotype-phenotype parameters. Cardiomyocytes derived from human-induced pluripotent stem cells were established and examined to assess the role of MYBPC3Δ25bp. Results In this genotype-phenotype study, individuals of South Asian descent living in the United States from both sexes (36.23% female) with a mean population age of 48.92 years (range, 18-84 years) were recruited. Genetic screening of 2401 US South Asian individuals found an MYBPC3Δ25bpcarrier frequency of 6%. A higher frequency of missense TTN variation was found in MYBPC3Δ25bp carriers compared with noncarriers, identifying distinct genetic backgrounds within the MYBPC3Δ25bp carrier group. Strikingly, 9.6% of MYBPC3Δ25bp carriers also had a novel MYBPC3 variant, D389V. Family studies documented D389V was in tandem on the same allele as MYBPC3Δ25bp, and D389V was only seen in the presence of MYBPC3Δ25bp. In contrast to MYBPC3Δ25bp, MYBPC3Δ25bp/D389V was associated with hyperdynamic left ventricular performance (mean [SEM] left ventricular ejection fraction, 66.7 [0.7%]; left ventricular fractional shortening, 36.6 [0.6%]; P < .03) and stem cell-derived cardiomyocytes exhibited cellular hypertrophy with abnormal Ca2+ transients. Conclusions and Relevance MYBPC3Δ25bp/D389V is associated with hyperdynamic features, which are an early finding in hypertrophic cardiomyopathy and thought to reflect an unfavorable energetic state. These findings support that a subset of MYBPC3Δ25bp carriers, those with D389V, account for the increased risk attributed to MYBPC3Δ25bp.
Collapse
Affiliation(s)
- Shiv Kumar Viswanathan
- Heart, Lung and Vascular Institute, Department of Internal Medicine, University of Cincinnati, Cincinnati, Ohio.,Department of Cell and Molecular Physiology, Loyola University Chicago, Maywood, Illinois
| | | | - Ashish Mehta
- National Heart Research Institute Singapore.,Cardiovascular Academic Clinical Program, DUKE-NUS Medical School, Singapore.,PSC and Phenotyping Laboratory, Victor Chang Cardiac Research Institute, Sydney, Australia
| | | | | | - Regina Fritsche-Danielson
- Cardiovascular and Metabolic Disease Innovative Medicines and Early Development Unit, AstraZeneca Research and Development, Gothenburg, Sweden
| | - Ratan V Bhat
- Cardiovascular and Metabolic Disease Innovative Medicines and Early Development Unit, AstraZeneca Research and Development, Gothenburg, Sweden
| | - Philip Wong
- National Heart Research Institute Singapore.,Cardiovascular and Metabolic Disorders Program, DUKE-NUS Medical School, Singapore.,Department of Cardiology, National Heart Centre Singapore, Singapore
| | - Sangeetha Kandoi
- Heart, Lung and Vascular Institute, Department of Internal Medicine, University of Cincinnati, Cincinnati, Ohio.,Department of Cell and Molecular Physiology, Loyola University Chicago, Maywood, Illinois.,Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamilnadu, India
| | - Jennifer A Schwanekamp
- Heart, Lung and Vascular Institute, Department of Internal Medicine, University of Cincinnati, Cincinnati, Ohio
| | - Gina Kuffel
- Department of Public Health Sciences, Loyola University Chicago, Maywood, Illinois
| | - Lorenzo L Pesce
- Computation Institute, The University of Chicago, Chicago, Illinois
| | - Michael J Zilliox
- Department of Public Health Sciences, Loyola University Chicago, Maywood, Illinois
| | - U Nalla B Durai
- Divison of Hematology and Oncology, University of Illinois at Chicago
| | - Rama Shanker Verma
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamilnadu, India
| | - Robert E Molokie
- Divison of Hematology and Oncology, University of Illinois at Chicago
| | | | - Philip R Khoury
- Heart Institute, Cincinnati Children's Hospital, Cincinnati, Ohio
| | - Annie Thomas
- Marcella Niehoff School of Nursing, Loyola University Chicago, Maywood, Illinois
| | - Thriveni Sanagala
- Department of Cardiology and Echocardiography and Cardiographics, Loyola University Chicago, Maywood, Illinois
| | - Hak Chiaw Tang
- Department of Cardiology, National Heart Centre Singapore, Singapore
| | - Richard C Becker
- Heart, Lung and Vascular Institute, Department of Internal Medicine, University of Cincinnati, Cincinnati, Ohio
| | - Ralph Knöll
- Cardiovascular and Metabolic Disease Innovative Medicines and Early Development Unit, AstraZeneca Research and Development, Gothenburg, Sweden.,Integrated Cardio-Metabolic Centre, Myocardial Genetics, Karolinska Institutet, University Hospital, Heart and Vascular Theme, Stockholm, Sweden
| | - Winston Shim
- National Heart Research Institute Singapore.,Cardiovascular and Metabolic Disorders Program, DUKE-NUS Medical School, Singapore
| | - Elizabeth M McNally
- Center for Genetic Medicine, Northwestern University, Chicago, Illinois.,Associate Editor for Translational Science
| | - Sakthivel Sadayappan
- Heart, Lung and Vascular Institute, Department of Internal Medicine, University of Cincinnati, Cincinnati, Ohio.,Department of Cell and Molecular Physiology, Loyola University Chicago, Maywood, Illinois
| |
Collapse
|
13
|
Shafiei H, Bakhtiarizadeh MR, Salehi A. Large‐scale potential
RNA
editing profiling in different adult chicken tissues. Anim Genet 2019; 50:460-474. [DOI: 10.1111/age.12818] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/23/2019] [Indexed: 12/23/2022]
Affiliation(s)
- H. Shafiei
- Department of Animal and Poultry Science, College of Aburaihan University of Tehran Tehran33916-53775Iran
| | - M. R. Bakhtiarizadeh
- Department of Animal and Poultry Science, College of Aburaihan University of Tehran Tehran33916-53775Iran
| | - A. Salehi
- Department of Animal and Poultry Science, College of Aburaihan University of Tehran Tehran33916-53775Iran
| |
Collapse
|
14
|
Ahmed AE, Heldenbrand J, Asmann Y, Fadlelmola FM, Katz DS, Kendig K, Kendzior MC, Li T, Ren Y, Rodriguez E, Weber MR, Wozniak JM, Zermeno J, Mainzer LS. Managing genomic variant calling workflows with Swift/T. PLoS One 2019; 14:e0211608. [PMID: 31287816 PMCID: PMC6615596 DOI: 10.1371/journal.pone.0211608] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 06/08/2019] [Indexed: 12/30/2022] Open
Abstract
Bioinformatics research is frequently performed using complex workflows with multiple steps, fans, merges, and conditionals. This complexity makes management of the workflow difficult on a computer cluster, especially when running in parallel on large batches of data: hundreds or thousands of samples at a time. Scientific workflow management systems could help with that. Many are now being proposed, but is there yet the “best” workflow management system for bioinformatics? Such a system would need to satisfy numerous, sometimes conflicting requirements: from ease of use, to seamless deployment at peta- and exa-scale, and portability to the cloud. We evaluated Swift/T as a candidate for such role by implementing a primary genomic variant calling workflow in the Swift/T language, focusing on workflow management, performance and scalability issues that arise from production-grade big data genomic analyses. In the process we introduced novel features into the language, which are now part of its open repository. Additionally, we formalized a set of design criteria for quality, robust, maintainable workflows that must function at-scale in a production setting, such as a large genomic sequencing facility or a major hospital system. The use of Swift/T conveys two key advantages. (1) It operates transparently in multiple cluster scheduling environments (PBS Torque, SLURM, Cray aprun environment, etc.), thus a single workflow is trivially portable across numerous clusters. (2) The leaf functions of Swift/T permit developers to easily swap executables in and out of the workflow, which makes it easy to maintain and to request resources optimal for each stage of the pipeline. While Swift/T’s data-level parallelism eliminates the need to code parallel analysis of multiple samples, it does make debugging more difficult, as is common for implicitly parallel code. Nonetheless, the language gives users a powerful and portable way to scale up analyses in many computing architectures. The code for our implementation of a variant calling workflow using Swift/T can be found on GitHub at https://github.com/ncsa/Swift-T-Variant-Calling, with full documentation provided at http://swift-t-variant-calling.readthedocs.io/en/latest/.
Collapse
Affiliation(s)
- Azza E. Ahmed
- Centre for Bioinformatics & Systems Biology, Faculty of Science, University of Khartoum, Khartoum, Sudan
- Department of Electrical and Electronic Engineering, Faculty of Engineering, University of Khartoum, Khartoum, Sudan
| | - Jacob Heldenbrand
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| | - Yan Asmann
- Department of Health Sciences Research, Mayo Clinic, Jacksonville, Florida, United States of America
| | - Faisal M. Fadlelmola
- Centre for Bioinformatics & Systems Biology, Faculty of Science, University of Khartoum, Khartoum, Sudan
| | - Daniel S. Katz
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| | - Katherine Kendig
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| | - Matthew C. Kendzior
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| | - Tiffany Li
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| | - Yingxue Ren
- Department of Health Sciences Research, Mayo Clinic, Jacksonville, Florida, United States of America
| | - Elliott Rodriguez
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| | - Matthew R. Weber
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| | - Justin M. Wozniak
- Argonne National Laboratory, Argonne, Illinois, United States of America
| | - Jennie Zermeno
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| | - Liudmila S. Mainzer
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
- * E-mail:
| |
Collapse
|
15
|
Vo AH, Swaggart KA, Woo A, Gao QQ, Demonbreun AR, Fallon KS, Quattrocelli M, Hadhazy M, Page PGT, Chen Z, Eskin A, Squire K, Nelson SF, McNally EM. Dusp6 is a genetic modifier of growth through enhanced ERK activity. Hum Mol Genet 2019; 28:279-289. [PMID: 30289454 DOI: 10.1093/hmg/ddy349] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 09/26/2018] [Indexed: 12/21/2022] Open
Abstract
Like other single-gene disorders, muscular dystrophy displays a range of phenotypic heterogeneity even with the same primary mutation. Identifying genetic modifiers capable of altering the course of muscular dystrophy is one approach to deciphering gene-gene interactions that can be exploited for therapy development. To this end, we used an intercross strategy in mice to map modifiers of muscular dystrophy. We interrogated genes of interest in an interval on mouse chromosome 10 associated with body mass in muscular dystrophy as skeletal muscle contributes significantly to total body mass. Using whole-genome sequencing of the two parental mouse strains combined with deep RNA sequencing, we identified the Met62Ile substitution in the dual-specificity phosphatase 6 (Dusp6) gene from the DBA/2 J (D2) mouse strain. DUSP6 is a broadly expressed dual-specificity phosphatase protein, which binds and dephosphorylates extracellular-signal-regulated kinase (ERK), leading to decreased ERK activity. We found that the Met62Ile substitution reduced the interaction between DUSP6 and ERK resulting in increased ERK phosphorylation and ERK activity. In dystrophic muscle, DUSP6 Met62Ile is strongly upregulated to counteract its reduced activity. We found that myoblasts from the D2 background were insensitive to a specific small molecule inhibitor of DUSP6, while myoblasts expressing the canonical DUSP6 displayed enhanced proliferation after exposure to DUSP6 inhibition. These data identify DUSP6 as an important regulator of ERK activity in the setting of muscle growth and muscular dystrophy.
Collapse
Affiliation(s)
- Andy H Vo
- Committee on Development, Regeneration and Stem Cell Biology, The University of Chicago, Chicago, IL
| | | | - Anna Woo
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago IL
| | - Quan Q Gao
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago IL
| | - Alexis R Demonbreun
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago IL
| | - Katherine S Fallon
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago IL
| | - Mattia Quattrocelli
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago IL
| | - Michele Hadhazy
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago IL
| | - Patrick G T Page
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago IL
| | - Zugen Chen
- Departments of Human Genetics and Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Ascia Eskin
- Departments of Human Genetics and Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Kevin Squire
- Departments of Human Genetics and Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Stanley F Nelson
- Departments of Human Genetics and Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Elizabeth M McNally
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago IL
| |
Collapse
|
16
|
Liu X, Luo X, Jiang C, Zhao H. Difficulties and challenges in the development of precision medicine. Clin Genet 2019; 95:569-574. [PMID: 30653655 DOI: 10.1111/cge.13511] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 01/09/2019] [Accepted: 01/11/2019] [Indexed: 12/25/2022]
Abstract
The rapid development of precision medicine is introducing a new era of significance in medicine. However, attaining precision medicine is an ambitious goal that is bound to encounter some challenges. Here, we have put forward some difficulties or questions that should be addressed by the progress in this field. The proposed issues include the long road to precision medicine for all types of diseases as the unknown domains of the human genome hinder the development of precision medicine. The challenges in the acquisition and analysis of large amounts of omics data, including difficulties in the establishment of a library of biological samples and large-scale data analysis, as well as the challenges of informed consent and medical ethics in precision medicine, must be overcome to attain the goals of precision medicine. To date, precision medicine programs have accomplished many preliminary achievements and will help to drive a dramatic revolution in clinical practices for the medical community. Through these advances, the diagnosis and treatment of many diseases will achieve many breakthroughs. This project is just beginning and requires a great deal of time and money. Precision medicine also requires extensive collaboration. Ultimately, these difficulties can be overcome. We should realize that precision medicine is good for patients, but there is still a long way to go.
Collapse
Affiliation(s)
- Xiaoqin Liu
- Department of Nephrology, Hongqi Hospital, Mudanjiang Medical University, Mudanjiang, People's Republic of China
| | - Xin Luo
- Department of Radiotherapy, The Second Hospital of PingLiang City, Second Affiliated Hospital of Gansu Medical College, Pingliang, People's Republic of China
| | - Chunyang Jiang
- Department of Thoracic Surgery, Tianjin Union Medical Center, Tianjin, People's Republic of China
| | - Hui Zhao
- Department of Thoracic Surgery, Tianjin Union Medical Center, Tianjin, People's Republic of China
| |
Collapse
|
17
|
Maarala AI, Bzhalava Z, Dillner J, Heljanko K, Bzhalava D. ViraPipe: scalable parallel pipeline for viral metagenome analysis from next generation sequencing reads. Bioinformatics 2018; 34:928-935. [PMID: 29106455 DOI: 10.1093/bioinformatics/btx702] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Accepted: 11/01/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation Next Generation Sequencing (NGS) technology enables identification of microbial genomes from massive amount of human microbiomes more rapidly and cheaper than ever before. However, the traditional sequential genome analysis algorithms, tools, and platforms are inefficient for performing large-scale metagenomic studies on ever-growing sample data volumes. Currently, there is an urgent need for scalable analysis pipelines that enable harnessing all the power of parallel computation in computing clusters and in cloud computing environments. We propose ViraPipe, a scalable metagenome analysis pipeline that is able to analyze thousands of human microbiomes in parallel in tolerable time. The pipeline is tuned for analyzing viral metagenomes and the software is applicable for other metagenomic analyses as well. ViraPipe integrates parallel BWA-MEM read aligner, MegaHit De novo assembler, and BLAST and HMMER3 sequence search tools. We show the scalability of ViraPipe by running experiments on mining virus related genomes from NGS datasets in a distributed Spark computing cluster. Results ViraPipe analyses 768 human samples in 210 minutes on a Spark computing cluster comprising 23 nodes and 1288 cores in total. The speedup of ViraPipe executed on 23 nodes was 11x compared to the sequential analysis pipeline executed on a single node. The whole process includes parallel decompression, read interleaving, BWA-MEM read alignment, filtering and normalizing of non-human reads, De novo contigs assembling, and searching of sequences with BLAST and HMMER3 tools. Contact ilari.maarala@aalto.fi. Availability and implementation https://github.com/NGSeq/ViraPipe.
Collapse
Affiliation(s)
- Altti Ilari Maarala
- Department of Computer Science, Aalto University, Espoo, Finland.,Helsinki Institute for Information Technology HIIT, Espoo, Finland
| | - Zurab Bzhalava
- Department of Laboratory Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Joakim Dillner
- Department of Laboratory Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Keijo Heljanko
- Department of Computer Science, Aalto University, Espoo, Finland.,Helsinki Institute for Information Technology HIIT, Espoo, Finland
| | - Davit Bzhalava
- Department of Laboratory Medicine, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
18
|
Sona P, Hong JH, Lee S, Kim BJ, Hong WY, Jung J, Kim HN, Kim HL, Christopher D, Herviou L, Im YH, Lee KY, Kim TS, Jung J. Integrated genome sizing (IGS) approach for the parallelization of whole genome analysis. BMC Bioinformatics 2018; 19:462. [PMID: 30509173 PMCID: PMC6276166 DOI: 10.1186/s12859-018-2499-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 11/16/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of whole genome sequence has increased recently with rapid progression of next-generation sequencing (NGS) technologies. However, storing raw sequence reads to perform large-scale genome analysis pose hardware challenges. Despite advancement in genome analytic platforms, efficient approaches remain relevant especially as applied to the human genome. In this study, an Integrated Genome Sizing (IGS) approach is adopted to speed up multiple whole genome analysis in high-performance computing (HPC) environment. The approach splits a genome (GRCh37) into 630 chunks (fragments) wherein multiple chunks can simultaneously be parallelized for sequence analyses across cohorts. RESULTS IGS was integrated on Maha-Fs (HPC) system, to provide the parallelization required to analyze 2504 whole genomes. Using a single reference pilot genome, NA12878, we compared the NGS process time between Maha-Fs (NFS SATA hard disk drive) and SGI-UV300 (solid state drive memory). It was observed that SGI-UV300 was faster, having 32.5 mins of process time, while that of the Maha-Fs was 55.2 mins. CONCLUSIONS The implementation of IGS can leverage the ability of HPC systems to analyze multiple genomes simultaneously. We believe this approach will accelerate research advancement in personalized genomic medicine. Our method is comparable to the fastest methods for sequence alignment.
Collapse
Affiliation(s)
- Peter Sona
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Jong Hui Hong
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Sunho Lee
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Byong Joon Kim
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Woon-Young Hong
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Jongcheol Jung
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Han-Na Kim
- PGM21 (Personalized Genomic Medicine 21), Ewha Womans University Medical Center, 1071, Anyang Cheon-ro, Yangcheon-gu, Seoul, 158-710, Korea
| | - Hyung-Lae Kim
- PGM21 (Personalized Genomic Medicine 21), Ewha Womans University Medical Center, 1071, Anyang Cheon-ro, Yangcheon-gu, Seoul, 158-710, Korea
| | - David Christopher
- Bioinformatics Solutions, 900 N McCarthy Blvd., Milpitas, CA, 95035, USA
| | - Laurent Herviou
- Bioinformatics Solutions, 900 N McCarthy Blvd., Milpitas, CA, 95035, USA
| | - Young Hwan Im
- Bioinformatics Solutions, 900 N McCarthy Blvd., Milpitas, CA, 95035, USA
| | - Kwee-Yum Lee
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025.,Faculty of Medicine, University of Queensland, QLD, Brisbane, 4072, Australia
| | - Tae Soon Kim
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025.,Department of Clinical Medical Sciences, Seoul National University College of Medicine, 71 Ihwajang-gil, Jongno-gu, Seoul, 03087, South Korea
| | - Jongsun Jung
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025.
| |
Collapse
|
19
|
Barefield DY, Puckelwartz MJ, Kim EY, Wilsbacher LD, Vo AH, Waters EA, Earley JU, Hadhazy M, Dellefave-Castillo L, Pesce LL, McNally EM. Experimental Modeling Supports a Role for MyBP-HL as a Novel Myofilament Component in Arrhythmia and Dilated Cardiomyopathy. Circulation 2017; 136:1477-1491. [PMID: 28778945 DOI: 10.1161/circulationaha.117.028585] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/25/2017] [Accepted: 07/21/2017] [Indexed: 12/23/2022]
Abstract
BACKGROUND Cardiomyopathy and arrhythmias are under significant genetic influence. Here, we studied a family with dilated cardiomyopathy and associated conduction system disease in whom prior clinical cardiac gene panel testing was unrevealing. METHODS Whole-genome sequencing and induced pluripotent stem cells were used to examine a family with dilated cardiomyopathy and atrial and ventricular arrhythmias. We also characterized a mouse model with heterozygous and homozygous deletion of Mybphl. RESULTS Whole-genome sequencing identified a premature stop codon, R255X, in the MYBPHL gene encoding MyBP-HL (myosin-binding protein-H like), a novel member of the myosin-binding protein family. MYBPHL was found to have high atrial expression with low ventricular expression. We determined that MyBP-HL protein was myofilament associated in the atria, and truncated MyBP-HL protein failed to incorporate into the myofilament. Human cell modeling demonstrated reduced expression from the mutant MYBPHL allele. Echocardiography of Mybphl heterozygous and null mouse hearts exhibited a 36% reduction in fractional shortening and an increased diastolic ventricular chamber size. Atria weight normalized to total heart weight was significantly increased in Mybphl heterozygous and null mice. Using a reporter system, we detected robust expression of Mybphl in the atria, and in discrete puncta throughout the right ventricular wall and septum, as well. Telemetric electrocardiogram recordings in Mybphl mice revealed cardiac conduction system abnormalities with aberrant atrioventricular conduction and an increased rate of arrhythmia in heterozygous and null mice. CONCLUSIONS The findings of reduced ventricular function and conduction system defects in Mybphl mice support that MYBPHL truncations may increase risk for human arrhythmias and cardiomyopathy.
Collapse
Affiliation(s)
- David Y Barefield
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.)
| | - Megan J Puckelwartz
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.)
| | - Ellis Y Kim
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.)
| | - Lisa D Wilsbacher
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.)
| | - Andy H Vo
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.)
| | - Emily A Waters
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.)
| | - Judy U Earley
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.)
| | - Michele Hadhazy
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.)
| | - Lisa Dellefave-Castillo
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.)
| | - Lorenzo L Pesce
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.)
| | - Elizabeth M McNally
- From Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (D.Y.B., M.J.P., J.U.E., M.H., L.D.-C., E.M.M.); Molecular Pathogenesis and Molecular Medicine, University of Chicago, IL (E.Y.K.); Feinberg Cardiovascular Institute, Northwestern University Feinberg School of Medicine, Chicago, IL (L.D.W.); Committee on Development, Regeneration and Stem Cell Biology, University of Chicago, IL (A.H.V.); Northwestern University Center for Advanced Molecular Imaging, Evanston, IL (E.A.W.); and Computation Institute, University of Chicago, IL (L.L.P.).
| |
Collapse
|
20
|
Decap D, Reumers J, Herzeel C, Costanza P, Fostier J. Halvade-RNA: Parallel variant calling from transcriptomic data using MapReduce. PLoS One 2017; 12:e0174575. [PMID: 28358893 PMCID: PMC5373595 DOI: 10.1371/journal.pone.0174575] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Accepted: 03/10/2017] [Indexed: 12/30/2022] Open
Abstract
Given the current cost-effectiveness of next-generation sequencing, the amount of DNA-seq and RNA-seq data generated is ever increasing. One of the primary objectives of NGS experiments is calling genetic variants. While highly accurate, most variant calling pipelines are not optimized to run efficiently on large data sets. However, as variant calling in genomic data has become common practice, several methods have been proposed to reduce runtime for DNA-seq analysis through the use of parallel computing. Determining the effectively expressed variants from transcriptomics (RNA-seq) data has only recently become possible, and as such does not yet benefit from efficiently parallelized workflows. We introduce Halvade-RNA, a parallel, multi-node RNA-seq variant calling pipeline based on the GATK Best Practices recommendations. Halvade-RNA makes use of the MapReduce programming model to create and manage parallel data streams on which multiple instances of existing tools such as STAR and GATK operate concurrently. Whereas the single-threaded processing of a typical RNA-seq sample requires ∼28h, Halvade-RNA reduces this runtime to ∼2h using a small cluster with two 20-core machines. Even on a single, multi-core workstation, Halvade-RNA can significantly reduce runtime compared to using multi-threading, thus providing for a more cost-effective processing of RNA-seq data. Halvade-RNA is written in Java and uses the Hadoop MapReduce 2.0 API. It supports a wide range of distributions of Hadoop, including Cloudera and Amazon EMR.
Collapse
Affiliation(s)
- Dries Decap
- Department of Information Technology, IDLab, Ghent University - imec, Ghent, Belgium
- ExaScience Life Lab, Leuven, Belgium
| | - Joke Reumers
- Janssen Research & Development, a division of Janssen Pharmaceutica N.V., Beerse, Belgium
- ExaScience Life Lab, Leuven, Belgium
| | | | - Pascal Costanza
- Intel Corporation Belgium, Leuven, Belgium
- ExaScience Life Lab, Leuven, Belgium
| | - Jan Fostier
- Department of Information Technology, IDLab, Ghent University - imec, Ghent, Belgium
- ExaScience Life Lab, Leuven, Belgium
- * E-mail:
| |
Collapse
|
21
|
Evolvable Smartphone-Based Platforms for Point-of-Care In-Vitro Diagnostics Applications. Diagnostics (Basel) 2016; 6:diagnostics6030033. [PMID: 27598208 PMCID: PMC5039567 DOI: 10.3390/diagnostics6030033] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Revised: 08/22/2016] [Accepted: 08/23/2016] [Indexed: 11/16/2022] Open
Abstract
The association of smart mobile devices and lab-on-chip technologies offers unprecedented opportunities for the emergence of direct-to-consumer in vitro medical diagnostics applications. Despite their clear transformative potential, obstacles remain to the large-scale disruption and long-lasting success of these systems in the consumer market. For instance, the increasing level of complexity of instrumented lab-on-chip devices, coupled to the sporadic nature of point-of-care testing, threatens the viability of a business model mainly relying on disposable/consumable lab-on-chips. We argued recently that system evolvability, defined as the design characteristic that facilitates more manageable transitions between system generations via the modification of an inherited design, can help remedy these limitations. In this paper, we discuss how platform-based design can constitute a formal entry point to the design and implementation of evolvable smart device/lab-on-chip systems. We present both a hardware/software design framework and the implementation details of a platform prototype enabling at this stage the interfacing of several lab-on-chip variants relying on current- or impedance-based biosensors. Our findings suggest that several change-enabling mechanisms implemented in the higher abstraction software layers of the system can promote evolvability, together with the design of change-absorbing hardware/software interfaces. Our platform architecture is based on a mobile software application programming interface coupled to a modular hardware accessory. It allows the specification of lab-on-chip operation and post-analytic functions at the mobile software layer. We demonstrate its potential by operating a simple lab-on-chip to carry out the detection of dopamine using various electroanalytical methods.
Collapse
|
22
|
Hall JL, Ryan JJ, Bray BE, Brown C, Lanfear D, Newby LK, Relling MV, Risch NJ, Roden DM, Shaw SY, Tcheng JE, Tenenbaum J, Wang TN, Weintraub WS. Merging Electronic Health Record Data and Genomics for Cardiovascular Research: A Science Advisory From the American Heart Association. ACTA ACUST UNITED AC 2016; 9:193-202. [PMID: 26976545 DOI: 10.1161/hcg.0000000000000029] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The process of scientific discovery is rapidly evolving. The funding climate has influenced a favorable shift in scientific discovery toward the use of existing resources such as the electronic health record. The electronic health record enables long-term outlooks on human health and disease, in conjunction with multidimensional phenotypes that include laboratory data, images, vital signs, and other clinical information. Initial work has confirmed the utility of the electronic health record for understanding mechanisms and patterns of variability in disease susceptibility, disease evolution, and drug responses. The addition of biobanks and genomic data to the information contained in the electronic health record has been demonstrated. The purpose of this statement is to discuss the current challenges in and the potential for merging electronic health record data and genomics for cardiovascular research.
Collapse
|
23
|
Kovatch P, Costa A, Giles Z, Fluder E, Cho HM, Mazurkova S. Big Omics Data Experience. SC ... CONFERENCE PROCEEDINGS. SC (CONFERENCE : SUPERCOMPUTING) 2015; 2015. [PMID: 30788464 DOI: 10.1145/2807591.2807595] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
As personalized medicine becomes more integrated into healthcare, the rate at which human genomes are being sequenced is rising quickly together with a concomitant acceleration in compute and storage requirements. To achieve the most effective solution for genomic workloads without re-architecting the industry-standard software, we performed a rigorous analysis of usage statistics, benchmarks and available technologies to design a system for maximum throughput. We share our experiences designing a system optimized for the "Genome Analysis ToolKit (GATK) Best Practices" whole genome DNA and RNA pipeline based on an evaluation of compute, workload and I/O characteristics. The characteristics of genomic-based workloads are vastly different from those of traditional HPC workloads, requiring different configurations of the scheduler and the I/O subsystem to achieve reliability, performance and scalability. By understanding how our researchers and clinicians work, we were able to employ techniques not only to speed up their workflow yielding improved and repeatable performance, but also to make more efficient use of storage and compute resources.
Collapse
Affiliation(s)
- Patricia Kovatch
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| | - Anthony Costa
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| | - Zachary Giles
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| | - Eugene Fluder
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| | - Hyung Min Cho
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| | - Svetlana Mazurkova
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY 10029, 212-241-6500
| |
Collapse
|
24
|
Standish KA, Carland TM, Lockwood GK, Pfeiffer W, Tatineni M, Huang CC, Lamberth S, Cherkas Y, Brodmerkel C, Jaeger E, Smith L, Rajagopal G, Curran ME, Schork NJ. Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies. BMC Bioinformatics 2015; 16:304. [PMID: 26395405 PMCID: PMC4580299 DOI: 10.1186/s12859-015-0736-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2014] [Accepted: 09/11/2015] [Indexed: 11/10/2022] Open
Abstract
MOTIVATION Next-generation sequencing (NGS) technologies have become much more efficient, allowing whole human genomes to be sequenced faster and cheaper than ever before. However, processing the raw sequence reads associated with NGS technologies requires care and sophistication in order to draw compelling inferences about phenotypic consequences of variation in human genomes. It has been shown that different approaches to variant calling from NGS data can lead to different conclusions. Ensuring appropriate accuracy and quality in variant calling can come at a computational cost. RESULTS We describe our experience implementing and evaluating a group-based approach to calling variants on large numbers of whole human genomes. We explore the influence of many factors that may impact the accuracy and efficiency of group-based variant calling, including group size, the biogeographical backgrounds of the individuals who have been sequenced, and the computing environment used. We make efficient use of the Gordon supercomputer cluster at the San Diego Supercomputer Center by incorporating job-packing and parallelization considerations into our workflow while calling variants on 437 whole human genomes generated as part of large association study. CONCLUSIONS We ultimately find that our workflow resulted in high-quality variant calls in a computationally efficient manner. We argue that studies like ours should motivate further investigations combining hardware-oriented advances in computing systems with algorithmic developments to tackle emerging 'big data' problems in biomedical research brought on by the expansion of NGS technologies.
Collapse
Affiliation(s)
- Kristopher A Standish
- Biomedical Sciences Graduate Program, University of California, San Diego, Gilman Drive, La Jolla, 92092, CA, USA. .,Human Biology, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, 92092, CA, USA.
| | - Tristan M Carland
- Human Biology, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, 92092, CA, USA.
| | - Glenn K Lockwood
- San Diego Supercomputer Center, University of California, San Diego, Gilman Drive, La Jolla, 92092, CA, USA.
| | - Wayne Pfeiffer
- San Diego Supercomputer Center, University of California, San Diego, Gilman Drive, La Jolla, 92092, CA, USA.
| | - Mahidhar Tatineni
- San Diego Supercomputer Center, University of California, San Diego, Gilman Drive, La Jolla, 92092, CA, USA.
| | - C Chris Huang
- Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA.
| | - Sarah Lamberth
- Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA.
| | - Yauheniya Cherkas
- Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA.
| | - Carrie Brodmerkel
- Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA.
| | - Ed Jaeger
- Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. .,R&D IT, Janssen R&D LLC, Springhouse, PA, USA.
| | - Lance Smith
- Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. .,R&D IT, Janssen R&D LLC, Springhouse, PA, USA.
| | - Gunaretnam Rajagopal
- Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. .,R&D IT, Janssen R&D LLC, Springhouse, PA, USA.
| | - Mark E Curran
- Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA.
| | - Nicholas J Schork
- Human Biology, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, 92092, CA, USA.
| |
Collapse
|
25
|
McNally E, Patterson K. Elizabeth McNally: A Muscular Approach. Circ Res 2015; 117:317-20. [PMID: 26227877 DOI: 10.1161/circresaha.115.307128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
26
|
Leveraging the power of high performance computing for next generation sequencing data analysis: tricks and twists from a high throughput exome workflow. PLoS One 2015; 10:e0126321. [PMID: 25942438 PMCID: PMC4420499 DOI: 10.1371/journal.pone.0126321] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Accepted: 03/31/2015] [Indexed: 12/26/2022] Open
Abstract
Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files.
Collapse
|
27
|
Decap D, Reumers J, Herzeel C, Costanza P, Fostier J. Halvade: scalable sequence analysis with MapReduce. Bioinformatics 2015; 31:2482-8. [PMID: 25819078 PMCID: PMC4514927 DOI: 10.1093/bioinformatics/btv179] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Accepted: 03/23/2015] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Post-sequencing DNA analysis typically consists of read mapping followed by variant calling. Especially for whole genome sequencing, this computational step is very time-consuming, even when using multithreading on a multi-core machine. RESULTS We present Halvade, a framework that enables sequencing pipelines to be executed in parallel on a multi-node and/or multi-core compute infrastructure in a highly efficient manner. As an example, a DNA sequencing analysis pipeline for variant calling has been implemented according to the GATK Best Practices recommendations, supporting both whole genome and whole exome sequencing. Using a 15-node computer cluster with 360 CPU cores in total, Halvade processes the NA12878 dataset (human, 100 bp paired-end reads, 50× coverage) in <3 h with very high parallel efficiency. Even on a single, multi-core machine, Halvade attains a significant speedup compared with running the individual tools with multithreading.
Collapse
Affiliation(s)
- Dries Decap
- Department of Information Technology, Ghent University - iMinds, Gaston Crommenlaan 8 bus 201, 9050 Ghent, Belgium, ExaScience Life Lab, Kapeldreef 75, 3001 Leuven, Belgium
| | - Joke Reumers
- ExaScience Life Lab, Kapeldreef 75, 3001 Leuven, Belgium, Janssen Research & Development, a division of Janssen Pharmaceutica N.V., 2340 Beerse, Belgium
| | - Charlotte Herzeel
- ExaScience Life Lab, Kapeldreef 75, 3001 Leuven, Belgium, Imec, Kapeldreef 75, 3001 Leuven, Belgium, and
| | - Pascal Costanza
- ExaScience Life Lab, Kapeldreef 75, 3001 Leuven, Belgium, Intel Corporation Belgium
| | - Jan Fostier
- Department of Information Technology, Ghent University - iMinds, Gaston Crommenlaan 8 bus 201, 9050 Ghent, Belgium, ExaScience Life Lab, Kapeldreef 75, 3001 Leuven, Belgium
| |
Collapse
|
28
|
Kelly BJ, Fitch JR, Hu Y, Corsmeier DJ, Zhong H, Wetzel AN, Nordquist RD, Newsom DL, White P. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics. Genome Biol 2015; 16:6. [PMID: 25600152 PMCID: PMC4333267 DOI: 10.1186/s13059-014-0577-x] [Citation(s) in RCA: 81] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Accepted: 12/23/2014] [Indexed: 12/18/2022] Open
Abstract
While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Peter White
- Center for Microbial Pathogenesis, The Research Institute at Nationwide Children's Hospital, 700 Children's Drive, Columbus 43205, OH, USA.,Department of Pediatrics, College of Medicine, The Ohio State University, Columbus, Ohio, USA
| |
Collapse
|