1
|
Chan AP, Choi Y, Rangan A, Zhang G, Podder A, Berens M, Sharma S, Pirrotte P, Byron S, Duggan D, Schork NJ. Interrogating the Human Diplome: Computational Methods, Emerging Applications, and Challenges. Methods Mol Biol 2023; 2590:1-30. [PMID: 36335489 DOI: 10.1007/978-1-0716-2819-5_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Human DNA sequencing protocols have revolutionized human biology, biomedical science, and clinical practice, but still have very important limitations. One limitation is that most protocols do not separate or assemble (i.e., "phase") the nucleotide content of each of the maternally and paternally derived chromosomal homologs making up the 22 autosomal pairs and the chromosomal pair making up the pseudo-autosomal region of the sex chromosomes. This has led to a dearth of studies and a consequent underappreciation of many phenomena of fundamental importance to basic and clinical genomic science. We discuss a few protocols for obtaining phase information as well as their limitations, including those that could be used in tumor phasing settings. We then describe a number of biological and clinical phenomena that require phase information. These include phenomena that require precise knowledge of the nucleotide sequence in a chromosomal segment from germline or somatic cells, such as DNA binding events, and insight into unique cis vs. trans-acting functionally impactful variant combinations-for example, variants implicated in a phenotype governed by compound heterozygosity. In addition, we also comment on the need for reliable and consensus-based diploid-context computational workflows for variant identification as well as the need for laboratory-based functional verification strategies for validating cis vs. trans effects of variant combinations. We also briefly describe available resources, example studies, as well as areas of further research, and ultimately argue that the science behind the study of human diploidy, referred to as "diplomics," which will be enabled by nucleotide-level resolution of phased genomes, is a logical next step in the analysis of human genome biology.
Collapse
Affiliation(s)
- Agnes P Chan
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
| | - Yongwook Choi
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
| | - Aditya Rangan
- Courant Institute of Mathematical Sciences at New York University, New York, NY, USA
| | - Guangfa Zhang
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
| | - Avijit Podder
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
| | - Michael Berens
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
- The City of Hope National Medical Center, Duarte, CA, USA
| | - Sunil Sharma
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
- The City of Hope National Medical Center, Duarte, CA, USA
| | - Patrick Pirrotte
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
- The City of Hope National Medical Center, Duarte, CA, USA
| | - Sara Byron
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
- The City of Hope National Medical Center, Duarte, CA, USA
| | - Dave Duggan
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
- The City of Hope National Medical Center, Duarte, CA, USA
| | - Nicholas J Schork
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA.
- The City of Hope National Medical Center, Duarte, CA, USA.
| |
Collapse
|
2
|
Hoehe MR, Herwig R. Analysis of 1276 Haplotype-Resolved Genomes Allows Characterization of Cis- and Trans-Abundant Genes. Methods Mol Biol 2023; 2590:237-272. [PMID: 36335503 DOI: 10.1007/978-1-0716-2819-5_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Many methods for haplotyping have materialized, but their application on a significant scale has been rare to date. Here we summarize analyses that were carried out in 1092 genomes from the 1000 Genomes Consortium and validated in an unprecedented number of 184 PGP genomes that have been experimentally haplotype-resolved by application of the Long-Fragment Read (LFR) technology. These analyses provided first insights into the diplotypic nature of human genomes and its potential functional implications. Thus, protein-changing variants were not randomly distributed between the two homologues of 18,121 autosomal protein-coding genes but occurred significantly more frequently in cis than in trans configurations in virtually each of the 1276 phased genomes. This resulted in global cis/trans ratios of ~60:40, establishing "cis abundance" as a universal characteristic of diploid human genomes. This phenomenon was based on two different classes of genes, a larger one exhibiting cis configurations of protein-changing variants in excess, so-called "cis-abundant" genes, and a smaller one of "trans-abundant" genes. These two gene classes, which together constitute a common diplotypic exome, were further functionally distinguished by means of gene ontology (GO) and pathway enrichment analysis. Moreover, they were distinguishable in terms of their effects on the human interactome, where they constitute distinct cis and trans modules, as shown with network propagation on a large integrated protein-protein interaction network. These analyses, recently performed with updated database and analysis tools, further consolidated the characterization of cis- and trans-abundant genes while expanding previous results. In this chapter, we present the key results along with the materials and methods to motivate readers to investigate these findings independently and gain further insights into the diplotypic nature of genes and genomes.
Collapse
Affiliation(s)
- Margret R Hoehe
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | - Ralf Herwig
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| |
Collapse
|
3
|
Leung AWS, Leung HCM, Wong CL, Zheng ZX, Lui WW, Luk HM, Lo IFM, Luo R, Lam TW. ECNano: A cost-effective workflow for target enrichment sequencing and accurate variant calling on 4800 clinically significant genes using a single MinION flowcell. BMC Med Genomics 2022; 15:43. [PMID: 35246132 PMCID: PMC8895767 DOI: 10.1186/s12920-022-01190-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 02/22/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The application of long-read sequencing using the Oxford Nanopore Technologies (ONT) MinION sequencer is getting more diverse in the medical field. Having a high sequencing error of ONT and limited throughput from a single MinION flowcell, however, limits its applicability for accurate variant detection. Medical exome sequencing (MES) targets clinically significant exon regions, allowing rapid and comprehensive screening of pathogenic variants. By applying MES with MinION sequencing, the technology can achieve a more uniform capture of the target regions, shorter turnaround time, and lower sequencing cost per sample. METHOD We introduced a cost-effective optimized workflow, ECNano, comprising a wet-lab protocol and bioinformatics analysis, for accurate variant detection at 4800 clinically important genes and regions using a single MinION flowcell. The ECNano wet-lab protocol was optimized to perform long-read target enrichment and ONT library preparation to stably generate high-quality MES data with adequate coverage. The subsequent variant-calling workflow, Clair-ensemble, adopted a fast RNN-based variant caller, Clair, and was optimized for target enrichment data. To evaluate its performance and practicality, ECNano was tested on both reference DNA samples and patient samples. RESULTS ECNano achieved deep on-target depth of coverage (DoC) at average > 100× and > 98% uniformity using one MinION flowcell. For accurate ONT variant calling, the generated reads sufficiently covered 98.9% of pathogenic positions listed in ClinVar, with 98.96% having at least 30× DoC. ECNano obtained an average read length of 1000 bp. The long reads of ECNano also covered the adjacent splice sites well, with 98.5% of positions having ≥ 30× DoC. Clair-ensemble achieved > 99% recall and accuracy for SNV calling. The whole workflow from wet-lab protocol to variant detection was completed within three days. CONCLUSION We presented ECNano, an out-of-the-box workflow comprising (1) a wet-lab protocol for ONT target enrichment sequencing and (2) a downstream variant detection workflow, Clair-ensemble. The workflow is cost-effective, with a short turnaround time for high accuracy variant calling in 4800 clinically significant genes and regions using a single MinION flowcell. The long-read exon captured data has potential for further development, promoting the application of long-read sequencing in personalized disease treatment and risk prediction.
Collapse
Affiliation(s)
- Amy Wing-Sze Leung
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | | | - Chak-Lim Wong
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Zhen-Xian Zheng
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Wui-Wang Lui
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Ho-Ming Luk
- Department of Health, Clinical Genetic Service, Hong Kong, SAR, China
| | - Ivan Fai-Man Lo
- Department of Health, Clinical Genetic Service, Hong Kong, SAR, China
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Hong Kong, China.
| | - Tak-Wah Lam
- Department of Computer Science, The University of Hong Kong, Hong Kong, China.
| |
Collapse
|
4
|
Hall JE, Lawrence ES, Simonson TS, Fox K. Seq-ing Higher Ground: Functional Investigation of Adaptive Variation Associated With High-Altitude Adaptation. Front Genet 2020; 11:471. [PMID: 32528523 PMCID: PMC7247851 DOI: 10.3389/fgene.2020.00471] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2020] [Accepted: 04/16/2020] [Indexed: 12/21/2022] Open
Abstract
Human populations at high altitude exhibit both unique physiological responses and strong genetic signatures of selection thought to compensate for the decreased availability of oxygen in each breath of air. With the increased availability of genomic information from Tibetans, Andeans, and Ethiopians, much progress has been made to elucidate genetic adaptations to chronic hypoxia that have occurred throughout hundreds of generations in these populations. In this perspectives piece, we discuss specific hypoxia-pathway variants that have been identified in high-altitude populations and methods for functional investigation, which may be used to determine the underlying causal factors that afford adaptation to high altitude.
Collapse
Affiliation(s)
- James E. Hall
- Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, School of Medicine, University of California, San Diego, La Jolla, CA, United States
| | - Elijah S. Lawrence
- Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, School of Medicine, University of California, San Diego, La Jolla, CA, United States
| | - Tatum S. Simonson
- Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, School of Medicine, University of California, San Diego, La Jolla, CA, United States
| | - Keolu Fox
- Department of Anthropology and Global Health, University of California, San Diego, La Jolla, CA, United States
| |
Collapse
|