1
|
Zhang JX, Yordanov B, Gaunt A, Wang MX, Dai P, Chen YJ, Zhang K, Fang JZ, Dalchau N, Li J, Phillips A, Zhang DY. A deep learning model for predicting next-generation sequencing depth from DNA sequence. Nat Commun 2021; 12:4387. [PMID: 34282137 PMCID: PMC8290051 DOI: 10.1038/s41467-021-24497-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Accepted: 06/17/2021] [Indexed: 11/29/2022] Open
Abstract
Targeted high-throughput DNA sequencing is a primary approach for genomics and molecular diagnostics, and more recently as a readout for DNA information storage. Oligonucleotide probes used to enrich gene loci of interest have different hybridization kinetics, resulting in non-uniform coverage that increases sequencing costs and decreases sequencing sensitivities. Here, we present a deep learning model (DLM) for predicting Next-Generation Sequencing (NGS) depth from DNA probe sequences. Our DLM includes a bidirectional recurrent neural network that takes as input both DNA nucleotide identities as well as the calculated probability of the nucleotide being unpaired. We apply our DLM to three different NGS panels: a 39,145-plex panel for human single nucleotide polymorphisms (SNP), a 2000-plex panel for human long non-coding RNA (lncRNA), and a 7373-plex panel targeting non-human sequences for DNA information storage. In cross-validation, our DLM predicts sequencing depth to within a factor of 3 with 93% accuracy for the SNP panel, and 99% accuracy for the non-human panel. In independent testing, the DLM predicts the lncRNA panel with 89% accuracy when trained on the SNP panel. The same model is also effective at predicting the measured single-plex kinetic rate constants of DNA hybridization and strand displacement. DNA probes used in next generation sequencing (NGS) have variable hybridisation kinetics, resulting in non-uniform coverage. Here, the authors develop a deep learning model to predict NGS depth using DNA probe sequences and apply to human and non-human sequencing panels.
Collapse
Affiliation(s)
- Jinny X Zhang
- Department of Bioengineering, Rice University, Houston, TX, USA.,Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, USA
| | - Boyan Yordanov
- Microsoft Research, Cambridge, UK.,Scientific Technologies, London, UK
| | | | - Michael X Wang
- Department of Bioengineering, Rice University, Houston, TX, USA
| | - Peng Dai
- Department of Bioengineering, Rice University, Houston, TX, USA
| | | | - Kerou Zhang
- Department of Bioengineering, Rice University, Houston, TX, USA
| | - John Z Fang
- Department of Bioengineering, Rice University, Houston, TX, USA
| | | | - Jiaming Li
- Department of Bioengineering, Rice University, Houston, TX, USA.,Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, USA
| | | | - David Yu Zhang
- Department of Bioengineering, Rice University, Houston, TX, USA. .,Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, USA.
| |
Collapse
|
2
|
Chadwick BP. Characterization of chromatin at structurally abnormal inactive X chromosomes reveals potential evidence of a rare hybrid active and inactive isodicentric X chromosome. Chromosome Res 2019; 28:155-169. [PMID: 31776830 DOI: 10.1007/s10577-019-09621-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 11/12/2019] [Accepted: 11/14/2019] [Indexed: 02/07/2023]
Abstract
X chromosome structural abnormalities are relatively common in Turner syndrome patients, in particular X isochromosomes. Reports over the last five decades examining asynchronous DNA replication between the normal X and isochromosome have clearly established that the structurally abnormal chromosome is the inactive X chromosome (Xi). Here the organization of chromatin at a deleted X chromosome, an Xq isochromosome, and two isodicentric chromosomes were examined. Consistent with previous differential staining methods, at interphase, the X isochromosome and isodicentric X chromosomes frequently formed bipartite Barr bodies, observed by fluorescence microscopy using numerous independent bona fide markers of Xi heterochromatin. At metaphase, with the exception of the pseudoautosomal region and the duplicated locus of the macrosatellite DXZ4 (if present on the abnormal X chromosome based on break points), euchromatin markers were absent from the Xi, whereas histone variant macroH2A formed reproducible banded mirror-image chromosomes. Unexpectedly, the isodicentric chromosome in 46,X,idic(X)(q28) cells, which carry a near full-length q-arm-to-q-arm fused chromosome, showed at interphase very rare instances of Xi chromatin bodies that were separated by large distances in the nucleus. Further examination using immunofluorescence and FISH support the possibility that these rare cells may represent ones in which one half of the isodicentric chromosome is active and the other half is inactive.
Collapse
Affiliation(s)
- Brian P Chadwick
- Department of Biological Science, Florida State University, 319 Stadium Drive, King 3076, Tallahassee, FL, 32306-4295, USA.
| |
Collapse
|
4
|
Dalton P, Coppin B, James R, Skuse D, Jacobs P. Three patients with a 45,X/46,X,psu dic(Xp) karyotype. J Med Genet 1998; 35:519-24. [PMID: 9643298 PMCID: PMC1051351 DOI: 10.1136/jmg.35.6.519] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Few cases of isochromosomes for the short arm of the X have been reported and all are dicentric with variable portions of the long arms interposed between the two centromeres. This paper reports three cases of complete short arm duplication of one X chromosome in unrelated female patients. All patients also have a 45,X cell line and present with some characteristic features of Turner syndrome. We used conventional cytogenetics, in situ hybridisation, and molecular genetics to describe all three structurally abnormal chromosomes and the parental origin of two of them. We briefly discuss the "inactivation enhancement" theory; however, any genotype-phenotype correlation is complicated by the presence of the 45,X cell line.
Collapse
Affiliation(s)
- P Dalton
- Wessex Regional Genetics Laboratory, Salisbury District Hospital, Wiltshire, UK
| | | | | | | | | |
Collapse
|
5
|
Earnshaw WC, Migeon BR. Three related centromere proteins are absent from the inactive centromere of a stable isodicentric chromosome. Chromosoma 1985; 92:290-6. [PMID: 2994966 DOI: 10.1007/bf00329812] [Citation(s) in RCA: 195] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
We developed an aqueous spreading procedure that permits simultaneous analysis of human chromosomes by Q-banding and indirect immunofluorescence. Using this methodology and anticentromere antibodies from an autoimmune patient we compared the active and inactive centromeres of an isodicentric X chromosome. We show that a family of structurally related human centromere proteins (CENP-A, CENP-B, and CENP-C) is detectable only at the active centromere. These antigens therefore may be regarded both as morphological and functional markers for active centromeres.
Collapse
|