1
|
Pan W, Niu H, Luo S, Chen L, Wu ZS. Intelligent Reconfiguration-Promoted Cellular Internalization of Core-Shell DNA Nanoprobe Equipped with Successive Dual Stimuli-Responsive Protective Satellites for Amplification Fluorescence Imaging of Tumor Cells. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2024:e2311388. [PMID: 38282377 DOI: 10.1002/smll.202311388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Indexed: 01/30/2024]
Abstract
Although DNA probes have attracted increasing interest for precise tumor cell identification by imaging intracellular biomarkers, the requirement of commercial transfection reagents, limited targeting ligands, and/or non-biocompatible inorganic nanostructures has hampered the clinic translation. To circumvent these shortcomings, a reconfigurable ES-NC (Na+ -dependent DNAzyme (E)-based substrate (S) cleavage core/shell DNA nanocluster (NC)) entirely from DNA strands is assembled for precise imaging of cancerous cells in a successive dual-stimuli-responsive manner. This nanoprobe is composed of a strung DNA tetrahedral satellites-based protective (DTP) shell, parallelly aligned target-responsive sensing (PTS) interlayer, and hydrophobic cholesterol-packed innermost layer (HCI core). Tetrahedral axial rotation-activated reconfiguration of DTP shell promotes the exposure of interior hydrophobic moieties, enabling cholesterol-mediated cellular internalization without auxiliary elements. Within cells, over-expressed glutathione triggers the disassembly of the DTP protective shell (first stimulus), facilitating target-stimulated signal transduction/amplification process (second stimuli). Target miRNA-21 is detected down to 10.6 fM without interference from coexisting miRNAs. Compared with transfection reagent-mediated counterpart, ES-NC displays a higher imaging ability, resists nuclease degradation, and has no detectable damage to healthy cells. The blind test demonstrates that the ES-NC is suitable for the identification of cancerous cells from healthy cells, indicating a promising tool for early diagnosis and prediction of cancer.
Collapse
Affiliation(s)
- Wenhao Pan
- Key Laboratory of Laboratory Medicine, Ministry of Education of China, and Zhejiang Provincial Key Laboratory of Medical Genetics, School of Laboratory Medicine and Life Science, Wenzhou Medical University, Wenzhou, 325035, China
- Cancer Metastasis Alert and Prevention Center, Fujian Provincial Key Laboratory of Cancer Metastasis Chemoprevention and Chemotherapy, State Key Laboratory of Photocatalysis on Energy and Environment, College of Chemistry, Fuzhou University, Fuzhou, 350108, China
| | - Huimin Niu
- Cancer Metastasis Alert and Prevention Center, Fujian Provincial Key Laboratory of Cancer Metastasis Chemoprevention and Chemotherapy, State Key Laboratory of Photocatalysis on Energy and Environment, College of Chemistry, Fuzhou University, Fuzhou, 350108, China
- Fujian Key Laboratory of Aptamers Technology, The 900th Hospital of Joint Logistics Support Force, Fuzhou, 350025, China
| | - Shasha Luo
- Cancer Metastasis Alert and Prevention Center, Fujian Provincial Key Laboratory of Cancer Metastasis Chemoprevention and Chemotherapy, State Key Laboratory of Photocatalysis on Energy and Environment, College of Chemistry, Fuzhou University, Fuzhou, 350108, China
| | - Linhuan Chen
- Cancer Metastasis Alert and Prevention Center, Fujian Provincial Key Laboratory of Cancer Metastasis Chemoprevention and Chemotherapy, State Key Laboratory of Photocatalysis on Energy and Environment, College of Chemistry, Fuzhou University, Fuzhou, 350108, China
| | - Zai-Sheng Wu
- Key Laboratory of Laboratory Medicine, Ministry of Education of China, and Zhejiang Provincial Key Laboratory of Medical Genetics, School of Laboratory Medicine and Life Science, Wenzhou Medical University, Wenzhou, 325035, China
- Cancer Metastasis Alert and Prevention Center, Fujian Provincial Key Laboratory of Cancer Metastasis Chemoprevention and Chemotherapy, State Key Laboratory of Photocatalysis on Energy and Environment, College of Chemistry, Fuzhou University, Fuzhou, 350108, China
| |
Collapse
|
2
|
Fedorova L, Crossley ER, Mulyar OA, Qiu S, Freeman R, Fedorov A. Profound Non-Randomness in Dinucleotide Arrangements within Ultra-Conserved Non-Coding Elements and the Human Genome. BIOLOGY 2023; 12:1125. [PMID: 37627009 PMCID: PMC10452674 DOI: 10.3390/biology12081125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/09/2023] [Accepted: 08/11/2023] [Indexed: 08/27/2023]
Abstract
Long human ultra-conserved non-coding elements (UCNEs) do not have any sequence similarity to each other or other characteristics that make them unalterable during vertebrate evolution. We hypothesized that UCNEs have unique dinucleotide (DN) composition and arrangements compared to the rest of the genome. A total of 4272 human UCNE sequences were analyzed computationally and compared with the whole genomes of human, chicken, zebrafish, and fly. Statistical analysis was performed to assess the non-randomness in DN spacing arrangements within the entire human genome and within UCNEs. Significant non-randomness in DN spacing arrangements was observed in the entire human genome. Additionally, UCNEs exhibited distinct patterns in DN arrangements compared to the rest of the genome. Approximately 83% of all DN pairs within UCNEs showed significant (>10%) non-random genomic arrangements at short distances (2-6 nucleotides) relative to each other. At the extremes, non-randomness in DN spacing distances deviated up to 40% from expected values and were frequently associated with GpC, CpG, ApT, and GpG/CpC dinucleotides. The described peculiarities in DN arrangements have persisted for hundreds of millions of years in vertebrates. These distinctive patterns may suggest that UCNEs have specific DNA conformations.
Collapse
Affiliation(s)
- Larisa Fedorova
- CRI Genetics LLC, Santa Monica, CA 90404, USA; (L.F.); (O.A.M.); (R.F.)
| | - Emily R. Crossley
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA;
| | - Oleh A. Mulyar
- CRI Genetics LLC, Santa Monica, CA 90404, USA; (L.F.); (O.A.M.); (R.F.)
| | - Shuhao Qiu
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA;
| | - Ryan Freeman
- CRI Genetics LLC, Santa Monica, CA 90404, USA; (L.F.); (O.A.M.); (R.F.)
| | - Alexei Fedorov
- CRI Genetics LLC, Santa Monica, CA 90404, USA; (L.F.); (O.A.M.); (R.F.)
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA;
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA;
| |
Collapse
|
3
|
Kim S, Yuan JB, Woods WS, Newton DA, Perez-Pinera P, Song JS. Chromatin structure and context-dependent sequence features control prime editing efficiency. Front Genet 2023; 14:1222112. [PMID: 37456665 PMCID: PMC10344898 DOI: 10.3389/fgene.2023.1222112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 06/16/2023] [Indexed: 07/18/2023] Open
Abstract
Prime editing (PE) is a highly versatile CRISPR-Cas9 genome editing technique. The current constructs, however, have variable efficiency and may require laborious experimental optimization. This study presents statistical models for learning the salient epigenomic and sequence features of target sites modulating the editing efficiency and provides guidelines for designing optimal PEs. We found that both regional constitutive heterochromatin and local nucleosome occlusion of target sites impede editing, while position-specific G/C nucleotides in the primer-binding site (PBS) and reverse transcription (RT) template regions of PE guide RNA (pegRNA) yield high editing efficiency, especially for short PBS designs. The presence of G/C nucleotides was most critical immediately 5' to the protospacer adjacent motif (PAM) site for all designs. The effects of different last templated nucleotides were quantified and observed to depend on the length of both PBS and RT templates. Our models found AGG to be the preferred PAM and detected a guanine nucleotide four bases downstream of the PAM to facilitate editing, suggesting a hitherto-unrecognized interaction with Cas9. A neural network interpretation method based on nonextensive statistical mechanics further revealed multi-nucleotide preferences, indicating dependency among several bases across pegRNA. Our work clarifies previous conflicting observations and uncovers context-dependent features important for optimizing PE designs.
Collapse
Affiliation(s)
- Somang Kim
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL, United States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Jimmy B. Yuan
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL, United States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Wendy S. Woods
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Destry A. Newton
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Pablo Perez-Pinera
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, United States
- Department of Biomedical and Translational Sciences, Carle-Illinois College of Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, United States
- Cancer Center at Illinois, University of Illinois at Urbana-Champaign, Urbana, IL, United States
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Jun S. Song
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL, United States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
- Cancer Center at Illinois, University of Illinois at Urbana-Champaign, Urbana, IL, United States
- Center for Theoretical Physics, Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, United States
- Department of Statistics, Harvard University, Cambridge, MA, United States
| |
Collapse
|
4
|
Ji D, Zhao J, Liu Y, Wei D. Electrical Nanobiosensors for Nucleic Acid Based Diagnostics. J Phys Chem Lett 2023; 14:4084-4095. [PMID: 37125726 DOI: 10.1021/acs.jpclett.3c00495] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Recent advances in nanotechnologies have promoted the iterative updating of nucleic acid sensors. Among various sensing technologies, the electrical nanobiosensor is regarded as one of the most promising prospects to achieve rapid, precise, and point-of-care nucleic acid based diagnostics. In this Perspective, we introduce recent progresses in electrical nanobiosensors for nucleic acid detection. First, the strategies for improving detection performance are summarized, including chemical amplification and electrical amplification. Then, the detection mechanism of electrical nanobiosensors, such as electrochemical biosensors, field-effect transistors, and photoelectric enhanced biosensors, is illustrated. At the same time, their applications in cancer screening, pathogen detection, gene sequencing, and genetic disease diagnosis are introduced. Finally, challenges and future prospects in clinical application are discussed.
Collapse
Affiliation(s)
- Daizong Ji
- State Key Laboratory of Molecular Engineering of Polymers, Fudan University, Shanghai 200433, China
- Department of Macromolecular Science, Fudan University, Shanghai 200433, China
- Laboratory of Molecular Materials and Devices, Fudan University, Shanghai 200433, China
| | - Junhong Zhao
- State Key Laboratory of Molecular Engineering of Polymers, Fudan University, Shanghai 200433, China
- Department of Macromolecular Science, Fudan University, Shanghai 200433, China
- Laboratory of Molecular Materials and Devices, Fudan University, Shanghai 200433, China
| | - Yunqi Liu
- Laboratory of Molecular Materials and Devices, Fudan University, Shanghai 200433, China
- Institute of Chemistry, Chinese Academy of Science, Beijing 100190, China
| | - Dacheng Wei
- State Key Laboratory of Molecular Engineering of Polymers, Fudan University, Shanghai 200433, China
- Department of Macromolecular Science, Fudan University, Shanghai 200433, China
- Laboratory of Molecular Materials and Devices, Fudan University, Shanghai 200433, China
| |
Collapse
|
5
|
Kim S, Yuan JB, Woods WS, Newton DA, Perez-Pinera P, Song JS. Chromatin structure and context-dependent sequence features control prime editing efficiency. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.15.536944. [PMID: 37162994 PMCID: PMC10168420 DOI: 10.1101/2023.04.15.536944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Prime editor (PE) is a highly versatile CRISPR-Cas9 genome editing technique. The current constructs, however, have variable efficiency and may require laborious experimental optimization. This study presents statistical models for learning the salient epigenomic and sequence features of target sites modulating the editing efficiency and provides guidelines for designing optimal PEs. We found that both regional constitutive heterochromatin and local nucleosome occlusion of target sites impede editing, while position-specific G/C nucleotides in the primer binding site (PBS) and reverse transcription (RT) template regions of PE guide-RNA (pegRNA) yield high editing efficiency, especially for short PBS designs. The presence of G/C nucleotides was most critical immediately 5' to the protospacer adjacent motif (PAM) site for all designs. The effects of different last templated nucleotides were quantified and seen to depend on both PBS and RT template lengths. Our models found AGG to be the preferred PAM and detected a guanine nucleotide four bases downstream of PAM to facilitate editing, suggesting a hitherto-unrecognized interaction with Cas9. A neural network interpretation method based on nonextensive statistical mechanics further revealed multi-nucleotide preferences, indicating dependency among several bases across pegRNA. Our work clarifies previous conflicting observations and uncovers context-dependent features important for optimizing PE designs.
Collapse
|
6
|
Avsec Ž, Weilert M, Shrikumar A, Krueger S, Alexandari A, Dalal K, Fropf R, McAnany C, Gagneur J, Kundaje A, Zeitlinger J. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat Genet 2021; 53:354-366. [PMID: 33603233 PMCID: PMC8812996 DOI: 10.1038/s41588-021-00782-6] [Citation(s) in RCA: 203] [Impact Index Per Article: 67.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Accepted: 01/07/2021] [Indexed: 01/30/2023]
Abstract
The arrangement (syntax) of transcription factor (TF) binding motifs is an important part of the cis-regulatory code, yet remains elusive. We introduce a deep learning model, BPNet, that uses DNA sequence to predict base-resolution chromatin immunoprecipitation (ChIP)-nexus binding profiles of pluripotency TFs. We develop interpretation tools to learn predictive motif representations and identify soft syntax rules for cooperative TF binding interactions. Strikingly, Nanog preferentially binds with helical periodicity, and TFs often cooperate in a directional manner, which we validate using clustered regularly interspaced short palindromic repeat (CRISPR)-induced point mutations. Our model represents a powerful general approach to uncover the motifs and syntax of cis-regulatory sequences in genomics data.
Collapse
Affiliation(s)
- Žiga Avsec
- Department of Informatics, Technical University of Munich, Garching, Germany,Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, Munich, Germany,Currently at DeepMind, London, UK
| | - Melanie Weilert
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Avanti Shrikumar
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Sabrina Krueger
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Amr Alexandari
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Khyati Dalal
- Stowers Institute for Medical Research, Kansas City, MO, USA,The University of Kansas Medical Center, Kansas City, KS, USA
| | - Robin Fropf
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Charles McAnany
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA,Department of Genetics, Stanford University, Stanford, CA, USA,correspondence: ,
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO, USA,The University of Kansas Medical Center, Kansas City, KS, USA,correspondence: ,
| |
Collapse
|
7
|
DNA mechanics and its biological impact. J Mol Biol 2021; 433:166861. [PMID: 33539885 DOI: 10.1016/j.jmb.2021.166861] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 01/26/2021] [Accepted: 01/27/2021] [Indexed: 02/06/2023]
Abstract
Almost all nucleoprotein interactions and DNA manipulation events involve mechanical deformations of DNA. Extraordinary progresses in single-molecule, structural, and computational methods have characterized the average mechanical properties of DNA, such as bendability and torsional rigidity, in high resolution. Further, the advent of sequencing technology has permitted measuring, in high-throughput, how such mechanical properties vary with sequence and epigenetic modifications along genomes. We review these recent technological advancements, and discuss how they have contributed to the emerging idea that variations in the mechanical properties of DNA play a fundamental role in regulating, genome-wide, diverse processes involved in chromatin organization.
Collapse
|
8
|
Abstract
Mechanical deformations of DNA such as bending are ubiquitous and implicated in diverse cellular functions1. However, the lack of high-throughput tools to directly measure the mechanical properties of DNA limits our understanding of whether and how DNA sequences modulate DNA mechanics and associated chromatin transactions genome-wide. We developed an assay called loop-seq to measure the intrinsic cyclizability of DNA – a proxy for DNA bendability – in high throughput. We measured the intrinsic cyclizabilities of 270,806 50 bp DNA fragments that span the entire length of S. cerevisiae chromosome V and other genomic regions, and also include random sequences. We discovered sequence-encoded regions of unusually low bendability upstream of Transcription Start Sites (TSSs). These regions disfavor the sharp DNA bending required for nucleosome formation and are co-centric with known Nucleosome Depleted Regions (NDRs). We show biochemically that low bendability of linker DNA located about 40 bp away from a nucleosome edge inhibits nucleosome sliding into the linker by the chromatin remodeler INO80. The observation explains how INO80 can create promoter-proximal nucleosomal arrays in the absence of any other factors2 by reading the DNA mechanical landscape. We show that chromosome wide, nucleosomes are characterized by high DNA bendability near dyads and low bendability near the linkers. This contrast increases for nucleosomes deeper into gene bodies, suggesting that DNA mechanics plays a previously unappreciated role in organizing nucleosomes far from the TSS, where nucleosome remodelers predominate. Importantly, random substitution of synonymous codons does not preserve this contrast, suggesting that the evolution of codon choice has been impacted by selective pressure to preserve sequence-encoded mechanical modulations along genes. We also provide evidence that transcription through the TSS-proximal nucleosomes is impacted by local DNA mechanics. Overall, this first genome-scale map of DNA mechanics hints at a ‘mechanical code’ with broad functional implications.
Collapse
|
9
|
Finnegan AI, Kim S, Jin H, Gapinske M, Woods WS, Perez-Pinera P, Song JS. Epigenetic engineering of yeast reveals dynamic molecular adaptation to methylation stress and genetic modulators of specific DNMT3 family members. Nucleic Acids Res 2020; 48:4081-4099. [PMID: 32187373 PMCID: PMC7192628 DOI: 10.1093/nar/gkaa161] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 02/16/2020] [Accepted: 03/13/2020] [Indexed: 12/21/2022] Open
Abstract
Cytosine methylation is a ubiquitous modification in mammalian DNA generated and maintained by several DNA methyltransferases (DNMTs) with partially overlapping functions and genomic targets. To systematically dissect the factors specifying each DNMT's activity, we engineered combinatorial knock-in of human DNMT genes in Komagataella phaffii, a yeast species lacking endogenous DNA methylation. Time-course expression measurements captured dynamic network-level adaptation of cells to DNMT3B1-induced DNA methylation stress and showed that coordinately modulating the availability of S-adenosyl methionine (SAM), the essential metabolite for DNMT-catalyzed methylation, is an evolutionarily conserved epigenetic stress response, also implicated in several human diseases. Convolutional neural networks trained on genome-wide CpG-methylation data learned distinct sequence preferences of DNMT3 family members. A simulated annealing interpretation method resolved these preferences into individual flanking nucleotides and periodic poly(A) tracts that rotationally position highly methylated cytosines relative to phased nucleosomes. Furthermore, the nucleosome repeat length defined the spatial unit of methylation spreading. Gene methylation patterns were similar to those in mammals, and hypo- and hypermethylation were predictive of increased and decreased transcription relative to control, respectively, in the absence of mammalian readers of DNA methylation. Introducing controlled epigenetic perturbations in yeast thus enabled characterization of fundamental genomic features directing specific DNMT3 proteins.
Collapse
Affiliation(s)
- Alex I Finnegan
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Somang Kim
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Hu Jin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Michael Gapinske
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Wendy S Woods
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Pablo Perez-Pinera
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Department of Biomedical and Translational Sciences, Carle-Illinois College of Medicine, University of Illinois, Urbana, IL 61801, USA
- Cancer Center at Illinois, University of Illinois, Urbana, IL 61801, USA
| | - Jun S Song
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Cancer Center at Illinois, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
10
|
Schnepf M, Ludwig C, Bandilla P, Ceolin S, Unnerstall U, Jung C, Gaul U. Sensitive Automated Measurement of Histone-DNA Affinities in Nucleosomes. iScience 2020; 23:100824. [PMID: 31982782 PMCID: PMC6994541 DOI: 10.1016/j.isci.2020.100824] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 12/12/2019] [Accepted: 01/06/2020] [Indexed: 11/06/2022] Open
Abstract
The DNA of eukaryotes is wrapped around histone octamers to form nucleosomes. Although it is well established that the DNA sequence significantly influences nucleosome formation, its precise contribution has remained controversial, partially owing to the lack of quantitative affinity data. Here, we present a method to measure DNA-histone binding free energies at medium throughput and with high sensitivity. Competitive nucleosome formation is achieved through automation, and a modified epifluorescence microscope is used to rapidly and accurately measure the fractions of bound/unbound DNA based on fluorescence anisotropy. The procedure allows us to obtain full titration curves with high reproducibility. We applied this technique to measure the histone-DNA affinities for 47 DNA sequences and analyzed how the affinities correlate with relevant DNA sequence features. We found that the GC content has a significant impact on nucleosome-forming preferences, but 10 bp dinucleotide periodicities and the presence of poly(dA:dT) stretches do not. Robotics permits full titration series to measure histone-DNA binding affinities Fluorescence anisotropy used as a fast, sensitive readout of bound/unbound DNA Free energies span three orders of magnitude, less for naturally occurring sequences GC content is a major determinant of measured binding free energies
Collapse
Affiliation(s)
- Max Schnepf
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Claudia Ludwig
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Peter Bandilla
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Stefano Ceolin
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Ulrich Unnerstall
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Christophe Jung
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany.
| | - Ulrike Gaul
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| |
Collapse
|
11
|
Somatic and Germline Mutation Periodicity Follow the Orientation of the DNA Minor Groove around Nucleosomes. Cell 2019; 175:1074-1087.e18. [PMID: 30388444 DOI: 10.1016/j.cell.2018.10.004] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 08/27/2018] [Accepted: 10/01/2018] [Indexed: 12/11/2022]
Abstract
Mutation rates along the genome are highly variable and influenced by several chromatin features. Here, we addressed how nucleosomes, the most pervasive chromatin structure in eukaryotes, affect the generation of mutations. We discovered that within nucleosomes, the somatic mutation rate across several tumor cohorts exhibits a strong 10 base pair (bp) periodicity. This periodic pattern tracks the alternation of the DNA minor groove facing toward and away from the histones. The strength and phase of the mutation rate periodicity are determined by the mutational processes active in tumors. We uncovered similar periodic patterns in the genetic variation among human and Arabidopsis populations, also detectable in their divergence from close species, indicating that the same principles underlie germline and somatic mutation rates. We propose that differential DNA damage and repair processes dependent on the minor groove orientation in nucleosome-bound DNA contribute to the 10-bp periodicity in AT/CG content in eukaryotic genomes.
Collapse
|
12
|
Skutkova H, Maderankova D, Sedlar K, Jugas R, Vitek M. A degeneration-reducing criterion for optimal digital mapping of genetic codes. Comput Struct Biotechnol J 2019; 17:406-414. [PMID: 30984363 PMCID: PMC6444178 DOI: 10.1016/j.csbj.2019.03.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 02/07/2019] [Accepted: 03/15/2019] [Indexed: 01/08/2023] Open
Abstract
Bioinformatics may seem to be a scientific field processing primarily large string datasets, as nucleotides and amino acids are represented with dedicated characters. On the other hand, many computational tasks that bioinformatics challenges are mathematical problems understandable as operations with digits. In fact, many computational tasks are solved this way in the background. One of the most widely used digital representations is mapping of nucleotides and amino acids with integers 0–3 and 0–20, respectively. The limitation of this mapping occurs when the digital signal of nucleotides has to be translated into a digital signal of amino acids as the genetic code is degenerated. This causes non-monotonies in a mapping function. Although map for reducing this undesirable effect has already been proposed, it is defined theoretically and for standard genetic codes only. In this study, we derived a novel optimal criterion for reducing the influence of degeneration by utilizing a large dataset of real sequences with various genetic codes. As a result, we proposed a new robust global optimal map suitable for any genetic code as well as specialized optimal maps for particular genetic codes. Optimization of 1D numerical representation for DNA to protein translation. Reducing genetic code degeneracy in numerical representation of DNA sequences. More robust numerical conversion used for genomic-proteomic analysis.
Collapse
Affiliation(s)
- Helena Skutkova
- Department of Biomedical Engineering, Brno University of Technology, Technicka 12, 616 00 Brno, Czech republic
| | - Denisa Maderankova
- Department of Biomedical Engineering, Brno University of Technology, Technicka 12, 616 00 Brno, Czech republic
| | - Karel Sedlar
- Department of Biomedical Engineering, Brno University of Technology, Technicka 12, 616 00 Brno, Czech republic
| | - Robin Jugas
- Department of Biomedical Engineering, Brno University of Technology, Technicka 12, 616 00 Brno, Czech republic
| | - Martin Vitek
- Department of Biomedical Engineering, Brno University of Technology, Technicka 12, 616 00 Brno, Czech republic
| |
Collapse
|
13
|
Abstract
Nucleosomes form the fundamental building blocks of eukaryotic chromatin, and previous attempts to understand the principles governing their genome-wide distribution have spurred much interest and debate in biology. In particular, the precise role of DNA sequence in shaping local chromatin structure has been controversial. This paper rigorously quantifies the contribution of hitherto-debated sequence features-including G+C content, 10.5 bp periodicity, and poly(dA:dT) tracts-to three distinct aspects of genome-wide nucleosome landscape: occupancy, translational positioning and rotational positioning. Our computational framework simultaneously learns nucleosome number and nucleosome-positioning energy from genome-wide nucleosome maps. In contrast to other previous studies, our model can predict both in vitro and in vivo nucleosome maps in Saccharomyces cerevisiae. We find that although G+C content is the primary determinant of MNase-derived nucleosome occupancy, MNase digestion biases may substantially influence this GC dependence. By contrast, poly(dA:dT) tracts are seen to deter nucleosome formation, regardless of the experimental method used. We further show that the 10.5 bp nucleotide periodicity facilitates rotational but not translational positioning. Applying our method to in vivo nucleosome maps demonstrates that, for a subset of genes, the regularly-spaced nucleosome arrays observed around transcription start sites can be partially recapitulated by DNA sequence alone. Finally, in vivo nucleosome occupancy derived from MNase-seq experiments around transcription termination sites can be mostly explained by the genomic sequence. Implications of these results and potential extensions of the proposed computational framework are discussed.
Collapse
Affiliation(s)
- Hu Jin
- Department of Physics, University of Illinois, Urbana-Champaign, Urbana, IL 61801
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Urbana, IL 61801
| | - Alex I. Finnegan
- Department of Physics, University of Illinois, Urbana-Champaign, Urbana, IL 61801
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Urbana, IL 61801
| | - Jun S. Song
- Department of Physics, University of Illinois, Urbana-Champaign, Urbana, IL 61801
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Urbana, IL 61801
| |
Collapse
|
14
|
|
15
|
Finnegan A, Song JS. Maximum entropy methods for extracting the learned features of deep neural networks. PLoS Comput Biol 2017; 13:e1005836. [PMID: 29084280 PMCID: PMC5679649 DOI: 10.1371/journal.pcbi.1005836] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Revised: 11/09/2017] [Accepted: 10/23/2017] [Indexed: 11/19/2022] Open
Abstract
New architectures of multilayer artificial neural networks and new methods for training them are rapidly revolutionizing the application of machine learning in diverse fields, including business, social science, physical sciences, and biology. Interpreting deep neural networks, however, currently remains elusive, and a critical challenge lies in understanding which meaningful features a network is actually learning. We present a general method for interpreting deep neural networks and extracting network-learned features from input data. We describe our algorithm in the context of biological sequence analysis. Our approach, based on ideas from statistical physics, samples from the maximum entropy distribution over possible sequences, anchored at an input sequence and subject to constraints implied by the empirical function learned by a network. Using our framework, we demonstrate that local transcription factor binding motifs can be identified from a network trained on ChIP-seq data and that nucleosome positioning signals are indeed learned by a network trained on chemical cleavage nucleosome maps. Imposing a further constraint on the maximum entropy distribution also allows us to probe whether a network is learning global sequence features, such as the high GC content in nucleosome-rich regions. This work thus provides valuable mathematical tools for interpreting and extracting learned features from feed-forward neural networks. Deep learning is a state-of-the-art reformulation of artificial neural networks that have a long history of development. It can perform superbly well in diverse automated classification and prediction problems, including handwriting recognition, image identification, and biological pattern recognition. Its modern success can be attributed to improved training algorithms, clever network architecture, rapid explosion of available data, and advanced computing power–all of which have allowed the great expansion in the number of unknown parameters to be estimated by the model. These parameters, however, are so intricately connected through highly nonlinear functions that interpreting which essential features of given data are actually used by a deep neural network for its excellent performance has been difficult. We address this problem by using ideas from statistical physics to sample new unseen data that are likely to behave similarly to original data points when passed through the trained network. This synthetic data cloud around each original data point retains informative features while averaging out nonessential ones, ultimately allowing us to extract important network-learned features from the original data set and thus improving the human interpretability of deep learning methods. We demonstrate how our method can be applied to biological sequence analysis.
Collapse
Affiliation(s)
- Alex Finnegan
- Department of Physics, University of Illinois, Urbana-Champaign, Urbana, Illinois, United States of America
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Urbana, Illinois, United States of America
| | - Jun S. Song
- Department of Physics, University of Illinois, Urbana-Champaign, Urbana, Illinois, United States of America
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Urbana, Illinois, United States of America
- * E-mail:
| |
Collapse
|
16
|
Voong LN, Xi L, Wang JP, Wang X. Genome-wide Mapping of the Nucleosome Landscape by Micrococcal Nuclease and Chemical Mapping. Trends Genet 2017; 33:495-507. [PMID: 28693826 DOI: 10.1016/j.tig.2017.05.007] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2017] [Revised: 05/10/2017] [Accepted: 05/30/2017] [Indexed: 12/30/2022]
Abstract
Nucleosomes regulate the transcription output of the genome by occluding the underlying DNA sequences from DNA-binding proteins that must act on it. Knowledge of the precise locations of nucleosomes in the genome is thus essential towards understanding how transcription is regulated. Current nucleosome-mapping strategies involve digesting chromatin with nucleases or chemical cleavage followed by high-throughput sequencing. In this review, we compare the traditional micrococcal nuclease (MNase)-based approach with a chemical cleavage strategy, with discussion on the important insights each has uncovered about the role of nucleosomes in shaping transcriptional processes.
Collapse
Affiliation(s)
- Lilien N Voong
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| | - Liqun Xi
- Department of Statistics, Northwestern University, Evanston, IL 60208, USA
| | - Ji-Ping Wang
- Department of Statistics, Northwestern University, Evanston, IL 60208, USA.
| | - Xiaozhong Wang
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA.
| |
Collapse
|
17
|
Logie C, Stunnenberg HG. Epigenetic memory: A macrophage perspective. Semin Immunol 2016; 28:359-67. [DOI: 10.1016/j.smim.2016.06.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2016] [Revised: 06/16/2016] [Accepted: 06/23/2016] [Indexed: 01/02/2023]
|
18
|
Multiplexing Genetic and Nucleosome Positioning Codes: A Computational Approach. PLoS One 2016; 11:e0156905. [PMID: 27272176 PMCID: PMC4896621 DOI: 10.1371/journal.pone.0156905] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Accepted: 05/20/2016] [Indexed: 11/19/2022] Open
Abstract
Eukaryotic DNA is strongly bent inside fundamental packaging units: the nucleosomes. It is known that their positions are strongly influenced by the mechanical properties of the underlying DNA sequence. Here we discuss the possibility that these mechanical properties and the concomitant nucleosome positions are not just a side product of the given DNA sequence, e.g. that of the genes, but that a mechanical evolution of DNA molecules might have taken place. We first demonstrate the possibility of multiplexing classical and mechanical genetic information using a computational nucleosome model. In a second step we give evidence for genome-wide multiplexing in Saccharomyces cerevisiae and Schizosacharomyces pombe. This suggests that the exact positions of nucleosomes play crucial roles in chromatin function.
Collapse
|