1
|
Sukys A, Grima R. Cell-cycle dependence of bursty gene expression: insights from fitting mechanistic models to single-cell RNA-seq data. Nucleic Acids Res 2025; 53:gkaf295. [PMID: 40240003 PMCID: PMC12000877 DOI: 10.1093/nar/gkaf295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 03/22/2025] [Accepted: 03/28/2025] [Indexed: 04/18/2025] Open
Abstract
Bursty gene expression is characterized by two intuitive parameters, burst frequency and burst size, the cell-cycle dependence of which has not been extensively profiled at the transcriptome level. In this study, we estimate the burst parameters per allele in the G1 and G2/M cell-cycle phases for thousands of mouse genes by fitting mechanistic models of gene expression to messenger RNA count data, obtained by sequencing of single cells whose cell-cycle position has been inferred using a deep-learning method. We find that upon DNA replication, the median burst frequency approximately halves, while the burst size remains mostly unchanged. Genome-wide distributions of the burst parameter ratios between the G2/M and G1 phases are broad, indicating substantial heterogeneity in transcriptional regulation. We also observe a significant negative correlation between the burst frequency and size ratios, suggesting that regulatory processes do not independently control the burst parameters. We show that to accurately estimate the burst parameter ratios, mechanistic models must explicitly account for gene copy number variation and extrinsic noise due to the coupling of transcription to cell age across the cell cycle, but corrections for technical noise due to imperfect capture of RNA molecules in sequencing experiments are less critical.
Collapse
Affiliation(s)
- Augustinas Sukys
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, United Kingdom
- The Alan Turing Institute, London NW1 2DB, United Kingdom
- School of BioSciences, University of Melbourne, Parkville, Victoria 3052, Australia
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, United Kingdom
| |
Collapse
|
2
|
Sullivan D, Hjörleifsson K, Swarna N, Oakes C, Holley G, Melsted P, Pachter L. Accurate quantification of nascent and mature RNAs from single-cell and single-nucleus RNA-seq. Nucleic Acids Res 2025; 53:gkae1137. [PMID: 39657125 PMCID: PMC11724275 DOI: 10.1093/nar/gkae1137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 10/28/2024] [Accepted: 12/05/2024] [Indexed: 12/14/2024] Open
Abstract
In single-cell and single-nucleus RNA sequencing (RNA-seq), the coexistence of nascent (unprocessed) and mature (processed) messenger RNA (mRNA) poses challenges in accurate read mapping and the interpretation of count matrices. The traditional transcriptome reference, defining the "region of interest" in bulk RNA-seq, restricts its focus to mature mRNA transcripts. This restriction leads to two problems: reads originating outside of the "region of interest" are prone to mismapping within this region, and additionally, such external reads cannot be matched to specific transcript targets. Expanding the "region of interest" to encompass both nascent and mature mRNA transcript targets provides a more comprehensive framework for RNA-seq analysis. Here, we introduce the concept of distinguishing flanking k-mers (DFKs) to improve mapping of sequencing reads. We have developed an algorithm to identify DFKs, which serve as a sophisticated "background filter", enhancing the accuracy of mRNA quantification. This dual strategy of an expanded region of interest coupled with the use of DFKs enhances the precision in quantifying both mature and nascent mRNA molecules, as well as in delineating reads of ambiguous status.
Collapse
Affiliation(s)
- Delaney K Sullivan
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California Blvd, Pasadena, CA 91125, USA
- UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California, Los Angeles, 885 Tiverton Drive, Los Angeles, CA 90095, USA
| | - Kristján Eldjárn Hjörleifsson
- Department of Computing and Mathematical Sciences, California Institute of Technology, 1200 E California Blvd, Pasadena, CA 91125, USA
| | - Nikhila P Swarna
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California Blvd, Pasadena, CA 91125, USA
| | - Conrad Oakes
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California Blvd, Pasadena, CA 91125, USA
| | - Guillaume Holley
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavík, Iceland
| | - Páll Melsted
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavík, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Sæmundargata 2, 102 Reykjavík, Iceland
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California Blvd, Pasadena, CA 91125, USA
- Department of Computing and Mathematical Sciences, California Institute of Technology, 1200 E California Blvd, Pasadena, CA 91125, USA
| |
Collapse
|
3
|
Fang M, Gorin G, Pachter L. Trajectory inference from single-cell genomics data with a process time model. PLoS Comput Biol 2025; 21:e1012752. [PMID: 39836699 PMCID: PMC11760028 DOI: 10.1371/journal.pcbi.1012752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 01/24/2025] [Accepted: 12/25/2024] [Indexed: 01/23/2025] Open
Abstract
Single-cell transcriptomics experiments provide gene expression snapshots of heterogeneous cell populations across cell states. These snapshots have been used to infer trajectories and dynamic information even without intensive, time-series data by ordering cells according to gene expression similarity. However, while single-cell snapshots sometimes offer valuable insights into dynamic processes, current methods for ordering cells are limited by descriptive notions of "pseudotime" that lack intrinsic physical meaning. Instead of pseudotime, we propose inference of "process time" via a principled modeling approach to formulating trajectories and inferring latent variables corresponding to timing of cells subject to a biophysical process. Our implementation of this approach, called Chronocell, provides a biophysical formulation of trajectories built on cell state transitions. The Chronocell model is identifiable, making parameter inference meaningful. Furthermore, Chronocell can interpolate between trajectory inference, when cell states lie on a continuum, and clustering, when cells cluster into discrete states. By using a variety of datasets ranging from cluster-like to continuous, we show that Chronocell enables us to assess the suitability of datasets and reveals distinct cellular distributions along process time that are consistent with biological process times. We also compare our parameter estimates of degradation rates to those derived from metabolic labeling datasets, thereby showcasing the biophysical utility of Chronocell. Nevertheless, based on performance characterization on simulations, we find that process time inference can be challenging, highlighting the importance of dataset quality and careful model assessment.
Collapse
Affiliation(s)
- Meichen Fang
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Gennady Gorin
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, United States of America
| |
Collapse
|
4
|
Gao Y, Liu Q. Delineating cell types with transcriptional kinetics. NATURE COMPUTATIONAL SCIENCE 2024; 4:657-658. [PMID: 39317761 DOI: 10.1038/s43588-024-00691-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2024]
Affiliation(s)
- Yicheng Gao
- State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, China
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, China
| | - Qi Liu
- State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, China.
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, China.
| |
Collapse
|