Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ono Y, Asai K, Hamada M. PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores. Bioinformatics 2021;37:589-595. [PMID: 32976553 PMCID: PMC8097687 DOI: 10.1093/bioinformatics/btaa835] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 08/20/2020] [Accepted: 09/11/2020] [Indexed: 12/21/2022] Open

For:	Ono Y, Asai K, Hamada M. PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores. Bioinformatics 2021;37:589-595. [PMID: 32976553 PMCID: PMC8097687 DOI: 10.1093/bioinformatics/btaa835] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 08/20/2020] [Accepted: 09/11/2020] [Indexed: 12/21/2022] Open

Number

Cited by Other Article(s)

Harary Y, Snapir P, Tov SS, Kruphman C, Rechef E, Jahshan Z, Garzon E, Yavits L. GCOC: A Genome Classifier-On-Chip Based on Similarity Search Content Addressable Memory. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2025;19:484-495. [PMID: 39196751 DOI: 10.1109/tbcas.2024.3449788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/30/2024]

Marini S, Barquero A, Wadhwani AA, Bian J, Ruiz J, Boucher C, Prosperi M. OCTOPUS: Disk-based, Multiplatform, Mobile-friendly Metagenomics Classifier. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2025;2024:798-807. [PMID: 40417475 PMCID: PMC12099329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 05/27/2025]

Ni K, Yu G, Zheng Z, Lu Y, Poe D, Chen Y, Sanborn M, Wang Z, Zhou S, Zhan X, Wang W, Xing J. LivecellX: A Scalable Deep Learning Framework for Single-Cell Object-Oriented Analysis in Live-Cell Imaging. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.23.639532. [PMID: 40060645 PMCID: PMC11888277 DOI: 10.1101/2025.02.23.639532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/25/2025]

Gao R, Hu H, Jiang Z, Cao S, Wang G, Zhao Y, Jiang T. SVHunter: long-read-based structural variation detection through the transformer model. Brief Bioinform 2025;26:bbaf203. [PMID: 40341921 PMCID: PMC12062572 DOI: 10.1093/bib/bbaf203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 03/31/2025] [Accepted: 04/15/2025] [Indexed: 05/11/2025] Open

Depuydt L, Ahmed OY, Fostier J, Langmead B, Gagie T. Run-length compressed metagenomic read classification with SMEM-finding and tagging. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.25.640119. [PMID: 40060500 PMCID: PMC11888359 DOI: 10.1101/2025.02.25.640119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/15/2025]

Kong T, Wang Y, Liu B. xRead: a coverage-guided approach for scalable construction of read overlapping graph. Gigascience 2025;14:giaf007. [PMID: 39960665 PMCID: PMC11831799 DOI: 10.1093/gigascience/giaf007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 11/29/2024] [Accepted: 01/10/2025] [Indexed: 02/20/2025] Open

Zakeri M, Brown NK, Ahmed OY, Gagie T, Langmead B. Movi: A fast and cache-efficient full-text pangenome index. iScience 2024;27:111464. [PMID: 39758981 PMCID: PMC11696632 DOI: 10.1016/j.isci.2024.111464] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2024] [Revised: 10/11/2024] [Accepted: 11/20/2024] [Indexed: 01/07/2025] Open

Liu Y, Li Y, Chen E, Xu J, Zhang W, Zeng X, Luo X. Repeat and haplotype aware error correction in nanopore sequencing reads with DeChat. Commun Biol 2024;7:1678. [PMID: 39702496 DOI: 10.1038/s42003-024-07376-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Accepted: 12/05/2024] [Indexed: 12/21/2024] Open

Fuhrmann L, Langer B, Topolsky I, Beerenwinkel N. VILOCA: sequencing quality-aware viral haplotype reconstruction and mutation calling for short-read and long-read data. NAR Genom Bioinform 2024;6:lqae152. [PMID: 39633724 PMCID: PMC11616694 DOI: 10.1093/nargab/lqae152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Revised: 09/15/2024] [Accepted: 10/25/2024] [Indexed: 12/07/2024] Open

Luo J, Wang J, Wei J, Yan C, Luo H. DeepHapNet: a haplotype assembly method based on RetNet and deep spectral clustering. Brief Bioinform 2024;26:bbae656. [PMID: 39690881 DOI: 10.1093/bib/bbae656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 10/18/2024] [Accepted: 12/05/2024] [Indexed: 12/19/2024] Open

Zong P, Deng W, Liu J, Ruan J. TSTA: thread and SIMD-based trapezoidal pairwise/multiple sequence-alignment method. GIGABYTE 2024;2024:gigabyte141. [PMID: 39539520 PMCID: PMC11558659 DOI: 10.46471/gigabyte.141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Accepted: 11/01/2024] [Indexed: 11/16/2024] Open

Gao X, Liu K, Luo S, Tang M, Liu N, Jiang C, Fang J, Li S, Hou Y, Guo C, Qu K. Comparative analysis of methodologies for detecting extrachromosomal circular DNA. Nat Commun 2024;15:9208. [PMID: 39448595 PMCID: PMC11502736 DOI: 10.1038/s41467-024-53496-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 10/14/2024] [Indexed: 10/26/2024] Open

Affiliation(s)

Xuyuan Gao Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
Ke Liu Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
Songwen Luo Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
Meifang Tang Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
Nianping Liu Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
Chen Jiang Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
Jingwen Fang Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China HanGene Biotech, Xiaoshan Innovation Polis, Hangzhou, Zhejiang, China
Shouzhen Li Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
Yanbing Hou Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
Chuang Guo Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China. School of Pharmacy, Bengbu Medical University, Bengbu, China. Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China.
Kun Qu Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China. Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China. School of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China.

Collapse

Giurgiu M, Wittstruck N, Rodriguez-Fos E, Chamorro González R, Brückner L, Krienelke-Szymansky A, Helmsauer K, Hartebrodt A, Euskirchen P, Koche RP, Haase K, Reinert K, Henssen AG. Reconstructing extrachromosomal DNA structural heterogeneity from long-read sequencing data using Decoil. Genome Res 2024;34:1355-1364. [PMID: 39111816 PMCID: PMC11529853 DOI: 10.1101/gr.279123.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 07/29/2024] [Indexed: 08/23/2024]

Affiliation(s)

Mădălina Giurgiu Department of Pediatric Oncology and Hematology, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Experimental and Clinical Research Center of the Max Delbrück Center and Charité Berlin, 13125 Berlin, Germany Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany Freie Universität Berlin, 14195 Berlin, Germany
Nadine Wittstruck Department of Pediatric Oncology and Hematology, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany Experimental and Clinical Research Center of the Max Delbrück Center and Charité Berlin, 13125 Berlin, Germany Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
Elias Rodriguez-Fos Department of Pediatric Oncology and Hematology, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany Experimental and Clinical Research Center of the Max Delbrück Center and Charité Berlin, 13125 Berlin, Germany Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
Rocío Chamorro González Department of Pediatric Oncology and Hematology, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany Experimental and Clinical Research Center of the Max Delbrück Center and Charité Berlin, 13125 Berlin, Germany Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany Max Delbrück Center for Molecular Medicine, 13125 Berlin, Germany
Lotte Brückner Department of Pediatric Oncology and Hematology, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany Experimental and Clinical Research Center of the Max Delbrück Center and Charité Berlin, 13125 Berlin, Germany Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany Max Delbrück Center for Molecular Medicine, 13125 Berlin, Germany
Annabell Krienelke-Szymansky Department of Pediatric Oncology and Hematology, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany Experimental and Clinical Research Center of the Max Delbrück Center and Charité Berlin, 13125 Berlin, Germany Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
Konstantin Helmsauer Department of Pediatric Oncology and Hematology, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany Experimental and Clinical Research Center of the Max Delbrück Center and Charité Berlin, 13125 Berlin, Germany Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
Anne Hartebrodt Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany
Philipp Euskirchen German Cancer Consortium (DKTK), partner site Berlin, a partnership between DKFZ and Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany Department of Neuropathology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, 13353 Berlin, Germany
Richard P Koche Center for Epigenetics Research, Memorial Sloan Kettering Cancer Center, New York, New York 10065, USA
Kerstin Haase Department of Pediatric Oncology and Hematology, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany Experimental and Clinical Research Center of the Max Delbrück Center and Charité Berlin, 13125 Berlin, Germany Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
Knut Reinert Freie Universität Berlin, 14195 Berlin, Germany
Anton G Henssen Department of Pediatric Oncology and Hematology, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Experimental and Clinical Research Center of the Max Delbrück Center and Charité Berlin, 13125 Berlin, Germany Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany Max Delbrück Center for Molecular Medicine, 13125 Berlin, Germany

Collapse

Chanin RB, West PT, Wirbel J, Gill MO, Green GZM, Park RM, Enright N, Miklos AM, Hickey AS, Brooks EF, Lum KK, Cristea IM, Bhatt AS. Intragenic DNA inversions expand bacterial coding capacity. Nature 2024;634:234-242. [PMID: 39322669 DOI: 10.1038/s41586-024-07970-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 08/20/2024] [Indexed: 09/27/2024]

Baudeau T, Sahlin K. Improved sub-genomic RNA prediction with the ARTIC protocol. Nucleic Acids Res 2024;52:e82. [PMID: 39149898 PMCID: PMC11417393 DOI: 10.1093/nar/gkae687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 07/18/2024] [Accepted: 07/25/2024] [Indexed: 08/17/2024] Open

Huang Y, Gao Y, Ly K, Lin L, Lambooij JP, King EG, Janssen A, Wei KHC, Lee YCG. Varying recombination landscapes between individuals are driven by polymorphic transposable elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.17.613564. [PMID: 39345575 PMCID: PMC11429682 DOI: 10.1101/2024.09.17.613564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]

Abstract

Meiotic recombination is a prominent force shaping genome evolution, and understanding the causes for varying recombination landscapes within and between species has remained a central, though challenging, question. Recombination rates are widely observed to negatively associate with the abundance of transposable elements (TEs), selfish genetic elements that move between genomic locations. While such associations are usually interpreted as recombination influencing the efficacy of selection at removing TEs, accumulating findings suggest that TEs could instead be the cause rather than the consequence. To test this prediction, we formally investigated the influence of polymorphic, putatively active TEs on recombination rates. We developed and benchmarked a novel approach that uses PacBio long-read sequencing to efficiently, accurately, and cost-effectively identify crossovers (COs), a key recombination product, among large numbers of pooled recombinant individuals. By applying this approach to Drosophila strains with distinct TE insertion profiles, we found that polymorphic TEs, especially RNA-based TEs and TEs with local enrichment of repressive marks, reduce the occurrence of COs. Such an effect leads to different CO frequencies between homologous sequences with and without TEs, contributing to varying CO maps between individuals. The suppressive effect of TEs on CO is further supported by two orthogonal approaches-analyzing the distributions of COs in panels of recombinant inbred lines in relation to TE polymorphism and applying marker-assisted estimations of CO frequencies to isogenic strains with and without transgenically inserted TEs. Our investigations reveal how the constantly changing mobilome can actively modify recombination landscapes, shaping genome evolution within and between species.

Collapse

Marini S, Barquero A, Wadhwani AA, Bian J, Ruiz J, Boucher C, Prosperi M. OCTOPUS: Disk-based, Multiplatform, Mobile-friendly Metagenomics Classifier. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585215. [PMID: 38559026 PMCID: PMC10979967 DOI: 10.1101/2024.03.15.585215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]

Liu C, Wu P, Wu X, Zhao X, Chen F, Cheng X, Zhu H, Wang O, Xu M. AsmMix: an efficient haplotype-resolved hybrid de novo genome assembling pipeline. Front Genet 2024;15:1421565. [PMID: 39130747 PMCID: PMC11310137 DOI: 10.3389/fgene.2024.1421565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 07/05/2024] [Indexed: 08/13/2024] Open

Shao H, Ruan J. BSAlign: A Library for Nucleotide Sequence Alignment. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024;22:qzae025. [PMID: 39209796 PMCID: PMC12016559 DOI: 10.1093/gpbjnl/qzae025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 03/03/2024] [Accepted: 03/12/2024] [Indexed: 09/04/2024]

Gamaarachchi H, Ferguson JM, Samarakoon H, Liyanage K, Deveson IW. Simulation of nanopore sequencing signal data with tunable parameters. Genome Res 2024;34:778-783. [PMID: 38692839 PMCID: PMC11216307 DOI: 10.1101/gr.278730.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 04/24/2024] [Indexed: 05/03/2024]

Affiliation(s)

Hasindu Gamaarachchi School of Computer Science and Engineering, University of New South Wales, Sydney, New South Wales 2052, Australia; Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, New South Wales 2010, Australia Australia
James M Ferguson Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, New South Wales 2010, Australia Australia
Hiruna Samarakoon School of Computer Science and Engineering, University of New South Wales, Sydney, New South Wales 2052, Australia Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, New South Wales 2010, Australia Australia
Kisaru Liyanage School of Computer Science and Engineering, University of New South Wales, Sydney, New South Wales 2052, Australia Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, New South Wales 2010, Australia Australia
Ira W Deveson Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia; Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, New South Wales 2010, Australia Australia St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, Sydney, New South Wales 2052, Australia

Collapse

Hämälä T, Moore C, Cowan L, Carlile M, Gopaulchan D, Brandrud MK, Birkeland S, Loose M, Kolář F, Koch MA, Yant L. Impact of whole-genome duplications on structural variant evolution in Cochlearia. Nat Commun 2024;15:5377. [PMID: 38918389 PMCID: PMC11199601 DOI: 10.1038/s41467-024-49679-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Accepted: 06/14/2024] [Indexed: 06/27/2024] Open

Li X, Chen K, Shao M. Efficient Seeding for Error-Prone Sequences with SubseqHash2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.30.596711. [PMID: 38895288 PMCID: PMC11185578 DOI: 10.1101/2024.05.30.596711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]

Hu H, Gao R, Gao W, Gao B, Jiang Z, Zhou M, Wang G, Jiang T. SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies. Brief Bioinform 2024;25:bbae336. [PMID: 38980375 PMCID: PMC11232458 DOI: 10.1093/bib/bbae336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 06/03/2024] [Accepted: 06/27/2024] [Indexed: 07/10/2024] Open

Wang W, Li Y, Ko S, Feng N, Zhang M, Liu JJ, Zheng S, Ren B, Yu YP, Luo JH, Tseng GC, Liu S. IFDlong: an isoform and fusion detector for accurate annotation and quantification of long-read RNA-seq data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.11.593690. [PMID: 38798496 PMCID: PMC11118288 DOI: 10.1101/2024.05.11.593690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]

Abstract

Advancements in long-read transcriptome sequencing (long-RNA-seq) technology have revolutionized the study of isoform diversity. These full-length transcripts enhance the detection of various transcriptome structural variations, including novel isoforms, alternative splicing events, and fusion transcripts. By shifting the open reading frame or altering gene expressions, studies have proved that these transcript alterations can serve as crucial biomarkers for disease diagnosis and therapeutic targets. In this project, we proposed IFDlong, a bioinformatics and biostatistics tool to detect isoform and fusion transcripts using bulk or single-cell long-RNA-seq data. Specifically, the software performed gene and isoform annotation for each long-read, defined novel isoforms, quantified isoform expression by a novel expectation-maximization algorithm, and profiled the fusion transcripts. For evaluation, IFDlong pipeline achieved overall the best performance when compared with several existing tools in large-scale simulation studies. In both isoform and fusion transcript quantification, IFDlong is able to reach more than 0.8 Spearman's correlation with the truth, and more than 0.9 cosine similarity when distinguishing multiple alternative splicing events. In novel isoform simulation, IFDlong can successfully balance the sensitivity (higher than 90%) and specificity (higher than 90%). Furthermore, IFDlong has proved its accuracy and robustness in diverse in-house and public datasets on healthy tissues, cell lines and multiple types of diseases. Besides bulk long-RNA-seq, IFDlong pipeline has proved its compatibility to single-cell long-RNA-seq data. This new software may hold promise for significant impact on long-read transcriptome analysis. The IFDlong software is available at https://github.com/wenjiaking/IFDlong.

Collapse

Affiliation(s)

Wenjia Wang Department of Biostatistics, School of Public Health, University of Pittsburgh, Pittsburgh, PA
Yuzhen Li Department of Surgery, School of Medicine, University of Pittsburgh, Pittsburgh, PA
Sungjin Ko Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, PA
Ning Feng Department of Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA
Manling Zhang Department of Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA
Jia-Jun Liu Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, PA
Songyang Zheng Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, PA
Baoguo Ren Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, PA
Yan P. Yu Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, PA
Jian-Hua Luo Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, PA Hillman Cancer Center, University of Pittsburgh Medical Center, Pittsburgh, PA
George C. Tseng Department of Biostatistics, School of Public Health, University of Pittsburgh, Pittsburgh, PA
Silvia Liu Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, PA Hillman Cancer Center, University of Pittsburgh Medical Center, Pittsburgh, PA Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA

Collapse

Su Y, Yu Z, Jin S, Ai Z, Yuan R, Chen X, Xue Z, Guo Y, Chen D, Liang H, Liu Z, Liu W. Comprehensive assessment of mRNA isoform detection methods for long-read sequencing data. Nat Commun 2024;15:3972. [PMID: 38730241 PMCID: PMC11087464 DOI: 10.1038/s41467-024-48117-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 04/19/2024] [Indexed: 05/12/2024] Open

Affiliation(s)

Yaqi Su Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
Zhejian Yu Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
Siqian Jin Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
Zhipeng Ai Division of Human Reproduction and Developmental Genetics, Women's Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310006, Zhejiang, China
Ruihong Yuan Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
Xinyi Chen Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
Ziwei Xue Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
Yixin Guo Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
Di Chen Center for Reproductive Medicine of the Second Affiliated Hospital Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China Centre for Regeneration and Cell Therapy of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
Hongqing Liang Division of Human Reproduction and Developmental Genetics, Women's Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310006, Zhejiang, China
Zuozhu Liu Zhejiang University-Angel Align Inc. R&D Center for Intelligent Healthcare, Zhejiang University-University of Illinois at Urbana-Champaign Institute (ZJU-UIUC Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
Wanlu Liu Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China. Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China. Future Health Laboratory, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314100, China. Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Zhejiang University, Hangzhou, 310058, Zhejiang, China.

Collapse

Schulz T, Medvedev P. ESKEMAP: exact sketch-based read mapping. Algorithms Mol Biol 2024;19:19. [PMID: 38704605 PMCID: PMC11069465 DOI: 10.1186/s13015-024-00261-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 03/19/2024] [Indexed: 05/06/2024] Open

Deng WJ, Li QQ, Shuai HN, Wu RX, Niu SF, Wang QH, Miao BB. Whole-Genome Sequencing Analyses Reveal the Evolution Mechanisms of Typical Biological Features of Decapterus maruadsi. Animals (Basel) 2024;14:1202. [PMID: 38672351 PMCID: PMC11047736 DOI: 10.3390/ani14081202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 04/11/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024] Open

Holmes MJ, Mahjour B, Castro CP, Farnum GA, Diehl AG, Boyle AP. HaplotagLR: An efficient and configurable utility for haplotagging long reads. PLoS One 2024;19:e0298688. [PMID: 38478504 PMCID: PMC10936807 DOI: 10.1371/journal.pone.0298688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 01/30/2024] [Indexed: 03/17/2024] Open

Jahshan Z, Yavits L. ViTAL: Vision TrAnsformer based Low coverage SARS-CoV-2 lineage assignment. Bioinformatics 2024;40:btae093. [PMID: 38374486 PMCID: PMC10913383 DOI: 10.1093/bioinformatics/btae093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 02/04/2024] [Accepted: 02/18/2024] [Indexed: 02/21/2024] Open

Zakeri M, Brown NK, Ahmed OY, Gagie T, Langmead B. Movi: a fast and cache-efficient full-text pangenome index. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.04.565615. [PMID: 37961660 PMCID: PMC10635132 DOI: 10.1101/2023.11.04.565615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]

Ding L, Wu S, Hou Z, Li A, Xu Y, Feng H, Pan W, Ruan J. Improving error-correcting capability in DNA digital storage via soft-decision decoding. Natl Sci Rev 2024;11:nwad229. [PMID: 38213525 PMCID: PMC10776348 DOI: 10.1093/nsr/nwad229] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 08/03/2023] [Accepted: 08/15/2023] [Indexed: 01/13/2024] Open

Affiliation(s)

Lulu Ding Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen518120, China
Shigang Wu Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen518120, China
Zhihao Hou Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen518120, China Guangdong Provincial Key Laboratory of Plant Molecular Breeding, State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, South China Agricultural University, Guangzhou510642, China
Alun Li Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen518120, China
Yaping Xu Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen518120, China
Hu Feng Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen518120, China
Weihua Pan Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen518120, China
Jue Ruan Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen518120, China

Collapse

Rajput J, Chandra G, Jain C. Co-linear chaining on pangenome graphs. Algorithms Mol Biol 2024;19:4. [PMID: 38279113 PMCID: PMC11288099 DOI: 10.1186/s13015-024-00250-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 01/02/2024] [Indexed: 01/28/2024] Open

Wei ZG, Zhang XD, Fan XG, Qian Y, Liu F, Wu FX. pathMap: a path-based mapping tool for long noisy reads with high sensitivity. Brief Bioinform 2024;25:bbae107. [PMID: 38517696 PMCID: PMC10959152 DOI: 10.1093/bib/bbae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 12/25/2023] [Accepted: 02/28/2024] [Indexed: 03/24/2024] Open

Chu J, Rong J, Feng X, Li H. ntsm: an alignment-free, ultra-low-coverage, sequencing technology agnostic, intraspecies sample comparison tool for sample swap detection. Gigascience 2024;13:giae024. [PMID: 38832466 PMCID: PMC11148594 DOI: 10.1093/gigascience/giae024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 02/13/2024] [Accepted: 04/30/2024] [Indexed: 06/05/2024] Open

Constantinides B, Hunt M, Crook DW. Hostile: accurate decontamination of microbial host sequences. Bioinformatics 2023;39:btad728. [PMID: 38039142 PMCID: PMC10749771 DOI: 10.1093/bioinformatics/btad728] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 11/11/2023] [Accepted: 11/29/2023] [Indexed: 12/03/2023] Open

Wei ZG, Bu PY, Zhang XD, Liu F, Qian Y, Wu FX. invMap: a sensitive mapping tool for long noisy reads with inversion structural variants. Bioinformatics 2023;39:btad726. [PMID: 38058196 PMCID: PMC11320709 DOI: 10.1093/bioinformatics/btad726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 11/02/2023] [Accepted: 12/05/2023] [Indexed: 12/08/2023] Open

Magi A, Mattei G, Mingrino A, Caprioli C, Ronchini C, Frigè G, Semeraro R, Baragli M, Bolognini D, Colombo E, Mazzarella L, Pelicci PG. GASOLINE: detecting germline and somatic structural variants from long-reads data. Sci Rep 2023;13:20817. [PMID: 38012350 PMCID: PMC10682169 DOI: 10.1038/s41598-023-48285-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 11/24/2023] [Indexed: 11/29/2023] Open

Chandra G, Jain C. Gap-Sensitive Colinear Chaining Algorithms for Acyclic Pangenome Graphs. J Comput Biol 2023;30:1182-1197. [PMID: 37902967 DOI: 10.1089/cmb.2023.0186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2023] Open

Guo Y, Feng X, Li H. Evaluation of haplotype-aware long-read error correction with hifieval. Bioinformatics 2023;39:btad631. [PMID: 37851384 PMCID: PMC10612404 DOI: 10.1093/bioinformatics/btad631] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 09/18/2023] [Accepted: 10/17/2023] [Indexed: 10/19/2023] Open

Zhang Y, Lu HW, Ruan J. GAEP: a comprehensive genome assembly evaluating pipeline. J Genet Genomics 2023;50:747-754. [PMID: 37245652 DOI: 10.1016/j.jgg.2023.05.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 05/19/2023] [Accepted: 05/23/2023] [Indexed: 05/30/2023]

Poszewiecka B, Gogolewski K, Karolak JA, Stankiewicz P, Gambin A. PhaseDancer: a novel targeted assembler of segmental duplications unravels the complexity of the human chromosome 2 fusion going from 48 to 46 chromosomes in hominin evolution. Genome Biol 2023;24:205. [PMID: 37697406 PMCID: PMC10496407 DOI: 10.1186/s13059-023-03022-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 07/25/2023] [Indexed: 09/13/2023] Open

Ayad LAK, Chikhi R, Pissis SP. Seedability: optimizing alignment parameters for sensitive sequence comparison. BIOINFORMATICS ADVANCES 2023;3:vbad108. [PMID: 37621456 PMCID: PMC10444664 DOI: 10.1093/bioadv/vbad108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 08/02/2023] [Accepted: 08/10/2023] [Indexed: 08/26/2023]

Yang X, Wang X, Zou Y, Zhang S, Xia M, Fu L, Vollger MR, Chen NC, Taylor DJ, Harvey WT, Logsdon GA, Meng D, Shi J, McCoy RC, Schatz MC, Li W, Eichler EE, Lu Q, Mao Y. Characterization of large-scale genomic differences in the first complete human genome. Genome Biol 2023;24:157. [PMID: 37403156 PMCID: PMC10320979 DOI: 10.1186/s13059-023-02995-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 06/23/2023] [Indexed: 07/06/2023] Open

Affiliation(s)

Xiangyu Yang Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Xuankai Wang Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Yawen Zou Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Shilong Zhang Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Manying Xia Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Lianting Fu Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Mitchell R Vollger Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Nae-Chyun Chen Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
Dylan J Taylor Department of Biology, Johns Hopkins University, Baltimore, MD, USA
William T Harvey Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Glennis A Logsdon Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Dan Meng Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Junfeng Shi Shanghai Engineering Research Center of Advanced Dental Technology and Materials, Shanghai, China Shanghai Key Laboratory of Stomatology, Shanghai Ninth People's Hospital, College of Stomatology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
Rajiv C McCoy Department of Biology, Johns Hopkins University, Baltimore, MD, USA
Michael C Schatz Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA Department of Biology, Johns Hopkins University, Baltimore, MD, USA
Weidong Li Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Evan E Eichler Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
Qing Lu Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Yafei Mao Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China. Shanghai Key Laboratory of Stomatology, Shanghai Ninth People's Hospital, College of Stomatology, Shanghai Jiao Tong University School of Medicine, Shanghai, China.

Collapse

Ahmed O, Rossi M, Boucher C, Langmead B. Efficient taxa identification using a pangenome index. Genome Res 2023;33:1069-1077. [PMID: 37258301 PMCID: PMC10538492 DOI: 10.1101/gr.277642.123] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 05/22/2023] [Indexed: 06/02/2023]

Li X, Shi Q, Chen K, Shao M. Seeding with minimized subsequence. Bioinformatics 2023;39:i232-i241. [PMID: 37387132 DOI: 10.1093/bioinformatics/btad218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open

Abstract

MOTIVATION

Modern methods for computation-intensive tasks in sequence analysis (e.g. read mapping, sequence alignment, genome assembly, etc.) often first transform each sequence into a list of short, regular-length seeds so that compact data structures and efficient algorithms can be employed to handle the ever-growing large-scale data. Seeding methods using kmers (substrings of length k) have gained tremendous success in processing sequencing data with low mutation/error rates. However, they are much less effective for sequencing data with high error rates as kmers cannot tolerate errors.

RESULTS

We propose SubseqHash, a strategy that uses subsequences, rather than substrings, as seeds. Formally, SubseqHash maps a string of length n to its smallest subsequence of length k, k < n, according to a given order overall length-k strings. Finding the smallest subsequence of a string by enumeration is impractical as the number of subsequences grows exponentially. To overcome this barrier, we propose a novel algorithmic framework that consists of a specifically designed order (termed ABC order) and an algorithm that computes the minimized subsequence under an ABC order in polynomial time. We first show that the ABC order exhibits the desired property and the probability of hash collision using the ABC order is close to the Jaccard index. We then show that SubseqHash overwhelmingly outperforms the substring-based seeding methods in producing high-quality seed-matches for three critical applications: read mapping, sequence alignment, and overlap detection. SubseqHash presents a major algorithmic breakthrough for tackling the high error rates and we expect it to be widely adapted for long-reads analysis.

AVAILABILITY AND IMPLEMENTATION

SubseqHash is freely available at https://github.com/Shao-Group/subseqhash.

Collapse

Guo Y, Feng X, Li H. Evaluation of haplotype-aware long-read error correction with hifieval. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.05.543788. [PMID: 37333189 PMCID: PMC10274712 DOI: 10.1101/2023.06.05.543788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]

Ahmed OY, Rossi M, Gagie T, Boucher C, Langmead B. SPUMONI 2: improved classification using a pangenome index of minimizer digests. Genome Biol 2023;24:122. [PMID: 37202771 PMCID: PMC10197461 DOI: 10.1186/s13059-023-02958-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 05/03/2023] [Indexed: 05/20/2023] Open

Chamorro González R, Conrad T, Stöber MC, Xu R, Giurgiu M, Rodriguez-Fos E, Kasack K, Brückner L, van Leen E, Helmsauer K, Dorado Garcia H, Stefanova ME, Hung KL, Bei Y, Schmelz K, Lodrini M, Mundlos S, Chang HY, Deubzer HE, Sauer S, Eggert A, Schulte JH, Schwarz RF, Haase K, Koche RP, Henssen AG. Parallel sequencing of extrachromosomal circular DNAs and transcriptomes in single cancer cells. Nat Genet 2023;55:880-890. [PMID: 37142849 PMCID: PMC10181933 DOI: 10.1038/s41588-023-01386-y] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 03/28/2023] [Indexed: 05/06/2023]

Affiliation(s)

Rocío Chamorro González Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany
Thomas Conrad Genomics Technology Platform, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
Maja C Stöber Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany Charité-Universitätsmedizin Berlin, Berlin, Germany Faculty of Life Science, Humboldt-Universität zu Berlin, Berlin, Germany
Robin Xu Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany
Mădălina Giurgiu Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany Freie Universität Berlin, Berlin, Germany
Elias Rodriguez-Fos Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany
Katharina Kasack Fraunhofer Institute for Cell Therapy and Immunology, Branch Bioanalytics and Bioprocesses IZI-BB, Potsdam, Germany
Lotte Brückner Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany Max-Delbrück-Centrum für Molekulare Medizin, Berlin, Germany
Eric van Leen Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany
Konstantin Helmsauer Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany
Heathcliff Dorado Garcia Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany
Maria E Stefanova RG Development and Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Berlin, Germany
King L Hung Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, CA, USA
Yi Bei Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany
Karin Schmelz Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
Marco Lodrini Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
Stefan Mundlos RG Development and Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Berlin, Germany
Howard Y Chang Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, CA, USA Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA
Hedwig E Deubzer Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany German Cancer Consortium, partner site Berlin, and German Cancer Research Center, Heidelberg, Germany Berlin Institute of Health, Berlin, Germany
Sascha Sauer Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
Angelika Eggert Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany German Cancer Consortium, partner site Berlin, and German Cancer Research Center, Heidelberg, Germany Berlin Institute of Health, Berlin, Germany
Johannes H Schulte Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany German Cancer Consortium, partner site Berlin, and German Cancer Research Center, Heidelberg, Germany Berlin Institute of Health, Berlin, Germany
Roland F Schwarz Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany Institute for Computational Cancer Biology, Center for Integrated Oncology, Cancer Research Center Cologne Essen Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
Kerstin Haase Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany German Cancer Consortium, partner site Berlin, and German Cancer Research Center, Heidelberg, Germany
Richard P Koche Center for Epigenetics Research, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Anton G Henssen Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany. Experimental and Clinical Research Center of the MDC and Charité Berlin, Berlin, Germany. Max-Delbrück-Centrum für Molekulare Medizin, Berlin, Germany. German Cancer Consortium, partner site Berlin, and German Cancer Research Center, Heidelberg, Germany.

Collapse

Popic V, Rohlicek C, Cunial F, Hajirasouliha I, Meleshko D, Garimella K, Maheshwari A. Cue: a deep-learning framework for structural variant discovery and genotyping. Nat Methods 2023;20:559-568. [PMID: 36959322 PMCID: PMC10152467 DOI: 10.1038/s41592-023-01799-x] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 01/29/2023] [Indexed: 03/25/2023]

Firtina C, Park J, Alser M, Kim JS, Cali D, Shahroodi T, Ghiasi N, Singh G, Kanellopoulos K, Alkan C, Mutlu O. BLEND: a fast, memory-efficient and accurate mechanism to find fuzzy seed matches in genome analysis. NAR Genom Bioinform 2023;5:lqad004. [PMID: 36685727 PMCID: PMC9853099 DOI: 10.1093/nargab/lqad004] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 12/16/2022] [Accepted: 01/10/2023] [Indexed: 01/22/2023] Open