Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kille B, Garrison E, Treangen TJ, Phillippy AM. Minmers are a generalization of minimizers that enable unbiased local Jaccard estimation. Bioinformatics 2023;39:btad512. [PMID: 37603771 PMCID: PMC10505501 DOI: 10.1093/bioinformatics/btad512] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/19/2023] [Accepted: 08/18/2023] [Indexed: 08/23/2023] Open

For:	Kille B, Garrison E, Treangen TJ, Phillippy AM. Minmers are a generalization of minimizers that enable unbiased local Jaccard estimation. Bioinformatics 2023;39:btad512. [PMID: 37603771 PMCID: PMC10505501 DOI: 10.1093/bioinformatics/btad512] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/19/2023] [Accepted: 08/18/2023] [Indexed: 08/23/2023] Open

Number

Cited by Other Article(s)

Yoo D, Rhie A, Hebbar P, Antonacci F, Logsdon GA, Solar SJ, Antipov D, Pickett BD, Safonova Y, Montinaro F, Luo Y, Malukiewicz J, Storer JM, Lin J, Sequeira AN, Mangan RJ, Hickey G, Monfort Anez G, Balachandran P, Bankevich A, Beck CR, Biddanda A, Borchers M, Bouffard GG, Brannan E, Brooks SY, Carbone L, Carrel L, Chan AP, Crawford J, Diekhans M, Engelbrecht E, Feschotte C, Formenti G, Garcia GH, de Gennaro L, Gilbert D, Green RE, Guarracino A, Gupta I, Haddad D, Han J, Harris RS, Hartley GA, Harvey WT, Hiller M, Hoekzema K, Houck ML, Jeong H, Kamali K, Kellis M, Kille B, Lee C, Lee Y, Lees W, Lewis AP, Li Q, Loftus M, Loh YHE, Loucks H, Ma J, Mao Y, Martinez JFI, Masterson P, McCoy RC, McGrath B, McKinney S, Meyer BS, Miga KH, Mohanty SK, Munson KM, Pal K, Pennell M, Pevzner PA, Porubsky D, Potapova T, Ringeling FR, Rocha JL, Ryder OA, Sacco S, Saha S, Sasaki T, Schatz MC, Schork NJ, Shanks C, Smeds L, Son DR, Steiner C, Sweeten AP, Tassia MG, Thibaud-Nissen F, Torres-González E, Trivedi M, Wei W, Wertz J, Yang M, Zhang P, Zhang S, Zhang Y, Zhang Z, et alYoo D, Rhie A, Hebbar P, Antonacci F, Logsdon GA, Solar SJ, Antipov D, Pickett BD, Safonova Y, Montinaro F, Luo Y, Malukiewicz J, Storer JM, Lin J, Sequeira AN, Mangan RJ, Hickey G, Monfort Anez G, Balachandran P, Bankevich A, Beck CR, Biddanda A, Borchers M, Bouffard GG, Brannan E, Brooks SY, Carbone L, Carrel L, Chan AP, Crawford J, Diekhans M, Engelbrecht E, Feschotte C, Formenti G, Garcia GH, de Gennaro L, Gilbert D, Green RE, Guarracino A, Gupta I, Haddad D, Han J, Harris RS, Hartley GA, Harvey WT, Hiller M, Hoekzema K, Houck ML, Jeong H, Kamali K, Kellis M, Kille B, Lee C, Lee Y, Lees W, Lewis AP, Li Q, Loftus M, Loh YHE, Loucks H, Ma J, Mao Y, Martinez JFI, Masterson P, McCoy RC, McGrath B, McKinney S, Meyer BS, Miga KH, Mohanty SK, Munson KM, Pal K, Pennell M, Pevzner PA, Porubsky D, Potapova T, Ringeling FR, Rocha JL, Ryder OA, Sacco S, Saha S, Sasaki T, Schatz MC, Schork NJ, Shanks C, Smeds L, Son DR, Steiner C, Sweeten AP, Tassia MG, Thibaud-Nissen F, Torres-González E, Trivedi M, Wei W, Wertz J, Yang M, Zhang P, Zhang S, Zhang Y, Zhang Z, Zhao SA, Zhu Y, Jarvis ED, Gerton JL, Rivas-González I, Paten B, Szpiech ZA, Huber CD, Lenz TL, Konkel MK, Yi SV, Canzar S, Watson CT, Sudmant PH, Molloy E, Garrison E, Lowe CB, Ventura M, O'Neill RJ, Koren S, Makova KD, Phillippy AM, Eichler EE. Complete sequencing of ape genomes. Nature 2025;641:401-418. [PMID: 40205052 PMCID: PMC12058530 DOI: 10.1038/s41586-025-08816-3] [Show More Authors] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Accepted: 02/19/2025] [Indexed: 04/11/2025]

Affiliation(s)

DongAhn Yoo Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Arang Rhie Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Prajna Hebbar UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
Francesca Antonacci Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy
Glennis A Logsdon Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
Steven J Solar Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Dmitry Antipov Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Brandon D Pickett Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Yana Safonova Computer Science and Engineering Department, Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA, USA
Francesco Montinaro Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy Institute of Genomics, University of Tartu, Tartu, Estonia
Yanting Luo Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, USA
Joanna Malukiewicz Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany German Primate Center, Primate Genetics Laboratory, Goettingen, Germany
Jessica M Storer Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
Jiadong Lin Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Abigail N Sequeira Department of Biology, Penn State University, University Park, PA, USA
Riley J Mangan Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA The Broad Institute of MIT and Harvard, Cambridge, MA, USA Genetics Training Program, Harvard Medical School, Boston, MA, USA
Glenn Hickey UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
Graciela Monfort Anez Stowers Institute for Medical Research, Kansas City, MO, USA
Parithi Balachandran The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
Anton Bankevich Computer Science and Engineering Department, Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA, USA
Christine R Beck Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, USA
Arjun Biddanda Department of Biology, Johns Hopkins University, Baltimore, MD, USA
Matthew Borchers Stowers Institute for Medical Research, Kansas City, MO, USA
Gerard G Bouffard NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Emry Brannan Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
Shelise Y Brooks NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Lucia Carbone Department of Medicine, KCVI, Oregon Health Sciences University, Portland, OR, USA Division of Genetics, Oregon National Primate Research Center, Beaverton, OR, USA
Laura Carrel PSU Medical School, Penn State University School of Medicine, Hershey, PA, USA
Agnes P Chan The Translational Genomics Research Institute, City of Hope National Medical Center, Phoenix, AZ, USA
Juyun Crawford NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Mark Diekhans UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
Eric Engelbrecht Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA
Cedric Feschotte Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
Giulio Formenti Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
Gage H Garcia Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Luciana de Gennaro Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy
David Gilbert San Diego Biomedical Research Institute, San Diego, CA, USA
Richard E Green Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
Andrea Guarracino Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
Ishaan Gupta Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, USA
Diana Haddad National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Junmin Han Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Robert S Harris Department of Biology, Penn State University, University Park, PA, USA
Gabrielle A Hartley Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
William T Harvey Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Michael Hiller LOEWE Centre for Translational Biodiversity Genomics, Frankfurt, Germany Senckenberg Research Institute, Frankfurt, Germany Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, Frankfurt, Germany
Kendra Hoekzema Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Marlys L Houck San Diego Zoo Wildlife Alliance, Escondido, CA, USA
Hyeonsoo Jeong Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Kaivan Kamali Department of Biology, Penn State University, University Park, PA, USA
Manolis Kellis Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA The Broad Institute of MIT and Harvard, Cambridge, MA, USA
Bryce Kille Department of Computer Science, Rice University, Houston, TX, USA
Chul Lee Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
Youngho Lee Laboratory of Bioinformatics and Population Genetics, Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
William Lees Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
Alexandra P Lewis Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Qiuhui Li Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
Mark Loftus Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA Center for Human Genetics, Clemson University, Greenwood, SC, USA
Yong Hwee Eddie Loh Neuroscience Research Institute, University of California, Santa Barbara, Santa Barbara, CA, USA
Hailey Loucks UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
Jian Ma Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
Yafei Mao Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China Shanghai Jiao Tong University Chongqing Research Institute, Chongqing, China
Juan F I Martinez Computer Science and Engineering Department, Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA, USA
Patrick Masterson National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Rajiv C McCoy Department of Biology, Johns Hopkins University, Baltimore, MD, USA
Barbara McGrath Department of Biology, Penn State University, University Park, PA, USA
Sean McKinney Stowers Institute for Medical Research, Kansas City, MO, USA
Britta S Meyer Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany
Karen H Miga UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
Saswat K Mohanty Department of Biology, Penn State University, University Park, PA, USA
Katherine M Munson Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Karol Pal Department of Biology, Penn State University, University Park, PA, USA
Matt Pennell Department of Computational Biology, Cornell University, Ithaca, NY, USA
Pavel A Pevzner Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, USA
David Porubsky Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Tamara Potapova Stowers Institute for Medical Research, Kansas City, MO, USA
Francisca R Ringeling Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany
Joana L Rocha Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA
Oliver A Ryder San Diego Zoo Wildlife Alliance, Escondido, CA, USA
Samuel Sacco Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
Swati Saha Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA
Takayo Sasaki San Diego Biomedical Research Institute, San Diego, CA, USA
Michael C Schatz Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
Nicholas J Schork The Translational Genomics Research Institute, City of Hope National Medical Center, Phoenix, AZ, USA
Cole Shanks UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
Linnéa Smeds Department of Biology, Penn State University, University Park, PA, USA
Dongmin R Son Department of Ecology, Evolution and Marine Biology, Neuroscience Research Institute, University of California, Santa Barbara, Santa Barbara, CA, USA
Cynthia Steiner San Diego Zoo Wildlife Alliance, Escondido, CA, USA
Alexander P Sweeten Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Michael G Tassia Department of Biology, Johns Hopkins University, Baltimore, MD, USA
Françoise Thibaud-Nissen National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Edmundo Torres-González Department of Biology, Penn State University, University Park, PA, USA
Mihir Trivedi Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Wenjie Wei School of Life Sciences, Westlake University, Hangzhou, China National Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
Julie Wertz Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Muyu Yang Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
Panpan Zhang Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
Shilong Zhang Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Yang Zhang Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
Zhenmiao Zhang Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, USA
Sarah A Zhao Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
Yixin Zhu Department of Computational Biology, Cornell University, Ithaca, NY, USA
Erich D Jarvis Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA Howard Hughes Medical Institute, Chevy Chase, MD, USA
Jennifer L Gerton Stowers Institute for Medical Research, Kansas City, MO, USA
Iker Rivas-González Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
Benedict Paten UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
Zachary A Szpiech Department of Biology, Penn State University, University Park, PA, USA
Christian D Huber Department of Biology, Penn State University, University Park, PA, USA
Tobias L Lenz Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany
Miriam K Konkel Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA Center for Human Genetics, Clemson University, Greenwood, SC, USA
Soojin V Yi Department of Ecology, Evolution and Marine Biology, Neuroscience Research Institute, University of California, Santa Barbara, Santa Barbara, CA, USA Department of Molecular, Cellular and Developmental Biology, Neuroscience Research Institute, University of California, Santa Barbara, Santa Barbara, CA, USA
Stefan Canzar Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany
Corey T Watson Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA
Peter H Sudmant Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
Erin Molloy Department of Computer Science, University of Maryland, College Park, MD, USA
Erik Garrison Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
Craig B Lowe Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, USA
Mario Ventura Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy
Rachel J O'Neill Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, USA Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
Sergey Koren Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Kateryna D Makova Department of Biology, Penn State University, University Park, PA, USA.
Adam M Phillippy Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
Evan E Eichler Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA. Howard Hughes Medical Institute, Chevy Chase, MD, USA.

Collapse

Groot Koerkamp R, Liu D, Pibiri GE. The open-closed mod-minimizer algorithm. Algorithms Mol Biol 2025;20:4. [PMID: 40098006 PMCID: PMC11912762 DOI: 10.1186/s13015-025-00270-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Accepted: 01/28/2025] [Indexed: 03/19/2025] Open

Abstract

Sampling algorithms that deterministically select a subset of k -mers are an important building block in bioinformatics applications. For example, they are used to index large textual collections, like DNA, and to compare sequences quickly. In such applications, a sampling algorithm is required to select one k -mer out of every window of w consecutive k -mers. The folklore and most used scheme is the random minimizer that selects the smallest k -mer in the window according to some random order. This scheme is remarkably simple and versatile, and has a density (expected fraction of selected k -mers) of 2 / ( w + 1 ) . In practice, lower density leads to faster methods and smaller indexes, and it turns out that the random minimizer is not the best one can do. Indeed, some schemes are known to approach optimal density 1/w when k → ∞ , like the recently introduced mod-minimizer (Groot Koerkamp and Pibiri, WABI 2024). In this work, we study methods that achieve low density when k ≤ w . In this small-k regime, a practical method with provably better density than the random minimizer is the miniception (Zheng et al., Bioinformatics 2021). This method can be elegantly described as sampling the smallest closed sycnmer (Edgar, PeerJ 2021) in the window according to some random order. We show that extending the miniception to prefer sampling open syncmers yields much better density. This new method-the open-closed minimizer-offers improved density for small k ≤ w while being as fast to compute as the random minimizer. Compared to methods based on decycling sets, that achieve very low density in the small-k regime, our method has comparable density while being computationally simpler and intuitive. Furthermore, we extend the mod-minimizer to improve density of any scheme that works well for small k to also work well when k > w is large. We hence obtain the open-closed mod-minimizer, a practical method that improves over the mod-minimizer for all k.

Collapse

Santoro DF, Marconi G, Capomaccio S, Bocchini M, Anderson AW, Finotti A, Confalonieri M, Albertini E, Rosellini D. Polyploidization-driven transcriptomic dynamics in Medicago sativa neotetraploids: mRNA, smRNA and allele-specific gene expression. BMC PLANT BIOLOGY 2025;25:108. [PMID: 39856624 PMCID: PMC11763150 DOI: 10.1186/s12870-025-06090-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Accepted: 01/09/2025] [Indexed: 01/27/2025]

Abstract

Whole genome duplication (WGD) is a powerful evolutionary mechanism in plants. Autopolyploids have been comparatively less studied than allopolyploids, with sexual autopolyploidization receiving even less attention. In this work, we studied the transcriptomes of neotetraploids (2n = 4x = 32) obtained by crossing two diploid (2n = 2x = 16) plants of Medicago sativa that produce a significant percentage of either 2n eggs or pollen. Diploid progeny from the same cross allowed us to separate the transcriptional outcomes of hybridization from those of WGD. This material can help to elucidate events at the base of the domestication of cultivated 4x alfalfa, the world's most important leguminous forage. Three 2x and three 4x progeny plants and 2x parental plants were used for this study. The RNA-seq data revealed that WGD did not dramatically affect the transcription of leaf protein-coding genes. The two parental genotypes did not contribute equally to the progeny transcriptomes, and genome-wide expression level dominance of the male parent was observed. A large majority of the genes whose expression level changed due to WGD presented increased expression, indicating that the 4x state requires the upregulation of approximately 2.66% of the protein-coding genes. Overall, we estimated that 3.63% of the protein-coding genes were transcriptionally affected by WGD and may contribute to the phenotypic novelty of the neotetraploid plants. Pathway analysis suggested that WGD could affect secondary metabolite biosynthesis, which in turn may influence forage quality. We found four times as many transcription factor genes among the polyploidization-affected genes than among those affected only by hybridization. Several of these belong to classes involved in stress response. Small RNA-seq revealed that very few miRNAs were significantly associated with WGD, but they target several hundred genes, and their role in the WGD response may be relevant. Integrated network analysis led to the identification of putative miRNA: mRNA interactions potentially involved in transcriptome reprogramming. Allele-specific expression analysis indicated that parent-of-origin bias was not a significant outcome of WGD, but we found that parentally biased RNA editing may be a significant source of variation in neopolyploids.

Collapse

Janssen A, Gibson P, Bravo A, de Bakker V, Slager J, Veening JW. PneumoBrowse 2: an integrated visual platform for curated genome annotation and multiomics data analysis of Streptococcus pneumoniae. Nucleic Acids Res 2025;53:D839-D851. [PMID: 39436044 PMCID: PMC11701578 DOI: 10.1093/nar/gkae923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Revised: 09/30/2024] [Accepted: 10/04/2024] [Indexed: 10/23/2024] Open

Kille B, Groot Koerkamp R, McAdams D, Liu A, Treangen TJ. A near-tight lower bound on the density of forward sampling schemes. Bioinformatics 2024;41:btae736. [PMID: 39666942 PMCID: PMC11676336 DOI: 10.1093/bioinformatics/btae736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Revised: 11/16/2024] [Accepted: 12/10/2024] [Indexed: 12/14/2024] Open

Kille B, Koerkamp RG, McAdams D, Liu A, Treangen TJ. A near-tight lower bound on the density of forward sampling schemes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.06.611668. [PMID: 39605515 PMCID: PMC11601301 DOI: 10.1101/2024.09.06.611668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]

Marçais G, Elder CS, Kingsford C. k-nonical space: sketching with reverse complements. Bioinformatics 2024;40:btae629. [PMID: 39432565 PMCID: PMC11549021 DOI: 10.1093/bioinformatics/btae629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 10/01/2024] [Accepted: 10/17/2024] [Indexed: 10/23/2024] Open

Ndiaye M, Prieto-Baños S, Fitzgerald LM, Yazdizadeh Kharrazi A, Oreshkov S, Dessimoz C, Sedlazeck FJ, Glover N, Majidian S. When less is more: sketching with minimizers in genomics. Genome Biol 2024;25:270. [PMID: 39402664 PMCID: PMC11472564 DOI: 10.1186/s13059-024-03414-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 10/01/2024] [Indexed: 10/19/2024] Open

Yoo D, Rhie A, Hebbar P, Antonacci F, Logsdon GA, Solar SJ, Antipov D, Pickett BD, Safonova Y, Montinaro F, Luo Y, Malukiewicz J, Storer JM, Lin J, Sequeira AN, Mangan RJ, Hickey G, Anez GM, Balachandran P, Bankevich A, Beck CR, Biddanda A, Borchers M, Bouffard GG, Brannan E, Brooks SY, Carbone L, Carrel L, Chan AP, Crawford J, Diekhans M, Engelbrecht E, Feschotte C, Formenti G, Garcia GH, de Gennaro L, Gilbert D, Green RE, Guarracino A, Gupta I, Haddad D, Han J, Harris RS, Hartley GA, Harvey WT, Hiller M, Hoekzema K, Houck ML, Jeong H, Kamali K, Kellis M, Kille B, Lee C, Lee Y, Lees W, Lewis AP, Li Q, Loftus M, Loh YHE, Loucks H, Ma J, Mao Y, Martinez JFI, Masterson P, McCoy RC, McGrath B, McKinney S, Meyer BS, Miga KH, Mohanty SK, Munson KM, Pal K, Pennell M, Pevzner PA, Porubsky D, Potapova T, Ringeling FR, Roha JL, Ryder OA, Sacco S, Saha S, Sasaki T, Schatz MC, Schork NJ, Shanks C, Smeds L, Son DR, Steiner C, Sweeten AP, Tassia MG, Thibaud-Nissen F, Torres-González E, Trivedi M, Wei W, Wertz J, Yang M, Zhang P, Zhang S, Zhang Y, Zhang Z, et alYoo D, Rhie A, Hebbar P, Antonacci F, Logsdon GA, Solar SJ, Antipov D, Pickett BD, Safonova Y, Montinaro F, Luo Y, Malukiewicz J, Storer JM, Lin J, Sequeira AN, Mangan RJ, Hickey G, Anez GM, Balachandran P, Bankevich A, Beck CR, Biddanda A, Borchers M, Bouffard GG, Brannan E, Brooks SY, Carbone L, Carrel L, Chan AP, Crawford J, Diekhans M, Engelbrecht E, Feschotte C, Formenti G, Garcia GH, de Gennaro L, Gilbert D, Green RE, Guarracino A, Gupta I, Haddad D, Han J, Harris RS, Hartley GA, Harvey WT, Hiller M, Hoekzema K, Houck ML, Jeong H, Kamali K, Kellis M, Kille B, Lee C, Lee Y, Lees W, Lewis AP, Li Q, Loftus M, Loh YHE, Loucks H, Ma J, Mao Y, Martinez JFI, Masterson P, McCoy RC, McGrath B, McKinney S, Meyer BS, Miga KH, Mohanty SK, Munson KM, Pal K, Pennell M, Pevzner PA, Porubsky D, Potapova T, Ringeling FR, Roha JL, Ryder OA, Sacco S, Saha S, Sasaki T, Schatz MC, Schork NJ, Shanks C, Smeds L, Son DR, Steiner C, Sweeten AP, Tassia MG, Thibaud-Nissen F, Torres-González E, Trivedi M, Wei W, Wertz J, Yang M, Zhang P, Zhang S, Zhang Y, Zhang Z, Zhao SA, Zhu Y, Jarvis ED, Gerton JL, Rivas-González I, Paten B, Szpiech ZA, Huber CD, Lenz TL, Konkel MK, Yi SV, Canzar S, Watson CT, Sudmant PH, Molloy E, Garrison E, Lowe CB, Ventura M, O’Neill RJ, Koren S, Makova KD, Phillippy AM, Eichler EE. Complete sequencing of ape genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.31.605654. [PMID: 39131277 PMCID: PMC11312596 DOI: 10.1101/2024.07.31.605654] [Show More Authors] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]

Affiliation(s)

DongAhn Yoo Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Arang Rhie Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
Prajna Hebbar UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95060, USA
Francesca Antonacci Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, 70124, Italy
Glennis A. Logsdon Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19103, USA
Steven J. Solar Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
Dmitry Antipov Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
Brandon D. Pickett Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
Yana Safonova Computer Science and Engineering Department, Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA 16801, USA
Francesco Montinaro Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, 70124, Italy Institute of Genomics, University of Tartu, Tartu, Estonia
Yanting Luo Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
Joanna Malukiewicz Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, 20146 Hamburg, Germany
Jessica M. Storer Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
Jiadong Lin Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Abigail N. Sequeira Department of Biology, Penn State University, University Park, PA 16802, USA
Riley J. Mangan Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Genetics Training Program, Harvard Medical School, Boston, MA 02115, USA
Glenn Hickey UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95060, USA
Graciela Monfort Anez Stowers Institute for Medical Research, Kansas City, MO 64110, USA
Parithi Balachandran The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
Anton Bankevich Computer Science and Engineering Department, Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA 16801, USA
Christine R. Beck Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, USA
Arjun Biddanda Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA
Matthew Borchers Stowers Institute for Medical Research, Kansas City, MO 64110, USA
Gerard G. Bouffard NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
Emry Brannan Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
Shelise Y. Brooks NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
Lucia Carbone Department of Medicine, KCVI, Oregon Health Sciences University, Portland, OR, USA Division of Genetics, Oregon National Primate Research Center, Beaverton, OR, USA
Laura Carrel PSU Medical School, Penn State University School of Medicine, Hershey, PA, USA
Agnes P. Chan The Translational Genomics Research Institute, a part of the City of Hope National Medical Center, Phoenix, AZ, USA
Juyun Crawford NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
Mark Diekhans UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95060, USA
Eric Engelbrecht Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA
Cedric Feschotte Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
Giulio Formenti Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10021, USA
Gage H. Garcia Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Luciana de Gennaro Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, 70124, Italy
David Gilbert San Diego Biomedical Research Institute, San Diego, CA, USA
Richard E. Green University of California Santa Cruz, Santa Cruz, CA, USA
Andrea Guarracino Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
Ishaan Gupta Department of Computer Science and Engineering, University of California San Diego, CA, USA
Diana Haddad National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
Junmin Han Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Robert S. Harris Department of Biology, Penn State University, University Park, PA 16802, USA
Gabrielle A. Hartley Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
William T. Harvey Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Michael Hiller LOEWE Centre for Translational Biodiversity Genomics, Senckenberg Research Institute, Goethe University, Frankfurt, Germany
Kendra Hoekzema Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Marlys L. Houck San Diego Zoo Wildlife Alliance, Escondido, CA, 92027-7000, USA
Hyeonsoo Jeong Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
Kaivan Kamali Department of Biology, Penn State University, University Park, PA 16802, USA
Manolis Kellis Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Bryce Kille Department of Computer Science, Rice University, Houston, TX 77005, USA
Chul Lee Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
Youngho Lee Laboratory of bioinformatics and population genetics, Interdisciplinary program in bioinformatics, Seoul National University, Republic of Korea
William Lees Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
Alexandra P. Lewis Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Qiuhui Li Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
Mark Loftus Department of Genetics & Biochemistry, Clemson University, Clemson, SC, USA Center for Human Genetics, Clemson University, Greenwood, SC, USA
Yong Hwee Eddie Loh Neuroscience Research Institute, University of California, Santa Barbara, CA, USA
Hailey Loucks UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95060, USA
Jian Ma Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, PA, USA
Yafei Mao Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, Zhejiang, China Shanghai Jiao Tong University Chongqing Research Institute, Chongqing, China
Juan F. I. Martinez Computer Science and Engineering Department, Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA 16801, USA
Patrick Masterson National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
Rajiv C. McCoy Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA
Barbara McGrath Department of Biology, Penn State University, University Park, PA 16802, USA
Sean McKinney Stowers Institute for Medical Research, Kansas City, MO 64110, USA
Britta S. Meyer Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, 20146 Hamburg, Germany
Karen H. Miga UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95060, USA
Saswat K. Mohanty Department of Biology, Penn State University, University Park, PA 16802, USA
Katherine M. Munson Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Karol Pal Department of Biology, Penn State University, University Park, PA 16802, USA
Matt Pennell Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
Pavel A. Pevzner Department of Computer Science and Engineering, University of California San Diego, CA, USA
David Porubsky Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Tamara Potapova Stowers Institute for Medical Research, Kansas City, MO 64110, USA
Francisca R. Ringeling Faculty of Informatics and Data Science, University of Regensburg, 93053 Regensburg, Germany
Joana L. Roha Department of Integrative Biology, University of California, Berkeley, Berkeley, USA
Oliver A. Ryder San Diego Zoo Wildlife Alliance, Escondido, CA, 92027-7000, USA
Samuel Sacco University of California Santa Cruz, Santa Cruz, CA, USA
Swati Saha Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA
Takayo Sasaki San Diego Biomedical Research Institute, San Diego, CA, USA
Michael C. Schatz Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
Nicholas J. Schork The Translational Genomics Research Institute, a part of the City of Hope National Medical Center, Phoenix, AZ, USA
Cole Shanks UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95060, USA
Linnéa Smeds Department of Biology, Penn State University, University Park, PA 16802, USA
Dongmin R. Son Department of Ecology, Evolution and Marine Biology, Neuroscience Research Institute, University of California, Santa Barbara, CA, USA
Cynthia Steiner San Diego Zoo Wildlife Alliance, Escondido, CA, 92027-7000, USA
Alexander P. Sweeten Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
Michael G. Tassia Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA
Françoise Thibaud-Nissen National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
Edmundo Torres-González Department of Biology, Penn State University, University Park, PA 16802, USA
Mihir Trivedi Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
Wenjie Wei School of Life Sciences, Westlake University, Hangzhou 310024, China National Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, 430070, Wuhan, China
Julie Wertz Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Muyu Yang Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, PA, USA
Panpan Zhang Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
Shilong Zhang Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
Yang Zhang Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, PA, USA
Zhenmiao Zhang Department of Computer Science and Engineering, University of California San Diego, CA, USA
Sarah A. Zhao Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Yixin Zhu Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
Erich D. Jarvis Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA Howard Hughes Medical Institute, Chevy Chase, MD, USA
Jennifer L. Gerton Stowers Institute for Medical Research, Kansas City, MO 64110, USA
Iker Rivas-González Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
Benedict Paten UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95060, USA
Zachary A. Szpiech Department of Biology, Penn State University, University Park, PA 16802, USA
Christian D. Huber Department of Biology, Penn State University, University Park, PA 16802, USA
Tobias L. Lenz Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, 20146 Hamburg, Germany
Miriam K. Konkel Department of Genetics & Biochemistry, Clemson University, Clemson, SC, USA Center for Human Genetics, Clemson University, Greenwood, SC, USA
Soojin V. Yi Department of Ecology, Evolution and Marine Biology, Department of Molecular, Cellular and Developmental Biology, Neuroscience Research Institute, University of California, Santa Barbara, CA, USA
Stefan Canzar Faculty of Informatics and Data Science, University of Regensburg, 93053 Regensburg, Germany
Corey T. Watson Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA
Peter H. Sudmant Department of Integrative Biology, University of California, Berkeley, Berkeley, USA Center for Computational Biology, University of California, Berkeley, Berkeley, USA
Erin Molloy Department of Computer Science, University of Maryland, College Park, MD 20742, USA
Erik Garrison Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
Craig B. Lowe Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
Mario Ventura Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, 70124, Italy
Rachel J. O’Neill Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, USA Departments of Molecular and Cell Biology, UConn Storrs, CT, USA
Sergey Koren Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
Kateryna D. Makova Department of Biology, Penn State University, University Park, PA 16802, USA
Adam M. Phillippy Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
Evan E. Eichler Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA

Collapse

Li H, Durbin R. Genome assembly in the telomere-to-telomere era. Nat Rev Genet 2024;25:658-670. [PMID: 38649458 DOI: 10.1038/s41576-024-00718-w] [Citation(s) in RCA: 50] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2024] [Indexed: 04/25/2024]

Sweeten AP, Schatz MC, Phillippy AM. ModDotPlot-rapid and interactive visualization of tandem repeats. Bioinformatics 2024;40:btae493. [PMID: 39110522 PMCID: PMC11321072 DOI: 10.1093/bioinformatics/btae493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 07/02/2024] [Accepted: 08/05/2024] [Indexed: 08/15/2024] Open

Marçais G, DeBlasio D, Kingsford C. Sketching Methods with Small Window Guarantee Using Minimum Decycling Sets. J Comput Biol 2024;31:597-615. [PMID: 38980804 PMCID: PMC11304339 DOI: 10.1089/cmb.2024.0544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/11/2024] Open

Abstract

Most sequence sketching methods work by selecting specific k-mers from sequences so that the similarity between two sequences can be estimated using only the sketches. Because estimating sequence similarity is much faster using sketches than using sequence alignment, sketching methods are used to reduce the computational requirements of computational biology software. Applications using sketches often rely on properties of the k-mer selection procedure to ensure that using a sketch does not degrade the quality of the results compared with using sequence alignment. Two important examples of such properties are locality and window guarantees, the latter of which ensures that no long region of the sequence goes unrepresented in the sketch. A sketching method with a window guarantee, implicitly or explicitly, corresponds to a decycling set of the de Bruijn graph, which is a set of unavoidable k-mers. Any long enough sequence, by definition, must contain a k-mer from any decycling set (hence, the unavoidable property). Conversely, a decycling set also defines a sketching method by choosing the k-mers from the set as representatives. Although current methods use one of a small number of sketching method families, the space of decycling sets is much larger and largely unexplored. Finding decycling sets with desirable characteristics (e.g., small remaining path length) is a promising approach to discovering new sketching methods with improved performance (e.g., with small window guarantee). The Minimum Decycling Sets (MDSs) are of particular interest because of their minimum size. Only two algorithms, by Mykkeltveit and Champarnaud, are previously known to generate two particular MDSs, although there are typically a vast number of alternative MDSs. We provide a simple method to enumerate MDSs. This method allows one to explore the space of MDSs and to find MDSs optimized for desirable properties. We give evidence that the Mykkeltveit sets are close to optimal regarding one particular property, the remaining path length. A number of conjectures and computational and theoretical evidence to support them are presented. Code available at https://github.com/Kingsford-Group/mdsscope.

Collapse

Sweeten AP, Schatz MC, Phillippy AM. ModDotPlot-Rapid and interactive visualization of complex repeats. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.15.589623. [PMID: 38712106 PMCID: PMC11071298 DOI: 10.1101/2024.04.15.589623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]