1
|
Pusadkar V, Azad RK. Benchmarking Metagenomic Classifiers on Simulated Ancient and Modern Metagenomic Data. Microorganisms 2023; 11:2478. [PMID: 37894136 PMCID: PMC10609333 DOI: 10.3390/microorganisms11102478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 09/28/2023] [Accepted: 09/29/2023] [Indexed: 10/29/2023] Open
Abstract
Taxonomic profiling of ancient metagenomic samples is challenging due to the accumulation of specific damage patterns on DNA over time. Although a number of methods for metagenome profiling have been developed, most of them have been assessed on modern metagenomes or simulated metagenomes mimicking modern metagenomes. Further, a comparative assessment of metagenome profilers on simulated metagenomes representing a spectrum of degradation depth, from the extremity of ancient (most degraded) to current or modern (not degraded) metagenomes, has not yet been performed. To understand the strengths and weaknesses of different metagenome profilers, we performed their comprehensive evaluation on simulated metagenomes representing human dental calculus microbiome, with the level of DNA damage successively raised to mimic modern to ancient metagenomes. All classes of profilers, namely, DNA-to-DNA, DNA-to-protein, and DNA-to-marker comparison-based profilers were evaluated on metagenomes with varying levels of damage simulating deamination, fragmentation, and contamination. Our results revealed that, compared to deamination and fragmentation, human and environmental contamination of ancient DNA (with modern DNA) has the most pronounced effect on the performance of each profiler. Further, the DNA-to-DNA (e.g., Kraken2, Bracken) and DNA-to-marker (e.g., MetaPhlAn4) based profiling approaches showed complementary strengths, which can be leveraged to elevate the state-of-the-art of ancient metagenome profiling.
Collapse
Affiliation(s)
- Vaidehi Pusadkar
- Department of Biological Sciences, University of North Texas, Denton, TX 76203, USA;
- BioDiscovery Institute, University of North Texas, Denton, TX 76203, USA
| | - Rajeev K. Azad
- Department of Biological Sciences, University of North Texas, Denton, TX 76203, USA;
- BioDiscovery Institute, University of North Texas, Denton, TX 76203, USA
- Department of Mathematics, University of North Texas, Denton, TX 76203, USA
| |
Collapse
|
2
|
Settlecowski AE, Marks BD, Manthey JD. Library preparation method and DNA source influence endogenous DNA recovery from 100-year-old avian museum specimens. Ecol Evol 2023; 13:e10407. [PMID: 37565027 PMCID: PMC10410627 DOI: 10.1002/ece3.10407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 06/28/2023] [Accepted: 07/21/2023] [Indexed: 08/12/2023] Open
Abstract
Museum specimens collected prior to cryogenic tissue storage are increasingly being used as genetic resources, and though high-throughput sequencing is becoming more cost-efficient, whole genome sequencing (WGS) of historical DNA (hDNA) remains inefficient and costly due to its short fragment sizes and high loads of exogenous DNA, among other factors. It is also unclear how sequencing efficiency is influenced by DNA sources. We aimed to identify the most efficient method and DNA source for collecting WGS data from avian museum specimens. We analyzed low-coverage WGS from 60 DNA libraries prepared from four American Robin (Turdus migratorius) and four Abyssinian Thrush (Turdus abyssinicus) specimens collected in the 1920s. We compared DNA source (toepad versus incision-line skin clip) and three library preparation methods: (1) double-stranded DNA (dsDNA), single tube (KAPA); (2) single-stranded DNA (ssDNA), multi-tube (IDT); and (3) ssDNA, single tube (Claret Bioscience). We found that the ssDNA, multi-tube method resulted in significantly greater endogenous DNA content, average read length, and sequencing efficiency than the other tested methods. We also tested whether a predigestion step reduced exogenous DNA in libraries from one specimen per species and found promising results that warrant further study. The ~10% increase in average sequencing efficiency of the best-performing method over a commonly implemented dsDNA library preparation method has the potential to significantly increase WGS coverage of hDNA from bird specimens. Future work should evaluate the threshold for specimen age at which these results hold and how the combination of library preparation method and DNA source influence WGS in other taxa.
Collapse
Affiliation(s)
- Amie E. Settlecowski
- Bird Collection Gantz Family Collections CenterThe Field MuseumChicagoIllinoisUSA
| | - Ben D. Marks
- Bird Collection Gantz Family Collections CenterThe Field MuseumChicagoIllinoisUSA
| | - Joseph D. Manthey
- Department of Biological SciencesTexas Tech UniversityLubbockTexasUSA
| |
Collapse
|
3
|
Bernstein JM, Ruane S. Maximizing Molecular Data From Low-Quality Fluid-Preserved Specimens in Natural History Collections. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2022.893088] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Over the past decade, museum genomics studies have focused on obtaining DNA of sufficient quality and quantity for sequencing from fluid-preserved natural history specimens, primarily to be used in systematic studies. While these studies have opened windows to evolutionary and biodiversity knowledge of many species worldwide, published works often focus on the success of these DNA sequencing efforts, which is undoubtedly less common than obtaining minimal or sometimes no DNA or unusable sequence data from specimens in natural history collections. Here, we attempt to obtain and sequence DNA extracts from 115 fresh and 41 degraded samples of homalopsid snakes, as well as from two degraded samples of a poorly known snake, Hydrablabes periops. Hydrablabes has been suggested to belong to at least two different families (Natricidae and Homalopsidae) and with no fresh tissues known to be available, intractable museum specimens currently provide the only opportunity to determine this snake’s taxonomic affinity. Although our aim was to generate a target-capture dataset for these samples, to be included in a broader phylogenetic study, results were less than ideal due to large amounts of missing data, especially using the same downstream methods as with standard, high-quality samples. However, rather than discount results entirely, we used mapping methods with references and pseudoreferences, along with phylogenetic analyses, to maximize any usable molecular data from our sequencing efforts, identify the taxonomic affinity of H. periops, and compare sequencing success between fresh and degraded tissue samples. This resulted in largely complete mitochondrial genomes for five specimens and hundreds to thousands of nuclear loci (ultra-conserved loci, anchored-hybrid enrichment loci, and a variety of loci frequently used in squamate phylogenetic studies) from fluid-preserved snakes, including a specimen of H. periops from the Field Museum of Natural History collection. We combined our H. periops data with previously published genomic and Sanger-sequenced datasets to confirm the familial designation of this taxon, reject previous taxonomic hypotheses, and make biogeographic inferences for Hydrablabes. A second H. periops specimen, despite being seemingly similar for initial raw sequencing results and after being put through the same protocols, resulted in little usable molecular data. We discuss the successes and failures of using different pipelines and methods to maximize the products from these data and provide expectations for others who are looking to use DNA sequencing efforts on specimens that likely have degraded DNA.Life Science Identifier (Hydrablabes periops)urn:lsid:zoobank.org:pub:F2AA44 E2-D2EF-4747-972A-652C34C2C09D.
Collapse
|
4
|
Irestedt M, Thörn F, Müller IA, Jønsson KA, Ericson PGP, Blom MPK. A guide to avian museomics: Insights gained from resequencing hundreds of avian study skins. Mol Ecol Resour 2022; 22:2672-2684. [PMID: 35661418 PMCID: PMC9542604 DOI: 10.1111/1755-0998.13660] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 04/25/2022] [Accepted: 05/23/2022] [Indexed: 11/30/2022]
Abstract
Biological specimens in natural history collections constitute a massive repository of genetic information. Many specimens have been collected in areas in which they no longer exist or in areas where present‐day collecting is not possible. There are also specimens in collections representing populations or species that have gone extinct. Furthermore, species or populations may have been sampled throughout an extensive time period, which is particularly valuable for studies of genetic change through time. With the advent of high‐throughput sequencing, natural history museum resources have become accessible for genomic research. Consequently, these unique resources are increasingly being used across many fields of natural history. In this paper, we summarize our experiences of resequencing hundreds of genomes from historical avian museum specimens. We publish the protocols we have used and discuss the entire workflow from sampling and laboratory procedures, to the bioinformatic processing of historical specimen data.
Collapse
Affiliation(s)
- Martin Irestedt
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, SE-104 05, Stockholm, Sweden
| | - Filip Thörn
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, SE-104 05, Stockholm, Sweden.,Department of Zoology, Stockholm University, Stockholm, Sweden
| | - Ingo A Müller
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, SE-104 05, Stockholm, Sweden.,Department of Zoology, Stockholm University, Stockholm, Sweden
| | - Knud A Jønsson
- Natural History Museum of Denmark, University of Copenhagen, Universitetsparken 15, Copenhagen, Denmark
| | - Per G P Ericson
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, SE-104 05, Stockholm, Sweden
| | - Mozes P K Blom
- Museum für Naturkunde, Leibniz Institut für Evolutions- und Biodiversitätsforschung, Berlin, Germany
| |
Collapse
|
5
|
Hahn EE, Alexander MR, Grealy A, Stiller J, Gardiner DM, Holleley CE. Unlocking inaccessible historical genomes preserved in formalin. Mol Ecol Resour 2021; 22:2130-2147. [PMID: 34549888 DOI: 10.1111/1755-0998.13505] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 09/08/2021] [Accepted: 09/10/2021] [Indexed: 11/27/2022]
Abstract
Museum specimens represent an unparalleled record of historical genomic data. However, the widespread practice of formalin preservation has thus far impeded genomic analysis of a large proportion of specimens. Limited DNA sequencing from formalin-preserved specimens has yielded low genomic coverage with unpredictable success. We set out to refine sample processing methods and to identify specimen characteristics predictive of sequencing success. With a set of taxonomically diverse specimens collected between 1962 and 2006 and ranging in preservation quality, we compared the efficacy of several end-to-end whole genome sequencing workflows alongside a k-mer-based trimming-free read alignment approach to maximize mapping of endogenous sequence. We recovered complete mitochondrial genomes and up to 3× nuclear genome coverage from formalin-preserved tissues. Hot alkaline lysis coupled with phenol-chloroform extraction out-performed proteinase K digestion in recovering DNA, while library preparation method had little impact on sequencing success. The strongest predictor of DNA yield was overall specimen condition, which additively interacts with preservation conditions to accelerate DNA degradation. Here, we demonstrate a significant advance in capability beyond limited recovery of a small number of loci via PCR or target-capture sequencing. To facilitate strategic selection of suitable specimens for genomic sequencing, we present a decision-making framework that utilizes independent and nondestructive assessment criteria. Sequencing of formalin-preserved specimens will contribute to a greater understanding of temporal trends in genetic adaptation, including those associated with a changing climate. Our work enhances the value of museum collections worldwide by unlocking genomes of specimens that have been disregarded as a valid molecular resource.
Collapse
Affiliation(s)
- Erin E Hahn
- National Research Collections Australia, Commonwealth Scientific Industrial Research Organisation, Canberra, ACT, Australia
| | - Marina R Alexander
- National Research Collections Australia, Commonwealth Scientific Industrial Research Organisation, Canberra, ACT, Australia
| | - Alicia Grealy
- National Research Collections Australia, Commonwealth Scientific Industrial Research Organisation, Canberra, ACT, Australia
| | - Jiri Stiller
- Agriculture and Food, Commonwealth Scientific Industrial Research Organisation, St Lucia, Qld, Australia
| | - Donald M Gardiner
- Agriculture and Food, Commonwealth Scientific Industrial Research Organisation, St Lucia, Qld, Australia
| | - Clare E Holleley
- National Research Collections Australia, Commonwealth Scientific Industrial Research Organisation, Canberra, ACT, Australia
| |
Collapse
|