1
|
Foster NR, Taylor D, Hoogewerff J, Aberle MG, de Caritat P, Roffey P, Edwards R, Malik A, Waycott M, Young JM. The secret hidden in dust: Assessing the potential to use biological and chemical properties of the airborne fraction of soil for provenance assignment and forensic casework. Forensic Sci Int Genet 2023; 67:102931. [PMID: 37659257 DOI: 10.1016/j.fsigen.2023.102931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 08/19/2023] [Accepted: 08/21/2023] [Indexed: 09/04/2023]
Abstract
The airborne fraction of soil (dust) is both ubiquitous in nature and contains localised biological and chemical signatures, making it a potential medium for forensic intelligence. Metabarcoding of dust can yield biological communities unique to the site of interest, similarly, geochemical analyses can uncover elements and minerals within dust that can be matched to a geographic location. Combining these analyses presents multiple lines of evidence as to the origin of dust collected from items of interest. In this work, we investigated whether bacterial and fungal communities in dust change through time and whether they are comparable to soil samples of the same site. We integrated dust metabarcoding into a framework amenable to forensic casework, (i.e., using calibrated log-likelihood ratios) to predict the origin of dust samples using models constructed from both dust samples and soil samples from the same site. Furthermore, we tested whether both metabarcoding and geochemical/mineralogical analyses could be conducted on a single swabbed sample, for situations where sampling is limited. We found both analyses could generate results from a single swabbed sample and found biological and chemical signatures unique to sites. However, we did find significant variation within sites, where this did not always correlate with time but was a random effect of sampling. This variation within sites was not greater than between sites and so did not influence site discrimination. When modelling bacterial and fungal diversity using calibrated log-likelihood ratios, we found samples were correctly predicted using dust 67% and 56% of the time and using soil 56% and 22% of the time for bacteria and fungi communities respectively. Incorrect predictions were related to within site variability, highlighting limitations to assigning dust provenance using metabarcoding of soil.
Collapse
Affiliation(s)
- Nicole R Foster
- College of Science and Engineering, Flinders University, GPO Box 2100, Adelaide, SA 5001, Australia.
| | - Duncan Taylor
- College of Science and Engineering, Flinders University, GPO Box 2100, Adelaide, SA 5001, Australia; Forensic Science SA, GPO Box 2790, Adelaide, SA 5001, Australia
| | - Jurian Hoogewerff
- National Centre for Forensic Studies, University of Canberra, Bruce Australian Capital Territory 2617, Australia
| | - Michael G Aberle
- National Centre for Forensic Studies, University of Canberra, Bruce Australian Capital Territory 2617, Australia
| | - Patrice de Caritat
- National Centre for Forensic Studies, University of Canberra, Bruce Australian Capital Territory 2617, Australia; Geoscience Australia, GPO Box 378, Canberra Australian Capital Territory 2601, Australia
| | - Paul Roffey
- National Centre for Forensic Studies, University of Canberra, Bruce Australian Capital Territory 2617, Australia; Australian Federal Police, GPO Box 401, Canberra Australian Capital Territory 2601, Australia
| | - Robert Edwards
- College of Science and Engineering, Flinders University, GPO Box 2100, Adelaide, SA 5001, Australia
| | - Arif Malik
- School of Biological Sciences, University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Michelle Waycott
- School of Biological Sciences, University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Jennifer M Young
- College of Science and Engineering, Flinders University, GPO Box 2100, Adelaide, SA 5001, Australia
| |
Collapse
|
2
|
Geographic source estimation using airborne plant environmental DNA in dust. Sci Rep 2021; 11:16238. [PMID: 34376726 PMCID: PMC8355115 DOI: 10.1038/s41598-021-95702-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 07/22/2021] [Indexed: 02/07/2023] Open
Abstract
Information obtained from the analysis of dust, particularly biological particles such as pollen, plant parts, and fungal spores, has great utility in forensic geolocation. As an alternative to manual microscopic analysis of dust components, we developed a pipeline that utilizes the airborne plant environmental DNA (eDNA) in settled dust to estimate geographic origin. Metabarcoding of settled airborne eDNA was used to identify plant species whose geographic distributions were then derived from occurrence records in the USGS Biodiversity in Service of Our Nation (BISON) database. The distributions for all plant species identified in a sample were used to generate a probabilistic estimate of the sample source. With settled dust collected at four U.S. sites over a 15-month period, we demonstrated positive regional geolocation (within 600 km2 of the collection point) with 47.6% (20 of 42) of the samples analyzed. Attribution accuracy and resolution was dependent on the number of plant species identified in a dust sample, which was greatly affected by the season of collection. In dust samples that yielded a minimum of 20 identified plant species, positive regional attribution was achieved with 66.7% (16 of 24 samples). For broader demonstration, citizen-collected dust samples collected from 31 diverse U.S. sites were analyzed, and trace plant eDNA provided relevant regional attribution information on provenance in 32.2% of samples. This showed that analysis of airborne plant eDNA in settled dust can provide an accurate estimate regional provenance within the U.S., and relevant forensic information, for a substantial fraction of samples analyzed.
Collapse
|
3
|
Phan NN, Chattopadhyay A, Lee TT, Yin HI, Lu TP, Lai LC, Hwa HL, Tsai MH, Chuang EY. High-performance deep learning pipeline predicts individuals in mixtures of DNA using sequencing data. Brief Bioinform 2021; 22:6345217. [PMID: 34368845 DOI: 10.1093/bib/bbab283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 06/20/2021] [Accepted: 07/03/2021] [Indexed: 11/14/2022] Open
Abstract
In this study, we proposed a deep learning (DL) model for classifying individuals from mixtures of DNA samples using 27 short tandem repeats and 94 single nucleotide polymorphisms obtained through massively parallel sequencing protocol. The model was trained/tested/validated with sequenced data from 6 individuals and then evaluated using mixtures from forensic DNA samples. The model successfully identified both the major and the minor contributors with 100% accuracy for 90 DNA mixtures, that were manually prepared by mixing sequence reads of 3 individuals at different ratios. Furthermore, the model identified 100% of the major contributors and 50-80% of the minor contributors in 20 two-sample external-mixed-samples at ratios of 1:39 and 1:9, respectively. To further demonstrate the versatility and applicability of the pipeline, we tested it on whole exome sequence data to classify subtypes of 20 breast cancer patients and achieved an area under curve of 0.85. Overall, we present, for the first time, a complete pipeline, including sequencing data processing steps and DL steps, that is applicable across different NGS platforms. We also introduced a sliding window approach, to overcome the sequence length variation problem of sequencing data, and demonstrate that it improves the model performance dramatically.
Collapse
Affiliation(s)
- Nam Nhut Phan
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan.,Graduate Institute of Biomedical Electronics and Bioinformatics, Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan.,Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan
| | - Amrita Chattopadhyay
- Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan
| | - Tsui-Ting Lee
- Department and Graduate Institute of Forensic Medicine, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Hsiang-I Yin
- Department and Graduate Institute of Forensic Medicine, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Tzu-Pin Lu
- Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan.,Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei 10055, Taiwan
| | - Liang-Chuan Lai
- Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan.,Graduate Institute of Physiology, College of Medicine, National Taiwan University, Taipei 10051, Taiwan
| | - Hsiao-Lin Hwa
- Department and Graduate Institute of Forensic Medicine, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Mong-Hsun Tsai
- Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan.,Institute of Biotechnology, National Taiwan University, Taipei 10672, Taiwan.,Center of Biotechnology, National Taiwan University, Taipei 10672, Taiwan
| | - Eric Y Chuang
- Graduate Institute of Biomedical Electronics and Bioinformatics, Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan.,Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan.,Master Program for Biomedical Engineering, China Medical University, Taichung 110122, Taiwan
| |
Collapse
|
4
|
Giampaoli S, De Vittori E, Barni F, Anselmo A, Rinaldi T, Baldi M, Miranda KC, Liao A, Brami D, Frajese GV, Berti A. DNA metabarcoding of forensic mycological samples. EGYPTIAN JOURNAL OF FORENSIC SCIENCES 2021. [DOI: 10.1186/s41935-021-00221-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
DNA metabarcoding and massive parallel sequencing are valuable molecular tools for the characterization of environmental samples. In forensic sciences, the analysis of the sample’s fungal population can be highly informative for the estimation of post-mortem interval, the ascertainment of deposition time, the identification of the cause of death, or the location of buried corpses. Unfortunately, metabarcoding data analysis often requires strong bioinformatic capabilities that are not widely available in forensic laboratories.
Results
The present paper describes the adoption of a user-friendly cloud-based application for the identification of fungi in typical forensic samples. The samples have also been analyzed through the QIIME pipeline, obtaining a relevant data concordance on top genus classification results (88%).
Conclusions
The availability of a user-friendly application that can be run without command line activities will increase the popularity of metabarcoding fungal analysis in forensic samples.
Collapse
|
5
|
Young JM, Linacre A. Massively parallel sequencing is unlocking the potential of environmental trace evidence. Forensic Sci Int Genet 2020; 50:102393. [PMID: 33157385 DOI: 10.1016/j.fsigen.2020.102393] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 09/03/2020] [Accepted: 09/07/2020] [Indexed: 01/16/2023]
Abstract
Massively parallel sequencing (MPS) has revolutionised the field of genomics enabling substantial advances in human DNA profiling. Further, the advent of MPS now allows biological signatures to be obtained from complex DNA mixtures and trace amounts of low biomass samples. Environmental samples serve as ideal forms of contact trace evidence as detection at a scene can establish a link between a suspect, location and victim. Many studies have applied MPS technology to characterise the biodiversity within high biomass environmental samples (such as soil and water) to address questions related to ecology, conservation, climate change and human health. However, translation of these tools to forensic science remains in its infancy, due in part to the merging of traditional forensic ecology practices with unfamiliar DNA technologies and complex datasets. In addition, people and objects also carry low biomass environmental signals which have recently been shown to reflect a specific individual or location. The sensitivity, and reducing cost, of MPS is now unlocking the power of both high and low biomass environmental DNA (eDNA) samples as useful sources of genetic information in forensic science. This paper discusses the potential of eDNA to forensic science by reviewing the most explored applications that are leading the integration of this technology into the field. We introduce novel areas of forensic ecology that could also benefit from these tools with a focus on linking a suspect to a scene or establishing provenance of an unknown sample and discuss the current limitations and validation recommendations to achieve translation of eDNA into casework.
Collapse
Affiliation(s)
- J M Young
- College of Science and Engineering, Flinders University, Adelaide, South Australia, Australia.
| | - A Linacre
- College of Science and Engineering, Flinders University, Adelaide, South Australia, Australia
| |
Collapse
|