1
|
Benschop CCG, van der Gaag KJ, de Vreede J, Backx AJ, de Leeuw RH, Zuñiga S, Hoogenboom J, de Knijff P, Sijen T. Application of a probabilistic genotyping software to MPS mixture STR data is supported by similar trends in LRs compared with CE data. Forensic Sci Int Genet 2021; 52:102489. [PMID: 33677249 DOI: 10.1016/j.fsigen.2021.102489] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 02/03/2021] [Accepted: 02/24/2021] [Indexed: 02/06/2023]
Abstract
The interpretation of short tandem repeat (STR) profiles can be challenging when, for example, alleles are masked due to allele sharing among contributors and/or when they are subject to drop-out, for instance from sample degradation. Mixture interpretation can be improved by increasing the number of STRs and/or loci with a higher discriminatory power. Both capillary electrophoresis (CE, 6-dye) and massively parallel sequencing (MPS) provide a platform for analysing relatively large numbers of autosomal STRs. In addition, MPS enables distinguishing between sequence variants, resulting in enlarged discriminatory power. Also, MPS allows for small amplicon sizes for all loci as spacing is not an issue, which is beneficial with degraded DNA. Altogether, MPS has the potential to increase the weights of evidence for true contributors to (complex) DNA profiles. In this study, likelihood ratio (LR) calculations were performed using STR profiles obtained with two different MPS systems and analysed using different settings: 1) MPS PowerSeq™ Auto System profiles analysed using FDSTools equipped with optimized settings such as noise correction, 2) ForenSeq™ DNA Signature Prep Kit profiles analysed using the default settings in the Universal Analysis Software (UAS), and 3) ForenSeq™ DNA Signature Prep Kit profiles analysed using FDSTools empirically adapted to cope with one-directional reads and provisional, basic settings. The LR calculations used genotyping data for two- to four-person mixtures varying for mixture proportion, level of drop-out and allele sharing and were generated with the continuous model EuroForMix. The LR results for the over 2000 sets of propositions were affected by the variation for the number of markers and analysis settings used in the three approaches. Nevertheless, trends for true and non-contributors, effects of replicates, assigned number of contributors, and model validation results were comparable for the three MPS approaches and alike the trends known for CE data. Based on this analogy, we regard the probabilistic interpretation of MPS STR data fit for forensic DNA casework. In addition, guidelines were derived on when to apply LR calculations to MPS autosomal STR data and report the corresponding results.
Collapse
Affiliation(s)
- Corina C G Benschop
- Division of Biological Traces, Netherlands Forensic Institute, The Hague, The Netherlands.
| | | | - Jennifer de Vreede
- Division of Biological Traces, Netherlands Forensic Institute, The Hague, The Netherlands.
| | - Anouk J Backx
- Division of Biological Traces, Netherlands Forensic Institute, The Hague, The Netherlands.
| | - Rick H de Leeuw
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands.
| | - Sofia Zuñiga
- Division of Biological Traces, Netherlands Forensic Institute, The Hague, The Netherlands.
| | - Jerry Hoogenboom
- Division of Biological Traces, Netherlands Forensic Institute, The Hague, The Netherlands.
| | - Peter de Knijff
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands.
| | - Titia Sijen
- Division of Biological Traces, Netherlands Forensic Institute, The Hague, The Netherlands; University of Amsterdam, Swammerdam Institute for Life Sciences, Amsterdam, The Netherlands.
| |
Collapse
|
2
|
Arindrarto W, Borràs DM, de Groen RAL, van den Berg RR, Locher IJ, van Diessen SAME, van der Holst R, van der Meijden ED, Honders MW, de Leeuw RH, Verlaat W, Jedema I, Kroes WGM, Knijnenburg J, van Wezel T, Vermaat JSP, Valk PJM, Janssen B, de Knijff P, van Bergen CAM, van den Akker EB, Hoen PAC', Kiełbasa SM, Laros JFJ, Griffioen M, Veelken H. Comprehensive diagnostics of acute myeloid leukemia by whole transcriptome RNA sequencing. Leukemia 2020; 35:47-61. [PMID: 32127641 PMCID: PMC7787979 DOI: 10.1038/s41375-020-0762-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Revised: 01/17/2020] [Accepted: 02/12/2020] [Indexed: 01/12/2023]
Abstract
Acute myeloid leukemia (AML) is caused by genetic aberrations that also govern the prognosis of patients and guide risk-adapted and targeted therapy. Genetic aberrations in AML are structurally diverse and currently detected by different diagnostic assays. This study sought to establish whole transcriptome RNA sequencing as single, comprehensive, and flexible platform for AML diagnostics. We developed HAMLET (Human AML Expedited Transcriptomics) as bioinformatics pipeline for simultaneous detection of fusion genes, small variants, tandem duplications, and gene expression with all information assembled in an annotated, user-friendly output file. Whole transcriptome RNA sequencing was performed on 100 AML cases and HAMLET results were validated by reference assays and targeted resequencing. The data showed that HAMLET accurately detected all fusion genes and overexpression of EVI1 irrespective of 3q26 aberrations. In addition, small variants in 13 genes that are often mutated in AML were called with 99.2% sensitivity and 100% specificity, and tandem duplications in FLT3 and KMT2A were detected by a novel algorithm based on soft-clipped reads with 100% sensitivity and 97.1% specificity. In conclusion, HAMLET has the potential to provide accurate comprehensive diagnostic information relevant for AML classification, risk assessment and targeted therapy on a single technology platform.
Collapse
Affiliation(s)
- Wibowo Arindrarto
- Center for Computational Biology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands.,Department of Human Genetics, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Daniel M Borràs
- GenomeScan B.V, 2333 BZ, Leiden, The Netherlands.,Department of Chemical Cell Biology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Ruben A L de Groen
- Department of Hematology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Redmar R van den Berg
- Department of Human Genetics, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Irene J Locher
- Department of Hematology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | | | - Rosalie van der Holst
- Department of Hematology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | | | - M Willy Honders
- Department of Hematology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Rick H de Leeuw
- Forensic Laboratory for DNA Research, Department of Human Genetics, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Wina Verlaat
- Department of Hematology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Inge Jedema
- Department of Hematology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Wilma G M Kroes
- Department of Clinical Genetics, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Jeroen Knijnenburg
- Department of Clinical Genetics, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Tom van Wezel
- Department of Pathology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Joost S P Vermaat
- Department of Hematology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Peter J M Valk
- Department of Hematology, Erasmus University Medical Center, 3015CN, Rotterdam, The Netherlands
| | - Bart Janssen
- GenomeScan B.V, 2333 BZ, Leiden, The Netherlands
| | - Peter de Knijff
- Forensic Laboratory for DNA Research, Department of Human Genetics, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | | | - Erik B van den Akker
- Center for Computational Biology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands.,The Delft Bioinformatics Lab, Delft University of Technology, 2628CD, Delft, The Netherlands.,Section of Molecular Epidemiology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Peter A C 't Hoen
- Department of Human Genetics, Leiden University Medical Center, 2300RC, Leiden, The Netherlands.,The Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, 6525 GA, Nijmegen, The Netherlands
| | - Szymon M Kiełbasa
- Center for Computational Biology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Jeroen F J Laros
- Department of Human Genetics, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| | - Marieke Griffioen
- Department of Hematology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands.
| | - Hendrik Veelken
- Department of Hematology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands
| |
Collapse
|
3
|
Khachatryan L, de Leeuw RH, Kraakman MEM, Pappas N, Te Raa M, Mei H, de Knijff P, Laros JFJ. Taxonomic classification and abundance estimation using 16S and WGS-A comparison using controlled reference samples. Forensic Sci Int Genet 2020; 46:102257. [PMID: 32058299 DOI: 10.1016/j.fsigen.2020.102257] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2019] [Revised: 12/30/2019] [Accepted: 01/27/2020] [Indexed: 12/30/2022]
Abstract
The assessment of microbiome biodiversity is the most common application of metagenomics. While 16S sequencing remains standard procedure for taxonomic profiling of metagenomic data, a growing number of studies have clearly demonstrated biases associated with this method. By using Whole Genome Shotgun sequencing (WGS) metagenomics, most of the known restrictions associated with 16S data are alleviated. However, due to the computationally intensive data analyses and higher sequencing costs, WGS based metagenomics remains a less popular option. Selecting the experiment type that provides a comprehensive, yet manageable amount of information is a challenge encountered in many metagenomics studies. In this work, we created a series of artificial bacterial mixes, each with a different distribution of skin-associated microbial species. These mixes were used to estimate the resolution of two different metagenomic experiments - 16S and WGS - and to evaluate several different bioinformatics approaches for taxonomic read classification. In all test cases, WGS approaches provide much more accurate results, in terms of taxa prediction and abundance estimation, in comparison to those of 16S. Furthermore, we demonstrate that a 16S dataset, analysed using different state of the art techniques and reference databases, can produce widely different results. In light of the fact that most forensic metagenomic analysis are still performed using 16S data, our results are especially important.
Collapse
Affiliation(s)
- Lusine Khachatryan
- Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands.
| | - Rick H de Leeuw
- Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Margriet E M Kraakman
- Department of Medical Microbiology, Leiden University Medical Center, Leiden, the Netherlands
| | - Nikos Pappas
- Sequencing Analysis Support Core, Leiden University Medical Center, Leiden, the Netherlands
| | - Marije Te Raa
- Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Hailiang Mei
- Sequencing Analysis Support Core, Leiden University Medical Center, Leiden, the Netherlands
| | - Peter de Knijff
- Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Jeroen F J Laros
- Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands; Department of Clinical Genetics, Leiden University Medical Center, Leiden, the Netherlands
| |
Collapse
|
4
|
de Leeuw RH, Garnier D, Kroon RMJM, Horlings CGC, de Meijer E, Buermans H, van Engelen BGM, de Knijff P, Raz V. Diagnostics of short tandem repeat expansion variants using massively parallel sequencing and componential tools. Eur J Hum Genet 2018; 27:400-407. [PMID: 30455479 PMCID: PMC6460572 DOI: 10.1038/s41431-018-0302-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 10/17/2018] [Accepted: 10/25/2018] [Indexed: 11/09/2022] Open
Abstract
Short tandem repeats (STRs) are scattered throughout the human genome. Some STRs, like trinucleotide repeat expansion (TRE) variants, cause hereditable disorders. Unambiguous molecular diagnostics of TRE disorders is hampered by current technical limitations imposed by traditional PCR and DNA sequencing methods. Here we report a novel pipeline for TRE variant diagnosis employing the massively parallel sequencing (MPS) combined with an opensource software package (FDSTools), which together are designed to distinguish true STR sequences from STR sequencing artifacts. We show that this approach can improve TRE diagnosis, such as Oculopharyngeal muscular dystrophy (OPMD). OPMD is caused by a trinucleotide expansion in the PABPN1 gene. A short GCN expansion, (GCN[10]), coding for a 10 alanine repeat is not pathogenic, but an alanine expansion is pathogenic. Applying this novel procedure in a Dutch OPMD patient cohort, we found expansion variants from GCN[11] to GCN[16], with the GCN[16] as the most abundant variant. The repeat expansion length did not correlate with clinical features. However, symptom severity was found to correlate with age and with the initial affected muscles, suggesting that aging and muscle-specific factors can play a role in modulating OPMD.
Collapse
Affiliation(s)
- Rick H de Leeuw
- Department of Human Genetics, Leiden University Medical Centre, Nijmegen, The Netherlands
| | - Dominique Garnier
- Department of Human Genetics, Leiden University Medical Centre, Nijmegen, The Netherlands
| | - Rosemarie M J M Kroon
- Department of Rehabilitation, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Corinne G C Horlings
- Department of Neurology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Emile de Meijer
- Department of Human Genetics, Leiden University Medical Centre, Nijmegen, The Netherlands
| | - Henk Buermans
- Department of Human Genetics, Leiden University Medical Centre, Nijmegen, The Netherlands
| | - Baziel G M van Engelen
- Department of Neurology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Peter de Knijff
- Department of Human Genetics, Leiden University Medical Centre, Nijmegen, The Netherlands
| | - Vered Raz
- Department of Human Genetics, Leiden University Medical Centre, Nijmegen, The Netherlands.
| |
Collapse
|
5
|
van der Gaag KJ, de Leeuw RH, Laros JFJ, den Dunnen JT, de Knijff P. Short hypervariable microhaplotypes: A novel set of very short high discriminating power loci without stutter artefacts. Forensic Sci Int Genet 2018; 35:169-175. [PMID: 29852469 DOI: 10.1016/j.fsigen.2018.05.008] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 05/03/2018] [Accepted: 05/16/2018] [Indexed: 12/12/2022]
Abstract
Since two decades, short tandem repeats (STRs) are the preferred markers for human identification, routinely analysed by fragment length analysis. Here we present a novel set of short hypervariable autosomal microhaplotypes (MH) that have four or more SNPs in a span of less than 70 nucleotides (nt). These MHs display a discriminating power approaching that of STRs and provide a powerful alternative for the analysis;1;is of forensic samples that are problematic when the STR fragment size range exceeds the integrity range of severely degraded DNA or when multiple donors contribute to an evidentiary stain and STR stutter artefacts complicate profile interpretation. MH typing was developed using the power of massively parallel sequencing (MPS) enabling new powerful, fast and efficient SNP-based approaches. MH candidates were obtained from queries in data of the 1000 Genomes, and Genome of the Netherlands (GoNL) projects. Wet-lab analysis of 276 globally dispersed samples and 97 samples of nine large CEPH families assisted locus selection and corroboration of informative value. We infer that MHs represent an alternative marker type with good discriminating power per locus (allowing the use of a limited number of loci), small amplicon sizes and absence of stutter artefacts that can be especially helpful when unbalanced mixed samples are submitted for human identification.
Collapse
Affiliation(s)
- Kristiaan J van der Gaag
- Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands; Division of Biological Traces, Netherlands Forensic Institute, Laan van Ypenburg 6, 2497GB, The Hague, The Netherlands.
| | - Rick H de Leeuw
- Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands.
| | - Jeroen F J Laros
- Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands.
| | - Johan T den Dunnen
- Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands; Department of Clinical Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands.
| | - Peter de Knijff
- Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands.
| |
Collapse
|
6
|
Anvar SY, van der Gaag KJ, van der Heijden JWF, Veltrop MHAM, Vossen RHAM, de Leeuw RH, Breukel C, Buermans HPJ, Verbeek JS, de Knijff P, den Dunnen JT, Laros JFJ. TSSV: a tool for characterization of complex allelic variants in pure and mixed genomes. Bioinformatics 2014; 30:1651-9. [DOI: 10.1093/bioinformatics/btu068] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
|