1
|
Konar A, Choudhury O, Bullis R, Fiedler L, Kruser JM, Stephens MT, Gailing O, Schlarbaum S, Coggeshall MV, Staton ME, Carlson JE, Emrich S, Romero-Severson J. High-quality genetic mapping with ddRADseq in the non-model tree Quercus rubra. BMC Genomics 2017; 18:417. [PMID: 28558688 PMCID: PMC5450186 DOI: 10.1186/s12864-017-3765-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Accepted: 05/04/2017] [Indexed: 11/10/2022] Open
Abstract
Background Restriction site associated DNA sequencing (RADseq) has the potential to be a broadly applicable, low-cost approach for high-quality genetic linkage mapping in forest trees lacking a reference genome. The statistical inference of linear order must be as accurate as possible for the correct ordering of sequence scaffolds and contigs to chromosomal locations. Accurate maps also facilitate the discovery of chromosome segments containing allelic variants conferring resistance to the biotic and abiotic stresses that threaten forest trees worldwide. We used ddRADseq for genetic mapping in the tree Quercus rubra, with an approach optimized to produce a high-quality map. Our study design also enabled us to model the results we would have obtained with less depth of coverage. Results Our sequencing design produced a high sequencing depth in the parents (248×) and a moderate sequencing depth (15×) in the progeny. The digital normalization method of generating a de novo reference and the SAMtools SNP variant caller yielded the most SNP calls (78,725). The major drivers of map inflation were multiple SNPs located within the same sequence (77% of SNPs called). The highest quality map was generated with a low level of missing data (5%) and a genome-wide threshold of 0.025 for deviation from Mendelian expectation. The final map included 849 SNP markers (1.8% of the 78,725 SNPs called). Downsampling the individual FASTQ files to model lower depth of coverage revealed that sequencing the progeny using 96 samples per lane would have yielded too few SNP markers to generate a map, even if we had sequenced the parents at depth 248×. Conclusions The ddRADseq technology produced enough high-quality SNP markers to make a moderately dense, high-quality map. The success of this project was due to high depth of coverage of the parents, moderate depth of coverage of the progeny, a good framework map, an optimized bioinformatics pipeline, and rigorous premapping filters. The ddRADseq approach is useful for the construction of high-quality genetic maps in organisms lacking a reference genome if the parents and progeny are sequenced at sufficient depth. Technical improvements in reduced representation sequencing (RRS) approaches are needed to reduce the amount of missing data. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3765-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Arpita Konar
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Olivia Choudhury
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Rebecca Bullis
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Lauren Fiedler
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA
| | | | - Melissa T Stephens
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Oliver Gailing
- School of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI, 49931, USA
| | - Scott Schlarbaum
- Department of Forestry, Wildlife and Fisheries, University of Tennessee, Knoxville, TN, 37996, USA
| | - Mark V Coggeshall
- School of Natural Resources, University of Missouri-Columbia, Columbia, MO, 65211, USA.,Hardwood Tree Improvement and Regeneration Center, USDA Forest Service Northern Research Station, West Lafayette, IN, 47907, USA
| | - Margaret E Staton
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN, 37996, USA
| | - John E Carlson
- Department of Ecosystem Science and Management, Penn State, University Park, State College, PA, 16802, USA
| | - Scott Emrich
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Jeanne Romero-Severson
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA.
| |
Collapse
|