Galià-Camps C, Pegueroles C, Turon X, Carreras C, Pascual M. Genome composition and GC content influence loci distribution in reduced representation genomic studies.
BMC Genomics 2024;
25:410. [PMID:
38664648 PMCID:
PMC11046876 DOI:
10.1186/s12864-024-10312-3]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024] Open
Abstract
BACKGROUND
Genomic architecture is a key evolutionary trait for living organisms. Due to multiple complex adaptive and neutral forces which impose evolutionary pressures on genomes, there is a huge variability of genomic features. However, their variability and the extent to which genomic content determines the distribution of recovered loci in reduced representation sequencing studies is largely unexplored.
RESULTS
Here, by using 80 genome assemblies, we observed that whereas plants primarily increase their genome size by expanding their intergenic regions, animals expand both intergenic and intronic regions, although the expansion patterns differ between deuterostomes and protostomes. Loci mapping in introns, exons, and intergenic categories obtained by in silico digestion using 2b-enzymes are positively correlated with the percentage of these regions in the corresponding genomes, suggesting that loci distribution mostly mirrors genomic architecture of the selected taxon. However, exonic regions showed a significant enrichment of loci in all groups regardless of the used enzyme. Moreover, when using selective adaptors to obtain a secondarily reduced loci dataset, the percentage and distribution of retained loci also varied. Adaptors with G/C terminals recovered a lower percentage of selected loci, with a further enrichment of exonic regions, while adaptors with A/T terminals retained a higher percentage of loci and slightly selected more intronic regions than expected.
CONCLUSIONS
Our results highlight how genome composition, genome GC content, RAD enzyme choice and use of base-selective adaptors influence reduced genome representation techniques. This is important to acknowledge in population and conservation genomic studies, as it determines the abundance and distribution of loci.
Collapse