1
|
Yu L, Fei C, Wang D, Huang R, Xuan W, Guo C, Jing L, Meng W, Yi L, Zhang H, Zhang J. Genome-wide identification, evolution and expression profiles analysis of bHLH gene family in Castanea mollissima. Front Genet 2023; 14:1193953. [PMID: 37252667 PMCID: PMC10213225 DOI: 10.3389/fgene.2023.1193953] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Accepted: 05/05/2023] [Indexed: 05/31/2023] Open
Abstract
The basic helix-loop-helix (bHLH) transcription factors (TFs) gene family is an important gene family in plants, and participates in regulation of plant apical meristem growth, metabolic regulation and stress resistance. However, its characteristics and potential functions have not been studied in chestnut (Castanea mollissima), an important nut with high ecological and economic value. In the present study, 94 CmbHLHs were identified in chestnut genome, of which 88 were unevenly distributed on chromosomes, and other six were located on five unanchored scaffolds. Almost all CmbHLH proteins were predicted in the nucleus, and subcellular localization demonstrated the correctness of the above predictions. Based on the phylogenetic analysis, all of the CmbHLH genes were divided into 19 subgroups with distinct features. Abundant cis-acting regulatory elements related to endosperm expression, meristem expression, and responses to gibberellin (GA) and auxin were identified in the upstream sequences of CmbHLH genes. This indicates that these genes may have potential functions in the morphogenesis of chestnut. Comparative genome analysis showed that dispersed duplication was the main driving force for the expansion of the CmbHLH gene family inferred to have evolved through purifying selection. Transcriptome analysis and qRT-PCR experiments showed that the expression patterns of CmbHLHs were different in different chestnut tissues, and revealed some members may have potential functions in chestnut buds, nuts, fertile/abortive ovules development. The results from this study will be helpful to understand the characteristics and potential functions of the bHLH gene family in chestnut.
Collapse
Affiliation(s)
- Liyang Yu
- Engineering Research Center of Chestnut Industry Technology, Ministry of Education, Hebei Normal University of Science and Technology, Qinhuangdao, Hebei, China
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
| | - Cao Fei
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
- Hebei Key Laboratory of Horticultural Germplasm Excavation and Innovative Utilization, Qinhuangdao, Hebei, China
| | - Dongsheng Wang
- Engineering Research Center of Chestnut Industry Technology, Ministry of Education, Hebei Normal University of Science and Technology, Qinhuangdao, Hebei, China
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
| | - Ruimin Huang
- Engineering Research Center of Chestnut Industry Technology, Ministry of Education, Hebei Normal University of Science and Technology, Qinhuangdao, Hebei, China
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
| | - Wang Xuan
- Engineering Research Center of Chestnut Industry Technology, Ministry of Education, Hebei Normal University of Science and Technology, Qinhuangdao, Hebei, China
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
| | - Chunlei Guo
- Engineering Research Center of Chestnut Industry Technology, Ministry of Education, Hebei Normal University of Science and Technology, Qinhuangdao, Hebei, China
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
| | - Liu Jing
- Engineering Research Center of Chestnut Industry Technology, Ministry of Education, Hebei Normal University of Science and Technology, Qinhuangdao, Hebei, China
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
| | - Wang Meng
- Engineering Research Center of Chestnut Industry Technology, Ministry of Education, Hebei Normal University of Science and Technology, Qinhuangdao, Hebei, China
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
| | - Lu Yi
- Engineering Research Center of Chestnut Industry Technology, Ministry of Education, Hebei Normal University of Science and Technology, Qinhuangdao, Hebei, China
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
| | - Haie Zhang
- Engineering Research Center of Chestnut Industry Technology, Ministry of Education, Hebei Normal University of Science and Technology, Qinhuangdao, Hebei, China
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
| | - Jingzheng Zhang
- Engineering Research Center of Chestnut Industry Technology, Ministry of Education, Hebei Normal University of Science and Technology, Qinhuangdao, Hebei, China
- Hebei Collaborative Innovation Center of Chestnut Industry, Qinhuangdao, Hebei, China
- Hebei Key Laboratory of Horticultural Germplasm Excavation and Innovative Utilization, Qinhuangdao, Hebei, China
| |
Collapse
|
2
|
Jing X, Xu L, Huai X, Zhang H, Zhao F, Qiao Y. Genome-Wide Identification and Characterization of Argonaute, Dicer-like and RNA-Dependent RNA Polymerase Gene Families and Their Expression Analyses in Fragaria spp. Genes (Basel) 2023; 14:genes14010121. [PMID: 36672862 PMCID: PMC9859564 DOI: 10.3390/genes14010121] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/19/2022] [Accepted: 12/29/2022] [Indexed: 01/04/2023] Open
Abstract
In the growth and development of plants, some non-coding small RNAs (sRNAs) not only mediate RNA interference at the post-transcriptional level, but also play an important regulatory role in chromatin modification at the transcriptional level. In these processes, the protein factors Argonaute (AGO), Dicer-like (DCL), and RNA-dependent RNA polymerase (RDR) play very important roles in the synthesis of sRNAs respectively. Though they have been identified in many plants, the information about these gene families in strawberry was poorly understood. In this study, using a genome-wide analysis and a phylogenetic approach, 13 AGO, six DCL, and nine RDR genes were identified in diploid strawberry Fragaria vesca. We also identified 33 AGO, 18 DCL, and 28 RDR genes in octoploid strawberry Fragaria × ananassa, studied the expression patterns of these genes in various tissues and developmental stages of strawberry, and researched the response of these genes to some hormones, finding that almost all genes respond to the five hormone stresses. This study is the first report of a genome-wide analysis of AGO, DCL, and RDR gene families in Fragaria spp., in which we provide basic genomic information and expression patterns for these genes. Additionally, this study provides a basis for further research on the functions of these genes and some evidence for the evolution between diploid and octoploid strawberries.
Collapse
Affiliation(s)
- Xiaotong Jing
- Laboratory of Fruit Crop Biotechnology, College of Horticulture, Nanjing Agricultural University, No. 1 Weigang, Nanjing 210095, China
| | - Linlin Xu
- Institute of Pomology, Jiangsu Academy of Agricultural Sciences/Jiangsu Key Laboratory for Horticultural Crop Genetic Improvement, Nanjing 210014, China
| | - Xinjia Huai
- Laboratory of Fruit Crop Biotechnology, College of Horticulture, Nanjing Agricultural University, No. 1 Weigang, Nanjing 210095, China
| | - Hong Zhang
- Laboratory of Fruit Crop Biotechnology, College of Horticulture, Nanjing Agricultural University, No. 1 Weigang, Nanjing 210095, China
| | - Fengli Zhao
- Laboratory of Fruit Crop Biotechnology, College of Horticulture, Nanjing Agricultural University, No. 1 Weigang, Nanjing 210095, China
| | - Yushan Qiao
- Laboratory of Fruit Crop Biotechnology, College of Horticulture, Nanjing Agricultural University, No. 1 Weigang, Nanjing 210095, China
- Institute of Pomology, Jiangsu Academy of Agricultural Sciences/Jiangsu Key Laboratory for Horticultural Crop Genetic Improvement, Nanjing 210014, China
- Correspondence:
| |
Collapse
|
3
|
Bai J, Song MJ, Gao J, Li G. Whole genome duplication and dispersed duplication characterize the evolution of the plant PINOID gene family across plant species. Gene 2022; 829:146494. [PMID: 35447241 DOI: 10.1016/j.gene.2022.146494] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 04/05/2022] [Accepted: 04/14/2022] [Indexed: 11/16/2022]
Abstract
PINOID is a kinase belonging to the AGCVIII family, which regulates the polar distribution of PIN proteins and plays an important role in plant geotropism. However, the origin and evolutionary history of this gene family is not fully known. In this study, we identified 79 similar sequences across 17 plant species genomes (PINOID, D6PK, PINOID2, "hypothetical kinase"). Our results show that the AGCVIII kinase family may have originated from related "Hypothetical Kinases" that come out sister to the rest of the gene family members. These kinases differentiated their functions are found in different plant classes: D6PK in moss and PINOID and PINOID2 evolving in angiosperms including the pioneer plant Amborella trichopoda. Our study investigates the evolution of PINOID kinases from a phylogenetic perspective giving us insight into how this important plant signal transduction network switch evolved to play a fundamental and important function in plant growth and development. We highlight the importance of whole genome duplications and dispersed duplications as opposed to tandem duplications in the evolution of this gene family.
Collapse
Affiliation(s)
- Jiangshan Bai
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Michael J Song
- Department of Biology, California State University East Bay, Hayward, CA, United States of America
| | - Jian Gao
- State Key Laboratory of Pharmaceutical Biotechnology, Department of Biotechnology and Pharmaceutical Sciences, School of Life Sciences, Nanjing University, Nanjing, China
| | - Guiting Li
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China.
| |
Collapse
|
4
|
Chu C, Zhao B, Park PJ, Lee EA. Identification and Genotyping of Transposable Element Insertions From Genome Sequencing Data. ACTA ACUST UNITED AC 2021; 107:e102. [PMID: 32662945 DOI: 10.1002/cphg.102] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Transposable element (TE) mobilization is a significant source of genomic variation and has been associated with various human diseases. The exponential growth of population-scale whole-genome sequencing and rapid innovations in long-read sequencing technologies provide unprecedented opportunities to study TE insertions and their functional impact in human health and disease. Identifying TE insertions, however, is challenging due to the repetitive nature of the TE sequences. Here, we review computational approaches to detecting and genotyping TE insertions using short- and long-read sequencing and discuss the strengths and weaknesses of different approaches. © 2020 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Chong Chu
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Boxun Zhao
- Division of Genetics and Genomics, The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts.,Department of Pediatrics, Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Peter J Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Eunjung Alice Lee
- Division of Genetics and Genomics, The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts.,Department of Pediatrics, Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| |
Collapse
|
5
|
Orozco-Arias S, Tobon-Orozco N, Piña JS, Jiménez-Varón CF, Tabares-Soto R, Guyot R. TIP_finder: An HPC Software to Detect Transposable Element Insertion Polymorphisms in Large Genomic Datasets. BIOLOGY 2020; 9:E281. [PMID: 32917036 PMCID: PMC7563458 DOI: 10.3390/biology9090281] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 09/01/2020] [Accepted: 09/07/2020] [Indexed: 12/12/2022]
Abstract
Transposable elements (TEs) are non-static genomic units capable of moving indistinctly from one chromosomal location to another. Their insertion polymorphisms may cause beneficial mutations, such as the creation of new gene function, or deleterious in eukaryotes, e.g., different types of cancer in humans. A particular type of TE called LTR-retrotransposons comprises almost 8% of the human genome. Among LTR retrotransposons, human endogenous retroviruses (HERVs) bear structural and functional similarities to retroviruses. Several tools allow the detection of transposon insertion polymorphisms (TIPs) but fail to efficiently analyze large genomes or large datasets. Here, we developed a computational tool, named TIP_finder, able to detect mobile element insertions in very large genomes, through high-performance computing (HPC) and parallel programming, using the inference of discordant read pair analysis. TIP_finder inputs are (i) short pair reads such as those obtained by Illumina, (ii) a chromosome-level reference genome sequence, and (iii) a database of consensus TE sequences. The HPC strategy we propose adds scalability and provides a useful tool to analyze huge genomic datasets in a decent running time. TIP_finder accelerates the detection of transposon insertion polymorphisms (TIPs) by up to 55 times in breast cancer datasets and 46 times in cancer-free datasets compared to the fastest available algorithms. TIP_finder applies a validated strategy to find TIPs, accelerates the process through HPC, and addresses the issues of runtime for large-scale analyses in the post-genomic era. TIP_finder version 1.0 is available at https://github.com/simonorozcoarias/TIP_finder.
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales 170002, Colombia; (N.T.-O.); (J.S.P.)
- Department of Systems and Informatics, Universidad de Caldas, Manizales 170002, Colombia
| | - Nicolas Tobon-Orozco
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales 170002, Colombia; (N.T.-O.); (J.S.P.)
| | - Johan S. Piña
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales 170002, Colombia; (N.T.-O.); (J.S.P.)
| | | | - Reinel Tabares-Soto
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales 170002, Colombia;
| | - Romain Guyot
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales 170002, Colombia;
- Institut de Recherche pour le Développement (IRD), CIRAD, Université de Montpellier, 34394 Montpellier, France
| |
Collapse
|
6
|
Wang X, Zhang Y, Wang L, Pan Z, He S, Gao Q, Chen B, Gong W, Du X. Casparian strip membrane domain proteins in Gossypium arboreum: genome-wide identification and negative regulation of lateral root growth. BMC Genomics 2020; 21:340. [PMID: 32366264 PMCID: PMC7199351 DOI: 10.1186/s12864-020-6723-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 04/06/2020] [Indexed: 11/28/2022] Open
Abstract
Background Root systems are critical for plant growth and development. The Casparian strip in root systems is involved in stress resistance and maintaining homeostasis. Casparian strip membrane domain proteins (CASPs) are responsible for the formation of Casparian strips. Results To investigate the function of CASPs in cotton, we identified and characterized 48, 54, 91 and 94 CASPs from Gossypium arboreum, Gossypium raimondii, Gossypium barbadense and Gossypium hirsutum, respectively, at the genome-wide level. However, only 29 common homologous CASP genes were detected in the four Gossypium species. A collinearity analysis revealed that whole genome duplication (WGD) was the primary reason for the expansion of the genes of the CASP family in the four cotton species. However, dispersed duplication could also contribute to the expansion of the GaCASPs gene family in the ancestors of G. arboreum. Phylogenetic analysis was used to cluster a total of 85 CASP genes from G. arboreum and Arabidopsis into six distinct groups, while the genetic structure and motifs of CASPs were conserved in the same group. Most GaCASPs were expressed in diverse tissues, with the exception of that five GaCASPs (Ga08G0113, Ga08G0114, Ga08G0116, Ga08G0117 and Ga08G0118) that were highly expressed in root tissues. Analyses of the tissue and subcellular localization suggested that GaCASP27 genes (Ga08G0117) are membrane protein genes located in the root. In the GaCASP27 silenced plants and the Arabidopsis mutants, the lateral root number significantly increased. Furthermore, GaMYB36, which is related to root development was found to regulate lateral root growth by targeting GaCASP27. Conclusions This study provides a fundamental understanding of the CASP gene family in cotton and demonstrates the regulatory role of GaCASP27 on lateral root growth and development.
Collapse
Affiliation(s)
- Xiaoyang Wang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China.,Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yuanming Zhang
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Liyuan Wang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
| | - Zhaoe Pan
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
| | - Shoupu He
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
| | - Qiong Gao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
| | - Baojun Chen
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
| | - Wenfang Gong
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China. .,Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees, Ministry of Education, Central South University of Forestry and Technology, Ministry of Education, Changsha, 410004, China.
| | - Xiongming Du
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China.
| |
Collapse
|
7
|
Xie T, Zeng L, Chen X, Rong H, Wu J, Batley J, Jiang J, Wang Y. Genome-Wide Analysis of the Lateral Organ Boundaries Domain Gene Family in Brassica Napus. Genes (Basel) 2020; 11:genes11030280. [PMID: 32155746 PMCID: PMC7140802 DOI: 10.3390/genes11030280] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Revised: 03/03/2020] [Accepted: 03/04/2020] [Indexed: 02/08/2023] Open
Abstract
The plant specific LATERAL ORGAN BOUNDARIES (LOB)-domain (LBD) proteins belong to a family of transcription factors that play important roles in plant growth and development, as well as in responses to various stresses. However, a comprehensive study of LBDs in Brassica napus has not yet been reported. In the present study, 126 BnLBD genes were identified in B. napus genome using bioinformatics analyses. The 126 BnLBDs were phylogenetically classified into two groups and nine subgroups. Evolutionary analysis indicated that whole genome duplication (WGD) and segmental duplication played important roles in the expansion of the BnLBD gene family. On the basis of the RNA-seq analyses, we identified BnLBD genes with tissue or developmental specific expression patterns. Through cis-acting element analysis and hormone treatment, we identified 19 BnLBD genes with putative functions in plant response to abscisic acid (ABA) treatment. This study provides a comprehensive understanding on the origin and evolutionary history of LBDs in B. napus, and will be helpful in further functional characterisation of BnLBDs.
Collapse
Affiliation(s)
- Tao Xie
- Jiangsu Provincial Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou 225009, China; (T.X.); (L.Z.); (X.C.); (H.R.); (J.W.); (Y.W.)
| | - Lei Zeng
- Jiangsu Provincial Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou 225009, China; (T.X.); (L.Z.); (X.C.); (H.R.); (J.W.); (Y.W.)
| | - Xin Chen
- Jiangsu Provincial Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou 225009, China; (T.X.); (L.Z.); (X.C.); (H.R.); (J.W.); (Y.W.)
| | - Hao Rong
- Jiangsu Provincial Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou 225009, China; (T.X.); (L.Z.); (X.C.); (H.R.); (J.W.); (Y.W.)
| | - Jingjing Wu
- Jiangsu Provincial Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou 225009, China; (T.X.); (L.Z.); (X.C.); (H.R.); (J.W.); (Y.W.)
| | - Jacqueline Batley
- School of Biological Sciences, University of Western Australia, Perth, WA 6009, Australia;
| | - Jinjin Jiang
- Jiangsu Provincial Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou 225009, China; (T.X.); (L.Z.); (X.C.); (H.R.); (J.W.); (Y.W.)
- Correspondence: ; Tel.: +86-514-87997303
| | - Youping Wang
- Jiangsu Provincial Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou 225009, China; (T.X.); (L.Z.); (X.C.); (H.R.); (J.W.); (Y.W.)
| |
Collapse
|
8
|
Rajaby R, Sung WK. TranSurVeyor: an improved database-free algorithm for finding non-reference transpositions in high-throughput sequencing data. Nucleic Acids Res 2019; 46:e122. [PMID: 30137425 PMCID: PMC6237741 DOI: 10.1093/nar/gky685] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Accepted: 07/19/2018] [Indexed: 01/21/2023] Open
Abstract
Transpositions transfer DNA segments between different loci within a genome; in particular, when a transposition is found in a sample but not in a reference genome, it is called a non-reference transposition. They are important structural variations that have clinical impact. Transpositions can be called by analyzing second generation high-throughput sequencing datasets. Current methods follow either a database-based or a database-free approach. Database-based methods require a database of transposable elements. Some of them have good specificity; however this approach cannot detect novel transpositions, and it requires a good database of transposable elements, which is not yet available for many species. Database-free methods perform de novo calling of transpositions, but their accuracy is low. We observe that this is due to the misalignment of the reads; since reads are short and the human genome has many repeats, false alignments create false positive predictions while missing alignments reduce the true positive rate. This paper proposes new techniques to improve database-free non-reference transposition calling: first, we propose a realignment strategy called one-end remapping that corrects the alignments of reads in interspersed repeats; second, we propose a SNV-aware filter that removes some incorrectly aligned reads. By combining these two techniques and other techniques like clustering and positive-to-negative ratio filter, our proposed transposition caller TranSurVeyor shows at least 3.1-fold improvement in terms of F1-score over existing database-free methods. More importantly, even though TranSurVeyor does not use databases of prior information, its performance is at least as good as existing database-based methods such as MELT, Mobster and Retroseq. We also illustrate that TranSurVeyor can discover transpositions that are not known in the current database.
Collapse
Affiliation(s)
- Ramesh Rajaby
- School of Computing, National University of Singapore, 13 Computing Drive, 117417, Singapore.,NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, 28 Medical Drive, 117456, Singapore
| | - Wing-Kin Sung
- School of Computing, National University of Singapore, 13 Computing Drive, 117417, Singapore.,Genome Institute of Singapore, 60 Biopolis Street, Genome, 138672, Singapore
| |
Collapse
|
9
|
Bae J, Lee KW, Islam MN, Yim HS, Park H, Rho M. iMGEins: detecting novel mobile genetic elements inserted in individual genomes. BMC Genomics 2018; 19:944. [PMID: 30563451 PMCID: PMC6299635 DOI: 10.1186/s12864-018-5290-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 11/20/2018] [Indexed: 11/10/2022] Open
Abstract
Background Recent advances in sequencing technology have allowed us to investigate personal genomes to find structural variations, which have been studied extensively to identify their association with the physiology of diseases such as cancer. In particular, mobile genetic elements (MGEs) are one of the major constituents of the human genomes, and cause genome instability by insertion, mutation, and rearrangement. Result We have developed a new program, iMGEins, to identify such novel MGEs by using sequencing reads of individual genomes, and to explore the breakpoints with the supporting reads and MGEs detected. iMGEins is the first MGE detection program that integrates three algorithmic components: discordant read-pair mapping, split-read mapping, and insertion sequence assembly. Our evaluation results showed its outstanding performance in detecting novel MGEs from simulated genomes, as well as real personal genomes. In detail, the average recall and precision rates of iMGEins are 96.67 and 100%, respectively, which are the highest among the programs compared. In the testing with real human genomes of the NA12878 sample, iMGEins shows the highest accuracy in detecting MGEs within 20 bp proximity of the breakpoints annotated. Conclusion In order to study the dynamics of MGEs in individual genomes, iMGEins was developed to accurately detect breakpoints and report inserted MGEs. Compared with other programs, iMGEins has valuable features of identifying novel MGEs and assembling the MGEs inserted. Electronic supplementary material The online version of this article (10.1186/s12864-018-5290-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Junwoo Bae
- Department of Electronics and Computer Engineering, Hanyang University, Seoul, Korea
| | - Kyeong Won Lee
- Marine Biotechnology Research Center, Korea Institute of Ocean Science and Technology, Ansan, Korea
| | - Mohammad Nazrul Islam
- Marine Biotechnology Research Center, Korea Institute of Ocean Science and Technology, Ansan, Korea.,Department of Marine Biotechnology, Korea University of Science and Technology, Daejeon, Korea.,Department of Biotechnology, Sher-e-Bangla Agricultural University, Dhaka, 1207, Bangladesh
| | - Hyung-Soon Yim
- Marine Biotechnology Research Center, Korea Institute of Ocean Science and Technology, Ansan, Korea.,Department of Marine Biotechnology, Korea University of Science and Technology, Daejeon, Korea
| | - Heejin Park
- Department of Computer Science and Engineering, Hanyang University, Seoul, Korea. .,Department of Biomedical Informatics, Hanyang University, Seoul, Korea.
| | - Mina Rho
- Department of Computer Science and Engineering, Hanyang University, Seoul, Korea. .,Department of Biomedical Informatics, Hanyang University, Seoul, Korea.
| |
Collapse
|
10
|
Serrato-Capuchina A, Matute DR. The Role of Transposable Elements in Speciation. Genes (Basel) 2018; 9:E254. [PMID: 29762547 PMCID: PMC5977194 DOI: 10.3390/genes9050254] [Citation(s) in RCA: 118] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 04/26/2018] [Accepted: 04/26/2018] [Indexed: 01/20/2023] Open
Abstract
Understanding the phenotypic and molecular mechanisms that contribute to genetic diversity between and within species is fundamental in studying the evolution of species. In particular, identifying the interspecific differences that lead to the reduction or even cessation of gene flow between nascent species is one of the main goals of speciation genetic research. Transposable elements (TEs) are DNA sequences with the ability to move within genomes. TEs are ubiquitous throughout eukaryotic genomes and have been shown to alter regulatory networks, gene expression, and to rearrange genomes as a result of their transposition. However, no systematic effort has evaluated the role of TEs in speciation. We compiled the evidence for TEs as potential causes of reproductive isolation across a diversity of taxa. We find that TEs are often associated with hybrid defects that might preclude the fusion between species, but that the involvement of TEs in other barriers to gene flow different from postzygotic isolation is still relatively unknown. Finally, we list a series of guides and research avenues to disentangle the effects of TEs on the origin of new species.
Collapse
Affiliation(s)
- Antonio Serrato-Capuchina
- Biology Department, Genome Sciences Building, University of North Carolina, Chapel Hill, NC 27514, USA.
| | - Daniel R Matute
- Biology Department, Genome Sciences Building, University of North Carolina, Chapel Hill, NC 27514, USA.
| |
Collapse
|
11
|
Ewing AD. Transposable element detection from whole genome sequence data. Mob DNA 2015; 6:24. [PMID: 26719777 PMCID: PMC4696183 DOI: 10.1186/s13100-015-0055-3] [Citation(s) in RCA: 123] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 12/21/2015] [Indexed: 11/25/2022] Open
Abstract
The number of software tools available for detecting transposable element insertions from whole genome sequence data has been increasing steadily throughout the last ~5 years. Some of these methods have unique features suiting them for particular use cases, but in general they follow one or more of a common set of approaches. Here, detection and filtering approaches are reviewed in the light of transposable element biology and the current state of whole genome sequencing. We demonstrate that the current state-of-the-art methods still do not produce highly concordant results and provide resources to assist future development in transposable element detection methods.
Collapse
Affiliation(s)
- Adam D Ewing
- Mater Research Institute - University of Queensland, 37 Kent St Level 4, Woolloongabba, QLD 4102 Australia
| |
Collapse
|