1
|
Ni Y, Zhang X, Li J, Lu Q, Chen H, Ma B, Liu C. Genetic diversity of Coffea arabica L. mitochondrial genomes caused by repeat- mediated recombination and RNA editing. FRONTIERS IN PLANT SCIENCE 2023; 14:1261012. [PMID: 37885664 PMCID: PMC10598636 DOI: 10.3389/fpls.2023.1261012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 09/25/2023] [Indexed: 10/28/2023]
Abstract
Background Coffea arabica L. is one of the most important crops widely cultivated in 70 countries across Asia, Africa, and Latin America. Mitochondria are essential organelles that play critical roles in cellular respiration, metabolism, and differentiation. C. arabica's nuclear and chloroplast genomes have been reported. However, its mitochondrial genome remained unreported. Here, we intended to sequence and characterize its mitochondrial genome to maximize the potential of its genomes for evolutionary studies, molecular breeding, and molecular marker developments. Results We sequenced the total DNA of C. arabica using Illumina and Nanopore platforms. We then assembled the mitochondrial genome with a hybrid strategy using Unicycler software. We found that the mitochondrial genome comprised two circular chromosomes with lengths of 867,678 bp and 153,529 bp, encoding 40 protein-coding genes, 26 tRNA genes, and three rRNA genes. We also detected 270 Simple Sequence Repeats and 34 tandem repeats in the mitochondrial genome. We found 515 high-scoring sequence pairs (HSPs) for a self-to-self similarity comparison using BLASTn. Three HSPs were found to mediate recombination by the mapping of long reads. Furthermore, we predicted 472 using deep-mt with the convolutional neural network model. Then we randomly validated 90 RNA editing events by PCR amplification and Sanger sequencing, with the majority being non-synonymous substitutions and only three being synonymous substitutions. These findings provide valuable insights into the genetic characteristics of the C. arabica mitochondrial genome, which can be helpful for future study on coffee breeding and mitochondrial genome evolution. Conclusion Our study sheds new light on the evolution of C. arabica organelle genomes and their potential use in genetic breeding, providing valuable data for developing molecular markers that can improve crop productivity and quality. Furthermore, the discovery of RNA editing events in the mitochondrial genome of C. arabica offers insights into the regulation of gene expression in this species, contributing to a better understanding of coffee genetics and evolution.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Chang Liu
- Center for Bioinformatics, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China
| |
Collapse
|
2
|
Guerra-Guimarães L, Pinheiro C, Oliveira ASF, Mira-Jover A, Valverde J, Guedes FADF, Azevedo H, Várzea V, Muñoz Pajares AJ. The chloroplast protein HCF164 is predicted to be associated with Coffea S H9 resistance factor against Hemileia vastatrix. Sci Rep 2023; 13:16019. [PMID: 37749157 PMCID: PMC10520047 DOI: 10.1038/s41598-023-41950-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 09/04/2023] [Indexed: 09/27/2023] Open
Abstract
To explore the connection between chloroplast and coffee resistance factors, designated as SH1 to SH9, whole genomic DNA of 42 coffee genotypes was sequenced, and entire chloroplast genomes were de novo assembled. The chloroplast phylogenetic haplotype network clustered individuals per species instead of SH factors. However, for the first time, it allowed the molecular validation of Coffea arabica as the maternal parent of the spontaneous hybrid "Híbrido de Timor". Individual reads were also aligned on the C. arabica reference genome to relate SH factors with chloroplast metabolism, and an in-silico analysis of selected nuclear-encoded chloroplast proteins (132 proteins) was performed. The nuclear-encoded thioredoxin-like membrane protein HCF164 enabled the discrimination of individuals with and without the SH9 factor, due to specific DNA variants linked to chromosome 7c (from C. canephora-derived sub-genome). The absence of both the thioredoxin domain and redox-active disulphide center in the HCF164 protein, observed in SH9 individuals, raises the possibility of potential implications on redox regulation. For the first time, the identification of specific DNA variants of chloroplast proteins allows discriminating individuals according to the SH profile. This study introduces an unexplored strategy for identifying protein/genes associated with SH factors and candidate targets of H. vastatrix effectors, thereby creating new perspectives for coffee breeding programs.
Collapse
Affiliation(s)
- Leonor Guerra-Guimarães
- CIFC - Centro de Investigação das Ferrugens do Cafeeiro, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, 1349-017, Lisboa, Portugal.
- LEAF - Linking Landscape, Environment, Agriculture and Food Research Center, Associated Laboratory TERRA, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, 1349-017, Lisboa, Portugal.
| | - Carla Pinheiro
- UCIBIO Applied Molecular Biosciences Unit, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, 2829-516, Caparica, Portugal.
- Associate Laboratory i4HB Institute for Health and Bioeconomy, NOVA School of Science and Technology, Universidade NOVA de Lisboa, 2829-516, Caparica, Portugal.
| | - Ana Sofia F Oliveira
- Center for Computational Chemistry, School of Chemistry, University of Bristol, University Walk, Bristol, BS8 1TS, UK
| | - Andrea Mira-Jover
- Departamento de Genética, Universidad de Granada, 18071, Granada, Spain
- Área de Ecología, Departamento de Biología Aplicada, Universidad Miguel Hernández, Elche, Spain
| | - Javier Valverde
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Campus de Vairão, 4485-661, Vairão, Portugal
- Estación Biológica de Doñana, Consejo Superior de Investigaciones Científicas (CSIC), Avda. Américo Vespucio 26, 41092, Sevilla, Spain
| | - Fernanda A de F Guedes
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Campus de Vairão, 4485-661, Vairão, Portugal
| | - Herlander Azevedo
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Campus de Vairão, 4485-661, Vairão, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, 4485-661, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade Do Porto, 4099-002, Porto, Portugal
| | - Vitor Várzea
- CIFC - Centro de Investigação das Ferrugens do Cafeeiro, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, 1349-017, Lisboa, Portugal
- LEAF - Linking Landscape, Environment, Agriculture and Food Research Center, Associated Laboratory TERRA, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, 1349-017, Lisboa, Portugal
| | - Antonio Jesús Muñoz Pajares
- Departamento de Genética, Universidad de Granada, 18071, Granada, Spain.
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Campus de Vairão, 4485-661, Vairão, Portugal.
- Research Unit Modeling Nature, Universidad de Granada, 18071, Granada, Spain.
| |
Collapse
|
3
|
Kim M, Xi H, Park J. Genome-wide comparative analyses of GATA transcription factors among 19 Arabidopsis ecotype genomes: Intraspecific characteristics of GATA transcription factors. PLoS One 2021; 16:e0252181. [PMID: 34038437 PMCID: PMC8153473 DOI: 10.1371/journal.pone.0252181] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Accepted: 05/11/2021] [Indexed: 12/30/2022] Open
Abstract
GATA transcription factors (TFs) are widespread eukaryotic regulators whose DNA-binding domain is a class IV zinc finger motif (CX2CX17-20CX2C) followed by a basic region. Due to the low cost of genome sequencing, multiple strains of specific species have been sequenced: e.g., number of plant genomes in the Plant Genome Database (http://www.plantgenome.info/) is 2,174 originated from 713 plant species. Thus, we investigated GATA TFs of 19 Arabidopsis thaliana genome-widely to understand intraspecific features of Arabidopsis GATA TFs with the pipeline of GATA database (http://gata.genefamily.info/). Numbers of GATA genes and GATA TFs of each A. thaliana genome range from 29 to 30 and from 39 to 42, respectively. Four cases of different pattern of alternative splicing forms of GATA genes among 19 A. thaliana genomes are identified. 22 of 2,195 amino acids (1.002%) from the alignment of GATA domain amino acid sequences display variations across 19 ecotype genomes. In addition, maximally four different amino acid sequences per each GATA domain identified in this study indicate that these position-specific amino acid variations may invoke intraspecific functional variations. Among 15 functionally characterized GATA genes, only five GATA genes display variations of amino acids across ecotypes of A. thaliana, implying variations of their biological roles across natural isolates of A. thaliana. PCA results from 28 characteristics of GATA genes display the four groups, same to those defined by the number of GATA genes. Topologies of bootstrapped phylogenetic trees of Arabidopsis chloroplasts and common GATA genes are mostly incongruent. Moreover, no relationship between geographical distribution and their phylogenetic relationships was found. Our results present that intraspecific variations of GATA TFs in A. thaliana are conserved and evolutionarily neutral along with 19 ecotypes, which is congruent to the fact that GATA TFs are one of the main regulators for controlling essential mechanisms, such as seed germination and hypocotyl elongation.
Collapse
Affiliation(s)
- Mangi Kim
- InfoBoss Inc., Gangnam-gu, Seoul, Republic of Korea
- InfoBoss Research Center, Gangnam-gu, Seoul, Republic of Korea
| | - Hong Xi
- InfoBoss Inc., Gangnam-gu, Seoul, Republic of Korea
- InfoBoss Research Center, Gangnam-gu, Seoul, Republic of Korea
| | - Jongsun Park
- InfoBoss Inc., Gangnam-gu, Seoul, Republic of Korea
- InfoBoss Research Center, Gangnam-gu, Seoul, Republic of Korea
| |
Collapse
|
4
|
Park J, Xi H, Kim Y. The Complete Chloroplast Genome of Arabidopsis thaliana Isolated in Korea (Brassicaceae): An Investigation of Intraspecific Variations of the Chloroplast Genome of Korean A. thaliana. Int J Genomics 2020; 2020:3236461. [PMID: 32964010 PMCID: PMC7492873 DOI: 10.1155/2020/3236461] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 08/02/2020] [Accepted: 08/17/2020] [Indexed: 01/18/2023] Open
Abstract
Arabidopsis thaliana (L.) Heynh. is a model organism of plant molecular biology. More than 1,700 whole genome sequences have been sequenced, but no Korean isolate genomes have been sequenced thus far despite the fact that many A. thaliana isolated in Japan and China have been sequenced. To understand the genetic background of Korean natural A. thaliana (named as 180404IB4), we presented its complete chloroplast genome, which is 154,464 bp long and has four subregions: 85,164 bp of large single copy (LSC) and 17,781 bp of small single copy (SSC) regions are separated by 26,257 bp of inverted repeat (IRs) regions including 130 genes (85 protein-coding genes, eight rRNAs, and 37 tRNAs). Fifty single nucleotide polymorphisms (SNPs) and 14 insertion and deletions (INDELs) are identified between 180404IB4 and Col0. In addition, 101 SSRs and 42 extendedSSRs were identified on the Korean A. thaliana chloroplast genome, indicating a similar number of SSRs on the rest five chloroplast genomes with a preference of sequence variations toward the SSR region. A nucleotide diversity analysis revealed two highly variable regions on A. thaliana chloroplast genomes. Phylogenetic trees with three more chloroplast genomes of East Asian natural isolates show that Korean and Chinese natural isolates are clustered together, whereas two Japanese isolates are not clustered, suggesting the need for additional investigations of the chloroplast genomes of East Asian isolates.
Collapse
Affiliation(s)
- Jongsun Park
- InfoBoss Inc., 301 Room, 670, Seolleung-ro, Gangnam-gu, Seoul, Republic of Korea
- InfoBoss Research Center, 301 Room, 670, Seolleung-ro, Gangnam-gu, Seoul, Republic of Korea
| | - Hong Xi
- InfoBoss Inc., 301 Room, 670, Seolleung-ro, Gangnam-gu, Seoul, Republic of Korea
- InfoBoss Research Center, 301 Room, 670, Seolleung-ro, Gangnam-gu, Seoul, Republic of Korea
| | - Yongsung Kim
- InfoBoss Inc., 301 Room, 670, Seolleung-ro, Gangnam-gu, Seoul, Republic of Korea
- InfoBoss Research Center, 301 Room, 670, Seolleung-ro, Gangnam-gu, Seoul, Republic of Korea
| |
Collapse
|
5
|
Kim Y, Park J, Chung Y. The comparison of the complete chloroplast genome of Suaeda japonica Makino presenting different external morphology (Amaranthaceae). Mitochondrial DNA B Resour 2020. [DOI: 10.1080/23802359.2020.1715867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
Affiliation(s)
- Yongsung Kim
- InfoBoss Co., Ltd, Seoul, Republic of Korea
- InfoBoss Research Center, Seoul, Republic of Korea
| | - Jongsun Park
- InfoBoss Co., Ltd, Seoul, Republic of Korea
- InfoBoss Research Center, Seoul, Republic of Korea
| | - Youngjae Chung
- Department of Biology, Shingyeong University, Gyeonggi-do, Republic of Korea
| |
Collapse
|
6
|
Park J, Kim Y, Xi H, Heo KI, Min J, Woo J, Lee D, Seo Y, Kim YH. The complete chloroplast genomes of two cold hardness coffee trees, Coffea arabica L. (Rubiaceae). Mitochondrial DNA B Resour 2020. [DOI: 10.1080/23802359.2020.1715883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
Affiliation(s)
- Jongsun Park
- InfoBoss Co., Ltd, Seoul, Republic of Korea
- InfoBoss Research Center, Seoul, Republic of Korea
| | - Yongsung Kim
- InfoBoss Co., Ltd, Seoul, Republic of Korea
- InfoBoss Research Center, Seoul, Republic of Korea
| | - Hong Xi
- InfoBoss Co., Ltd, Seoul, Republic of Korea
- InfoBoss Research Center, Seoul, Republic of Korea
| | - Kyoung-In Heo
- InfoBoss Co., Ltd, Seoul, Republic of Korea
- InfoBoss Research Center, Seoul, Republic of Korea
| | - Juhyeon Min
- InfoBoss Co., Ltd, Seoul, Republic of Korea
- InfoBoss Research Center, Seoul, Republic of Korea
| | - Jongwook Woo
- Stronghold Technology, Inc, Seoul, Republic of Korea
| | - Dukgou Lee
- Stronghold Technology, Inc, Seoul, Republic of Korea
| | - Youmi Seo
- Stronghold Technology, Inc, Seoul, Republic of Korea
| | | |
Collapse
|
7
|
Kim Y, Yi JS, Min J, Xi H, Kim DY, Son J, Park J, Jeon JI. The complete chloroplast genome of Aconitum coreanum (H. Lév.) Rapaics (Ranunculaceae). Mitochondrial DNA B Resour 2019; 4:3404-3406. [PMID: 33366014 PMCID: PMC7707295 DOI: 10.1080/23802359.2019.1674213] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 09/23/2019] [Indexed: 11/29/2022] Open
Abstract
Aconitum coreanum (H. Lév.) Rapaics listed in the Korean Red List is a medicinal herb. We presented complete chloroplast genome, which is 157,024 bp long and has four subregions: 87,637 bp of large single-copy and 16,901 bp of small single-copy regions, which are separated by two 26,243 bp inverted repeat regions including 132 genes (86 protein-coding genes, 8 rRNAs, and 37 tRNAs). The overall GC content of the chloroplast is 38.0%. Phylogenetic trees show that A. coreanum occupied a basal position at subgenus Aconitum clade and two A. coreanum isolated from midwestern and eastern regions of Korea are clustered together.
Collapse
Affiliation(s)
- Yongsung Kim
- InfoBoss Co., Ltd., Seoul, The Republic of Korea
- InfoBoss Research Center, Seoul, The Republic of Korea
| | - Jae-Sun Yi
- Shingu Botanic Garden, Seongnam-si, The Republic of Korea
- Department of Environmental Horticulture, University of Seoul, Seoul, The Republic of Korea
| | - Juhyeon Min
- InfoBoss Co., Ltd., Seoul, The Republic of Korea
- InfoBoss Research Center, Seoul, The Republic of Korea
| | - Hong Xi
- InfoBoss Co., Ltd., Seoul, The Republic of Korea
- InfoBoss Research Center, Seoul, The Republic of Korea
| | - Da Yeon Kim
- Shingu Botanic Garden, Seongnam-si, The Republic of Korea
- Department of Environmental Horticulture, University of Seoul, Seoul, The Republic of Korea
| | - Janghyuk Son
- InfoBoss Co., Ltd., Seoul, The Republic of Korea
- InfoBoss Research Center, Seoul, The Republic of Korea
| | - Jongsun Park
- InfoBoss Co., Ltd., Seoul, The Republic of Korea
- InfoBoss Research Center, Seoul, The Republic of Korea
| | - Jeong-Ill Jeon
- Shingu Botanic Garden, Seongnam-si, The Republic of Korea
- Department of Horticulture Design, Shingu College, Seongnam-si, The Republic of Korea
| |
Collapse
|