1
|
Pires GP, Fioresi VS, Canal D, Canal DC, Fernandes M, Brustolini OJB, de Avelar Carpinetti P, Ferreira A, da Silva Ferreira MF. Effects of trimer repeats on Psidium guajava L. gene expression and prospection of functional microsatellite markers. Sci Rep 2024; 14:9811. [PMID: 38684872 PMCID: PMC11059378 DOI: 10.1038/s41598-024-60417-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 04/23/2024] [Indexed: 05/02/2024] Open
Abstract
Most research on trinucleotide repeats (TRs) focuses on human diseases, with few on the impact of TR expansions on plant gene expression. This work investigates TRs' effect on global gene expression in Psidium guajava L., a plant species with widespread distribution and significant relevance in the food, pharmacology, and economics sectors. We analyzed TR-containing coding sequences in 1,107 transcripts from 2,256 genes across root, shoot, young leaf, old leaf, and flower bud tissues of the Brazilian guava cultivars Cortibel RM and Paluma. Structural analysis revealed TR sequences with small repeat numbers (5-9) starting with cytosine or guanine or containing these bases. Functional annotation indicated TR-containing genes' involvement in cellular structures and processes (especially cell membranes and signal recognition), stress response, and resistance. Gene expression analysis showed significant variation, with a subset of highly expressed genes in both cultivars. Differential expression highlighted numerous down-regulated genes in Cortibel RM tissues, but not in Paluma, suggesting interplay between tissues and cultivars. Among 72 differentially expressed genes with TRs, 24 form miRNAs, 13 encode transcription factors, and 11 are associated with transposable elements. In addition, a set of 20 SSR-annotated, transcribed, and differentially expressed genes with TRs was selected as phenotypic markers for Psidium guajava and, potentially for closely related species as well.
Collapse
Affiliation(s)
- Giovanna Pinto Pires
- Centro de Ciências Agrárias e Engenharias, Departamento de Agronomia, Universidade Federal Do Espírito Santo, Alto Universitário, s/n, Alegre, ES, 29500-000, Brazil
| | - Vinicius Sartori Fioresi
- Centro de Ciências Agrárias e Engenharias, Departamento de Agronomia, Universidade Federal Do Espírito Santo, Alto Universitário, s/n, Alegre, ES, 29500-000, Brazil
| | - Drielli Canal
- Centro de Ciências Agrárias e Engenharias, Departamento de Agronomia, Universidade Federal Do Espírito Santo, Alto Universitário, s/n, Alegre, ES, 29500-000, Brazil
| | - Dener Cezati Canal
- Centro de Ciências Agrárias e Engenharias, Departamento de Agronomia, Universidade Federal Do Espírito Santo, Alto Universitário, s/n, Alegre, ES, 29500-000, Brazil
| | - Miquéias Fernandes
- Centro de Ciências Agrárias e Engenharias, Departamento de Agronomia, Universidade Federal Do Espírito Santo, Alto Universitário, s/n, Alegre, ES, 29500-000, Brazil
| | - Otávio José Bernardes Brustolini
- Laboratório Nacional de Computação Científica (LNCC). Av. Getulio Vargas, 333, Petrópolis, Rio de Janeiro, Quitandinha, 25651-076, Brazil
| | - Paola de Avelar Carpinetti
- Centro de Ciências Agrárias e Engenharias, Departamento de Agronomia, Universidade Federal Do Espírito Santo, Alto Universitário, s/n, Alegre, ES, 29500-000, Brazil
| | - Adésio Ferreira
- Centro de Ciências Agrárias e Engenharias, Departamento de Agronomia, Universidade Federal Do Espírito Santo, Alto Universitário, s/n, Alegre, ES, 29500-000, Brazil
| | - Marcia Flores da Silva Ferreira
- Centro de Ciências Agrárias e Engenharias, Departamento de Agronomia, Universidade Federal Do Espírito Santo, Alto Universitário, s/n, Alegre, ES, 29500-000, Brazil.
| |
Collapse
|
2
|
Luteran EM, Paukstelis PJ. The parallel-stranded d(CGA) duplex is a highly predictable structural motif with two conformationally distinct strands. Acta Crystallogr D Struct Biol 2022; 78:299-309. [PMID: 35234144 PMCID: PMC8900823 DOI: 10.1107/s2059798322000304] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 01/10/2022] [Indexed: 11/10/2022] Open
Abstract
DNA can adopt noncanonical structures that have important biological functions while also providing structural diversity for applications in nanotechnology. Here, the crystal structures of two oligonucleotides composed of d(CGA) triplet repeats in the parallel-stranded duplex form are described. The structure determination of four unique d(CGA)-based parallel-stranded duplexes across two crystal structures has allowed the structural parameters of d(CGA) triplets in the parallel-stranded duplex form to be characterized and established. These results show that d(CGA) units are highly uniform, but that each strand in the duplex is structurally unique and has a distinct role in accommodating structural asymmetries induced by the C-CH+ base pair.
Collapse
Affiliation(s)
- Emily M. Luteran
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742, USA
| | - Paul J. Paukstelis
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
3
|
Katsumata K, Ichikawa Y, Fuse T, Kurumizaka H, Yanagida A, Urano T, Kato H, Shimizu M. Sequence-dependent nucleosome formation in trinucleotide repeats evaluated by in vivo chemical mapping. Biochem Biophys Res Commun 2021; 556:179-184. [PMID: 33839413 DOI: 10.1016/j.bbrc.2021.03.155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 03/28/2021] [Indexed: 11/18/2022]
Abstract
Trinucleotide repeat sequences (TRSs), consisting of 10 unique classes of repeats in DNA, are members of microsatellites and abundantly and non-randomly distributed in many eukaryotic genomes. The lengths of TRSs are mutable, and the expansions of several TRSs are implicated in hereditary neurological diseases. However, the underlying causes of the biased distribution and the dynamic properties of TRSs in the genome remain elusive. Here, we examined the effects of TRSs on nucleosome formation in vivo by histone H4-S47C site-directed chemical cleavages, using well-defined yeast minichromosomes in which each of the ten TRS classes resided in the central region of a positioned nucleosome. We showed that (AAT)12 and (ACT)12 act as strong nucleosome-promoting sequences, while (AGG)12 and (CCG)12 act as nucleosome-excluding sequences in vivo. The local histone binding affinity scores support the idea that nucleosome formation in TRSs, except for (AGG)12, is mainly determined by the affinity for the histone octamers. Overall, our study presents a framework for understanding the nucleosome-forming abilities of TRSs.
Collapse
Affiliation(s)
- Koji Katsumata
- Department of Chemistry, Graduate School of Science and Engineering, Meisei University, 2-1-1 Hodokubo, Hino, Tokyo, 191-8506, Japan
| | - Yuichi Ichikawa
- Division of Cancer Biology, The Cancer Institute of JFCR, 3-8-31 Ariake, Koto-ku, Tokyo, 135-8550, Japan
| | - Tomohiro Fuse
- Department of Chemistry, Graduate School of Science and Engineering, Meisei University, 2-1-1 Hodokubo, Hino, Tokyo, 191-8506, Japan
| | - Hitoshi Kurumizaka
- Laboratory of Chromatin Structure and Function, Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo, 113-0032, Japan
| | - Akio Yanagida
- School of Pharmacy, Tokyo University of Pharmacy and Life Sciences, 1432-1 Horinouchi, Hachioji, Tokyo, 192-0392, Japan
| | - Takeshi Urano
- Department of Biochemistry, Shimane University School of Medicine, 89-1 Enya-cho, Izumo, Shimane, 693-8501, Japan
| | - Hiroaki Kato
- Department of Biochemistry, Shimane University School of Medicine, 89-1 Enya-cho, Izumo, Shimane, 693-8501, Japan
| | - Mitsuhiro Shimizu
- Department of Chemistry, Graduate School of Science and Engineering, Meisei University, 2-1-1 Hodokubo, Hino, Tokyo, 191-8506, Japan.
| |
Collapse
|
4
|
Li TT, Tang B, Bai X, Wang XL, Luo XN, Yan HB, Zhu HF, Jia H, Liu XL, Liu MY. Development of genome-wide polymorphic microsatellite markers for Trichinella spiralis. Parasit Vectors 2020; 13:58. [PMID: 32046770 PMCID: PMC7014596 DOI: 10.1186/s13071-020-3929-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 02/03/2020] [Indexed: 02/05/2023] Open
Abstract
Background Trichinella nematodes are globally distributed food-borne pathogens, in which Trichinella spiralis is the most common species in China. Microsatellites are a powerful tool in population genetics and phylogeographic analysis. However, only a few microsatellite markers were reported in T. spiralis. Thus, there is a need to develop and validate genome-wide microsatellite markers for T. spiralis. Methods Microsatellites were selected from shotgun genomic sequences using MIcroSAtellite identification tool (MISA). The identified markers were validated in 12 isolates of T. spiralis in China. Results A total of 93,140 microsatellites were identified by MISA from 9267 contigs in T. spiralis genome sequences, in which 16 polymorphic loci were selected for validation by PCR with single larvae from 12 isolates of T. spiralis in China. There were 7–19 alleles per locus (average 11.25 alleles per locus). The observed heterozygosity (HO) and expected heterozygosity (HE) ranged from 0.325 to 0.750 and 0.737 to 0.918, respectively. The polymorphism information content (PIC) ranged from 0.719 to 0.978 (average 0.826). Among the 16 loci, markers for 10 loci could be amplified from all 12 international standard strains of Trichinella spp. Conclusions Sixteen highly polymorphic markers were selected and validated for T. spiralis. Primary phylogenetic analysis showed that these markers might serve as a useful tool for genetic studies of Trichinella parasites.![]()
Collapse
Affiliation(s)
- Ting-Ting Li
- Key Laboratory of Zoonosis Research, Ministry of Education, Institute of Zoonosis, College of Veterinary Medicine, Jilin University, Changchun, 130062, Jilin, People's Republic of China
| | - Bin Tang
- Key Laboratory of Zoonosis Research, Ministry of Education, Institute of Zoonosis, College of Veterinary Medicine, Jilin University, Changchun, 130062, Jilin, People's Republic of China
| | - Xue Bai
- Key Laboratory of Zoonosis Research, Ministry of Education, Institute of Zoonosis, College of Veterinary Medicine, Jilin University, Changchun, 130062, Jilin, People's Republic of China
| | - Xue-Lin Wang
- Key Laboratory of Zoonosis Research, Ministry of Education, Institute of Zoonosis, College of Veterinary Medicine, Jilin University, Changchun, 130062, Jilin, People's Republic of China
| | - Xue-Nong Luo
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, CAAS, Lanzhou, 730046, Gansu, People's Republic of China
| | - Hong-Bin Yan
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, CAAS, Lanzhou, 730046, Gansu, People's Republic of China
| | - Hong-Fei Zhu
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Hong Jia
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, People's Republic of China
| | - Xiao-Lei Liu
- Key Laboratory of Zoonosis Research, Ministry of Education, Institute of Zoonosis, College of Veterinary Medicine, Jilin University, Changchun, 130062, Jilin, People's Republic of China.
| | - Ming-Yuan Liu
- Key Laboratory of Zoonosis Research, Ministry of Education, Institute of Zoonosis, College of Veterinary Medicine, Jilin University, Changchun, 130062, Jilin, People's Republic of China. .,Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou, Jiangsu, People's Republic of China.
| |
Collapse
|
5
|
Shimada MK, Sanbonmatsu R, Yamaguchi-Kabata Y, Yamasaki C, Suzuki Y, Chakraborty R, Gojobori T, Imanishi T. Selection pressure on human STR loci and its relevance in repeat expansion disease. Mol Genet Genomics 2016; 291:1851-69. [PMID: 27290643 DOI: 10.1007/s00438-016-1219-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2015] [Accepted: 05/21/2016] [Indexed: 12/30/2022]
Abstract
Short Tandem Repeats (STRs) comprise repeats of one to several base pairs. Because of the high mutability due to strand slippage during DNA synthesis, rapid evolutionary change in the number of repeating units directly shapes the range of repeat-number variation according to selection pressure. However, the remaining questions include: Why are STRs causing repeat expansion diseases maintained in the human population; and why are these limited to neurodegenerative diseases? By evaluating the genome-wide selection pressure on STRs using the database we constructed, we identified two different patterns of relationship in repeat-number polymorphisms between DNA and amino-acid sequences, although both patterns are evolutionary consequences of avoiding the formation of harmful long STRs. First, a mixture of degenerate codons is represented in poly-proline (poly-P) repeats. Second, long poly-glutamine (poly-Q) repeats are favored at the protein level; however, at the DNA level, STRs encoding long poly-Qs are frequently divided by synonymous SNPs. Furthermore, significant enrichments of apoptosis and neurodevelopment were biological processes found specifically in genes encoding poly-Qs with repeat polymorphism. This suggests the existence of a specific molecular function for polymorphic and/or long poly-Q stretches. Given that the poly-Qs causing expansion diseases were longer than other poly-Qs, even in healthy subjects, our results indicate that the evolutionary benefits of long and/or polymorphic poly-Q stretches outweigh the risks of long CAG repeats predisposing to pathological hyper-expansions. Molecular pathways in neurodevelopment requiring long and polymorphic poly-Q stretches may provide a clue to understanding why poly-Q expansion diseases are limited to neurodegenerative diseases.
Collapse
Affiliation(s)
- Makoto K Shimada
- Institute for Comprehensive Medical Science, Fujita Health University, 1-98 Dengakugakubo, Kutsukake-cho, Toyoake, Aichi, 470-1192, Japan.
- National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi Koto-ku, Tokyo, 135-0064, Japan.
- Japan Biological Informatics Consortium, 10F TIME24 Building, 2-4-32 Aomi, Koto-ku, Tokyo, 135-8073, Japan.
| | - Ryoko Sanbonmatsu
- Japan Biological Informatics Consortium, 10F TIME24 Building, 2-4-32 Aomi, Koto-ku, Tokyo, 135-8073, Japan
| | - Yumi Yamaguchi-Kabata
- National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi Koto-ku, Tokyo, 135-0064, Japan
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, 980-8573, Japan
| | - Chisato Yamasaki
- National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi Koto-ku, Tokyo, 135-0064, Japan
- Japan Biological Informatics Consortium, 10F TIME24 Building, 2-4-32 Aomi, Koto-ku, Tokyo, 135-8073, Japan
| | - Yoshiyuki Suzuki
- Graduate School of Natural Sciences, Nagoya City University, 1 Yamanohata, Mizuho-cho, Mizuho-ku, Nagoya, Aichi, 467-8501, Japan
| | - Ranajit Chakraborty
- Health Science Center, University of North Texas, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA
| | - Takashi Gojobori
- National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi Koto-ku, Tokyo, 135-0064, Japan
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Ibn Al-Haytham Building (West), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Tadashi Imanishi
- National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi Koto-ku, Tokyo, 135-0064, Japan
- Department of Molecular Life Science, Tokai University School of Medicine, 143 Shimokasuya, Isehara, Kanagawa, 259-1193, Japan
| |
Collapse
|
6
|
Rosandić M, Paar V, Glunčić M. Fundamental role of start/stop regulators in whole DNA and new trinucleotide classification. Gene 2013; 531:184-90. [PMID: 24042127 DOI: 10.1016/j.gene.2013.09.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2013] [Revised: 08/31/2013] [Accepted: 09/05/2013] [Indexed: 10/26/2022]
Abstract
The origin and logic of genetic code are two of greatest mysteries of life sciences. Analyzing DNA sequences we showed that the start/stop trinucleotides have broader importance than just marking start and stop of exons in coding DNA. On this basis, here we introduced new classification of trinucleotides and showed that all A+T rich trinucleotides consisting of three different nucleotides arise from start-ATG, stop-TGA and stop-TAG using their complement, reverse complement and reverse transformations. Due to the same transformations during generations of crossing-over they can switch from one form to the other. By direct process the start-ATG and stop-TAG can irreversibly transform into stop-TAA. By transformation into A+T rich trinucleotides and 16/32 C+G rich they can lose the start/stop function and take the role of a sense codon in reversible way. The remaining 16 C+G trinucleotides cannot directly transform into start/stop trinucleotides and thus remain a firm skeleton for structuring the C+G rich DNA. We showed that start/stops strongly enrich the A+T rich noncoding DNA through frequently extended forms. From the evolutionary viewpoint the start/stops are chief creators of prevailing A+T rich noncoding DNA, and of more stable coding DNA. We propose that start/stops have basic role as "seeds" in trinucleotide evolution of noncoding and coding sequences and lead to asymmetry between A+T and C+G rich DNA. By dynamical transformations during evolution they enabled pronounced phylogenetic broadness, keeping the regulator function.
Collapse
Affiliation(s)
- Marija Rosandić
- Faculty of Science, University of Zagreb, Bijenička 32, 10000 Zagreb, Croatia
| | | | | |
Collapse
|
7
|
Meglécz E, Nève G, Biffin E, Gardner MG. Breakdown of phylogenetic signal: a survey of microsatellite densities in 454 shotgun sequences from 154 non model eukaryote species. PLoS One 2012; 7:e40861. [PMID: 22815847 PMCID: PMC3397955 DOI: 10.1371/journal.pone.0040861] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2012] [Accepted: 06/14/2012] [Indexed: 11/19/2022] Open
Abstract
Microsatellites are ubiquitous in Eukaryotic genomes. A more complete understanding of their origin and spread can be gained from a comparison of their distribution within a phylogenetic context. Although information for model species is accumulating rapidly, it is insufficient due to a lack of species depth, thus intragroup variation is necessarily ignored. As such, apparent differences between groups may be overinflated and generalizations cannot be inferred until an analysis of the variation that exists within groups has been conducted. In this study, we examined microsatellite coverage and motif patterns from 454 shotgun sequences of 154 Eukaryote species from eight distantly related phyla (Cnidaria, Arthropoda, Onychophora, Bryozoa, Mollusca, Echinodermata, Chordata and Streptophyta) to test if a consistent phylogenetic pattern emerges from the microsatellite composition of these species. It is clear from our results that data from model species provide incomplete information regarding the existing microsatellite variability within the Eukaryotes. A very strong heterogeneity of microsatellite composition was found within most phyla, classes and even orders. Autocorrelation analyses indicated that while microsatellite contents of species within clades more recent than 200 Mya tend to be similar, the autocorrelation breaks down and becomes negative or non-significant with increasing divergence time. Therefore, the age of the taxon seems to be a primary factor in degrading the phylogenetic pattern present among related groups. The most recent classes or orders of Chordates still retain the pattern of their common ancestor. However, within older groups, such as classes of Arthropods, the phylogenetic pattern has been scrambled by the long independent evolution of the lineages.
Collapse
Affiliation(s)
- Emese Meglécz
- IMBE UMR 7263 CNRS IRD, Aix-Marseille University, Marseille, France.
| | | | | | | |
Collapse
|
8
|
Sahu J, Sarmah R, Dehury B, Sarma K, Sahoo S, Sahu M, Barooah M, Modi MK, Sen P. Mining for SSRs and FDMs from expressed sequence tags of Camellia sinensis. Bioinformation 2012; 8:260-6. [PMID: 22493533 PMCID: PMC3321235 DOI: 10.6026/97320630008260] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2012] [Accepted: 03/21/2012] [Indexed: 11/23/2022] Open
Abstract
Simple Sequence Repeats (SSRs) developed from Expressed Sequence Tags (ESTs), known as EST-SSRs are most widely used and potentially valuable source of gene based markers for their high levels of crosstaxon portability, rapid and less expensive development. The EST sequence information in the publicly available databases is increasing in a faster rate. The emerging computational approach provides a better alternative process of development of SSR markers from the ESTs than the conventional methods. In the present study, 12,851 EST sequences of Camellia sinensis, downloaded from National Center for Biotechnology Information (NCBI) were mined for the development of Microsatellites. 6148 (4779 singletons and 1369 contigs) non redundant EST sequences were found after preprocessing and assembly of these sequences using various computational tools. Out of total 3822.68 kb sequence examined, 1636 (26.61%) EST sequences containing 2371 SSRs were detected with a density of 1 SSR/1.61 kb leading to development of 245 primer pairs. These mined EST-SSR markers will help further in the study of variability, mapping, evolutionary relationship in Camellia sinensis. In addition, these developed SSRs can also be applied for various studies across species.
Collapse
Affiliation(s)
- Jagajjit Sahu
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Ranjan Sarmah
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Budheswar Dehury
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Kishore Sarma
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Smita Sahoo
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Mousumi Sahu
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Madhumita Barooah
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Mahendra Kumar Modi
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Priyabrata Sen
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| |
Collapse
|
9
|
Hamarsheh O, Amro A. Characterization of simple sequence repeats (SSRs) from Phlebotomus papatasi (Diptera: Psychodidae) expressed sequence tags (ESTs). Parasit Vectors 2011; 4:189. [PMID: 21958493 PMCID: PMC3191335 DOI: 10.1186/1756-3305-4-189] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2011] [Accepted: 09/29/2011] [Indexed: 10/31/2022] Open
Abstract
BACKGROUND Phlebotomus papatasi is a natural vector of Leishmania major, which causes cutaneous leishmaniasis in many countries. Simple sequence repeats (SSRs), or microsatellites, are common in eukaryotic genomes and are short, repeated nucleotide sequence elements arrayed in tandem and flanked by non-repetitive regions. The enrichment methods used previously for finding new microsatellite loci in sand flies remain laborious and time consuming; in silico mining, which includes retrieval and screening of microsatellites from large amounts of sequence data from sequence data bases using microsatellite search tools can yield many new candidate markers. RESULTS Simple sequence repeats (SSRs) were characterized in P. papatasi expressed sequence tags (ESTs) derived from a public database, National Center for Biotechnology Information (NCBI). A total of 42,784 sequences were mined, and 1,499 SSRs were identified with a frequency of 3.5% and an average density of 15.55 kb per SSR. Dinucleotide motifs were the most common SSRs, accounting for 67% followed by tri-, tetra-, and penta-nucleotide repeats, accounting for 31.1%, 1.5%, and 0.1%, respectively. The length of microsatellites varied from 5 to 16 repeats. Dinucleotide types; AG and CT have the highest frequency. Dinucleotide SSR-ESTs are relatively biased toward an excess of (AX)n repeats and a low GC base content. Forty primer pairs were designed based on motif lengths for further experimental validation. CONCLUSION The first large-scale survey of SSRs derived from P. papatasi is presented; dinucleotide SSRs identified are more frequent than other types. EST data mining is an effective strategy to identify functional microsatellites in P. papatasi.
Collapse
Affiliation(s)
- Omar Hamarsheh
- Department of Biological Sciences, Faculty of Science and Technology, Al-Quds University, PO Box 51000, Jerusalem, Palestine.
| | | |
Collapse
|
10
|
Krzyzosiak WJ, Sobczak K, Wojciechowska M, Fiszer A, Mykowska A, Kozlowski P. Triplet repeat RNA structure and its role as pathogenic agent and therapeutic target. Nucleic Acids Res 2011; 40:11-26. [PMID: 21908410 PMCID: PMC3245940 DOI: 10.1093/nar/gkr729] [Citation(s) in RCA: 134] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
This review presents detailed information about the structure of triplet repeat RNA and addresses the simple sequence repeats of normal and expanded lengths in the context of the physiological and pathogenic roles played in human cells. First, we discuss the occurrence and frequency of various trinucleotide repeats in transcripts and classify them according to the propensity to form RNA structures of different architectures and stabilities. We show that repeats capable of forming hairpin structures are overrepresented in exons, which implies that they may have important functions. We further describe long triplet repeat RNA as a pathogenic agent by presenting human neurological diseases caused by triplet repeat expansions in which mutant RNA gains a toxic function. Prominent examples of these diseases include myotonic dystrophy type 1 and fragile X-associated tremor ataxia syndrome, which are triggered by mutant CUG and CGG repeats, respectively. In addition, we discuss RNA-mediated pathogenesis in polyglutamine disorders such as Huntington's disease and spinocerebellar ataxia type 3, in which expanded CAG repeats may act as an auxiliary toxic agent. Finally, triplet repeat RNA is presented as a therapeutic target. We describe various concepts and approaches aimed at the selective inhibition of mutant transcript activity in experimental therapies developed for repeat-associated diseases.
Collapse
Affiliation(s)
- Wlodzimierz J Krzyzosiak
- Laboratory of Cancer Genetics, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland.
| | | | | | | | | | | |
Collapse
|
11
|
Pan X, Liao Y, Liu Y, Chang P, Liao L, Yang L, Li H. Transcription of AAT•ATT triplet repeats in Escherichia coli is silenced by H-NS and IS1E transposition. PLoS One 2010; 5:e14271. [PMID: 21151567 PMCID: PMC3000339 DOI: 10.1371/journal.pone.0014271] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2010] [Accepted: 11/15/2010] [Indexed: 11/18/2022] Open
Abstract
Background The trinucleotide repeats AAT•ATT are simple DNA sequences that potentially form different types of non-B DNA secondary structures and cause genomic instabilities in vivo. Methodology and Principal Findings The molecular mechanism underlying the maintenance of a 24-triplet AAT•ATT repeat was examined in E.coli by cloning the repeats into the EcoRI site in plasmid pUC18 and into the attB site on the E.coli genome. Either the AAT or the ATT strand acted as lagging strand template in a replication fork. Propagations of the repeats in either orientation on plasmids did not affect colony morphology when triplet repeat transcription using the lacZ promoter was repressed either by supplementing LacIQin trans or by adding glucose into the medium. In contrast, transparent colonies were formed by inducing transcription of the repeats, suggesting that transcription of AAT•ATT repeats was toxic to cell growth. Meanwhile, significant IS1E transposition events were observed both into the triplet repeats region proximal to the promoter side, the promoter region of the lacZ gene, and into the AAT•ATT region itself. Transposition reversed the transparent colony phenotype back into healthy, convex colonies. In contrast, transcription of an 8-triplet AAT•ATT repeat in either orientation on plasmids did not produce significant changes in cell morphology and did not promote IS1E transposition events. We further found that a role of IS1E transposition into plasmids was to inhibit transcription through the repeats, which was influenced by the presence of the H-NS protein, but not of its paralogue StpA. Conclusions and Significance Our findings thus suggest that the longer AAT•ATT triplet repeats in E.coli become vulnerable after transcription. H-NS and its facilitated IS1E transposition can silence long triplet repeats transcription and preserve cell growth and survival.
Collapse
Affiliation(s)
- Xuefeng Pan
- School of Life Science, Beijing Institute of Technology, Beijing, China.
| | | | | | | | | | | | | |
Collapse
|
12
|
Castagnone-Sereno P, Danchin EGJ, Deleury E, Guillemaud T, Malausa T, Abad P. Genome-wide survey and analysis of microsatellites in nematodes, with a focus on the plant-parasitic species Meloidogyne incognita. BMC Genomics 2010; 11:598. [PMID: 20973953 PMCID: PMC3091743 DOI: 10.1186/1471-2164-11-598] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2010] [Accepted: 10/25/2010] [Indexed: 11/13/2022] Open
Abstract
Background Microsatellites are the most popular source of molecular markers for studying population genetic variation in eukaryotes. However, few data are currently available about their genomic distribution and abundance across the phylum Nematoda. The recent completion of the genomes of several nematode species, including Meloidogyne incognita, a major agricultural pest worldwide, now opens the way for a comparative survey and analysis of microsatellites in these organisms. Results Using MsatFinder, the total numbers of 1-6 bp perfect microsatellites detected in the complete genomes of five nematode species (Brugia malayi, Caenorhabditis elegans, M. hapla, M. incognita, Pristionchus pacificus) ranged from 2,842 to 61,547, and covered from 0.09 to 1.20% of the nematode genomes. Under our search criteria, the most common repeat motifs for each length class varied according to the different nematode species considered, with no obvious relation to the AT-richness of their genomes. Overall, (AT)n, (AG)n and (CT)n were the three most frequent dinucleotide microsatellite motifs found in the five genomes considered. Except for two motifs in P. pacificus, all the most frequent trinucleotide motifs were AT-rich, with (AAT)n and (ATT)n being the only common to the five nematode species. A particular attention was paid to the microsatellite content of the plant-parasitic species M. incognita. In this species, a repertoire of 4,880 microsatellite loci was identified, from which 2,183 appeared suitable to design markers for population genetic studies. Interestingly, 1,094 microsatellites were identified in 801 predicted protein-coding regions, 99% of them being trinucleotides. When compared against the InterPro domain database, 497 of these CDS were successfully annotated, and further assigned to Gene Ontology terms. Conclusions Contrasted patterns of microsatellite abundance and diversity were characterized in five nematode genomes, even in the case of two closely related Meloidogyne species. 2,245 di- to hexanucleotide loci were identified in the genome of M. incognita, providing adequate material for the future development of a wide range of microsatellite markers in this major plant parasite.
Collapse
|
13
|
Ellison CK, Shaw KL. Mining non-model genomic libraries for microsatellites: BAC versus EST libraries and the generation of allelic richness. BMC Genomics 2010; 11:428. [PMID: 20624300 PMCID: PMC2996956 DOI: 10.1186/1471-2164-11-428] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2010] [Accepted: 07/12/2010] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Simple sequence repeats (SSRs) are tandemly repeated sequence motifs common in genomic nucleotide sequence that often harbor significant variation in repeat number. Frequently used as molecular markers, SSRs are increasingly identified via in silico approaches. Two common classes of genomic resources that can be mined are bacterial artificial chromosome (BAC) libraries and expressed sequence tag (EST) libraries. RESULTS 288 SSR loci were screened in the rapidly radiating Hawaiian swordtail cricket genus Laupala. SSRs were more densely distributed and contained longer repeat structures in BAC library-derived sequence than in EST library-derived sequence, although neither repeat density nor length was exceptionally elevated despite the relatively large genome size of Laupala. A non-random distribution favoring AT-rich SSRs was observed. Allelic diversity of SSRs was positively correlated with repeat length and was generally higher in AT-rich repeat motifs. CONCLUSION The first large-scale survey of Orthopteran SSR allelic diversity is presented. Selection contributes more strongly to the size and density distributions of SSR loci derived from EST library sequence than from BAC library sequence, although all SSRs likely are subject to similar physical and structural constraints, such as slippage of DNA replication machinery, that may generate increased allelic diversity in AT-rich sequence motifs. Although in silico approaches work well for SSR locus identification in both EST and BAC libraries, BAC library sequence and AT-rich repeat motifs are generally superior SSR development resources for most applications.
Collapse
Affiliation(s)
| | - Kerry L Shaw
- Department of Neurobiology and Behavior, Cornell University, Ithaca, NY 14850, USA
| |
Collapse
|
14
|
Rouchka EC. Database of exact tandem repeats in the Zebrafish genome. BMC Genomics 2010; 11:347. [PMID: 20515480 PMCID: PMC2901318 DOI: 10.1186/1471-2164-11-347] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2009] [Accepted: 06/01/2010] [Indexed: 11/23/2022] Open
Abstract
Background Sequencing of the approximately 1.7 billion bases of the zebrafish genome is currently underway. To date, few high resolution genetic maps exist for the zebrafish genome, based mainly on single nucleotide polymorphisms (SNPs) and short microsatellite repeats. The desire to construct a higher resolution genetic map led to the construction of a database of tandemly repeating elements within the zebrafish Zv8 assembly. Description Exact tandem repeats with a repeat length of at least three bases and a copy number of at least 10 were reported. Repeats with a total length of 250 or fewer bases and their flanking regions were masked for known vertebrate repeats. Optimal primer pairs were computationally designed in the regions flanking the detected repeats. This database of exact tandem repeats can then be used as a resource by molecular biologists with interests in experimentally testing VNTRs within a zebrafish population. Conclusions A total of 116,915 repeats with a base length of at least three nucleotides were detected. The longest of these was a 54-base repeat with fourteen tandem copies. A significant number of repeats with a base length of 18, 24, 27 and 30 were detected, many with potentially novel proline-rich coding regions. Detection of exact tandem repeats in the zebrafish genome leads to a wealth of information regarding potential polymorphic sites for VNTRs. The association of many of these repeats with potentially novel yet similar coding regions yields an exciting potential for disease associated genes. A web interface for querying repeats is available at http://bioinformatics.louisville.edu/zebrafish/. This portal allows for users to search for a repeats of a selected base size from any valid specified region within the 25 linkage groups.
Collapse
Affiliation(s)
- Eric C Rouchka
- Department of Computer Engineering and Computer Science, Speed School of Engineering, University of Louisville, Duthie Center, Room 208, Louisville, KY, USA.
| |
Collapse
|
15
|
Kozlowski P, de Mezer M, Krzyzosiak WJ. Trinucleotide repeats in human genome and exome. Nucleic Acids Res 2010; 38:4027-39. [PMID: 20215431 PMCID: PMC2896521 DOI: 10.1093/nar/gkq127] [Citation(s) in RCA: 102] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Trinucleotide repeats (TNRs) are of interest in genetics because they are used as markers for tracing genotype–phenotype relations and because they are directly involved in numerous human genetic diseases. In this study, we searched the human genome reference sequence and annotated exons (exome) for the presence of uninterrupted triplet repeat tracts composed of six or more repeated units. A list of 32 448 TNRs and 878 TNR-containing genes was generated and is provided herein. We found that some triplet repeats, specifically CNG, are overrepresented, while CTT, ATC, AAC and AAT are underrepresented in exons. This observation suggests that the occurrence of TNRs in exons is not random, but undergoes positive or negative selective pressure. Additionally, TNR types strongly determine their localization in mRNA sections (ORF, UTRs). Most genes containing exon-overrepresented TNRs are associated with gene ontology-defined functions. Surprisingly, many groups of genes that contain TNR types coding for different homo-amino acid tracts associate with the same transcription-related GO categories. We propose that TNRs have potential to be functional genetic elements and that their variation may be involved in the regulation of many common phenotypes; as such, TNR polymorphisms should be considered a priority in association studies.
Collapse
Affiliation(s)
- Piotr Kozlowski
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland.
| | | | | |
Collapse
|
16
|
Sgaramella V, Astolfi PA. Somatic genome variations interact with environment, genome and epigenome in the determination of the phenotype: A paradigm shift in genomics? DNA Repair (Amst) 2010; 9:470-3. [PMID: 20153268 DOI: 10.1016/j.dnarep.2009.11.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2009] [Accepted: 11/03/2009] [Indexed: 01/18/2023]
|
17
|
Pemberton TJ, Sandefur CI, Jakobsson M, Rosenberg NA. Sequence determinants of human microsatellite variability. BMC Genomics 2009; 10:612. [PMID: 20015383 PMCID: PMC2806349 DOI: 10.1186/1471-2164-10-612] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2009] [Accepted: 12/16/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microsatellite loci are frequently used in genomic studies of DNA sequence repeats and in population studies of genetic variability. To investigate the effect of sequence properties of microsatellites on their level of variability we have analyzed genotypes at 627 microsatellite loci in 1,048 worldwide individuals from the HGDP-CEPH cell line panel together with the DNA sequences of these microsatellites in the human RefSeq database. RESULTS Calibrating PCR fragment lengths in individual genotypes by using the RefSeq sequence enabled us to infer repeat number in the HGDP-CEPH dataset and to calculate the mean number of repeats (as opposed to the mean PCR fragment length), under the assumption that differences in PCR fragment length reflect differences in the numbers of repeats in the embedded repeat sequences. We find the mean and maximum numbers of repeats across individuals to be positively correlated with heterozygosity. The size and composition of the repeat unit of a microsatellite are also important factors in predicting heterozygosity, with tetra-nucleotide repeat units high in G/C content leading to higher heterozygosity. Finally, we find that microsatellites containing more separate sets of repeated motifs generally have higher heterozygosity. CONCLUSIONS These results suggest that sequence properties of microsatellites have a significant impact in determining the features of human microsatellite variability.
Collapse
Affiliation(s)
- Trevor J Pemberton
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA.
| | | | | | | |
Collapse
|
18
|
Richard GF, Kerrest A, Dujon B. Comparative genomics and molecular dynamics of DNA repeats in eukaryotes. Microbiol Mol Biol Rev 2008; 72:686-727. [PMID: 19052325 PMCID: PMC2593564 DOI: 10.1128/mmbr.00011-08] [Citation(s) in RCA: 343] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Repeated elements can be widely abundant in eukaryotic genomes, composing more than 50% of the human genome, for example. It is possible to classify repeated sequences into two large families, "tandem repeats" and "dispersed repeats." Each of these two families can be itself divided into subfamilies. Dispersed repeats contain transposons, tRNA genes, and gene paralogues, whereas tandem repeats contain gene tandems, ribosomal DNA repeat arrays, and satellite DNA, itself subdivided into satellites, minisatellites, and microsatellites. Remarkably, the molecular mechanisms that create and propagate dispersed and tandem repeats are specific to each class and usually do not overlap. In the present review, we have chosen in the first section to describe the nature and distribution of dispersed and tandem repeats in eukaryotic genomes in the light of complete (or nearly complete) available genome sequences. In the second part, we focus on the molecular mechanisms responsible for the fast evolution of two specific classes of tandem repeats: minisatellites and microsatellites. Given that a growing number of human neurological disorders involve the expansion of a particular class of microsatellites, called trinucleotide repeats, a large part of the recent experimental work on microsatellites has focused on these particular repeats, and thus we also review the current knowledge in this area. Finally, we propose a unified definition for mini- and microsatellites that takes into account their biological properties and try to point out new directions that should be explored in a near future on our road to understanding the genetics of repeated sequences.
Collapse
Affiliation(s)
- Guy-Franck Richard
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, CNRS, URA2171, Université Pierre et Marie Curie, UFR927, 25 rue du Dr. Roux, F-75015, Paris, France.
| | | | | |
Collapse
|
19
|
Soragni E, Herman D, Dent SYR, Gottesfeld JM, Wells RD, Napierala M. Long intronic GAA*TTC repeats induce epigenetic changes and reporter gene silencing in a molecular model of Friedreich ataxia. Nucleic Acids Res 2008; 36:6056-65. [PMID: 18820300 PMCID: PMC2577344 DOI: 10.1093/nar/gkn604] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2008] [Revised: 09/05/2008] [Accepted: 09/05/2008] [Indexed: 12/25/2022] Open
Abstract
Friedreich ataxia (FRDA) is caused by hyperexpansion of GAA*TTC repeats located in the first intron of the FXN gene, which inhibits transcription leading to the deficiency of frataxin. The FXN gene is an excellent target for therapeutic intervention since (i) 98% of patients carry the same type of mutation, (ii) the mutation is intronic, thus leaving the FXN coding sequence unaffected and (iii) heterozygous GAA*TTC expansion carriers with approximately 50% decrease of the frataxin are asymptomatic. The discovery of therapeutic strategies for FRDA is hampered by a lack of appropriate molecular models of the disease. Herein, we present the development of a new cell line as a molecular model of FRDA by inserting 560 GAA*TTC repeats into an intron of a GFP reporter minigene. The GFP_(GAA*TTC)(560) minigene recapitulates the molecular hallmarks of the mutated FXN gene, i.e. inhibition of transcription of the reporter gene, decreased levels of the reporter protein and hypoacetylation and hypermethylation of histones in the vicinity of the repeats. Additionally, selected histone deacetylase inhibitors, known to stimulate the FXN gene expression, increase the expression of the GFP_(GAA*TTC)(560) reporter. This FRDA model can be adapted to high-throughput analyses in a search for new therapeutics for the disease.
Collapse
Affiliation(s)
- E. Soragni
- Center for Genome Research, Institute of Biosciences and Technology, Texas A&M Health Science Center, 2121 West Holcombe Blvd., Houston, TX, 77030, The Scripps Research Institute, Department of Molecular Biology, 10550 North Torrey Pines Road, La Jolla, CA, 92037 and University of Texas M. D. Anderson Cancer Center, Department of Biochemistry and Molecular Biology and Center for Cancer Epigenetics, 1515 Holcombe Blvd., Houston, TX, 77030, USA
| | - D. Herman
- Center for Genome Research, Institute of Biosciences and Technology, Texas A&M Health Science Center, 2121 West Holcombe Blvd., Houston, TX, 77030, The Scripps Research Institute, Department of Molecular Biology, 10550 North Torrey Pines Road, La Jolla, CA, 92037 and University of Texas M. D. Anderson Cancer Center, Department of Biochemistry and Molecular Biology and Center for Cancer Epigenetics, 1515 Holcombe Blvd., Houston, TX, 77030, USA
| | - S. Y. R. Dent
- Center for Genome Research, Institute of Biosciences and Technology, Texas A&M Health Science Center, 2121 West Holcombe Blvd., Houston, TX, 77030, The Scripps Research Institute, Department of Molecular Biology, 10550 North Torrey Pines Road, La Jolla, CA, 92037 and University of Texas M. D. Anderson Cancer Center, Department of Biochemistry and Molecular Biology and Center for Cancer Epigenetics, 1515 Holcombe Blvd., Houston, TX, 77030, USA
| | - J. M. Gottesfeld
- Center for Genome Research, Institute of Biosciences and Technology, Texas A&M Health Science Center, 2121 West Holcombe Blvd., Houston, TX, 77030, The Scripps Research Institute, Department of Molecular Biology, 10550 North Torrey Pines Road, La Jolla, CA, 92037 and University of Texas M. D. Anderson Cancer Center, Department of Biochemistry and Molecular Biology and Center for Cancer Epigenetics, 1515 Holcombe Blvd., Houston, TX, 77030, USA
| | - R. D. Wells
- Center for Genome Research, Institute of Biosciences and Technology, Texas A&M Health Science Center, 2121 West Holcombe Blvd., Houston, TX, 77030, The Scripps Research Institute, Department of Molecular Biology, 10550 North Torrey Pines Road, La Jolla, CA, 92037 and University of Texas M. D. Anderson Cancer Center, Department of Biochemistry and Molecular Biology and Center for Cancer Epigenetics, 1515 Holcombe Blvd., Houston, TX, 77030, USA
| | - M. Napierala
- Center for Genome Research, Institute of Biosciences and Technology, Texas A&M Health Science Center, 2121 West Holcombe Blvd., Houston, TX, 77030, The Scripps Research Institute, Department of Molecular Biology, 10550 North Torrey Pines Road, La Jolla, CA, 92037 and University of Texas M. D. Anderson Cancer Center, Department of Biochemistry and Molecular Biology and Center for Cancer Epigenetics, 1515 Holcombe Blvd., Houston, TX, 77030, USA
| |
Collapse
|
20
|
Ruiz-Herrera A, Robinson TJ. Evolutionary plasticity and cancer breakpoints in human chromosome 3. Bioessays 2008; 30:1126-37. [DOI: 10.1002/bies.20829] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
21
|
Tan EC, Li H. Characterization of frequencies and distribution of single nucleotide insertions/deletions in the human genome. Gene 2006; 376:268-80. [PMID: 16781088 DOI: 10.1016/j.gene.2006.04.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2005] [Revised: 04/07/2006] [Accepted: 04/12/2006] [Indexed: 11/19/2022]
Abstract
Most of the studies on single nucleotide variations are on substitutions rather than insertions/deletions. In this study, we examined the distribution and characteristics of single nucleotide insertions/deletions (SNindels), using data available from dbSNP for all the human chromosomes. There are almost 300,000 SNindels in the database, of which only 0.8% are validated. They occur at the frequency of 0.887 per 10 kb on average for the whole genome, or approximately 1 for every 11,274 bp. More than half occur in regions with mononucleotide repeats the longest of which is 47 bases. Overall the mononucleotide repeats involving C and G are much shorter than those for A and T. About 12% are surrounded by palindromes. There is general correlation between chromosome size and total number for each chromosome. Inter-chromosomal variation in density ranges from 0.6 to 21.7 per kilobase. The overall spectrum shows very high proportion of SNindel of types -/A and -/T at over 81%. The proportion of -/A and -/T SNindels for each chromosome is correlated to its AT content. Less than half of the SNindels are within or near known genes and even fewer (<0.183%) in coding regions, and more than 1.4% of -/C and -/G are in coding compared to 0.2% for -/A and -/T types. SNindels of -/A and -/T types make up 80% of those found within untranslated regions but less than 40% of those within coding regions. A separate analysis using the subset of 2324 validated SNindels showed slightly less AT bias of 74%, SNindels not within mononucleotide repeats showed even less AT bias at 58%. Density of validated SNindels is 0.007/10 kb overall and 90% are found within or near genes. Among all chromosomes, Y has the lowest numbers and densities for all SNindels, validated SNindels, and SNindels not within repeats.
Collapse
Affiliation(s)
- Ene-Choo Tan
- KK Research Centre, KK Women's and Children's Hospital, 100 Bukit Timah Road, Singapore 229899, Singapore.
| | | |
Collapse
|
22
|
Sgaramella V, Salamini F. Gene paucity, genome instability, clonal development: has an individual genome the potential to encode for more than one brain? DNA Repair (Amst) 2006; 5:531-3. [PMID: 16621729 DOI: 10.1016/j.dnarep.2006.03.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2006] [Accepted: 03/07/2006] [Indexed: 11/26/2022]
|
23
|
Clark RM, Bhaskar SS, Miyahara M, Dalgliesh GL, Bidichandani SI. Expansion of GAA trinucleotide repeats in mammals. Genomics 2005; 87:57-67. [PMID: 16316739 DOI: 10.1016/j.ygeno.2005.09.006] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2005] [Revised: 09/04/2005] [Accepted: 09/07/2005] [Indexed: 01/29/2023]
Abstract
We have previously shown that GAA trinucleotide repeats have undergone significant expansion in the human genome. Here we present the analysis of the length distribution of all 10 nonredundant trinucleotide repeat motifs in 20 complete eukaryotic genomes (6 mammalian, 2 nonmammalian vertebrates, 4 arthropods, 4 fungi, and 1 each of nematode, amoebozoa, alveolate, and plant), which showed that the abundance of large expansions of GAA trinucleotide repeats is specific to mammals. Analysis of human-chimpanzee-gorilla orthologs revealed that loci with large expansions are species-specific and have occurred after divergence from the common ancestor. PCR analysis of human controls revealed large expansions at multiple human (GAA)(30+) loci; nine loci showed expanded alleles containing >65 triplets, analogous to disease-causing expansions in Friedreich ataxia, including two that are in introns of genes of unknown function. The abundance of long GAA trinucleotide repeat tracts in mammalian genomes represents a significant mutation potential and source of interindividual variability.
Collapse
Affiliation(s)
- Rhonda M Clark
- Department of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, Oklahoma City, OK 73104, USA
| | | | | | | | | |
Collapse
|
24
|
Braun BR, van het Hoog M, d'Enfert C, Martchenko M, Dungan J, Kuo A, Inglis DO, Uhl MA, Hogues H, Berriman M, Lorenz M, Levitin A, Oberholzer U, Bachewich C, Harcus D, Marcil A, Dignard D, Iouk T, Zito R, Frangeul L, Tekaia F, Rutherford K, Wang E, Munro CA, Bates S, Gow NA, Hoyer LL, Köhler G, Morschhäuser J, Newport G, Znaidi S, Raymond M, Turcotte B, Sherlock G, Costanzo M, Ihmels J, Berman J, Sanglard D, Agabian N, Mitchell AP, Johnson AD, Whiteway M, Nantel A. A human-curated annotation of the Candida albicans genome. PLoS Genet 2005; 1:36-57. [PMID: 16103911 PMCID: PMC1183520 DOI: 10.1371/journal.pgen.0010001] [Citation(s) in RCA: 242] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2005] [Accepted: 03/14/2005] [Indexed: 11/24/2022] Open
Abstract
Recent sequencing and assembly of the genome for the fungal pathogen Candida albicans used simple automated procedures for the identification of putative genes. We have reviewed the entire assembly, both by hand and with additional bioinformatic resources, to accurately map and describe 6,354 genes and to identify 246 genes whose original database entries contained sequencing errors (or possibly mutations) that affect their reading frame. Comparison with other fungal genomes permitted the identification of numerous fungus-specific genes that might be targeted for antifungal therapy. We also observed that, compared to other fungi, the protein-coding sequences in the C. albicans genome are especially rich in short sequence repeats. Finally, our improved annotation permitted a detailed analysis of several multigene families, and comparative genomic studies showed that C. albicans has a far greater catabolic range, encoding respiratory Complex 1, several novel oxidoreductases and ketone body degrading enzymes, malonyl-CoA and enoyl-CoA carriers, several novel amino acid degrading enzymes, a variety of secreted catabolic lipases and proteases, and numerous transporters to assimilate the resulting nutrients. The results of these efforts will ensure that the Candida research community has uniform and comprehensive genomic information for medical research as well as for future diagnostic and therapeutic applications.
Collapse
Affiliation(s)
- Burkhard R Braun
- Department of Microbiology and Immunology, University of California, San Francisco, California, United States of America
| | - Marco van het Hoog
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Christophe d'Enfert
- Unité Postulante Biologie et Pathogénicité Fongiques, INRA USC 2019, Institut Pasteur, Paris, France
| | - Mikhail Martchenko
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Jan Dungan
- Department of Stomatology, University of California, San Francisco, California, United States of America
| | - Alan Kuo
- Department of Stomatology, University of California, San Francisco, California, United States of America
| | - Diane O Inglis
- Department of Microbiology and Immunology, University of California, San Francisco, California, United States of America
| | - M. Andrew Uhl
- Department of Microbiology and Immunology, University of California, San Francisco, California, United States of America
| | - Hervé Hogues
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | | | - Michael Lorenz
- Department of Microbiology and Molecular Genetics, Utah-Houston Medical School, Houston, Texas, United States of America
| | - Anastasia Levitin
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Ursula Oberholzer
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Catherine Bachewich
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Doreen Harcus
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Anne Marcil
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Daniel Dignard
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Tatiana Iouk
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Rosa Zito
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Lionel Frangeul
- Plate-Forme Intégration et Analyse Génomique, Institut Pasteur, Paris, France
| | - Fredj Tekaia
- Unité de Génétique Moléculaire des Levures, Institut Pasteur, Paris, France
| | | | - Edwin Wang
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - Carol A Munro
- School of Medical Sciences, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen, United Kingdom
| | - Steve Bates
- School of Medical Sciences, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen, United Kingdom
| | - Neil A Gow
- School of Medical Sciences, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen, United Kingdom
| | - Lois L Hoyer
- Department of Veterinary Pathobiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Gerwald Köhler
- Department of Stomatology, University of California, San Francisco, California, United States of America
| | - Joachim Morschhäuser
- Institut für Molekulare Infektionsbiologie, Universität Wurzburg, Wurzburg, Germany
| | - George Newport
- Department of Stomatology, University of California, San Francisco, California, United States of America
| | - Sadri Znaidi
- Institut de Recherches Cliniques de Montreal, Montreal, Quebec, Canada
| | - Martine Raymond
- Institut de Recherches Cliniques de Montreal, Montreal, Quebec, Canada
| | - Bernard Turcotte
- Department of Medicine, Royal Victoria Hospital, McGill University, Montreal, Quebec, Canada
| | - Gavin Sherlock
- Department of Genetics, Stanford University School of Medicine, Palo Alto, California, United States of America
| | - Maria Costanzo
- Department of Genetics, Stanford University School of Medicine, Palo Alto, California, United States of America
| | - Jan Ihmels
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Judith Berman
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Dominique Sanglard
- Institute of Microbiology, University Hospital Lausanne, Lausanne, Switzerland
| | - Nina Agabian
- Department of Stomatology, University of California, San Francisco, California, United States of America
| | - Aaron P Mitchell
- Department of Microbiology and Institute of Cancer Research, Columbia University, New York, New York, United States of America
| | - Alexander D Johnson
- Department of Microbiology and Immunology, University of California, San Francisco, California, United States of America
| | - Malcolm Whiteway
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| | - André Nantel
- Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, Canada
| |
Collapse
|
25
|
Jackson SM, Whitworth AJ, Greene JC, Libby RT, Baccam SL, Pallanck LJ, La Spada AR. A SCA7 CAG/CTG repeat expansion is stable in Drosophila melanogaster despite modulation of genomic context and gene dosage. Gene 2005; 347:35-41. [PMID: 15715978 DOI: 10.1016/j.gene.2004.12.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2004] [Revised: 11/13/2004] [Accepted: 12/06/2004] [Indexed: 10/25/2022]
Abstract
CAG and CTG repeat expansions are the cause of at least a dozen inherited neurological disorders. In these so-called "dynamic mutation" diseases, the expanded repeats display dramatic genetic instability, changing in size when transmitted through the germline and within somatic tissues. As the molecular basis of the repeat instability process remains poorly understood, modeling of repeat instability in model organisms has provided some insights into potentially involved factors, implicating especially replication and repair pathways. Studies in mice have also shown that the genomic context of the repeat sequence is required for CAG/CTG repeat instability in the case of spinocerebellar ataxia type 7 (SCA7), one of the most unstable of all CAG/CTG repeat disease loci. While most studies of repeat instability have taken a candidate gene approach, unbiased screens for factors involved in trinucleotide repeat instability have been lacking. We therefore attempted to use Drosophila melanogaster to model expanded CAG repeat instability by creating transgenic flies carrying trinucleotide repeat expansions, deriving flies with SCA7 CAG90 repeats in cDNA and genomic context. We found that SCA7 CAG90 repeats are stable in Drosophila, regardless of context. To screen for genes whose reduced function might destabilize expanded CAG repeat tracts in Drosophila, we crossed the SCA7 CAG90 repeat flies with various deficiency stocks, including lines lacking genes encoding the orthologues of flap endonuclease-1, PCNA, and MutS. In all cases, perfect repeat stability was preserved, suggesting that Drosophila may not be a suitable system for determining the molecular basis of SCA7 CAG repeat instability.
Collapse
Affiliation(s)
- Stephen M Jackson
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | | | | | | | | | | | | |
Collapse
|
26
|
Nichol Edamura K, Leonard MR, Pearson CE. Role of replication and CpG methylation in fragile X syndrome CGG deletions in primate cells. Am J Hum Genet 2005; 76:302-11. [PMID: 15625623 PMCID: PMC1196375 DOI: 10.1086/427928] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2004] [Accepted: 12/08/2004] [Indexed: 01/22/2023] Open
Abstract
Instability of the fragile X CGG repeat involves both maternally derived expansions and deletions in the gametes of full-mutation males. It has also been suggested that the absence of aberrant CpG methylation may enhance repeat deletions through an unknown process. The effect of CGG tract length, DNA replication direction, location of replication initiation, and CpG methylation upon CGG stability were investigated using an SV40 primate replication system. Replication-dependant deletions with 53 CGG repeats were observed when replication was initiated proximal to the repeat, with CGG as the lagging-strand template. When we initiated replication further from the repeat, while maintaining CGG as the lagging-strand template or using CCG as the lagging-strand template, significant instability was not observed. CpG methylation of the unstable template stabilized the repeat, decreasing both the frequency and the magnitude of deletion events. Furthermore, CpG methylation slowed the efficiency of replication for all templates. Interestingly, replication forks displayed no evidence of a block at the CGG repeat tract, regardless of replication direction or CpG methylation status. Templates with 20 CGG repeats were stable under all circumstances. These results reveal that CGG deletions occur during replication and are sensitive to replication-fork dynamics, tract length, and CpG methylation.
Collapse
Affiliation(s)
- Kerrie Nichol Edamura
- Program of Genetics and Genomic Biology, The Hospital for Sick Children, and Program of Molecular and Medical Genetics, University of Toronto, Toronto
| | - Michelle R. Leonard
- Program of Genetics and Genomic Biology, The Hospital for Sick Children, and Program of Molecular and Medical Genetics, University of Toronto, Toronto
| | - Christopher E. Pearson
- Program of Genetics and Genomic Biology, The Hospital for Sick Children, and Program of Molecular and Medical Genetics, University of Toronto, Toronto
| |
Collapse
|
27
|
Carrasco N, Buzin Y, Tyson E, Halpert E, Huang Z. Selenium derivatization and crystallization of DNA and RNA oligonucleotides for X-ray crystallography using multiple anomalous dispersion. Nucleic Acids Res 2004; 32:1638-46. [PMID: 15007109 PMCID: PMC390325 DOI: 10.1093/nar/gkh325] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We report here the solid phase synthesis of RNA and DNA oligonucleotides containing the 2'-selenium functionality for X-ray crystallography using multiwavelength anomalous dispersion. We have synthesized the novel 2'-methylseleno cytidine phosphoramidite and improved the accessibility of the 2'-methylseleno uridine phosphoramidite for the synthesis of many selenium-derivatized DNAs and RNAs in large scales. The yields of coupling these Se-nucleoside phosphoramidites into DNA or RNA oligonucleotides were over 99% when 5-(benzylmercapto)-1H-tetrazole was used as the coupling reagent. The UV melting study of A-form dsDNAs indicated that the 2'-selenium derivatization had no effect on the stability of the duplexes with the 3'-endo sugar pucker. Thus, the stems of functional RNA molecules with the same 3'-endo sugar pucker appear to be the ideal sites for the selenium derivatization with 2'-Se-C and 2'-Se-U. Crystallization of the selenium-derivatized oligonucleotides is also reported here. The results demonstrate that this 2'-selenium functionality is suitable for RNA and A-form DNA derivatization in X-ray crystallography.
Collapse
Affiliation(s)
- Nicolas Carrasco
- Department of Chemistry, Brooklyn College, and Program of Biochemistry and Chemistry, The Graduate School, The City University of New York, 2900 Bedford Avenue, Brooklyn, NY 11210, USA
| | | | | | | | | |
Collapse
|