1
|
Fabbri MC, Tiezzi F, Crovetti A, Maltecca C, Bozzi R. Investigation of cosmopolitan and local Italian beef cattle breeds uncover common patterns of heterozygosity. Animal 2024; 18:101142. [PMID: 38636149 DOI: 10.1016/j.animal.2024.101142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 03/15/2024] [Accepted: 03/18/2024] [Indexed: 04/20/2024] Open
Abstract
The analysis of livestock heterozygosity is less common compared to the study of homozygous patterns. Heterozygous-Rich Regions (HRRs) may harbor significant loci for functional traits such as immune response, survival rate, and fertility. For this reason, this study was conducted to investigate and characterize the heterozygosity patterns of four beef cattle breeds, which included two cosmopolitan breeds (Limousine and Charolaise) and two local breeds (Sarda and Sardo Bruna). Our analysis identified regions with a high degree of heterozygosity using a consecutive runs approach, the Tajima D test, nucleotide diversity estimation, and Hardy Weinberg equilibrium test. These regions exhibited recurrent heterozygosity peaks and were consistently found on specific chromosomes across all breeds, specifically autosomes 15, 16, 20, and 23. The cosmopolitan and Sardo Bruna breeds also displayed peaks on autosomes 2 and 21, respectively. Thirty-five top runs shared by more than 25% of the populations were identified. These genomic fragments encompassed 18 genes, two of which are directly linked to male fertility, while four are associated with lactation. Two other genes play roles in survival and immune response. Our study also detected a region related to growth and carcass traits in Limousine breed. Our analysis of heterozygosity-rich regions revealed particular segments of the cattle genome linked to various functional traits. It appears that balancing selection is occurring in specific regions within the four examined breeds, and unexpectedly, they are common across cosmopolitan and local breeds. The genes identified hold potential for applications in breeding programs and conservation studies to investigate the phenotypes associated with these heterozygous genotypes. In addition, Tajima D test, Nucleotide diversity, and Hardy Weinberg equilibrium test confirmed the presence of heterozygous fragments found with Heterozygous-Rich Regions analysis.
Collapse
Affiliation(s)
- M C Fabbri
- Dipartimento di Scienze e Tecnologie Agrarie, Alimentari, Ambientali e Forestali, Università di Firenze, Firenze, Italy.
| | - F Tiezzi
- Dipartimento di Scienze e Tecnologie Agrarie, Alimentari, Ambientali e Forestali, Università di Firenze, Firenze, Italy
| | - A Crovetti
- Dipartimento di Scienze e Tecnologie Agrarie, Alimentari, Ambientali e Forestali, Università di Firenze, Firenze, Italy
| | - C Maltecca
- Dipartimento di Scienze e Tecnologie Agrarie, Alimentari, Ambientali e Forestali, Università di Firenze, Firenze, Italy; Department of Animal Science, North Carolina State University, Raleigh, NC 27695, United States
| | - R Bozzi
- Dipartimento di Scienze e Tecnologie Agrarie, Alimentari, Ambientali e Forestali, Università di Firenze, Firenze, Italy
| |
Collapse
|
2
|
Villegas LI, Ferretti L, Wiehe T, Waldvogel A, Schiffer PH. Parthenogenomics: Insights on mutation rates and nucleotide diversity in parthenogenetic Panagrolaimus nematodes. Ecol Evol 2024; 14:e10831. [PMID: 38192904 PMCID: PMC10771965 DOI: 10.1002/ece3.10831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/11/2023] [Accepted: 12/13/2023] [Indexed: 01/10/2024] Open
Abstract
Asexual reproduction is assumed to lead to the accumulation of deleterious mutations, and reduced heterozygosity due to the absence of recombination. Panagrolaimid nematode species display different modes of reproduction. Sexual reproduction with distinct males and females, asexual reproduction through parthenogenesis in the genus Panagrolaimus, and hermaphroditism in Propanagrolaimus. Here, we compared genomic features of free-living nematodes in populations and species isolated from geographically distant regions to study diversity, and genome-wide differentiation under different modes of reproduction. We firstly estimated genome-wide spontaneous mutation rates in a triploid parthenogenetic Panagrolaimus, and a diploid hermaphroditic Propanagrolaimus via long-term mutation accumulation lines. Secondly, we calculated population genetic parameters including nucleotide diversity, and fixation index (F ST) between populations of asexually and sexually reproducing nematodes. Thirdly, we used phylogenetic network methods on sexually and asexually reproducing Panagrolaimus populations to understand evolutionary relationships between them. The estimated mutation rate was slightly lower for the asexual population, as expected for taxa with this reproductive mode. Natural polyploid asexual populations revealed higher nucleotide diversity. Despite their common ancestor, a gene network revealed a high level of genetic differentiation among asexual populations. The elevated heterozygosity found in the triploid parthenogens could be explained by the third genome copy. Given their tendentially lower mutation rates it can be hypothesized that this is part of the mechanism to evade Muller's ratchet. Our findings in parthenogenetic triploid nematode populations seem to challenge common expectations of evolution under asexuality.
Collapse
Affiliation(s)
| | | | - Thomas Wiehe
- Institute for GeneticsUniversity of CologneKölnGermany
| | | | | |
Collapse
|
3
|
Yu H, Ma L, Zhao Y, Naren G, Wu H, Sun Y, Wu L, Zhang L. Characterization of nuclear DNA diversity in an individual Leymus chinensis. FRONTIERS IN PLANT SCIENCE 2023; 14:1157145. [PMID: 37346123 PMCID: PMC10280068 DOI: 10.3389/fpls.2023.1157145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 04/13/2023] [Indexed: 06/23/2023]
Abstract
Intraorganismal genetic heterogeneity (IGH) exists when an individual organism harbors more than one genotype among its cells. In general, intercellular DNA diversity occurs at a very low frequency and cannot be directly detected by DNA sequencing from bulk tissue. In this study, based on Sanger and high-throughput sequencing, different species, different organs, different DNA segments and a single cell were employed to characterize nucleotide mutations in Leymus chinensis. The results demonstrated that 1) the nuclear DNA showed excessive genetic heterogeneity among cells of an individual leaf or seed but the chloroplast genes remained consistent; 2) a high density of SNPs was found in the variants of the unique DNA sequence, and the similar SNP profile shared between the leaf and seed suggested that nucleotide mutation followed a certain rule and was not random; and 3) the mutation rate decreased from the genomic DNA sequence to the corresponding protein sequence. Our results suggested that Leymus chinensis seemed to consist of a collection of cells with different genetic backgrounds.
Collapse
|
4
|
Konopiński MK. Average weighted nucleotide diversity is more precise than pixy in estimating the true value of π from sequence sets containing missing data. Mol Ecol Resour 2023; 23:348-354. [PMID: 36031871 DOI: 10.1111/1755-0998.13707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 06/03/2022] [Accepted: 07/27/2022] [Indexed: 01/04/2023]
Abstract
Nucleotide diversity remains an important statistic in population genetic/genomic studies. Although recent advances in massive sequencing make generating sequence data sets cheaper and faster, currently used technologies often introduce substantial amounts of missing nucleotides in their output. A novel method of estimating π from data sets containing missing data - pixy - has also recently been proposed. In this study, the pixy estimator, πpixy , was compared to average weighted nucleotide diversity, πW . The estimators were tested both on sequences simulated in fastsimcoal and real sequence sets. Both sets were modified by random insertion of missing nucleotides. Weighted nucleotide diversity performed better in all pairwise comparisons. It was characterized by a smaller error and a narrower distribution of the results. πpixy tends to overestimate the nucleotide diversity when both the proportion of missing data and the level of variation is low. Of the two estimators, only πW estimated the true nucleotide diversity in a part of the simulations. A simple formula for estimating πW allows for easy integration of the estimator in packages such as pixy, which would allow obtaining more precise estimates of nucleotide diversity either in a sliding window or for discrete genomic regions.
Collapse
|
5
|
Wang H, Wang Q, Tan X, Wang J, Zhang J, Zheng M, Zhao G, Wen J. Estimation of genetic variability and identification of regions under selection based on runs of homozygosity in Beijing-You Chickens. Poult Sci 2022; 102:102342. [PMID: 36470032 PMCID: PMC9719870 DOI: 10.1016/j.psj.2022.102342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 11/08/2022] [Accepted: 11/09/2022] [Indexed: 11/16/2022] Open
Abstract
The genetic composition of populations is the result of a long-term process of selection and adaptation to specific environments and ecosystems. Runs of homozygosity (ROHs) are homozygous segments of the genome where the 2 haplotypes inherited from the parents are identical. The detection of ROH can be used to describe the genetic variability and quantify the level of inbreeding in an individual. Here, we investigated the occurrence and distribution of ROHs in 40 Beijing-You Chickens from the random breeding population (BJY_C) and 40 Beijing-You Chickens from the intramuscular fat (IMF) selection population (BJY_S). Principal component analysis (PCA) and maximum likelihood (ML) analyses showed that BJY_C was completely separated from the BJY_S. The nucleotide diversity of BJY_C was higher than that of BJY_S, and the decay rate of LD of BJY_C was faster. The ROHs were identified for a total of 7,101 in BJY_C and 9,273 in BJY_S, respectively. The ROH-based inbreeding estimate (FROH) of BJY_C was 0.079, which was significantly lower than that of BJY_S (FROH = 0.114). The results were the same as the estimates of the inbreeding coefficients calculated based on homozygosity (FHOM), the correlation between uniting gametes (FUNI), and the genomic relationship matrix (FGRM). Additionally, the distribution and number of ROH islands in chromosomes of BJY_C and BJY_S were significantly different. The ROH islands of BJY_S that included genes associated with lipid metabolism and fat deposition, such as CIDEA and S1PR1, were absent in BJY_C. However, GPR161 was detected in both populations, which is a candidate gene for the formation of the unique five-finger trait in Beijing-You chickens. Our findings contributed to the understanding of the genetic diversity of random or artificially selected populations, and allowed the accurate monitoring of population inbreeding using genomic information, as well as the detection of genomic regions that affect traits under selection.
Collapse
Affiliation(s)
- Hailong Wang
- Chinese Academy of Agricultural Science, State Key Laboratory of Animal Nutrition, Beijing 100193, China
| | - Qiao Wang
- Chinese Academy of Agricultural Science, State Key Laboratory of Animal Nutrition, Beijing 100193, China
| | - Xiaodong Tan
- Chinese Academy of Agricultural Science, State Key Laboratory of Animal Nutrition, Beijing 100193, China
| | - Jie Wang
- Chinese Academy of Agricultural Science, State Key Laboratory of Animal Nutrition, Beijing 100193, China
| | - Jin Zhang
- Chinese Academy of Agricultural Science, State Key Laboratory of Animal Nutrition, Beijing 100193, China
| | - Maiqing Zheng
- Chinese Academy of Agricultural Science, State Key Laboratory of Animal Nutrition, Beijing 100193, China
| | - Guiping Zhao
- Chinese Academy of Agricultural Science, State Key Laboratory of Animal Nutrition, Beijing 100193, China
| | - Jie Wen
- Chinese Academy of Agricultural Science, State Key Laboratory of Animal Nutrition, Beijing 100193, China.
| |
Collapse
|
6
|
Lu F, Sossin A, Abell N, Montgomery SB, He Z. Deep learning-assisted genome-wide characterization of massively parallel reporter assays. Nucleic Acids Res 2022; 50:11442-11454. [PMID: 36350674 PMCID: PMC9723615 DOI: 10.1093/nar/gkac990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 10/04/2022] [Accepted: 10/19/2022] [Indexed: 11/10/2022] Open
Abstract
Massively parallel reporter assay (MPRA) is a high-throughput method that enables the study of the regulatory activities of tens of thousands of DNA oligonucleotides in a single experiment. While MPRA experiments have grown in popularity, their small sample sizes compared to the scale of the human genome limits our understanding of the regulatory effects they detect. To address this, we develop a deep learning model, MpraNet, to distinguish potential MPRA targets from the background genome. This model achieves high discriminative performance (AUROC = 0.85) at differentiating MPRA positives from a set of control variants that mimic the background genome when applied to the lymphoblastoid cell line. We observe that existing functional scores represent very distinct functional effects, and most of them fail to characterize the regulatory effect that MPRA detects. Using MpraNet, we predict potential MPRA functional variants across the genome and identify the distributions of MPRA effect relative to other characteristics of genetic variation, including allele frequency, alternative functional annotations specified by FAVOR, and phenome-wide associations. We also observed that the predicted MPRA positives are not uniformly distributed across the genome; instead, they are clumped together in active regions comprising 9.95% of the genome and inactive regions comprising 89.07% of the genome. Furthermore, we propose our model as a screen to filter MPRA experiment candidates at genome-wide scale, enabling future experiments to be more cost-efficient by increasing precision relative to that observed from previous MPRAs.
Collapse
Affiliation(s)
| | | | - Nathan Abell
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Stephen B Montgomery
- Department of Genetics, Stanford University, Stanford, CA 94305, USA,Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Zihuai He
- To whom correspondence should be addressed. Tel: +1 718 869 4929;
| |
Collapse
|
7
|
Rahmani RS, Decap D, Fostier J, Marchal K. BLSSpeller to discover novel regulatory motifs in maize. DNA Res 2022; 29:6651838. [PMID: 35904558 PMCID: PMC9358016 DOI: 10.1093/dnares/dsac029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
With the decreasing cost of sequencing and availability of larger numbers of sequenced genomes, comparative genomics is becoming increasingly attractive to complement experimental techniques for the task of transcription factor (TF) binding site identification. In this study, we redesigned BLSSpeller, a motif discovery algorithm, to cope with larger sequence datasets. BLSSpeller was used to identify novel motifs in Zea mays in a comparative genomics setting with 16 monocot lineages. We discovered 61 motifs of which 20 matched previously described motif models in Arabidopsis. In addition, novel, yet uncharacterized motifs were detected, several of which are supported by available sequence-based and/or functional data. Instances of the predicted motifs were enriched around transcription start sites and contained signatures of selection. Moreover, the enrichment of the predicted motif instances in open chromatin and TF binding sites indicates their functionality, supported by the fact that genes carrying instances of these motifs were often found to be co-expressed and/or enriched in similar GO functions. Overall, our study unveiled several novel candidate motifs that might help our understanding of the genotype to phenotype association in crops.
Collapse
Affiliation(s)
- Razgar Seyed Rahmani
- Department of Plant Biotechnology and Bioinformatics, Ghent University , Gent, Belgium
- Department of Information Technology, IDLab, Ghent University—imec , Gent, Belgium
| | - Dries Decap
- Department of Information Technology, IDLab, Ghent University—imec , Gent, Belgium
| | - Jan Fostier
- Department of Information Technology, IDLab, Ghent University—imec , Gent, Belgium
| | - Kathleen Marchal
- Department of Plant Biotechnology and Bioinformatics, Ghent University , Gent, Belgium
- Department of Information Technology, IDLab, Ghent University—imec , Gent, Belgium
- Department of Biochemistry, Genetics and Microbiology, University of Pretoria , Pretoria, South Africa
| |
Collapse
|
8
|
Genome-Wide Prediction of Transcription Start Sites in Conifers. Int J Mol Sci 2022; 23:ijms23031735. [PMID: 35163661 PMCID: PMC8836283 DOI: 10.3390/ijms23031735] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 01/30/2022] [Accepted: 02/01/2022] [Indexed: 02/04/2023] Open
Abstract
The identification of promoters is an essential step in the genome annotation process, providing a framework for gene regulatory networks and their role in transcription regulation. Despite considerable advances in the high-throughput determination of transcription start sites (TSSs) and transcription factor binding sites (TFBSs), experimental methods are still time-consuming and expensive. Instead, several computational approaches have been developed to provide fast and reliable means for predicting the location of TSSs and regulatory motifs on a genome-wide scale. Numerous studies have been carried out on the regulatory elements of mammalian genomes, but plant promoters, especially in gymnosperms, have been left out of the limelight and, therefore, have been poorly investigated. The aim of this study was to enhance and expand the existing genome annotations using computational approaches for genome-wide prediction of TSSs in the four conifer species: loblolly pine, white spruce, Norway spruce, and Siberian larch. Our pipeline will be useful for TSS predictions in other genomes, especially for draft assemblies, where reliable TSS predictions are not usually available. We also explored some of the features of the nucleotide composition of the predicted promoters and compared the GC properties of conifer genes with model monocot and dicot plants. Here, we demonstrate that even incomplete genome assemblies and partial annotations can be a reliable starting point for TSS annotation. The results of the TSS prediction in four conifer species have been deposited in the Persephone genome browser, which allows smooth visualization and is optimized for large data sets. This work provides the initial basis for future experimental validation and the study of the regulatory regions to understand gene regulation in gymnosperms.
Collapse
|
9
|
Maung TZ, Chu SH, Park YJ. Functional Haplotypes and Evolutionary Insight into the Granule-Bound Starch Synthase II ( GBSSII) Gene in Korean Rice Accessions (KRICE_CORE). Foods 2021; 10:2359. [PMID: 34681408 PMCID: PMC8535093 DOI: 10.3390/foods10102359] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 09/28/2021] [Accepted: 09/30/2021] [Indexed: 12/30/2022] Open
Abstract
Granule-bound starch synthase 2 (GBSSII), a paralogous isoform of GBSSI, carries out amylose biosynthesis in rice. Unlike GBSSI, it mainly functions in transient organs, such as leaves. Despite many reports on the starch gene family, little is known about the genetics and genomics of GBSSII. Haplotype analysis was conducted to unveil genetic variations (SNPs and InDels) of GBSSII (OS07G0412100) and it was also performed to gain evolutionary insight through genetic diversity, population genetic structure, and phylogenetic analyses using the KRICE_CORE set (475 rice accessions). Thirty nonsynonymous SNPs (nsSNPs) were detected across the diverse GBSSII coding regions, representing 38 haplotypes, including 13 cultivated, 21 wild, and 4 mixed (a combination of cultivated and wild) varieties. The cultivated haplotypes (C_1-C_13) contained more nsSNPs across the GBSSII genomic region than the wild varieties. Nucleotide diversity analysis highlighted the higher diversity values of the cultivated varieties (weedy = 0.0102, landrace = 0.0093, and bred = 0.0066) than the wild group (0.0045). The cultivated varieties exhibited no reduction in diversity during domestication. Diversity reduction in the japonica and the wild groups was evidenced by the negative Tajima's D values under purifying selection, suggesting the domestication signatures of GBSSII; however, balancing selection was indicated by positive Tajima's D values in indica. Principal component analysis and population genetics analyses estimated the ambiguous evolutionary relationships among the cultivated and wild rice groups, indicating highly diverse structural features of the rice accessions within the GBSSII genomic region. FST analysis differentiated most of the classified populations in a range of greater FST values. Our findings provide evolutionary insights into GBSSII and, consequently, a molecular breeding program can be implemented for select desired traits using these diverse nonsynonymous (functional) alleles.
Collapse
Affiliation(s)
- Thant Zin Maung
- Department of Plant Resources, College of Industrial Science, Kongju National University, Yesan 32439, Korea;
| | - Sang-Ho Chu
- Center of Crop Breeding on Omics and Artificial Intelligence, Kongju National University, Yesan 32439, Korea;
| | - Yong-Jin Park
- Department of Plant Resources, College of Industrial Science, Kongju National University, Yesan 32439, Korea;
- Center of Crop Breeding on Omics and Artificial Intelligence, Kongju National University, Yesan 32439, Korea;
| |
Collapse
|
10
|
Yengo L, Yang J, Keller MC, Goddard ME, Wray NR, Visscher PM. Genomic partitioning of inbreeding depression in humans. Am J Hum Genet 2021; 108:1488-1501. [PMID: 34214457 DOI: 10.1016/j.ajhg.2021.06.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 06/01/2021] [Indexed: 02/05/2023] Open
Abstract
Across species, offspring of related individuals often exhibit significant reduction in fitness-related traits, known as inbreeding depression (ID), yet the genetic and molecular basis for ID remains elusive. Here, we develop a method to quantify enrichment of ID within specific genomic annotations and apply it to human data. We analyzed the phenomes and genomes of ∼350,000 unrelated participants of the UK Biobank and found, on average of over 11 traits, significant enrichment of ID within genomic regions with high recombination rates (>21-fold; p < 10-5), with conserved function across species (>19-fold; p < 10-4), and within regulatory elements such as DNase I hypersensitive sites (∼5-fold; p = 8.9 × 10-7). We also quantified enrichment of ID within trait-associated regions and found suggestive evidence that genomic regions contributing to additive genetic variance in the population are enriched for ID signal. We find strong correlations between functional enrichment of SNP-based heritability and that of ID (r = 0.8, standard error: 0.1). These findings provide empirical evidence that ID is most likely due to many partially recessive deleterious alleles in low linkage disequilibrium regions of the genome. Our study suggests that functional characterization of ID may further elucidate the genetic architectures and biological mechanisms underlying complex traits and diseases.
Collapse
|
11
|
Zhan S, Griswold C, Lukens L. Zea mays RNA-seq estimated transcript abundances are strongly affected by read mapping bias. BMC Genomics 2021; 22:285. [PMID: 33874908 PMCID: PMC8056621 DOI: 10.1186/s12864-021-07577-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 03/30/2021] [Indexed: 11/27/2022] Open
Abstract
Background Genetic variation for gene expression is a source of phenotypic variation for natural and agricultural species. The common approach to map and to quantify gene expression from genetically distinct individuals is to assign their RNA-seq reads to a single reference genome. However, RNA-seq reads from alleles dissimilar to this reference genome may fail to map correctly, causing transcript levels to be underestimated. Presently, the extent of this mapping problem is not clear, particularly in highly diverse species. We investigated if mapping bias occurred and if chromosomal features associated with mapping bias. Zea mays presents a model species to assess these questions, given it has genotypically distinct and well-studied genetic lines. Results In Zea mays, the inbred B73 genome is the standard reference genome and template for RNA-seq read assignments. In the absence of mapping bias, B73 and a second inbred line, Mo17, would each have an approximately equal number of regulatory alleles that increase gene expression. Remarkably, Mo17 had 2–4 times fewer such positively acting alleles than did B73 when RNA-seq reads were aligned to the B73 reference genome. Reciprocally, over one-half of the B73 alleles that increased gene expression were not detected when reads were aligned to the Mo17 genome template. Genes at dissimilar chromosomal ends were strongly affected by mapping bias, and genes at more similar pericentromeric regions were less affected. Biased transcript estimates were higher in untranslated regions and lower in splice junctions. Bias occurred across software and alignment parameters. Conclusions Mapping bias very strongly affects gene transcript abundance estimates in maize, and bias varies across chromosomal features. Individual genome or transcriptome templates are likely necessary for accurate transcript estimation across genetically variable individuals in maize and other species. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07577-3.
Collapse
Affiliation(s)
- Shuhua Zhan
- Department of Plant Agriculture, University of Guelph, Guelph, Ontario, Canada
| | - Cortland Griswold
- Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada
| | - Lewis Lukens
- Department of Plant Agriculture, University of Guelph, Guelph, Ontario, Canada.
| |
Collapse
|
12
|
Allemailem KS, Almatroudi A, Alrumaihi F, Makki Almansour N, Aldakheel FM, Rather RA, Afroze D, Rah B. Single nucleotide polymorphisms (SNPs) in prostate cancer: its implications in diagnostics and therapeutics. Am J Transl Res 2021; 13:3868-3889. [PMID: 34017579 PMCID: PMC8129253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 03/09/2021] [Indexed: 06/12/2023]
Abstract
Prostate cancer is one of the most frequently diagnosed malignancies in developed countries and approximately 248,530 new cases of prostate cancer are likely to be diagnosed in the United States in 2021. During the late 1990s and 2000s, the prostate cancer-related death rate has decreased by 4% per year on average because of advancements in prostate-specific antigen (PSA) testing. However, the non-specificity of PSA to distinguish between benign and malignant forms of cancer is a major concern in the management of prostate cancer. Despite other risk factors in the pathogenesis of prostate cancer, recent advancement in molecular genetics suggests that genetic heredity plays a crucial role in prostate carcinogenesis. Approximately, 60% of heritability and more than 100 well-recognized single-nucleotide-polymorphisms (SNPs) have been found to be associated with prostate cancer and constitute a major risk factor in the development of prostate cancer. Recent findings revealed that a low to moderate effect on the progression of prostate cancer of individual SNPs was observed compared to a strong progressive effect when SNPs were in combination. Here, in this review, we made an attempt to critically analyze the role of SNPs and associated genes in the development of prostate cancer and their implications in diagnostics and therapeutics. A better understanding of the role of SNPs in prostate cancer susceptibility may improve risk prediction, enhance fine-mapping, and furnish new insights into the underlying pathophysiology of prostate cancer.
Collapse
Affiliation(s)
- Khaled S Allemailem
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim UniversityBuraydah, Saudi Arabia
| | - Ahmad Almatroudi
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim UniversityBuraydah, Saudi Arabia
| | - Faris Alrumaihi
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim UniversityBuraydah, Saudi Arabia
| | - Nahlah Makki Almansour
- Department of Biology, College of Science, University of Hafr Al BatinHafr Al Batin, Saudi Arabia
| | - Fahad M Aldakheel
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Saud UniversityRiyadh, Saudi Arabia
- Prince Sattam Chair for Epidemiology and Public Health Research, College of Medicine, King Saud UniversityRiyadh, Saudi Arabia
| | - Rafiq Ahmad Rather
- Advanced Centre for Human Genetics, Sher-i-Kashmir Institute of Medical ScienceSrinagar, Jammu and Kashmir, India
| | - Dil Afroze
- Advanced Centre for Human Genetics, Sher-i-Kashmir Institute of Medical ScienceSrinagar, Jammu and Kashmir, India
| | - Bilal Rah
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim UniversityBuraydah, Saudi Arabia
| |
Collapse
|
13
|
Lin YL, Wu DH, Wu CC, Huang YF. Explore the genetics of weedy traits using rice 3K database. BOTANICAL STUDIES 2021; 62:2. [PMID: 33432466 PMCID: PMC7801593 DOI: 10.1186/s40529-020-00309-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 12/29/2020] [Indexed: 06/12/2023]
Abstract
BACKGROUND Weedy rice, a conspecific weedy counterpart of the cultivated rice (Oryza sativa L.), has been problematic in rice-production area worldwide. Although we started to know about the origin of some weedy traits for some rice-growing regions, an overall assessment of weedy trait-related loci was not yet available. On the other hand, the advances in sequencing technologies, together with community efforts, have made publicly available a large amount of genomic data. Given the availability of public data and the need of "weedy" allele mining for a better management of weedy rice, the objective of the present study was to explore the genetic architecture of weedy traits based on publicly available data, mainly from the 3000 Rice Genome Project (3K-RGP). RESULTS Based on the results of population structure analysis, we have selected 1378 individuals from four sub-populations (aus, indica, temperate japonica, tropical japonica) without admixed genomic composition for genome-wide association analysis (GWAS). Five traits were investigated: awn color, seed shattering, seed threshability, seed coat color, and seedling height. GWAS was conducted for each sub-population × trait combination and we have identified 66 population-specific trait-associated SNPs. Eleven significant SNPs fell into an annotated gene and four other SNPs were close to a putative candidate gene (± 25 kb). SNPs located in or close to Rc were particularly predictive of the occurrence of seed coat color and our results showed that different sub-populations required different SNPs for a better seed coat color prediction. We compared the data of 3K-RGP to a publicly available weedy rice dataset. The profile of allele frequency, phenotype-genotype segregation of target SNP, as well as GWAS results for the presence and absence of awns diverged between the two sets of data. CONCLUSIONS The genotype of trait-associated SNPs identified in this study, especially those located in or close to Rc, can be developed to diagnostic SNPs to trace the origin of weedy trait occurred in the field. The difference of results from the two publicly available datasets used in this study emphasized the importance of laboratory experiments to confirm the allele mining results based on publicly available data.
Collapse
Affiliation(s)
- Yu-Lan Lin
- Department of Agronomy, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd, Da'an Dist., Taipei, 10617, Taiwan
| | - Dong-Hong Wu
- Taiwan Agricultural Research Institute, Council of Agriculture, Executive Yuan, No. 189, Zhongzheng Rd, Wufeng Dist, Taichung City, 41362, Taiwan
| | - Cheng-Chieh Wu
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, 11529, Taiwan
- Institute of Plant Science, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd, Da'an Dist., Taipei, 10617, Taiwan
| | - Yung-Fen Huang
- Department of Agronomy, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd, Da'an Dist., Taipei, 10617, Taiwan.
| |
Collapse
|
14
|
Pachganov S, Murtazalieva K, Zarubin A, Taran T, Chartier D, Tatarinova TV. Prediction of Rice Transcription Start Sites Using TransPrise: A Novel Machine Learning Approach. Methods Mol Biol 2021; 2238:261-274. [PMID: 33471337 DOI: 10.1007/978-1-0716-1068-8_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
As the interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper, we present TransPrise-an efficient deep learning tool for predicting positions of eukaryotic transcription start sites. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise with the TSSPlant approach for well-annotated genome of Oryza sativa. Using a computer with a graphics processing unit, the run time of TransPrise is 250 min on a genome of 374 Mb long.We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all the necessary packages, models, and code as well as the source code of the TransPrise algorithm are available at http://compubioverne.group/ . The source code is ready to use and to be customized to predict TSS in any eukaryotic organism.
Collapse
Affiliation(s)
- Stepan Pachganov
- Ugra Research Institute of Information Technologies, Khanty-Mansiysk, Russia
| | | | - Alexei Zarubin
- Tomsk National Research Medical Center of the Russian Academy of Sciences, Research Institute of Medical Genetics, Tomsk, Russia
| | | | - Duane Chartier
- International Center for Art Intelligence, Inc, Los Angeles, CA, USA
| | - Tatiana V Tatarinova
- Vavilov Institute of General Genetics, Moscow, Russia.
- Department of Biology, University of La Verne, La Verne, CA, USA.
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.
- Siberian Federal University, Krasnoyarsk, Russia.
| |
Collapse
|
15
|
Adiba M, Das T, Paul A, Das A, Chakraborty S, Hosen MI, Nabi AN. In silico characterization of coding and non-coding SNPs of the androgen receptor gene. INFORMATICS IN MEDICINE UNLOCKED 2021; 24:100556. [DOI: 10.1016/j.imu.2021.100556] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
|
16
|
Cheng YH, Liu CFJ, Yu YH, Jhou YT, Fujishima M, Tsai IJ, Leu JY. Genome plasticity in Paramecium bursaria revealed by population genomics. BMC Biol 2020; 18:180. [PMID: 33250052 PMCID: PMC7702705 DOI: 10.1186/s12915-020-00912-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 10/29/2020] [Indexed: 11/25/2022] Open
Abstract
Background Ciliates are an ancient and diverse eukaryotic group found in various environments. A unique feature of ciliates is their nuclear dimorphism, by which two types of nuclei, the diploid germline micronucleus (MIC) and polyploidy somatic macronucleus (MAC), are present in the same cytoplasm and serve different functions. During each sexual cycle, ciliates develop a new macronucleus in which newly fused genomes are extensively rearranged to generate functional minichromosomes. Interestingly, each ciliate species seems to have its way of processing genomes, providing a diversity of resources for studying genome plasticity and its regulation. Here, we sequenced and analyzed the macronuclear genome of different strains of Paramecium bursaria, a highly divergent species of the genus Paramecium which can stably establish endosymbioses with green algae. Results We assembled a high-quality macronuclear genome of P. bursaria and further refined genome annotation by comparing population genomic data. We identified several species-specific expansions in protein families and gene lineages that are potentially associated with endosymbiosis. Moreover, we observed an intensive chromosome breakage pattern that occurred during or shortly after sexual reproduction and contributed to highly variable gene dosage throughout the genome. However, patterns of copy number variation were highly correlated among genetically divergent strains, suggesting that copy number is adjusted by some regulatory mechanisms or natural selection. Further analysis showed that genes with low copy number variation among populations tended to function in basic cellular pathways, whereas highly variable genes were enriched in environmental response pathways. Conclusions We report programmed DNA rearrangements in the P. bursaria macronuclear genome that allow cells to adjust gene copy number globally according to individual gene functions. Our results suggest that large-scale gene copy number variation may represent an ancient mechanism for cells to adapt to different environments. Supplementary information The online version contains supplementary material available at 10.1186/s12915-020-00912-2.
Collapse
Affiliation(s)
- Yu-Hsuan Cheng
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, 106, Taiwan.,Institute of Molecular Biology, Academia Sinica, 128 Sec. 2, Academia Road, Nankang, Taipei, 115, Taiwan
| | - Chien-Fu Jeff Liu
- Institute of Molecular Biology, Academia Sinica, 128 Sec. 2, Academia Road, Nankang, Taipei, 115, Taiwan
| | - Yen-Hsin Yu
- Institute of Molecular Biology, Academia Sinica, 128 Sec. 2, Academia Road, Nankang, Taipei, 115, Taiwan
| | - Yu-Ting Jhou
- Institute of Molecular Biology, Academia Sinica, 128 Sec. 2, Academia Road, Nankang, Taipei, 115, Taiwan
| | - Masahiro Fujishima
- Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Yamaguchi, 753-8512, Japan
| | - Isheng Jason Tsai
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, 106, Taiwan.,Biodiversity Research Center, Academia Sinica, Taipei, 115, Taiwan
| | - Jun-Yi Leu
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, 106, Taiwan. .,Institute of Molecular Biology, Academia Sinica, 128 Sec. 2, Academia Road, Nankang, Taipei, 115, Taiwan.
| |
Collapse
|
17
|
Sarpan N, Taranenko E, Ooi SE, Low ETL, Espinoza A, Tatarinova TV, Ong-Abdullah M. DNA methylation changes in clonally propagated oil palm. PLANT CELL REPORTS 2020; 39:1219-1233. [PMID: 32591850 DOI: 10.1007/s00299-020-02561-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 06/17/2020] [Indexed: 06/11/2023]
Abstract
Several hypomethylated sites within the Karma region of EgDEF1 and hotspot regions in chromosomes 1, 2, 3, and 5 may be associated with mantling. One of the main challenges faced by the oil palm industry is fruit abnormalities, such as the "mantled" phenotype that can lead to reduced yields. This clonal abnormality is an epigenetic phenomenon and has been linked to the hypomethylation of a transposable element within the EgDEF1 gene. To understand the epigenome changes in clones, methylomes of clonal oil palms were compared to methylomes of seedling-derived oil palms. Whole-genome bisulfite sequencing data from seedlings, normal, and mantled clones were analyzed to determine and compare the context-specific DNA methylomes. In seedlings, coding and regulatory regions are generally hypomethylated while introns and repeats are extensively methylated. Genes with a low number of guanines and cytosines in the third position of codons (GC3-poor genes) were increasingly methylated towards their 3' region, while GC3-rich genes remain demethylated, similar to patterns in other eukaryotic species. Predicted promoter regions were generally hypomethylated in seedlings. In clones, CG, CHG, and CHH methylation levels generally decreased in functionally important regions, such as promoters, 5' UTRs, and coding regions. Although random regions were found to be hypomethylated in clonal genomes, hypomethylation of certain hotspot regions may be associated with the clonal mantling phenotype. Our findings, therefore, suggest other hypomethylated CHG sites within the Karma of EgDEF1 and hypomethylated hotspot regions in chromosomes 1, 2, 3 and 5, are associated with mantling.
Collapse
Affiliation(s)
- Norashikin Sarpan
- Advanced Biotechnology and Breeding Centre, Malaysian Palm Oil Board, 6 Persiaran Institusi, Bandar Baru Bangi, 43000, Kajang, Selangor, Malaysia
| | - Elizaveta Taranenko
- Department of Biology, University of La Verne, La Verne, CA, USA
- Department of Fundamental Biology and Biotechnology, Siberian Federal University, 660074, Krasnoyarsk, Russia
| | - Siew-Eng Ooi
- Advanced Biotechnology and Breeding Centre, Malaysian Palm Oil Board, 6 Persiaran Institusi, Bandar Baru Bangi, 43000, Kajang, Selangor, Malaysia
| | - Eng-Ti Leslie Low
- Advanced Biotechnology and Breeding Centre, Malaysian Palm Oil Board, 6 Persiaran Institusi, Bandar Baru Bangi, 43000, Kajang, Selangor, Malaysia
| | | | - Tatiana V Tatarinova
- Department of Biology, University of La Verne, La Verne, CA, USA.
- Department of Fundamental Biology and Biotechnology, Siberian Federal University, 660074, Krasnoyarsk, Russia.
- Vavilov Institute for General Genetics, Moscow, Russia.
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.
| | - Meilina Ong-Abdullah
- Advanced Biotechnology and Breeding Centre, Malaysian Palm Oil Board, 6 Persiaran Institusi, Bandar Baru Bangi, 43000, Kajang, Selangor, Malaysia.
| |
Collapse
|
18
|
Dybus A, Yu YH, Proskura W, Lanckriet R, Cheng YH. Association of Sequence Variants in the CKM (Creatine Kinase, M-Type) Gene with Racing Performance of Homing Pigeons. RUSS J GENET+ 2020. [DOI: 10.1134/s1022795420080025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
19
|
Matić S, Tabone G, Garibaldi A, Gullino ML. Alternaria Leaf Spot Caused by Alternaria Species: An Emerging Problem on Ornamental Plants in Italy. PLANT DISEASE 2020; 104:2275-2287. [PMID: 32584157 DOI: 10.1094/pdis-02-20-0399-re] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Serious outbreaks of Alternaria leaf spot and plant decay have recently been recorded on several ornamental plants in the Biella Province (Northern Italy). Twenty-two fungal isolates were obtained from Alternaria infected plant tissues from 13 ornamental hosts. All the isolates were identified morphologically as small-spored Alternaria species. Multilocus sequence typing, carried out by means of ITS, rpb2, tef1, endoPG, Alt a 1, and OPA10-2, assigned 19 isolates as Alternaria alternata, two isolates as belonging to the Alternaria arborescens species complex, and one isolate as an unknown Alternaria sp. Haplotype analyses of ornamental and reference A. alternata isolates from 12 countries identified 14 OPA10-2 and 11 endoPG haplotypes showing a relatively high haplotype diversity. A lack of host specialization or geographic distribution was observed. The host range of the studied A. alternata isolates expanded in cross-pathogenicity assays, and more aggressiveness was frequently observed on the experimental plants than on the host plants from which the fungal isolates were originally isolated. High disease severity, population expansion, intraspecies diversity, and increased range of experimental hosts were seen in the emergence of Alternaria disease on ornamentals. More epidemiological and molecular studies should be performed to better understand these diseases, taking into consideration factors such as seed transmission and ongoing climate changes.
Collapse
Affiliation(s)
- Slavica Matić
- AGROINNOVA - Centre of Competence for the Innovation in the Agro-environmental Sector, Università di Torino, 10095 Grugliasco (TO), Italy
- Dept. Agricultural, Forestry and Food Sciences (DISAFA), Università di Torino, 10095 Grugliasco (TO), Italy
| | - Giulia Tabone
- AGROINNOVA - Centre of Competence for the Innovation in the Agro-environmental Sector, Università di Torino, 10095 Grugliasco (TO), Italy
| | - Angelo Garibaldi
- AGROINNOVA - Centre of Competence for the Innovation in the Agro-environmental Sector, Università di Torino, 10095 Grugliasco (TO), Italy
| | - Maria Lodovica Gullino
- AGROINNOVA - Centre of Competence for the Innovation in the Agro-environmental Sector, Università di Torino, 10095 Grugliasco (TO), Italy
- Dept. Agricultural, Forestry and Food Sciences (DISAFA), Università di Torino, 10095 Grugliasco (TO), Italy
| |
Collapse
|
20
|
Tello J, Torres-Pérez R, Flutre T, Grimplet J, Ibáñez J. VviUCC1 Nucleotide Diversity, Linkage Disequilibrium and Association with Rachis Architecture Traits in Grapevine. Genes (Basel) 2020; 11:E598. [PMID: 32485819 PMCID: PMC7348735 DOI: 10.3390/genes11060598] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 05/25/2020] [Accepted: 05/27/2020] [Indexed: 11/25/2022] Open
Abstract
Cluster compactness is a trait with high agronomic relevance, affecting crop yield and grape composition. Rachis architecture is a major component of cluster compactness determinism, and is a target trait toward the breeding of grapevine varieties less susceptible to pests and diseases. Although its genetic basis is scarcely understood, a preliminary result indicated a possible involvement of the VviUCC1 gene. The aim of this study was to characterize the VviUCC1 gene in grapevine and to test the association between the natural variation observed for a series of rachis architecture traits and the polymorphisms detected in the VviUCC1 sequence. This gene encodes an uclacyanin plant-specific cell-wall protein involved in fiber formation and/or lignification processes. A high nucleotide diversity in the VviUCC1 gene promoter and coding regions was observed, but no critical effects were predicted in the protein domains, indicating a high level of conservation of its function in the cultivated grapevine. After correcting statistical models for genetic stratification and linkage disequilibrium effects, marker-trait association results revealed a series of single nucleotide polymorphisms (SNPs) significantly associated with cluster compactness and rachis traits variation. Two of them (Y-984 and K-88) affected two common cis-transcriptional regulatory elements, suggesting an effect on phenotype via gene expression regulation. This work reinforces the interest of further studies aiming to reveal the functional effect of the detected VviUCC1 variants on grapevine rachis architecture.
Collapse
Affiliation(s)
- Javier Tello
- Departamento de Viticultura, Instituto de Ciencias de la Vid y del Vino (CSIC, UR, Gobierno de La Rioja), 26080 Logroño, Spain; (R.T.-P.); (J.G.); (J.I.)
| | - Rafael Torres-Pérez
- Departamento de Viticultura, Instituto de Ciencias de la Vid y del Vino (CSIC, UR, Gobierno de La Rioja), 26080 Logroño, Spain; (R.T.-P.); (J.G.); (J.I.)
- Servicio de Bioinformática para Genómica y Proteómica (BioinfoGP), Centro Nacional de Biotecnología (CNB-CSIC), 28049 Madrid, Spain
| | - Timothée Flutre
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE-Le Moulon, 91190 Gif-sur-Yvette, France;
| | - Jérôme Grimplet
- Departamento de Viticultura, Instituto de Ciencias de la Vid y del Vino (CSIC, UR, Gobierno de La Rioja), 26080 Logroño, Spain; (R.T.-P.); (J.G.); (J.I.)
- Unidad de Hortofruticultura, Centro de Investigación y Tecnología Agroalimentaria de Aragón (CITA), 50059 Zaragoza, Spain
- Instituto Agroalimentario de Aragón-IA2 (CITA-Universidad de Zaragoza), 50059 Zaragoza, Spain
| | - Javier Ibáñez
- Departamento de Viticultura, Instituto de Ciencias de la Vid y del Vino (CSIC, UR, Gobierno de La Rioja), 26080 Logroño, Spain; (R.T.-P.); (J.G.); (J.I.)
| |
Collapse
|
21
|
Golicz AA, Bhalla PL, Edwards D, Singh MB. Rice 3D chromatin structure correlates with sequence variation and meiotic recombination rate. Commun Biol 2020; 3:235. [PMID: 32398676 PMCID: PMC7217851 DOI: 10.1038/s42003-020-0932-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Accepted: 03/31/2020] [Indexed: 11/30/2022] Open
Abstract
Genomes of many eukaryotic species have a defined three-dimensional architecture critical for cellular processes. They are partitioned into topologically associated domains (TADs), defined as regions of high chromatin inter-connectivity. While TADs are not a prominent feature of A. thaliana genome organization, they have been reported for other plants including rice, maize, tomato and cotton and for which TAD formation appears to be linked to transcription and chromatin epigenetic status. Here we show that in the rice genome, sequence variation and meiotic recombination rate correlate with the 3D genome structure. TADs display increased SNP and SV density and higher recombination rate compared to inter-TAD regions. We associate the observed differences with the TAD epigenetic landscape, TE composition and an increased incidence of meiotic crossovers. Golicz et al. report an increase in single nucleotide polymorphisms and structural variations across and within Topologically Associated Domains (TADs) in the rice genome, which is different to the pattern observed in the human genome. They show that this may be due to epigenetic modifications, transposable elements composition, and meiotic crossovers in the TAD regions.
Collapse
Affiliation(s)
- Agnieszka A Golicz
- School of Agriculture and Food, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia.
| | - Prem L Bhalla
- School of Agriculture and Food, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, WA, 6009, Australia
| | - Mohan B Singh
- School of Agriculture and Food, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia
| |
Collapse
|
22
|
Mustafa MI, Murshed NS, Abdelmoneim AH, Abdelmageed MI, Elfadol NM, Makhawi AM. Extensive In Silico Analysis of ATL1 Gene : Discovered Five Mutations That May Cause Hereditary Spastic Paraplegia Type 3A. SCIENTIFICA 2020; 2020:8329286. [PMID: 32322428 PMCID: PMC7140133 DOI: 10.1155/2020/8329286] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 01/31/2020] [Accepted: 02/21/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND Hereditary spastic paraplegia type 3A (SPG3A) is a neurodegenerative disease inherited type of Hereditary spastic paraplegia (HSP). It is the second most frequent type of HSP which is characterized by progressive bilateral and mostly symmetric spasticity and weakness of the legs. SPG3A gene mutations and the phenotype-genotype correlations have not yet been recognized. The aim of this work was to categorize the most damaging SNPs in ATL1 gene and to predict their impact on the functional and structural levels by several computational analysis tools. METHODS The raw data of ATL1 gene were retrieved from dbSNP database and then run into numerous computational analysis tools. Additionally; we submitted the common six deleterious outcomes from the previous functional analysis tools to I-mutant 3.0 and MUPro, respectively, to investigate their effect on the structural level. The 3D structure of ATL1 was predicted by RaptorX and modeled using UCSF Chimera to compare the differences between the native and the mutant amino acids. RESULTS Five nsSNPs out of 249 were classified as the most deleterious (rs746927118, rs979765709, rs119476049, rs864622269, and rs1242753115). CONCLUSIONS In this study, the impact of nsSNPs in the ATL1 gene was investigated by various in silico tools that revealed five nsSNPs (V67F, T120I, R217Q, R495W, and G504E) are deleterious SNPs, which have a functional impact on ATL1 protein and, therefore, can be used as genomic biomarkers specifically before 4 years of age; also, it may play a key role in pharmacogenomics by evaluating drug response for this disabling disease.
Collapse
Affiliation(s)
| | - Naseem S. Murshed
- Department of Microbiology, International University of Africa, Khartoum, Sudan
| | | | | | - Nafisa M. Elfadol
- Department of Microbiology, National Ribat University, Khartoum, Sudan
| | | |
Collapse
|
23
|
Prathiviraj R, Chellapandi P. Modeling a global regulatory network of Methanothermobacter thermautotrophicus strain ∆H. ACTA ACUST UNITED AC 2020. [DOI: 10.1007/s13721-020-0223-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
24
|
Vondras AM, Minio A, Blanco-Ulate B, Figueroa-Balderas R, Penn MA, Zhou Y, Seymour D, Ye Z, Liang D, Espinoza LK, Anderson MM, Walker MA, Gaut B, Cantu D. The genomic diversification of grapevine clones. BMC Genomics 2019; 20:972. [PMID: 31830913 PMCID: PMC6907202 DOI: 10.1186/s12864-019-6211-2] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 10/22/2019] [Indexed: 12/14/2022] Open
Abstract
Background Vegetatively propagated clones accumulate somatic mutations. The purpose of this study was to better appreciate clone diversity and involved defining the nature of somatic mutations throughout the genome. Fifteen Zinfandel winegrape clone genomes were sequenced and compared to one another using a highly contiguous genome reference produced from one of the clones, Zinfandel 03. Results Though most heterozygous variants were shared, somatic mutations accumulated in individual and subsets of clones. Overall, heterozygous mutations were most frequent in intergenic space and more frequent in introns than exons. A significantly larger percentage of CpG, CHG, and CHH sites in repetitive intergenic space experienced transition mutations than in genic and non-repetitive intergenic spaces, likely because of higher levels of methylation in the region and because methylated cytosines often spontaneously deaminate. Of the minority of mutations that occurred in exons, larger proportions of these were putatively deleterious when they occurred in relatively few clones. Conclusions These data support three major conclusions. First, repetitive intergenic space is a major driver of clone genome diversification. Second, clones accumulate putatively deleterious mutations. Third, the data suggest selection against deleterious variants in coding regions or some mechanism by which mutations are less frequent in coding than noncoding regions of the genome.
Collapse
Affiliation(s)
- Amanda M Vondras
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Andrea Minio
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Barbara Blanco-Ulate
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA.,Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Rosa Figueroa-Balderas
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Michael A Penn
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Yongfeng Zhou
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92617, USA
| | - Danelle Seymour
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92617, USA
| | - Zirou Ye
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Dingren Liang
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Lucero K Espinoza
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Michael M Anderson
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - M Andrew Walker
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Brandon Gaut
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92617, USA
| | - Dario Cantu
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA.
| |
Collapse
|
25
|
Discovery of Functional SNPs via Genome-Wide Exploration of Malaysian Pigmented Rice Varieties. Int J Genomics 2019; 2019:4168045. [PMID: 31687375 PMCID: PMC6811786 DOI: 10.1155/2019/4168045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 08/01/2019] [Accepted: 08/19/2019] [Indexed: 01/30/2023] Open
Abstract
Recently, rice breeding program has shown increased interests on the pigmented rice varieties due to their benefits to human health. However, the genetic variation of pigmented rice varieties is still scarce and remains unexplored. Hence, we performed genome-wide SNP analysis from the genome resequencing of four Malaysian pigmented rice varieties, representing two black and two red rice varieties. The genome of four pigmented varieties was mapped against Nipponbare reference genome sequences, and 1.9 million SNPs were discovered. Of these, 622 SNPs with polymorphic sites were identified in 258 protein-coding genes related to metabolism, stress response, and transporter. Comparative analysis of 622 SNPs with polymorphic sites against six rice SNP datasets from the Ensembl Plants variation database was performed, and 70 SNPs were identified as novel SNPs. Analysis of SNPs in the flavonoid biosynthetic genes revealed 40 nonsynonymous SNPs, which has potential as molecular markers for rice seed colour identification. The highlighted SNPs in this study show effort in producing valuable genomic resources for application in the rice breeding program, towards the genetic improvement of new and improved pigmented rice varieties.
Collapse
|
26
|
Pachganov S, Murtazalieva K, Zarubin A, Sokolov D, Chartier DR, Tatarinova TV. TransPrise: a novel machine learning approach for eukaryotic promoter prediction. PeerJ 2019; 7:e7990. [PMID: 31695967 PMCID: PMC6827441 DOI: 10.7717/peerj.7990] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 10/04/2019] [Indexed: 02/01/2023] Open
Abstract
As interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper we present TransPrise-an efficient deep learning tool for prediction of positions of eukaryotic transcription start sites. Our pipeline consists of two parts: the binary classifier operates the first, and if a sequence is classified as TSS-containing the regression step follows, where the precise location of TSS is being identified. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise classification and regression models with the TSSPlant approach for the well annotated genome of Oryza sativa. Using a computer equipped with a graphics processing unit, the run time of TransPrise is 250 minutes on a genome of 374 Mb long. The Matthews correlation coefficient value for TransPrise is 0.79, more than two times larger than the 0.31 for TSSPlant classification models. This represents a high level of prediction accuracy. Additionally, the mean absolute error for the regression model is 29.19 nt, allowing for accurate prediction of TSS location. TransPrise was also tested in Homo sapiens, where mean absolute error of the regression model was 47.986 nt. We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all necessary packages, models, code as well as the source code of the TransPrise algorithm are available at (http://compubioverne.group/). The source code is ready to use and customizable to predict TSS in any eukaryotic organism.
Collapse
Affiliation(s)
- Stepan Pachganov
- Ugra Research Institute of Information Technologies, Khanty-Mansiysk, Russia
| | - Khalimat Murtazalieva
- Vavilov Institute for General Genetics, Moscow, Russia.,Institute of Bioinformatics, Moscow, Russia
| | - Aleksei Zarubin
- Tomsk National Research Medical Center of the Russian Academy of Sciences, Research Institute of Medical Genetics, Tomsk, Russia
| | | | - Duane R Chartier
- International Center for Art Intelligence, Inc., Los Angeles, CA, United States of America
| | - Tatiana V Tatarinova
- Vavilov Institute for General Genetics, Moscow, Russia.,Department of Biology, University of La Verne, La Verne, CA, United States of America.,A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.,Siberian Federal University, Krasnoyarsk, Russia
| |
Collapse
|
27
|
Mustafa MI, Mohammed ZO, Murshed NS, Elfadol NM, Abdelmoneim AH, Hassan MA. In Silico Genetics Revealing 5 Mutations in CEBPA Gene Associated With Acute Myeloid Leukemia. Cancer Inform 2019; 18:1176935119870817. [PMID: 31621694 PMCID: PMC6777061 DOI: 10.1177/1176935119870817] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2019] [Accepted: 07/30/2019] [Indexed: 12/11/2022] Open
Abstract
Background: Acute myeloid leukemia (AML) is an extremely heterogeneous malignant
disorder; AML has been reported as one of the main causes of death in
children. The objective of this work was to classify the most deleterious
mutation in CCAAT/enhancer-binding protein-alpha (CEBPA)
and to predict their influence on the functional, structural, and expression
levels by various Bioinformatics analysis tools. Methods: The single nucleotide polymorphisms (SNPs) were claimed from the National
Center for Biotechnology Information (NCBI) database and then submitted into
various functional analysis tools, which were done to predict the influence
of each SNP, followed by structural analysis of modeled protein followed by
predicting the mutation effect on energy stability; the most damaging
mutations were chosen for additional investigation by Mutation3D, Project
hope, ConSurf, BioEdit, and UCSF Chimera tools. Results: A total of 5 mutations out of 248 were likely to be responsible for the
structural and functional variations in CEBPA protein, whereas in the
3′-untranslated region (3′-UTR) the result showed that among 350 SNPs in the
3′-UTR of CEBPA gene, about 11 SNPs were predicted. Among
these 11 SNPs, 65 alleles disrupted a conserved miRNA site and 22 derived
alleles created a new site of miRNA. Conclusions: In this study, the impact of functional mutations in the CEBPA gene was
investigated through different bioinformatics analysis techniques, which
determined that R339W, R288P, N292S, N292T, and D63N are pathogenic
mutations that have a possible functional and structural influence,
therefore, could be used as genetic biomarkers and may assist in genetic
studies with a special consideration of the large heterogeneity of AML.
Collapse
Affiliation(s)
- Mujahed I Mustafa
- Department of Biotechnology, Africa City of Technology, Khartoum North, Sudan
| | - Zainab O Mohammed
- Department of Haematology, Ribat University Hospital, Khartoum, Sudan
| | - Naseem S Murshed
- Department of Biotechnology, Africa City of Technology, Khartoum North, Sudan
| | - Nafisa M Elfadol
- Department of Biotechnology, Africa City of Technology, Khartoum North, Sudan
| | | | - Mohamed A Hassan
- Department of Biotechnology, Africa City of Technology, Khartoum North, Sudan
| |
Collapse
|
28
|
Li M, Stragliati L, Bellini E, Ricci A, Saba A, Sanità di Toppi L, Varotto C. Evolution and functional differentiation of recently diverged phytochelatin synthase genes from Arundo donax L. JOURNAL OF EXPERIMENTAL BOTANY 2019; 70:5391-5405. [PMID: 31145784 PMCID: PMC6793451 DOI: 10.1093/jxb/erz266] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Accepted: 05/24/2019] [Indexed: 05/15/2023]
Abstract
Phytochelatin synthases (PCSs) play pivotal roles in the detoxification of heavy metals and metalloids in plants; however, little information on the evolution of recently duplicated PCS genes in plant species is available. Here we characterize the evolution and functional differentiation of three PCS genes from the giant reed (Arundo donax L.), a biomass/bioenergy crop with remarkable resistance to cadmium and other heavy metals. Phylogenetic reconstruction with PCS genes from fully sequenced monocotyledonous genomes indicated that the three A. donax PCSs, namely AdPCS1-3, form a monophyletic clade. The AdPCS1-3 genes were expressed at low levels in many A. donax organs and displayed different levels of cadmium-responsive expression in roots. Overexpression of AdPCS1-3 in Arabidopsis thaliana and yeast reproduced the phenotype of functional PCS genes. Mass spectrometry analyses confirmed that AdPCS1-3 are all functional enzymes, but with significant differences in the amount of the phytochelatins synthesized. Moreover, heterogeneous evolutionary rates characterized the AdPCS1-3 genes, indicative of relaxed natural selection. These results highlight the elevated functional differentiation of A. donax PCS genes from both a transcriptional and an enzymatic point of view, providing evidence of the high evolvability of PCS genes and of plant responsiveness to heavy metal stress.
Collapse
Affiliation(s)
- Mingai Li
- Department of Biodiversity and Molecular Ecology, Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige (TN) , Italy
| | - Luca Stragliati
- Dipartimento di Scienze Chimiche, della Vita e della Sostenibilità Ambientale, Università degli studi di Parma, Parco Area delle Scienze, Parma, Italy
| | - Erika Bellini
- Dipartimento di Biologia, Università di Pisa, Pisa, Italy
| | - Ada Ricci
- Dipartimento di Scienze Chimiche, della Vita e della Sostenibilità Ambientale, Università degli studi di Parma, Parco Area delle Scienze, Parma, Italy
| | - Alessandro Saba
- Dipartimento di Patologia Chirurgica, Medica, Molecolare e dell’Area Critica, Università di Pisa, Pisa, Italy
| | | | - Claudio Varotto
- Department of Biodiversity and Molecular Ecology, Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige (TN) , Italy
- Correspondence: or
| |
Collapse
|
29
|
Arslan M, Devisetty UK, Porsch M, Große I, Müller JA, Michalski SG. RNA-Seq analysis of soft rush (Juncus effusus): transcriptome sequencing, de novo assembly, annotation, and polymorphism identification. BMC Genomics 2019; 20:489. [PMID: 31195970 PMCID: PMC6567414 DOI: 10.1186/s12864-019-5886-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 06/05/2019] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Juncus effusus L. (family: Juncaceae; order: Poales) is a helophytic rush growing in temperate damp or wet terrestrial habitats and is of almost cosmopolitan distribution. The species has been studied intensively with respect to its interaction with co-occurring plants as well as microbes being involved in major biogeochemical cycles. J. effusus has biotechnological value as component of Constructed Wetlands where the plant has been employed in phytoremediation of contaminated water. Its genome has not been sequenced. RESULTS In this study we carried out functional annotation and polymorphism analysis of de novo assembled RNA-Seq data from 18 genotypes using 249 million paired-end Illumina HiSeq reads and 2.8 million 454 Titanium reads. The assembly comprised 158,591 contigs with a mean contig length of 780 bp. The assembly was annotated using the dammit! annotation pipeline, which queries the databases OrthoDB, Pfam-A, Rfam, and runs BUSCO (Benchmarking Single-Copy Ortholog genes). In total, 111,567 contigs (70.3%) were annotated with functional descriptions, assigned gene ontology terms, and conserved protein domains, which resulted in 30,932 non-redundant gene sequences. Results of BUSCO and KEGG pathway analyses were similar for J. effusus as for the well-studied members of the Poales, Oryza sativa and Sorghum bicolor. A total of 566,433 polymorphisms were identified in transcribed regions with an average frequency of 1 polymorphism in every 171 bases. CONCLUSIONS The transcriptome assembly was of high quality and genome coverage was sufficient for global analyses. This annotated knowledge resource can be utilized for future gene expression analysis, genomic feature comparisons, genotyping, primer design, and functional genomics in J. effusus.
Collapse
Affiliation(s)
- Muhammad Arslan
- Department Environmental Biotechnology, Helmholtz Centre for Environmental Research - UFZ, Permoserstr, 15, Leipzig, Germany.,Institute for Biology V (Environmental Research), RWTH Aachen University, Templergraben 55, 52062, Aachen, Germany
| | | | - Martin Porsch
- Institute of Computer Science, Martin-Luther-University Halle-Wittenberg, Von-Seckendorff-Platz 1, 06120, Halle (Saale), Germany.,Core Facility Deep Sequencing, Martin-Luther-University Halle-Wittenberg, Magdeburger Str. 2, 06112, Halle (Saale), Germany
| | - Ivo Große
- Institute of Computer Science, Martin-Luther-University Halle-Wittenberg, Von-Seckendorff-Platz 1, 06120, Halle (Saale), Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103, Leipzig, Germany
| | - Jochen A Müller
- Department Environmental Biotechnology, Helmholtz Centre for Environmental Research - UFZ, Permoserstr, 15, Leipzig, Germany.
| | - Stefan G Michalski
- Department of Community Ecology, Helmholtz Centre for Environmental Research - UFZ, Theodor-Lieser-Str. 4, 06120, Halle (Saale), Germany
| |
Collapse
|
30
|
Fuentes RR, Chebotarov D, Duitama J, Smith S, De la Hoz JF, Mohiyuddin M, Wing RA, McNally KL, Tatarinova T, Grigoriev A, Mauleon R, Alexandrov N. Structural variants in 3000 rice genomes. Genome Res 2019; 29:870-880. [PMID: 30992303 PMCID: PMC6499320 DOI: 10.1101/gr.241240.118] [Citation(s) in RCA: 96] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Accepted: 03/11/2019] [Indexed: 12/24/2022]
Abstract
Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5′ UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice.
Collapse
Affiliation(s)
- Roven Rommel Fuentes
- International Rice Research Institute, Laguna 4031, Philippines.,Bioinformatics Group, Wageningen University and Research, 6708 PB Wageningen, the Netherlands
| | | | - Jorge Duitama
- Systems and Computing Engineering Department, Universidad de Los Andes, Bogotá 111711, Colombia.,Agrobiodiversity Research Area, International Center for Tropical Agriculture (CIAT), Cali 6713, Colombia
| | - Sean Smith
- Biology Department, Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA
| | - Juan Fernando De la Hoz
- Agrobiodiversity Research Area, International Center for Tropical Agriculture (CIAT), Cali 6713, Colombia
| | | | - Rod A Wing
- International Rice Research Institute, Laguna 4031, Philippines.,Arizona Genomics Institute, University of Arizona, Tucson, Arizona 85721, USA.,King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| | | | - Tatiana Tatarinova
- Department of Biology, University of La Verne, La Verne, California 91750, USA.,Vavilov Institute of General Genetics, Moscow 119333, Russia.,A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127051, Russia.,Laboratory of Forest Genomics, Siberian Federal University, Krasnoyarsk 660041, Russia
| | - Andrey Grigoriev
- Biology Department, Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA
| | - Ramil Mauleon
- International Rice Research Institute, Laguna 4031, Philippines
| | | |
Collapse
|
31
|
Neininger K, Marschall T, Helms V. SNP and indel frequencies at transcription start sites and at canonical and alternative translation initiation sites in the human genome. PLoS One 2019; 14:e0214816. [PMID: 30978217 PMCID: PMC6461226 DOI: 10.1371/journal.pone.0214816] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 03/20/2019] [Indexed: 11/30/2022] Open
Abstract
Single-nucleotide polymorphisms (SNPs) are the most common form of genetic variation in humans and drive phenotypic variation. Due to evolutionary conservation, SNPs and indels (insertion and deletions) are depleted in functionally important sequence elements. Recently, population-scale sequencing efforts such as the 1000 Genomes Project and the Genome of the Netherlands Project have catalogued large numbers of sequence variants. Here, we present a systematic analysis of the polymorphisms reported by these two projects in different coding and non-coding genomic elements of the human genome (intergenic regions, CpG islands, promoters, 5’ UTRs, coding exons, 3’ UTRs, introns, and intragenic regions). Furthermore, we were especially interested in the distribution of SNPs and indels in direct vicinity to the transcription start site (TSS) and translation start site (CSS). Thereby, we discovered an enrichment of dinucleotides CpG and CpA and an accumulation of SNPs at base position −1 relative to the TSS that involved primarily CpG and CpA dinucleotides. Genes having a CpG dinucleotide at TSS position -1 were enriched in the functional GO terms “Phosphoprotein”, “Alternative splicing”, and “Protein binding”. Focusing on the CSS, we compared SNP patterns in the flanking regions of canonical and alternative AUG and near-cognate start sites where we considered alternative starts previously identified by experimental ribosome profiling. We observed similar conservation patterns of canonical and alternative translation start sites, which underlines the importance of alternative translation mechanisms for cellular function.
Collapse
Affiliation(s)
- Kerstin Neininger
- Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
- Graduate School of Computer Science, Saarland University, 66123 Saarbrücken, Germany
| | - Tobias Marschall
- Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
- Max Planck Institute for Informatics, 66123 Saarbrücken, Germany
| | - Volkhard Helms
- Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
- * E-mail:
| |
Collapse
|
32
|
Rajkumar MS, Garg R, Jain M. Genome-wide discovery of DNA polymorphisms among chickpea cultivars with contrasting seed size/weight and their functional relevance. Sci Rep 2018; 8:16795. [PMID: 30429540 PMCID: PMC6235875 DOI: 10.1038/s41598-018-35140-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Accepted: 10/31/2018] [Indexed: 12/16/2022] Open
Abstract
Seed size/weight is a major agronomic trait which determine crop productivity in legumes. To understand the genetic basis of seed size determination, we sought to identify DNA polymorphisms between two small (Himchana 1 and Pusa 362) and two large-seeded (JGK 3 and PG 0515) chickpea cultivars via whole genome resequencing. We identified a total of 75535 single nucleotide polymorphisms (SNPs), 6486 insertions and deletions (InDels), 1938 multi-nucleotide polymorphisms (MNPs) and 5025 complex variants between the two small and two large-seeded chickpea cultivars. Our analysis revealed 814, 244 and 72 seed-specific genes harboring DNA polymorphisms in promoter or non-synonymous and large-effect DNA polymorphisms, respectively. Gene ontology analysis revealed enrichment of cell growth and division related terms in these genes. Among them, at least 22 genes associated with quantitative trait loci, and those involved in cell growth and division and encoding transcription factors harbored promoter and/or large-effect/non-synonymous DNA polymorphisms. These also showed higher expression at late-embryogenesis and/or mid-maturation stages of seed development in the large-seeded cultivar, suggesting their role in seed size/weight determination in chickpea. Altogether, this study provided a valuable resource for large-scale genotyping applications and a few putative candidate genes that might play crucial role in governing seed size/weight in chickpea.
Collapse
Affiliation(s)
- Mohan Singh Rajkumar
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India
| | - Rohini Garg
- Department of Life Sciences, School of Natural Sciences, Shiv Nadar University, Gautam Buddha Nagar, Uttar Pradesh, 201314, India
| | - Mukesh Jain
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India. .,National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi, 110067, India.
| |
Collapse
|
33
|
Investigation of the Perilipin 5 gene expression and association study of its sequence polymorphism with meat and carcass quality traits in different pig breeds. Animal 2018; 12:1135-1143. [DOI: 10.1017/s1751731117002804] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
|
34
|
Triska M, Solovyev V, Baranova A, Kel A, Tatarinova TV. Nucleotide patterns aiding in prediction of eukaryotic promoters. PLoS One 2017; 12:e0187243. [PMID: 29141011 PMCID: PMC5687710 DOI: 10.1371/journal.pone.0187243] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2017] [Accepted: 09/05/2017] [Indexed: 01/09/2023] Open
Abstract
Computational analysis of promoters is hindered by the complexity of their architecture. In less studied genomes with complex organization, false positive promoter predictions are common. Accurate identification of transcription start sites and core promoter regions remains an unsolved problem. In this paper, we present a comprehensive analysis of genomic features associated with promoters and show that probabilistic integrative algorithms-driven models allow accurate classification of DNA sequence into “promoters” and “non-promoters” even in absence of the full-length cDNA sequences. These models may be built upon the maps of the distributions of sequence polymorphisms, RNA sequencing reads on genomic DNA, methylated nucleotides, transcription factor binding sites, as well as relative frequencies of nucleotides and their combinations. Positional clustering of binding sites shows that the cells of Oryza sativa utilize three distinct classes of transcription factors: those that bind preferentially to the [-500,0] region (188 “promoter-specific” transcription factors), those that bind preferentially to the [0,500] region (282 “5′ UTR-specific” TFs), and 207 of the “promiscuous” transcription factors with little or no location preference with respect to TSS. For the most informative motifs, their positional preferences are conserved between dicots and monocots.
Collapse
Affiliation(s)
- Martin Triska
- Children’s Hospital Los Angeles, University of Southern California, Los Angeles, CA, United States of America
- Faculty of Advanced Technology, University of South Wales, Pontypridd, Wales, United Kingdom
| | | | - Ancha Baranova
- School of Systems Biology, George Mason University, Fairfax, VA, United States of America
- Research Centre for Medical Genetics, Moscow, Russia
| | - Alexander Kel
- geneXplain GmbH, Wolfenbuettel, Germany
- Institute of Chemical Biology and Fundamental Medicine, Novosibirsk, Russia
| | - Tatiana V. Tatarinova
- School of Systems Biology, George Mason University, Fairfax, VA, United States of America
- Department of Biology, Division of Natural Sciences, University of La Verne, La Verne, CA, United States of America
- Bioinformatics Center, AA Kharkevich Institute for Information Transmission Problems RAS, Moscow, Russia
- Vavilov’s Institute for General Genetics, Moscow, Russia, Moscow, Russia
- * E-mail:
| |
Collapse
|
35
|
Thakur Z, Saini V, Arya P, Kumar A, Mehta PK. Computational insights into promoter architecture of toxin-antitoxin systems of Mycobacterium tuberculosis. Gene 2017; 641:161-171. [PMID: 29066303 DOI: 10.1016/j.gene.2017.10.054] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Revised: 09/27/2017] [Accepted: 10/16/2017] [Indexed: 12/16/2022]
Abstract
Toxin-antitoxin (TA) systems are two component genetic modules widespread in many bacterial genomes, including Mycobacterium tuberculosis (Mtb). The TA systems play a significant role in biofilm formation, antibiotic tolerance and persistence of pathogen inside the host cells. Deciphering regulatory motifs of Mtb TA systems is the first essential step to understand their transcriptional regulation. In this study, in silico approaches, that is, the knowledge based motif discovery and de novo motif discovery were used to identify the regulatory motifs of 79 Mtb TA systems. The knowledge based motif discovery approach was used to design a Perl based bio-tool Mtb-sig-miner available at (https://github.com/zoozeal/Mtb-sig-miner), which could successfully detect sigma (σ) factor specific regulatory motifs in the promoter region of Mtb TA modules. The manual curation of Mtb-sig-miner output hits revealed that the majority of them possessed σB regulatory motif in their promoter region. On the other hand, de novo approach resulted in the identification of a novel conserved motif [(T/A)(G/T)NTA(G/C)(C/A)AT(C/A)] within the promoter region of 14 Mtb TA systems. The identified conserved motif was also validated for its activity as conserved core region of operator sequence of corresponding TA system by molecular docking studies. The strong binding of respective antitoxin/toxin with the identified novel conserved motif reflected the validation of identified motif as the core region of operator sequence of respective TA systems. These findings provide computational insight to understand the transcriptional regulation of Mtb TA systems.
Collapse
Affiliation(s)
- Zoozeal Thakur
- Centre for Biotechnology, Maharshi Dayanand University, Rohtak, 124001, Haryana, India
| | - Vandana Saini
- Toxicology & Computational Biology Group, Centre for Bioinformatics, Maharshi Dayanand University, Rohtak, 124001, Haryana, India
| | - Preeti Arya
- National Agri-Food Biotechnology Institute, Sector 81, S.A.S Nagar, Mohali, Punjab 140306, India
| | - Ajit Kumar
- Toxicology & Computational Biology Group, Centre for Bioinformatics, Maharshi Dayanand University, Rohtak, 124001, Haryana, India.
| | - Promod K Mehta
- Centre for Biotechnology, Maharshi Dayanand University, Rohtak, 124001, Haryana, India.
| |
Collapse
|