1
|
Historical routes for diversification of domesticated chickpea inferred from landrace genomics. Mol Biol Evol 2023:7158554. [PMID: 37159511 DOI: 10.1093/molbev/msad110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 04/03/2023] [Accepted: 04/11/2023] [Indexed: 05/11/2023] Open
Abstract
According to archaeological records, chickpea (Cicer arietinum) was first domesticated in the Fertile Crescent about 10k years BP. Its subsequent diversification in Middle East, South Asia, Ethiopia, and the Western Mediterranean, however, remains obscure and cannot be resolved using only archeological and historical evidence. Moreover, chickpea has two market types: 'desi' and 'kabuli', for which the geographic origin is a matter of debate. To decipher chickpea history, we took the genetic data from 421 chickpea landraces unaffected by the green revolution and tested complex historical hypotheses of chickpea migration and admixture on two hierarchical spatial levels: within and between major regions of cultivation. For chickpea migration within regions, we developed popdisp, a Bayesian model of population dispersal from a regional representative center towards the sampling sites, that considers geographical proximities between sites. This method confirmed that chickpea spreads within each geographical region along optimal geographical routes rather than by simple diffusion and estimated representative allele frequencies for each region. For chickpea migration between regions, we developed another model, migadmi, that takes allele frequencies of populations and evaluates multiple and nested admixture events. Applying this model to desi populations, we found both Indian and Middle Eastern traces in Ethiopian chickpea, suggesting the presence of a seaway from South Asia to Ethiopia. As for the origin of kabuli chickpeas, we found significant evidence for its origin from Turkey rather than Central Asia.
Collapse
|
2
|
Microevolution, speciation and macroevolution in rhizobia: Genomic mechanisms and selective patterns. FRONTIERS IN PLANT SCIENCE 2022; 13:1026943. [PMID: 36388581 PMCID: PMC9640933 DOI: 10.3389/fpls.2022.1026943] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 10/06/2022] [Indexed: 06/16/2023]
Abstract
Nodule bacteria (rhizobia), N2-fixing symbionts of leguminous plants, represent an excellent model to study the fundamental issues of evolutionary biology, including the tradeoff between microevolution, speciation, and macroevolution, which remains poorly understood for free-living organisms. Taxonomically, rhizobia are extremely diverse: they are represented by nearly a dozen families of α-proteobacteria (Rhizobiales) and by some β-proteobacteria. Their genomes are composed of core parts, including house-keeping genes (hkg), and of accessory parts, including symbiotically specialized (sym) genes. In multipartite genomes of evolutionary advanced fast-growing species (Rhizobiaceae), sym genes are clustered on extra-chromosomal replicons (megaplasmids, chromids), facilitating gene transfer in plant-associated microbial communities. In this review, we demonstrate that in rhizobia, microevolution and speciation involve different genomic and ecological mechanisms: the first one is based on the diversification of sym genes occurring under the impacts of host-induced natural selection (including its disruptive, frequency-dependent and group forms); the second one-on the diversification of hkgs under the impacts of unknown factors. By contrast, macroevolution represents the polyphyletic origin of super-species taxa, which are dependent on the transfer of sym genes from rhizobia to various soil-borne bacteria. Since the expression of newly acquired sym genes on foreign genomic backgrounds is usually restricted, conversion of resulted recombinants into the novel rhizobia species involves post-transfer genetic changes. They are presumably supported by host-induced selective processes resulting in the sequential derepression of nod genes responsible for nodulation and of nif/fix genes responsible for symbiotic N2 fixation.
Collapse
|
3
|
Be aware of the allele-specific bias and compositional effects in multi-template PCR. PeerJ 2022; 10:e13888. [PMID: 36061756 PMCID: PMC9438772 DOI: 10.7717/peerj.13888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 07/21/2022] [Indexed: 01/19/2023] Open
Abstract
High-throughput sequencing of amplicon libraries is the most widespread and one of the most effective ways to study the taxonomic structure of microbial communities, even despite growing accessibility of whole metagenome sequencing. Due to the targeted amplification, the method provides unparalleled resolution of communities, but at the same time perturbs initial community structure thereby reducing data robustness and compromising downstream analyses. Experimental research of the perturbations is largely limited to comparative studies on different PCR protocols without considering other sources of experimental variation related to characteristics of the initial microbial composition itself. Here we analyse these sources and demonstrate how dramatically they effect the relative abundances of taxa during the PCR cycles. We developed the mathematical model of the PCR amplification assuming the heterogeneity of amplification efficiencies and considering the compositional nature of data. We designed the experiment-five consecutive amplicon cycles (22-26) with 12 replicates for one real human stool microbial sample-and estimated the dynamics of the microbial community in line with the model. We found the high heterogeneity in amplicon efficiencies of taxa that leads to the non-linear and substantial (up to fivefold) changes in relative abundances during PCR. The analysis of possible sources of heterogeneity revealed the significant association between amplicon efficiencies and the energy of secondary structures of the DNA templates. The result of our work highlights non-trivial changes in the dynamics of real-life microbial communities due to their compositional nature. Obtained effects are specific not only for amplicon libraries, but also for any studies of metagenome dynamics.
Collapse
|
4
|
Heterogeneity of the GFP fitness landscape and data-driven protein design. eLife 2022; 11:75842. [PMID: 35510622 PMCID: PMC9119679 DOI: 10.7554/elife.75842] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 03/25/2022] [Indexed: 11/24/2022] Open
Abstract
Studies of protein fitness landscapes reveal biophysical constraints guiding protein evolution and empower prediction of functional proteins. However, generalisation of these findings is limited due to scarceness of systematic data on fitness landscapes of proteins with a defined evolutionary relationship. We characterized the fitness peaks of four orthologous fluorescent proteins with a broad range of sequence divergence. While two of the four studied fitness peaks were sharp, the other two were considerably flatter, being almost entirely free of epistatic interactions. Mutationally robust proteins, characterized by a flat fitness peak, were not optimal templates for machine-learning-driven protein design - instead, predictions were more accurate for fragile proteins with epistatic landscapes. Our work paves insights for practical application of fitness landscape heterogeneity in protein engineering.
Collapse
|
5
|
Human prostate cancer bone metastases have an actionable immunosuppressive microenvironment. Cancer Cell 2021; 39:1464-1478.e8. [PMID: 34719426 PMCID: PMC8578470 DOI: 10.1016/j.ccell.2021.09.005] [Citation(s) in RCA: 81] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Revised: 07/15/2021] [Accepted: 09/14/2021] [Indexed: 02/06/2023]
Abstract
Bone metastases are devastating complications of cancer. They are particularly common in prostate cancer (PCa), represent incurable disease, and are refractory to immunotherapy. We seek to define distinct features of the bone marrow (BM) microenvironment by analyzing single cells from bone metastatic prostate tumors, involved BM, uninvolved BM, and BM from cancer-free, orthopedic patients, and healthy individuals. Metastatic PCa is associated with multifaceted immune distortion, specifically exhaustion of distinct T cell subsets, appearance of macrophages with states specific to PCa bone metastases. The chemokine CCL20 is notably overexpressed by myeloid cells, as is its cognate CCR6 receptor on T cells. Disruption of the CCL20-CCR6 axis in mice with syngeneic PCa bone metastases restores T cell reactivity and significantly prolongs animal survival. Comparative high-resolution analysis of PCa bone metastases shows a targeted approach for relieving local immunosuppression for therapeutic effect.
Collapse
|
6
|
Towards Understanding Afghanistan Pea Symbiotic Phenotype Through the Molecular Modeling of the Interaction Between LykX-Sym10 Receptor Heterodimer and Nod Factors. FRONTIERS IN PLANT SCIENCE 2021; 12:642591. [PMID: 34025691 PMCID: PMC8138044 DOI: 10.3389/fpls.2021.642591] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 04/13/2021] [Indexed: 05/06/2023]
Abstract
The difference in symbiotic specificity between peas of Afghanistan and European phenotypes was investigated using molecular modeling. Considering segregating amino acid polymorphism, we examined interactions of pea LykX-Sym10 receptor heterodimers with four forms of Nodulation factor (NF) that varied in natural decorations (acetylation and length of the glucosamine chain). First, we showed the stability of the LykX-Sym10 dimer during molecular dynamics (MD) in solvent and in the presence of a membrane. Then, four NFs were separately docked to one European and two Afghanistan dimers, and the results of these interactions were in line with corresponding pea symbiotic phenotypes. The European variant of the LykX-Sym10 dimer effectively interacts with both acetylated and non-acetylated forms of NF, while the Afghanistan variants successfully interact with the acetylated form only. We additionally demonstrated that the length of the NF glucosamine chain contributes to controlling the effectiveness of the symbiotic interaction. The obtained results support a recent hypothesis that the LykX gene is a suitable candidate for the unidentified Sym2 allele, the determinant of pea specificity toward Rhizobium leguminosarum bv. viciae strains producing NFs with or without an acetylation decoration. The developed modeling methodology demonstrated its power in multiple searches for genetic determinants, when experimental detection of such determinants has proven extremely difficult.
Collapse
|
7
|
Abstract IA23: Impact of metastatic prostate cancer on human bone marrow. Cancer Res 2020. [DOI: 10.1158/1538-7445.tumhet2020-ia23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Cancer-related mortality due to solid tumor malignancies is overwhelmingly due to the development and progression of metastases. In advanced prostate cancer, metastases most often involve the bone and generally represent incurable disease. It remains unclear what aspects of the bone marrow microenvironment make it hospitable to metastatic dissemination. Similarly, the impact of the metastatic tumor on hematopoiesis and the marrow immune response is poorly understood. We took advantage of rare spinal cord decompression surgeries to profile marrow and metastatic tumors from men with advanced prostate cancer at single-cell resolution. Our analysis contrasts the cellular composition and transcriptional states in matched samples of tumor and liquid bone marrow collected at adjacent vertebral body levels, as well as bone marrow of orthopedic patients without malignancy. Metastatic prostate cancer was associated with hematopoietic suppression and multifaceted immune distortion. There was exhaustion of specific T cell subsets, appearance of inflammatory monocytes and macrophages, and alteration of cytokine profiles. Computational analysis showed association between the presence of specific myeloid subsets and the level of T lymphocyte dysfunction in the tumor fraction. We screened for potential signaling axes that may underlie this interaction. Among them was chemokine CCL20, notably overexpressed by myeloid cells, as was its cognate CCR6 receptor, expressed on T cells. We developed a syngeneic mouse model of bone-metastatic prostate cancer to explore this observation, and demonstrated that disruption of the CCL20-CCR6 axis from either side resulted in significant prolongation of survival. Our results further indicated that this dual overexpression was associated with repressed immune responses. Overall, comparative high-resolution analysis of bone marrow reveals distinct alterations associated with prostate cancer bone metastases that may be amenable to therapeutic targeting with the goal of altering cancer progression.
Citation Format: Ninib Baryawno, Youmna Kfoury, Nicolas Severe, Shenglin Mei, Karin Gustafsson, Taghreed Hirz, Thomas Brouse, Elizabeth W. Scadden, Anna A. Igolkina, Bryan D. Choi, Nikolas Barkas, John H. Shin, Philip J. Saylor, David T. Scadden, David B. Sykes, Peter V. Kharchenko, as part of the Boston Bone Metastasis Consortium. Impact of metastatic prostate cancer on human bone marrow [abstract]. In: Proceedings of the AACR Virtual Special Conference on Tumor Heterogeneity: From Single Cells to Clinical Impact; 2020 Sep 17-18. Philadelphia (PA): AACR; Cancer Res 2020;80(21 Suppl):Abstract nr IA23.
Collapse
|
8
|
Multi-trait multi-locus SEM model discriminates SNPs of different effects. BMC Genomics 2020; 21:490. [PMID: 32723302 PMCID: PMC7385891 DOI: 10.1186/s12864-020-06833-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 06/16/2020] [Indexed: 11/21/2022] Open
Abstract
Background There is a plethora of methods for genome-wide association studies. However, only a few of them may be classified as multi-trait and multi-locus, i.e. consider the influence of multiple genetic variants to several correlated phenotypes. Results We propose a multi-trait multi-locus model which employs structural equation modeling (SEM) to describe complex associations between SNPs and traits - multi-trait multi-locus SEM (mtmlSEM). The structure of our model makes it possible to discriminate pleiotropic and single-trait SNPs of direct and indirect effect. We also propose an automatic procedure to construct the model using factor analysis and the maximum likelihood method. For estimating a large number of parameters in the model, we performed Bayesian inference and implemented Gibbs sampling. An important feature of the model is that it correctly copes with non-normally distributed variables, such as some traits and variants. Conclusions We applied the model to Vavilov’s collection of 404 chickpea (Cicer arietinum L.) accessions with 20-fold cross-validation. We analyzed 16 phenotypic traits which we organized into five groups and found around 230 SNPs associated with traits, 60 of which were of pleiotropic effect. The model demonstrated high accuracy in predicting trait values.
Collapse
|
9
|
H3K4me3, H3K9ac, H3K27ac, H3K27me3 and H3K9me3 Histone Tags Suggest Distinct Regulatory Evolution of Open and Condensed Chromatin Landmarks. Cells 2019; 8:cells8091034. [PMID: 31491936 PMCID: PMC6770625 DOI: 10.3390/cells8091034] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Revised: 08/28/2019] [Accepted: 09/03/2019] [Indexed: 12/12/2022] Open
Abstract
Background: Transposons are selfish genetic elements that self-reproduce in host DNA. They were active during evolutionary history and now occupy almost half of mammalian genomes. Close insertions of transposons reshaped structure and regulation of many genes considerably. Co-evolution of transposons and host DNA frequently results in the formation of new regulatory regions. Previously we published a concept that the proportion of functional features held by transposons positively correlates with the rate of regulatory evolution of the respective genes. Methods: We ranked human genes and molecular pathways according to their regulatory evolution rates based on high throughput genome-wide data on five histone modifications (H3K4me3, H3K9ac, H3K27ac, H3K27me3, H3K9me3) linked with transposons for five human cell lines. Results: Based on the total of approximately 1.5 million histone tags, we ranked regulatory evolution rates for 25075 human genes and 3121 molecular pathways and identified groups of molecular processes that showed signs of either fast or slow regulatory evolution. However, histone tags showed different regulatory patterns and formed two distinct clusters: promoter/active chromatin tags (H3K4me3, H3K9ac, H3K27ac) vs. heterochromatin tags (H3K27me3, H3K9me3). Conclusion: In humans, transposon-linked histone marks evolved in a coordinated way depending on their functional roles.
Collapse
|
10
|
Matching population diversity of rhizobial nodA and legume NFR5 genes in plant-microbe symbiosis. Ecol Evol 2019; 9:10377-10386. [PMID: 31624556 PMCID: PMC6787799 DOI: 10.1002/ece3.5556] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Revised: 07/07/2019] [Accepted: 07/15/2019] [Indexed: 12/31/2022] Open
Abstract
We hypothesized that population diversities of partners in nitrogen-fixing rhizobium-legume symbiosis can be matched for "interplaying" genes. We tested this hypothesis using data on nucleotide polymorphism of symbiotic genes encoding two components of the plant-bacteria signaling system: (a) the rhizobial nodA acyltransferase involved in the fatty acid tail decoration of the Nod factor (signaling molecule); (b) the plant NFR5 receptor required for Nod factor binding. We collected three wild-growing legume species together with soil samples adjacent to the roots from one large 25-year fallow: Vicia sativa, Lathyrus pratensis, and Trifolium hybridum nodulated by one of the two Rhizobium leguminosarum biovars (viciae and trifolii). For each plant species, we prepared three pools for DNA extraction and further sequencing: the plant pool (30 plant indiv.), the nodule pool (90 nodules), and the soil pool (30 samples). We observed the following statistically significant conclusions: (a) a monotonic relationship between the diversity in the plant NFR5 gene pools and the nodule rhizobial nodA gene pools; (b) higher topological similarity of the NFR5 gene tree with the nodA gene tree of the nodule pool, than with the nodA gene tree of the soil pool. Both nonsynonymous diversity and Tajima's D were increased in the nodule pools compared with the soil pools, consistent with relaxation of negative selection and/or admixture of balancing selection. We propose that the observed genetic concordance between NFR5 gene pools and nodule nodA gene pools arises from the selection of particular genotypes of the nodA gene by the host plant.
Collapse
|
11
|
Analysis of Gene Expression Variance in Schizophrenia Using Structural Equation Modeling. Front Mol Neurosci 2018; 11:192. [PMID: 29942251 PMCID: PMC6004421 DOI: 10.3389/fnmol.2018.00192] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 05/15/2018] [Indexed: 01/02/2023] Open
Abstract
Schizophrenia (SCZ) is a psychiatric disorder of unknown etiology. There is evidence suggesting that aberrations in neurodevelopment are a significant attribute of schizophrenia pathogenesis and progression. To identify biologically relevant molecular abnormalities affecting neurodevelopment in SCZ we used cultured neural progenitor cells derived from olfactory neuroepithelium (CNON cells). Here, we tested the hypothesis that variance in gene expression differs between individuals from SCZ and control groups. In CNON cells, variance in gene expression was significantly higher in SCZ samples in comparison with control samples. Variance in gene expression was enriched in five molecular pathways: serine biosynthesis, PI3K-Akt, MAPK, neurotrophin and focal adhesion. More than 14% of variance in disease status was explained within the logistic regression model (C-value = 0.70) by predictors accounting for gene expression in 69 genes from these five pathways. Structural equation modeling (SEM) was applied to explore how the structure of these five pathways was altered between SCZ patients and controls. Four out of five pathways showed differences in the estimated relationships among genes: between KRAS and NF1, and KRAS and SOS1 in the MAPK pathway; between PSPH and SHMT2 in serine biosynthesis; between AKT3 and TSC2 in the PI3K-Akt signaling pathway; and between CRK and RAPGEF1 in the focal adhesion pathway. Our analysis provides evidence that variance in gene expression is an important characteristic of SCZ, and SEM is a promising method for uncovering altered relationships between specific genes thus suggesting affected gene regulation associated with the disease. We identified altered gene-gene interactions in pathways enriched for genes with increased variance in expression in SCZ. These pathways and loci were previously implicated in SCZ, providing further support for the hypothesis that gene expression variance plays important role in the etiology of SCZ.
Collapse
|
12
|
A Pipeline for Classifying Deleterious Coding Mutations in Agricultural Plants. FRONTIERS IN PLANT SCIENCE 2018; 9:1734. [PMID: 30546376 PMCID: PMC6279870 DOI: 10.3389/fpls.2018.01734] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 11/08/2018] [Indexed: 05/18/2023]
Abstract
The impact of deleterious variation on both plant fitness and crop productivity is not completely understood and is a hot topic of debates. The deleterious mutations in plants have been solely predicted using sequence conservation methods rather than function-based classifiers due to lack of well-annotated mutational datasets in these organisms. Here, we developed a machine learning classifier based on a dataset of deleterious and neutral mutations in Arabidopsis thaliana by extracting 18 informative features that discriminate deleterious mutations from neutral, including 9 novel features not used in previous studies. We examined linear SVM, Gaussian SVM, and Random Forest classifiers, with the latter performing best. Random Forest classifiers exhibited a markedly higher accuracy than the popular PolyPhen-2 tool in the Arabidopsis dataset. Additionally, we tested whether the Random Forest, trained on the Arabidopsis dataset, accurately predicts deleterious mutations in Orýza sativa and Pisum sativum and observed satisfactory levels of performance accuracy (87% and 93%, respectively) higher than obtained by the PolyPhen-2. Application of Transfer learning in classifiers did not improve their performance. To additionally test the performance of the Random Forest classifier across different angiosperm species, we applied it to annotate deleterious mutations in Cicer arietinum and validated them using population frequency data. Overall, we devised a classifier with the potential to improve the annotation of putative functional mutations in QTL and GWAS hit regions, as well as for the evolutionary analysis of proliferation of deleterious mutations during plant domestication; thus optimizing breeding improvement and development of new cultivars.
Collapse
|
13
|
[Characteristics of Natural Selection in Populations of Nodule Bacteria (Rhizobium leguminosarum) Interacting With Different Host Plants]. GENETIKA 2015; 51:1108-1116. [PMID: 27169225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Using high throughput sequencing of the nodA gene, we studied the population dynamics of Rhizobium leguminosarum (bv. viciae, bv. trifolii) in rhizospheric and nodular subpopulations associated with the leguminous plants representing different cross-inoculation groups (Vicia sativa, Lathyrus pratensis of the vetch/vetchling/pea group and Trifolium hybridum of the clover group). The "rhizosphere-nodules" transitions result in either an increase or decrease in the frequencies of 10 of the 23 operational taxonomic units (OTUs) (which were identified with 95% similarity) depending on the symbiotic specificity and phylogenetic positions of OTUs. Statistical and bioinformatical analysis of the population structures suggest that the type of natural selection responsible for these changes may be diversifying at the whole-population level and frequency-dependent at the OTU-specific level, ensuring the divergent evolution of rhizobia interacting with different host species.
Collapse
|