1
|
Gawriljuk VO, Godoy AS, Oerlemans R, Welker LAT, Hirsch AKH, Groves MR. Cryo-EM structure of 1-deoxy-D-xylulose 5-phosphate synthase DXPS from Plasmodium falciparum reveals a distinct N-terminal domain. Nat Commun 2024; 15:6642. [PMID: 39103329 PMCID: PMC11300867 DOI: 10.1038/s41467-024-50671-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 07/17/2024] [Indexed: 08/07/2024] Open
Abstract
Plasmodium falciparum is the main causative agent of malaria, a deadly disease that mainly affects children under five years old. Artemisinin-based combination therapies have been pivotal in controlling the disease, but resistance has arisen in various regions, increasing the risk of treatment failure. The non-mevalonate pathway is essential for the isoprenoid synthesis in Plasmodium and provides several under-explored targets to be used in the discovery of new antimalarials. 1-deoxy-D-xylulose-5-phosphate synthase (DXPS) is the first and rate-limiting enzyme of the pathway. Despite its importance, there are no structures available for any Plasmodium spp., due to the complex sequence which contains large regions of high disorder, making crystallisation a difficult task. In this manuscript, we use cryo-electron microscopy to solve the P. falciparum DXPS structure at a final resolution of 2.42 Å. Overall, the structure resembles other DXPS enzymes but includes a distinct N-terminal domain exclusive to the Plasmodium genus. Mutational studies show that destabilization of the cap domain interface negatively impacts protein stability and activity. Additionally, a density for the co-factor thiamine diphosphate is found in the active site. Our work highlights the potential of cryo-EM to obtain structures of P. falciparum proteins that are unfeasible by means of crystallography.
Collapse
Affiliation(s)
- Victor O Gawriljuk
- Chemical and Pharmaceutical Biology, Groningen Research Institute of Pharmacy, University of Groningen, Antonius Deusinglaan 1, 9713 AV, Groningen, The Netherlands
| | - Andre S Godoy
- Sao Carlos Institute of Physics, University of Sao Paulo, Av. Joao Dagnone, 1100 - Jardim Santa Angelina, Sao Carlos, 13563-120, Brazil
| | - Rick Oerlemans
- Chemical and Pharmaceutical Biology, Groningen Research Institute of Pharmacy, University of Groningen, Antonius Deusinglaan 1, 9713 AV, Groningen, The Netherlands
| | - Luise A T Welker
- Chemical and Pharmaceutical Biology, Groningen Research Institute of Pharmacy, University of Groningen, Antonius Deusinglaan 1, 9713 AV, Groningen, The Netherlands
| | - Anna K H Hirsch
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) - Helmholtz Centre for Infection Research (HZI), Campus Building E8.1, 66123, Saarbrücken, Germany
- Saarland University, Department of Pharmacy, Campus Building E8.1, 66123, Saarbrücken, Germany
| | - Matthew R Groves
- Chemical and Pharmaceutical Biology, Groningen Research Institute of Pharmacy, University of Groningen, Antonius Deusinglaan 1, 9713 AV, Groningen, The Netherlands.
| |
Collapse
|
2
|
Zhang C, Forsdyke DR. Potential Achilles heels of SARS-CoV-2 are best displayed by the base order-dependent component of RNA folding energy. Comput Biol Chem 2021; 94:107570. [PMID: 34500325 PMCID: PMC8410225 DOI: 10.1016/j.compbiolchem.2021.107570] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 08/29/2021] [Accepted: 08/30/2021] [Indexed: 11/29/2022]
Abstract
The base order-dependent component of folding energy has revealed a highly conserved region in HIV-1 genomes that associates with RNA structure. This corresponds to a packaging signal that is recognized by the nucleocapsid domain of the Gag polyprotein. Long viewed as a potential HIV-1 "Achilles heel," the signal can be targeted by a new antiviral compound. Although SARS-CoV-2 differs in many respects from HIV-1, the same technology displays regions with a high base order-dependent folding energy component, which are also highly conserved. This indicates structural invariance (SI) sustained by natural selection. While the regions are often also protein-encoding (e. g. NSP3, ORF3a), we suggest that their nucleic acid level functions can be considered potential "Achilles heels" for SARS-CoV-2, perhaps susceptible to therapies like those envisaged for AIDS. The ribosomal frameshifting element scored well, but higher SI scores were obtained in other regions, including those encoding NSP13 and the nucleocapsid (N) protein.
Collapse
Affiliation(s)
- Chiyu Zhang
- Shanghai Public Health Clinical Center, Fudan University, Shanghai, China
| | - Donald R Forsdyke
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario K7L3N6, Canada.
| |
Collapse
|
3
|
Cappannini A, Forcelloni S, Giansanti A. Evolutionary pressures and codon bias in low complexity regions of plasmodia. Genetica 2021; 149:217-237. [PMID: 34254217 DOI: 10.1007/s10709-021-00126-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 06/30/2021] [Indexed: 11/25/2022]
Abstract
The biological meaning of low complexity regions in the proteins of Plasmodium species is a topic of discussion in evolutionary biology. There is a debate between selectionists and neutralists, who either attribute or do not attribute an effect of low-complexity regions on the fitness of these parasites, respectively. In this work, we comparatively study 22 Plasmodium species to understand whether their low complexity regions undergo a neutral or, rather, a selective and species-dependent evolution. The focus is on the connection between the codon repertoire of the genetic coding sequences and the occurrence of low complexity regions in the corresponding proteins. The first part of the work concerns the correlation between the length of plasmodial proteins and their propensity at embedding low complexity regions. Relative synonymous codon usage, entropy, and other indicators reveal that the incidence of low complexity regions and their codon bias is species-specific and subject to selective evolutionary pressure. We also observed that protein length, a relaxed selective pressure, and a broad repertoire of codons in proteins, are strongly correlated with the occurrence of low complexity regions. Overall, it seems plausible that the codon bias of low-complexity regions contributes to functional innovation and codon bias enhancement of proteins on which Plasmodium species rest as successful evolutionary parasites.
Collapse
Affiliation(s)
- Andrea Cappannini
- Department of Physics, Sapienza, University of Rome, P.le A. Moro 5, 00185, Roma, Italy.
| | - Sergio Forcelloni
- Max Planck Institute of Biochemistry, 82152, Martinsried, Germany.,Department of Chemistry, Technical University of Munich, 85748, Garching, Germany
| | - Andrea Giansanti
- Department of Physics, Sapienza, University of Rome, P.le A. Moro 5, 00185, Roma, Italy.,Istituto Nazionale di Fisica Nucleare, INFN, Roma1 section. 00185, Roma, Italy
| |
Collapse
|
4
|
Neutralism versus selectionism: Chargaff's second parity rule, revisited. Genetica 2021; 149:81-88. [PMID: 33880685 PMCID: PMC8057000 DOI: 10.1007/s10709-021-00119-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 04/09/2021] [Indexed: 11/03/2022]
Abstract
Of Chargaff's four "rules" on DNA base frequencies, the functional interpretation of his second parity rule (PR2) is the most contentious. Thermophile base compositions (GC%) were taken by Galtier and Lobry (1997) as favoring Sueoka's neutral PR2 hypothesis over Forsdyke's selective PR2 hypothesis, namely that mutations improving local within-species recombination efficiency had generated a genome-wide potential for the strands of duplex DNA to separate and initiate recombination through the "kissing" of the tips of stem-loops. However, following Chargaff's GC rule, base composition mainly reflects a species-specific, genome-wide, evolutionary pressure. GC% could not have consistently followed the dictates of temperature, since it plays fundamental roles in both sustaining species integrity and, through primarily neutral genome-wide mutation, fostering speciation. Evidence for a local within-species recombination-initiating role of base order was obtained with a novel technology that masked the contribution of base composition to nucleic acid folding energy. Forsdyke's results were consistent with his PR2 hypothesis, appeared to resolve some root problems in biology and provided a theoretical underpinning for alignment-free taxonomic analyses using relative oligonucleotide frequencies (k-mer analysis). Moreover, consistent with Chargaff's cluster rule, discovery of the thermoadaptive role of the "purine-loading" of open reading frames made less tenable the Galtier-Lobry anti-selectionist arguments.
Collapse
|
5
|
Wang Y, Yang HJ, Harrison PM. The relationship between protein domains and homopeptides in the Plasmodium falciparum proteome. PeerJ 2020; 8:e9940. [PMID: 33062426 PMCID: PMC7534687 DOI: 10.7717/peerj.9940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 08/24/2020] [Indexed: 12/03/2022] Open
Abstract
The proteome of the malaria parasite Plasmodium falciparum is notable for the pervasive occurrence of homopeptides or low-complexity regions (i.e., regions that are made from a small subset of amino-acid residue types). The most prevalent of these are made from residues encoded by adenine/thymidine (AT)-rich codons, in particular asparagine. We examined homopeptide occurrences within protein domains in P. falciparum. Homopeptide enrichments occur for hydrophobic (e.g., valine), or small residues (alanine or glycine) in short spans (<5 residues), but these enrichments disappear for longer lengths. We observe that short asparagine homopeptides (<10 residues long) have a dramatic relative depletion inside protein domains, indicating some selective constraint to keep them from forming. We surmise that this is possibly linked to co-translational protein folding, although there are specific protein domains that are enriched in longer asparagine homopeptides (≥10 residues) indicating a functional linkage for specific poly-asparagine tracts. Top gene ontology functional category enrichments for homopeptides associated with diverse protein domains include “vesicle-mediated transport”, and “DNA-directed 5′-3′ RNA polymerase activity”, with various categories linked to “binding” evidencing significant homopeptide depletions. Also, in general homopeptides are substantially enriched in the parts of protein domains that are near/in IDRs. The implications of these findings are discussed.
Collapse
|
6
|
MacRaild CA, Seow J, Das SC, Norton RS. Disordered epitopes as peptide vaccines. Pept Sci (Hoboken) 2018; 110:e24067. [PMID: 32328540 PMCID: PMC7167742 DOI: 10.1002/pep2.24067] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Revised: 03/08/2018] [Accepted: 03/09/2018] [Indexed: 01/23/2023]
Abstract
The development of clinically useful peptide-based vaccines remains a long-standing goal. This review highlights that intrinsically disordered protein antigens, which lack an ordered three-dimensional structure, represent excellent starting points for the development of such vaccines. Disordered proteins represent an important class of antigen in a wide range of human pathogens, and, contrary to widespread belief, they are frequently targets of protective antibody responses. Importantly, disordered epitopes appear invariably to be linear epitopes, rendering them ideally suited to incorporation into a peptide vaccine. Nonetheless, the conformational properties of disordered antigens, and hence their recognition by antibodies, frequently depend on the interactions they make and the context in which they are presented to the immune system. These effects must be considered in the design of an effective vaccine. Here we discuss these issues and propose design principles that may facilitate the development of peptide vaccines targeting disordered antigens.
Collapse
Affiliation(s)
- Christopher A. MacRaild
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal ParadeParkville3052Australia
| | - Jeffrey Seow
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal ParadeParkville3052Australia
| | - Sreedam C. Das
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal ParadeParkville3052Australia
| | - Raymond S. Norton
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal ParadeParkville3052Australia
| |
Collapse
|
7
|
Abstract
The genomic architecture of organisms, including nucleotide composition, can be highly variable, even among closely-related species. To better understand the causes leading to structural variation in genomes, information on distinct and diverse genomic features is needed. Malaria parasites are known for encompassing a wide range of genomic GC-content and it has long been thought that Plasmodium falciparum, the virulent malaria parasite of humans, has the most AT-biased eukaryotic genome. Here, I perform comparative genomic analyses of the most AT-rich eukaryotes sequenced to date, and show that the avian malaria parasites Plasmodium gallinaceum, P. ashfordi, and P. relictum have the most extreme coding sequences in terms of AT-bias. Their mean GC-content is 21.21, 21.22 and 21.60 %, respectively, which is considerably lower than the transcriptome of P. falciparum (23.79 %) and other eukaryotes. This information enables a better understanding of genome evolution and raises the question of how certain organisms are able to prosper despite severe compositional constraints.
Collapse
|
8
|
Pancsa R, Tompa P. Coding Regions of Intrinsic Disorder Accommodate Parallel Functions. Trends Biochem Sci 2016; 41:898-906. [DOI: 10.1016/j.tibs.2016.08.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Revised: 08/16/2016] [Accepted: 08/19/2016] [Indexed: 02/01/2023]
|
9
|
Battistuzzi FU, Schneider KA, Spencer MK, Fisher D, Chaudhry S, Escalante AA. Profiles of low complexity regions in Apicomplexa. BMC Evol Biol 2016; 16:47. [PMID: 26923229 PMCID: PMC4770516 DOI: 10.1186/s12862-016-0625-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Accepted: 02/17/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Low complexity regions (LCRs) are a ubiquitous feature in genomes and yet their evolutionary history and functional roles are unclear. Previous studies have shown contrasting evidence in favor of both neutral and selective mechanisms of evolution for different sets of LCRs suggesting that modes of identification of these regions may play a role in our ability to discern their evolutionary history. To further investigate this issue, we used a multiple threshold approach to identify species-specific profiles of proteome complexity and, by comparing properties of these sets, determine the influence that starting parameters have on evolutionary inferences. RESULTS We find that, although qualitatively similar, quantitatively each species has a unique LCR profile which represents the frequency of these regions within each genome. Inferences based on these profiles are more accurate in comparative analyses of genome complexity as they allow to determine the relative complexity of multiple genomes as well as the type of repetitiveness that is most common in each. Based on the multiple threshold LCR sets obtained, we identified predominant evolutionary mechanisms at different complexity levels, which show neutral mechanisms acting on highly repetitive LCRs (e.g., homopolymers) and selective forces becoming more important as heterogeneity of the LCRs increases. CONCLUSIONS Our results show how inferences based on LCRs are influenced by the parameters used to identify these regions. Sets of LCRs are heterogeneous aggregates of regions that include homo- and heteropolymers and, as such, evolve according to different mechanisms. LCR profiles provide a new way to investigate genome complexity across species and to determine the driving mechanism of their evolution.
Collapse
Affiliation(s)
| | - Kristan A Schneider
- Department of MNI, University of Applied Sciences Mittweida, Mittweida, Germany.
| | - Matthew K Spencer
- Department of Geology and Physics, Lake Superior State University, Sault Ste. Marie, MI, USA.
| | - David Fisher
- David Eccles School of Business, University of Utah, Salt Lake City, UT, USA.
| | - Sophia Chaudhry
- Department of Biological Sciences, Oakland University, Rochester, MI, USA. .,Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA.
| | - Ananias A Escalante
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.
| |
Collapse
|
10
|
Forsdyke DR. Complexity. Evol Bioinform Online 2016. [DOI: 10.1007/978-3-319-28755-3_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
|
11
|
Forsdyke DR. Doctor-scientist-patients who barketh not: the quantified self-movement and crowd-sourcing research. J Eval Clin Pract 2015. [PMID: 26201555 DOI: 10.1111/jep.12425] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Donald R Forsdyke
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Canada
| |
Collapse
|
12
|
Liu L, Richard J, Kim S, Wojcik EJ. Small molecule screen for candidate antimalarials targeting Plasmodium Kinesin-5. J Biol Chem 2014; 289:16601-14. [PMID: 24737313 DOI: 10.1074/jbc.m114.551408] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Plasmodium falciparum and vivax are responsible for the majority of malaria infections worldwide, resulting in over a million deaths annually. Malaria parasites now show measured resistance to all currently utilized drugs. Novel antimalarial drugs are urgently needed. The Plasmodium Kinesin-5 mechanoenzyme is a suitable "next generation" target. Discovered via small molecule screen experiments, the human Kinesin-5 has multiple allosteric sites that are "druggable." One site in particular, unique in its sequence divergence across all homologs in the superfamily and even within the same family, exhibits exquisite drug specificity. We propose that Plasmodium Kinesin-5 shares this allosteric site and likewise can be targeted to uncover inhibitors with high specificity. To test this idea, we performed a screen for inhibitors selective for Plasmodium Kinesin-5 ATPase activity in parallel with human Kinesin-5. Our screen of nearly 2000 compounds successfully identified compounds that selectively inhibit both P. vivax and falciparum Kinesin-5 motor domains but, as anticipated, do not impact human Kinesin-5 activity. Of note is a candidate drug that did not biochemically compete with the ATP substrate for the conserved active site or disrupt the microtubule-binding site. Together, our experiments identified MMV666693 as a selective allosteric inhibitor of Plasmodium Kinesin-5; this is the first identified protein target for the Medicines of Malaria Venture validated collection of parasite proliferation inhibitors. This work demonstrates that chemical screens against human kinesins are adaptable to homologs in disease organisms and, as such, extendable to strategies to combat infectious disease.
Collapse
Affiliation(s)
- Liqiong Liu
- From the Department of Biochemistry and Molecular Biology, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112
| | - Jessica Richard
- From the Department of Biochemistry and Molecular Biology, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112
| | - Sunyoung Kim
- From the Department of Biochemistry and Molecular Biology, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112
| | - Edward J Wojcik
- From the Department of Biochemistry and Molecular Biology, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112
| |
Collapse
|
13
|
Forsdyke DR. Implications of HIV RNA structure for recombination, speciation, and the neutralism-selectionism controversy. Microbes Infect 2013; 16:96-103. [PMID: 24211872 DOI: 10.1016/j.micinf.2013.10.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2013] [Revised: 10/24/2013] [Accepted: 10/24/2013] [Indexed: 11/29/2022]
Abstract
The conflict between the needs to encode both a protein (impaired by non-synonymous mutation), and nucleic acid structure (impaired by synonymous or non-synonymous mutation), can sometimes be resolved in favour of the nucleic acid because its structure is critical for a selectively advantageous genome-wide activity--recombination. However, above a sequence difference threshold, recombination is impaired. It may then be advantageous for new species to arise. Building on the work of Grantham and others critical of the neutralist viewpoint, heuristic support for this hypothesis emerged from studies of the base composition and structure of retroviral genomes. The extreme enrichment in the purine A of the RNA of human immunodeficiency virus (HIV-1), parallels the mild purine-loading of the RNAs of most organisms, for which there is an adaptive explanation--immune evasion. However, human T cell leukaemia virus (HTLV-1), with the potential to invade the same host cell, shows extreme enrichment in the pyrimidine C. Assuming the low GC% HIV and the high GC% HTLV-1 to share a common ancestor, it was postulated that differences in GC% had arisen to prevent homologous recombination between these emerging lentiviral species. Sympatrically isolated by this intracellular reproductive barrier, prototypic HIV-1 seized the AU-rich (low GC%) high ground (thus committing to purine A rather than purine G). Prototypic HTLV-1 forwent this advantage and evolved an independent evolutionary strategy--similar to that of the GC%-rich Epstein-Barr virus--profound latency maintained by transcription of one purine-rich mRNA. The evidence supporting these interpretations is reviewed.
Collapse
Affiliation(s)
- Donald R Forsdyke
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON K7L3N6, Canada.
| |
Collapse
|
14
|
Filisetti D, Théobald-Dietrich A, Mahmoudi N, Rudinger-Thirion J, Candolfi E, Frugier M. Aminoacylation of Plasmodium falciparum tRNA(Asn) and insights in the synthesis of asparagine repeats. J Biol Chem 2013; 288:36361-71. [PMID: 24196969 DOI: 10.1074/jbc.m113.522896] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Genome sequencing revealed an extreme AT-rich genome and a profusion of asparagine repeats associated with low complexity regions (LCRs) in proteins of the malarial parasite Plasmodium falciparum. Despite their abundance, the function of these LCRs remains unclear. Because they occur in almost all families of plasmodial proteins, the occurrence of LCRs cannot be associated with any specific metabolic pathway; yet their accumulation must have given selective advantages to the parasite. Translation of these asparagine-rich LCRs demands extraordinarily high amounts of asparaginylated tRNA(Asn). However, unlike other organisms, Plasmodium codon bias is not correlated to tRNA gene copy number. Here, we studied tRNA(Asn) accumulation as well as the catalytic capacities of the asparaginyl-tRNA synthetase of the parasite in vitro. We observed that asparaginylation in this parasite can be considered standard, which is expected to limit the availability of asparaginylated tRNA(Asn) in the cell and, in turn, slow down the ribosomal translation rate when decoding asparagine repeats. This observation strengthens our earlier hypothesis considering that asparagine rich sequences act as "tRNA sponges" and help cotranslational folding of parasite proteins. However, it also raises many questions about the mechanistic aspects of the synthesis of asparagine repeats and about their implications in the global control of protein expression throughout Plasmodium life cycle.
Collapse
Affiliation(s)
- Denis Filisetti
- From the Architecture et Réactivité de l'ARN, Université de Strasbourg, CNRS, Institut de Biologie Moléculaire et Cellulaire, 15 rue René Descartes, 67084 Strasbourg cedex, France and
| | | | | | | | | | | |
Collapse
|
15
|
|
16
|
Abstract
Among species within a phylogenetic group, genomic GC% values can cover a wide range that is particularly evident at third codon positions. However, among genes within a genome, genic GC% values can also cover a wide range that is, again, particularly evident at third codon positions. Individual genes and genomes each have a "homostabilizing propensity" to adopt a relatively uniform GC%. Each gene (a "microisochore") occupies a discrete GC% niche of relatively uniform base composition amongst its fellow genes, which can collectively span a wide GC% range. Homostabilization serves to recombinationally isolate both genome sectors (facilitating gene duplication and differentiation) and genomes (facilitating genome duplication and differentiation; e.g., speciation). Although they may sometimes be in conflict, the individualities of genomes, and of genes within those genomes, are separately sustained by a common mechanism, uniformity of GC%. The protection against inadvertent recombination afforded by GC% differentiation is, in the general case, a prerequisite for phenotypic differentiation.
Collapse
Affiliation(s)
- D. R. FORSDYKE
- Department of Biochemistry, Queen's University, Kingston, Ontario K7L3N6, Canada
| |
Collapse
|
17
|
Abstract
To detect positive Darwinian selection it is thought essential to compare two sequences. Despite its defects, "the comparative method rules." However, genes evolving rapidly under positive selection conflict more with internal forces (the genome phenotype) than genes evolving slowly under negative selection. In particular, there is conflict with stem-loop potential. The conflict between protein-encoding potential (primary information) and stem-loop potential (secondary information) permits detection of positive selection in a single sequence. The degree to which secondary information is compromised provides a measure of the speed of transmission of primary information. Thus, the sovereignty of the comparative method is challenged not only by its own defects, but also by the availability of a single-sequence method. However, while of limited utility for positive selection, the comparative method casts new light on Darwin's great question — the origin of species. Comparison of rates of synonymous and non-synonymous mutation suggests that branching into new species begins with synonymous mutations.
Collapse
Affiliation(s)
- DONALD R. FORSDYKE
- Department of Biochemistry, Queen's University, Kingston, Ontario, Canada K7L3N6, Canada
| |
Collapse
|
18
|
Protein-based signatures of functional evolution in Plasmodium falciparum. BMC Evol Biol 2011; 11:257. [PMID: 21917172 PMCID: PMC3197514 DOI: 10.1186/1471-2148-11-257] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2011] [Accepted: 09/14/2011] [Indexed: 02/06/2023] Open
Abstract
Background It has been known for over a decade that Plasmodium falciparum proteins are enriched in non-globular domains of unknown function. The potential for these regions of protein sequence to undergo high levels of genetic drift provides a fundamental challenge to attempts to identify the molecular basis of adaptive change in malaria parasites. Results Evolutionary comparisons were undertaken using a set of forty P. falciparum metabolic enzyme genes, both within the hominid malaria clade (P. reichenowi) and across the genus (P. chabaudi). All genes contained coding elements highly conserved across the genus, but there were also a large number of regions of weakly or non-aligning coding sequence. These displayed remarkable levels of non-synonymous fixed differences within the hominid malaria clade indicating near complete release from purifying selection (dN/dS ratio at residues non-aligning across genus: 0.64, dN/dS ratio at residues identical across genus: 0.03). Regions of low conservation also possessed high levels of hydrophilicity, a marker of non-globularity. The propensity for such regions to act as potent sources of non-synonymous genetic drift within extant P. falciparum isolates was confirmed at chromosomal regions containing genes known to mediate drug resistance in field isolates, where 150 of 153 amino acid variants were located in poorly conserved regions. In contrast, all 22 amino acid variants associated with drug resistance were restricted to highly conserved regions. Additional mutations associated with laboratory-selected drug resistance, such as those in PfATPase4 selected by spiroindolone, were similarly restricted while mutations in another calcium ATPase (PfSERCA, a gene proposed to mediate artemisinin resistance) that reach significant frequencies in field isolates were located exclusively in poorly conserved regions consistent with genetic drift. Conclusion Coding sequences of malaria parasites contain prospectively definable domains subject to neutral or nearly neutral evolution on a scale that appears unrivalled in biology. This distinct evolutionary landscape has potential to confound analytical methods developed for other genera. Against this tide of genetic drift, polymorphisms mediating functional change stand out to such an extent that evolutionary context provides a useful signal for identifying the molecular basis of drug resistance in malaria parasites, a finding that is of relevance to both genome-wide and candidate gene studies in this genus.
Collapse
|
19
|
Haerty W, Golding GB. Increased polymorphism near low-complexity sequences across the genomes of Plasmodium falciparum isolates. Genome Biol Evol 2011; 3:539-50. [PMID: 21602572 PMCID: PMC3140889 DOI: 10.1093/gbe/evr045] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Low-complexity regions (LCRs) within proteins sequences are often considered to evolve neutrally even though recent studies reported evidence for selection acting on some of them. Because of their widespread distribution among eukaryotes genomes and the potential deleterious effect of expansion/contraction of some of them in humans, low-complexity sequences are of major interest and numerous studies have attempted to describe their dynamic between genomes as well as the factors correlated to their variation and to assess their selective value. However, due to the scarcity of individual genomes within a species, most of the analyses so far have been performed at the species level with the implicit assumption that the variation both in composition and size within species is too small relative to the between-species divergence to affect the conclusions of the analysis. Here we used the available genomes of 14 Plasmodium falciparum isolates to assess the relationship between low-complexity sequence variation and factors such as nucleotide polymorphism across strains, sequence composition, and protein expression. We report that more than half of the 7,711 low-complexity sequences found within aligned coding sequences are variable in size among strains. Across strains, we observed an increasing density of polymorphic sites toward the LCR boundaries. This observation strongly suggests the joint effects of lowered selective constraints on low-complexity sequences and a mutagenic effect of these simple sequences.
Collapse
Affiliation(s)
- Wilfried Haerty
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | | |
Collapse
|
20
|
Tian X, Strassmann JE, Queller DC. Genome nucleotide composition shapes variation in simple sequence repeats. Mol Biol Evol 2010; 28:899-909. [PMID: 20943830 DOI: 10.1093/molbev/msq266] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Simple sequence repeats (SSRs) or microsatellites are a common component of genomes but vary greatly across species in their abundance. We tested the hypothesis that this variation is due in part to AT/GC content of genomes, with genomes biased toward either high AT or high CG generating more short random repeats that are long enough to enhance expansion through slippage during replication. To test this hypothesis, we identified repeats with perfect tandem iterations of 1-6 bp from 25 protists with complete or near-complete genome sequences. As expected, the density and the frequency are highly related to genome AT content, with excellent fits to quadratic regressions with minima near a 50% AT content and rising toward both extremes. Within species, the same trends hold, except the limited variation in AT content within each species places each mainly on the descending (GC rich), middle, or ascending (AT rich) part of the curve. The base usages of repeat motifs are also significantly correlated with genome nucleotide compositions: Percentages of AT-rich motifs rise with the increase of genome AT content but vice versa for GC-rich subgroups. Amino acid homopolymer repeats also show the expected quadratic relationship, with higher abundance in species with AT content biased in either direction. Our results show that genome nucleotide composition explains up to half of the variance in the abundance and motif constitution of SSRs.
Collapse
Affiliation(s)
- Xiangjun Tian
- Department of Ecology and Evolutionary Biology, Rice University, USA
| | | | | |
Collapse
|
21
|
Zilversmit MM, Volkman SK, DePristo MA, Wirth DF, Awadalla P, Hartl DL. Low-complexity regions in Plasmodium falciparum: missing links in the evolution of an extreme genome. Mol Biol Evol 2010; 27:2198-209. [PMID: 20427419 PMCID: PMC2922621 DOI: 10.1093/molbev/msq108] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Over the past decade, attempts to explain the unusual size and prevalence of low-complexity regions (LCRs) in the proteins of the human malaria parasite Plasmodium falciparum have used both neutral and adaptive models. This past research has offered conflicting explanations for LCR characteristics and their role in, and influence on, the evolution of genome structure. Here we show that P. falciparum LCRs (PfLCRs) are not a single phenomenon, but rather consist of at least three distinct types of sequence, and this heterogeneity is the source of the conflict in the literature. Using molecular and population genetics, we show that these families of PfLCRs are evolving by different mechanisms. One of these families, named here the HighGC family, is of particular interest because these LCRs act as recombination hotspots, both in genes under positive selection for high levels of diversity which can be created by recombination (antigens) and those likely to be evolving neutrally or under negative selection (metabolic enzymes). We discuss how the discovery of these distinct species of PfLCRs helps to resolve previous contradictory studies on LCRs in malaria and contributes to our understanding of the evolution of the of the parasite's unusual genome.
Collapse
Affiliation(s)
- Martine M Zilversmit
- Department of Organismic and Evolutionary Biology, Harvard University, Boston, MA, USA.
| | | | | | | | | | | |
Collapse
|
22
|
Frugier M, Bour T, Ayach M, Santos MAS, Rudinger-Thirion J, Théobald-Dietrich A, Pizzi E. Low Complexity Regions behave as tRNA sponges to help co-translational folding of plasmodial proteins. FEBS Lett 2009; 584:448-54. [PMID: 19900443 DOI: 10.1016/j.febslet.2009.11.004] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2009] [Revised: 11/02/2009] [Accepted: 11/03/2009] [Indexed: 10/20/2022]
Abstract
In most organisms, the information necessary to specify the native 3D-structures of proteins is encoded in the corresponding mRNA sequences. Translational accuracy and efficiency are coupled and sequences that are slowly translated play an essential role in the concomitant folding of protein domains. Here, we suggest that the well-known mechanisms for the regulation of translational efficiency, which involves mRNA structure and/or asymmetric tRNA abundance, do not apply to all organisms. We propose that Plasmodium, the parasite responsible for malaria, uses an alternative strategy to slow down ribosomal speed and avoid multidomain protein misfolding during translation. In our model, the abundant Low Complexity Regions present in Plasmodium proteins replace the codon preferences, which influence the assembly of protein secondary structures.
Collapse
Affiliation(s)
- Magali Frugier
- Architecture et Réactivité de l'ARN, Université de Strasbourg, CNRS, IBMC, 15 rue René Descartes, 67084 Strasbourg Cedex, France.
| | | | | | | | | | | | | |
Collapse
|
23
|
Scherrer and Jost’s symposium: the gene concept in 2008. Theory Biosci 2009; 128:157-61. [DOI: 10.1007/s12064-009-0071-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2008] [Accepted: 02/03/2009] [Indexed: 10/20/2022]
|
24
|
Wells GA, Müller IB, Wrenger C, Louw AI. The activity of Plasmodium falciparum arginase is mediated by a novel inter-monomer salt-bridge between Glu295-Arg404. FEBS J 2009; 276:3517-30. [PMID: 19456858 DOI: 10.1111/j.1742-4658.2009.07073.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A recent study implicated a role for Plasmodium falciparum arginase in the systemic depletion of arginine levels, which in turn has been associated with human cerebral malaria pathogenesis. Arginase (EC 3.5.3.1) is a multimeric metallo-protein that catalyses the hydrolysis of arginine to ornithine and urea by means of a binuclear spin-coupled Mn(2+) cluster in the active site. A previous report indicated that P. falciparum arginase has a strong dependency between trimer formation, enzyme activity and metal co-ordination. Mutations that abolished Mn(2+) binding also caused dissociation of the trimer; conversely, mutations that abolished trimer formation resulted in inactive monomers. By contrast, the monomers of mammalian (and therefore host) arginase are also active. P. falciparum arginase thus appears to be an obligate trimer and interfering with trimer formation may therefore serve as an alternative route to enzyme inhibition. In the present study, the mechanism of the metal dependency was explored by means of homology modelling and molecular dynamics. When the active site metals are removed, loss of structural integrity is observed. This is reflected by a larger equilibration rmsd for the protein when the active site metal is removed and some loss of secondary structure. Furthermore, modelling revealed the existence of a novel inter-monomer salt-bridge between Glu295 and Arg404, which was shown to be associated with the metal dependency. Mutational studies not only confirmed the importance of this salt-bridge in trimer formation, but also provided evidence for the independence of P. falciparum arginase activity on trimer formation.
Collapse
Affiliation(s)
- Gordon A Wells
- Department of Biochemistry, University of Pretoria, South Africa
| | | | | | | |
Collapse
|
25
|
Microsatellites that violate Chargaff's second parity rule have base order-dependent asymmetries in the folding energies of complementary DNA strands and may not drive speciation. J Theor Biol 2008; 254:168-77. [DOI: 10.1016/j.jtbi.2008.05.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2008] [Revised: 05/16/2008] [Accepted: 05/16/2008] [Indexed: 11/21/2022]
|
26
|
Brick K, Pizzi E. A novel series of compositionally biased substitution matrices for comparing Plasmodium proteins. BMC Bioinformatics 2008; 9:236. [PMID: 18485187 PMCID: PMC2408606 DOI: 10.1186/1471-2105-9-236] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2008] [Accepted: 05/16/2008] [Indexed: 11/15/2022] Open
Abstract
Background The most common substitution matrices currently used (BLOSUM and PAM) are based on protein sequences with average amino acid distributions, thus they do not represent a fully accurate substitution model for proteins characterized by a biased amino acid composition. This problem has been addressed recently by adjusting existing matrices, however, to date, no empirical approach has been taken to build matrices which offer a substitution model for comparing proteins sharing an amino acid compositional bias. Here, we present a novel procedure to construct series of symmetrical substitution matrices to align proteins from similarly biased Plasmodium proteomes. Results We generated substitution matrices by selecting from the BLOCKS database those multiple alignments with a compositional bias similar to that of P. falciparum and P. yoelii proteins. A novel 'fuzzy' clustering method was adopted to group sequences within these alignments, showing that this method retains more complete information on the amino acid substitutions when compared to hierarchical clustering. We assessed the performance against the BLOSUM62 series and showed that the usage of our matrices results in an improvement in the performance of BLAST database searches, greatly reducing the number of false positive hits. We then demonstrated applications of the use of novel matrices to improve the annotation of homologs between the two Plasmodium species and to classify members of the P. falciparum RIFIN/STEVOR family. Conclusion We confirmed that in the case of compositionally biased proteins, standard BLOSUM matrices are not suited for optimal alignments, and specific substitution matrices are required. In addition, we showed that the usage of these matrices leads to a reduction of false positive hits, facilitating the automatic annotation process.
Collapse
Affiliation(s)
- Kevin Brick
- Dipartimento di Malattie Infettive, Parassitarie ed Immunomediate - Istituto Superiore di Sanità, Viale Regina Elena, 299 00161 Roma, Italy.
| | | |
Collapse
|
27
|
Aly ASI, Mikolajczak SA, Rivera HS, Camargo N, Jacobs-Lorena V, Labaied M, Coppens I, Kappe SHI. Targeted deletion of SAP1 abolishes the expression of infectivity factors necessary for successful malaria parasite liver infection. Mol Microbiol 2008; 69:152-63. [PMID: 18466298 PMCID: PMC2615191 DOI: 10.1111/j.1365-2958.2008.06271.x] [Citation(s) in RCA: 85] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Malaria parasite sporozoites prepare for transmission to a mammalian host by upregulation of UIS (Upregulated in Infectious Sporozoites) genes. A number of UIS gene products are essential for the establishment of the intrahepatocytic niche. However, the factors that regulate the expression of genes involved in gain of infectivity for the liver are unknown. Herein, we show that a conserved Plasmodium sporozoite low-complexity asparagine-rich protein, SAP1 (Sporozoite Asparagine-rich Protein 1), has an essential role in malaria parasite liver infection. Targeted deletion of SAP1 in the rodent malaria parasite Plasmodium yoelii generated mutant parasites that traverse and invade hepatocytes normally but cannot initiate liver-stage development in vitro and in vivo. Moreover, immunizations with Pysap1(−) sporozoites confer long-lasting sterile protection against wild-type sporozoite infection. Strikingly, lack of SAP1 abolished expression of essential UIS genes including UIS3, UIS4 and P52 but not the constitutively expressed genes encoding, among others, sporozoite proteins CSP and TRAP. SAP1 localization to the cell interior but not the nucleus of sporozoites suggests its involvement in a post-transcriptional mechanism of gene expression control. These findings demonstrate that SAP1 is essential for liver infection possibly by functioning as a selective regulator controlling the expression of infectivity-associated parasite effector genes.
Collapse
Affiliation(s)
- Ahmed S I Aly
- Seattle Biomedical Research Institute, Seattle, WA 98109, USA
| | | | | | | | | | | | | | | |
Collapse
|
28
|
Dávalos LM, Perkins SL. Saturation and base composition bias explain phylogenomic conflict in Plasmodium. Genomics 2008; 91:433-42. [DOI: 10.1016/j.ygeno.2008.01.006] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2007] [Revised: 01/09/2008] [Accepted: 01/12/2008] [Indexed: 10/22/2022]
|
29
|
Zhang CY, Wei JF, Wu JS, Xu WR, Sun X, He SH. Evaluation of FORS-D analysis: a comparison with the statistically significant stem-loop potential. Biochem Genet 2007; 46:29-40. [PMID: 17955360 DOI: 10.1007/s10528-007-9126-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2007] [Accepted: 05/26/2007] [Indexed: 11/28/2022]
Abstract
The stem-loop potential of a nucleic acid segment (expressed as a FONS value), decomposes into base composition-dependent and base order-dependent components. The latter, expressed as a FORS-D value, is derived by subtracting the value of the base composition-dependent component (FORS-M) from the FONS value. FORS-D analysis is the use of FORS-D values to estimate the potential of local base order to contribute to a stem-loop structure, and it has been used to investigate the relationship between stem-loop structure and other selective pressures on genomes. In the present study, we evaluated the reliability of FORS-D analysis by comparing it with statistically significant stem-loop potential, another robust method developed by Le and Maizel for examining stem-loop structure. We found that FORS-M values calculated using 10 randomized sequences are as reliable as those calculated using 100 randomized sequences. The resulting FORS-D values have a similar trend and distribution as statistically significant stem-loop potential, implying that FORS-D analysis is as reliable as the latter in measuring the distribution of base order-dependent stem-loop potential. Since the calculation of the FORS-M values is time consuming, the integrated program Bodslp developed by us will become a convenient tool for large-scale FORS-D analysis. The results also suggest that for some purposes the online program SigStb developed by Le and Maizel may be used as an alternative tool for FORS-D analysis.
Collapse
Affiliation(s)
- Chi-Yu Zhang
- Department of Biochemistry and Molecular Biology, Jiangsu University School of Medical Technology, Zhenjiang, Jiangsu, P.R. China.
| | | | | | | | | | | |
Collapse
|
30
|
Forsdyke DR. Calculation of folding energies of single-stranded nucleic acid sequences: conceptual issues. J Theor Biol 2007; 248:745-53. [PMID: 17698086 DOI: 10.1016/j.jtbi.2007.07.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2007] [Revised: 07/05/2007] [Accepted: 07/09/2007] [Indexed: 12/16/2022]
Abstract
The stability of a folded single-stranded nucleic acid depends on the composition and order of its constituent bases and may be assessed by taking into account the pairing energies of its constituent dinucleotides. To assess the possible biological significance of a computed structure, Maizel and coworkers in the 1980s compared the energy of folding of a natural single-stranded RNA sequence with the energies of several versions of the same sequence produced by shuffling base order. However, in the 2000s many took as self-evident the view that shuffling at the mononucleotide level (single bases) was conceptual wrong and should be replaced by shuffling at the level of dinucleotides (retaining pairs of adjacent bases). Folding energies then became indistinguishable from those of corresponding shuffled sequences and doubt was cast on the importance of secondary structures. Nevertheless, some continued productively to employ the single base shuffling approach, the justification for which is the topic of this paper. Because dinucleotide pairing energies are needed to calculate structure, it does not follow that shuffling should not disrupt dinucleotides. Base shuffling allows determination of the relative contributions of base composition and base order to total folding energy. The potential for secondary structure arises from pressures acting at both DNA and RNA levels, and is abundant throughout genomes-with a probable primary role in recombination. Within a gene the potential can often be accommodated, and base order and composition work together (values have the same negative sign) in contributing to total folding energy. But sometimes protein-coding pressure on base order conflicts with the pressure for secondary structure and the values have opposite signs. Total folding energy can be deemed of potential biological significance when the average of several readings is significantly less than zero.
Collapse
Affiliation(s)
- Donald R Forsdyke
- Department of Biochemistry, Queen's University, Kingston, Ontario, Canada K7L3N6.
| |
Collapse
|
31
|
Chen X, Chong CR, Shi L, Yoshimoto T, Sullivan DJ, Liu JO. Inhibitors of Plasmodium falciparum methionine aminopeptidase 1b possess antimalarial activity. Proc Natl Acad Sci U S A 2006; 103:14548-53. [PMID: 16983082 PMCID: PMC1599997 DOI: 10.1073/pnas.0604101103] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
With >1 million deaths annually, mostly among children in sub-Saharan Africa, malaria poses one of the most critical challenges in medicine today. Although introduction of the artemisinin class of antimalarial drugs has offered a temporary solution to the problem of drug resistance, new antimalarial drugs are needed to ensure effective control of the disease in the future. Herein, we have investigated members of the methionine aminopeptidase family as potential antimalarial targets. The Plasmodium falciparum methionine aminopeptidase 1b (PfMetAP1b), one of four MetAP proteins encoded in the P. falciparum genome, was cloned, overexpressed, purified, and used to screen a 175,000-compound library for inhibitors. A family of structurally related inhibitors containing a 2-(2-pyridinyl)-pyrimidine core was identified. Structure/activity studies led to the identification of a potent PfMetAP1b inhibitor, XC11, with an IC(50) of 112 nM. XC11 was highly selective for PfMetAP1b and did not exhibit significant cytotoxicity against primary human fibroblasts. Most importantly, XC11 inhibited the proliferation of P. falciparum strains 3D7 [chloroquine (CQ)-sensitive] and Dd2 (multidrug-resistant) in vitro and is active in mouse malaria models for both CQ-sensitive and CQ-resistant strains. These results suggest that PfMetAP1b is a promising target and XC11 is an important lead compound for the development of novel antimalarial drugs.
Collapse
Affiliation(s)
- Xiaochun Chen
- Departments of *Pharmacology and Molecular Sciences and
| | | | - Lirong Shi
- The Malaria Research Institute, W. Harry Feinstone Department of Molecular Microbiology and Immunology, The Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205; and
| | - Tadashi Yoshimoto
- School of Pharmaceutical Sciences, Nagasaki University, Nagasaki 852-8521, Japan
| | - David J. Sullivan
- The Malaria Research Institute, W. Harry Feinstone Department of Molecular Microbiology and Immunology, The Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205; and
| | - Jun O. Liu
- Departments of *Pharmacology and Molecular Sciences and
- Oncology and
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
32
|
DePristo MA, Zilversmit MM, Hartl DL. On the abundance, amino acid composition, and evolutionary dynamics of low-complexity regions in proteins. Gene 2006; 378:19-30. [PMID: 16806741 DOI: 10.1016/j.gene.2006.03.023] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2006] [Revised: 03/24/2006] [Accepted: 03/27/2006] [Indexed: 11/21/2022]
Abstract
Protein sequences frequently contain regions composed of a reduced number of amino acids. Despite their presence in about half of all proteins and their unusual prevalence in the malaria parasite Plasmodium falciparum, the function and evolution of such low-complexity regions (LCRs) remain unclear. Here we show that LCR abundance and amino acid composition depend largely, but not exclusively, on genomic A+T content and obey power-law growth dynamics. Further, our results indicate that LCRs are analogous to microsatellites in that DNA replication slippage and unequal crossover recombination are important molecular mechanisms for LCR expansion. We support this hypothesis by demonstrating that the size of LCR insertions/deletions among orthologous genes depends upon length. Moreover, we show that LCRs enable intra-exonic recombination in a key family of cell-surface antigens in P. falciparum and thus likely facilitate the generation of antigenic diversity. We conclude with a mechanistic model for LCR evolution that links the pattern of LCRs within P. falciparum to its high genomic A+T content and recombination rate.
Collapse
Affiliation(s)
- Mark A DePristo
- Department of Organismic and Evolutionary Biology, Hartl Lab, 16 Divinity Street, Harvard University, Cambridge, MA 02138, USA.
| | | | | |
Collapse
|
33
|
Complexity. Evol Bioinform Online 2006. [DOI: 10.1007/978-0-387-33419-6_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
34
|
Lee SJ, Mortimer JR, Forsdyke DR. Genomic conflict settled in favour of the species rather than the gene at extreme GC percentage values. ACTA ACUST UNITED AC 2005; 3:219-28. [PMID: 15702952 DOI: 10.2165/00822942-200403040-00003] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Wada and colleagues have shown that, whether prokaryotic or eukaryotic, each gene has a "homostabilising propensity" to adopt a relatively uniform GC percentage (GC%). Accordingly, each gene can be viewed as a "microisochore" occupying a discrete GC% niche of relatively uniform base composition amongst its fellow genes. Although first, second and third codon positions usually differ in GC%, each position tends to maintain a uniform, gene-specific GC% value. Thus, within a genome, genic GC% values can cover a wide range. This is most evident at third codon positions, which are least constrained by amino acid encoding needs. In 1991, Wada and colleagues further noted that, within a phylogenetic group, genomic GC% values can also cover a wide range. This is again most evident at third codon positions. Thus, the dispersion of GC% values among genes within a genome matches the dispersion of GC% values among genomes within a phylogenetic group. Wada described the context-independence of plots of different codon position GC% values against total GC% as a "universal" characteristic. Several studies relate this to recombination. We have confirmed that third codon positions usually relate more to the genes that contain them than to the species. However, in genomes with extreme GC% values (low or high), third codon positions tend to maintain a constant GC%, thus relating more to the species than to the genes that contain them. Genes in an extreme-GC% genome collectively span a smaller GC% range, and mainly rely on first and second codon positions for differentiation as "microisochores". Our results are consistent with the view that differences in GC% serve to recombinationally isolate both genome sectors (facilitating gene duplication) and genomes (facilitating genome duplication, e.g. speciation). In intermediate-GC% genomes, conflict between the needs of the species and the needs of individual genes within that species is minimal. However, in extreme-GC% genomes there is a conflict, which is settled in favour of the species (i.e. group selection) rather than in favour of the gene (genic selection).
Collapse
Affiliation(s)
- Shang-Jung Lee
- Genetics Graduate Program, University of British Columbia, Vancouver, British Columbia, Canada
| | | | | |
Collapse
|
35
|
Rayment JH, Forsdyke DR. Amino acids as placeholders: base-composition pressures on protein length in malaria parasites and prokaryotes. ACTA ACUST UNITED AC 2005; 4:117-30. [PMID: 16128613 DOI: 10.2165/00822942-200504020-00005] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
BACKGROUND The composition and sequence of amino acids in a protein may serve the underlying needs of the nucleic acids that encode the protein (the genome phenotype). In extreme form, amino acids become mere placeholders inserted between functional segments or domains, and--apart from increasing protein length--playing no role in the specific function or structure of a protein (the conventional phenotype). METHODS We studied the genomes of two malarial parasites and 521 prokaryotes (144 complete) that differ widely in GC% and optimum growth temperature, comparing the base compositions of the protein coding regions and corresponding lengths (kilobases). RESULTS Malarial parasites show distinctive responses to base-compositional pressures that increase as protein lengths increase. A low-GC% species (Plasmodium falciparum) is likely to have more placeholder amino acids than an intermediate-GC% species (P. vivax), so that homologous proteins are longer. In prokaryotes, GC% is generally greater and AG% is generally less in open reading frames (ORFs) encoding long proteins. The increased GC% in long ORFs increases as species' GC% increases, and decreases as species' AG% increases. In low- and intermediate-GC% prokaryotic species, increases in ORF GC% as encoded proteins increase in length are largely accounted for by the base compositions of first and second (amino acid-determining) codon positions. In high-GC% prokaryotic species, first and third (non-amino acid-determining) codon positions play this role. CONCLUSION In low- and intermediate-GC% prokaryotes, placeholder amino acids are likely to be well defined, corresponding to codons enriched in G and/or C at first and second positions. In high-GC% prokaryotes, placeholder amino acids are likely to be less well defined. Increases in ORF GC% as encoded proteins increase in length are greater in mesophiles than in thermophiles, which are constrained from increasing protein lengths in response to base-composition pressures.
Collapse
Affiliation(s)
- Jonathan H Rayment
- Department of Biochemistry, Queen's University, Kingston, Ontario, Canada
| | | |
Collapse
|
36
|
Jean L, Withers-Martinez C, Hackett F, Blackman MJ. Unique insertions within Plasmodium falciparum subtilisin-like protease-1 are crucial for enzyme maturation and activity. Mol Biochem Parasitol 2005; 144:187-97. [PMID: 16183148 DOI: 10.1016/j.molbiopara.2005.07.008] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2005] [Revised: 07/14/2005] [Accepted: 07/27/2005] [Indexed: 11/22/2022]
Abstract
Parasite serine proteases play essential roles in the asexual erythrocytic life cycle of the malaria parasite. The timing and location of expression of Plasmodium falciparum subtilisin-like protease-1 (PfSUB-1) are consistent with a role in erythrocyte invasion. Maturation of PfSUB-1 involves two autocatalytic processing events in which an 82 kDa precursor is converted to a 54 kDa form, followed by further cleavage to produce a 47 kDa form. Here we have compared PfSUB-1 with a number of Plasmodium orthologues and the most closely related bacterial subtilase sequences and find that, like many malarial proteins, PfSUB-1 possesses both low and high complexity insertions. The latter take the form of six surface-associated strands or loops which are conserved in all SUB-1 orthologues but not present in any other subtilase. Several mutants of PfSUB-1 with deletions of all, or part, of each of the six loop insertions were produced in an insect cell expression system. Aside from loop III, which was dispensable, individual deletion of the loop insertions revealed a role in protein maturation and/or stability. Specific substitutions within loop II inhibited maturation and enzyme activity. Mutations in loops V and VI specifically inhibited the second step of autocatalytic maturation providing evidence that the two processing steps have distinct structural requirements and that conversion to p47 is not a prerequisite for proteolytic activity in trans.
Collapse
Affiliation(s)
- Létitia Jean
- Division of Parasitology, National Institute for Medical Research, Mill Hill, London NW7 1AA, UK
| | | | | | | |
Collapse
|
37
|
Chanda I, Pan A, Dutta C. Proteome composition in Plasmodium falciparum: higher usage of GC-rich nonsynonymous codons in highly expressed genes. J Mol Evol 2005; 61:513-23. [PMID: 16044241 DOI: 10.1007/s00239-005-0023-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2005] [Accepted: 04/19/2005] [Indexed: 10/25/2022]
Abstract
The parasite Plasmodium falciparum, responsible for the most deadly form of human malaria, is one of the extremely AT-rich genomes sequenced so far and known to possess many atypical characteristics. Using multivariate statistical approaches, the present study analyzes the amino acid usage pattern in 5038 annotated protein-coding sequences in P. falciparum clone 3D7. The amino acid composition of individual proteins, though dominated by the directional mutational pressure, exhibits wide variation across the proteome. The Asn content, expression level, mean molecular weight, hydropathy, and aromaticity are found to be the major sources of variation in amino acid usage. At all stages of development, frequencies of residues encoded by GC-rich codons such as Gly, Ala, Arg, and Pro increase significantly in the products of the highly expressed genes. Investigation of nucleotide substitution patterns in P. falciparum and other Plasmodium species reveals that the nonsynonymous sites of highly expressed genes are more conserved than those of the lowly expressed ones, though for synonymous sites, the reverse is true. The highly expressed genes are, therefore, expected to be closer to their putative ancestral state in amino acid composition, and a plausible reason for their sequences being GC-rich at nonsynonymous codon positions could be that their ancestral state was less AT-biased. Negative correlation of the expression level of proteins with respective molecular weights supports the notion that P. falciparum, in spite of its intracellular parasitic lifestyle, follows the principle of cost minimization.
Collapse
Affiliation(s)
- Ipsita Chanda
- Human Genetics & Genomics Group, Indian Institute of Chemical Biology, Kolkata 700032, India
| | | | | |
Collapse
|
38
|
Singh GP, Chandra BR, Bhattacharya A, Akhouri RR, Singh SK, Sharma A. Hyper-expansion of asparagines correlates with an abundance of proteins with prion-like domains in Plasmodium falciparum. Mol Biochem Parasitol 2005; 137:307-19. [PMID: 15383301 DOI: 10.1016/j.molbiopara.2004.05.016] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2004] [Revised: 05/04/2004] [Accepted: 05/14/2004] [Indexed: 11/20/2022]
Abstract
Plasmodium falciparum encodes approximately 5300 proteins of which approximately 35% have repeats of amino acids, significantly higher than in other fully sequenced eukaryotes. The proportion of proteins with amino acid homorepeats varies from 4 to 54% amongst different functional classes of proteins. These homorepeats are dominated by asparagines, which are selected over lysines despite equivalent AT codon content. Surprisingly, asparagine repeats are absent from the variant surface antigen protein families of PfEMP1s, Stevors and Rifins. The PfEMP1 protein family is instead rich in recurrences of glutamates, similar to human cell surface proteins. Structural mapping of homorepeats suggests that these segments are likely to form surface exposed structures that protrude from the main protein cores. We also found an abundance of asparagine-rich prion-like domains in P. falciparum, significantly larger than in any other eukaryote. Domains rich in glutamines and asparagines have an innate predisposition to form self-propagating amyloid fibers, which are involved both in prion-based inheritance and in human neurodegenerative disorders. Nearly 24% (1302 polypeptides) of P. falciparum proteins contain prion-forming or prion-inducing domains, in comparison to Drosophila (approximately 3.4%) which to date showed the highest number of prion-like proteins. The unexpected properties of P. falciparum revealed here open new avenues for investigating parasite biology.
Collapse
Affiliation(s)
- Gajinder Pal Singh
- Malaria Research Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110 067, India
| | | | | | | | | | | |
Collapse
|
39
|
Paz A, Mester D, Baca I, Nevo E, Korol A. Adaptive role of increased frequency of polypurine tracts in mRNA sequences of thermophilic prokaryotes. Proc Natl Acad Sci U S A 2004; 101:2951-6. [PMID: 14973185 PMCID: PMC365726 DOI: 10.1073/pnas.0308594100] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The mechanism of an organism's adaptation to high temperatures has been investigated intensively in recent years. It was suggested that the macromolecules of thermophilic microorganisms (especially proteins) have structural features that enhance their thermostability. We compared mRNA sequences of 72 fully sequenced prokaryotic proteomes (14 thermophilic and 58 mesophilic species). Although the differences between the percentage of adenine plus guanine content of whole mRNAs of different prokaryotic species are much lower than those of guanine plus cytosine content, the thermophile purine-pyrimidine (R/Y) ratio within their mRNAs is significantly higher than that of the mesophiles. The first and third codon positions of both thermophiles and mesophiles are purine-biased, with the bias more pronounced by the thermophiles. Thermophile mRNAs that display the highest R/Y ratio (1.43-1.69) are those of the ribosomal proteins, histone-like proteins, DNA-dependent RNA polymerase subunits, and heat-shock proteins. Within mesophilic prokaryotes and five eukaryotic species, the R/Y ratio of the mRNAs of heat-shock proteins is higher than their average over coding part of the genome. Polypurine tracts (R)(n) (with n > or = 5) are much more abundant within the thermophile mRNAs compared with mesophiles. Between two sequential pure-purinic codons of thermophile mRNAs, there is a rather strong tendency for the occurrence of adenine but not guanine tracts. The data suggest that mixed adenine.guanine and polyadenine tracts in mRNAs increase the thermostability beyond the contribution of amino acids encoded by purine tracts, which highlights the importance of ecological stress in the evolution of genome architecture.
Collapse
Affiliation(s)
- Arnon Paz
- Institute of Evolution, Haifa University, Mount Carmel, Haifa 31905, Israel
| | | | | | | | | |
Collapse
|
40
|
Abstract
It has recently been shown that many proteins are unfolded in their functional state. In addition, a large number of stretches of protein sequences are predicted to be unfolded. It has been argued that the high frequency of occurrence of these predicted unfolded sequences indicates that the majority of these sequences must also be functional. These sequences tend to be of low complexity. It is well established that certain types of low-complexity sequences are genetically unstable, and are prone to expand in the genome. It is possible, therefore, that in addition to these well-characterised functional unfolded proteins, there are a large number of unfolded proteins that are non-functional. Analogous to 'junk DNA' these protein sequences may arise due to physical characteristics of DNA. Their high frequency may reflect, therefore, the high probability of expansion in the genome. Such 'junk proteins' would not be advantageous, and may be mildly deleterious to the cell.
Collapse
Affiliation(s)
- Simon C Lovell
- School of Biological Science, University of Manchester, 2.205 Stopford Building, Oxford Rd, Manchester M13 9PT, UK.
| |
Collapse
|
41
|
Lambros RJ, Mortimer JR, Forsdyke DR. Optimum growth temperature and the base composition of open reading frames in prokaryotes. Extremophiles 2003; 7:443-50. [PMID: 14666404 DOI: 10.1007/s00792-003-0353-4] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2003] [Accepted: 06/20/2003] [Indexed: 11/27/2022]
Abstract
The purine-loading index (PLI) is the difference between the numbers of purines (A+G) and pyrimidines (T+C) per kilobase of single-stranded nucleic acid. By purine-loading their mRNAs organisms may minimize unnecessary RNA-RNA interactions and prevent inadvertent formation of "self" double-stranded RNA. Since RNA-RNA interactions have a strong entropy-driven component, this need to minimize should increase as temperature increases. Consistent with this, we report for 550 prokaryotic species that optimum growth temperature is related to the average PLI of open reading frames. With increasing temperature prokaryotes tend to acquire base A and lose base C, while keeping bases T and G relatively constant. Accordingly, while the PLI increases, the (G+C)% decreases. The previously observed positive correlation between (G+C)% and optimum growth temperature, which applies to RNA species whose structure is of major importance for their function (ribosomal and transfer RNAs) does not apply to mRNAs, and hence is unlikely to apply generally to genomic DNA.
Collapse
Affiliation(s)
- R J Lambros
- Department of Biochemistry, Queen's University, Kingston, Ontario K7L3N6, Canada
| | | | | |
Collapse
|