1
|
Multiple toxins and a protease contribute to the aphid-killing ability of Pseudomonas fluorescens PpR24. Environ Microbiol 2024; 26:e16604. [PMID: 38561900 DOI: 10.1111/1462-2920.16604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 02/23/2024] [Indexed: 04/04/2024]
Abstract
Aphids are globally important pests causing damage to a broad range of crops. Due to insecticide resistance, there is an urgent need to develop alternative control strategies. In our previous work, we found Pseudomonas fluorescens PpR24 can orally infect and kill the insecticide-resistant green-peach aphid (Myzus persicae). However, the genetic basis of the insecticidal capability of PpR24 remains unclear. Genome sequencing of PpR24 confirmed the presence of various insecticidal toxins such as Tc (toxin complexes), Rhs (rearrangement hotspot) elements, and other insect-killing proteases. Upon aphids infection with PpR24, RNA-Seq analysis revealed 193 aphid genes were differentially expressed with down-regulation of 16 detoxification genes. In addition, 1325 PpR24 genes (542 were upregulated and 783 downregulated) were subject to differential expression, including genes responsible for secondary metabolite biosynthesis, the iron-restriction response, oxidative stress resistance, and virulence factors. Single and double deletion of candidate virulence genes encoding a secreted protease (AprX) and four toxin components (two TcA-like; one TcB-like; one TcC-like insecticidal toxins) showed that all five genes contribute significantly to aphid killing, particularly AprX. This comprehensive host-pathogen transcriptomic analysis provides novel insight into the molecular basis of bacteria-mediated aphid mortality and the potential of PpR24 as an effective biocontrol agent.
Collapse
|
2
|
The complete genome assemblies of 19 insect pests of worldwide importance to agriculture. PESTICIDE BIOCHEMISTRY AND PHYSIOLOGY 2023; 191:105339. [PMID: 36963921 DOI: 10.1016/j.pestbp.2023.105339] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 01/04/2023] [Accepted: 01/10/2023] [Indexed: 06/18/2023]
Abstract
There are many insect pests worldwide that damage agricultural crop and reduce yield either by direct feeding or by the transmission of plant diseases. To date, control of pest insects has been achieved largely by applying synthetic insecticides. However, insecticide use can be seriously impacted by legislation that limits their use or by the evolution of resistance in the target pest. Thus, there is a move towards less use of insecticides and increased adoption of integrated pest management strategies using a wide range of non-chemical and chemical control methods. For good pest control there is a need to understand the mode of action and selectivity of insecticides, the life cycles of the pests and their biology and behaviours, all of which can benefit from good quality genome data. Here we present the complete assembled (chromosome level) genomes (incl. mtDNA) of 19 insect pests, Agriotes lineatus (click beetle/wireworm), Aphis gossypii (melon/cotton aphid), Bemisia tabaci (cotton whitefly), Brassicogethes aeneus (pollen beetle), Ceutorhynchus obstrictus (seedpod weevil), Chilo suppressalis (striped rice stem borer), Chrysodeixis includens (soybean looper), Diabrotica balteata (cucumber beetle), Diatraea saccharalis (sugar cane borer), Nezara viridula (green stink bug), Nilaparvata lugens (brown plant hopper), Phaedon cochleariae (mustard beetle), Phyllotreta striolata (striped flea beetle), Psylliodes chrysocephala (cabbage stem flea beetle), Spodoptera exigua (beet army worm), Spodoptera littoralis (cotton leaf worm), Diabrotica virgifera (western corn root worm), Euschistus heros (brown stink bug) and Phyllotreta cruciferae (crucifer flea beetle). For the first 15 of these we also present the annotation of genes encoding potential xenobiotic detoxification enzymes. This public resource will aid in the elucidation and monitoring of resistance mechanisms, the development of highly selective chemistry and potential techniques to disrupt behaviour in a way that limits the effect of the pests.
Collapse
|
3
|
Gene-based mapping of trehalose biosynthetic pathway genes reveals association with source- and sink-related yield traits in a spring wheat panel. Food Energy Secur 2021; 10:e292. [PMID: 34594548 PMCID: PMC8459250 DOI: 10.1002/fes3.292] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Revised: 04/12/2021] [Accepted: 04/12/2021] [Indexed: 12/11/2022] Open
Abstract
Trehalose 6‐phosphate (T6P) signalling regulates carbon use and allocation and is a target to improve crop yields. However, the specific contributions of trehalose phosphate synthase (TPS) and trehalose phosphate phosphatase (TPP) genes to source‐ and sink‐related traits remain largely unknown. We used enrichment capture sequencing on TPS and TPP genes to estimate and partition the genetic variation of yield‐related traits in a spring wheat (Triticum aestivum) breeding panel specifically built to capture the diversity across the 75,000 CIMMYT wheat cultivar collection. Twelve phenotypes were correlated to variation in TPS and TPP genes including plant height and biomass (source), spikelets per spike, spike growth and grain filling traits (sink) which showed indications of both positive and negative gene selection. Individual genes explained proportions of heritability for biomass and grain‐related traits. Three TPS1 homologues were particularly significant for trait variation. Epistatic interactions were found within and between the TPS and TPP gene families for both plant height and grain‐related traits. Gene‐based prediction improved predictive ability for grain weight when gene effects were combined with the whole‐genome markers. Our study has generated a wealth of information on natural variation of TPS and TPP genes related to yield potential which confirms the role for T6P in resource allocation and in affecting traits such as grain number and size confirming other studies which now opens up the possibility of harnessing natural genetic variation more widely to better understand the contribution of native genes to yield traits for incorporation into breeding programmes.
Collapse
|
4
|
The Wheat GENIE3 Network Provides Biologically-Relevant Information in Polyploid Wheat. G3 (BETHESDA, MD.) 2020; 10:3675-3686. [PMID: 32747342 PMCID: PMC7534433 DOI: 10.1534/g3.120.401436] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Accepted: 08/01/2020] [Indexed: 11/18/2022]
Abstract
Gene regulatory networks are powerful tools which facilitate hypothesis generation and candidate gene discovery. However, the extent to which the network predictions are biologically relevant is often unclear. Recently a GENIE3 network which predicted targets of wheat transcription factors was produced. Here we used an independent RNA-Seq dataset to test the predictions of the wheat GENIE3 network for the senescence-regulating transcription factor NAM-A1 (TraesCS6A02G108300). We re-analyzed the RNA-Seq data against the RefSeqv1.0 genome and identified a set of differentially expressed genes (DEGs) between the wild-type and nam-a1 mutant which recapitulated the known role of NAM-A1 in senescence and nutrient remobilisation. We found that the GENIE3-predicted target genes of NAM-A1 overlap significantly with the DEGs, more than would be expected by chance. Based on high levels of overlap between GENIE3-predicted target genes and the DEGs, we identified candidate senescence regulators. We then explored genome-wide trends in the network related to polyploidy and found that only homeologous transcription factors are likely to share predicted targets in common. However, homeologs which vary in expression levels across tissues are less likely to share predicted targets than those that do not, suggesting that they may be more likely to act in distinct pathways. This work demonstrates that the wheat GENIE3 network can provide biologically-relevant predictions of transcription factor targets, which can be used for candidate gene prediction and for global analyses of transcription factor function. The GENIE3 network has now been integrated into the KnetMiner web application, facilitating its use in future studies.
Collapse
|
5
|
PHI-base: the pathogen-host interactions database. Nucleic Acids Res 2020; 48:D613-D620. [PMID: 31733065 PMCID: PMC7145647 DOI: 10.1093/nar/gkz904] [Citation(s) in RCA: 104] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Revised: 10/01/2019] [Accepted: 11/14/2019] [Indexed: 11/21/2022] Open
Abstract
The pathogen–host interactions database (PHI-base) is available at www.phi-base.org. PHI-base contains expertly curated molecular and biological information on genes proven to affect the outcome of pathogen–host interactions reported in peer reviewed research articles. PHI-base also curates literature describing specific gene alterations that did not affect the disease interaction phenotype, in order to provide complete datasets for comparative purposes. Viruses are not included, due to their extensive coverage in other databases. In this article, we describe the increased data content of PHI-base, plus new database features and further integration with complementary databases. The release of PHI-base version 4.8 (September 2019) contains 3454 manually curated references, and provides information on 6780 genes from 268 pathogens, tested on 210 hosts in 13,801 interactions. Prokaryotic and eukaryotic pathogens are represented in almost equal numbers. Host species consist of approximately 60% plants (split 50:50 between cereal and non-cereal plants), and 40% other species of medical and/or environmental importance. The information available on pathogen effectors has risen by more than a third, and the entries for pathogens that infect crop species of global importance has dramatically increased in this release. We also briefly describe the future direction of the PHI-base project, and some existing problems with the PHI-base curation process.
Collapse
|
6
|
Epigenetic patterns within the haplotype phased fig (Ficus carica L.) genome. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 102:600-614. [PMID: 31808196 DOI: 10.1111/tpj.14635] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2019] [Revised: 11/13/2019] [Accepted: 11/26/2019] [Indexed: 05/22/2023]
Abstract
Due to DNA heterozygosity and repeat content, assembly of non-model plant genomes is challenging. Herein, we report a high-quality genome reference of one of the oldest known domesticated species, fig (Ficus carica L.), using Pacific Biosciences single-molecule, real-time sequencing. The fig genome is ~333 Mbp in size, of which 80% has been anchored to 13 chromosomes. Genome-wide analysis of N6 -methyladenine and N4 -methylcytosine revealed high methylation levels in both genes and transposable elements, and a prevalence of methylated over non-methylated genes. Furthermore, the characterization of N6 -methyladenine sites led to the identification of ANHGA, a species-specific motif, which is prevalent for both genes and transposable elements. Finally, exploiting the contiguity of the 13 pseudomolecules, we identified 13 putative centromeric regions. The high-quality reference genome and the characterization of methylation profiles, provides an important resource for both fig breeding and for fundamental research into the relationship between epigenetic changes and phenotype, using fig as a model species.
Collapse
|
7
|
A roadmap for gene functional characterisation in crops with large genomes: Lessons from polyploid wheat. eLife 2020; 9:e55646. [PMID: 32208137 PMCID: PMC7093151 DOI: 10.7554/elife.55646] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Accepted: 03/12/2020] [Indexed: 02/04/2023] Open
Abstract
Understanding the function of genes within staple crops will accelerate crop improvement by allowing targeted breeding approaches. Despite their importance, a lack of genomic information and resources has hindered the functional characterisation of genes in major crops. The recent release of high-quality reference sequences for these crops underpins a suite of genetic and genomic resources that support basic research and breeding. For wheat, these include gene model annotations, expression atlases and gene networks that provide information about putative function. Sequenced mutant populations, improved transformation protocols and structured natural populations provide rapid methods to study gene function directly. We highlight a case study exemplifying how to integrate these resources. This review provides a helpful guide for plant scientists, especially those expanding into crop research, to capitalise on the discoveries made in Arabidopsis and other plants. This will accelerate the improvement of crops of vital importance for food and nutrition security.
Collapse
|
8
|
A Co-Expression Network in Hexaploid Wheat Reveals Mostly Balanced Expression and Lack of Significant Gene Loss of Homeologous Meiotic Genes Upon Polyploidization. FRONTIERS IN PLANT SCIENCE 2019; 10:1325. [PMID: 31681395 PMCID: PMC6813927 DOI: 10.3389/fpls.2019.01325] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Accepted: 09/24/2019] [Indexed: 05/05/2023]
Abstract
Polyploidization has played an important role in plant evolution. However, upon polyploidization, the process of meiosis must adapt to ensure the proper segregation of increased numbers of chromosomes to produce balanced gametes. It has been suggested that meiotic gene (MG) duplicates return to a single copy following whole genome duplication to stabilize the polyploid genome. Therefore, upon the polyploidization of wheat, a hexaploid species with three related (homeologous) genomes, the stabilization process may have involved rapid changes in content and expression of MGs on homeologous chromosomes (homeologs). To examine this hypothesis, sets of candidate MGs were identified in wheat using co-expression network analysis and orthology informed approaches. In total, 130 RNA-Seq samples from a range of tissues including wheat meiotic anthers were used to define co-expressed modules of genes. Three modules were significantly correlated with meiotic tissue samples but not with other tissue types. These modules were enriched for GO terms related to cell cycle, DNA replication, and chromatin modification and contained orthologs of known MGs. Overall, 74.4% of genes within these meiosis-related modules had three homeologous copies which was similar to other tissue-related modules. Amongst wheat MGs identified by orthology, rather than co-expression, the majority (93.7%) were either retained in hexaploid wheat at the same number of copies (78.4%) or increased in copy number (15.3%) compared to ancestral wheat species. Furthermore, genes within meiosis-related modules showed more balanced expression levels between homeologs than genes in non-meiosis-related modules. Taken together, our results do not support extensive gene loss nor changes in homeolog expression of MGs upon wheat polyploidization. The construction of the MG co-expression network allowed identification of hub genes and provided key targets for future studies.
Collapse
|
9
|
Basic LEUCINE ZIPPER TRANSCRIPTION FACTOR67 Transactivates DELAY OF GERMINATION1 to Establish Primary Seed Dormancy in Arabidopsis. THE PLANT CELL 2019; 31:1276-1288. [PMID: 30962396 PMCID: PMC6588305 DOI: 10.1105/tpc.18.00892] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 03/15/2019] [Accepted: 04/05/2019] [Indexed: 05/18/2023]
Abstract
Seed dormancy governs the timing of germination, one of the most important developmental transitions in a plant's life cycle. The DELAY OF GERMINATION1 (DOG1) gene is a key regulator of seed dormancy and a major quantitative trait locus in Arabidopsis (Arabidopsis thaliana). DOG1 expression is under tight developmental and environmental regulation, but the transcription factors involved are not known. Here we show that basic LEUCINE ZIPPER TRANSCRIPTION FACTOR67 (bZIP67) acts downstream of the central regulator of seed development, LEAFY COTYLEDON1, to transactivate DOG1 during maturation and help to establish primary dormancy. We show that bZIP67 overexpression enhances dormancy and that bZIP67 protein (but not transcript) abundance is increased in seeds matured in cool conditions, providing a mechanism to explain how temperature regulates DOG1 expression. We also show that natural allelic variation in the DOG1 promoter affects bZIP67-dependent transactivation, providing a mechanism to explain ecotypic differences in seed dormancy that are controlled by the DOG1 locus.
Collapse
|
10
|
Natural variation in acyl editing is a determinant of seed storage oil composition. Sci Rep 2018; 8:17346. [PMID: 30478395 PMCID: PMC6255774 DOI: 10.1038/s41598-018-35136-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Accepted: 10/26/2018] [Indexed: 01/09/2023] Open
Abstract
Seeds exhibit wide variation in the fatty acid composition of their storage oil. However, the genetic basis of this variation is only partially understood. Here we have used a multi-parent advanced generation inter-cross (MAGIC) population to study the genetic control of fatty acid chain length in Arabidopsis thaliana seed oil. We mapped four quantitative trait loci (QTL) for the quantity of the major very long chain fatty acid species 11-eicosenoic acid (20:1), using multiple QTL modelling. Surprisingly, the main-effect QTL does not coincide with FATTY ACID ELONGASE 1 and a parallel genome wide association study suggested that LYSOPHOSPHATIDYLCHOLINE ACYLTRANSFERASE 2 (LPCAT2) is a candidate for this QTL. Regression analysis also suggested that LPCAT2 expression and 20:1 content in seeds of the 19 MAGIC founder accessions are related. LPCAT is a key component of the Lands cycle; an acyl editing pathway that enables acyl-exchange between the acyl-Coenzyme A and phosphatidylcholine precursor pools used for microsomal fatty acid elongation and desaturation, respectively. We Mendelianised the main-effect QTL using biparental chromosome segment substitution lines and carried out complementation tests to show that a single cis-acting polymorphism in the LPCAT2 promoter causes the variation in seed 20:1 content, by altering the LPCAT2 expression level and total LPCAT activity in developing siliques. Our work establishes that oilseed species exhibit natural variation in the enzymic capacity for acyl editing and this contributes to the genetic control of storage oil composition.
Collapse
|
11
|
Abstract
KnetMaps is a
BioJS component for the interactive visualization of biological knowledge networks. It is well suited for applications that need to visualise complementary, connected and content-rich data in a single view in order to help users to traverse pathways linking entities of interest, for example to go from genotype to phenotype. KnetMaps loads data in JSON format, visualizes the structure and content of knowledge networks using lightweight JavaScript libraries, and supports interactive touch gestures. KnetMaps uses effective visualization techniques to prevent information overload and to allow researchers to progressively build their knowledge.
Collapse
|
12
|
Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach. J Integr Bioinform 2018; 15:/j/jib.ahead-of-print/jib-2018-0023/jib-2018-0023.xml. [PMID: 30085931 PMCID: PMC6340125 DOI: 10.1515/jib-2018-0023] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2018] [Accepted: 06/07/2018] [Indexed: 01/01/2023] Open
Abstract
The speed and accuracy of new scientific discoveries – be it by humans or artificial intelligence – depends on the quality of the underlying data and on the technology to connect, search and share the data efficiently. In recent years, we have seen the rise of graph databases and semi-formal data models such as knowledge graphs to facilitate software approaches to scientific discovery. These approaches extend work based on formalised models, such as the Semantic Web. In this paper, we present our developments to connect, search and share data about genome-scale knowledge networks (GSKN). We have developed a simple application ontology based on OWL/RDF with mappings to standard schemas. We are employing the ontology to power data access services like resolvable URIs, SPARQL endpoints, JSON-LD web APIs and Neo4j-based knowledge graphs. We demonstrate how the proposed ontology and graph databases considerably improve search and access to interoperable and reusable biological knowledge (i.e. the FAIRness data principles).
Collapse
|
13
|
The Role of Trehalose 6-Phosphate in Crop Yield and Resilience. PLANT PHYSIOLOGY 2018; 177:12-23. [PMID: 29592862 PMCID: PMC5933140 DOI: 10.1104/pp.17.01634] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Accepted: 03/19/2018] [Indexed: 05/19/2023]
Abstract
T6P can be targeted through genetic and chemical methods for crop yield improvements in different environments through the effect of T6P on carbon allocation and biosynthetic pathways.
Collapse
|
14
|
Transcriptome changes induced by arbuscular mycorrhizal fungi in sunflower (Helianthus annuus L.) roots. Sci Rep 2018; 8:4. [PMID: 29311719 PMCID: PMC5758643 DOI: 10.1038/s41598-017-18445-0] [Citation(s) in RCA: 69] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Accepted: 12/08/2017] [Indexed: 01/11/2023] Open
Abstract
Arbuscular mycorrhizal (AM) fungi are essential elements of soil fertility, plant nutrition and productivity, facilitating soil mineral nutrient uptake. Helianthus annuus is a non-model, widely cultivated species. Here we used an RNA-seq approach for evaluating gene expression variation at early and late stages of mycorrhizal establishment in sunflower roots colonized by the arbuscular fungus Rhizoglomus irregulare. mRNA was isolated from roots of plantlets at 4 and 16 days after inoculation with the fungus. cDNA libraries were built and sequenced with Illumina technology. Differential expression analysis was performed between control and inoculated plants. Overall 726 differentially expressed genes (DEGs) between inoculated and control plants were retrieved. The number of up-regulated DEGs greatly exceeded the number of down-regulated DEGs and this difference increased in later stages of colonization. Several DEGs were specifically involved in known mycorrhizal processes, such as membrane transport, cell wall shaping, and other. We also found previously unidentified mycorrhizal-induced transcripts. The most important DEGs were carefully described in order to hypothesize their roles in AM symbiosis. Our data add a valuable contribution for deciphering biological processes related to beneficial fungi and plant symbiosis, adding an Asteraceae, non-model species for future comparative functional genomics studies.
Collapse
|
15
|
Spatiotemporal expression patterns of wheat amino acid transporters reveal their putative roles in nitrogen transport and responses to abiotic stress. Sci Rep 2017. [PMID: 28710348 DOI: 10.1038/s41598-017-04473-4473] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/24/2023] Open
Abstract
Amino acid transporters have roles in amino acid uptake from soil, long-distance transport, remobilization from vegetative tissues and accumulation in grain. Critically, the majority of wheat grain nitrogen is derived from amino acids remobilized from vegetative organs. However, no systematic analysis of wheat AAT genes has been reported to date. Here, 283 full length wheat AAT genes representing 100 distinct groups of homeologs were identified and curated by selectively consolidating IWGSC CSSv2 and TGACv1 Triticum aestivum genome assemblies and reassembling or mapping of IWGSC CSS chromosome sorted reads to fill any gaps. Gene expression profiling was performed using public RNA-seq data from root, leaf, stem, spike, grain and grain cells (transfer cell (TC), aleurone cell (AL), and starchy endosperm (SE)). AATs highly expressed in roots are good candidates for amino acid uptake from soil whilst AATs highly expressed in senescing leaves and stems may be involved in translocation to grain. AATs in TC (TaAAP2 and TaAAP19) and SE (TaAAP13) may play important roles in determining grain protein content and grain yield. The expression levels of AAT homeologs showed unequal contributions in response to abiotic stresses and development, which may aid wheat adaptation to a wide range of environments.
Collapse
|
16
|
Knowledge Discovery in Biological Databases for Revealing Candidate Genes Linked to Complex Phenotypes. J Integr Bioinform 2017; 14:/j/jib.ahead-of-print/jib-2016-0002/jib-2016-0002.xml. [PMID: 28609292 PMCID: PMC6042805 DOI: 10.1515/jib-2016-0002] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Accepted: 02/16/2017] [Indexed: 02/06/2023] Open
Abstract
Genetics and “omics” studies designed to uncover genotype to phenotype relationships often identify large numbers of potential candidate genes, among which the causal genes are hidden. Scientists generally lack the time and technical expertise to review all relevant information available from the literature, from key model species and from a potentially wide range of related biological databases in a variety of data formats with variable quality and coverage. Computational tools are needed for the integration and evaluation of heterogeneous information in order to prioritise candidate genes and components of interaction networks that, if perturbed through potential interventions, have a positive impact on the biological outcome in the whole organism without producing negative side effects. Here we review several bioinformatics tools and databases that play an important role in biological knowledge discovery and candidate gene prioritization. We conclude with several key challenges that need to be addressed in order to facilitate biological knowledge discovery in the future.
Collapse
|
17
|
Genome Wide Analysis of Fatty Acid Desaturation and Its Response to Temperature. PLANT PHYSIOLOGY 2017; 173:1594-1605. [PMID: 28108698 PMCID: PMC5338679 DOI: 10.1104/pp.16.01907] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2016] [Accepted: 01/20/2017] [Indexed: 05/22/2023]
Abstract
Plants modify the polyunsaturated fatty acid content of their membrane and storage lipids in order to adapt to changes in temperature. In developing seeds, this response is largely controlled by the activities of the microsomal ω-6 and ω-3 fatty acid desaturases, FAD2 and FAD3. Although temperature regulation of desaturation has been studied at the molecular and biochemical levels, the genetic control of this trait is poorly understood. Here, we have characterized the response of Arabidopsis (Arabidopsis thaliana) seed lipids to variation in ambient temperature and found that heat inhibits both ω-6 and ω-3 desaturation in phosphatidylcholine, leading to a proportional change in triacylglycerol composition. Analysis of the 19 parental accessions of the multiparent advanced generation intercross (MAGIC) population showed that significant natural variation exists in the temperature responsiveness of ω-6 desaturation. A combination of quantitative trait locus (QTL) analysis and genome-wide association studies (GWAS) using the MAGIC population suggests that ω-6 desaturation is largely controlled by cis-acting sequence variants in the FAD2 5' untranslated region intron that determine the expression level of the gene. However, the temperature responsiveness of ω-6 desaturation is controlled by a separate QTL on chromosome 2. The identity of this locus is unknown, but genome-wide association studies identified potentially causal sequence variants within ∼40 genes in an ∼450-kb region of the QTL.
Collapse
|
18
|
Developing integrated crop knowledge networks to advance candidate gene discovery. Appl Transl Genom 2016; 11:18-26. [PMID: 28018846 PMCID: PMC5167366 DOI: 10.1016/j.atg.2016.10.003] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Accepted: 10/24/2016] [Indexed: 12/03/2022]
Abstract
The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.
Collapse
|
19
|
Insecticide resistance mediated by an exon skipping event. Mol Ecol 2016; 25:5692-5704. [PMID: 27748560 PMCID: PMC5111602 DOI: 10.1111/mec.13882] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2016] [Revised: 09/05/2016] [Accepted: 10/05/2016] [Indexed: 12/31/2022]
Abstract
Many genes increase coding capacity by alternate exon usage. The gene encoding the insect nicotinic acetylcholine receptor (nAChR) α6 subunit, target of the bio‐insecticide spinosad, is one example of this and expands protein diversity via alternative splicing of mutually exclusive exons. Here, we show that spinosad resistance in the tomato leaf miner, Tuta absoluta is associated with aberrant regulation of splicing of Taα6 resulting in a novel form of insecticide resistance mediated by exon skipping. Sequencing of the α6 subunit cDNA from spinosad selected and unselected strains of T. absoluta revealed all Taα6 transcripts of the selected strain were devoid of exon 3, with comparison of genomic DNA and mRNA revealing this is a result of exon skipping. Exon skipping cosegregated with spinosad resistance in survival bioassays, and functional characterization of this alteration using modified human nAChR α7, a model of insect α6, demonstrated that exon 3 is essential for receptor function and hence spinosad sensitivity. DNA and RNA sequencing analyses suggested that exon skipping did not result from genetic alterations in intronic or exonic cis‐regulatory elements, but rather was associated with a single epigenetic modification downstream of exon 3a, and quantitative changes in the expression of trans‐acting proteins that have known roles in the regulation of alternative splicing. Our results demonstrate that the intrinsic capacity of the α6 gene to generate transcript diversity via alternative splicing can be readily exploited during the evolution of resistance and identifies exon skipping as a molecular alteration conferring insecticide resistance.
Collapse
|
20
|
Abstract
Background Accurate genome assembly and gene model annotation are critical for comparative species and gene functional analyses. Here we present the completed genome sequence and annotation of the reference strain PH-1 of Fusarium graminearum, the causal agent of head scab disease of small grain cereals which threatens global food security. Completion was achieved by combining (a) the BROAD Sanger sequenced draft, with (b) the gene predictions from Munich Information Services for Protein Sequences (MIPS) v3.2, with (c) de novo whole-genome shotgun re-sequencing, (d) re-annotation of the gene models using RNA-seq evidence and Fgenesh, Snap, GeneMark and Augustus prediction algorithms, followed by (e) manual curation. Results We have comprehensively completed the genomic 36,563,796 bp sequence by replacing unknown bases, placing supercontigs within their correct loci, correcting assembly errors, and inserting new sequences which include for the first time complete AT rich sequences such as centromere sequences, subtelomeric regions and the telomeres. Each of the four F. graminearium chromosomes was found to be submetacentric with respect to centromere positioning. The position of a potential neocentromere was also defined. A preferentially higher frequency of genetic recombination was observed at the end of the longer arm of each chromosome. Within the genome 1529 gene models have been modified and 412 new gene models predicted, with a total gene call of 14,164. The re-annotation impacts upon 69 entries held within the Pathogen-Host Interactions database (PHI-base) which stores information on genes for which mutant phenotypes in pathogen-host interactions have been experimentally tested, of which 59 are putative transcription factors, 8 kinases, 1 ATP citrate lyase (ACL1), and 1 syntaxin-like SNARE gene (GzSYN1). Although the completed F. graminearum contains very few transposon sequences, a previously unrecognised and potentially active gypsy-type long-terminal-repeat (LTR) retrotransposon was identified. In addition, each of the sub-telomeres and centromeres contained either a LTR or MarCry-1_FO element. The full content of the proposed ancient chromosome fusion sites has also been revealed and investigated. Regions with high recombination previously noted to be rich in secretome encoding genes were also found to be rich in tRNA sequences. This study has identified 741 F. graminearum species specific genes and provides the first complete genome assembly for a Sordariomycetes species. Conclusions This fully completed F. graminearum PH-1 genome and manually curated annotation, available at Ensembl Fungi, provides the optimum resource to perform interspecies comparative analyses and gene function studies. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1756-1) contains supplementary material, which is available to authorized users.
Collapse
|
21
|
Whole-genome analysis of Fusarium graminearum insertional mutants identifies virulence associated genes and unmasks untagged chromosomal deletions. BMC Genomics 2015; 16:261. [PMID: 25881124 PMCID: PMC4404607 DOI: 10.1186/s12864-015-1412-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2014] [Accepted: 02/27/2015] [Indexed: 12/24/2022] Open
Abstract
Background Identifying pathogen virulence genes required to cause disease is crucial to understand the mechanisms underlying the pathogenic process. Plasmid insertion mutagenesis of fungal protoplasts is frequently used for this purpose in filamentous ascomycetes. Post transformation, the mutant population is screened for loss of virulence to a specific plant or animal host. Identifying the insertion event has previously met with varying degrees of success, from a cleanly disrupted gene with minimal deletion of nucleotides at the insertion point to multiple-copy insertion events and large deletions of chromosomal regions. Currently, extensive mutant collections exist in laboratories globally where it was hitherto impossible to identify all the affected genes. Results We used a whole-genome sequencing (WGS) approach using Illumina HiSeq 2000 technology to investigate DNA tag insertion points and chromosomal deletion events in mutagenised, reduced virulence F. graminearum isolates identified in disease tests on wheat (Triticum aestivum). We developed the FindInsertSeq workflow to localise the DNA tag insertions to the nucleotide level. The workflow was tested using four mutants showing evidence of single and multi-copy insertions in DNA blot analysis. FindInsertSeq was able to identify both single and multi-copy concatenation insertion sites. By comparing sequencing coverage, unexpected molecular recombination events such as large tagged and untagged chromosomal deletions, and DNA amplification were observed in three of the analysed mutants. A random data sampling approach revealed the minimum genome coverage required to survey the F. graminearum genome for alterations. Conclusions This study demonstrates that whole-genome re-sequencing to 22x fold genome coverage is an efficient tool to characterise single and multi-copy insertion mutants in the filamentous ascomycete Fusarium graminearum. In some cases insertion events are accompanied with large untagged chromosomal deletions while in other cases a straight-forward insertion event could be confirmed. The FindInsertSeq analysis workflow presented in this study enables researchers to efficiently characterise insertion and deletion mutants. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1412-9) contains supplementary material, which is available to authorized users.
Collapse
|
22
|
Transcriptome and metabolite profiling of the infection cycle of Zymoseptoria tritici on wheat reveals a biphasic interaction with plant immunity involving differential pathogen chromosomal contributions and a variation on the hemibiotrophic lifestyle definition. PLANT PHYSIOLOGY 2015; 167:1158-85. [PMID: 25596183 PMCID: PMC4348787 DOI: 10.1104/pp.114.255927] [Citation(s) in RCA: 180] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2014] [Accepted: 01/16/2015] [Indexed: 05/17/2023]
Abstract
The hemibiotrophic fungus Zymoseptoria tritici causes Septoria tritici blotch disease of wheat (Triticum aestivum). Pathogen reproduction on wheat occurs without cell penetration, suggesting that dynamic and intimate intercellular communication occurs between fungus and plant throughout the disease cycle. We used deep RNA sequencing and metabolomics to investigate the physiology of plant and pathogen throughout an asexual reproductive cycle of Z. tritici on wheat leaves. Over 3,000 pathogen genes, more than 7,000 wheat genes, and more than 300 metabolites were differentially regulated. Intriguingly, individual fungal chromosomes contributed unequally to the overall gene expression changes. Early transcriptional down-regulation of putative host defense genes was detected in inoculated leaves. There was little evidence for fungal nutrient acquisition from the plant throughout symptomless colonization by Z. tritici, which may instead be utilizing lipid and fatty acid stores for growth. However, the fungus then subsequently manipulated specific plant carbohydrates, including fructan metabolites, during the switch to necrotrophic growth and reproduction. This switch coincided with increased expression of jasmonic acid biosynthesis genes and large-scale activation of other plant defense responses. Fungal genes encoding putative secondary metabolite clusters and secreted effector proteins were identified with distinct infection phase-specific expression patterns, although functional analysis suggested that many have overlapping/redundant functions in virulence. The pathogenic lifestyle of Z. tritici on wheat revealed through this study, involving initial defense suppression by a slow-growing extracellular and nutritionally limited pathogen followed by defense (hyper) activation during reproduction, reveals a subtle modification of the conceptual definition of hemibiotrophic plant infection.
Collapse
|
23
|
Abstract
Summary Information Retrieval (IR) plays a central role in the exploration and interpretation of integrated biological datasets that represent the heterogeneous ecosystem of life sciences. Here, keyword based query systems are popular user interfaces. In turn, to a large extend, the used query phrases determine the quality of the search result and the effort a scientist has to invest for query refinement. In this context, computer aided query expansion and suggestion is one of the most challenging tasks for life science information systems. Existing query front-ends support aspects like spelling correction, query refinement or query expansion. However, the majority of the front-ends only make limited use of enhanced IR algorithms to implement comprehensive and computer aided query refinement workflows. In this work, we present the design of a multi-stage query suggestion workflow and its implementation in the life science IR system LAILAPS. The presented workflow includes enhanced tokenisation, word breaking, spelling correction, query expansion and query suggestion ranking. A spelling correction benchmark with 5,401 queries and manually selected use cases for query expansion demonstrate the performance of the implemented workflow and its advantages compared with state-of-the-art systems.
Collapse
|
24
|
Secondary cell wall composition and candidate gene expression in developing willow (Salix purpurea) stems. PLANTA 2014; 239:1041-53. [PMID: 24504696 PMCID: PMC3997797 DOI: 10.1007/s00425-014-2034-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2013] [Accepted: 01/21/2014] [Indexed: 05/17/2023]
Abstract
The properties of the secondary cell wall (SCW) in willow largely determine the suitability of willow biomass feedstock for potential bioenergy and biofuel applications. SCW development has been little studied in willow and it is not known how willow compares with model species, particularly the closely related genus Populus. To address this and relate SCW synthesis to candidate genes in willow, a tractable bud culture-derived system was developed in Salix purpurea, and cell wall composition and RNA-Seq transcriptome were followed in stems during early development. A large increase in SCW deposition in the period 0-2 weeks after transfer to soil was characterised by a big increase in xylan content, but no change in the frequency of substitution of xylan with glucuronic acid, and increased abundance of putative transcripts for synthesis of SCW cellulose, xylan and lignin. Histochemical staining and immunolabeling revealed that increased deposition of lignin and xylan was associated with xylem, xylem fibre cells and phloem fibre cells. Transcripts orthologous to those encoding xylan synthase components IRX9 and IRX10 and xylan glucuronyl transferase GUX1 in Arabidopsis were co-expressed, and showed the same spatial pattern of expression revealed by in situ hybridisation at four developmental stages, with abundant expression in proto-xylem, xylem fibre and ray parenchyma cells and some expression in phloem fibre cells. The results show a close similarity with SCW development in Populus species, but also give novel information on the relationship between spatial and temporal variation in xylan-related transcripts and xylan composition.
Collapse
|
25
|
Abstract
SUMMARY Ondex Web is a new web-based implementation of the network visualization and exploration tools from the Ondex data integration platform. New features such as context-sensitive menus and annotation tools provide users with intuitive ways to explore and manipulate the appearance of heterogeneous biological networks. Ondex Web is open source, written in Java and can be easily embedded into Web sites as an applet. Ondex Web supports loading data from a variety of network formats, such as XGMML, NWB, Pajek and OXL. AVAILABILITY AND IMPLEMENTATION http://ondex.rothamsted.ac.uk/OndexWeb.
Collapse
|
26
|
Assessing the functional coherence of modules found in multiple-evidence networks from Arabidopsis. BMC Bioinformatics 2011; 12:203. [PMID: 21612636 PMCID: PMC3118170 DOI: 10.1186/1471-2105-12-203] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2010] [Accepted: 05/25/2011] [Indexed: 12/18/2022] Open
Abstract
Background Combining multiple evidence-types from different information sources has the potential to reveal new relationships in biological systems. The integrated information can be represented as a relationship network, and clustering the network can suggest possible functional modules. The value of such modules for gaining insight into the underlying biological processes depends on their functional coherence. The challenges that we wish to address are to define and quantify the functional coherence of modules in relationship networks, so that they can be used to infer function of as yet unannotated proteins, to discover previously unknown roles of proteins in diseases as well as for better understanding of the regulation and interrelationship between different elements of complex biological systems. Results We have defined the functional coherence of modules with respect to the Gene Ontology (GO) by considering two complementary aspects: (i) the fragmentation of the GO functional categories into the different modules and (ii) the most representative functions of the modules. We have proposed a set of metrics to evaluate these two aspects and demonstrated their utility in Arabidopsis thaliana. We selected 2355 proteins for which experimentally established protein-protein interaction (PPI) data were available. From these we have constructed five relationship networks, four based on single types of data: PPI, co-expression, co-occurrence of protein names in scientific literature abstracts and sequence similarity and a fifth one combining these four evidence types. The ability of these networks to suggest biologically meaningful grouping of proteins was explored by applying Markov clustering and then by measuring the functional coherence of the clusters. Conclusions Relationship networks integrating multiple evidence-types are biologically informative and allow more proteins to be assigned to a putative functional module. Using additional evidence types concentrates the functional annotations in a smaller number of modules without unduly compromising their consistency. These results indicate that integration of more data sources improves the ability to uncover functional association between proteins, both by allowing more proteins to be linked and producing a network where modular structure more closely reflects the hierarchy in the gene ontology.
Collapse
|
27
|
Gaining confidence in cross-species annotation transfer: from simple molecular function to complex phenotypic traits. ASPECTS OF APPLIED BIOLOGY 2011; 107:79-87. [PMID: 22319070 PMCID: PMC3272443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Cross-species annotation transfer is a widely used approach for transferring information about simple molecular functions or pathways from one protein in one species to its ortholog in another species. In crop species, the phenotypic traits of interest, such as grain yield, are very complex and are often related to multiple biological processes and systems. It is still unclear to what extent the high level annotations describing phenotypic traits can also be reliably transferred across species. In this work, we have developed a procedure to measure precisely the transferability of these functional annotations from one species to another and demonstrate its application to Arabidopsis and several crop species. This comparative analysis is a step towards assigning higher level biological function to genes and gene networks as part of the wider genotype to phenotype challenge.
Collapse
|
28
|
Enhancing Data Integration with Text Analysis to Find Proteins Implicated in Plant Stress Response. J Integr Bioinform 2010. [DOI: 10.1515/jib-2010-121] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
SummaryHigh throughput genomic studies can identify large numbers of potential candidate genes, which must be interpreted and filtered by investigators to select the best ones for further analysis. Prioritization is generally based on evidence that supports the role of a gene product in the biological process being investigated. The two most important bodies of information providing such evidence are bioinformatics databases and the scientific literature. In this paper we present an extension to the Ondex data integration framework that uses text mining techniques over Medline abstracts as a method for accessing both these bodies of evidence in a consistent way. In an example use case, we apply our method to create a knowledge base of Arabidopsis proteins implicated in plant stress response and use various scoring metrics to identify key protein-stress associations. In conclusion, we show that the additional text mining features are able to highlight proteins using the scientific literature that would not have been seen using data integration alone. Ondex is an open-source software project and can be downloaded, together with the text mining features described here, from www.ondex.org.
Collapse
|
29
|
Abstract
SummaryThe automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara- Cyc) which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation.The methods and algorithms presented in this publication are an integral part of the ONDEX system which is freely available from http://ondex.sf.net/.
Collapse
|