1
|
Tourdot E, Martin PGP, Maza E, Mauxion JP, Djari A, Gévaudant F, Chevalier C, Pirrello J, Gonzalez N. Ploidy-specific transcriptomes shed light on the heterogeneous identity and metabolism of developing tomato pericarp cells. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024; 118:997-1015. [PMID: 38281284 DOI: 10.1111/tpj.16646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 11/20/2023] [Accepted: 01/05/2024] [Indexed: 01/30/2024]
Abstract
Endoreduplication, during which cells increase their DNA content through successive rounds of full genome replication without cell division, is the major source of endopolyploidy in higher plants. Endoreduplication plays pivotal roles in plant growth and development and is associated with the activation of specific transcriptional programmes that are characteristic of each cell type, thereby defining their identity. In plants, endoreduplication is found in numerous organs and cell types, especially in agronomically valuable ones, such as the fleshy fruit (pericarp) of tomato presenting high ploidy levels. We used the tomato pericarp tissue as a model system to explore the transcriptomes associated with endoreduplication progression during fruit growth. We confirmed that expression globally scales with ploidy level and identified sets of differentially expressed genes presenting only developmental-specific, only ploidy-specific expression patterns or profiles resulting from an additive effect of ploidy and development. When comparing ploidy levels at a specific developmental stage, we found that non-endoreduplicated cells are defined by cell division state and cuticle synthesis while endoreduplicated cells are mainly defined by their metabolic activity changing rapidly over time. By combining this dataset with publicly available spatiotemporal pericarp expression data, we proposed a map describing the distribution of ploidy levels within the pericarp. These transcriptome-based predictions were validated by quantifying ploidy levels within the pericarp tissue. This in situ ploidy quantification revealed the dynamic progression of endoreduplication and its cell layer specificity during early fruit development. In summary, the study sheds light on the complex relationship between endoreduplication, cell differentiation and gene expression patterns in the tomato pericarp.
Collapse
Affiliation(s)
- Edouard Tourdot
- Université de Bordeaux, INRAE, UMR1332 Biologie du Fruit et Pathologie, F-33882, Villenave d'Ornon, France
| | - Pascal G P Martin
- Université de Bordeaux, INRAE, UMR1332 Biologie du Fruit et Pathologie, F-33882, Villenave d'Ornon, France
| | - Elie Maza
- Laboratoire de Recherche en Sciences Végétales-Génomique et Biotechnologie des Fruits-UMR5546, Université de Toulouse, CNRS, UPS, Toulouse-INP, F-31326, Castanet-Tolosan, France
| | - Jean-Philippe Mauxion
- Université de Bordeaux, INRAE, UMR1332 Biologie du Fruit et Pathologie, F-33882, Villenave d'Ornon, France
| | - Anis Djari
- Laboratoire de Recherche en Sciences Végétales-Génomique et Biotechnologie des Fruits-UMR5546, Université de Toulouse, CNRS, UPS, Toulouse-INP, F-31326, Castanet-Tolosan, France
| | - Frédéric Gévaudant
- Université de Bordeaux, INRAE, UMR1332 Biologie du Fruit et Pathologie, F-33882, Villenave d'Ornon, France
| | - Christian Chevalier
- Université de Bordeaux, INRAE, UMR1332 Biologie du Fruit et Pathologie, F-33882, Villenave d'Ornon, France
| | - Julien Pirrello
- Laboratoire de Recherche en Sciences Végétales-Génomique et Biotechnologie des Fruits-UMR5546, Université de Toulouse, CNRS, UPS, Toulouse-INP, F-31326, Castanet-Tolosan, France
| | - Nathalie Gonzalez
- Université de Bordeaux, INRAE, UMR1332 Biologie du Fruit et Pathologie, F-33882, Villenave d'Ornon, France
| |
Collapse
|
2
|
Brooks TG, Lahens NF, Mrčela A, Grant GR. Challenges and best practices in omics benchmarking. Nat Rev Genet 2024; 25:326-339. [PMID: 38216661 DOI: 10.1038/s41576-023-00679-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2023] [Indexed: 01/14/2024]
Abstract
Technological advances enabling massively parallel measurement of biological features - such as microarrays, high-throughput sequencing and mass spectrometry - have ushered in the omics era, now in its third decade. The resulting complex landscape of analytical methods has naturally fostered the growth of an omics benchmarking industry. Benchmarking refers to the process of objectively comparing and evaluating the performance of different computational or analytical techniques when processing and analysing large-scale biological data sets, such as transcriptomics, proteomics and metabolomics. With thousands of omics benchmarking studies published over the past 25 years, the field has matured to the point where the foundations of benchmarking have been established and well described. However, generating meaningful benchmarking data and properly evaluating performance in this complex domain remains challenging. In this Review, we highlight some common oversights and pitfalls in omics benchmarking. We also establish a methodology to bring the issues that can be addressed into focus and to be transparent about those that cannot: this takes the form of a spreadsheet template of guidelines for comprehensive reporting, intended to accompany publications. In addition, a survey of recent developments in benchmarking is provided as well as specific guidance for commonly encountered difficulties.
Collapse
Affiliation(s)
- Thomas G Brooks
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Nicholas F Lahens
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Antonijo Mrčela
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Gregory R Grant
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
3
|
Hsu CY, Chang CJ, Liu Q, Shyr Y. scKWARN: Kernel-weighted-average robust normalization for single-cell RNA-seq data. Bioinformatics 2024; 40:btae008. [PMID: 38237908 PMCID: PMC10868328 DOI: 10.1093/bioinformatics/btae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 12/05/2023] [Accepted: 01/04/2024] [Indexed: 02/09/2024] Open
Abstract
MOTIVATION Single-cell RNA-seq normalization is an essential step to correct unwanted biases caused by sequencing depth, capture efficiency, dropout, and other technical factors. Existing normalization methods primarily reduce biases arising from sequencing depth by modeling count-depth relationship and/or assuming a specific distribution for read counts. However, these methods may lead to over or under-correction due to presence of technical biases beyond sequencing depth and the restrictive assumption on models and distributions. RESULTS We present scKWARN, a Kernel Weighted Average Robust Normalization designed to correct known or hidden technical confounders without assuming specific data distributions or count-depth relationships. scKWARN generates a pseudo expression profile for each cell by borrowing information from its fuzzy technical neighbors through a kernel smoother. It then compares this profile against the reference derived from cells with the same bimodality patterns to determine the normalization factor. As demonstrated in both simulated and real datasets, scKWARN outperforms existing methods in removing a variety of technical biases while preserving true biological heterogeneity. AVAILABILITY AND IMPLEMENTATION scKWARN is freely available at https://github.com/cyhsuTN/scKWARN.
Collapse
Affiliation(s)
- Chih-Yuan Hsu
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Chia-Jung Chang
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Biomedical Engineering, National Cheng Kung University, Tainan 701, Taiwan
| | - Qi Liu
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Yu Shyr
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| |
Collapse
|
4
|
Bollier N, Micol-Ponce R, Dakdaki A, Maza E, Zouine M, Djari A, Bouzayen M, Chevalier C, Delmas F, Gonzalez N, Hernould M. Various tomato cultivars display contrasting morphological and molecular responses to a chronic heat stress. FRONTIERS IN PLANT SCIENCE 2023; 14:1278608. [PMID: 37965003 PMCID: PMC10642206 DOI: 10.3389/fpls.2023.1278608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 10/06/2023] [Indexed: 11/16/2023]
Abstract
Climate change is one of the biggest threats that human society currently needs to face. Heat waves associated with global warming negatively affect plant growth and development and will increase in intensity and frequency in the coming years. Tomato is one of the most produced and consumed fruit in the world but remarkable yield losses occur every year due to the sensitivity of many cultivars to heat stress (HS). New insights into how tomato plants are responding to HS will contribute to the development of cultivars with high yields under harsh temperature conditions. In this study, the analysis of microsporogenesis and pollen germination rate of eleven tomato cultivars after exposure to a chronic HS revealed differences between genotypes. Pollen development was either delayed and/or desynchronized by HS depending on the cultivar considered. In addition, except for two, pollen germination was abolished by HS in all cultivars. The transcriptome of floral buds at two developmental stages (tetrad and pollen floral buds) of five cultivars revealed common and specific molecular responses implemented by tomato cultivars to cope with chronic HS. These data provide valuable insights into the diversity of the genetic response of floral buds from different cultivars to HS and may contribute to the development of future climate resilient tomato varieties.
Collapse
Affiliation(s)
- N. Bollier
- INRAE, Université de Bordeaux, BFP, Bordeaux, France
| | | | - A. Dakdaki
- INRAE, Université de Bordeaux, BFP, Bordeaux, France
| | - E. Maza
- Laboratoire de Recherche en Sciences Végétales, Université de Toulouse, CNRS, UPS, Toulouse INP, Toulouse, France
| | - M. Zouine
- Laboratoire de Recherche en Sciences Végétales, Université de Toulouse, CNRS, UPS, Toulouse INP, Toulouse, France
| | - A. Djari
- Laboratoire de Recherche en Sciences Végétales, Université de Toulouse, CNRS, UPS, Toulouse INP, Toulouse, France
| | - M. Bouzayen
- Laboratoire de Recherche en Sciences Végétales, Université de Toulouse, CNRS, UPS, Toulouse INP, Toulouse, France
| | - C. Chevalier
- INRAE, Université de Bordeaux, BFP, Bordeaux, France
| | - F. Delmas
- INRAE, Université de Bordeaux, BFP, Bordeaux, France
| | - N. Gonzalez
- INRAE, Université de Bordeaux, BFP, Bordeaux, France
| | - M. Hernould
- INRAE, Université de Bordeaux, BFP, Bordeaux, France
| |
Collapse
|
5
|
O'Connell GC. Variability in donor leukocyte counts confound the use of common RNA sequencing data normalization strategies in transcriptomic biomarker studies performed with whole blood. Sci Rep 2023; 13:15514. [PMID: 37726353 PMCID: PMC10509252 DOI: 10.1038/s41598-023-41443-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 08/26/2023] [Indexed: 09/21/2023] Open
Abstract
Gene expression data generated from whole blood via next generation sequencing is frequently used in studies aimed at identifying mRNA-based biomarker panels with utility for diagnosis or monitoring of human disease. These investigations often employ data normalization techniques more typically used for analysis of data originating from solid tissues, which largely operate under the general assumption that specimens have similar transcriptome composition. However, this assumption may be violated when working with data generated from whole blood, which is more cellularly dynamic, leading to potential confounds. In this study, we used next generation sequencing in combination with flow cytometry to assess the influence of donor leukocyte counts on the transcriptional composition of whole blood specimens sampled from a cohort of 138 human subjects, and then subsequently examined the effect of four frequently used data normalization approaches on our ability to detect inter-specimen biological variance, using the flow cytometry data to benchmark each specimens true cellular and molecular identity. Whole blood samples originating from donors with differing leukocyte counts exhibited dramatic differences in both genome-wide distributions of transcript abundance and gene-level expression patterns. Consequently, three of the normalization strategies we tested, including median ratio (MRN), trimmed mean of m-values (TMM), and quantile normalization, noticeably masked the true biological structure of the data and impaired our ability to detect true interspecimen differences in mRNA levels. The only strategy that improved our ability to detect true biological variance was simple scaling of read counts by sequencing depth, which unlike the aforementioned approaches, makes no assumptions regarding transcriptome composition.
Collapse
Affiliation(s)
- Grant C O'Connell
- Molecular Biomarker Core, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH, 44106-4904, USA.
- School of Nursing, Case Western Reserve University, Cleveland, OH, USA.
| |
Collapse
|
6
|
Iatrou A, Gounari M, Sofou E, Zaragoza-Infante L, Markopoulos I, Sarrigeorgiou I, Petrakis G, Pechlivanis N, Roumeliotou-Dimou M, Panayiotidis P, Stamatopoulos B, Gkanidou M, Sandaltzopoulos R, Degano M, Koletsa T, Lymberi P, Psomopoulos F, Ghia P, Agathangelidis A, Chatzidimitriou A, Stamatopoulos K. N-Glycosylation of the Ig Receptors Shapes the Antigen Reactivity in Chronic Lymphocytic Leukemia Subset #201. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2023; 211:743-754. [PMID: 37466373 DOI: 10.4049/jimmunol.2300330] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 07/04/2023] [Indexed: 07/20/2023]
Abstract
Subset #201 is a clinically indolent subgroup of patients with chronic lymphocytic leukemia defined by the expression of stereotyped, mutated IGHV4-34/IGLV1-44 BCR Ig. Subset #201 is characterized by recurrent somatic hypermutations (SHMs) that frequently lead to the creation and/or disruption of N-glycosylation sites within the Ig H and L chain variable domains. To understand the relevance of this observation, using next-generation sequencing, we studied how SHM shapes the subclonal architecture of the BCR Ig repertoire in subset #201, particularly focusing on changes in N-glycosylation sites. Moreover, we profiled the Ag reactivity of the clonotypic BCR Ig expressed as rmAbs. We found that almost all analyzed cases from subset #201 carry SHMs potentially affecting N-glycosylation at the clonal and/or subclonal level and obtained evidence for N-glycan occupancy in SHM-induced novel N-glycosylation sites. These particular SHMs impact (auto)antigen recognition, as indicated by differences in Ag reactivity between the authentic rmAbs and germline revertants of SHMs introducing novel N-glycosylation sites in experiments entailing 1) flow cytometry for binding to viable cells, 2) immunohistochemistry against various human tissues, 3) ELISA against microbial Ags, and 4) protein microarrays testing reactivity against multiple autoantigens. On these grounds, N-glycosylation appears as relevant for the natural history of at least a fraction of Ig-mutated chronic lymphocytic leukemia. Moreover, subset #201 emerges as a paradigmatic case for the role of affinity maturation in the evolution of Ag reactivity of the clonotypic BCR Ig.
Collapse
Affiliation(s)
- Anastasia Iatrou
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
- Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupolis, Greece
| | - Maria Gounari
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
| | - Electra Sofou
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
| | - Laura Zaragoza-Infante
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
| | - Ioannis Markopoulos
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
| | - Ioannis Sarrigeorgiou
- Immunology Laboratory, Immunology Department, Hellenic Pasteur Institute, Athens, Greece
| | - Georgios Petrakis
- Pathology Department, Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Nikolaos Pechlivanis
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
| | - Maria Roumeliotou-Dimou
- Hematology Section of the First Department of Propedeutic Internal Medicine, Laikon University Hospital, Athens, Greece
| | - Panagiotis Panayiotidis
- Hematology Section of the First Department of Propedeutic Internal Medicine, Laikon University Hospital, Athens, Greece
| | - Basile Stamatopoulos
- Laboratory of Clinical Cell Therapy, Jules Bordet Institute, Free University of Brussels, Brussels, Belgium
| | - Maria Gkanidou
- Blood Transfusion Department, G. Papanikolaou Hospital, Thessaloniki, Greece
| | - Rafael Sandaltzopoulos
- Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupolis, Greece
| | - Massimo Degano
- Biocrystallography Unit, Division of Immunology, Transplantation, and Infectious Diseases, IRCCS Scientific Institute San Raffaele, Milan, Italy
| | - Triantafyllia Koletsa
- Pathology Department, Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Peggy Lymberi
- Immunology Laboratory, Immunology Department, Hellenic Pasteur Institute, Athens, Greece
| | - Fotis Psomopoulos
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
| | - Paolo Ghia
- Division of Experimental Oncology, IRCCS Scientific Institute San Raffaele, Milan, Italy
| | - Andreas Agathangelidis
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
- Department of Biology, School of Science, National and Kapodistrian University of Athens, Athens, Greece
| | - Anastasia Chatzidimitriou
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
- Department of Molecular Medicine and Surgery, Karolinska Institute, Stockholm, Sweden
| | - Kostas Stamatopoulos
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
- Department of Molecular Medicine and Surgery, Karolinska Institute, Stockholm, Sweden
| |
Collapse
|
7
|
Teichman G, Cohen D, Ganon O, Dunsky N, Shani S, Gingold H, Rechavi O. RNAlysis: analyze your RNA sequencing data without writing a single line of code. BMC Biol 2023; 21:74. [PMID: 37024838 PMCID: PMC10080885 DOI: 10.1186/s12915-023-01574-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 03/17/2023] [Indexed: 04/08/2023] Open
Abstract
BACKGROUND Among the major challenges in next-generation sequencing experiments are exploratory data analysis, interpreting trends, identifying potential targets/candidates, and visualizing the results clearly and intuitively. These hurdles are further heightened for researchers who are not experienced in writing computer code since most available analysis tools require programming skills. Even for proficient computational biologists, an efficient and replicable system is warranted to generate standardized results. RESULTS We have developed RNAlysis, a modular Python-based analysis software for RNA sequencing data. RNAlysis allows users to build customized analysis pipelines suiting their specific research questions, going all the way from raw FASTQ files (adapter trimming, alignment, and feature counting), through exploratory data analysis and data visualization, clustering analysis, and gene set enrichment analysis. RNAlysis provides a friendly graphical user interface, allowing researchers to analyze data without writing code. We demonstrate the use of RNAlysis by analyzing RNA sequencing data from different studies using C. elegans nematodes. We note that the software applies equally to data obtained from any organism with an existing reference genome. CONCLUSIONS RNAlysis is suitable for investigating various biological questions, allowing researchers to more accurately and reproducibly run comprehensive bioinformatic analyses. It functions as a gateway into RNA sequencing analysis for less computer-savvy researchers, but can also help experienced bioinformaticians make their analyses more robust and efficient, as it offers diverse tools, scalability, automation, and standardization between analyses.
Collapse
Affiliation(s)
- Guy Teichman
- Department of Neurobiology, Wise Faculty of Life Sciences and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel.
| | - Dror Cohen
- Department of Neurobiology, Wise Faculty of Life Sciences and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Or Ganon
- Department of Biology, Technion - Israel Institute of Technology, Haifa, Israel
| | - Netta Dunsky
- Sagol Brain Institute, Sourasky Medical Center, Neurological Institute, Tel Aviv and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Shachar Shani
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Hila Gingold
- Department of Neurobiology, Wise Faculty of Life Sciences and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Oded Rechavi
- Department of Neurobiology, Wise Faculty of Life Sciences and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
8
|
Zhao H, Moberg KH, Veraksa A. Hippo pathway and Bonus control developmental cell fate decisions in the Drosophila eye. Dev Cell 2023; 58:416-434.e12. [PMID: 36868234 PMCID: PMC10023510 DOI: 10.1016/j.devcel.2023.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 08/10/2022] [Accepted: 02/06/2023] [Indexed: 03/05/2023]
Abstract
The canonical function of the Hippo signaling pathway is the regulation of organ growth. How this pathway controls cell-fate determination is less well understood. Here, we identify a function of the Hippo pathway in cell-fate decisions in the developing Drosophila eye, exerted through the interaction of Yorkie (Yki) with the transcriptional regulator Bonus (Bon), an ortholog of mammalian transcriptional intermediary factor 1/tripartite motif (TIF1/TRIM) family proteins. Instead of controlling tissue growth, Yki and Bon promote epidermal and antennal fates at the expense of the eye fate. Proteomic, transcriptomic, and genetic analyses reveal that Yki and Bon control these cell-fate decisions by recruiting transcriptional and post-transcriptional co-regulators and by repressing Notch target genes and activating epidermal differentiation genes. Our work expands the range of functions and regulatory mechanisms under Hippo pathway control.
Collapse
Affiliation(s)
- Heya Zhao
- Department of Biology, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Kenneth H Moberg
- Department of Cell Biology, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Alexey Veraksa
- Department of Biology, University of Massachusetts Boston, Boston, MA 02125, USA.
| |
Collapse
|
9
|
Vinceti A, De Lucia RR, Cremaschi P, Perron U, Karakoc E, Mauri L, Fernandez C, Kluczynski KH, Anderson DS, Iorio F. An interactive web application for processing, correcting, and visualizing genome-wide pooled CRISPR-Cas9 screens. CELL REPORTS METHODS 2023; 3:100373. [PMID: 36814834 PMCID: PMC9939378 DOI: 10.1016/j.crmeth.2022.100373] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 10/06/2022] [Accepted: 12/07/2022] [Indexed: 01/24/2023]
Abstract
A limitation of pooled CRISPR-Cas9 screens is the high false-positive rate in detecting essential genes arising from copy-number-amplified genomics regions. To solve this issue, we previously developed CRISPRcleanR: a computational method implemented as R/python package and in a dockerized version. CRISPRcleanR detects and corrects biased responses to CRISPR-Cas9 targeting in an unsupervised fashion, accurately reducing false-positive signals while maintaining sensitivity in identifying relevant genetic dependencies. Here, we present CRISPRcleanR WebApp , a web application enabling access to CRISPRcleanR through an intuitive interface. CRISPRcleanR WebApp removes the complexity of R/python language user interactions; provides user-friendly access to a complete analytical pipeline, not requiring any data pre-processing and generating gene-level summaries of essentiality with associated statistical scores; and offers a range of interactively explorable plots while supporting a more comprehensive range of CRISPR guide RNAs' libraries than the original package. CRISPRcleanR WebApp is available at https://crisprcleanr-webapp.fht.org/.
Collapse
Affiliation(s)
- Alessandro Vinceti
- Computational Biology Research Centre, Human Technopole, Viale Rita Levi-Montalcini, 1, 20157 Milano, Italy
| | - Riccardo Roberto De Lucia
- Computational Biology Research Centre, Human Technopole, Viale Rita Levi-Montalcini, 1, 20157 Milano, Italy
| | - Paolo Cremaschi
- Computational Biology Research Centre, Human Technopole, Viale Rita Levi-Montalcini, 1, 20157 Milano, Italy
| | - Umberto Perron
- Computational Biology Research Centre, Human Technopole, Viale Rita Levi-Montalcini, 1, 20157 Milano, Italy
| | - Emre Karakoc
- Cancer Dependency Map Analytics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Luca Mauri
- ICT and Digitalisation, Human Technopole, Viale Rita Levi-Montalcini, 1, 20157 Milano, Italy
| | - Carlos Fernandez
- ICT and Digitalisation, Human Technopole, Viale Rita Levi-Montalcini, 1, 20157 Milano, Italy
| | | | - Daniel Stephen Anderson
- ICT and Digitalisation, Human Technopole, Viale Rita Levi-Montalcini, 1, 20157 Milano, Italy
| | - Francesco Iorio
- Computational Biology Research Centre, Human Technopole, Viale Rita Levi-Montalcini, 1, 20157 Milano, Italy
- Cancer Dependency Map Analytics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| |
Collapse
|
10
|
Altay G, Zapardiel-Gonzalo J, Peters B. RNA-seq preprocessing and sample size considerations for gene network inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.02.522518. [PMID: 36711979 PMCID: PMC9881880 DOI: 10.1101/2023.01.02.522518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Background Gene network inference (GNI) methods have the potential to reveal functional relationships between different genes and their products. Most GNI algorithms have been developed for microarray gene expression datasets and their application to RNA-seq data is relatively recent. As the characteristics of RNA-seq data are different from microarray data, it is an unanswered question what preprocessing methods for RNA-seq data should be applied prior to GNI to attain optimal performance, or what the required sample size for RNA-seq data is to obtain reliable GNI estimates. Results We ran 9144 analysis of 7 different RNA-seq datasets to evaluate 300 different preprocessing combinations that include data transformations, normalizations and association estimators. We found that there was no single best performing preprocessing combination but that there were several good ones. The performance varied widely over various datasets, which emphasized the importance of choosing an appropriate preprocessing configuration before GNI. Two preprocessing combinations appeared promising in general: First, Log-2 TPM (transcript per million) with Variance-stabilizing transformation (VST) and Pearson Correlation Coefficient (PCC) association estimator. Second, raw RNA-seq count data with PCC. Along with these two, we also identified 18 other good preprocessing combinations. Any of these algorithms might perform best in different datasets. Therefore, the GNI performances of these approaches should be measured on any new dataset to select the best performing one for it. In terms of the required biological sample size of RNA-seq data, we found that between 30 to 85 samples were required to generate reliable GNI estimates. Conclusions This study provides practical recommendations on default choices for data preprocessing prior to GNI analysis of RNA-seq data to obtain optimal performance results.
Collapse
Affiliation(s)
- Gökmen Altay
- La Jolla Institute for Immunology, 9420 Athena Circle, La Jolla, CA 92037, USA
| | | | - Bjoern Peters
- La Jolla Institute for Immunology, 9420 Athena Circle, La Jolla, CA 92037, USA
| |
Collapse
|
11
|
Chirinos X, Ying S, Rodrigues MA, Maza E, Djari A, Hu G, Liu M, Purgatto E, Fournier S, Regad F, Bouzayen M, Pirrello J. Transition to ripening in tomato requires hormone-controlled genetic reprogramming initiated in gel tissue. PLANT PHYSIOLOGY 2023; 191:610-625. [PMID: 36200876 PMCID: PMC9806557 DOI: 10.1093/plphys/kiac464] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 09/16/2022] [Indexed: 06/16/2023]
Abstract
Ripening is the last stage of the developmental program in fleshy fruits. During this phase, fruits become edible and acquire their unique sensory qualities and post-harvest potential. Although our knowledge of the mechanisms that regulate fruit ripening has improved considerably over the past decades, the processes that trigger the transition to ripening remain poorly deciphered. While transcriptomic profiling of tomato (Solanum lycopersicum L.) fruit ripening to date has mainly focused on the changes occurring in pericarp tissues between the Mature Green and Breaker stages, our study addresses the changes between the Early Mature Green and Late Mature Green stages in the gel and pericarp separately. The data showed that the shift from an inability to initiate ripening to the capacity to undergo full ripening requires extensive transcriptomic reprogramming that takes place first in the locular tissues before extending to the pericarp. Genome-wide transcriptomic profiling revealed the wide diversity of transcription factor (TF) families engaged in the global reprogramming of gene expression and identified those specifically regulated at the Mature Green stage in the gel but not in the pericarp, thereby providing potential targets toward deciphering the initial factors and events that trigger the transition to ripening. The study also uncovered an extensive reformed homeostasis for most plant hormones, highlighting the multihormonal control of ripening initiation. Our data unveil the antagonistic roles of ethylene and auxin during the onset of ripening and show that auxin treatment delays fruit ripening via impairing the expression of genes required for System-2 autocatalytic ethylene production that is essential for climacteric ripening. This study unveils the detailed features of the transcriptomic reprogramming associated with the transition to ripening of tomato fruit and shows that the first changes occur in the locular gel before extending to pericarp and that a reformed auxin homeostasis is essential for the ripening to proceed.
Collapse
Affiliation(s)
| | | | - Maria Aurineide Rodrigues
- Laboratoire de Recherche en Sciences Végétales—Génomique et Biotechnologie des Fruits—UMR5546, Université de Toulouse, CNRS, UPS, Toulouse-INP, Toulouse, France
- Université de Toulouse, INRAe/INP Toulouse, Génomique et Biotechnologie des Fruits—UMR990, Castanet-Tolosan, France
- Institute of Biosciences, Department of Botany, Universidade de São Paulo, São Paulo, 11461 Brazil
| | - Elie Maza
- Laboratoire de Recherche en Sciences Végétales—Génomique et Biotechnologie des Fruits—UMR5546, Université de Toulouse, CNRS, UPS, Toulouse-INP, Toulouse, France
- Université de Toulouse, INRAe/INP Toulouse, Génomique et Biotechnologie des Fruits—UMR990, Castanet-Tolosan, France
| | - Anis Djari
- Laboratoire de Recherche en Sciences Végétales—Génomique et Biotechnologie des Fruits—UMR5546, Université de Toulouse, CNRS, UPS, Toulouse-INP, Toulouse, France
- Université de Toulouse, INRAe/INP Toulouse, Génomique et Biotechnologie des Fruits—UMR990, Castanet-Tolosan, France
| | - Guojian Hu
- Laboratoire de Recherche en Sciences Végétales—Génomique et Biotechnologie des Fruits—UMR5546, Université de Toulouse, CNRS, UPS, Toulouse-INP, Toulouse, France
- Université de Toulouse, INRAe/INP Toulouse, Génomique et Biotechnologie des Fruits—UMR990, Castanet-Tolosan, France
| | - Mingchun Liu
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, Sichuan 610065, China
| | - Eduardo Purgatto
- Departamento de Alimentos e Nutrição Experimental, Faculdade de Ciências Farmacêuticas, Universidade de São Paulo, São Paulo, SP, Brazil
| | - Sylvie Fournier
- Metatoul-AgromiX platform, LRSV, Université de Toulouse, CNRS, UPS, Toulouse INP, France
- MetaboHUB-MetaToul, National Infrastructure of Metabolomics and Fluxomics, Toulouse, 31077, France
| | - Farid Regad
- Laboratoire de Recherche en Sciences Végétales—Génomique et Biotechnologie des Fruits—UMR5546, Université de Toulouse, CNRS, UPS, Toulouse-INP, Toulouse, France
- Université de Toulouse, INRAe/INP Toulouse, Génomique et Biotechnologie des Fruits—UMR990, Castanet-Tolosan, France
| | - Mondher Bouzayen
- Laboratoire de Recherche en Sciences Végétales—Génomique et Biotechnologie des Fruits—UMR5546, Université de Toulouse, CNRS, UPS, Toulouse-INP, Toulouse, France
- Université de Toulouse, INRAe/INP Toulouse, Génomique et Biotechnologie des Fruits—UMR990, Castanet-Tolosan, France
| | | |
Collapse
|
12
|
Escorcia-Rodríguez JM, Gaytan-Nuñez E, Hernandez-Benitez EM, Zorro-Aranda A, Tello-Palencia MA, Freyre-González JA. Improving gene regulatory network inference and assessment: The importance of using network structure. Front Genet 2023; 14:1143382. [PMID: 36926589 PMCID: PMC10012345 DOI: 10.3389/fgene.2023.1143382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 02/20/2023] [Indexed: 03/03/2023] Open
Abstract
Gene regulatory networks are graph models representing cellular transcription events. Networks are far from complete due to time and resource consumption for experimental validation and curation of the interactions. Previous assessments have shown the modest performance of the available network inference methods based on gene expression data. Here, we study several caveats on the inference of regulatory networks and methods assessment through the quality of the input data and gold standard, and the assessment approach with a focus on the global structure of the network. We used synthetic and biological data for the predictions and experimentally-validated biological networks as the gold standard (ground truth). Standard performance metrics and graph structural properties suggest that methods inferring co-expression networks should no longer be assessed equally with those inferring regulatory interactions. While methods inferring regulatory interactions perform better in global regulatory network inference than co-expression-based methods, the latter is better suited to infer function-specific regulons and co-regulation networks. When merging expression data, the size increase should outweigh the noise inclusion and graph structure should be considered when integrating the inferences. We conclude with guidelines to take advantage of inference methods and their assessment based on the applications and available expression datasets.
Collapse
Affiliation(s)
- Juan M Escorcia-Rodríguez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Estefani Gaytan-Nuñez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Undergraduate Program in Genomic Sciences, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Ericka M Hernandez-Benitez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Undergraduate Program in Genomic Sciences, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Andrea Zorro-Aranda
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Department of Chemical Engineering, Universidad de Antioquia, Medellín, Colombia
| | - Marco A Tello-Palencia
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Undergraduate Program in Genomic Sciences, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Julio A Freyre-González
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| |
Collapse
|
13
|
Transcriptomic data analysis of melanocytes and melanoma cell lines of LAT transporter genes for precise medicine. BIO-ALGORITHMS AND MED-SYSTEMS 2022. [DOI: 10.2478/bioal-2022-0086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Abstract
Background: Boron Neutron Capture Therapy (BNCT) is a two-step treatment that can be used in some types of cancers. It involves administering a compound containing boron atoms to the patient and irradiating the affected area of the body with a neutron beam. The success of the therapy depends mainly on the delivery of the boron isotope (10B) to the tumor using an appropriate boron carrier. One of the boron carriers used is boronophenylalanine (BPA). Therefore, in research on the use of boron carriers, it is also important to know the mechanisms of its uptake by cells. Aim: To study the expression of LAT family genes in two melanoma (high melanotic WM115 and low melanotic WM266-4) cell lines and melanocytes (HEMa-Lp) which are responsible for the transport the BPA into cells. Methods: To normalize data from the transcriptomic analysis, the ratio of the median method was used. This allowed the samples to be compared with each other. Comparison metrics included log-fold change (LFC) values. The heatmap of LFC values and the cluster map were created. These graphs show the similarities and differences between the samples. Results: Transcriptomic data show that in melanocytes, LFC for SLC7A5 (LAT1) and SLC3A2 (4Fhc) was higher than in melanoma cell lines, which corresponded with their melanin content. Conclusion: Our results indicate overexpression of BPA transporter genes in normal cells (melanocytes), which may suggest the highest level of these proteins in melanocytes compared to less melanotic melanoma. Therefore, for BNCT, the use of BPA as the 10B carrier will require additional qualifying tests of amino acid transporter expression for patients and specific tumors to develop a personalized BNCT.
Collapse
|
14
|
Patel T, Hammelman J, Aziz S, Jang S, Closser M, Michaels TL, Blum JA, Gifford DK, Wichterle H. Transcriptional dynamics of murine motor neuron maturation in vivo and in vitro. Nat Commun 2022; 13:5427. [PMID: 36109497 PMCID: PMC9477853 DOI: 10.1038/s41467-022-33022-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 08/25/2022] [Indexed: 12/03/2022] Open
Abstract
Neurons born in the embryo can undergo a protracted period of maturation lasting well into postnatal life. How gene expression changes are regulated during maturation and whether they can be recapitulated in cultured neurons remains poorly understood. Here, we show that mouse motor neurons exhibit pervasive changes in gene expression and accessibility of associated regulatory regions from embryonic till juvenile age. While motifs of selector transcription factors, ISL1 and LHX3, are enriched in nascent regulatory regions, motifs of NFI factors, activity-dependent factors, and hormone receptors become more prominent in maturation-dependent enhancers. Notably, stem cell-derived motor neurons recapitulate ~40% of the maturation expression program in vitro, with neural activity playing only a modest role as a late-stage modulator. Thus, the genetic maturation program consists of a core hardwired subprogram that is correctly executed in vitro and an extrinsically-controlled subprogram that is dependent on the in vivo context of the maturing organism.
Collapse
Affiliation(s)
- Tulsi Patel
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| | - Jennifer Hammelman
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, 02139, USA
| | - Siaresh Aziz
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Sumin Jang
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Michael Closser
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Theodore L Michaels
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Jacob A Blum
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - David K Gifford
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, 02139, USA
| | - Hynek Wichterle
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| |
Collapse
|
15
|
Athanasopoulou K, Adamopoulos PG, Daneva GN, Scorilas A. Decoding the concealed transcriptional signature of the apoptosis-related BCL2 antagonist/killer 1 (BAK1) gene in human malignancies. Apoptosis 2022; 27:869-882. [DOI: 10.1007/s10495-022-01753-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2022] [Indexed: 11/29/2022]
|
16
|
Athanasopoulou K, Adamopoulos PG, Scorilas A. Structural characterization and expression analysis of novel MAPK1 transcript variants with the development of a multiplexed targeted nanopore sequencing approach. Int J Biochem Cell Biol 2022; 150:106272. [PMID: 35878809 DOI: 10.1016/j.biocel.2022.106272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 06/17/2022] [Accepted: 07/21/2022] [Indexed: 11/27/2022]
Abstract
Mitogen-activated protein kinases (MAPKs) represent a protein family firmly involved in many signaling cascades, regulating a vast spectrum of stimulated cellular processes. Studies have shown that alternatively spliced isoforms of MAPKs play a crucial role in determining the desired cell fate in response to specific stimulations. Although the implication of most MAPKs transcript variants in the MAPK signaling cascades has been clarified, the transcriptional profile of a pivotal member, MAPK1, has not been investigated for the existence of additional isoforms. In the current study we developed and implemented targeted long-read and short-read sequencing approaches to identify novel MAPK1 splice variants. The combination of nanopore sequencing and NGS enabled the implementation of a long-read polishing pipeline using error-rate correction algorithms, which empowered the high accuracy of the results and increased the sequencing efficiency. The utilized multiplexing option in the nanopore sequencing approach allowed not only the identification of novel MAPK1 mRNAs, but also elucidated their expression profile in multiple human malignancies and non-cancerous cell lines. Our study highlights for the first time the existence of ten previously undescribed MAPK1 mRNAs (MAPK1 v.3 - v.12) and evaluates their relative expression levels in comparison to the main MAPK1 v.1. The optimization and employment of qPCR assays revealed that MAPK1 v.3 - v.12 can be quantified in a wide spectrum of human cell lines with notable specificity. Finally, our findings suggest that the novel protein-coding mRNAs are highly expected to participate in the regulation of MAPK pathways, demonstrating differential localizations and functionalities.
Collapse
Affiliation(s)
- Konstantina Athanasopoulou
- Department of Biochemistry and Molecular Biology, National and Kapodistrian University of Athens, Athens, Greece
| | - Panagiotis G Adamopoulos
- Department of Biochemistry and Molecular Biology, National and Kapodistrian University of Athens, Athens, Greece
| | - Andreas Scorilas
- Department of Biochemistry and Molecular Biology, National and Kapodistrian University of Athens, Athens, Greece.
| |
Collapse
|
17
|
Huang HC, Wu Y, Yang Q, Qin LX. PRECISION.array: An R Package for Benchmarking microRNA Array Data Normalization in the Context of Sample Classification. Front Genet 2022; 13:838679. [PMID: 35938023 PMCID: PMC9354575 DOI: 10.3389/fgene.2022.838679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 06/10/2022] [Indexed: 11/13/2022] Open
Abstract
We present a new R package PRECISION.array for assessing the performance of data normalization methods in connection with methods for sample classification. It includes two microRNA microarray datasets for the same set of tumor samples: a re-sampling-based algorithm for simulating additional paired datasets under various designs of sample-to-array assignment and levels of signal-to-noise ratios and a collection of numerical and graphical tools for method performance assessment. The package allows users to specify their own methods for normalization and classification, in addition to implementing three methods for training data normalization, seven methods for test data normalization, seven methods for classifier training, and two methods for classifier validation. It enables an objective and systemic evaluation of the operating characteristics of normalization and classification methods in microRNA microarrays. To our knowledge, this is the first such tool available. The R package can be downloaded freely at https://github.com/LXQin/PRECISION.array.
Collapse
|
18
|
Charles S, Sreekumar J, Natarajan J. Transcriptomic meta-analysis reveals biomarker pairs and key pathways in Tetralogy of Fallot. J Bioinform Comput Biol 2022; 20:2240004. [DOI: 10.1142/s0219720022400042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
19
|
Nazmul Hasan M, Islam S, Bhuiyan FH, Arefin S, Hoque H, Azad Jewel N, Ghosh A, Prodhan SH. Genome wide analysis of the heavy-metal-associated (HMA) gene family in tomato and expression profiles under different stresses. Gene X 2022; 835:146664. [PMID: 35691406 DOI: 10.1016/j.gene.2022.146664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 05/24/2022] [Accepted: 06/06/2022] [Indexed: 11/04/2022] Open
Abstract
The heavy-metal-associated (HMA) family plays a major role in the transportation of metals. Despite having the genome sequence of the tomato (Solanum lycopersicum), the HMA gene family has not been studied yet. In this study, we identified 48 HMA genes and categorized them into Cu/Ag P1B-ATPase and Zn/Co/Cd/Pb P1BATPase sub-families according to their phylogenic relationship with Arabidopsis and rice. The SlHMA genes were distributed throughout the 12 chromosomes. Analysis of gene structure, chromosomal position, and synteny, revealed that segmental duplications bestowed their evolution. The high numbers of stress-related cis-elements were found to be present in the putative promoter regions indicate the involvement of SlHMAs in stress modulation pathways. RNA-seq data revealed that SlHMAs had divergent expression in different tissues and developmental stages, where members of Cu/Ag P1B-ATPase subfamily were strongly expressed in the roots. RT-qPCR analysis of nine selected SlHMAs showed that most of the genes were up-regulated in response to heavy metals and moderately regulated in response to different abiotic stresses such as salt, drought, and cold.
Collapse
Affiliation(s)
- Md Nazmul Hasan
- Department of Genetic Engineering and Biotechnology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh
| | - Shiful Islam
- Department of Biochemistry and Molecular Biology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh; Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Fahmid H Bhuiyan
- Department of Genetic Engineering and Biotechnology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh; Plant Biotechnology Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh
| | - Shahrear Arefin
- Department of Genetic Engineering and Biotechnology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh
| | - Hammadul Hoque
- Department of Genetic Engineering and Biotechnology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh.
| | - Nurnabi Azad Jewel
- Department of Genetic Engineering and Biotechnology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh.
| | - Ajit Ghosh
- Department of Biochemistry and Molecular Biology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh.
| | - Shamsul H Prodhan
- Department of Genetic Engineering and Biotechnology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh.
| |
Collapse
|
20
|
Zou J, Düren Y, Qin LX. PRECISION.seq: An R Package for Benchmarking Depth Normalization in microRNA Sequencing. Front Genet 2022; 12:823431. [PMID: 35154266 PMCID: PMC8832140 DOI: 10.3389/fgene.2021.823431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Accepted: 12/31/2021] [Indexed: 11/13/2022] Open
Abstract
We present a new R package PRECISION.seq for assessing the performance of depth normalization in microRNA sequencing data. It provides a pair of microRNA sequencing data sets for the same set of tumor samples, additional pairs of data sets simulated by re-sampling under various patterns of differential expression, and a collection of numerical and graphical tools for assessing the performance of normalization methods. Users can easily assess their chosen normalization method and compare its performance to nine methods already included in the package. PRECISION.seq enables an objective and systematic evaluation of normalization methods in microRNA sequencing using realistically distributed and robustly benchmarked data under a wide range of differential expression patterns. To our best knowledge, this is the first such tool available. The data sets and source code of the R package can be found at https://github.com/LXQin/PRECISION.seq.
Collapse
Affiliation(s)
- Jian Zou
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, United States
| | - Yannick Düren
- Department of Mathematics, Ruhr-University Bochum, Bochum, Germany
| | - Li-Xuan Qin
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, United States
- *Correspondence: Li-Xuan Qin,
| |
Collapse
|
21
|
Graf J, Cho S, McDonough E, Corwin A, Sood A, Lindner A, Salvucci M, Stachtea X, Van Schaeybroeck S, Dunne PD, Laurent-Puig P, Longley D, Prehn JHM, Ginty F. FLINO: a new method for immunofluorescence bioimage normalization. Bioinformatics 2022; 38:520-526. [PMID: 34601553 PMCID: PMC8723144 DOI: 10.1093/bioinformatics/btab686] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 09/09/2021] [Accepted: 09/25/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Multiplexed immunofluorescence bioimaging of single-cells and their spatial organization in tissue holds great promise to the development of future precision diagnostics and therapeutics. Current multiplexing pipelines typically involve multiple rounds of immunofluorescence staining across multiple tissue slides. This introduces experimental batch effects that can hide underlying biological signal. It is important to have robust algorithms that can correct for the batch effects while not introducing biases into the data. Performance of data normalization methods can vary among different assay pipelines. To evaluate differences, it is critical to have a ground truth dataset that is representative of the assay. RESULTS A new immunoFLuorescence Image NOrmalization method is presented and evaluated against alternative methods and workflows. Multiround immunofluorescence staining of the same tissue with the nuclear dye DAPI was used to represent virtual slides and a ground truth. DAPI was restained on a given tissue slide producing multiple images of the same underlying structure but undergoing multiple representative tissue handling steps. This ground truth dataset was used to evaluate and compare multiple normalization methods including median, quantile, smooth quantile, median ratio normalization and trimmed mean of the M-values. These methods were applied in both an unbiased grid object and segmented cell object workflow to 24 multiplexed biomarkers. An upper quartile normalization of grid objects in log space was found to obtain almost equivalent performance to directly normalizing segmented cell objects by the middle quantile. The developed grid-based technique was then applied with on-slide controls for evaluation. Using five or fewer controls per slide can introduce biases into the data. Ten or more on-slide controls were able to robustly correct for batch effects. AVAILABILITY AND IMPLEMENTATION The data underlying this article along with the FLINO R-scripts used to perform the evaluation of image normalizations methods and workflows can be downloaded from https://github.com/GE-Bio/FLINO. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- John Graf
- To whom correspondence should be addressed. or
| | - Sanghee Cho
- Department of Biology & Applied Physics, GE Research, Niskayuna, NY 12309, USA
| | - Elizabeth McDonough
- Department of Biology & Applied Physics, GE Research, Niskayuna, NY 12309, USA
| | - Alex Corwin
- Department of Biology & Applied Physics, GE Research, Niskayuna, NY 12309, USA
| | - Anup Sood
- Department of Biology & Applied Physics, GE Research, Niskayuna, NY 12309, USA
| | - Andreas Lindner
- Department of Physiology and Medical Physics, Centre of Systems Medicine, Royal College of Surgeons in Ireland University of Medicine and Health Sciences, 123 St. Stephen’s Green, Dublin 2, Ireland
| | - Manuela Salvucci
- Department of Physiology and Medical Physics, Centre of Systems Medicine, Royal College of Surgeons in Ireland University of Medicine and Health Sciences, 123 St. Stephen’s Green, Dublin 2, Ireland
| | - Xanthi Stachtea
- Department of Oncology, Centre for Cancer Research & Cell Biology, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7AE, Northern Ireland, UK
| | - Sandra Van Schaeybroeck
- Department of Oncology, Centre for Cancer Research & Cell Biology, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7AE, Northern Ireland, UK
| | - Philip D Dunne
- Department of Oncology, Centre for Cancer Research & Cell Biology, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7AE, Northern Ireland, UK
| | - Pierre Laurent-Puig
- Department of Biology, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, 3 Av. Victoria, 75004 Paris, France
| | - Daniel Longley
- Department of Oncology, Centre for Cancer Research & Cell Biology, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7AE, Northern Ireland, UK
| | - Jochen H M Prehn
- Department of Physiology and Medical Physics, Centre of Systems Medicine, Royal College of Surgeons in Ireland University of Medicine and Health Sciences, 123 St. Stephen’s Green, Dublin 2, Ireland
| | - Fiona Ginty
- To whom correspondence should be addressed. or
| |
Collapse
|
22
|
Johnson KA, Krishnan A. Robust normalization and transformation techniques for constructing gene coexpression networks from RNA-seq data. Genome Biol 2022; 23:1. [PMID: 34980209 PMCID: PMC8721966 DOI: 10.1186/s13059-021-02568-9] [Citation(s) in RCA: 44] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 12/06/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Constructing gene coexpression networks is a powerful approach for analyzing high-throughput gene expression data towards module identification, gene function prediction, and disease-gene prioritization. While optimal workflows for constructing coexpression networks, including good choices for data pre-processing, normalization, and network transformation, have been developed for microarray-based expression data, such well-tested choices do not exist for RNA-seq data. Almost all studies that compare data processing and normalization methods for RNA-seq focus on the end goal of determining differential gene expression. RESULTS Here, we present a comprehensive benchmarking and analysis of 36 different workflows, each with a unique set of normalization and network transformation methods, for constructing coexpression networks from RNA-seq datasets. We test these workflows on both large, homogenous datasets and small, heterogeneous datasets from various labs. We analyze the workflows in terms of aggregate performance, individual method choices, and the impact of multiple dataset experimental factors. Our results demonstrate that between-sample normalization has the biggest impact, with counts adjusted by size factors producing networks that most accurately recapitulate known tissue-naive and tissue-aware gene functional relationships. CONCLUSIONS Based on this work, we provide concrete recommendations on robust procedures for building an accurate coexpression network from an RNA-seq dataset. In addition, researchers can examine all the results in great detail at https://krishnanlab.github.io/RNAseq_coexpression to make appropriate choices for coexpression analysis based on the experimental factors of their RNA-seq dataset.
Collapse
Affiliation(s)
- Kayla A Johnson
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Arjun Krishnan
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
23
|
Song J, Bian J, Xue N, Xu Y, Wu J. Inter-species mRNA transfer among green peach aphids, dodder parasites, and cucumber host plants. PLANT DIVERSITY 2022; 44:1-10. [PMID: 35281124 PMCID: PMC8897176 DOI: 10.1016/j.pld.2021.03.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 03/26/2021] [Indexed: 05/28/2023]
Abstract
mRNAs are transported within a plant through phloem. Aphids are phloem feeders and dodders (Cuscuta spp.) are parasites which establish phloem connections with host plants. When aphids feed on dodders, whether there is trafficking of mRNAs among aphids, dodders, and host plants and if aphid feeding affects the mRNA transfer between dodders and hosts are unclear. We constructed a green peach aphid (GPA, Myzus persicae)-dodder (Cuscuta australis)-cucumber (Cucumis sativus) tritrophic system by infesting GPAs on C. australis, which parasitized cucumber hosts. We found that GPA feeding activated defense-related phytohormonal and transcriptomic responses in both C. australis and cucumbers and large numbers of mRNAs were found to be transferred between C. australis and cucumbers and between C. australis and GPAs; importantly, GPA feeding on C. australis greatly altered inter-species mobile mRNA profiles. Furthermore, three cucumber mRNAs and three GPA mRNAs could be respectively detected in GPAs and cucumbers. Moreover, our statistical analysis indicated that mRNAs with high abundances and long transcript lengths are likely to be mobile. This study reveals the existence of inter-species and even inter-kingdom mRNA movement among insects, parasitic plants, and parasite hosts, and suggests complex regulation of mRNA trafficking.
Collapse
Affiliation(s)
- Juan Song
- Department of Economic Plants and Biotechnology, Yunnan Key Laboratory for Wild Plant Resources, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jinge Bian
- Department of Economic Plants and Biotechnology, Yunnan Key Laboratory for Wild Plant Resources, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Na Xue
- Department of Economic Plants and Biotechnology, Yunnan Key Laboratory for Wild Plant Resources, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuxing Xu
- Department of Economic Plants and Biotechnology, Yunnan Key Laboratory for Wild Plant Resources, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jianqiang Wu
- Department of Economic Plants and Biotechnology, Yunnan Key Laboratory for Wild Plant Resources, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
24
|
Tran DT, Might M. cdev: a ground-truth based measure to evaluate RNA-seq normalization performance. PeerJ 2021; 9:e12233. [PMID: 34707933 PMCID: PMC8496462 DOI: 10.7717/peerj.12233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 09/09/2021] [Indexed: 11/28/2022] Open
Abstract
Normalization of RNA-seq data has been an active area of research since the problem was first recognized a decade ago. Despite the active development of new normalizers, their performance measures have been given little attention. To evaluate normalizers, researchers have been relying on ad hoc measures, most of which are either qualitative, potentially biased, or easily confounded by parametric choices of downstream analysis. We propose a metric called condition-number based deviation, or cdev, to quantify normalization success. cdev measures how much an expression matrix differs from another. If a ground truth normalization is given, cdev can then be used to evaluate the performance of normalizers. To establish experimental ground truth, we compiled an extensive set of public RNA-seq assays with external spike-ins. This data collection, together with cdev, provides a valuable toolset for benchmarking new and existing normalization methods.
Collapse
Affiliation(s)
- Diem-Trang Tran
- School of Computing, University of Utah, Salt Lake City, UT, United States of America
| | - Matthew Might
- Hugh Kaul Precision Medicine Institute, University of Alabama at Birmingham, Birmingham, AL, United States of America
| |
Collapse
|
25
|
Osabe T, Shimizu K, Kadota K. Differential expression analysis using a model-based gene clustering algorithm for RNA-seq data. BMC Bioinformatics 2021; 22:511. [PMID: 34670485 PMCID: PMC8527798 DOI: 10.1186/s12859-021-04438-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Accepted: 10/11/2021] [Indexed: 11/10/2022] Open
Abstract
Background RNA-seq is a tool for measuring gene expression and is commonly used to identify differentially expressed genes (DEGs). Gene clustering is used to classify DEGs with similar expression patterns for the subsequent analyses of data from experiments such as time-courses or multi-group comparisons. However, gene clustering has rarely been used for analyzing simple two-group data or differential expression (DE). In this study, we report that a model-based clustering algorithm implemented in an R package, MBCluster.Seq, can also be used for DE analysis. Results The input data originally used by MBCluster.Seq is DEGs, and the proposed method (called MBCdeg) uses all genes for the analysis. The method uses posterior probabilities of genes assigned to a cluster displaying non-DEG pattern for overall gene ranking. We compared the performance of MBCdeg with conventional R packages such as edgeR, DESeq2, and TCC that are specialized for DE analysis using simulated and real data. Our results showed that MBCdeg outperformed other methods when the proportion of DEG (PDEG) was less than 50%. However, the DEG identification using MBCdeg was less consistent than with conventional methods. We compared the effects of different normalization algorithms using MBCdeg, and performed an analysis using MBCdeg in combination with a robust normalization algorithm (called DEGES) that was not implemented in MBCluster.Seq. The new analysis method showed greater stability than using the original MBCdeg with the default normalization algorithm. Conclusions MBCdeg with DEGES normalization can be used in the identification of DEGs when the PDEG is relatively low. As the method is based on gene clustering, the DE result includes information on which expression pattern the gene belongs to. The new method may be useful for the analysis of time-course and multi-group data, where the classification of expression patterns is often required. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04438-4.
Collapse
Affiliation(s)
- Takayuki Osabe
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan
| | - Kentaro Shimizu
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan.,Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan.,Interfaculty Initiative in Information Studies, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-0033, Japan
| | - Koji Kadota
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan. .,Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan. .,Interfaculty Initiative in Information Studies, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-0033, Japan.
| |
Collapse
|
26
|
Althiab-Almasaud R, Chen Y, Maza E, Djari A, Frasse P, Mollet JC, Mazars C, Jamet E, Chervin C. Ethylene signaling modulates tomato pollen tube growth through modifications of cell wall remodeling and calcium gradient. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2021; 107:893-908. [PMID: 34036648 DOI: 10.1111/tpj.15353] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 05/14/2021] [Accepted: 05/18/2021] [Indexed: 06/12/2023]
Abstract
Ethylene modulates plant developmental processes including flower development. Previous studies have suggested ethylene participates in pollen tube (PT) elongation, and both ethylene production and perception seem critical at the time of fertilization. The full gene set regulated by ethylene during PT growth is unknown. To study this, we used various EThylene Receptor (ETR) tomato (Solanum lycopersicum) mutants: etr3-ko, a loss-of-function (LOF) mutant; and NR (NEVER RIPE), a gain-of-function (GOF) mutant. The etr3-ko PTs grew faster than wild-type (WT) PTs. Oppositely, NR PT elongation was slower than in WT, and PTs displayed larger diameters. ETR mutations result in feedback control of ethylene production. Furthermore, ethylene treatment of germinating pollen grains increased PT length in etr-ko mutants and WT, but not in NR. Treatment with the ethylene perception inhibitor 1-methylcyclopropene decreased PT length in etr-ko mutants and WT, but had no effect on NR. This confirmed that ethylene regulates PT growth. The comparison of PT transcriptomes in LOF and GOF mutants, etr3-ko and NR, both harboring mutations of the ETR3 gene, revealed that ethylene perception has major impacts on cell wall- and calcium-related genes as confirmed by microscopic observations showing a modified distribution of the methylesterified homogalacturonan pectic motif and of calcium load. Our results establish links between PT growth, ethylene, calcium, and cell wall metabolism, and also constitute a transcriptomic resource.
Collapse
Affiliation(s)
- Rasha Althiab-Almasaud
- Laboratoire de Génomique et Biotechnologie des Fruits, Université de Toulouse, Toulouse INP-ENSAT, INRAE, Auzeville-Tolosane, France
| | - Yi Chen
- Laboratoire de Génomique et Biotechnologie des Fruits, Université de Toulouse, Toulouse INP-ENSAT, INRAE, Auzeville-Tolosane, France
- College of Food and Pharmaceutical Sciences, Ningbo University, Ningbo, China
| | - Elie Maza
- Laboratoire de Génomique et Biotechnologie des Fruits, Université de Toulouse, Toulouse INP-ENSAT, INRAE, Auzeville-Tolosane, France
| | - Anis Djari
- Laboratoire de Génomique et Biotechnologie des Fruits, Université de Toulouse, Toulouse INP-ENSAT, INRAE, Auzeville-Tolosane, France
| | - Pierre Frasse
- Laboratoire de Génomique et Biotechnologie des Fruits, Université de Toulouse, Toulouse INP-ENSAT, INRAE, Auzeville-Tolosane, France
| | - Jean-Claude Mollet
- Laboratoire Glyco-MEV, SFR NORVEGE, Innovation Chimie Carnot, Normandie Univ, UniRouen, Rouen, France
| | - Christian Mazars
- Laboratoire de Recherche en Sciences Végétales, Université de Toulouse, CNRS, UPS, Auzeville-Tolosane, France
| | - Elisabeth Jamet
- Laboratoire de Recherche en Sciences Végétales, Université de Toulouse, CNRS, UPS, Auzeville-Tolosane, France
| | - Christian Chervin
- Laboratoire de Génomique et Biotechnologie des Fruits, Université de Toulouse, Toulouse INP-ENSAT, INRAE, Auzeville-Tolosane, France
| |
Collapse
|
27
|
Yang J, Wang D, Yang Y, Yang W, Jin W, Niu X, Gong J. A systematic comparison of normalization methods for eQTL analysis. Brief Bioinform 2021; 22:6278608. [PMID: 34015824 DOI: 10.1093/bib/bbab193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 04/14/2021] [Accepted: 04/28/2021] [Indexed: 11/15/2022] Open
Abstract
Expression quantitative trait loci (eQTL) analysis has been widely used in interpreting disease-associated loci through correlating genetic variant loci with the expression of specific genes. RNA-sequencing (RNA-Seq), which can quantify gene expression at the genome-wide level, is often used in eQTL identification. Since different normalization methods of gene expression have substantial impacts on RNA-seq downstream analysis, it is of great necessity to systematically compare the effects of these methods on eQTL identification. Here, by using RNA-seq and genotype data of four different cancers in The Cancer Genome Atlas (TCGA) database, we comprehensively evaluated the effect of eight commonly used normalization methods on eQTL identification. Our results showed that the application of different methods could cause 20-30% differences in the final results of eQTL identification. Among these methods, COUNT, Median of Ratio (MED) and Trimmed Mean of M-values (TMM) generated similar results for identifying eQTLs, while Fragments Per Kilobase Million (FPKM) or RANK produced more differential results compared with other methods. Based on the accuracy and receiver operating characteristic (ROC) curve, the TMM method was found to be the optimal method for normalizing gene expression data in eQTLs analysis. In addition, we also evaluated the performance of different pairwise combinations of these methods. As a result, compared with single normalization methods, the combination of methods can not only identify more cis-eQTLs, but also improve the performance of the ROC curve. Overall, this study provides a comprehensive comparison of normalization methods for identifying eQTLs from RNA-seq data, and proposes some practical recommendations for diverse scenarios.
Collapse
Affiliation(s)
- Jiajun Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| | - Dongyang Wang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| | - Yanbo Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| | - Wenqian Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| | - Weiwei Jin
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| | - Xiaohui Niu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| | - Jing Gong
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China.,College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, P. R. China
| |
Collapse
|
28
|
Chung M, Bruno VM, Rasko DA, Cuomo CA, Muñoz JF, Livny J, Shetty AC, Mahurkar A, Dunning Hotopp JC. Best practices on the differential expression analysis of multi-species RNA-seq. Genome Biol 2021; 22:121. [PMID: 33926528 PMCID: PMC8082843 DOI: 10.1186/s13059-021-02337-8] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 04/01/2021] [Indexed: 02/07/2023] Open
Abstract
Advances in transcriptome sequencing allow for simultaneous interrogation of differentially expressed genes from multiple species originating from a single RNA sample, termed dual or multi-species transcriptomics. Compared to single-species differential expression analysis, the design of multi-species differential expression experiments must account for the relative abundances of each organism of interest within the sample, often requiring enrichment methods and yielding differences in total read counts across samples. The analysis of multi-species transcriptomics datasets requires modifications to the alignment, quantification, and downstream analysis steps compared to the single-species analysis pipelines. We describe best practices for multi-species transcriptomics and differential gene expression.
Collapse
Affiliation(s)
- Matthew Chung
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Vincent M Bruno
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - David A Rasko
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Christina A Cuomo
- Infectious Disease and Microbiome Program, Broad Institute, Cambridge, MA, 02142, USA
| | - José F Muñoz
- Infectious Disease and Microbiome Program, Broad Institute, Cambridge, MA, 02142, USA
| | - Jonathan Livny
- Infectious Disease and Microbiome Program, Broad Institute, Cambridge, MA, 02142, USA
| | - Amol C Shetty
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Anup Mahurkar
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Julie C Dunning Hotopp
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA. .,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA. .,Greenebaum Cancer Center, University of Maryland, Baltimore, MD, 21201, USA.
| |
Collapse
|
29
|
Cui W, Xue H, Wei L, Jin J, Tian X, Wang Q. High heterogeneity undermines generalization of differential expression results in RNA-Seq analysis. Hum Genomics 2021; 15:7. [PMID: 33509298 PMCID: PMC7845028 DOI: 10.1186/s40246-021-00308-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 01/19/2021] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND RNA sequencing (RNA-Seq) has been widely applied in oncology for monitoring transcriptome changes. However, the emerging problem that high variation of gene expression levels caused by tumor heterogeneity may affect the reproducibility of differential expression (DE) results has rarely been studied. Here, we investigated the reproducibility of DE results for any given number of biological replicates between 3 and 24 and explored why a great many differentially expressed genes (DEGs) were not reproducible. RESULTS Our findings demonstrate that poor reproducibility of DE results exists not only for small sample sizes, but also for relatively large sample sizes. Quite a few of the DEGs detected are specific to the samples in use, rather than genuinely differentially expressed under different conditions. Poor reproducibility of DE results is mainly caused by high variation of gene expression levels for the same gene in different samples. Even though biological variation may account for much of the high variation of gene expression levels, the effect of outlier count data also needs to be treated seriously, as outlier data severely interfere with DE analysis. CONCLUSIONS High heterogeneity exists not only in tumor tissue samples of each cancer type studied, but also in normal samples. High heterogeneity leads to poor reproducibility of DEGs, undermining generalization of differential expression results. Therefore, it is necessary to use large sample sizes (at least 10 if possible) in RNA-Seq experimental designs to reduce the impact of biological variability and DE results should be interpreted cautiously unless soundly validated.
Collapse
Affiliation(s)
- Weitong Cui
- Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, 255300, China
| | - Huaru Xue
- Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, 255300, China
| | - Lei Wei
- Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, 255300, China
| | - Jinghua Jin
- Environmental Protection Research Institute of Light Industry, Beijing, 100089, China
| | - Xuewen Tian
- Shandong Sport University, Jinan, 250102, China
| | - Qinglu Wang
- Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, 255300, China.
| |
Collapse
|
30
|
Systematic comparison and assessment of RNA-seq procedures for gene expression quantitative analysis. Sci Rep 2020; 10:19737. [PMID: 33184454 PMCID: PMC7665074 DOI: 10.1038/s41598-020-76881-x] [Citation(s) in RCA: 85] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 11/03/2020] [Indexed: 01/16/2023] Open
Abstract
RNA-seq is currently considered the most powerful, robust and adaptable technique for measuring gene expression and transcription activation at genome-wide level. As the analysis of RNA-seq data is complex, it has prompted a large amount of research on algorithms and methods. This has resulted in a substantial increase in the number of options available at each step of the analysis. Consequently, there is no clear consensus about the most appropriate algorithms and pipelines that should be used to analyse RNA-seq data. In the present study, 192 pipelines using alternative methods were applied to 18 samples from two human cell lines and the performance of the results was evaluated. Raw gene expression signal was quantified by non-parametric statistics to measure precision and accuracy. Differential gene expression performance was estimated by testing 17 differential expression methods. The procedures were validated by qRT-PCR in the same samples. This study weighs up the advantages and disadvantages of the tested algorithms and pipelines providing a comprehensive guide to the different methods and procedures applied to the analysis of RNA-seq data, both for the quantification of the raw expression signal and for the differential gene expression.
Collapse
|
31
|
Tong L, Wu PY, Phan JH, Hassazadeh HR, Tong W, Wang MD. Impact of RNA-seq data analysis algorithms on gene expression estimation and downstream prediction. Sci Rep 2020; 10:17925. [PMID: 33087762 PMCID: PMC7578822 DOI: 10.1038/s41598-020-74567-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2019] [Accepted: 08/27/2020] [Indexed: 11/23/2022] Open
Abstract
To use next-generation sequencing technology such as RNA-seq for medical and health applications, choosing proper analysis methods for biomarker identification remains a critical challenge for most users. The US Food and Drug Administration (FDA) has led the Sequencing Quality Control (SEQC) project to conduct a comprehensive investigation of 278 representative RNA-seq data analysis pipelines consisting of 13 sequence mapping, three quantification, and seven normalization methods. In this article, we focused on the impact of the joint effects of RNA-seq pipelines on gene expression estimation as well as the downstream prediction of disease outcomes. First, we developed and applied three metrics (i.e., accuracy, precision, and reliability) to quantitatively evaluate each pipeline's performance on gene expression estimation. We then investigated the correlation between the proposed metrics and the downstream prediction performance using two real-world cancer datasets (i.e., SEQC neuroblastoma dataset and the NIH/NCI TCGA lung adenocarcinoma dataset). We found that RNA-seq pipeline components jointly and significantly impacted the accuracy of gene expression estimation, and its impact was extended to the downstream prediction of these cancer outcomes. Specifically, RNA-seq pipelines that produced more accurate, precise, and reliable gene expression estimation tended to perform better in the prediction of disease outcome. In the end, we provided scenarios as guidelines for users to use these three metrics to select sensible RNA-seq pipelines for the improved accuracy, precision, and reliability of gene expression estimation, which lead to the improved downstream gene expression-based prediction of disease outcome.
Collapse
Affiliation(s)
- Li Tong
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Po-Yen Wu
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - John H Phan
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Hamid R Hassazadeh
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - Weida Tong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - May D Wang
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA.
| |
Collapse
|
32
|
Zandi E, Ayatollahi Mehrgardi A, Esmailizadeh A. Mammary tissue transcriptomic analysis for construction of integrated regulatory networks involved in lactogenesis of Ovis aries. Genomics 2020; 112:4277-4287. [PMID: 32693106 DOI: 10.1016/j.ygeno.2020.07.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 06/19/2020] [Accepted: 07/13/2020] [Indexed: 10/23/2022]
Abstract
The mammary gland experiences vast changes between the onset of lactation and pregnancy. This remodeling involves different functions such as lactation that is controlled by innumerable regulators and various gene networks which are still not completely understood. MicroRNAs (miRNAs) are one of the important non-coding gene regulators which control an extensive range of biological processes. Thus, exploring miRNAs functions is important for solving gene regulation complexity. The main purpose in the present study is to identify the various gene regulative integrated networks involved in lactation progress in mammary gland. We analyzed ovine mammary tissue data sets which included expression profiles of mRNA (genes) and miRNAs related to six ewes in different days of lactation and nutritional treatments. We combined two different types of information: the network that is module inference by mRNAs (RNA-seq data), miRNAs and transcription factors (TFs) expression matrix and prediction of targets via computational methods. To discover the miRNAs regulatory function, 134 modules were predicted by using gene expression data and 14 TFs and 20 miRNAs were allocated to these predicted modules. By applying this integrated computation-based method, 38 miRNA-modules and 35 TF-module interactions were identified from ovine mammary tissue data during lactogenesis. A lot of these modules were involved in lipid and protein metabolism, as well as steroids and vitamin biosynthesis, which would play key roles in mammary tissue and lactation development. These results present new information about the regulatory procedures at the miRNAs and TF levels throughout lactation.
Collapse
Affiliation(s)
- Elmira Zandi
- Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, Kerman, PB 76169-133, Iran; Yong Researchers Society, Shahid Bahonar University of Kerman, PB 76169-133, Kerman, Iran
| | - Ahmad Ayatollahi Mehrgardi
- Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, Kerman, PB 76169-133, Iran
| | - Ali Esmailizadeh
- Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, Kerman, PB 76169-133, Iran.
| |
Collapse
|
33
|
Dissanayake TK, Schäuble S, Mirhakkak MH, Wu WL, Ng ACK, Yip CCY, López AG, Wolf T, Yeung ML, Chan KH, Yuen KY, Panagiotou G, To KKW. Comparative Transcriptomic Analysis of Rhinovirus and Influenza Virus Infection. Front Microbiol 2020; 11:1580. [PMID: 32849329 PMCID: PMC7396524 DOI: 10.3389/fmicb.2020.01580] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 06/17/2020] [Indexed: 12/15/2022] Open
Abstract
Rhinovirus (RV) and influenza virus are the most frequently detected respiratory viruses among adult patients with community acquired pneumonia. Previous clinical studies have identified major differences in the clinical presentations and inflammatory or immune response during these infections. A systematic transcriptomic analysis directly comparing influenza and RV is lacking. Here, we sought to compare the transcriptomic response to these viral infections. Human airway epithelial Calu-3 cells were infected with contemporary clinical isolates of RV, influenza A virus (IAV), or influenza B virus (IBV). Host gene expression was determined using RNA-seq. Differentially expressed genes (DEGs) with respect to mock-infected cells were identified using the overlapping gene-set of four different statistical models. Transcriptomic analysis showed that RV-infected cells have a more blunted host response with fewer DEGs than IAV or IBV-infected cells. IFNL1 and CXCL10 were among the most upregulated DEGs during RV, IAV, and IBV infection. Other DEGs that were highly expressed for all 3 viruses were mainly genes related to type I or type III interferons (RSAD2, IDO1) and chemokines (CXCL11). Notably, ICAM5, a known receptor for enterovirus D68, was highly expressed during RV infection only. Gene Set Enrichment Analysis (GSEA) confirmed that pathways associated with interferon response, innate immunity, or regulation of inflammatory response, were most perturbed for all three viruses. Network analysis showed that steroid-related pathways were enriched. Taken together, our data using contemporary virus strains suggests that genes related to interferon and chemokine predominated the host response associated with RV, IAV, and IBV infection. Several highly expressed genes, especially ICAM5 which is preferentially-induced during RV infection, deserve further investigation.
Collapse
Affiliation(s)
| | - Sascha Schäuble
- Systems Biology and Bioinformatics Unit, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany
| | - Mohammad Hassan Mirhakkak
- Systems Biology and Bioinformatics Unit, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany
| | - Wai-Lan Wu
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Anthony Chin-Ki Ng
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Cyril C Y Yip
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Albert García López
- Systems Biology and Bioinformatics Unit, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany
| | - Thomas Wolf
- Systems Biology and Bioinformatics Unit, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany
| | - Man-Lung Yeung
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,State Key Laboratory for Emerging Infectious Diseases, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,Department of Clinical Microbiology and Infection Control, The University of Hong Kong, Hong Kong, China.,Carol Yu Centre for Infection, The University of Hong Kong, Hong Kong, China
| | - Kwok-Hung Chan
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Kwok-Yung Yuen
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,State Key Laboratory for Emerging Infectious Diseases, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,Department of Clinical Microbiology and Infection Control, The University of Hong Kong, Hong Kong, China.,Carol Yu Centre for Infection, The University of Hong Kong, Hong Kong, China
| | - Gianni Panagiotou
- Systems Biology and Bioinformatics Unit, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany.,Systems Biology and Bioinformatics Group, School of Biological Sciences, Faculty of Sciences, The University of Hong Kong, Hong Kong, China
| | - Kelvin Kai-Wang To
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,State Key Laboratory for Emerging Infectious Diseases, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,Department of Clinical Microbiology and Infection Control, The University of Hong Kong, Hong Kong, China.,Carol Yu Centre for Infection, The University of Hong Kong, Hong Kong, China
| |
Collapse
|
34
|
Ramos TAR, Maracaja-Coutinho V, Ortega JM, do Rêgo TG. CORAZON: a web server for data normalization and unsupervised clustering based on expression profiles. BMC Res Notes 2020; 13:338. [PMID: 32665017 PMCID: PMC7359491 DOI: 10.1186/s13104-020-05171-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Accepted: 07/03/2020] [Indexed: 01/12/2023] Open
Abstract
Objective Data normalization and clustering are mandatory steps in gene expression and downstream analyses, respectively. However, user-friendly implementations of these methodologies are available exclusively under expensive licensing agreements, or in stand-alone scripts developed, reflecting on a great obstacle for users with less computational skills. Results We developed an online tool called CORAZON (Correlations Analyses Zipper Online), which implements three unsupervised learning methods to cluster gene expression datasets in a friendly environment. It allows the usage of eight gene expression normalization/transformation methodologies and the attribute’s influence. The normalizations requiring the gene length only could be performed to RNA-seq, meanwhile the others can be used with microarray and/or NanoString data. Clustering methodologies performances were evaluated through five models with accuracies between 92 and 100%. We applied our tool to obtain functional insights of non-coding RNAs (ncRNAs) based on Gene Ontology enrichment of clusters in a dataset generated by the ENCODE project. The clusters where the majority of transcripts are coding genes were enriched in Cellular, Metabolic, Transports, and Systems Development categories. Meanwhile, the ncRNAs were enriched in the Detection of Stimulus, Sensory Perception, Immunological System, and Digestion categories. CORAZON source-code is freely available at https://gitlab.com/integrativebioinformatics/corazon and the web-server can be accessed at http://corazon.integrativebioinformatics.me.
Collapse
Affiliation(s)
- Thaís A R Ramos
- Programa de Pós-Graduação em Bioinformática, Bioinformatics Multidisciplinary Environment (BioME), Instituto Metrópole Digital, Universidade Federal do Rio Grande do Norte, Natal, Brazil.,Advanced Center for Chronic Diseases (ACCDiS), Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
| | - Vinicius Maracaja-Coutinho
- Programa de Pós-Graduação em Bioinformática, Bioinformatics Multidisciplinary Environment (BioME), Instituto Metrópole Digital, Universidade Federal do Rio Grande do Norte, Natal, Brazil. .,Advanced Center for Chronic Diseases (ACCDiS), Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile. .,Instituto Vandique, João Pessoa, Brazil.
| | - J Miguel Ortega
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil.
| | - Thaís G do Rêgo
- Programa de Pós-Graduação em Bioinformática, Bioinformatics Multidisciplinary Environment (BioME), Instituto Metrópole Digital, Universidade Federal do Rio Grande do Norte, Natal, Brazil. .,Departamento de Informática, Centro de Informática, Universidade Federal da Paraíba, João Pessoa, Brazil.
| |
Collapse
|
35
|
De novo transcriptome sequencing and analysis of salt-, alkali-, and drought-responsive genes in Sophora alopecuroides. BMC Genomics 2020; 21:423. [PMID: 32576152 PMCID: PMC7310485 DOI: 10.1186/s12864-020-06823-4] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 06/12/2020] [Indexed: 02/06/2023] Open
Abstract
Background Salinity, alkalinity, and drought stress are the main abiotic stress factors affecting plant growth and development. Sophora alopecuroides L., a perennial leguminous herb in the genus Sophora, is a highly salt-tolerant sand-fixing pioneer species distributed mostly in Western Asia and northwestern China. Few studies have assessed responses to abiotic stress in S. alopecuroides. The transcriptome of the genes that confer stress-tolerance in this species has not previously been sequenced. Our objective was to sequence and analyze this transcriptome. Results Twelve cDNA libraries were constructed in triplicate from mRNA obtained from Sophora alopecuroides for the control and salt, alkali, and drought treatments. Using de novo assembly, 902,812 assembled unigenes were generated, with an average length of 294 bp. Based on similarity searches, 545,615 (60.43%) had at least one significant match in the Nr, Nt, Pfam, KOG/COG, Swiss-Prot, and GO databases. In addition, 1673 differentially expressed genes (DEGs) were obtained from the salt treatment, 8142 from the alkali treatment, and 17,479 from the drought treatment. A total of 11,936 transcription factor genes from 82 transcription factor families were functionally annotated under salt, alkali, and drought stress, these include MYB, bZIP, NAC and WRKY family members. DEGs were involved in the hormone signal transduction pathway, biosynthesis of secondary metabolites and antioxidant enzymes; this suggests that these pathways or processes may be involved in tolerance towards salt, alkali, and drought stress in S. alopecuroides. Conclusion Our study first reported transcriptome reference sequence data in Sophora alopecuroides, a non-model plant without a reference genome. We determined digital expression profile and discovered a broad survey of unigenes associated with salt, alkali, and drought stress which provide genomic resources available for Sophora alopecuroides.
Collapse
|
36
|
Catalogue of stage-specific transcripts in Ixodes ricinus and their potential functions during the tick life-cycle. Parasit Vectors 2020; 13:311. [PMID: 32546252 PMCID: PMC7296661 DOI: 10.1186/s13071-020-04173-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 06/05/2020] [Indexed: 12/15/2022] Open
Abstract
Background The castor bean tick Ixodes ricinus is an important vector of several clinically important diseases, whose prevalence increases with accelerating global climate changes. Characterization of a tick life-cycle is thus of great importance. However, researchers mainly focus on specific organs of fed life stages, while early development of this tick species is largely neglected. Methods In an attempt to better understand the life-cycle of this widespread arthropod parasite, we sequenced the transcriptomes of four life stages (egg, larva, nymph and adult female), including unfed and partially blood-fed individuals. To enable a more reliable identification of transcripts and their comparison in all five transcriptome libraries, we validated an improved-fit set of five I. ricinus-specific reference genes for internal standard normalization of our transcriptomes. Then, we mapped biological functions to transcripts identified in different life stages (clusters) to elucidate life stage-specific processes. Finally, we drew conclusions from the functional enrichment of these clusters specifically assigned to each transcriptome, also in the context of recently published transcriptomic studies in ticks. Results We found that reproduction-related transcripts are present in both fed nymphs and fed females, underlining the poorly documented importance of ovaries as moulting regulators in ticks. Additionally, we identified transposase transcripts in tick eggs suggesting elevated transposition during embryogenesis, co-activated with factors driving developmental regulation of gene expression. Our findings also highlight the importance of the regulation of energetic metabolism in tick eggs during embryonic development and glutamate metabolism in nymphs. Conclusions Our study presents novel insights into stage-specific transcriptomes of I. ricinus and extends the current knowledge of this medically important pathogen, especially in the early phases of its development.![]()
Collapse
|
37
|
Sissaoui S, Yu J, Yan A, Li R, Yukselen O, Kucukural A, Zhu LJ, Lawson ND. Genomic Characterization of Endothelial Enhancers Reveals a Multifunctional Role for NR2F2 in Regulation of Arteriovenous Gene Expression. Circ Res 2020; 126:875-888. [PMID: 32065070 PMCID: PMC7212523 DOI: 10.1161/circresaha.119.316075] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
RATIONALE Significant progress has revealed transcriptional inputs that underlie regulation of artery and vein endothelial cell fates. However, little is known concerning genome-wide regulation of this process. Therefore, such studies are warranted to address this gap. OBJECTIVE To identify and characterize artery- and vein-specific endothelial enhancers in the human genome, thereby gaining insights into mechanisms by which blood vessel identity is regulated. METHODS AND RESULTS Using chromatin immunoprecipitation and deep sequencing for markers of active chromatin in human arterial and venous endothelial cells, we identified several thousand artery- and vein-specific regulatory elements. Computational analysis revealed that NR2F2 (nuclear receptor subfamily 2, group F, member 2) sites were overrepresented in vein-specific enhancers, suggesting a direct role in promoting vein identity. Subsequent integration of chromatin immunoprecipitation and deep sequencing data sets with RNA sequencing revealed that NR2F2 regulated 3 distinct aspects related to arteriovenous identity. First, consistent with previous genetic observations, NR2F2 directly activated enhancer elements flanking cell cycle genes to drive their expression. Second, NR2F2 was essential to directly activate vein-specific enhancers and their associated genes. Our genomic approach further revealed that NR2F2 acts with ERG (ETS-related gene) at many of these sites to drive vein-specific gene expression. Finally, NR2F2 directly repressed only a small number of artery enhancers in venous cells to prevent their activation, including a distal element upstream of the artery-specific transcription factor, HEY2 (hes related family bHLH transcription factor with YRPW motif 2). In arterial endothelial cells, this enhancer was normally bound by ERG, which was also required for arterial HEY2 expression. By contrast, in venous endothelial cells, NR2F2 was bound to this site, together with ERG, and prevented its activation. CONCLUSIONS By leveraging a genome-wide approach, we revealed mechanistic insights into how NR2F2 functions in multiple roles to maintain venous identity. Importantly, characterization of its role at a crucial artery enhancer upstream of HEY2 established a novel mechanism by which artery-specific expression can be achieved.
Collapse
Affiliation(s)
- Samir Sissaoui
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, MA, 01605
| | - Jun Yu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, MA, 01605
| | - Aimin Yan
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, MA, 01605
| | - Rui Li
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, MA, 01605
| | - Onur Yukselen
- Department of Bioinformatics Core, University of Massachusetts Medical School, Worcester, MA, 01605
| | - Alper Kucukural
- Department of Bioinformatics Core, University of Massachusetts Medical School, Worcester, MA, 01605
- Department of Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA, 01605
| | - Lihua Julie Zhu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, MA, 01605
- Department of Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA, 01605
- Department of Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, 01605
| | - Nathan D. Lawson
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, MA, 01605
| |
Collapse
|
38
|
Molecular signatures of aneuploidy-driven adaptive evolution. Nat Commun 2020; 11:588. [PMID: 32001709 PMCID: PMC6992709 DOI: 10.1038/s41467-019-13669-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Accepted: 11/15/2019] [Indexed: 02/06/2023] Open
Abstract
Alteration of normal ploidy (aneuploidy) can have a number of opposing effects, such as unbalancing protein abundances and inhibiting cell growth but also accelerating genetic diversification and rapid adaptation. The interplay of these detrimental and beneficial effects remains puzzling. Here, to understand how cells develop tolerance to aneuploidy, we subject disomic (i.e. with an extra chromosome copy) strains of yeast to long-term experimental evolution under strong selection, by forcing disomy maintenance and daily population dilution. We characterize mutations, karyotype alterations and gene expression changes, and dissect the associated molecular strategies. Cells with different extra chromosomes accumulated mutations at distinct rates and displayed diverse adaptive events. They tended to evolve towards normal ploidy through chromosomal DNA loss and gene expression changes. We identify genes with recurrent mutations and altered expression in multiple lines, revealing a variant that improves growth under genotoxic stresses. These findings support rapid evolvability of disomic strains that can be used to characterize fitness effects of mutations under different stress conditions. Aneuploidy (abnormal chromosome number) can enable rapid adaptation to stress conditions, but it also entails fitness costs from gene imbalance. Here, the authors experimentally evolve yeast while forcing maintenance of aneuploidy to identify the mechanisms that promote tolerance of aneuploidy.
Collapse
|
39
|
Hu X, Zhu L, Zhang Y, Xu L, Li N, Zhang X, Pan Y. Genome-wide identification of C2H2 zinc-finger genes and their expression patterns under heat stress in tomato ( Solanum lycopersicum L.). PeerJ 2019; 7:e7929. [PMID: 31788352 PMCID: PMC6882421 DOI: 10.7717/peerj.7929] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 09/20/2019] [Indexed: 12/12/2022] Open
Abstract
The C2H2 zinc finger protein (C2H2-ZFP) transcription factor family regulates the expression of a wide variety of genes in response to various developmental processes or abiotic stresses; however, these proteins have not yet been comprehensively analyzed in tomato (Solanum lycopersicum). In this study, a total of 104 C2H2-ZFs were identified in an uneven distribution across the entire tomato genome, and include seven segmental duplication events. Based on their phylogenetic relationships, these genes were clustered into nine distinct categories analogous to those in Arabidopsis thaliana. High similarities were found between the exon–intron structures and conserved motifs of the genes within each group. Correspondingly, the expression patterns of the C2H2-ZF genes indicated that they function in different tissues and at different developmental stages. Additionally, quantitative real-time PCR (qRT-PCR) results demonstrated that the expression levels of 34 selected C2H2-ZFs are changed dramatically among the roots, stems, and leaves at different time points of a heat stress treatment, suggesting that the C2H2-ZFPs are extensively involved in the heat stress response but have potentially varying roles. These results form the basis for the further molecular and functional analysis of the C2H2-ZFPs, especially for those members that significantly varied under heat treatment, which may be targeted to improve the heat tolerance of tomato and other Solanaceae species.
Collapse
Affiliation(s)
- Xin Hu
- Key Laboratory of Horticulture Science for Southern Mountainous Regions, Ministry of Education, College of Horticulture and Landscape Architecture, Southwest University, Chongqing, China.,Academy of Agricultural Sciences, Southwest University, Chongqing, China
| | - Lili Zhu
- Key Laboratory of Horticulture Science for Southern Mountainous Regions, Ministry of Education, College of Horticulture and Landscape Architecture, Southwest University, Chongqing, China.,Academy of Agricultural Sciences, Southwest University, Chongqing, China
| | - Yi Zhang
- Key Laboratory of Horticulture Science for Southern Mountainous Regions, Ministry of Education, College of Horticulture and Landscape Architecture, Southwest University, Chongqing, China.,Academy of Agricultural Sciences, Southwest University, Chongqing, China
| | - Li Xu
- Key Laboratory of Horticulture Science for Southern Mountainous Regions, Ministry of Education, College of Horticulture and Landscape Architecture, Southwest University, Chongqing, China.,Academy of Agricultural Sciences, Southwest University, Chongqing, China
| | - Na Li
- Key Laboratory of Horticulture Science for Southern Mountainous Regions, Ministry of Education, College of Horticulture and Landscape Architecture, Southwest University, Chongqing, China.,Academy of Agricultural Sciences, Southwest University, Chongqing, China
| | - Xingguo Zhang
- Key Laboratory of Horticulture Science for Southern Mountainous Regions, Ministry of Education, College of Horticulture and Landscape Architecture, Southwest University, Chongqing, China.,Academy of Agricultural Sciences, Southwest University, Chongqing, China
| | - Yu Pan
- Key Laboratory of Horticulture Science for Southern Mountainous Regions, Ministry of Education, College of Horticulture and Landscape Architecture, Southwest University, Chongqing, China.,Academy of Agricultural Sciences, Southwest University, Chongqing, China
| |
Collapse
|
40
|
Sun X, Sun S, Yang S. An Efficient and Flexible Method for Deconvoluting Bulk RNA-Seq Data with Single-Cell RNA-Seq Data. Cells 2019; 8:E1161. [PMID: 31569701 PMCID: PMC6830085 DOI: 10.3390/cells8101161] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2019] [Revised: 09/23/2019] [Accepted: 09/26/2019] [Indexed: 12/25/2022] Open
Abstract
Estimating cell type compositions for complex diseases is an important step to investigate the cellular heterogeneity for understanding disease etiology and potentially facilitate early disease diagnosis and prevention. Here, we developed a computationally statistical method, referring to Multi-Omics Matrix Factorization (MOMF), to estimate the cell-type compositions of bulk RNA sequencing (RNA-seq) data by leveraging cell type-specific gene expression levels from single-cell RNA sequencing (scRNA-seq) data. MOMF not only directly models the count nature of gene expression data, but also effectively accounts for the uncertainty of cell type-specific mean gene expression levels. We demonstrate the benefits of MOMF through three real data applications, i.e., Glioblastomas (GBM), colorectal cancer (CRC) and type II diabetes (T2D) studies. MOMF is able to accurately estimate disease-related cell type proportions, i.e., oligodendrocyte progenitor cells and macrophage cells, which are strongly associated with the survival of GBM and CRC, respectively.
Collapse
Affiliation(s)
- Xifang Sun
- Department of Mathematics, School of Science, Xi'an Shiyou University, 710065 Xi'an, China.
| | - Shiquan Sun
- School of Computer Science, Northwestern Polytechnical University, 710072 Xi'an, China.
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Sheng Yang
- Department of Biostatistics, School of Public Health, Nanjing Medical University, 211166 Nanjing, China.
| |
Collapse
|
41
|
Masjedi S, Zwiebel LJ, Giorgio TD. Olfactory receptor gene abundance in invasive breast carcinoma. Sci Rep 2019; 9:13736. [PMID: 31551495 PMCID: PMC6760194 DOI: 10.1038/s41598-019-50085-4] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 08/31/2019] [Indexed: 12/12/2022] Open
Abstract
Expression of olfactory receptors (ORs) has been reported in many human tissues outside the nasal epithelium. ORs have been validated as biomarkers in prostate cancer. In breast cancer, however, the expression and role of OR genes remain understudied. We examined the significance of OR transcript abundance in a large invasive breast carcinoma population and identified two OR genes, OR2W3 and OR2B6 to be potentially correlated to breast cancer progression. 960 breast invasive tumors and 56 human breast cancer cell lines were assessed for OR gene expression and 21 OR genes were highly abundant among 198 cases. Our transcriptome analysis discovered three significantly abundant OR genes among three sub-populations of invasive breast carcinoma patients. OR2W3 was correlated with invasion genes and basal-like subtype whereas OR2B6 was correlated with proliferation genes and luminal A subtype. Analyzing the OR gene upregulation among breast cancer cell lines showed that OR2B6 and OR2W3 were abundant similar to invasive breast tumors. Our study suggests that specific OR genes may be correlated with breast cancer characteristics, making ORs potential new diagnostic, and/or treatment markers. This study suggests future directions for the exploration of a role for ORs in the mechanisms of breast cancer proliferation and progression.
Collapse
Affiliation(s)
- Shirin Masjedi
- Department of Biomedical Engineering, Vanderbilt University, Nashville, USA
| | - Laurence J Zwiebel
- Department of Biological Sciences, Vanderbilt University, Nashville, USA
| | - Todd D Giorgio
- Department of Biomedical Engineering, Vanderbilt University, Nashville, USA.
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, USA.
| |
Collapse
|
42
|
Osabe T, Shimizu K, Kadota K. Accurate Classification of Differential Expression Patterns in a Bayesian Framework With Robust Normalization for Multi-Group RNA-Seq Count Data. Bioinform Biol Insights 2019; 13:1177932219860817. [PMID: 31312083 PMCID: PMC6614939 DOI: 10.1177/1177932219860817] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 06/10/2019] [Indexed: 12/13/2022] Open
Abstract
Empirical Bayes is a choice framework for differential expression (DE) analysis
for multi-group RNA-seq count data. Its characteristic ability to compute
posterior probabilities for predefined expression patterns allows users to
assign the pattern with the highest value to the gene under consideration.
However, current Bayesian methods such as baySeq and EBSeq can be improved,
especially with respect to normalization. Two R packages
(baySeq and EBSeq) with their default normalization settings and with other
normalization methods (MRN and TCC) were compared using three-group simulation
data and real count data. Our findings were as follows: (1) the Bayesian methods
coupled with TCC normalization performed comparably or better than those with
the default normalization settings under various simulation scenarios, (2)
default DE pipelines provided in TCC that implements a generalized linear model
framework was still superior to the Bayesian methods with TCC normalization when
overall degree of DE was evaluated, and (3) baySeq with TCC was robust against
different choices of possible expression patterns. In practice, we recommend
using the default DE pipeline provided in TCC for obtaining overall gene ranking
and then using the baySeq with TCC normalization for assigning the most
plausible expression patterns to individual genes.
Collapse
Affiliation(s)
- Takayuki Osabe
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
| | - Kentaro Shimizu
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan.,Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Japan
| | - Koji Kadota
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan.,Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Japan
| |
Collapse
|
43
|
Visger CJ, Wong GKS, Zhang Y, Soltis PS, Soltis DE. Divergent gene expression levels between diploid and autotetraploid Tolmiea relative to the total transcriptome, the cell, and biomass. AMERICAN JOURNAL OF BOTANY 2019; 106:280-291. [PMID: 30779448 DOI: 10.1002/ajb2.1239] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Accepted: 12/03/2018] [Indexed: 05/28/2023]
Abstract
PREMISE OF THE STUDY Studies of gene expression and polyploidy are typically restricted to characterizing differences in transcript concentration. Using diploid and autotetraploid Tolmiea, we present an integrated approach for cross-ploidy comparisons that account for differences in transcriptome size and cell density and make multiple comparisons of transcript abundance. METHODS We use RNA spike-in standards in concert with cell size and density to identify and correct for differences in transcriptome size and compare levels of gene expression across multiple scales: per transcriptome, per cell, and per biomass. KEY RESULTS In total, ~17% of all loci were identified as differentially expressed (DEGs) between the diploid and autopolyploid species. The per-transcriptome normalization, the method researchers typically use, captured the fewest DEGs (58% of total DEGs) and failed to detect any DEGs not found by the alternative normalizations. When transcript abundance was normalized per biomass and per cell, ~66% and ~82% of the total DEGs were recovered, respectively. The discrepancy between per-transcriptome and per-cell recovery of DEGs occurs because per-transcriptome normalizations are concentration-based and therefore blind to differences in transcriptome size. CONCLUSIONS While each normalization enables valid comparisons at biologically relevant scales, a holistic comparison of multiple normalizations provides additional explanatory power not available from any single approach. Notably, autotetraploid loci tend to conserve diploid-like transcript abundance per biomass through increased gene expression per cell, and these loci are enriched for photosynthesis-related functions.
Collapse
Affiliation(s)
- Clayton J Visger
- Department of Biological Sciences, California State University Sacramento, Sacramento, CA, 95819, USA
| | - Gane K-S Wong
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada
- Department of Medicine, University of Alberta, Edmonton, AB, T6G 2E1, Canada
- Beijing Genomics Institute-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Yong Zhang
- Beijing Genomics Institute-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
- Shenzhen Hua Han Gene Co. Ltd., 7F Jian An Shan Hai Building, No. 8000, Shennan Road, Futian District, Shenzhen, 518040, China
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
- Genetics Institute, University of Florida, Gainesville, FL, 32610, USA
- Biodiversity Institute, University of Florida, Gainesville, FL, 32611, USA
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
- Genetics Institute, University of Florida, Gainesville, FL, 32610, USA
- Biodiversity Institute, University of Florida, Gainesville, FL, 32611, USA
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA
| |
Collapse
|
44
|
Sbardella A, Weller M, Fonseca I, Stafuzza N, Bernardes P, e Silva F, da Silva M, Martins M, Munari D. RNA sequencing differential gene expression analysis of isolated perfused bovine udders experimentally inoculated with Streptococcus agalactiae. J Dairy Sci 2019; 102:1761-1767. [DOI: 10.3168/jds.2018-15516] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 10/28/2018] [Indexed: 11/19/2022]
|
45
|
Hybrid swarm intelligent redundancy relevance (RR) with convolution trained compositional pattern neural network expert system for diagnosis of diabetes. HEALTH AND TECHNOLOGY 2019. [DOI: 10.1007/s12553-018-00291-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
46
|
DEBrowser: interactive differential expression analysis and visualization tool for count data. BMC Genomics 2019; 20:6. [PMID: 30611200 PMCID: PMC6321710 DOI: 10.1186/s12864-018-5362-x] [Citation(s) in RCA: 143] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Accepted: 12/11/2018] [Indexed: 01/09/2023] Open
Abstract
Background Sequencing data has become a standard measure of diverse cellular activities. For example, gene expression is accurately measured by RNA sequencing (RNA-Seq) libraries, protein-DNA interactions are captured by chromatin immunoprecipitation sequencing (ChIP-Seq), protein-RNA interactions by crosslinking immunoprecipitation sequencing (CLIP-Seq) or RNA immunoprecipitation (RIP-Seq) sequencing, DNA accessibility by assay for transposase-accessible chromatin (ATAC-Seq), DNase or MNase sequencing libraries. The processing of these sequencing techniques involves library-specific approaches. However, in all cases, once the sequencing libraries are processed, the result is a count table specifying the estimated number of reads originating from each genomic locus. Differential analysis to determine which loci have different cellular activity under different conditions starts with the count table and iterates through a cycle of data assessment, preparation and analysis. Such complex analysis often relies on multiple programs and is therefore a challenge for those without programming skills. Results We developed DEBrowser as an R bioconductor project to interactively visualize every step of the differential analysis, without programming. The application provides a rich and interactive web based graphical user interface built on R’s shiny infrastructure. DEBrowser allows users to visualize data with various types of graphs that can be explored further by selecting and re-plotting any desired subset of data. Using the visualization approaches provided, users can determine and correct technical variations such as batch effects and sequencing depth that affect differential analysis. We show DEBrowser’s ease of use by reproducing the analysis of two previously published data sets. Conclusions DEBrowser is a flexible, intuitive, web-based analysis platform that enables an iterative and interactive analysis of count data without any requirement of programming knowledge. Electronic supplementary material The online version of this article (10.1186/s12864-018-5362-x) contains supplementary material, which is available to authorized users.
Collapse
|
47
|
Expression analysis of RNA sequencing data from human neural and glial cell lines depends on technical replication and normalization methods. BMC Bioinformatics 2018; 19:412. [PMID: 30453873 PMCID: PMC6245503 DOI: 10.1186/s12859-018-2382-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Background The potential for astrocyte participation in central nervous system recovery is highlighted by in vitro experiments demonstrating their capacity to transdifferentiate into neurons. Understanding astrocyte plasticity could be advanced by comparing astrocytes with stem cells. RNA sequencing (RNA-seq) is ideal for comparing differences across cell types. However, this novel multi-stage process has the potential to introduce unwanted technical variation at several points in the experimental workflow. Quantitative understanding of the contribution of experimental parameters to technical variation would facilitate the design of robust RNA-Seq experiments. Results RNA-Seq was used to achieve biological and technical objectives. The biological aspect compared gene expression between normal human fetal-derived astrocytes and human neural stem cells cultured in identical conditions. When differential expression threshold criteria of |log2fold change| > 2 were applied to the data, no significant differences were observed. The technical component quantified variation arising from particular steps in the research pathway, and compared the ability of different normalization methods to reduce unwanted variance. To facilitate this objective, a liberal false discovery rate of 10% and a |log2fold change| > 0.5 were implemented for the differential expression threshold. Data were normalized with RPKM, TMM, and UQS methods using JMP Genomics. The contributions of key replicable experimental parameters (cell lot; library preparation; flow cell) to variance in the data were evaluated using principal variance component analysis. Our analysis showed that, although the variance for every parameter is strongly influenced by the normalization method, the largest contributor to technical variance was library preparation. The ability to detect differentially expressed genes was also affected by normalization; differences were only detected in non-normalized and TMM-normalized data. Conclusions The similarity in gene expression between astrocytes and neural stem cells supports the potential for astrocytic transdifferentiation into neurons, and emphasizes the need to evaluate the therapeutic potential of astrocytes for central nervous system damage. The choice of normalization method influences the contributions to experimental variance as well as the outcomes of differential expression analysis. However irrespective of normalization method, our findings illustrate that library preparation contributed the largest component of technical variance. Electronic supplementary material The online version of this article (10.1186/s12859-018-2382-0) contains supplementary material, which is available to authorized users.
Collapse
|
48
|
Evans C, Hardin J, Stoebel DM. Selecting between-sample RNA-Seq normalization methods from the perspective of their assumptions. Brief Bioinform 2018; 19:776-792. [PMID: 28334202 PMCID: PMC6171491 DOI: 10.1093/bib/bbx008] [Citation(s) in RCA: 169] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2016] [Revised: 01/06/2017] [Indexed: 11/13/2022] Open
Abstract
RNA-Seq is a widely used method for studying the behavior of genes under different biological conditions. An essential step in an RNA-Seq study is normalization, in which raw data are adjusted to account for factors that prevent direct comparison of expression measures. Errors in normalization can have a significant impact on downstream analysis, such as inflated false positives in differential expression analysis. An underemphasized feature of normalization is the assumptions on which the methods rely and how the validity of these assumptions can have a substantial impact on the performance of the methods. In this article, we explain how assumptions provide the link between raw RNA-Seq read counts and meaningful measures of gene expression. We examine normalization methods from the perspective of their assumptions, as an understanding of methodological assumptions is necessary for choosing methods appropriate for the data at hand. Furthermore, we discuss why normalization methods perform poorly when their assumptions are violated and how this causes problems in subsequent analysis. To analyze a biological experiment, researchers must select a normalization method with assumptions that are met and that produces a meaningful measure of expression for the given experiment.
Collapse
Affiliation(s)
- Ciaran Evans
- Department of Statistics, Baker Hall, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | | |
Collapse
|
49
|
Berenguer J, Lagerweij T, Zhao XW, Dusoswa S, van der Stoop P, Westerman B, de Gooijer MC, Zoetemelk M, Zomer A, Crommentuijn MHW, Wedekind LE, López-López À, Giovanazzi A, Bruch-Oms M, van der Meulen-Muileman IH, Reijmers RM, van Kuppevelt TH, García-Vallejo JJ, van Kooyk Y, Tannous BA, Wesseling P, Koppers-Lalic D, Vandertop WP, Noske DP, van Beusechem VW, van Rheenen J, Pegtel DM, van Tellingen O, Wurdinger T. Glycosylated extracellular vesicles released by glioblastoma cells are decorated by CCL18 allowing for cellular uptake via chemokine receptor CCR8. J Extracell Vesicles 2018; 7:1446660. [PMID: 29696074 PMCID: PMC5912193 DOI: 10.1080/20013078.2018.1446660] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Accepted: 02/23/2018] [Indexed: 02/07/2023] Open
Abstract
Cancer cells release extracellular vesicles (EVs) that contain functional biomolecules such as RNA and proteins. EVs are transferred to recipient cancer cells and can promote tumour progression and therapy resistance. Through RNAi screening, we identified a novel EV uptake mechanism involving a triple interaction between the chemokine receptor CCR8 on the cells, glycans exposed on EVs and the soluble ligand CCL18. This ligand acts as bridging molecule, connecting EVs to cancer cells. We show that glioblastoma EVs promote cell proliferation and resistance to the alkylating agent temozolomide (TMZ). Using in vitro and in vivo stem-like glioblastoma models, we demonstrate that EV-induced phenotypes are neutralised by a small molecule CCR8 inhibitor, R243. Interference with chemokine receptors may offer therapeutic opportunities against EV-mediated cross-talk in glioblastoma.
Collapse
Affiliation(s)
- Jordi Berenguer
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands
| | - Tonny Lagerweij
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands
| | - Xi Wen Zhao
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands
| | - Sophie Dusoswa
- Department of Molecular Cell Biology and Immunology, VU University Medical Center, Amsterdam, The Netherlands
| | - Petra van der Stoop
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands
| | - Bart Westerman
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands
| | - Mark C de Gooijer
- Department of Bio-Pharmacy/Mouse Cancer Clinic, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Marloes Zoetemelk
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands
| | - Anoek Zomer
- Cancer Genomics Netherlands, Hubrecht Institute-KNAW and University Medical Center Utrecht, Utrecht, The Netherlands
| | - Matheus H W Crommentuijn
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands.,Department of Bio-Pharmacy/Mouse Cancer Clinic, The Netherlands Cancer Institute, Amsterdam, The Netherlands.,Department of Neurology, Massachusetts General Hospital, Boston, MA, USA.,Program in Neuroscience, Harvard Medical School, Boston, MA, USA
| | - Laurine E Wedekind
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands
| | - Àlan López-López
- Department of Physiological Sciences I, University of Barcelona, Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas, Barcelona, Spain
| | - Alberta Giovanazzi
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands
| | - Marina Bruch-Oms
- Department of Molecular Cell Biology and Immunology, VU University Medical Center, Amsterdam, The Netherlands
| | | | - Rogier M Reijmers
- Department of Molecular Cell Biology and Immunology, VU University Medical Center, Amsterdam, The Netherlands
| | - Toin H van Kuppevelt
- Department of Matrix Biochemistry, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Juan-Jesús García-Vallejo
- Department of Molecular Cell Biology and Immunology, VU University Medical Center, Amsterdam, The Netherlands
| | - Yvette van Kooyk
- Department of Molecular Cell Biology and Immunology, VU University Medical Center, Amsterdam, The Netherlands
| | - Bakhos A Tannous
- Department of Neurology, Massachusetts General Hospital, Boston, MA, USA.,Program in Neuroscience, Harvard Medical School, Boston, MA, USA
| | - Pieter Wesseling
- Department of Pathology, VU University Medical Center, Amsterdam, The Netherlands.,Department of Pathology, Princess Máxima Center for Pediatric Oncology and University Medical Center Utrecht, Utrecht, The Netherlands
| | | | - W Peter Vandertop
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands
| | - David P Noske
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands
| | - Victor W van Beusechem
- Department of Medical Oncology, VU University Medical Center, Amsterdam, The Netherlands
| | - Jacco van Rheenen
- Cancer Genomics Netherlands, Hubrecht Institute-KNAW and University Medical Center Utrecht, Utrecht, The Netherlands
| | - D Michiel Pegtel
- Department of Matrix Biochemistry, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Olaf van Tellingen
- Department of Bio-Pharmacy/Mouse Cancer Clinic, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Thomas Wurdinger
- Department of Neurosurgery, VU University Medical Center, Amsterdam, The Netherlands.,Department of Neurology, Massachusetts General Hospital, Boston, MA, USA.,Program in Neuroscience, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
50
|
Lamarre S, Frasse P, Zouine M, Labourdette D, Sainderichin E, Hu G, Le Berre-Anton V, Bouzayen M, Maza E. Optimization of an RNA-Seq Differential Gene Expression Analysis Depending on Biological Replicate Number and Library Size. FRONTIERS IN PLANT SCIENCE 2018; 9:108. [PMID: 29491871 PMCID: PMC5817962 DOI: 10.3389/fpls.2018.00108] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 01/19/2018] [Indexed: 05/23/2023]
Abstract
RNA-Seq is a widely used technology that allows an efficient genome-wide quantification of gene expressions for, for example, differential expression (DE) analysis. After a brief review of the main issues, methods and tools related to the DE analysis of RNA-Seq data, this article focuses on the impact of both the replicate number and library size in such analyses. While the main drawback of previous relevant studies is the lack of generality, we conducted both an analysis of a two-condition experiment (with eight biological replicates per condition) to compare the results with previous benchmark studies, and a meta-analysis of 17 experiments with up to 18 biological conditions, eight biological replicates and 100 million (M) reads per sample. As a global trend, we concluded that the replicate number has a larger impact than the library size on the power of the DE analysis, except for low-expressed genes, for which both parameters seem to have the same impact. Our study also provides new insights for practitioners aiming to enhance their experimental designs. For instance, by analyzing both the sensitivity and specificity of the DE analysis, we showed that the optimal threshold to control the false discovery rate (FDR) is approximately 2-r, where r is the replicate number. Furthermore, we showed that the false positive rate (FPR) is rather well controlled by all three studied R packages: DESeq, DESeq2, and edgeR. We also analyzed the impact of both the replicate number and library size on gene ontology (GO) enrichment analysis. Interestingly, we concluded that increases in the replicate number and library size tend to enhance the sensitivity and specificity, respectively, of the GO analysis. Finally, we recommend to RNA-Seq practitioners the production of a pilot data set to strictly analyze the power of their experimental design, or the use of a public data set, which should be similar to the data set they will obtain. For individuals working on tomato research, on the basis of the meta-analysis, we recommend at least four biological replicates per condition and 20 M reads per sample to be almost sure of obtaining about 1000 DE genes if they exist.
Collapse
Affiliation(s)
- Sophie Lamarre
- LISBP, Centre National de la Recherche Scientifique, INRA, INSA, Université de Toulouse, Toulouse, France
| | - Pierre Frasse
- GBF, Université de Toulouse, INRA, Castanet-Tolosan, France
| | - Mohamed Zouine
- GBF, Université de Toulouse, INRA, Castanet-Tolosan, France
| | - Delphine Labourdette
- LISBP, Centre National de la Recherche Scientifique, INRA, INSA, Université de Toulouse, Toulouse, France
| | | | - Guojian Hu
- GBF, Université de Toulouse, INRA, Castanet-Tolosan, France
| | - Véronique Le Berre-Anton
- LISBP, Centre National de la Recherche Scientifique, INRA, INSA, Université de Toulouse, Toulouse, France
| | | | - Elie Maza
- GBF, Université de Toulouse, INRA, Castanet-Tolosan, France
| |
Collapse
|