1
|
Al Ali F, Marr AK, Tatari-Calderone Z, Alfaki M, Toufiq M, Roelands J, Syed Ahamed Kabeer B, Bedognetti D, Marr N, Garand M, Rinchai D, Chaussabel D. Organizing training workshops on gene literature retrieval, profiling, and visualization for early career researchers. F1000Res 2023; 10:275. [PMID: 37448622 PMCID: PMC10336363 DOI: 10.12688/f1000research.36395.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/20/2023] [Indexed: 07/15/2023] Open
Abstract
Early-career researchers must acquire the skills necessary to effectively search and extract information from biomedical literature. This ability is for instance crucial for evaluating the novelty of experimental results, and assessing potential publishing opportunities. Given the rapidly growing volume of publications in the field of biomedical research, new systematic approaches need to be devised and adopted for the retrieval and curation of literature relevant to a specific theme. In this context, we present a hands-on training curriculum aimed at retrieval, profiling, and visualization of literature associated with a given topic. The curriculum was implemented in a workshop in January 2021. Here we provide supporting material and step-by-step implementation guidelines with the ISG15 gene literature serving as an illustrative use case. Workshop participants can learn several skills, including: 1) building and troubleshoot PubMed queries in order to retrieve the literature associated with a gene of interest; 2) identifying key concepts relevant to given themes (such as cell types, diseases, and biological processes); 3) measuring the prevalence of these concepts in the gene literature; 4) extracting key information from relevant articles, and 5) developing a background section or summary on the basis of this information. Finally, trainees can learn to consolidate the structured information captured through this process for presentation via an interactive web application.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Davide Bedognetti
- Research Branch, Sidra Medicine, Doha, Qatar
- College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar
- Department of Internal Medicine and Medical Specialties, University of Genoa, Genoa, 16126, Italy
| | - Nico Marr
- Research Branch, Sidra Medicine, Doha, Qatar
- College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar
| | | | | | | |
Collapse
|
2
|
Al Ali F, Marr AK, Tatari-Calderone Z, Alfaki M, Toufiq M, Roelands J, Syed Ahamed Kabeer B, Bedognetti D, Marr N, Garand M, Rinchai D, Chaussabel D. Organizing gene literature retrieval, profiling, and visualization training workshops for early career researchers. F1000Res 2021. [DOI: 10.12688/f1000research.36395.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Developing the skills needed to effectively search and extract information from biomedical literature is essential for early-career researchers. It is, for instance, on this basis that the novelty of experimental results, and therefore publishing opportunities, can be evaluated. Given the unprecedented volume of publications in the field of biomedical research, new systematic approaches need to be devised and adopted for the retrieval and curation of literature relevant to a specific theme. Here we describe a hands-on training curriculum aimed at retrieval, profiling, and visualization of literature associated with a given topic. This curriculum was implemented in a workshop in January 2021. We provide supporting material and step-by-step implementation guidelines with the ISG15 gene literature serving as an illustrative use case. Through participation in such a workshop, trainees can learn: 1) to build and troubleshoot PubMed queries in order to retrieve the literature associated with a gene of interest; 2) to identify key concepts relevant to given themes (such as cell types, diseases, and biological processes); 3) to measure the prevalence of these concepts in the gene literature; 4) to extract key information from relevant articles, and 5) to develop a background section or summary on the basis of this information. Finally, trainees can learn to consolidate the structured information captured through this process for presentation via an interactive web application.
Collapse
|
3
|
Roelands J, Garand M, Hinchcliff E, Ma Y, Shah P, Toufiq M, Alfaki M, Hendrickx W, Boughorbel S, Rinchai D, Jazaeri A, Bedognetti D, Chaussabel D. Long-Chain Acyl-CoA Synthetase 1 Role in Sepsis and Immunity: Perspectives From a Parallel Review of Public Transcriptome Datasets and of the Literature. Front Immunol 2019; 10:2410. [PMID: 31681299 PMCID: PMC6813721 DOI: 10.3389/fimmu.2019.02410] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2019] [Accepted: 09/26/2019] [Indexed: 12/21/2022] Open
Abstract
A potential role for the long-chain acyl-CoA synthetase family member 1 (ACSL1) in the immunobiology of sepsis was explored during a hands-on training workshop. Participants first assessed the robustness of the potential gap in biomedical knowledge identified via an initial screen of public transcriptome data and of the literature associated with ACSL1. Increase in ACSL1 transcript abundance during sepsis was confirmed in several independent datasets. Querying the ACSL1 literature also confirmed the absence of reports associating ACSL1 with sepsis. Inferences drawn from both the literature (via indirect associations) and public transcriptome data (via correlation) point to the likely participation of ACSL1 and ACSL4, another family member, in inflammasome activation in neutrophils during sepsis. Furthermore, available clinical data indicate that levels of ACSL1 and ACSL4 induction was significantly higher in fatal cases of sepsis. This denotes potential translational relevance and is consistent with involvement in pathways driving potentially deleterious systemic inflammation. Finally, while ACSL1 expression was induced in blood in vitro by a wide range of pathogen-derived factors as well as TNF, induction of ACSL4 appeared restricted to flagellated bacteria and pathogen-derived TLR5 agonists and IFNG. Taken together, this joint review of public literature and omics data records points to two members of the acyl-CoA synthetase family potentially playing a role in inflammasome activation in neutrophils. Translational relevance of these observations in the context of sepsis and other inflammatory conditions remain to be investigated.
Collapse
Affiliation(s)
- Jessica Roelands
- Sidra Medicine, Doha, Qatar.,Department of Surgery, Leiden University Medical Center, Leiden, Netherlands
| | | | - Emily Hinchcliff
- Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Ying Ma
- Department of Melanoma Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Parin Shah
- Department of Melanoma Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | | | | | | | | | | | - Amir Jazaeri
- Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | | | | |
Collapse
|
4
|
Bougarn S, Boughorbel S, Chaussabel D, Marr N. A curated transcriptome dataset collection to investigate inborn errors of immunity. F1000Res 2019; 8:188. [PMID: 31559014 PMCID: PMC6749933 DOI: 10.12688/f1000research.18048.2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/28/2019] [Indexed: 01/10/2023] Open
Abstract
Primary immunodeficiencies (PIDs) are a heterogeneous group of inherited disorders, frequently caused by loss-of-function and less commonly by gain-of-function mutations, which can result in susceptibility to a broad or a very narrow range of infections but also in inflammatory, allergic or malignant diseases. Owing to the wide range in clinical manifestations and variability in penetrance and expressivity, there is an urgent need to better understand the underlying molecular, cellular and immunological phenotypes in PID patients in order to improve clinical diagnosis and management. Here we have compiled a manually curated collection of public transcriptome datasets mainly obtained from human whole blood, peripheral blood mononuclear cells (PBMCs) or fibroblasts of patients with PIDs and of control subjects for subsequent meta-analysis, query and interpretation. A total of eighteen (18) datasets derived from studies of PID patients were identified and retrieved from the NCBI Gene Expression Omnibus (GEO) database and loaded in GXB, a custom web application designed for interactive query and visualization of integrated large-scale data. The dataset collection includes samples from well characterized PID patients that were stimulated
ex vivo under a variety of conditions to assess the molecular consequences of the underlying, naturally occurring gene defects on a genome-wide scale. Multiple sample groupings and rank lists were generated to facilitate comparisons of the transcriptional responses between different PID patients and control subjects. The GXB tool enables browsing of a single transcript across studies, thereby providing new perspectives on the role of a given molecule across biological systems and PID patients. This dataset collection is available at
http://pid.gxbsidra.org/dm3/geneBrowser/list.
Collapse
Affiliation(s)
- Salim Bougarn
- Systems Biology and Immunology, Sidra Medicine, Doha, Qatar
| | | | | | - Nico Marr
- Systems Biology and Immunology, Sidra Medicine, Doha, Qatar
| |
Collapse
|
5
|
Bougarn S, Boughorbel S, Chaussabel D, Marr N. A curated transcriptome dataset collection to investigate the blood transcriptional response to viral respiratory tract infection and vaccination. F1000Res 2019; 8:284. [PMID: 31231515 PMCID: PMC6567289 DOI: 10.12688/f1000research.18533.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/07/2019] [Indexed: 12/13/2022] Open
Abstract
The human immune defense mechanisms and factors associated with good versus poor health outcomes following viral respiratory tract infections (VRTI), as well as correlates of protection following vaccination against respiratory viruses, remain incompletely understood. To shed further light into these mechanisms, a number of systems-scale studies have been conducted to measure transcriptional changes in blood leukocytes of either naturally or experimentally infected individuals, or in individual’s post-vaccination. Here we are making available a public repository, for research investigators for interpretation, a collection of transcriptome datasets obtained from human whole blood and peripheral blood mononuclear cells (PBMC) to investigate the transcriptional responses following viral respiratory tract infection or vaccination against respiratory viruses. In total, Thirty one31 datasets, associated to viral respiratory tract infections and their related vaccination studies, were identified and retrieved from the NCBI Gene Expression Omnibus (GEO) and loaded in a custom web application designed for interactive query and visualization of integrated large-scale data. Quality control checks, using relevant biological markers, were performed. Multiple sample groupings and rank lists were created to facilitate dataset query and interpretation. Via this interface, users can generate web links to customized graphical views, which may be subsequently inserted into manuscripts to report novel findings. The GXB tool enables browsing of a single gene across projects, providing new perspectives on the role of a given molecule across biological systems in the diagnostic and prognostic following VRTI but also in identifying new correlates of protection. This dataset collection is available at:
http://vri1.gxbsidra.org/dm3/geneBrowser/list.
Collapse
Affiliation(s)
- Salim Bougarn
- Systems Biology and Immunology Department, Sidra Medicine, Doha, Qatar
| | - Sabri Boughorbel
- Systems Biology and Immunology Department, Sidra Medicine, Doha, Qatar
| | - Damien Chaussabel
- Systems Biology and Immunology Department, Sidra Medicine, Doha, Qatar
| | - Nico Marr
- Systems Biology and Immunology Department, Sidra Medicine, Doha, Qatar
| |
Collapse
|
6
|
Bougarn S, Boughorbel S, Chaussabel D, Marr N. A curated transcriptome dataset collection to investigate inborn errors of immunity. F1000Res 2019; 8:188. [PMID: 31559014 PMCID: PMC6749933 DOI: 10.12688/f1000research.18048.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/28/2019] [Indexed: 08/13/2023] Open
Abstract
Primary immunodeficiencies (PIDs) are a heterogeneous group of inherited disorders, frequently caused by loss-of-function and less commonly by gain-of-function mutations, which can result in susceptibility to a broad or a very narrow range of infections but also in inflammatory, allergic or malignant diseases. Owing to the wide range in clinical manifestations and variability in penetrance and expressivity, there is an urgent need to better understand the underlying molecular, cellular and immunological phenotypes in PID patients in order to improve clinical diagnosis and management. Here we have compiled a manually curated collection of public transcriptome datasets mainly obtained from human whole blood, peripheral blood mononuclear cells (PBMCs) or fibroblasts of patients with PIDs and of control subjects for subsequent meta-analysis, query and interpretation. A total of eighteen (18) datasets derived from studies of PID patients were identified and retrieved from the NCBI Gene Expression Omnibus (GEO) database and loaded in GXB, a custom web application designed for interactive query and visualization of integrated large-scale data. The dataset collection includes samples from well characterized PID patients that were stimulated ex vivo under a variety of conditions to assess the molecular consequences of the underlying, naturally occurring gene defects on a genome-wide scale. Multiple sample groupings and rank lists were generated to facilitate comparisons of the transcriptional responses between different PID patients and control subjects. The GXB tool enables browsing of a single transcript across studies, thereby providing new perspectives on the role of a given molecule across biological systems and PID patients. This dataset collection is available at http://pid.gxbsidra.org/dm3/geneBrowser/list.
Collapse
Affiliation(s)
- Salim Bougarn
- Systems Biology and Immunology, Sidra Medicine, Doha, Qatar
| | | | | | - Nico Marr
- Systems Biology and Immunology, Sidra Medicine, Doha, Qatar
| |
Collapse
|
7
|
Huang SSY, Al Ali F, Boughorbel S, Toufiq M, Chaussabel D, Garand M. A curated collection of transcriptome datasets to investigate the molecular mechanisms of immunoglobulin E-mediated atopic diseases. Database (Oxford) 2019; 2019:baz066. [PMID: 31290545 PMCID: PMC6616200 DOI: 10.1093/database/baz066] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Revised: 04/14/2019] [Accepted: 04/29/2019] [Indexed: 12/17/2022]
Abstract
Prevalence of allergies has reached ~20% of population in developed countries and sensitization rate to one or more allergens among school age children are approaching 50%. However, the combination of the complexity of atopic allergy susceptibility/development and environmental factors has made identification of gene biomarkers challenging. The amount of publicly accessible transcriptomic data presents an unprecedented opportunity for mechanistic discoveries and validation of complex disease signatures across studies. However, this necessitates structured methodologies and visual tools for the interpretation of results. Here, we present a curated collection of transcriptomic datasets relevant to immunoglobin E-mediated atopic diseases (ranging from allergies to primary immunodeficiencies). Thirty-three datasets from the Gene Expression Omnibus, encompassing 1860 transcriptome profiles, were made available on the Gene Expression Browser (GXB), an online and open-source web application that allows for the query, visualization and annotation of metadata. The thematic compositions, disease categories, sample number and platforms of the collection are described. Ranked gene lists and sample grouping are used to facilitate data visualization/interpretation and are available online via GXB (http://ige.gxbsidra.org/dm3/geneBrowser/list). Dataset validation using associated publications showed good concordance in GXB gene expression trend and fold-change.
Collapse
Affiliation(s)
| | - Fatima Al Ali
- Sidra Medicine, Al Gharrafa Street Ar-Rayyan, Doha, Qatar
| | | | | | | | - Mathieu Garand
- Sidra Medicine, Al Gharrafa Street Ar-Rayyan, Doha, Qatar
| |
Collapse
|
8
|
Chaussabel D, Rinchai D. Using 'collective omics data' for biomedical research training. Immunology 2018; 155:18-23. [PMID: 29705995 DOI: 10.1111/imm.12944] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 04/11/2018] [Indexed: 12/13/2022] Open
Abstract
Systems-scale molecular profiling data accumulating in public repositories may constitute a useful resource for immunologists. It is for instance likely that information relevant to their chosen line of research be found among the more than 90,000 data series available in the NCBI Gene Expression Omnibus. Such 'collective omics data' may also be employed as source material for training purposes. This is the case when training curricula aim at the development of bioinformatics skills necessary for the analysis, interpretation or visualization of data generated on global scales. But 'collective omics data' may also be reused for training purposes to foster the development of the skills and 'mental habits' underpinning traditional reductionist science approaches. This review describes a small-scale initiative involving investigators, for the most part immunologists, having engaged in a range of training activities relying on 'collective omics data'.
Collapse
|
9
|
Placental Up-Regulation of Leptin and ARMS2 is Associated with Growth Discordance in Monochorionic Diamniotic Twin Pregnancies. Twin Res Hum Genet 2017; 20:169-179. [DOI: 10.1017/thg.2017.11] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Fetal growth discordance is a relatively common complication of monochorionic diamniotic (MCDA) twin pregnancies and is caused by a combination of maternal and placental factors. The aim of the study was to survey placental gene expression patterns and identify genes associated with growth discordance. Clinical samples comprised eight growth-discordant MCDA twin placentas (31+3–34+4 weeks gestational age) and six growth-concordant twin placentas (31+2–37 weeks gestational age). Gene expression libraries were constructed from placental biopsy samples and analyzed by RNA-sequencing. The distribution and relative abundance of mRNA transcripts expressed in the smaller and larger placentas from growth-discordant and concordant MCDA twins was remarkably similar. However, leptin (LEP) and age-related maculopathy susceptibility 2 (ARMS2) mRNA levels were exclusively up-regulated in all of the eight smaller growth-discordant twin placentas. Quantitative real-time PCR of independent biopsy samples confirmed the levels of differential mRNA expression for both genes. Immunohistochemical analysis of tissue sections from matching twin placentas showed increased leptin expression in 5–10% of blood vessel cells of the smaller placenta and marginally higher levels of ARMS2 expression in the microvillous membrane of the smaller placenta. Based on these findings, we speculate that up-regulation of leptin and ARMS2 forms part of an important survival mechanism to compensate for placental growth discordance. Since, leptin and ARMS2 are both expressed as soluble proteins, they may have clinical potential as measurable biomarkers for predicting the onset of growth discordance in MCDA twin pregnancies.
Collapse
|
10
|
Rahman M, Boughorbel S, Presnell S, Quinn C, Cugno C, Chaussabel D, Marr N. A curated transcriptome dataset collection to investigate the functional programming of human hematopoietic cells in early life. F1000Res 2016; 5:414. [PMID: 27347375 PMCID: PMC4916988 DOI: 10.12688/f1000research.8375.1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/23/2016] [Indexed: 12/24/2022] Open
Abstract
Compendia of large-scale datasets made available in public repositories provide an opportunity to identify and fill gaps in biomedical knowledge. But first, these data need to be made readily accessible to research investigators for interpretation. Here we make available a collection of transcriptome datasets to investigate the functional programming of human hematopoietic cells in early life. Thirty two datasets were retrieved from the NCBI Gene Expression Omnibus (GEO) and loaded in a custom web application called the Gene Expression Browser (GXB), which was designed for interactive query and visualization of integrated large-scale data. Quality control checks were performed. Multiple sample groupings and gene rank lists were created allowing users to reveal age-related differences in transcriptome profiles, changes in the gene expression of neonatal hematopoietic cells to a variety of immune stimulators and modulators, as well as during cell differentiation. Available demographic, clinical, and cell phenotypic information can be overlaid with the gene expression data and used to sort samples. Web links to customized graphical views can be generated and subsequently inserted in manuscripts to report novel findings. GXB also enables browsing of a single gene across projects, thereby providing new perspectives on age- and developmental stage-specific expression of a given gene across the human hematopoietic system. This dataset collection is available at:
http://developmentalimmunology.gxbsidra.org/dm3/geneBrowser/list.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Nico Marr
- Sidra Medical and Research Center, Doha, Qatar
| |
Collapse
|