1
|
Laczik M, Erdős E, Ozgyin L, Hevessy Z, Csősz É, Kalló G, Nagy T, Barta E, Póliska S, Szatmári I, Bálint BL. Extensive proteome and functional genomic profiling of variability between genetically identical human B-lymphoblastoid cells. Sci Data 2022; 9:763. [PMID: 36496436 PMCID: PMC9741606 DOI: 10.1038/s41597-022-01871-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 11/22/2022] [Indexed: 12/13/2022] Open
Abstract
In life-science research isogenic B-lymphoblastoid cell lines (LCLs) are widely known and preferred for their genetic stability - they are often used for studying mutations for example, where genetic stability is crucial. We have shown previously that phenotypic variability can be observed in isogenic B-lymphoblastoid cell lines. Isogenic LCLs present well-defined phenotypic differences on various levels, for example on the gene expression level or the chromatin level. Based on our investigations, the phenotypic variability of the isogenic LCLs is accompanied by certain genetic variation too. We have developed a compendium of LCL datasets that present the phenotypic and genetic variability of five isogenic LCLs from a multiomic perspective. In this paper, we present additional datasets generated with Next Generation Sequencing techniques to provide genomic and transcriptomic profiles (WGS, RNA-seq, single cell RNA-seq), protein-DNA interactions (ChIP-seq), together with mass spectrometry and flow cytometry datasets to monitor the changes in the proteome. We are sharing these datasets with the scientific community according to the FAIR principles for further investigations.
Collapse
Affiliation(s)
- Miklós Laczik
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Edina Erdős
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Lilla Ozgyin
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Zsuzsanna Hevessy
- grid.7122.60000 0001 1088 8582Department of Laboratory Medicine, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Éva Csősz
- grid.7122.60000 0001 1088 8582Proteomics Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Gergő Kalló
- grid.7122.60000 0001 1088 8582Proteomics Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Tibor Nagy
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary ,grid.129553.90000 0001 1015 7851Department of Genetics and Genomics, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Szent-Györgyi Albert út 4, Gödöllő, H-2100 Hungary
| | - Endre Barta
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary ,grid.129553.90000 0001 1015 7851Department of Genetics and Genomics, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Szent-Györgyi Albert út 4, Gödöllő, H-2100 Hungary
| | - Szilárd Póliska
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - István Szatmári
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary ,grid.7122.60000 0001 1088 8582Faculty of Pharmacy, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Bálint László Bálint
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary ,grid.11804.3c0000 0001 0942 9821Department of Bioinformatics, Semmelweis University, Budapest, Tűzoltó utca 7-9., H-1094 Hungary
| |
Collapse
|
2
|
Zeng Z, Mao C, Vo A, Li X, Nugent JO, Khan SA, Clare SE, Luo Y. Deep learning for cancer type classification and driver gene identification. BMC Bioinformatics 2021; 22:491. [PMID: 34689757 PMCID: PMC8543824 DOI: 10.1186/s12859-021-04400-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 09/24/2021] [Indexed: 12/12/2022] Open
Abstract
Background Genetic information is becoming more readily available and is increasingly being used to predict patient cancer types as well as their subtypes. Most classification methods thus far utilize somatic mutations as independent features for classification and are limited by study power. We aim to develop a novel method to effectively explore the landscape of genetic variants, including germline variants, and small insertions and deletions for cancer type prediction.
Results We proposed DeepCues, a deep learning model that utilizes convolutional neural networks to unbiasedly derive features from raw cancer DNA sequencing data for disease classification and relevant gene discovery. Using raw whole-exome sequencing as features, germline variants and somatic mutations, including insertions and deletions, were interactively amalgamated for feature generation and cancer prediction. We applied DeepCues to a dataset from TCGA to classify seven different types of major cancers and obtained an overall accuracy of 77.6%. We compared DeepCues to conventional methods and demonstrated a significant overall improvement (p < 0.001). Strikingly, using DeepCues, the top 20 breast cancer relevant genes we have identified, had a 40% overlap with the top 20 known breast cancer driver genes. Conclusion Our results support DeepCues as a novel method to improve the representational resolution of DNA sequencings and its power in deriving features from raw sequences for cancer type prediction, as well as discovering new cancer relevant genes. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04400-4.
Collapse
Affiliation(s)
- Zexian Zeng
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive Room 11-189, Chicago, IL, 60611, USA.,Department of Data Sciences, Dana-Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Chengsheng Mao
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive Room 11-189, Chicago, IL, 60611, USA
| | - Andy Vo
- Committee on Developmental Biology and Regenerative Medicine, The University of Chicago, Chicago, IL, USA
| | | | - Janna Ore Nugent
- Research Computing Services, Northwestern University, Chicago, IL, USA
| | - Seema A Khan
- Department of Surgery, Feinberg School of Medicine, Northwestern University, NMH/Prentice Women's Hospital Room 4-420 250 E Superior, Chicago, IL, 60611, USA.
| | - Susan E Clare
- Department of Surgery, Feinberg School of Medicine, Northwestern University, Robert H Lurie Medical Research Center Room 4-113 250 E Superior, Chicago, IL, 60611, USA.
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive Room 11-189, Chicago, IL, 60611, USA.
| |
Collapse
|
3
|
Chase Huizar C, Raphael I, Forsthuber TG. Genomic, proteomic, and systems biology approaches in biomarker discovery for multiple sclerosis. Cell Immunol 2020; 358:104219. [PMID: 33039896 PMCID: PMC7927152 DOI: 10.1016/j.cellimm.2020.104219] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Revised: 09/13/2020] [Accepted: 09/16/2020] [Indexed: 12/12/2022]
Abstract
Multiple sclerosis (MS) is a neuroinflammatory disorder characterized by autoimmune-mediated inflammatory lesions in CNS leading to myelin damage and axonal loss. MS is a heterogenous disease with variable and unpredictable disease course. Due to its complex nature, MS is difficult to diagnose and responses to specific treatments may vary between individuals. Therefore, there is an indisputable need for biomarkers for early diagnosis, prediction of disease exacerbations, monitoring the progression of disease, and for measuring responses to therapy. Genomic and proteomic studies have sought to understand the molecular basis of MS and find biomarker candidates. Advances in next-generation sequencing and mass-spectrometry techniques have yielded an unprecedented amount of genomic and proteomic data; yet, translation of the results into the clinic has been underwhelming. This has prompted the development of novel data science techniques for exploring these large datasets to identify biologically relevant relationships and ultimately point towards useful biomarkers. Herein we discuss optimization of omics study designs, advances in the generation of omics data, and systems biology approaches aimed at improving biomarker discovery and translation to the clinic for MS.
Collapse
Affiliation(s)
- Carol Chase Huizar
- Department of Biology, University of Texas at San Antonio, San Antonio, TX, USA
| | - Itay Raphael
- Department of Neurological Surgery, University of Pittsburgh, UPMC Children's Hospital, Pittsburgh, PA, USA.
| | - Thomas G Forsthuber
- Department of Biology, University of Texas at San Antonio, San Antonio, TX, USA.
| |
Collapse
|