1
|
Zhang J, Wei X, Zhao C, Yang H. Protocol to infer and analyze miRNA sponge modules in heterogeneous data using miRSM 2.0. STAR Protoc 2024; 5:103317. [PMID: 39292559 PMCID: PMC11424997 DOI: 10.1016/j.xpro.2024.103317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 08/06/2024] [Accepted: 08/23/2024] [Indexed: 09/20/2024] Open
Abstract
MicroRNA (miRNA) sponges synergistically modulate physiological and pathological processes in the form of modules or clusters. Here, we present a protocol for inferring and analyzing miRNA sponge modules in heterogeneous data using the R package miRSM 2.0. We describe steps for identifying gene modules, inferring miRNA sponge modules at multi-sample and single-sample levels, and performing modular analysis. From the perspective of computational biology, miRSM 2.0 has the potential to advance our understanding of the role of miRNA sponges in diseases. For complete details on the use and execution of this protocol, please refer to Zhang et al.1,2,3.
Collapse
Affiliation(s)
- Junpeng Zhang
- School of Engineering, Dali University, Yunnan 671003, China.
| | - Xuemei Wei
- School of Engineering, Dali University, Yunnan 671003, China
| | - Chunwen Zhao
- School of Engineering, Dali University, Yunnan 671003, China
| | - Haolin Yang
- School of Engineering, Dali University, Yunnan 671003, China
| |
Collapse
|
2
|
Baruah B, Dutta MP, Banerjee S, Bhattacharyya DK. EnsemBic: An effective ensemble of biclustering to identify potential biomarkers of esophageal squamous cell carcinoma. Comput Biol Chem 2024; 110:108090. [PMID: 38759483 DOI: 10.1016/j.compbiolchem.2024.108090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 03/28/2024] [Accepted: 04/29/2024] [Indexed: 05/19/2024]
Abstract
The development of functionally enriched and biologically competent biclustering algorithm is essential for extracting hidden information from massive biological datasets. This paper presents a novel biclustering ensemble called EnsemBic based on p-value, which calculates the functional similarity of genetic associations. To validate the effectiveness and robustness of EnsemBic, we apply three well-known biclustering techniques, viz. Laplace Prior, iBBiG, and xMotif to implement EnsemBic and have been compared using different leading parameters. It is observed that the EnsemBic outperforms its competing algorithms in several prominent functional and biological measures. Next, the biclusters obtained from EnsemBic are used to identify potential biomarkers of Esophageal Squamous Cell Carcinoma (ESCC) by exploring topological and biological relevance with reference to the elite genes, attained from genecards. Finally, we discover that the genes F2RL3, APPL1, CALM1, IFNGR1, LPAR1, ANGPT2, ARPC2, CGN, CLDN7, ATP6V1C2, CEACAM1, FTL, PLAU,PSMB4, and EPHB2 carry both the topological and biological significance of previously established ESCC elite genes. Therefore, we declare the aforementioned genes as potential biomarkers of ESCC.
Collapse
Affiliation(s)
- Bikash Baruah
- Dept. of Computer Science and Engineering, NIT Arunachal Pradesh, India
| | - Manash P Dutta
- Dept. of Computer Science & Information Technology, Cotton University, Guwahati, Assam, India.
| | | | - Dhruba K Bhattacharyya
- Dept. of Computer Science and Engineering, Tezpur University, School of Engineering, Tezpur, India
| |
Collapse
|
3
|
Liu F, Yang Y, Xu XS, Yuan M. MESBC: A novel mutually exclusive spectral biclustering method for cancer subtyping. Comput Biol Chem 2024; 109:108009. [PMID: 38219419 DOI: 10.1016/j.compbiolchem.2023.108009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 12/22/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Many soft biclustering algorithms have been developed and applied to various biological and biomedical data analyses. However, few mutually exclusive (hard) biclustering algorithms have been proposed, which could better identify disease or molecular subtypes with survival significance based on genomic or transcriptomic data. In this study, we developed a novel mutually exclusive spectral biclustering (MESBC) algorithm based on spectral method to detect mutually exclusive biclusters. MESBC simultaneously detects relevant features (genes) and corresponding conditions (patients) subgroups and, therefore, automatically uses the signature features for each subtype to perform the clustering. Extensive simulations revealed that MESBC provided superior accuracy in detecting pre-specified biclusters compared with the non-negative matrix factorization (NMF) and Dhillon's algorithm, particularly in very noisy data. Further analysis of the algorithm on real datasets obtained from the TCGA database showed that MESBC provided more accurate (i.e., smaller p-value) overall survival prediction in patients with lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) cancers when compared to the existing, gold-standard subtypes for lung cancers (integrative clustering). Furthermore, MESBC detected several genes with significant prognostic value in both LUAD and LUSC patients. External validation on an independent, unseen GEO dataset of LUAD showed that MESBC-derived clusters based on TCGA data still exhibited clear biclustering patterns and consistent, outstanding prognostic predictability, demonstrating robust generalizability of MESBC. Therefore, MESBC could potentially be used as a risk stratification tool to optimize the treatment for the patient, improve the selection of patients for clinical trials, and contribute to the development of novel therapeutic agents.
Collapse
Affiliation(s)
- Fengrong Liu
- Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China
| | - Yaning Yang
- Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China
| | | | - Min Yuan
- School of Public Health Administration, Anhui Medical University, Hefei 230032, China.
| |
Collapse
|
4
|
Chen H, Zhao L, Liu J, Zhou H, Wang X, Fang X, Xia X. Bioinformatic Analyzes of the Association Between Upregulated Expression of JUN Gene via APOBEC-Induced FLG Gene Mutation and Prognosis of Cervical Cancer. Front Med (Lausanne) 2022; 9:815450. [PMID: 35510248 PMCID: PMC9058067 DOI: 10.3389/fmed.2022.815450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 03/15/2022] [Indexed: 02/01/2023] Open
Abstract
Globally, cervical cancer (CC) is the most common malignant tumor of the female reproductive system and its incidence is only second after breast cancer. Although screening and advanced treatment strategies have improved the rates of survival, some patients with CC still die due to metastasis and drug resistance. It is considered that cancer is driven by somatic mutations, such as single nucleotide, small insertions/deletions, copy number, and structural variations, as well as epigenetic changes. Previous studies have shown that cervical intraepithelial neoplasia is associated with copy number variants (CNVs) and/or mutations in cancer-related genes. Further, CC is also related to genetic mutations. The present study analyzed the data on somatic mutations of cervical squamous cell carcinoma (CESC) in the Cancer Genome Atlas database. It was evident that the Apolipoprotein B mRNA editing enzyme-catalyzed polypeptide-like (APOBEC)-related mutation of the FLG gene can upregulate the expression of the JUN gene and ultimately lead to poor prognosis for patients with CC. Therefore, the findings of the current study provide a new direction for future treatment of CC.
Collapse
Affiliation(s)
- Huan Chen
- Department of Obstetrics and Gynecology, The Second XIANGYA Hospital of Central South University, Changsha, China
| | - Liyun Zhao
- Department of Obstetrics and Gynecology, The Second XIANGYA Hospital of Central South University, Changsha, China
| | - Jiaqiang Liu
- Laboratory Medicine Center, Zhu Zhou Hospital Affiliated to Xiangya School of Medicine, Central South University (CSU), Zhuzhou, China
| | - Housheng Zhou
- Department of Obstetrics and Gynecology, Zhu Zhou Hospital Affiliated to Xiangya School of Medicine, CSU, Zhuzhou, China
| | - Xi Wang
- Department of Obstetrics and Gynecology, The Second XIANGYA Hospital of Central South University, Changsha, China
| | - Xiaoling Fang
- Department of Obstetrics and Gynecology, The Second XIANGYA Hospital of Central South University, Changsha, China
| | - Xiaomeng Xia
- Department of Obstetrics and Gynecology, The Second XIANGYA Hospital of Central South University, Changsha, China
- *Correspondence: Xiaomeng Xia
| |
Collapse
|
5
|
Baruah B, Dutta MP, Bhattacharyya DK. Identification of ESCC potential biomarkers using biclustering algorithms. GENE REPORTS 2022. [DOI: 10.1016/j.genrep.2022.101563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
6
|
PD_BiBIM: Biclustering-based biomarker identification in ESCC microarray data. J Biosci 2021. [DOI: 10.1007/s12038-021-00171-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
7
|
Li H, Li J, Zhang C, Zhang C, Wang H. TERT mutations correlate with higher TMB value and unique tumor microenvironment and may be a potential biomarker for anti-CTLA4 treatment. Cancer Med 2020; 9:7151-7160. [PMID: 32810393 PMCID: PMC7541140 DOI: 10.1002/cam4.3376] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 07/25/2020] [Accepted: 07/28/2020] [Indexed: 12/16/2022] Open
Abstract
Immune checkpoint inhibitors (ICIs) have recently changed therapeutic paradigms for patients across multiple cancer types. However, current biomarkers cannot accurately predict responses to ICIs. Telomerase reverse transcriptase (TERT) mutations lead to an aberrant upregulation of TERT expression, and ultimately allow telomere maintenance, thus supporting immortalization of cancer cells. This study aimed to investigate whether the TERT mutation is a potential predictor of ICI treatment across all cancer types. TERT mutations positively correlated with a higher tumor mutational burden (TMB) value, neoantigen load, and tumor purity. Lymphocyte infiltration, macrophage regulation, interferon‐gamma (IFN‐γ) response, and transforming growth factor‐β (TGF‐β) response which was representative immune‐expression signatures, all had higher signature scores in the TERT mutation group. Activated CD4 T cell, naïve B cell, activated dendritic cell, M0 macrophage, M1 macrophage, neutrophil, resting NK cell, and plasma cells all had relatively higher immune scores in the TERT mutation group, whereas Th series cells, memory B cell, resting mast cells, monocytes, and activated NK cells had lower immune scores. Notably, in the subgroup analysis of monotherapy and combination ICI treatment, only in the anti‐cytotoxic‐T‐lymphocyte‐associated antigen 4 (anti‐CTLA4) group, patients with TERT mutations had a better prognosis, especially for melanoma. Therefore, TERT mutations were closely related to a higher TMB value and unique tumor microenvironment, which may be the reason that TERT mutations may be a potential biomarker for anti‐CTLA4 treatment.
Collapse
Affiliation(s)
- Huahua Li
- Department of Integrated Chinese and Western Medicine, Affiliated Cancer Hospital of Zhengzhou University and Henan Cancer Hospital, Zhengzhou, China
| | - Jia Li
- Department of Integrated Chinese and Western Medicine, Affiliated Cancer Hospital of Zhengzhou University and Henan Cancer Hospital, Zhengzhou, China
| | - Chenyue Zhang
- Department of Integrated Therapy, Fudan University Shanghai Cancer Center, Shanghai Medical College, Shanghai, China
| | - Chenxing Zhang
- Department of Nephrology, Shanghai Children's Medical Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Haiyong Wang
- Department of Internal Medicine-Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| |
Collapse
|
8
|
Ghosh TS, Rampelli S, Jeffery IB, Santoro A, Neto M, Capri M, Giampieri E, Jennings A, Candela M, Turroni S, Zoetendal EG, Hermes GDA, Elodie C, Meunier N, Brugere CM, Pujos-Guillot E, Berendsen AM, De Groot LCPGM, Feskins EJM, Kaluza J, Pietruszka B, Bielak MJ, Comte B, Maijo-Ferre M, Nicoletti C, De Vos WM, Fairweather-Tait S, Cassidy A, Brigidi P, Franceschi C, O'Toole PW. Mediterranean diet intervention alters the gut microbiome in older people reducing frailty and improving health status: the NU-AGE 1-year dietary intervention across five European countries. Gut 2020; 69:1218-1228. [PMID: 32066625 PMCID: PMC7306987 DOI: 10.1136/gutjnl-2019-319654] [Citation(s) in RCA: 544] [Impact Index Per Article: 108.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 12/29/2019] [Accepted: 12/31/2019] [Indexed: 12/12/2022]
Abstract
OBJECTIVE Ageing is accompanied by deterioration of multiple bodily functions and inflammation, which collectively contribute to frailty. We and others have shown that frailty co-varies with alterations in the gut microbiota in a manner accelerated by consumption of a restricted diversity diet. The Mediterranean diet (MedDiet) is associated with health. In the NU-AGE project, we investigated if a 1-year MedDiet intervention could alter the gut microbiota and reduce frailty. DESIGN We profiled the gut microbiota in 612 non-frail or pre-frail subjects across five European countries (UK, France, Netherlands, Italy and Poland) before and after the administration of a 12-month long MedDiet intervention tailored to elderly subjects (NU-AGE diet). RESULTS Adherence to the diet was associated with specific microbiome alterations. Taxa enriched by adherence to the diet were positively associated with several markers of lower frailty and improved cognitive function, and negatively associated with inflammatory markers including C-reactive protein and interleukin-17. Analysis of the inferred microbial metabolite profiles indicated that the diet-modulated microbiome change was associated with an increase in short/branch chained fatty acid production and lower production of secondary bile acids, p-cresols, ethanol and carbon dioxide. Microbiome ecosystem network analysis showed that the bacterial taxa that responded positively to the MedDiet intervention occupy keystone interaction positions, whereas frailty-associated taxa are peripheral in the networks. CONCLUSION Collectively, our findings support the feasibility of improving the habitual diet to modulate the gut microbiota which in turn has the potential to promote healthier ageing.
Collapse
Affiliation(s)
- Tarini Shankar Ghosh
- School of Microbiology, University College Cork, Cork, Ireland
- APC Microbiome Ireland, University College Cork, Cork, Ireland
| | - Simone Rampelli
- Unit of Microbial Ecology of Health, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Ian B Jeffery
- School of Microbiology, University College Cork, Cork, Ireland
- APC Microbiome Ireland, University College Cork, Cork, Ireland
| | - Aurelia Santoro
- Department of Experimental, Diagnostic and Speciality Medicine, Alma Mater Studiorum, University of Bologna, Bologna, Italy
- CIG Interdepartmental Centre "L Galvani", Alma Mater Studiorum, University of Bologna, Bologna, Italy
| | - Marta Neto
- School of Microbiology, University College Cork, Cork, Ireland
- APC Microbiome Ireland, University College Cork, Cork, Ireland
| | - Miriam Capri
- Unit of Microbial Ecology of Health, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Enrico Giampieri
- Department of Experimental, Diagnostic and Speciality Medicine, Alma Mater Studiorum, University of Bologna, Bologna, Italy
| | - Amy Jennings
- Norwich Medical School, University of East Anglia Faculty of Medicine and Health Sciences, Norwich, Norfolk, UK
| | - Marco Candela
- Unit of Microbial Ecology of Health, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Silvia Turroni
- Unit of Microbial Ecology of Health, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Erwin G Zoetendal
- Laboratory of Microbiology, Wageningen University and Research, Wageningen, Netherlands
| | - Gerben D A Hermes
- Laboratory of Microbiology, Wageningen University and Research, Wageningen, Netherlands
| | - Caumon Elodie
- CRNH Auvergne, F-63000 Clermont-Ferrand, CHU Clermont-Ferrand, Clermont-Ferrand, France
| | - Nathalie Meunier
- CRNH Auvergne, F-63000 Clermont-Ferrand, CHU Clermont-Ferrand, Clermont-Ferrand, France
| | | | - Estelle Pujos-Guillot
- Plateforme d'Exploration du Métabolisme, MetaboHUB Clermont, Clermont-Ferrand, Université Clermont Auvergne, Clermont-Ferrand, Auvergne, France
| | - Agnes M Berendsen
- Division of Human Nutrition and Health, Wageningen University and Research, Wageningen, Netherlands
| | - Lisette C P G M De Groot
- Division of Human Nutrition and Health, Wageningen University and Research, Wageningen, Netherlands
| | - Edith J M Feskins
- Division of Human Nutrition and Health, Wageningen University and Research, Wageningen, Netherlands
| | - Joanna Kaluza
- Department of Human Nutrition, Warsaw University of Life Sciences, Warszawa, Poland
| | - Barbara Pietruszka
- Department of Human Nutrition, Warsaw University of Life Sciences, Warszawa, Poland
| | | | - Blandine Comte
- Plateforme d'Exploration du Métabolisme, MetaboHUB Clermont, Clermont-Ferrand, Université Clermont Auvergne, Clermont-Ferrand, Auvergne, France
| | - Monica Maijo-Ferre
- Gut Health Institute Strategic Programme, Quadram Institute Bioscience, Norwich, Norfolk, UK
| | - Claudio Nicoletti
- Gut Health Institute Strategic Programme, Quadram Institute Bioscience, Norwich, Norfolk, UK
- Department of Experimental and Clinical Medicine, Section of Anatomy, University of Florence, Firenze, Toscana, Italy
| | - Willem M De Vos
- Laboratory of Microbiology, Wageningen University and Research, Wageningen, Netherlands
- Human Microbiome Research Program, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Susan Fairweather-Tait
- Department of Nutrition and Preventive Medicine, Norwich Medical School, University of East Anglia, Norwich, Norfolk, UK
| | - Aedin Cassidy
- The Institute of Global Food Security, Queen's University Belfast, Belfast, UK
| | - Patrizia Brigidi
- Unit of Microbial Ecology of Health, Department of Pharmacy and Biotechnology, University of Bologna, Bolognas, Italy
| | - Claudio Franceschi
- Department of Experimental, Diagnostic and Speciality Medicine, Alma Mater Studiorum, University of Bologna, Bologna, Emilia-Romagna, Italy
- Department of Applied Mathematics, Institute of Information Technology, Mathematics and Mechanics (ITMM), Lobachevsky State University of Nizhny Novgorod-National Research University (UNN), Nizhny Novgorod, Russian Federation
| | - Paul W O'Toole
- School of Microbiology, University College Cork, Cork, Ireland
- APC Microbiome Ireland, University College Cork, Cork, Ireland
| |
Collapse
|
9
|
Xie J, Ma A, Fennell A, Ma Q, Zhao J. It is time to apply biclustering: a comprehensive review of biclustering applications in biological and biomedical data. Brief Bioinform 2020; 20:1449-1464. [PMID: 29490019 DOI: 10.1093/bib/bby014] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Revised: 01/16/2018] [Indexed: 12/12/2022] Open
Abstract
Biclustering is a powerful data mining technique that allows clustering of rows and columns, simultaneously, in a matrix-format data set. It was first applied to gene expression data in 2000, aiming to identify co-expressed genes under a subset of all the conditions/samples. During the past 17 years, tens of biclustering algorithms and tools have been developed to enhance the ability to make sense out of large data sets generated in the wake of high-throughput omics technologies. These algorithms and tools have been applied to a wide variety of data types, including but not limited to, genomes, transcriptomes, exomes, epigenomes, phenomes and pharmacogenomes. However, there is still a considerable gap between biclustering methodology development and comprehensive data interpretation, mainly because of the lack of knowledge for the selection of appropriate biclustering tools and further supporting computational techniques in specific studies. Here, we first deliver a brief introduction to the existing biclustering algorithms and tools in public domain, and then systematically summarize the basic applications of biclustering for biological data and more advanced applications of biclustering for biomedical data. This review will assist researchers to effectively analyze their big data and generate valuable biological knowledge and novel insights with higher efficiency.
Collapse
|
10
|
Pucher BM, Zeleznik OA, Thallinger GG. Comparison and evaluation of integrative methods for the analysis of multilevel omics data: a study based on simulated and experimental cancer data. Brief Bioinform 2020; 20:671-681. [PMID: 29688321 DOI: 10.1093/bib/bby027] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Revised: 03/02/2018] [Indexed: 12/12/2022] Open
Abstract
Integrative analysis aims to identify the driving factors of a biological process by the joint exploration of data from multiple cellular levels. The volume of omics data produced is constantly increasing, and so too does the collection of tools for its analysis. Comparative studies assessing performance and the biological value of results, however, are rare but in great demand. We present a comprehensive comparison of three integrative analysis approaches, sparse canonical correlation analysis (sCCA), non-negative matrix factorization (NMF) and logic data mining MicroArray Logic Analyzer (MALA), by applying them to simulated and experimental omics data. We find that sCCA and NMF are able to identify differential features in simulated data, while the Logic Data Mining method, MALA, falls short. Applied to experimental data, we show that MALA performs best in terms of sample classification accuracy, and in general, the classification power of prioritized feature sets is high (97.1-99.5% accuracy). The proportion of features identified by at least one of the other methods, however, is approximately 60% for sCCA and NMF and nearly 30% for MALA, and the proportion of features jointly identified by all methods is only around 16%. Similarly, the congruence on functional levels (Gene Ontology, Reactome) is low. Furthermore, the agreement of identified feature sets with curated gene signatures relevant to the investigated disease is modest. We discuss possible reasons for the moderate overlap of identified feature sets with each other and with curated cancer signatures. The R code to create simulated data, results and figures is provided at https://github.com/ThallingerLab/IamComparison.
Collapse
Affiliation(s)
- Bettina M Pucher
- Institute of Computational Biotechnology, Graz University of Technology, Petersgasse 14, 8010 Graz, Austria Omics Center Graz, BioTechMed-Graz, Stiftingtalstrasse 24, 8010 Graz, Austria
| | - Oana A Zeleznik
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, 181 Longwood Ave, Boston MA 02115, USA
| | - Gerhard G Thallinger
- Institute of Computational Biotechnology, Graz University of Technology, Petersgasse 14, 8010 Graz, Austria Omics Center Graz, BioTechMed-Graz, Stiftingtalstrasse 24, 8010 Graz, Austria
| |
Collapse
|
11
|
Orzechowski P, Pańszczyk A, Huang X, Moore JH. runibic: a Bioconductor package for parallel row-based biclustering of gene expression data. Bioinformatics 2018; 34:4302-4304. [PMID: 29939213 PMCID: PMC6289127 DOI: 10.1093/bioinformatics/bty512] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Revised: 05/27/2018] [Accepted: 06/22/2018] [Indexed: 11/13/2022] Open
Abstract
Motivation Biclustering is an unsupervised technique of simultaneous clustering of rows and columns of input matrix. With multiple biclustering algorithms proposed, UniBic remains one of the most accurate methods developed so far. Results In this paper we introduce a Bioconductor package called runibic with parallel implementation of UniBic. For the convenience the algorithm was reimplemented, parallelized and wrapped within an R package called runibic. The package includes: (i) a couple of times faster parallel version of the original sequential algorithm, (ii) much more efficient memory management, (iii) modularity which allows to build new methods on top of the provided one and (iv) integration with the modern Bioconductor packages such as SummarizedExperiment, ExpressionSet and biclust. Availability and implementation The package is implemented in R and is available from Bioconductor (starting from version 3.6) at the following URL http://bioconductor.org/packages/runibic with installation instructions and tutorial. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Patryk Orzechowski
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA
- Department of Automatics and Biomedical Engineering, AGH University of Science and Technology, Krakow, Poland
| | - Artur Pańszczyk
- Department of Automatics and Biomedical Engineering, AGH University of Science and Technology, Krakow, Poland
| | - Xiuzhen Huang
- Department of Computer Science, Arkansas State University, Jonesboro, AR, USA
| | - Jason H Moore
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
12
|
Thorsson V, Gibbs DL, Brown SD, Wolf D, Bortone DS, Ou Yang TH, Porta-Pardo E, Gao GF, Plaisier CL, Eddy JA, Ziv E, Culhane AC, Paull EO, Sivakumar IKA, Gentles AJ, Malhotra R, Farshidfar F, Colaprico A, Parker JS, Mose LE, Vo NS, Liu J, Liu Y, Rader J, Dhankani V, Reynolds SM, Bowlby R, Califano A, Cherniack AD, Anastassiou D, Bedognetti D, Mokrab Y, Newman AM, Rao A, Chen K, Krasnitz A, Hu H, Malta TM, Noushmehr H, Pedamallu CS, Bullman S, Ojesina AI, Lamb A, Zhou W, Shen H, Choueiri TK, Weinstein JN, Guinney J, Saltz J, Holt RA, Rabkin CS, Lazar AJ, Serody JS, Demicco EG, Disis ML, Vincent BG, Shmulevich I. The Immune Landscape of Cancer. Immunity 2018; 48:812-830.e14. [PMID: 29628290 PMCID: PMC5982584 DOI: 10.1016/j.immuni.2018.03.023] [Citation(s) in RCA: 3720] [Impact Index Per Article: 531.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Revised: 01/23/2018] [Accepted: 03/21/2018] [Indexed: 02/08/2023]
Abstract
We performed an extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse cancer types by utilizing data compiled by TCGA. Across cancer types, we identified six immune subtypes-wound healing, IFN-γ dominant, inflammatory, lymphocyte depleted, immunologically quiet, and TGF-β dominant-characterized by differences in macrophage or lymphocyte signatures, Th1:Th2 cell ratio, extent of intratumoral heterogeneity, aneuploidy, extent of neoantigen load, overall cell proliferation, expression of immunomodulatory genes, and prognosis. Specific driver mutations correlated with lower (CTNNB1, NRAS, or IDH1) or higher (BRAF, TP53, or CASP8) leukocyte levels across all cancers. Multiple control modalities of the intracellular and extracellular networks (transcription, microRNAs, copy number, and epigenetic processes) were involved in tumor-immune cell interactions, both across and within immune subtypes. Our immunogenomics pipeline to characterize these heterogeneous tumors and the resulting data are intended to serve as a resource for future targeted studies to further advance the field.
Collapse
Affiliation(s)
- Vésteinn Thorsson
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA 98109, USA.
| | - David L Gibbs
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA 98109, USA
| | - Scott D Brown
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Denise Wolf
- University of California, San Francisco, Box 0808, 2340 Sutter Street, S433, San Francisco, CA 94115, USA
| | - Dante S Bortone
- Lineberger Comprehensive Cancer Center, Curriculum in Bioinformatics and Computational Biology, University of North Carolina, 125 Mason Farm Road, Chapel Hill, NC 27599-7295, USA
| | - Tai-Hsien Ou Yang
- Department of Systems Biology and Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| | - Eduard Porta-Pardo
- Barcelona Supercomputing Centre, c/Jordi Girona, 29, 08034 Barcelona, Spain; SBP Medical Discovery Institute, La Jolla, CA 92037, USA
| | - Galen F Gao
- The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Christopher L Plaisier
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA 98109, USA; School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ 85281, USA
| | - James A Eddy
- Sage Bionetworks, 2901 Third Ave, Suite 330, Seattle, WA 98121, USA
| | - Elad Ziv
- Department of Medicine, Institute for Human Genetics, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, 1450 3rd St, San Francisco, CA 94143, USA
| | - Aedin C Culhane
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Evan O Paull
- Irving Cancer Research Center, Room 913,1130 St. Nicholas Avenue, New York, NY 10032, USA
| | - I K Ashok Sivakumar
- Department of Computer Science, Institute for Computational Medicine; Johns Hopkins University, Baltimore, MD 21218, USA
| | - Andrew J Gentles
- Departments of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | | | - Farshad Farshidfar
- Department of Oncology, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Antonio Colaprico
- Universite libre de Bruxelles (ULB), Computer Science Department, Faculty of Sciences, Boulevard du Triomphe - CP212, 1050 Bruxelles, Belgium
| | - Joel S Parker
- Lineberger Comprehensive Cancer Center, Curriculum in Bioinformatics and Computational Biology, University of North Carolina, 125 Mason Farm Road, Chapel Hill, NC 27599-7295, USA
| | - Lisle E Mose
- Lineberger Comprehensive Cancer Center, Curriculum in Bioinformatics and Computational Biology, University of North Carolina, 125 Mason Farm Road, Chapel Hill, NC 27599-7295, USA
| | - Nam Sy Vo
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Jianfang Liu
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA 15963, USA
| | - Yuexin Liu
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Janet Rader
- Medical College of Wisconsin, 9200 Wisconsin Avenue, Milwaukee, WI 53226 USA
| | - Varsha Dhankani
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA 98109, USA
| | - Sheila M Reynolds
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA 98109, USA
| | - Reanne Bowlby
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Andrea Califano
- Irving Cancer Research Center, Room 913,1130 St. Nicholas Avenue, New York, NY 10032, USA
| | - Andrew D Cherniack
- The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Dimitris Anastassiou
- Department of Systems Biology and Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| | - Davide Bedognetti
- Division of Translational Medicine, Research Branch, Sidra Medical and Research Center, PO Box 26999, Doha, Qatar
| | - Younes Mokrab
- Division of Translational Medicine, Research Branch, Sidra Medical and Research Center, PO Box 26999, Doha, Qatar
| | - Aaron M Newman
- Institute for Stem Cell Biology and Regenerative Medicine and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Arvind Rao
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Alexander Krasnitz
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | - Hai Hu
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA 15963, USA
| | - Tathiane M Malta
- Department of Neurosurgery, Henry Ford Hospital, Detroit, MI 48202, USA; Department of Genetics, Ribeirao Preto Medical School, University of São Paulo, São Paulo, Brazil
| | - Houtan Noushmehr
- Department of Neurosurgery, Henry Ford Hospital, Detroit, MI 48202, USA; Department of Genetics, Ribeirao Preto Medical School, University of São Paulo, São Paulo, Brazil
| | | | - Susan Bullman
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | | | - Andrew Lamb
- Sage Bionetworks, 2901 Third Ave, Suite 330, Seattle, WA 98121, USA
| | - Wanding Zhou
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI 49503, USA
| | - Hui Shen
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI 49503, USA
| | - Toni K Choueiri
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - John N Weinstein
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Justin Guinney
- Sage Bionetworks, 2901 Third Ave, Suite 330, Seattle, WA 98121, USA
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook Medicine, 100 Nicolls Rd, Stony Brook, NY 11794, USA
| | - Robert A Holt
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Charles S Rabkin
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Dr., Bethesda, MD 20892, USA
| | - Alexander J Lazar
- Departments of Pathology, Genomics Medicine and Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd-Unit 85, Houston, TX 77030, USA
| | - Jonathan S Serody
- Department of Medicine and Microbiology and Lineberger Comprehensive Cancer Center, 125 Mason Farm Road, Chapel Hill, NC 27599-7295, USA
| | - Elizabeth G Demicco
- Mount Sinai Hospital, Department of Pathology and Laboratory Medicine, 600 University Ave., Toronto, ON M5G 1X5, Canada
| | - Mary L Disis
- UW Medicine Cancer Vaccine Institute, 850 Republican Street, Brotman Building, 2nd Floor, Room 221, Box 358050, University of Washington, Seattle, WA 98109-4714, USA
| | - Benjamin G Vincent
- Lineberger Comprehensive Cancer Center, Curriculum in Bioinformatics and Computational Biology, University of North Carolina, 125 Mason Farm Road, Chapel Hill, NC 27599-7295, USA.
| | - Ilya Shmulevich
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA 98109, USA.
| |
Collapse
|
13
|
Mandal K, Sarmah R, Bhattacharyya DK. Biomarker Identification for Cancer Disease Using Biclustering Approach: An Empirical Study. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 16:490-509. [PMID: 29993834 DOI: 10.1109/tcbb.2018.2820695] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper presents an exhaustive empirical study to identify biomarkers using two approaches: frequency-based and network-based, over seventeen different biclustering algorithms and six different cancer expression datasets. To systematically analyze the biclustering algorithms, we perform enrichment analysis, subtype identification and biomarker identification. Biclustering algorithms such as C&C, SAMBA and Plaid are useful to detect biomarkers by both approaches for all datasets except prostate cancer. We detect a total of 102 gene biomarkers using frequency-based method out of which 19 are for blood cancer, 36 for lung cancer, 25 for colon cancer, 13 for multi-tissue cancer and 9 for prostate cancer. Using the network-based approach we detect a total of 41 gene biomarkers of which 15 are from blood cancer, 12 from lung cancer, 6 from colon cancer, 7 from multi-tissue cancer and 1 from prostate cancer dataset. We further extend our network analysis over some biclusters and detect some gene biomarkers not detected earlier by both frequency-based or network-based approach. We expand our work on breast cancer miRNA expression data to evaluate the performance of the biclustering algorithms. We detect 19 breast cancer biomarkers by frequency-based method and 5 by network-based method for the miRNA dataset.
Collapse
|
14
|
Alzahrani M, Kuwahara H, Wang W, Gao X. Gracob: a novel graph-based constant-column biclustering method for mining growth phenotype data. Bioinformatics 2018; 33:2523-2531. [PMID: 28379298 PMCID: PMC5870648 DOI: 10.1093/bioinformatics/btx199] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Accepted: 04/03/2017] [Indexed: 11/24/2022] Open
Abstract
Motivation Growth phenotype profiling of genome-wide gene-deletion strains over stress conditions can offer a clear picture that the essentiality of genes depends on environmental conditions. Systematically identifying groups of genes from such high-throughput data that share similar patterns of conditional essentiality and dispensability under various environmental conditions can elucidate how genetic interactions of the growth phenotype are regulated in response to the environment. Results We first demonstrate that detecting such ‘co-fit’ gene groups can be cast as a less well-studied problem in biclustering, i.e. constant-column biclustering. Despite significant advances in biclustering techniques, very few were designed for mining in growth phenotype data. Here, we propose Gracob, a novel, efficient graph-based method that casts and solves the constant-column biclustering problem as a maximal clique finding problem in a multipartite graph. We compared Gracob with a large collection of widely used biclustering methods that cover different types of algorithms designed to detect different types of biclusters. Gracob showed superior performance on finding co-fit genes over all the existing methods on both a variety of synthetic data sets with a wide range of settings, and three real growth phenotype datasets for E. coli, proteobacteria and yeast. Availability and Implementation Our program is freely available for download at http://sfb.kaust.edu.sa/Pages/Software.aspx. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Majed Alzahrani
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMCE) Division, Thuwal, 23955-6900, Saudi Arabia
| | - Hiroyuki Kuwahara
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMCE) Division, Thuwal, 23955-6900, Saudi Arabia
| | - Wei Wang
- Department of Computer Science, University of California, Los Angeles, CA 90095, USA
| | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMCE) Division, Thuwal, 23955-6900, Saudi Arabia
| |
Collapse
|
15
|
Kléma J, Malinka F, Železný F. Semantic biclustering for finding local, interpretable and predictive expression patterns. BMC Genomics 2017. [PMID: 29513193 PMCID: PMC5657082 DOI: 10.1186/s12864-017-4132-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Background One of the major challenges in the analysis of gene expression data is to identify local patterns composed of genes showing coherent expression across subsets of experimental conditions. Such patterns may provide an understanding of underlying biological processes related to these conditions. This understanding can further be improved by providing concise characterizations of the genes and situations delimiting the pattern. Results We propose a method called semantic biclustering with the aim to detect interpretable rectangular patterns in binary data matrices. As usual in biclustering, we seek homogeneous submatrices, however, we also require that the included elements can be jointly described in terms of semantic annotations pertaining to both rows (genes) and columns (samples). To find such interpretable biclusters, we explore two strategies. The first endows an existing biclustering algorithm with the semantic ingredients. The other is based on rule and tree learning known from machine learning. Conclusions The two alternatives are tested in experiments with two Drosophila melanogaster gene expression datasets. Both strategies are shown to detect sets of compact biclusters with semantic descriptions that also remain largely valid for unseen (testing) data. This desirable generalization aspect is more emphasized in the strategy stemming from conventional biclustering although this is traded off by the complexity of the descriptions (number of ontology terms employed), which, on the other hand, is lower for the alternative strategy.
Collapse
Affiliation(s)
- Jiří Kléma
- Department of Computer Science, Czech Technical University in Prague, Karlovo náměstí 13, 121 35, Prague 2, Czech Republic.
| | - František Malinka
- Department of Computer Science, Czech Technical University in Prague, Karlovo náměstí 13, 121 35, Prague 2, Czech Republic
| | - Filip Železný
- Department of Computer Science, Czech Technical University in Prague, Karlovo náměstí 13, 121 35, Prague 2, Czech Republic
| |
Collapse
|
16
|
Meng C, Zeleznik OA, Thallinger GG, Kuster B, Gholami AM, Culhane AC. Dimension reduction techniques for the integrative analysis of multi-omics data. Brief Bioinform 2016; 17:628-41. [PMID: 26969681 PMCID: PMC4945831 DOI: 10.1093/bib/bbv108] [Citation(s) in RCA: 210] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Revised: 10/26/2015] [Indexed: 01/16/2023] Open
Abstract
State-of-the-art next-generation sequencing, transcriptomics, proteomics and other high-throughput 'omics' technologies enable the efficient generation of large experimental data sets. These data may yield unprecedented knowledge about molecular pathways in cells and their role in disease. Dimension reduction approaches have been widely used in exploratory analysis of single omics data sets. This review will focus on dimension reduction approaches for simultaneous exploratory analyses of multiple data sets. These methods extract the linear relationships that best explain the correlated structure across data sets, the variability both within and between variables (or observations) and may highlight data issues such as batch effects or outliers. We explore dimension reduction techniques as one of the emerging approaches for data integration, and how these can be applied to increase our understanding of biological systems in normal physiological function and disease.
Collapse
|
17
|
Lawlor N, Fabbri A, Guan P, George J, Karuturi RKM. multiClust: An R-package for Identifying Biologically Relevant Clusters in Cancer Transcriptome Profiles. Cancer Inform 2016; 15:103-14. [PMID: 27330269 PMCID: PMC4907340 DOI: 10.4137/cin.s38000] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2015] [Revised: 03/28/2016] [Accepted: 04/03/2016] [Indexed: 12/26/2022] Open
Abstract
Clustering is carried out to identify patterns in transcriptomics profiles to determine clinically relevant subgroups of patients. Feature (gene) selection is a critical and an integral part of the process. Currently, there are many feature selection and clustering methods to identify the relevant genes and perform clustering of samples. However, choosing an appropriate methodology is difficult. In addition, extensive feature selection methods have not been supported by the available packages. Hence, we developed an integrative R-package called multiClust that allows researchers to experiment with the choice of combination of methods for gene selection and clustering with ease. Using multiClust, we identified the best performing clustering methodology in the context of clinical outcome. Our observations demonstrate that simple methods such as variance-based ranking perform well on the majority of data sets, provided that the appropriate number of genes is selected. However, different gene ranking and selection methods remain relevant as no methodology works for all studies.
Collapse
Affiliation(s)
- Nathan Lawlor
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Alec Fabbri
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT, USA
| | - Peiyong Guan
- Genome Institute of Singapore, A*STAR (Agency for Science, Technology and Research), Singapore
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| | - Joshy George
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | |
Collapse
|
18
|
Uzun A, Schuster J, McGonnigal B, Schorl C, Dewan A, Padbury J. Targeted Sequencing and Meta-Analysis of Preterm Birth. PLoS One 2016; 11:e0155021. [PMID: 27163930 PMCID: PMC4862658 DOI: 10.1371/journal.pone.0155021] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2015] [Accepted: 04/22/2016] [Indexed: 01/01/2023] Open
Abstract
Understanding the genetic contribution(s) to the risk of preterm birth may lead to the development of interventions for treatment, prediction and prevention. Twin studies suggest heritability of preterm birth is 36-40%. Large epidemiological analyses support a primary maternal origin for recurrence of preterm birth, with little effect of paternal or fetal genetic factors. We exploited an "extreme phenotype" of preterm birth to leverage the likelihood of genetic discovery. We compared variants identified by targeted sequencing of women with 2-3 generations of preterm birth with term controls without history of preterm birth. We used a meta-genomic, bi-clustering algorithm to identify gene sets coordinately associated with preterm birth. We identified 33 genes including 217 variants from 5 modules that were significantly different between cases and controls. The most frequently identified and connected genes in the exome library were IGF1, ATM and IQGAP2. Likewise, SOS1, RAF1 and AKT3 were most frequent in the haplotype library. Additionally, SERPINB8, AZU1 and WASF3 showed significant differences in abundance of variants in the univariate comparison of cases and controls. The biological processes impacted by these gene sets included: cell motility, migration and locomotion; response to glucocorticoid stimulus; signal transduction; metabolic regulation and control of apoptosis.
Collapse
Affiliation(s)
- Alper Uzun
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, Rhode Island, United States of America
- Brown Alpert Medical School, Providence, Rhode Island, United States of America
| | - Jessica Schuster
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, Rhode Island, United States of America
| | - Bethany McGonnigal
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, Rhode Island, United States of America
| | - Christoph Schorl
- Molecular Biology, Cell Biology & Biochemistry, Brown University, Providence, Rhode Island, United States of America
| | - Andrew Dewan
- Department of Epidemiology and Public Health, Yale University, New Haven, Connecticut, United States of America
| | - James Padbury
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, Rhode Island, United States of America
- Brown Alpert Medical School, Providence, Rhode Island, United States of America
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
19
|
Affiliation(s)
- Axel Gandy
- Department of Mathematics, Imperial College London
| | - Georg Hahn
- Department of Mathematics, Imperial College London
| |
Collapse
|
20
|
Composition and temporal stability of the gut microbiota in older persons. ISME JOURNAL 2015; 10:170-82. [PMID: 26090993 PMCID: PMC4681863 DOI: 10.1038/ismej.2015.88] [Citation(s) in RCA: 276] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/18/2014] [Revised: 04/01/2015] [Accepted: 04/24/2015] [Indexed: 12/21/2022]
Abstract
The composition and function of the human gut microbiota has been linked to health and disease. We previously identified correlations between habitual diet, microbiota composition gradients and health gradients in an unstratified cohort of 178 elderly subjects. To refine our understanding of diet–microbiota associations and differential taxon abundance, we adapted an iterative bi-clustering algorithm (iterative binary bclustering of gene sets (iBBiG)) and applied it to microbiota composition data from 732 faecal samples from 371 ELDERMET cohort subjects, including longitudinal samples. We thus identified distinctive microbiota configurations associated with ageing in both community and long-stay residential care elderly subjects. Mixed-taxa populations were identified that had clinically distinct associations. Microbiota temporal instability was observed in both community-dwelling and long-term care subjects, particularly in those with low initial microbiota diversity. However, the stability of the microbiota of subjects had little impact on the directional change of the microbiota as observed for long-stay subjects who display a gradual shift away from their initial microbiota. This was not observed in community-dwelling subjects. This directional change was associated with duration in long-stay. Changes in these bacterial populations represent the loss of the health-associated and youth-associated microbiota components and gain of an elderly associated microbiota. Interestingly, community-associated microbiota configurations were impacted more by the use of antibiotics than the microbiota of individuals in long-term care, as the community-associated microbiota showed more loss but also more recovery following antibiotic treatment. This improved definition of gut microbiota composition patterns in the elderly will better inform the design of dietary or antibiotic interventions targeting the gut microbiota.
Collapse
|
21
|
Ahmed HA, Mahanta P, Bhattacharyya DK, Kalita JK. Shifting-and-Scaling Correlation Based Biclustering Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014; 11:1239-1252. [PMID: 26357059 DOI: 10.1109/tcbb.2014.2323054] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The existence of various types of correlations among the expressions of a group of biologically significant genes poses challenges in developing effective methods of gene expression data analysis. The initial focus of computational biologists was to work with only absolute and shifting correlations. However, researchers have found that the ability to handle shifting-and-scaling correlation enables them to extract more biologically relevant and interesting patterns from gene microarray data. In this paper, we introduce an effective shifting-and-scaling correlation measure named Shifting and Scaling Similarity (SSSim), which can detect highly correlated gene pairs in any gene expression data. We also introduce a technique named Intensive Correlation Search (ICS) biclustering algorithm, which uses SSSim to extract biologically significant biclusters from a gene expression data set. The technique performs satisfactorily with a number of benchmarked gene expression data sets when evaluated in terms of functional categories in Gene Ontology database.
Collapse
|
22
|
Gandy A, Hahn G. MMCTest-A Safe Algorithm for Implementing Multiple Monte Carlo Tests. Scand Stat Theory Appl 2014. [DOI: 10.1111/sjos.12085] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Affiliation(s)
- Axel Gandy
- Department of Mathematics; Imperial College London
| | - Georg Hahn
- Department of Mathematics; Imperial College London
| |
Collapse
|
23
|
Jiang B, Liu JS, Bulyk ML. Bayesian hierarchical model of protein-binding microarray k-mer data reduces noise and identifies transcription factor subclasses and preferred k-mers. ACTA ACUST UNITED AC 2013; 29:1390-8. [PMID: 23559638 DOI: 10.1093/bioinformatics/btt152] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
MOTIVATION Sequence-specific transcription factors (TFs) regulate the expression of their target genes through interactions with specific DNA-binding sites in the genome. Data on TF-DNA binding specificities are essential for understanding how regulatory specificity is achieved. RESULTS Numerous studies have used universal protein-binding microarray (PBM) technology to determine the in vitro binding specificities of hundreds of TFs for all possible 8 bp sequences (8mers). We have developed a Bayesian analysis of variance (ANOVA) model that decomposes these 8mer data into background noise, TF familywise effects and effects due to the particular TF. Adjusting for background noise improves PBM data quality and concordance with in vivo TF binding data. Moreover, our model provides simultaneous identification of TF subclasses and their shared sequence preferences, and also of 8mers bound preferentially by individual members of TF subclasses. Such results may aid in deciphering cis-regulatory codes and determinants of protein-DNA binding specificity. AVAILABILITY AND IMPLEMENTATION Source code, compiled code and R and Python scripts are available from http://thebrain.bwh.harvard.edu/hierarchicalANOVA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bo Jiang
- Department of Statistics, Harvard University, Cambridge, MA 02138, USA.
| | | | | |
Collapse
|