1
|
Fong WJ, Tan HM, Garg R, Teh AL, Pan H, Gupta V, Krishna B, Chen ZH, Purwanto NY, Yap F, Tan KH, Chan KYJ, Chan SY, Goh N, Rane N, Tan ESE, Jiang Y, Han M, Meaney M, Wang D, Keppo J, Tan GCY. Comparing feature selection and machine learning approaches for predicting CYP2D6 methylation from genetic variation. Front Neuroinform 2024; 17:1244336. [PMID: 38449836 PMCID: PMC10915285 DOI: 10.3389/fninf.2023.1244336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 10/18/2023] [Indexed: 03/08/2024] Open
Abstract
Introduction Pharmacogenetics currently supports clinical decision-making on the basis of a limited number of variants in a few genes and may benefit paediatric prescribing where there is a need for more precise dosing. Integrating genomic information such as methylation into pharmacogenetic models holds the potential to improve their accuracy and consequently prescribing decisions. Cytochrome P450 2D6 (CYP2D6) is a highly polymorphic gene conventionally associated with the metabolism of commonly used drugs and endogenous substrates. We thus sought to predict epigenetic loci from single nucleotide polymorphisms (SNPs) related to CYP2D6 in children from the GUSTO cohort. Methods Buffy coat DNA methylation was quantified using the Illumina Infinium Methylation EPIC beadchip. CpG sites associated with CYP2D6 were used as outcome variables in Linear Regression, Elastic Net and XGBoost models. We compared feature selection of SNPs from GWAS mQTLs, GTEx eQTLs and SNPs within 2 MB of the CYP2D6 gene and the impact of adding demographic data. The samples were split into training (75%) sets and test (25%) sets for validation. In Elastic Net model and XGBoost models, optimal hyperparameter search was done using 10-fold cross validation. Root Mean Square Error and R-squared values were obtained to investigate each models' performance. When GWAS was performed to determine SNPs associated with CpG sites, a total of 15 SNPs were identified where several SNPs appeared to influence multiple CpG sites. Results Overall, Elastic Net models of genetic features appeared to perform marginally better than heritability estimates and substantially better than Linear Regression and XGBoost models. The addition of nongenetic features appeared to improve performance for some but not all feature sets and probes. The best feature set and Machine Learning (ML) approach differed substantially between CpG sites and a number of top variables were identified for each model. Discussion The development of SNP-based prediction models for CYP2D6 CpG methylation in Singaporean children of varying ethnicities in this study has clinical application. With further validation, they may add to the set of tools available to improve precision medicine and pharmacogenetics-based dosing.
Collapse
Affiliation(s)
- Wei Jing Fong
- Computational Biology, National University of Singapore, Singapore, Singapore
| | - Hong Ming Tan
- Computational Biology, National University of Singapore, Singapore, Singapore
| | - Rishabh Garg
- Computational Biology, National University of Singapore, Singapore, Singapore
| | - Ai Ling Teh
- Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Hong Pan
- Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Varsha Gupta
- Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Bernadus Krishna
- Computational Biology, National University of Singapore, Singapore, Singapore
| | - Zou Hui Chen
- Computational Biology, National University of Singapore, Singapore, Singapore
| | | | - Fabian Yap
- KK Women's and Children's Hospital, Singapore, Singapore
| | - Kok Hian Tan
- KK Women's and Children's Hospital, Singapore, Singapore
- Duke NUS Medical School, Singapore, Singapore
| | - Kok Yen Jerry Chan
- KK Women's and Children's Hospital, Singapore, Singapore
- Duke NUS Medical School, Singapore, Singapore
| | - Shiao-Yng Chan
- Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- National University Hospital, Singapore, Singapore
| | | | - Nikita Rane
- Institute of Mental Health,Singapore, Singapore
| | | | | | - Mei Han
- Computational Biology, National University of Singapore, Singapore, Singapore
| | - Michael Meaney
- Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Dennis Wang
- Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- National Heart and Lung Institute, Imperial College London, London, United Kingdom
| | - Jussi Keppo
- Computational Biology, National University of Singapore, Singapore, Singapore
| | - Geoffrey Chern-Yee Tan
- Computational Biology, National University of Singapore, Singapore, Singapore
- Institute of Mental Health,Singapore, Singapore
| |
Collapse
|
2
|
Tan GCY, Wang Z, Tan ESE, Ong RJM, Ooi PE, Lee D, Rane N, Tey SYX, Chua SY, Goh N, Lam GW, Chakraborty A, Yew AKL, Ong SK, Kee JL, Lim XY, Hashim N, Lu SH, Meany M, Tolomeo S, Lee CA, Tan HM, Keppo J. Transdiagnostic clustering of self-schema from self-referential judgements identifies subtypes of healthy personality and depression. Front Neuroinform 2024; 17:1244347. [PMID: 38274390 PMCID: PMC10808829 DOI: 10.3389/fninf.2023.1244347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 11/06/2023] [Indexed: 01/27/2024] Open
Abstract
Introduction The heterogeneity of depressive and anxiety disorders complicates clinical management as it may account for differences in trajectory and treatment response. Self-schemas, which can be determined by Self-Referential Judgements (SRJs), are heterogeneous yet stable. SRJs have been used to characterize personality in the general population and shown to be prognostic in depressive and anxiety disorders. Methods In this study, we used SRJs from a Self-Referential Encoding Task (SRET) to identify clusters from a clinical sample of 119 patients recruited from the Institute of Mental Health presenting with depressive or anxiety symptoms and a non-clinical sample of 115 healthy adults. The generated clusters were examined in terms of most endorsed words, cross-sample correspondence, association with depressive symptoms and the Depressive Experiences Questionnaire and diagnostic category. Results We identify a 5-cluster solution in each sample and a 7-cluster solution in the combined sample. When perturbed, metrics such as optimum cluster number, criterion value, likelihood, DBI and CHI remained stable and cluster centers appeared stable when using BIC or ICL as criteria. Top endorsed words in clusters were meaningful across theoretical frameworks from personality, psychodynamic concepts of relatedness and self-definition, and valence in self-referential processing. The clinical clusters were labeled "Neurotic" (C1), "Extraverted" (C2), "Anxious to please" (C3), "Self-critical" (C4), "Conscientious" (C5). The non-clinical clusters were labeled "Self-confident" (N1), "Low endorsement" (N2), "Non-neurotic" (N3), "Neurotic" (N4), "High endorsement" (N5). The combined clusters were labeled "Self-confident" (NC1), "Externalising" (NC2), "Neurotic" (NC3), "Secure" (NC4), "Low endorsement" (NC5), "High endorsement" (NC6), "Self-critical" (NC7). Cluster differences were observed in endorsement of positive and negative words, latency biases, recall biases, depressive symptoms, frequency of depressive disorders and self-criticism. Discussion Overall, clusters endorsing more negative words tended to endorse fewer positive words, showed more negative biases in reaction time and negative recall bias, reported more severe depressive symptoms and a higher frequency of depressive disorders and more self-criticism in the clinical population. SRJ-based clustering represents a novel transdiagnostic framework for subgrouping patients with depressive and anxiety symptoms that may support the future translation of the science of self-referential processing, personality and psychodynamic concepts of self-definition to clinical applications.
Collapse
Affiliation(s)
| | | | | | - Rachel Jing Min Ong
- Faculty of Social Sciences, National University of Singapore, Singapore, Singapore
| | - Pei En Ooi
- School of Biological Sciences, National Technological University, Singapore, Singapore
| | - Danan Lee
- Yale-NUS College, Singapore, Singapore
| | - Nikita Rane
- Institute of Mental Health, Singapore, Singapore
| | | | - Si Ying Chua
- Institute of Mental Health, Singapore, Singapore
| | | | | | - Atlanta Chakraborty
- Institute of Operations Research and Analytics, National University of Singapore, Singapore, Singapore
| | - Anthony Khye Loong Yew
- Institute of Operations Research and Analytics, National University of Singapore, Singapore, Singapore
| | | | | | - Xin Ying Lim
- Faculty of Social Sciences, National University of Singapore, Singapore, Singapore
| | - Nawal Hashim
- Institute of Mental Health, Singapore, Singapore
| | | | - Michael Meany
- Singapore Institute for Clinical Sciences, A*STAR, Singapore, Singapore
| | - Serenella Tolomeo
- Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | | | - Hong Ming Tan
- Institute of Operations Research and Analytics, National University of Singapore, Singapore, Singapore
| | - Jussi Keppo
- Institute of Operations Research and Analytics, National University of Singapore, Singapore, Singapore
| |
Collapse
|