1
|
Alcázar Magaña A, Vaswani A, Brown KS, Jiang Y, Alam MN, Caruso M, Lak P, Cheong P, Gray NE, Quinn JF, Soumyanath A, Stevens JF, Maier CS. Integrating High-Resolution Mass Spectral Data, Bioassays and Computational Models to Annotate Bioactives in Botanical Extracts: Case Study Analysis of C. asiatica Extract Associates Dicaffeoylquinic Acids with Protection against Amyloid-β Toxicity. Molecules 2024; 29:838. [PMID: 38398590 PMCID: PMC10892090 DOI: 10.3390/molecules29040838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/07/2024] [Accepted: 02/12/2024] [Indexed: 02/25/2024] Open
Abstract
Rapid screening of botanical extracts for the discovery of bioactive natural products was performed using a fractionation approach in conjunction with flow-injection high-resolution mass spectrometry for obtaining chemical fingerprints of each fraction, enabling the correlation of the relative abundance of molecular features (representing individual phytochemicals) with the read-outs of bioassays. We applied this strategy for discovering and identifying constituents of Centella asiatica (C. asiatica) that protect against Aβ cytotoxicity in vitro. C. asiatica has been associated with improving mental health and cognitive function, with potential use in Alzheimer's disease. Human neuroblastoma MC65 cells were exposed to subfractions of an aqueous extract of C. asiatica to evaluate the protective benefit derived from these subfractions against amyloid β-cytotoxicity. The % viability score of the cells exposed to each subfraction was used in conjunction with the intensity of the molecular features in two computational models, namely Elastic Net and selectivity ratio, to determine the relationship of the peak intensity of molecular features with % viability. Finally, the correlation of mass spectral features with MC65 protection and their abundance in different sub-fractions were visualized using GNPS molecular networking. Both computational methods unequivocally identified dicaffeoylquinic acids as providing strong protection against Aβ-toxicity in MC65 cells, in agreement with the protective effects observed for these compounds in previous preclinical model studies.
Collapse
Affiliation(s)
- Armando Alcázar Magaña
- Department of Chemistry, Oregon State University, Corvallis, OR 97331, USA; (A.A.M.); (A.V.); (M.N.A.); (P.L.); (P.C.)
- BENFRA Botanical Dietary Supplements Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (N.E.G.); (A.S.); (J.F.S.)
- Life Sciences Institute, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Ashish Vaswani
- Department of Chemistry, Oregon State University, Corvallis, OR 97331, USA; (A.A.M.); (A.V.); (M.N.A.); (P.L.); (P.C.)
| | - Kevin S. Brown
- Department of Pharmaceutical Sciences, Oregon State University, Corvallis, OR 97331, USA;
- School of Chemical, Biological, and Environmental Engineering, Oregon State University, 116 Johnson Hall, 105 SW 26th Street, Corvallis, OR 97331, USA
| | - Yuan Jiang
- Department of Statistics, Oregon State University, Corvallis, OR 97331, USA;
| | - Md Nure Alam
- Department of Chemistry, Oregon State University, Corvallis, OR 97331, USA; (A.A.M.); (A.V.); (M.N.A.); (P.L.); (P.C.)
| | - Maya Caruso
- Department of Neurology, Oregon Health & Science University, Portland, OR 97239, USA; (M.C.); (J.F.Q.)
| | - Parnian Lak
- Department of Chemistry, Oregon State University, Corvallis, OR 97331, USA; (A.A.M.); (A.V.); (M.N.A.); (P.L.); (P.C.)
| | - Paul Cheong
- Department of Chemistry, Oregon State University, Corvallis, OR 97331, USA; (A.A.M.); (A.V.); (M.N.A.); (P.L.); (P.C.)
| | - Nora E. Gray
- BENFRA Botanical Dietary Supplements Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (N.E.G.); (A.S.); (J.F.S.)
- Department of Neurology, Oregon Health & Science University, Portland, OR 97239, USA; (M.C.); (J.F.Q.)
| | - Joseph F. Quinn
- Department of Neurology, Oregon Health & Science University, Portland, OR 97239, USA; (M.C.); (J.F.Q.)
- Parkinson’s Disease Research Education and Clinical Care Center, Veterans’ Administration Portland Health Care System, Portland, OR 97239, USA
| | - Amala Soumyanath
- BENFRA Botanical Dietary Supplements Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (N.E.G.); (A.S.); (J.F.S.)
- Department of Neurology, Oregon Health & Science University, Portland, OR 97239, USA; (M.C.); (J.F.Q.)
| | - Jan F. Stevens
- BENFRA Botanical Dietary Supplements Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (N.E.G.); (A.S.); (J.F.S.)
- Department of Pharmaceutical Sciences, Oregon State University, Corvallis, OR 97331, USA;
- Linus Pauling Institute, Oregon State University, Corvallis, OR 97331, USA
| | - Claudia S. Maier
- Department of Chemistry, Oregon State University, Corvallis, OR 97331, USA; (A.A.M.); (A.V.); (M.N.A.); (P.L.); (P.C.)
- BENFRA Botanical Dietary Supplements Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (N.E.G.); (A.S.); (J.F.S.)
- Linus Pauling Institute, Oregon State University, Corvallis, OR 97331, USA
| |
Collapse
|
2
|
Liu X, Zhu B, Dai XW, Xu ZA, Li R, Qian Y, Lu YP, Zhang W, Liu Y, Zheng J. GBDT_KgluSite: An improved computational prediction model for lysine glutarylation sites based on feature fusion and GBDT classifier. BMC Genomics 2023; 24:765. [PMID: 38082413 PMCID: PMC10712101 DOI: 10.1186/s12864-023-09834-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Lysine glutarylation (Kglu) is one of the most important Post-translational modifications (PTMs), which plays significant roles in various cellular functions, including metabolism, mitochondrial processes, and translation. Therefore, accurate identification of the Kglu site is important for elucidating protein molecular function. Due to the time-consuming and expensive limitations of traditional biological experiments, computational-based Kglu site prediction research is gaining more and more attention. RESULTS In this paper, we proposed GBDT_KgluSite, a novel Kglu site prediction model based on GBDT and appropriate feature combinations, which achieved satisfactory performance. Specifically, seven features including sequence-based features, physicochemical property-based features, structural-based features, and evolutionary-derived features were used to characterize proteins. NearMiss-3 and Elastic Net were applied to address data imbalance and feature redundancy issues, respectively. The experimental results show that GBDT_KgluSite has good robustness and generalization ability, with accuracy and AUC values of 93.73%, and 98.14% on five-fold cross-validation as well as 90.11%, and 96.75% on the independent test dataset, respectively. CONCLUSION GBDT_KgluSite is an effective computational method for identifying Kglu sites in protein sequences. It has good stability and generalization ability and could be useful for the identification of new Kglu sites in the future. The relevant code and dataset are available at https://github.com/flyinsky6/GBDT_KgluSite .
Collapse
Affiliation(s)
- Xin Liu
- School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China.
| | - Bao Zhu
- Cancer Institute, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
- Jiangsu Center for the Collaboration and Innovation of Cancer Biotherapy, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| | - Xia-Wei Dai
- School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| | - Zhi-Ao Xu
- School of Life Sciences, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| | - Rui Li
- School of Life Sciences, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| | - Yuting Qian
- Jiangsu Center for the Collaboration and Innovation of Cancer Biotherapy, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| | - Ya-Ping Lu
- School of Humanities and Arts, China University of Mining and Technology, Xuzhou, Jiangsu, 221116, China
| | - Wenqing Zhang
- School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| | - Yong Liu
- Cancer Institute, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China.
- Jiangsu Center for the Collaboration and Innovation of Cancer Biotherapy, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China.
| | - Junnian Zheng
- Jiangsu Center for the Collaboration and Innovation of Cancer Biotherapy, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China.
- Center of Clinical Oncology, The Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, 221002, China.
| |
Collapse
|
3
|
Tanigawa Y, Kellis M. Power of inclusion: Enhancing polygenic prediction with admixed individuals. Am J Hum Genet 2023; 110:1888-1902. [PMID: 37890495 PMCID: PMC10645553 DOI: 10.1016/j.ajhg.2023.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 09/22/2023] [Accepted: 09/22/2023] [Indexed: 10/29/2023] Open
Abstract
Admixed individuals offer unique opportunities for addressing limited transferability in polygenic scores (PGSs), given the substantial trans-ancestry genetic correlation in many complex traits. However, they are rarely considered in PGS training, given the challenges in representing ancestry-matched linkage-disequilibrium reference panels for admixed individuals. Here we present inclusive PGS (iPGS), which captures ancestry-shared genetic effects by finding the exact solution for penalized regression on individual-level data and is thus naturally applicable to admixed individuals. We validate our approach in a simulation study across 33 configurations with varying heritability, polygenicity, and ancestry composition in the training set. When iPGS is applied to n = 237,055 ancestry-diverse individuals in the UK Biobank, it shows the greatest improvements in Africans by 48.9% on average across 60 quantitative traits and up to 50-fold improvements for some traits (neutrophil count, R2 = 0.058) over the baseline model trained on the same number of European individuals. When we allowed iPGS to use n = 284,661 individuals, we observed an average improvement of 60.8% for African, 11.6% for South Asian, 7.3% for non-British White, 4.8% for White British, and 17.8% for the other individuals. We further developed iPGS+refit to jointly model the ancestry-shared and -dependent genetic effects when heterogeneous genetic associations were present. For neutrophil count, for example, iPGS+refit showed the highest predictive performance in the African group (R2 = 0.115), which exceeds the best predictive performance for the White British group (R2 = 0.090 in the iPGS model), even though only 1.49% of individuals used in the iPGS training are of African ancestry. Our results indicate the power of including diverse individuals for developing more equitable PGS models.
Collapse
Affiliation(s)
- Yosuke Tanigawa
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
4
|
Peterson J, Gilbert-Gatty M, Ekström K, Hagesjö L, Bengtson A. Near-the-Line Steel Slag Analysis Using Laser-Induced Breakdown Spectroscopy: Traditional Univariate Versus Machine Learning Calibration Methods. Appl Spectrosc 2023; 77:907-914. [PMID: 36495069 DOI: 10.1177/00037028221144654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
This work is focused on rapid quantitative analysis of slag in the steel industry for improved process control. The novel approach in this work is a direct comparison of two methods to calibrate and quantify spectral data from the slags. Calibration was first done with the most prevalent method in quantitative optical emission spectroscopy (OES) of solids, the univariate ratio method. The second method is an advanced multivariate analysis (MVA) algorithm termed Elastic Net, allowing to include several lines for each element in the calibration functions. In both methods, the output is mass fraction ratios of the analyte element (or compound) to a matrix element (compound). The actual mass fractions of each compound are calculated by sum normalization assuming the matrix to make up the difference up to 100%. The metric used to evaluate the performance of the methods in terms of accuracy is the parameter σrel calculated as the ratio of the root mean square (RMS) deviation from values obtained by X-ray fluorescence (XRF) divided by the average mass fraction of the compound, expressed in percent. A bit surprising, the main outcome of the comparison is that there is very little difference in the performance of the two methods. One exception is the analysis of MgO, where the elastic net gives significantly better accuracy. Presumably, this is due to the use of multiple lines for Mg to build the calibration function. This is very encouraging, since MgO is a major compound in most slags that needs to be determined accurately. It is suggested to improve accuracy further by means of separate calibrations for a limited number of slag types.
Collapse
Affiliation(s)
- Jonas Peterson
- Department of Analytical Process Monitoring, Swerim AB, Kista, Sweden
| | | | - Krister Ekström
- Department of Analytical Process Monitoring, Swerim AB, Kista, Sweden
| | - Louise Hagesjö
- Department of Analytical Process Monitoring, Swerim AB, Kista, Sweden
| | - Arne Bengtson
- Department of Analytical Process Monitoring, Swerim AB, Kista, Sweden
| |
Collapse
|
5
|
Chakraborty M, Shakir Mahmud M, Gates TJ, Sinha S. Analysis and Prediction of Human Mobility in the United States during the Early Stages of the COVID-19 Pandemic using Regularized Linear Models. Transp Res Rec 2023; 2677:380-395. [PMID: 37153191 PMCID: PMC10149351 DOI: 10.1177/03611981211067794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Since the United States started grappling with the COVID-19 pandemic, with the highest number of confirmed cases and deaths in the world as of August 2020, most states have enforced travel restrictions resulting in drastic reductions in mobility and travel. However, the long-term implications of this crisis to mobility still remain uncertain. To this end, this study proposes an analytical framework that determines the most significant factors affecting human mobility in the United States during the early days of the pandemic. Particularly, the study uses least absolute shrinkage and selection operator (LASSO) regularization to identify the most significant variables influencing human mobility and uses linear regularization algorithms, including ridge, LASSO, and elastic net modeling techniques, to predict human mobility. State-level data were obtained from various sources from January 1, 2020 to June 13, 2020. The entire data set was divided into a training and a test data set, and the variables selected by LASSO were used to train models by the linear regularization algorithms, using the training data set. Finally, the prediction accuracy of the developed models was examined on the test data. The results indicate that several factors, including the number of new cases, social distancing, stay-at-home orders, domestic travel restrictions, mask-wearing policy, socioeconomic status, unemployment rate, transit mode share, percent of population working from home, and percent of older (60+ years) and African and Hispanic American populations, among others, significantly influence daily trips. Moreover, among all models, ridge regression provides the most superior performance with the least error, whereas both LASSO and elastic net performed better than the ordinary linear model.
Collapse
Affiliation(s)
- Meghna Chakraborty
- Department of Civil and Environmental Engineering, Michigan State University, East Lansing, MI
- Meghna Chakraborty,
| | - Md Shakir Mahmud
- Department of Civil and Environmental Engineering, Michigan State University, East Lansing, MI
| | - Timothy J. Gates
- Department of Civil and Environmental Engineering, Michigan State University, East Lansing, MI
| | | |
Collapse
|
6
|
Mineo L, Rodolico A, Spedicato GA, Aguglia A, Bolognesi S, Concerto C, Cuomo A, Goracci A, Serafini G, Maina G, Fagiolini A, Amore M, Aguglia E. Which mixed depression model? A comparison between DSM-5-defined mixed features and Koukopoulos' criteria. Bipolar Disord 2022; 24:530-538. [PMID: 34846773 DOI: 10.1111/bdi.13166] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
BACKGROUND The criteria of the Diagnostic and Statistical Manual of Mental Disorders 5th edition "with mixed features specifier" (DSM-5 MFS) are considered controversial since they include only typical manic symptoms. By contrast, Koukopoulos developed an alternative model of mixed depression (MxD) focusing primarily on the excitatory component. OBJECTIVE To compare DSM-5 MFS and Koukopoulos' MxD (KMxD) in terms of prevalence, associated clinical variables, and discriminative capacity for bipolar depression in patients with major depressive episode (MDE). METHODS A total of 300 patients with MDE-155 with major depressive disorder and 145 with bipolar disorder (BD)-were recruited. The discriminative capacity of DSM-5 MFS and KMxD criteria for BD was estimated using the area under the curves of receiver operating characteristic (ROC_AUC). The clinical variables associated with these two diagnostic constructs were assessed by performing a logistic regression. RESULTS A total of 44 and 165 patients met the DSM-5 MFS and KMxD criteria, respectively. The ROC_AUCs and their confidence intervals for BD according to DSM-5 MFS and KMxD were 77.0% (72.0%-82.1%) and 71.9% (66.2%-77.7%), respectively. The optimal thresholds (combining sensitivity and specificity measures) for BD diagnosis were ≥1 (77%/68%) for DSM-5 MFS and ≥3 symptoms (78%/66%) for KMxD. However, considering the DSM-5 MFS cut-off (≥3 symptoms), the specificity (97%) increased at the expense of sensitivity (26%). CONCLUSIONS KMxD and DSM-5-MFS showed an overlapping discriminative capacity for bipolar depression. The current diagnostic threshold of DSM-5 MFS did not prove to be very inclusive, if compared with the greater diagnostic sensitivity of KMxD, which also yielded better association with clinical variables related to mixedness.
Collapse
Affiliation(s)
- Ludovico Mineo
- Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy
| | - Alessandro Rodolico
- Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy
| | - Giorgio A Spedicato
- Department of Statistics and Quantitative Methods, University of Milano-Bicocca, Milan, Italy
| | - Andrea Aguglia
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics, Maternal and Child Health, Section of Psychiatry, University of Genoa, Genoa, Italy
- IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Simone Bolognesi
- Department of Molecular Medicine, University of Siena, Siena, Italy
| | - Carmen Concerto
- Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy
| | - Alessandro Cuomo
- Department of Molecular Medicine, University of Siena, Siena, Italy
| | - Arianna Goracci
- Department of Molecular Medicine, University of Siena, Siena, Italy
| | - Gianluca Serafini
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics, Maternal and Child Health, Section of Psychiatry, University of Genoa, Genoa, Italy
- IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Giuseppe Maina
- Rita Levi Montalcini Department of Neuroscience, University of Turin, University Hospital San Luigi Gonzaga, Turin, Italy
| | - Andrea Fagiolini
- Department of Molecular Medicine, University of Siena, Siena, Italy
| | - Mario Amore
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics, Maternal and Child Health, Section of Psychiatry, University of Genoa, Genoa, Italy
- IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Eugenio Aguglia
- Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy
| |
Collapse
|
7
|
Du J, Boss J, Han P, Beesley LJ, Kleinsasser M, Goutman SA, Batterman S, Feldman EL, Mukherjee B. Variable selection with multiply-imputed datasets: choosing between stacked and grouped methods. J Comput Graph Stat 2022; 31:1063-1075. [PMID: 36644406 PMCID: PMC9838615 DOI: 10.1080/10618600.2022.2035739] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 01/04/2022] [Accepted: 01/24/2022] [Indexed: 02/01/2023]
Abstract
Penalized regression methods are used in many biomedical applications for variable selection and simultaneous coefficient estimation. However, missing data complicates the implementation of these methods, particularly when missingness is handled using multiple imputation. Applying a variable selection algorithm on each imputed dataset will likely lead to different sets of selected predictors. This paper considers a general class of penalized objective functions which, by construction, force selection of the same variables across imputed datasets. By pooling objective functions across imputations, optimization is then performed jointly over all imputed datasets rather than separately for each dataset. We consider two objective function formulations that exist in the literature, which we will refer to as "stacked" and "grouped" objective functions. Building on existing work, we (a) derive and implement efficient cyclic coordinate descent and majorization-minimization optimization algorithms for continuous and binary outcome data, (b) incorporate adaptive shrinkage penalties, (c) compare these methods through simulation, and (d) develop an R package miselect. Simulations demonstrate that the "stacked" approaches are more computationally efficient and have better estimation and selection properties. We apply these methods to data from the University of Michigan ALS Patients Biorepository aiming to identify the association between environmental pollutants and ALS risk. Supplementary materials are available online.
Collapse
Affiliation(s)
- Jiacong Du
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
| | - Jonathan Boss
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
| | - Peisong Han
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
| | - Lauren J Beesley
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
| | | | | | - Stuart Batterman
- Department of Environmental Health Science, University of Michigan, Ann Arbor, MI
| | - Eva L Feldman
- Department of Neurology, University of Michigan, Ann Arbor, MI
| | | |
Collapse
|
8
|
Xiang Y, Zou X, Shi H, Xu X, Wu C, Zhong W, Wang J, Zhou W, Zeng X, He M, Wang Y, Huang L, Wang X. Elastic Net Models Based on DNA Copy Number Variations Predicts Clinical Features, Expression Signatures, and Mutations in Lung Adenocarcinoma. Front Genet 2021; 12:668040. [PMID: 34135942 PMCID: PMC8202527 DOI: 10.3389/fgene.2021.668040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 04/26/2021] [Indexed: 11/30/2022] Open
Abstract
In the precision medicine of lung adenocarcinoma, the identification and prediction of tumor phenotypes for specific biomolecular events are still not studied in depth. Various earlier researches sheds light on the close correlation between genetic expression signatures and DNA copy number variations (CNVs), for which analysis of CNVs provides valuable information about molecular and phenotypic changes in tumorigenesis. In this study, we propose a comprehensive analysis combining genome-wide association analysis and an Elastic Net Regression predictive model, focus on predicting the levels of many gene expression signatures in lung adenocarcinoma, based upon DNA copy number features alone. Additionally, we predicted many other key phenotypes, including clinical features (pathological stage), gene mutations, and protein expressions. These Elastic Net prediction methods can also be applied to other gene sets, thereby facilitating their use as biomarkers in monitoring therapy.
Collapse
Affiliation(s)
- Yi Xiang
- Department of Oncology, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Xiaohuan Zou
- Department of Critical Care Medicine, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Huaqiu Shi
- Department of Oncology, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Xueming Xu
- Department of Oncology, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Caixia Wu
- First Clinical Medical College, Gannan Medical University, Ganzhou, China
| | - Wenjuan Zhong
- Department of Oncology, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Jinfeng Wang
- Department of Oncology, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Wenting Zhou
- Department of Oncology, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Xiaoli Zeng
- Department of Oncology, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Miao He
- Department of Oncology, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Ying Wang
- First Clinical Medical College, Gannan Medical University, Ganzhou, China
| | - Li Huang
- Department of Oncology, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Xiangcai Wang
- Department of Oncology, The First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| |
Collapse
|
9
|
Ghijs M, Vanbillemont B, Nicolaï N, De Beer T, Nopens I. Two-dimensional moisture content and size measurement of pharmaceutical granules after fluid bed drying using near-infrared chemical imaging. Int J Pharm 2021; 595:120069. [PMID: 33421586 DOI: 10.1016/j.ijpharm.2020.120069] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 11/02/2020] [Accepted: 11/06/2020] [Indexed: 10/22/2022]
Abstract
In pharmaceutical wet granulation, drying is a critical step in terms of energy and material consumption, whereas granule moisture content and size are important process outcomes that determine tabletting performance. The drying process is, however, very complex due to the multitude of interacting mechanisms on different scales. Building robust physical models of this process therefore requires detailed data. Current data collection methods only succeed in measuring the average moisture content of a size fraction of granules, whereas this property rather follows a distribution that, moreover, contains information on the drying patterns. Therefore, a measurement method is devised to simultaneously characterise the moisture content and size of individual pharmaceutical granules. A setup with near-infrared chemical imaging (NIR-CI) is used to capture an image of a number of granules, in which the absorbance spectra are used for deriving the moisture content of the material and the size of the granules is estimated based on the amount of pixels containing pharmaceutical material. The quantification of moisture content based on absorption spectra is performed with two different regression methods, Partial Least Squares regression (PLSR) and Elastic Net Regression (ENR). The method is validated with particle size data for size determination, loss-on-drying (LOD) data of average moisture contents of granule samples and, finally, batch fluid bed experiments in which the results are compared to the most detailed method to date. The individual granule moisture contents confirmed again that granule size is an important factor in the drying process. The measurement method can be used to gain more detailed experimental insight in different fluidisation and particulate processes, which will allow building of robust process models.
Collapse
Affiliation(s)
- Michael Ghijs
- BIOMATH, Department of Data Analysis and Mathematical Modelling, Ghent University, Belgium; Laboratory of Pharmaceutical Process Analytical Technology, Ghent University, Belgium.
| | - Brecht Vanbillemont
- Laboratory of Pharmaceutical Process Analytical Technology, Ghent University, Belgium
| | - Niels Nicolaï
- BIOMATH, Department of Data Analysis and Mathematical Modelling, Ghent University, Belgium; Laboratory of Pharmaceutical Process Analytical Technology, Ghent University, Belgium
| | - Thomas De Beer
- Laboratory of Pharmaceutical Process Analytical Technology, Ghent University, Belgium.
| | - Ingmar Nopens
- BIOMATH, Department of Data Analysis and Mathematical Modelling, Ghent University, Belgium.
| |
Collapse
|
10
|
Kim MH, Banerjee S, Park SM, Pathak J. Improving risk prediction for depression via Elastic Net regression - Results from Korea National Health Insurance Services Data. AMIA Annu Symp Proc 2017; 2016:1860-1869. [PMID: 28269945 PMCID: PMC5333336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Depression, despite its high prevalence, remains severely under-diagnosed across the healthcare system. This demands the development of data-driven approaches that can help screen patients who are at a high risk of depression. In this work, we develop depression risk prediction models that incorporate disease co-morbidities using logistic regression with Elastic Net. Using data from the one million twelve-year longitudinal cohort from Korean National Health Insurance Services (KNHIS), our model achieved an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) of 0.7818, compared to a traditional logistic regression model without co-morbidity analysis (AUC of 0.6992). We also showed co-morbidity adjusted Odds Ratios (ORs), which may be more accurate independent estimate of each predictor variable. In conclusion, inclusion of co-morbidity analysis improved the performance of depression risk prediction models.
Collapse
Affiliation(s)
- Min-hyung Kim
- Department of Healthcare Policy & Research, Weill Cornell Medical College, New York, NY, USA
| | - Samprit Banerjee
- Department of Healthcare Policy & Research, Weill Cornell Medical College, New York, NY, USA
| | - Sang Min Park
- Department of Family Medicine, Seoul National University College of Medicine, Seoul, Korea
| | - Jyotishman Pathak
- Department of Healthcare Policy & Research, Weill Cornell Medical College, New York, NY, USA
| |
Collapse
|
11
|
Abstract
An important task in personalized medicine is to predict disease risk based on a person's genome, e.g. on a large number of single-nucleotide polymorphisms (SNPs). Genome-wide association studies (GWAS) make SNP and phenotype data available to researchers. A critical question for researchers is how to best predict disease risk. Penalized regression equipped with variable selection, such as LASSO and SCAD, is deemed to be promising in this setting. However, the sparsity assumption taken by the LASSO, SCAD and many other penalized regression techniques may not be applicable here: it is now hypothesized that many common diseases are associated with many SNPs with small to moderate effects. In this article, we use the GWAS data from the Wellcome Trust Case Control Consortium (WTCCC) to investigate the performance of various unpenalized and penalized regression approaches under true sparse or non-sparse models. We find that in general penalized regression outperformed unpenalized regression; SCAD, TLP and LASSO performed best for sparse models, while elastic net regression was the winner, followed by ridge, TLP and LASSO, for non-sparse models.
Collapse
Affiliation(s)
- Erin Austin
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455
| | - Xiaotong Shen
- School of Statistics, University of Minnesota, Minneapolis, MN 55455
| |
Collapse
|