Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

25
(from Reference Citation Analysis)

Article PDFs (12)

Cited by > 0 (17)

Searched Name

Katrijn Van Deun

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Statistics

Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Category

Show more Refine

Number	Citation Analysis
1	Infections in Biological and Targeted Synthetic Drug Use in Rheumatoid Arthritis: Where do We Stand? A Scoping Review and Meta-analysis. Rheumatol Ther 2023;10:1147-1165. [PMID: 37365454 PMCID: PMC10469142 DOI: 10.1007/s40744-023-00571-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 06/05/2023] [Indexed: 06/28/2023] Open Abstract INTRODUCTION The advent of biological and targeted synthetic therapies has revolutionized rheumatoid arthritis (RA) treatment. However, this has come at the price of an increased risk of infections. The aim of this study was to present an integrated overview of both serious and non-serious infections, and to identify potential predictors of infection risk in RA patients using biological or targeted synthetic drugs. METHODS We systematically reviewed available literature from PubMed and Cochrane and performed multivariate meta-analysis with meta-regression on the reported infections. Randomized controlled trials and prospective and retrospective observational studies including patient registry studies were analyzed, combined as well as separately. We excluded studies focusing on viral infections only. RESULTS Infections were not reported in a standardized manner. Meta-analysis showed significant heterogeneity that persisted after forming subgroups by study design and follow-up duration. Overall, the pooled proportions of patients experiencing an infection during a study were 0.30 (95% CI, 0.28-0.33) and 0.03 (95% CI, 0.028-0.035) for any kind of infections or serious infections only, respectively. We found no potential predictors that were consistent across all study subgroups. CONCLUSIONS The high heterogeneity and the inconsistency of potential predictors between studies show that we do not yet have a complete picture of infection risk in RA patients using biological or targeted synthetic drugs. Besides, we found non-serious infections outnumbered serious infections by a factor 10:1, but only a few studies have focused on their occurrence. Future studies should apply a uniform method of infectious adverse event reporting and also focus on non-serious infections and their impact on treatment decisions and quality of life. Collapse Key Words Biological Heterogeneity Infection Meta-analysis Meta-regression Rheumatoid arthritis Targeted synthetic drugs Collapse MESH Headings Collapse Grants Collapse
2	Simultaneous clustering and variable selection: A novel algorithm and model selection procedure. Behav Res Methods 2023;55:2157-2174. [PMID: 36085542 PMCID: PMC10439051 DOI: 10.3758/s13428-022-01795-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/30/2021] [Indexed: 11/08/2022] Abstract The growing availability of high-dimensional data sets offers behavioral scientists an unprecedented opportunity to integrate the information hidden in the novel types of data (e.g., genetic data, social media data, and GPS tracks, etc.,) and thereby obtain a more detailed and comprehensive view towards their research questions. In the context of clustering, analyzing the large volume of variables could potentially result in an accurate estimation or a novel discovery of underlying subgroups. However, a unique challenge is that the high-dimensional data sets likely involve a significant amount of irrelevant variables. These irrelevant variables do not contribute to the separation of clusters and they may mask cluster partitions. The current paper addresses this challenge by introducing a new clustering algorithm, called Cardinality K-means or CKM, and by proposing a novel model selection strategy. CKM is able to perform simultaneous clustering and variable selection with high stability. In two simulation studies and an empirical demonstration with genetic data, CKM consistently outperformed competing methods in terms of recovering cluster partitions and identifying signaling variables. Meanwhile, our novel model selection strategy determines the number of clusters based on a subset of variables that are most likely to be signaling variables. Through a simulation study, this strategy was found to result in a more accurate estimation of the number of clusters compared to the conventional strategy that utilizes the full set of variables. Our proposed CKM algorithm, together with the novel model selection strategy, has been implemented in a freely accessible R package. Collapse Key Words Clustering High-dimensional data Model selection Variable selection Collapse MESH Headings Humans Algorithms Computer Simulation Cluster Analysis Collapse Grants Collapse
3	Determinants and mediating mechanisms of quality of life and disease-specific symptoms among thyroid cancer patients: the design of the WaTCh study. Thyroid Res 2023;16:23. [PMID: 37424010 DOI: 10.1186/s13044-023-00165-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 05/23/2023] [Indexed: 07/11/2023] Open Abstract BACKGROUND Thyroid cancer (TC) patients are understudied but appear to be at risk for poor physical and psychosocial outcomes. Knowledge of the course and determinants of these deteriorated outcomes is lacking. Furthermore, little is known about mediating biological mechanisms. OBJECTIVES The WaTCh-study aims to; 1. Examine the course of physical and psychosocial outcomes. 2. Examine the association of demographic, environmental, clinical, physiological, and personality characteristics to those outcomes. In other words, who is at risk? 3. Reveal the association of mediating biological mechanisms (inflammation, kynurenine pathway) with poor physical and psychological outcomes. In other words, why is a person at risk? DESIGN AND METHODS Newly diagnosed TC patients from 13 Dutch hospitals will be invited. Data collection will take place before treatment, and at 6, 12 and 24 months after diagnosis. Sociodemographic and clinical information is available from the Netherlands Cancer Registry. Patients fill-out validated questionnaires at each time-point to assess quality of life, TC-specific symptoms, physical activity, anxiety, depression, health care use, and employment. Patients are asked to donate blood three times to assess inflammation and kynurenine pathway. Optionally, at each occasion, patients can use a weighing scale with bioelectrical impedance analysis (BIA) system to assess body composition; can register food intake using an online food diary; and can wear an activity tracker to assess physical activity and sleep duration/quality. Representative Dutch normative data on the studied physical and psychosocial outcomes is already available. IMPACT WaTCh will reveal the course of physical and psychosocial outcomes among TC patients over time and answers the question who is at risk for poor outcomes, and why. This knowledge can be used to provide personalized information, to improve screening, to develop and provide tailored treatment strategies and supportive care, to optimize outcomes, and ultimately increase the number of TC survivors that live in good health. Collapse Key Words Activity trackers BIA weighing scales Food diaries Inflammation Kynurenine pathway PROFILES registry Patient reported outcomes Thyroid cancer Collapse MESH Headings Collapse Grants Collapse
4	Neural responses to facial attractiveness: event-related potentials differentiate between salience and valence effects. Biol Psychol 2023;179:108549. [PMID: 37004907 DOI: 10.1016/j.biopsycho.2023.108549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 03/13/2023] [Accepted: 03/29/2023] [Indexed: 04/03/2023] Abstract We examined the neural correlates of facial attractiveness by presenting pictures of male or female faces (neutral expression) with low/intermediate/high attractiveness to 48 male or female participants while recording their electroencephalogram (EEG). Subjective attractiveness ratings were used to determine the 10% highest, 10% middlemost, and 10% lowest rated faces for each individual participant to allow for high contrast comparisons. These were then split into preferred and dispreferred gender categories. ERP components P1, N1, P2, N2, early posterior negativity (EPN), P300 and late positive potential (LPP) (up until 3000 ms post-stimulus), and the face specific N170 were analysed. A salience effect (attractive/unattractive > intermediate) in an early LPP interval (450-850 ms) and a long-lasting valence related effect (attractive > unattractive) in a late LPP interval (1000-3000 ms) were elicited by the preferred gender faces but not by the dispreferred gender faces. Multi-variate pattern analysis (MVPA)-classifications on whole-brain single-trial EEG patterns further confirmed these salience and valence effects. It is concluded that, facial attractiveness elicits neural responses that are indicative of valenced experiences, but only if these faces are considered relevant. These experiences take time to develop and last well beyond the interval that is commonly explored. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
5	Stimulus material selection for the Dutch famous faces test for older adults. Front Med (Lausanne) 2023;10:1124986. [PMID: 37122325 PMCID: PMC10140445 DOI: 10.3389/fmed.2023.1124986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 03/24/2023] [Indexed: 05/02/2023] Open Abstract Worldwide, approximately 22% of all individuals aged 50 years and older are currently estimated to fall somewhere on the Alzheimer's disease (AD) continuum, which can be roughly divided into preclinical AD, mild cognitive impairment (MCI), and AD dementia. While episodic memory loss (among other aspects) is typically required for a diagnosis of AD dementia, MCI is said to have occurred when cognitive impairment (including memory loss) is worse than expected for the person's age but not enough to be classified as dementia. On the other hand, preclinical AD can currently only be detected using biomarkers; clinical symptoms are not apparent using traditional neuropsychological tests. The main aim of the current paper was to explore the possibility of a test which could distinguish preclinical AD from normal aging. Recent scientific evidence suggests that the Famous Faces Test (FFT) could differentiate preclinical AD from normal aging up to 5 years before a clinical AD diagnosis. Problematic with existing FFTs is the selection of stimulus material. Faces famous in a specific country and a specific decade might not be equally famous for individuals in another country or indeed for people of different ages. The current article describes how famous faces were systematically selected and chosen for the Dutch older (60+) population using five steps. The goal was to design and develop short versions of the FFT for Dutch older adults of equivalent mean difficulty. In future work, these nine parallel versions will be necessary for (a) cross-sectional comparison as well as subsequent longitudinal assessment of cognitively normal and clinical groups and (b) creating personalized norms for the normal aged controls that could be used to compare performance within individuals with clinical diagnoses. The field needs a simple, cognitive test which can distinguish the earliest stages of the dementia continuum from normal aging. Collapse Key Words aging famous faces famous names memory naming preclinical AD recollection Collapse MESH Headings Collapse Grants Karolinska Institutet Collapse
6	Clustering of trauma patients based on longitudinal data and the application of machine learning to predict recovery. Sci Rep 2022;12:16990. [PMID: 36216874 PMCID: PMC9550811 DOI: 10.1038/s41598-022-21390-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 09/27/2022] [Indexed: 12/29/2022] Open Abstract Predicting recovery after trauma is important to provide patients a perspective on their estimated future health, to engage in shared decision making and target interventions to relevant patient groups. In the present study, several unsupervised techniques are employed to cluster patients based on longitudinal recovery profiles. Subsequently, these data-driven clusters were assessed on clinical validity by experts and used as targets in supervised machine learning models. We present a formalised analysis of the obtained clusters that incorporates evaluation of (i) statistical and machine learning metrics, (ii) clusters clinical validity with descriptive statistics and medical expertise. Clusters quality assessment revealed that clusters obtained through a Bayesian method (High Dimensional Supervised Classification and Clustering) and a Deep Gaussian Mixture model, in combination with oversampling and a Random Forest for supervised learning of the cluster assignments provided among the most clinically sensible partitioning of patients. Other methods that obtained higher classification accuracy suffered from cluster solutions with large majority classes or clinically less sensible classes. Models that used just physical or a mix of physical and psychological outcomes proved to be among the most sensible, suggesting that clustering on psychological outcomes alone yields recovery profiles that do not conform to known risk factors. Collapse Key Words prognosis quality of life health care rehabilitation Collapse MESH Headings Bayes Theorem Cluster Analysis Humans Machine Learning Risk Factors Supervised Machine Learning Collapse Grants Collapse
7	Decoding the neural responses to experiencing disgust and sadness. Brain Res 2022;1793:148034. [DOI: 10.1016/j.brainres.2022.148034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 06/20/2022] [Accepted: 07/26/2022] [Indexed: 11/02/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
8	Welcome message from the new editors-in-chief. METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES 2022. [DOI: 10.5964/meth.8575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
9	Measuring Clinical, Biological, and Behavioral Variables to Elucidate Trajectories of Patient (Reported) Outcomes: The PROFILES Registry. J Natl Cancer Inst 2022;114:800-807. [PMID: 35201353 PMCID: PMC9194631 DOI: 10.1093/jnci/djac047] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 11/05/2021] [Accepted: 02/17/2022] [Indexed: 11/14/2022] Open Abstract To take cancer survivorship research to the next level, it's important to gain insight in trajectories of changing patient (reported) outcomes and impaired recovery after cancer. This is needed as the number of survivors is increasing and a large proportion is confronted with changing health after treatment. Mechanistic research can facilitate the development of personalized risk-stratified follow-up care and tailored interventions to promote healthy cancer survivorship. We describe how these trajectories can be studied by taking the recently extended Dutch population-based PROFILES (Patient Reported Outcomes Following Initial treatment and Long term Evaluation of Survivorship) registry as an example. PROFILES combines longitudinal assessment of patient-reported outcomes with novel, ambulatory and objective measures (e.g., activity trackers; blood draws; hair samples; online food diaries; online cognitive tests; weighing scales; online symptoms assessment), and cancer registry and pharmacy databases. Furthermore, we discuss methods to optimize the use of a multidomain data collection like return of individual results to participants which may not only improve patient empowerment but also long-term cohort retention. Also, advanced statistical methods are needed to handle high-dimensional longitudinal data (with missing values) and provide insight into trajectories of changing patient (reported) outcomes after cancer. Our coded data can be used by academic researchers around the world. Registries like PROFILES, that go beyond boundaries of disciplines and institutions, will contribute to better predictions of who will experience changes and why. This is needed to prevent and mitigate long-term and late effects of cancer (treatment) and to identify new interventions to promote health. Collapse Key Words advanced statistics ambulatory monitoring cancer mechanisms patient-reported outcomes registry survivorship Collapse MESH Headings Collapse Grants Collapse
10	A Guide for Sparse PCA: Model Comparison and Applications. PSYCHOMETRIKA 2021;86:893-919. [PMID: 34185214 PMCID: PMC8636462 DOI: 10.1007/s11336-021-09773-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 05/17/2021] [Indexed: 05/14/2023] Abstract PCA is a popular tool for exploring and summarizing multivariate data, especially those consisting of many variables. PCA, however, is often not simple to interpret, as the components are a linear combination of the variables. To address this issue, numerous methods have been proposed to sparsify the nonzero coefficients in the components, including rotation-thresholding methods and, more recently, PCA methods subject to sparsity inducing penalties or constraints. Here, we offer guidelines on how to choose among the different sparse PCA methods. Current literature misses clear guidance on the properties and performance of the different sparse PCA methods, often relying on the misconception that the equivalence of the formulations for ordinary PCA also holds for sparse PCA. To guide potential users of sparse PCA methods, we first discuss several popular sparse PCA methods in terms of where the sparseness is imposed on the loadings or on the weights, assumed model, and optimization criterion used to impose sparseness. Second, using an extensive simulation study, we assess each of these methods by means of performance measures such as squared relative error, misidentification rate, and percentage of explained variance for several data generating models and conditions for the population model. Finally, two examples using empirical data are considered. Collapse Key Words dimension reduction exploratory data analysis high dimension-low sample size regularization sparse principal components analysis Collapse MESH Headings Algorithms Computer Simulation Principal Component Analysis Psychometrics Collapse Grants Collapse
11	The COVID-19 outbreak increases maternal stress during pregnancy, but not the risk for postpartum depression. Arch Womens Ment Health 2021;24:1037-1043. [PMID: 33830373 PMCID: PMC8027291 DOI: 10.1007/s00737-021-01104-9] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 01/14/2021] [Indexed: 12/26/2022] Abstract The COVID-19 pandemic affects society and may especially have an impact on mental health of vulnerable groups, such as perinatal women. This prospective cohort study of 669 participating women in the Netherlands compared perinatal symptoms of depression and stress during and before the pandemic. After a pilot in 2018, recruitment started on 7 January 2019. Up until 1 March 2020 (before the pandemic), 401 women completed questionnaires during pregnancy, of whom 250 also completed postpartum assessment. During the pandemic, 268 women filled out at least one questionnaire during pregnancy and 59 postpartum (1 March-14 May 2020). Pregnancy-specific stress increased significantly in women during the pandemic. We found no increase in depressive symptoms during pregnancy nor an increase in incidence of high levels of postpartum depressive symptoms during the pandemic. Clinicians should be aware of the potential for increased stress in pregnant women during the pandemic. Collapse Key Words COVID-19 pandemic Depression Perinatal Pregnancy-specific stress Collapse MESH Headings Anxiety COVID-19 Depression/diagnosis Depression/epidemiology Depression, Postpartum/diagnosis Depression, Postpartum/epidemiology Disease Outbreaks Female Humans Pandemics Parturition Pregnancy Prospective Studies SARS-CoV-2 Stress, Psychological/epidemiology Collapse Grants Universiteit van Tilburg Nederlandse Organisatie voor Wetenschappelijk Onderzoek Collapse
12	Multiple imputation of longitudinal categorical data through bayesian mixture latent Markov models. J Appl Stat 2019;47:1720-1738. [PMID: 35707130 PMCID: PMC9041790 DOI: 10.1080/02664763.2019.1692794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022] Abstract Standard latent class modeling has recently been shown to provide a flexible tool for the multiple imputation (MI) of missing categorical covariates in cross-sectional studies. This article introduces an analogous tool for longitudinal studies: MI using Bayesian mixture Latent Markov (BMLM) models. Besides retaining the benefits of latent class models, i.e. respecting the (categorical) measurement scale of the variables and preserving possibly complex relationships between variables within a measurement occasion, the Markov dependence structure of the proposed BMLM model allows capturing lagged dependencies between adjacent time points, while the time-constant mixture structure allows capturing dependencies across all time points, as well as retrieving associations between time-varying and time-constant variables. The performance of the BMLM model for MI is evaluated by means of a simulation study and an empirical experiment, in which it is compared with complete case analysis and MICE. Results show good performance of the proposed method in retrieving the parameters of the analysis model. In contrast, competing methods could provide correct estimates only for some aspects of the data. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
13	RegularizedSCA: Regularized simultaneous component analysis of multiblock data in R. Behav Res Methods 2019;51:2268-2289. [PMID: 30542912 PMCID: PMC6797642 DOI: 10.3758/s13428-018-1163-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Abstract This article introduces a package developed for R (R Core Team, 2017) for performing an integrated analysis of multiple data blocks (i.e., linked data) coming from different sources. The methods in this package combine simultaneous component analysis (SCA) with structured selection of variables. The key feature of this package is that it allows to (1) identify joint variation that is shared across all the data sources and specific variation that is associated with one or a few of the data sources and (2) flexibly estimate component matrices with predefined structures. Linked data occur in many disciplines (e.g., biomedical research, bioinformatics, chemometrics, finance, genomics, psychology, and sociology) and especially in multidisciplinary research. Hence, we expect our package to be useful in various fields. Collapse Key Words Common/distinctive components Group Lasso Lasso Linked data analysis Multiblock analysis Simultaneous component analysis Collapse MESH Headings Information Storage and Retrieval Software Collapse Grants Collapse
14	Revealing the Joint Mechanisms in Traditional Data Linked With Big Data. ZEITSCHRIFT FUR PSYCHOLOGIE-JOURNAL OF PSYCHOLOGY 2019;226:212-231. [PMID: 31523606 PMCID: PMC6736194 DOI: 10.1027/2151-2604/a000341] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 08/30/2018] [Accepted: 09/28/2018] [Indexed: 11/23/2022] Abstract Recent technological advances have made it possible to study human behavior by linking novel types of data to more traditional types of psychological data, for example, linking psychological questionnaire data with genetic risk scores. Revealing the variables that are linked throughout these traditional and novel types of data gives crucial insight into the complex interplay between the multiple factors that determine human behavior, for example, the concerted action of genes and environment in the emergence of depression. Little or no theory is available on the link between such traditional and novel types of data, the latter usually consisting of a huge number of variables. The challenge is to select - in an automated way - those variables that are linked throughout the different blocks, and this eludes currently available methods for data analysis. To fill the methodological gap, we here present a novel data integration method. Collapse Key Words big data component analysis linked data variable selection Collapse MESH Headings Collapse Grants U01 AG024904 NIA NIH HHS U19 AG024904 NIA NIH HHS Collapse
15	Bayesian Latent Class Models for the Multiple Imputation of Categorical Data. METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES 2018. [DOI: 10.1027/1614-2241/a000146] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open Abstract Abstract. Latent class analysis has been recently proposed for the multiple imputation (MI) of missing categorical data, using either a standard frequentist approach or a nonparametric Bayesian model called Dirichlet process mixture of multinomial distributions (DPMM). The main advantage of using a latent class model for multiple imputation is that it is very flexible in the sense that it can capture complex relationships in the data given that the number of latent classes is large enough. However, the two existing approaches also have certain disadvantages. The frequentist approach is computationally demanding because it requires estimating many LC models: first models with different number of classes should be estimated to determine the required number of classes and subsequently the selected model is reestimated for multiple bootstrap samples to take into account parameter uncertainty during the imputation stage. Whereas the Bayesian Dirichlet process models perform the model selection and the handling of the parameter uncertainty automatically, the disadvantage of this method is that it tends to use a too small number of clusters during the Gibbs sampling, leading to an underfitting model yielding invalid imputations. In this paper, we propose an alternative approach which combined the strengths of the two existing approaches; that is, we use the Bayesian standard latent class model as an imputation model. We show how model selection can be performed prior to the imputation step using a single run of the Gibbs sampler and, moreover, show how underfitting is prevented by using large values for the hyperparameters of the mixture weights. The results of two simulation studies and one real-data study indicate that with a proper setting of the prior distributions, the Bayesian latent class model yields valid imputations and outperforms competing methods. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
16	Obtaining insights from high-dimensional data: sparse principal covariates regression. BMC Bioinformatics 2018;19:104. [PMID: 29587627 PMCID: PMC5870402 DOI: 10.1186/s12859-018-2114-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Accepted: 03/13/2018] [Indexed: 11/10/2022] Open Abstract BACKGROUND Data analysis methods are usually subdivided in two distinct classes: There are methods for prediction and there are methods for exploration. In practice, however, there often is a need to learn from the data in both ways. For example, when predicting the antibody titers a few weeks after vaccination on the basis of genomewide mRNA transcription rates, also mechanistic insights about the effect of vaccinations on the immune system are sought. Principal covariates regression (PCovR) is a method that combines both purposes. Yet, it misses insightful representations of the data as these include all the variables. RESULTS Here, we propose a sparse extension of principal covariates regression such that the resulting solutions are based on an automatically selected subset of the variables. Our method is shown to outperform competing methods like sparse principal components regression and sparse partial least squares in a simulation study. Furthermore good performance of the method is illustrated on publicly available data including antibody titers and genomewide transcription rates for subjects vaccinated against the flu: the selected genes by sparse PCovR are higly enriched for immune related terms and the method predicts the titers for an independent test sample well. In comparison, no significantly enriched terms were found for the genes selected by sparse partial least squares and out-of-sample prediction was worse. CONCLUSIONS Sparse principal covariates regression is a promising and competitive tool for obtaining insights from high-dimensional data. AVAILABILITY The source code implementing our proposed method is available from GitHub, together with all scripts used to extract, pre-process, analyze, and post-process the data: https://github.com/katrijnvandeun/SPCovR . Collapse Key Words Dimension reduction High-dimensional data Immunology Prediction Stability selection Collapse MESH Headings Collapse Grants Collapse
17	QUINT: A tool to detect qualitative treatment-subgroup interactions in randomized controlled trials. Psychother Res 2015;26:612-22. [PMID: 26169837 DOI: 10.1080/10503307.2015.1062934] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open Abstract OBJECTIVE The detection of subgroups involved in qualitative treatment-subgroup interactions (i.e., for one subgroup of clients treatment A outperforms treatment B, whereas for another the reverse holds true) is crucial for personalized health. In typical Randomized Controlled Trials (RCTs), the combination of a lack of a priori hypotheses and a large number of possible moderators leaves current methods insufficient to detect subgroups involved in such interactions. A recently developed method, QUalitative INteraction Trees (QUINT), offers a solution. However, the paper in which QUINT has been introduced is not easily accessible for non-methodologists. In this paper, we want to review the conceptual basis of QUINT in a nontechnical way, and illustrate its relevance for psychological applications. METHOD We present a concise introduction into QUINT along with a summary of available evidence on its performance. Subsequently, we subject RCT data on the effect of motivational interviewing in a treatment for substance abuse disorders to a reanalysis with QUINT. As outcome variables, we focus on measures of retention and substance use. RESULTS A qualitative treatment-subgroup interaction is found for retention. By contrast, no qualitative interaction is detected for substance use. CONCLUSIONS QUINT may lead to insightful and well-interpretable results with straightforward implications for personalized treatment assignment. Collapse Key Words QUINT qualitative interaction subgroup analysis treatment efficacy Collapse MESH Headings Collapse Grants Collapse
18	Unraveling affective dysregulation in borderline personality disorder: a theoretical model and empirical evidence. JOURNAL OF ABNORMAL PSYCHOLOGY 2015;124:186-98. [PMID: 25603359 DOI: 10.1037/abn0000021] [Citation(s) in RCA: 76] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Abstract Although emotion dysregulation has consistently been conceptualized as a core problem of borderline personality disorder (BPD), a comprehensive, and empirically and ecologically validated model that captures the exact types of dysregulation remains absent. In the present article, we combine insights from basic affective science and the biosocial theory of BPD to present a theoretical model that captures the most fundamental affective dynamical processes that underlie BPD and stipulates that individuals with BPD are characterized by more negative affective homebases, higher levels of affective variability, and lower levels of attractor strength or return to baseline. Next, we empirically validate this proposal by statistically modeling data from three electronic diary studies on emotional responses to personally relevant stimuli in personally relevant environments that were collected both from patients with BPD (N = 50, 42, and 43) and from healthy subjects (N = 50, 24, and 28). The results regarding negative affective homebases and heightened affective variabilities consistently confirmed our hypotheses across all three datasets. The findings regarding attractor strengths (i.e., return to baseline) were less consistent and of smaller magnitude. The transdiagnostic nature of our approach may help to elucidate the common and distinctive mechanisms that underlie several different disorders that are characterized by affective dysregulation. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
19	DISCO-SCA and properly applied GSVD as swinging methods to find common and distinctive processes. PLoS One 2012;7:e37840. [PMID: 22693578 PMCID: PMC3365060 DOI: 10.1371/journal.pone.0037840] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2011] [Accepted: 04/29/2012] [Indexed: 11/18/2022] Open Abstract Background In systems biology it is common to obtain for the same set of biological entities information from multiple sources. Examples include expression data for the same set of orthologous genes screened in different organisms and data on the same set of culture samples obtained with different high-throughput techniques. A major challenge is to find the important biological processes underlying the data and to disentangle therein processes common to all data sources and processes distinctive for a specific source. Recently, two promising simultaneous data integration methods have been proposed to attain this goal, namely generalized singular value decomposition (GSVD) and simultaneous component analysis with rotation to common and distinctive components (DISCO-SCA). Results Both theoretical analyses and applications to biologically relevant data show that: (1) straightforward applications of GSVD yield unsatisfactory results, (2) DISCO-SCA performs well, (3) provided proper pre-processing and algorithmic adaptations, GSVD reaches a performance level similar to that of DISCO-SCA, and (4) DISCO-SCA is directly generalizable to more than two data sources. The biological relevance of DISCO-SCA is illustrated with two applications. First, in a setting of comparative genomics, it is shown that DISCO-SCA recovers a common theme of cell cycle progression and a yeast-specific response to pheromones. The biological annotation was obtained by applying Gene Set Enrichment Analysis in an appropriate way. Second, in an application of DISCO-SCA to metabolomics data for Escherichia coli obtained with two different chemical analysis platforms, it is illustrated that the metabolites involved in some of the biological processes underlying the data are detected by one of the two platforms only; therefore, platforms for microbial metabolomics should be tailored to the biological question. Conclusions Both DISCO-SCA and properly applied GSVD are promising integrative methods for finding common and distinctive processes in multisource data. Open source code for both methods is provided. Collapse Key Words Collapse MESH Headings Computational Biology/methods Escherichia coli/metabolism Gene Expression Profiling Genomics Metabolomics Saccharomyces cerevisiae/genetics Statistics as Topic/methods Collapse Grants Collapse
20	A flexible framework for sparse simultaneous component based data integration. BMC Bioinformatics 2011;12:448. [PMID: 22085701 PMCID: PMC3283562 DOI: 10.1186/1471-2105-12-448] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2011] [Accepted: 11/15/2011] [Indexed: 12/05/2022] Open Abstract 1 Background High throughput data are complex and methods that reveal structure underlying the data are most useful. Principal component analysis, frequently implemented as a singular value decomposition, is a popular technique in this respect. Nowadays often the challenge is to reveal structure in several sources of information (e.g., transcriptomics, proteomics) that are available for the same biological entities under study. Simultaneous component methods are most promising in this respect. However, the interpretation of the principal and simultaneous components is often daunting because contributions of each of the biomolecules (transcripts, proteins) have to be taken into account. 2 Results We propose a sparse simultaneous component method that makes many of the parameters redundant by shrinking them to zero. It includes principal component analysis, sparse principal component analysis, and ordinary simultaneous component analysis as special cases. Several penalties can be tuned that account in different ways for the block structure present in the integrated data. This yields known sparse approaches as the lasso, the ridge penalty, the elastic net, the group lasso, sparse group lasso, and elitist lasso. In addition, the algorithmic results can be easily transposed to the context of regression. Metabolomics data obtained with two measurement platforms for the same set of Escherichia coli samples are used to illustrate the proposed methodology and the properties of different penalties with respect to sparseness across and within data blocks. 3 Conclusion Sparse simultaneous component analysis is a useful method for data integration: First, simultaneous analyses of multiple blocks offer advantages over sequential and separate analyses and second, interpretation of the results is highly facilitated by their sparseness. The approach offered is flexible and allows to take the block structure in different ways into account. As such, structures can be found that are exclusively tied to one data platform (group lasso approach) as well as structures that involve all data platforms (Elitist lasso approach). 4 Availability The additional file contains a MATLAB implementation of the sparse simultaneous component method. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
21	Tissue-specific disallowance of housekeeping genes: the other face of cell differentiation. Genome Res 2010;21:95-105. [PMID: 21088282 DOI: 10.1101/gr.109173.110] [Citation(s) in RCA: 144] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Abstract We report on a hitherto poorly characterized class of genes that are expressed in all tissues, except in one. Often, these genes have been classified as housekeeping genes, based on their nearly ubiquitous expression. However, the specific repression in one tissue defines a special class of "disallowed genes." In this paper, we used the intersection-union test to screen for such genes in a multi-tissue panel of genome-wide mRNA expression data. We propose that disallowed genes need to be repressed in the specific target tissue to ensure correct tissue function. We provide mechanistic data of repression with two metabolic examples, exercise-induced inappropriate insulin release and interference with ketogenesis in liver. Developmentally, this repression is established during tissue maturation in the early postnatal period involving epigenetic changes in histone methylation. In addition, tissue-specific expression of microRNAs can further diminish these repressed mRNAs. Together, we provide a systematic analysis of tissue-specific repression of housekeeping genes, a phenomenon that has not been studied so far on a genome-wide basis and, when perturbed, can lead to human disease. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
22	Integrating functional genomics data using maximum likelihood based simultaneous component analysis. BMC Bioinformatics 2009;10:340. [PMID: 19835617 PMCID: PMC2771021 DOI: 10.1186/1471-2105-10-340] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2009] [Accepted: 10/16/2009] [Indexed: 12/02/2022] Open Abstract Background In contemporary biology, complex biological processes are increasingly studied by collecting and analyzing measurements of the same entities that are collected with different analytical platforms. Such data comprise a number of data blocks that are coupled via a common mode. The goal of collecting this type of data is to discover biological mechanisms that underlie the behavior of the variables in the different data blocks. The simultaneous component analysis (SCA) family of data analysis methods is suited for this task. However, a SCA may be hampered by the data blocks being subjected to different amounts of measurement error, or noise. To unveil the true mechanisms underlying the data, it could be fruitful to take noise heterogeneity into consideration in the data analysis. Maximum likelihood based SCA (MxLSCA-P) was developed for this purpose. In a previous simulation study it outperformed normal SCA-P. This previous study, however, did not mimic in many respects typical functional genomics data sets, such as, data blocks coupled via the experimental mode, more variables than experimental units, and medium to high correlations between variables. Here, we present a new simulation study in which the usefulness of MxLSCA-P compared to ordinary SCA-P is evaluated within a typical functional genomics setting. Subsequently, the performance of the two methods is evaluated by analysis of a real life Escherichia coli metabolomics data set. Results In the simulation study, MxLSCA-P outperforms SCA-P in terms of recovery of the true underlying scores of the common mode and of the true values underlying the data entries. MxLSCA-P further performed especially better when the simulated data blocks were subject to different noise levels. In the analysis of an E. coli metabolomics data set, MxLSCA-P provided a slightly better and more consistent interpretation. Conclusion MxLSCA-P is a promising addition to the SCA family. The analysis of coupled functional genomics data blocks could benefit from its ability to take different noise levels per data block into consideration and improve the recovery of the true patterns underlying the data. Moreover, the maximum likelihood based approach underlying MxLSCA-P could be extended to custom-made solutions to specific problems encountered. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
23	A structured overview of simultaneous component based data integration. BMC Bioinformatics 2009;10:246. [PMID: 19671149 PMCID: PMC2752463 DOI: 10.1186/1471-2105-10-246] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2009] [Accepted: 08/11/2009] [Indexed: 11/10/2022] Open Abstract Background Data integration is currently one of the main challenges in the biomedical sciences. Often different pieces of information are gathered on the same set of entities (e.g., tissues, culture samples, biomolecules) with the different pieces stemming, for example, from different measurement techniques. This implies that more and more data appear that consist of two or more data arrays that have a shared mode. An integrative analysis of such coupled data should be based on a simultaneous analysis of all data arrays. In this respect, the family of simultaneous component methods (e.g., SUM-PCA, unrestricted PCovR, MFA, STATIS, and SCA-P) is a natural choice. Yet, different simultaneous component methods may lead to quite different results. Results We offer a structured overview of simultaneous component methods that frames them in a principal components setting such that both the common core of the methods and the specific elements with regard to which they differ are highlighted. An overview of principles is given that may guide the data analyst in choosing an appropriate simultaneous component method. Several theoretical and practical issues are illustrated with an empirical example on metabolomics data for Escherichia coli as obtained with different analytical chemical measurement methods. Conclusion Of the aspects in which the simultaneous component methods differ, pre-processing and weighting are consequential. Especially, the type of weighting of the different matrices is essential for simultaneous component analysis. These types are shown to be linked to different specifications of the idea of a fair integration of the different coupled arrays. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
24	Joint mapping of genes and conditions via multidimensional unfolding analysis. BMC Bioinformatics 2007;8:181. [PMID: 17550582 PMCID: PMC1904247 DOI: 10.1186/1471-2105-8-181] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2007] [Accepted: 06/05/2007] [Indexed: 11/13/2022] Open Abstract Background Microarray compendia profile the expression of genes in a number of experimental conditions. Such data compendia are useful not only to group genes and conditions based on their similarity in overall expression over profiles but also to gain information on more subtle relations between genes and conditions. Getting a clear visual overview of all these patterns in a single easy-to-grasp representation is a useful preliminary analysis step: We propose to use for this purpose an advanced exploratory method, called multidimensional unfolding. Results We present a novel algorithm for multidimensional unfolding that overcomes both general problems and problems that are specific for the analysis of gene expression data sets. Applying the algorithm to two publicly available microarray compendia illustrates its power as a tool for exploratory data analysis: The unfolding analysis of a first data set resulted in a two-dimensional representation which clearly reveals temporal regulation patterns for the genes and a meaningful structure for the time points, while the analysis of a second data set showed the algorithm's ability to go beyond a mere identification of those genes that discriminate between different patient or tissue types. Conclusion Multidimensional unfolding offers a useful tool for preliminary explorations of microarray data: By relying on an easy-to-grasp low-dimensional geometric framework, relations among genes, among conditions and between genes and conditions are simultaneously represented in an accessible way which may reveal interesting patterns in the data. An additional advantage of the method is that it can be applied to the raw data without necessitating the choice of suitable genewise transformations of the data. Collapse Key Words Collapse MESH Headings Algorithms Biomarkers, Tumor/metabolism Colonic Neoplasms/diagnosis Colonic Neoplasms/genetics Colonic Neoplasms/metabolism Databases, Protein Gene Expression Profiling/methods Neoplasm Proteins/genetics Neoplasm Proteins/metabolism Oligonucleotide Array Sequence Analysis/methods Signal Transduction Collapse Grants Collapse
25	Multidimensional Unfolding by Nonmetric Multidimensional Scaling of Spearman Distances in the Extended Permutation Polytope. MULTIVARIATE BEHAVIORAL RESEARCH 2007;42:103-132. [PMID: 26821078 DOI: 10.1080/00273170701341167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023] Abstract A multidimensional unfolding technique that is not prone to degenerate solutions and is based on multidimensional scaling of a complete data matrix is proposed: distance information about the unfolding data and about the distances both among judges and among objects is included in the complete matrix. The latter information is derived from the permutation polytope supplemented with the objects, called the preference sphere. In this sphere, distances are measured that are closely related to Spearman's rank correlation and that are comparable among each other so that an unconditional approach is reasonable. In two simulation studies, it is shown that the proposed technique leads to acceptable recovery of given preference structures. A major practical advantage of this unfolding technique is its relatively easy implementation in existing software for multidimensional scaling. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse