1
|
ABCD_Harmonizer: An Open-source Tool for Mapping and Controlling for Scanner Induced Variance in the Adolescent Brain Cognitive Development Study. Neuroinformatics 2023; 21:323-337. [PMID: 36940062 PMCID: PMC10849121 DOI: 10.1007/s12021-023-09624-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/13/2023] [Indexed: 03/21/2023]
Abstract
Data from multisite magnetic resonance imaging (MRI) studies contain variance attributable to the scanner that can reduce statistical power and potentially bias results if not appropriately managed. The Adolescent Cognitive Brain Development (ABCD) study is an ongoing, longitudinal neuroimaging study acquiring data from over 11,000 children starting at 9-10 years of age. These scans are acquired on 29 different scanners of 5 different model types manufactured by 3 different vendors. Publicly available data from the ABCD study include structural MRI (sMRI) measures such as cortical thickness and diffusion MRI (dMRI) measures such as fractional anisotropy. In this work, we 1) quantify the variance attributable to scanner effects in the sMRI and dMRI datasets, 2) demonstrate the effectiveness of the data harmonization approach called ComBat to address scanner effects, and 3) present a simple, open-source tool for investigators to harmonize image features from the ABCD study. Scanner-induced variance was present in every image feature and varied in magnitude by feature type and brain location. For almost all features, scanner variance exceeded variability attributable to age and sex. ComBat harmonization was shown to effectively remove scanner induced variance from all image features while preserving the biological variability in the data. Moreover, we show that for studies examining relatively small subsamples of the ABCD dataset, the use of ComBat harmonized data provides more accurate estimates of effect sizes compared to controlling for scanner effects using ordinary least squares regression.
Collapse
|
2
|
Evaluation of functional MRI-based human brain parcellation: a review. J Neurophysiol 2022; 128:197-217. [PMID: 35675446 DOI: 10.1152/jn.00411.2021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Brain parcellations play a crucial role in the analysis of brain imaging data sets, as they can significantly affect the outcome of the analysis. In recent years, several novel approaches for constructing MRI-based brain parcellations have been developed with promising results. In the absence of ground truth, several evaluation approaches have been used to evaluate currently available brain parcellations. In this article, we review and critique methods used for evaluating functional brain parcellations constructed using fMRI data sets. We also describe how some of these evaluation methods have been used to estimate the optimal parcellation granularity. We provide a critical discussion of the current approach to the problem of identifying the optimal brain parcellation that is suited for a given neuroimaging study. We argue that the criteria for an optimal brain parcellation must depend on the application the parcellation is intended for. We describe a teleological approach to the evaluation of brain parcellations, where brain parcellations are evaluated in different contexts and optimal brain parcellations for each context are identified separately. We conclude by discussing several directions for further research that would result in improved evaluation strategies.
Collapse
|
3
|
Feature selection framework for functional connectome fingerprinting. Hum Brain Mapp 2021; 42:3717-3732. [PMID: 34076306 PMCID: PMC8288098 DOI: 10.1002/hbm.25379] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 12/03/2020] [Accepted: 02/09/2021] [Indexed: 12/03/2022] Open
Abstract
The ability to uniquely characterize individual subjects based on their functional connectome (FC) is a key requirement for progress toward precision psychiatry. FC fingerprinting is increasingly studied in the neuroimaging community for this purpose, where a variety of approaches have been developed for effective FC fingerprinting. Recent independent studies showed that fingerprinting accuracy suffers at large sample sizes and when coarser parcellations are used for computing the FC. Quantifying this problem and understanding the reasons these factors impact fingerprinting accuracy is crucial to develop more accurate fingerprinting methods for large sample sizes. Part of the challenge in fingerprinting is that FC captures both generic and subject‐specific information. A systematic approach for identifying subject‐specific FC information is crucial for making progress in addressing the fingerprinting problem. In this study, we addressed three gaps in our understanding of the FC fingerprinting problem. First, we studied the joint effect of sample size and parcellation granularity. Second, we explained the reason for reduced fingerprinting accuracy with increased sample size and reduced parcellation granularity. To this end, we used a clustering quality metric from the data mining community. Third, we developed a general feature selection framework for systematically identifying resting‐state functional connectivity (RSFC) elements that capture information to uniquely identify subjects. In sum, we evaluated six different approaches from this framework by quantifying both subject‐specific fingerprinting accuracy and the decrease in accuracy with an increase in sample size to identify which approach improved quality metrics the most.
Collapse
|
4
|
Sensitization to peanut, egg or pets is associated with skin barrier dysfunction in children with atopic dermatitis. Clin Exp Allergy 2021; 51:666-673. [PMID: 33721370 DOI: 10.1111/cea.13866] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 02/21/2021] [Accepted: 02/24/2021] [Indexed: 11/27/2022]
Abstract
BACKGROUND Children with atopic dermatitis (AD) are often sensitized to food and aeroallergens, but sensitization patterns have not been analysed with biologic measures of disease pathogenicity. OBJECTIVE We sought to define allergen sensitization grouping(s) using unbiased machine learning and determine their associations with skin filaggrin (FLG) and transepidermal water loss (TEWL) (assesses skin barrier integrity), S100A8 and S100A9 expression (assesses skin inflammation) and AD severity. METHODS We studied 400 children with AD in the Mechanisms of Progression from Atopic Dermatitis to Asthma in Children (MPAACH) cohort to identify groupings of food and aeroallergen sensitizations. MPAACH is a paediatric AD cohort, aged 1-2, recruited through hospital/community settings between 2016 and 2018. We analysed these groupings' associations with AD biomarkers: skin FLG, S100A8 and S100A9 expression, total IgE, TEWL and AD severity. RESULTS An unbiased machine learning approach revealed five allergen clusters. The most common cluster (N = 131), SPTPEP, had sensitization to peanut, egg and/or pets. Three low prevalence clusters, which included children with allergen sensitization other than peanut, egg or pets, were combined into SPTOther . SPTNEG included children with no sensitization(s). SPTPEP children had higher median non-lesional TEWL (16.9 g/m2 /h) and IgE (90 kU/L) compared with SPTOTHER (8.8 g/m2 /h and 24 kU/L; p = .01 and p < .001) and SPTNEG (9 g/m2 /h and 26 kU/L; p = .003 and p < .001). SPTPEP children had lower median lesional (0.70) and non-lesional (1.09) FLG expression compared with SPTOTHER (lesional: 0.9; p = .047, non-lesional: 1.78; p = .01) and SPTNEG (lesional: 1.47; p < .001, non-lesional: 2.21; p < .001). There were no differences among groupings in S100A8 or S100A9 expression. CONCLUSIONS AND CLINICAL RELEVANCE In this largely clinic-based cohort of young children with AD, allergic sensitization to peanut, egg, cat or dog was associated with more severe disease and skin barrier function but not markers of cutaneous inflammation. These data need replicating in a population-based cohort but may have important implications for understanding the interaction between AD and allergic sensitization.
Collapse
|
5
|
A spatiotemporal analysis of opioid poisoning mortality in Ohio from 2010 to 2016. Sci Rep 2021; 11:4692. [PMID: 33633131 PMCID: PMC7907120 DOI: 10.1038/s41598-021-83544-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Accepted: 01/27/2021] [Indexed: 11/09/2022] Open
Abstract
Opioid-related deaths have severely increased since 2000 in the United States. This crisis has been declared a public health emergency, and among the most affected states is Ohio. We used statewide vital statistic data from the Ohio Department of Health (ODH) and demographics data from the U.S. Census Bureau to analyze opioid-related mortality from 2010 to 2016. We focused on the characterization of the demographics from the population of opioid-related fatalities, spatiotemporal pattern analysis using Moran's statistics at the census-tract level, and comorbidity analysis using frequent itemset mining and association rule mining. We found higher rates of opioid-related deaths in white males aged 25-54 compared to the rest of Ohioans. Deaths tended to increasingly cluster around Cleveland, Columbus and Cincinnati and away from rural regions as time progressed. We also found relatively high co-occurrence of cardiovascular disease, anxiety or drug abuse history, with opioid-related mortality. Our results demonstrate that state-wide spatiotemporal and comorbidity analysis of the opioid epidemic could provide novel insights into how the demographic characteristics, spatiotemporal factors, and/or health conditions may be associated with opioid-related deaths in the state of Ohio.
Collapse
|
6
|
Progress in developing a hybrid deep learning algorithm for identifying and locating primary vertices. EPJ WEB OF CONFERENCES 2021. [DOI: 10.1051/epjconf/202125104012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The locations of proton-proton collision points in LHC experiments are called primary vertices (PVs). Preliminary results of a hybrid deep learning algorithm for identifying and locating these, targeting the Run 3 incarnation of LHCb, have been described at conferences in 2019 and 2020. In the past year we have made significant progress in a variety of related areas. Using two newer Kernel Density Estimators (KDEs) as input feature sets improves the fidelity of the models, as does using full LHCb simulation rather than the “toy Monte Carlo” originally (and still) used to develop models. We have also built a deep learning model to calculate the KDEs from track information. Connecting a tracks-to-KDE model to a KDE-to-hists model used to find PVs provides a proof-of-concept that a single deep learning model can use track information to find PVs with high efficiency and high fidelity. We have studied a variety of models systematically to understand how variations in their architectures affect performance. While the studies reported here are specific to the LHCb geometry and operating conditions, the results suggest that the same approach could be used by the ATLAS and CMS experiments.
Collapse
|
7
|
Resolving single-cell heterogeneity from hundreds of thousands of cells through sequential hybrid clustering and NMF. Bioinformatics 2020; 36:3773-3780. [PMID: 32207533 PMCID: PMC7320606 DOI: 10.1093/bioinformatics/btaa201] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 02/20/2020] [Accepted: 03/19/2020] [Indexed: 12/13/2022] Open
Abstract
Motivation The rapid proliferation of single-cell RNA-sequencing (scRNA-Seq) technologies has spurred the development of diverse computational approaches to detect transcriptionally coherent populations. While the complexity of the algorithms for detecting heterogeneity has increased, most require significant user-tuning, are heavily reliant on dimension reduction techniques and are not scalable to ultra-large datasets. We previously described a multi-step algorithm, Iterative Clustering and Guide-gene Selection (ICGS), which applies intra-gene correlation and hybrid clustering to uniquely resolve novel transcriptionally coherent cell populations from an intuitive graphical user interface. Results We describe a new iteration of ICGS that outperforms state-of-the-art scRNA-Seq detection workflows when applied to well-established benchmarks. This approach combines multiple complementary subtype detection methods (HOPACH, sparse non-negative matrix factorization, cluster ‘fitness’, support vector machine) to resolve rare and common cell-states, while minimizing differences due to donor or batch effects. Using data from multiple cell atlases, we show that the PageRank algorithm effectively downsamples ultra-large scRNA-Seq datasets, without losing extremely rare or transcriptionally similar yet distinct cell types and while recovering novel transcriptionally distinct cell populations. We believe this new approach holds tremendous promise in reproducibly resolving hidden cell populations in complex datasets. Availability and implementation ICGS2 is implemented in Python. The source code and documentation are available at http://altanalyze.org. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
8
|
Chronic Dysregulation of Cortical and Subcortical Metabolism After Experimental Traumatic Brain Injury. Mol Neurobiol 2019; 56:2908-2921. [PMID: 30069831 PMCID: PMC7584385 DOI: 10.1007/s12035-018-1276-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2018] [Accepted: 07/23/2018] [Indexed: 02/03/2023]
Abstract
Traumatic brain injury (TBI) is a leading cause of death and long-term disability worldwide. Although chronic disability is common after TBI, effective treatments remain elusive and chronic TBI pathophysiology is not well understood. Early after TBI, brain metabolism is disrupted due to unregulated ion release, mitochondrial damage, and interruption of molecular trafficking. This metabolic disruption causes at least part of the TBI pathology. However, it is not clear how persistent or pervasive metabolic injury is at later stages of injury. Using untargeted 1H-NMR metabolomics, we examined ex vivo hippocampus, striatum, thalamus, frontal cortex, and brainstem tissue in a rat lateral fluid percussion model of chronic brain injury. We found altered tissue concentrations of metabolites in the hippocampus and thalamus consistent with dysregulation of energy metabolism and excitatory neurotransmission. Furthermore, differential correlation analysis provided additional evidence of metabolic dysregulation, most notably in brainstem and frontal cortex, suggesting that metabolic consequences of injury are persistent and widespread. Interestingly, the patterns of network changes were region-specific. The individual metabolic signatures after injury in different structures of the brain at rest may reflect different compensatory mechanisms engaged to meet variable metabolic demands across brain regions.
Collapse
|
9
|
mCrave: Continuous Estimation of Craving During Smoking Cessation. PROCEEDINGS OF THE ... ACM INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING . UBICOMP (CONFERENCE) 2016; 2016:863-874. [PMID: 27990501 DOI: 10.1145/2971648.2971672] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Craving usually precedes a lapse for impulsive behaviors such as overeating, drinking, smoking, and drug use. Passive estimation of craving from sensor data in the natural environment can be used to assist users in coping with craving. In this paper, we take the first steps towards developing a computational model to estimate cigarette craving (during smoking abstinence) at the minute-level using mobile sensor data. We use 2,012 hours of sensor data and 1,812 craving self-reports from 61 participants in a smoking cessation study. To estimate craving, we first obtain a continuous measure of stress from sensor data. We find that during hours of day when craving is high, stress associated with self-reported high craving is greater than stress associated with low craving. We use this and other insights to develop feature functions, and encode them as pattern detectors in a Conditional Random Field (CRF) based model to infer craving probabilities.
Collapse
|
10
|
Connectivity cluster analysis for discovering discriminative subnetworks in schizophrenia. Hum Brain Mapp 2014; 36:756-67. [PMID: 25394864 DOI: 10.1002/hbm.22662] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Accepted: 10/07/2014] [Indexed: 11/10/2022] Open
Abstract
In this manuscript, we present connectivity cluster analysis (CoCA), a novel computational framework that takes advantage of structure of the brain networks to magnify reproducible signals and quash noise. Resting state functional Magnetic Resonance Imaging (fMRI) data that is used in estimating functional brain networks is often noisy, leading to reduced power and inconsistent findings across independent studies. There is a need for techniques that can unearth signals in noisy datasets, while addressing redundancy in the functional connections that are used for testing association. CoCA is a data driven approach that addresses the problems of redundancy and noise by first finding groups of region pairs that behave in a cohesive way across the subjects. These cohesive sets of functional connections are further tested for association with the disease. CoCA is applied in the context of patients with schizophrenia, a disorder characterized as a disconnectivity syndrome. Our results suggest that CoCA can find reproducible sets of functional connections that behave cohesively. Applying this technique, we found that the connectivity clusters joining thalamus to parietal, temporal, and visuoparietal regions are highly discriminative of schizophrenia patients as well as reproducible using retest data and replicable in an independent confirmatory sample.
Collapse
|
11
|
Complex biomarker discovery in neuroimaging data: Finding a needle in a haystack. Neuroimage Clin 2013; 3:123-31. [PMID: 24179856 PMCID: PMC3791294 DOI: 10.1016/j.nicl.2013.07.004] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Revised: 06/27/2013] [Accepted: 07/16/2013] [Indexed: 12/17/2022]
Abstract
Neuropsychiatric disorders such as schizophrenia, bipolar disorder and Alzheimer's disease are major public health problems. However, despite decades of research, we currently have no validated prognostic or diagnostic tests that can be applied at an individual patient level. Many neuropsychiatric diseases are due to a combination of alterations that occur in a human brain rather than the result of localized lesions. While there is hope that newer imaging technologies such as functional and anatomic connectivity MRI or molecular imaging may offer breakthroughs, the single biomarkers that are discovered using these datasets are limited by their inability to capture the heterogeneity and complexity of most multifactorial brain disorders. Recently, complex biomarkers have been explored to address this limitation using neuroimaging data. In this manuscript we consider the nature of complex biomarkers being investigated in the recent literature and present techniques to find such biomarkers that have been developed in related areas of data mining, statistics, machine learning and bioinformatics.
Collapse
|
12
|
Large-scale integrative network-based analysis identifies common pathways disrupted by copy number alterations across cancers. BMC Genomics 2013; 14:440. [PMID: 23822816 PMCID: PMC3703268 DOI: 10.1186/1471-2164-14-440] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2012] [Accepted: 06/26/2013] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Many large-scale studies analyzed high-throughput genomic data to identify altered pathways essential to the development and progression of specific types of cancer. However, no previous study has been extended to provide a comprehensive analysis of pathways disrupted by copy number alterations across different human cancers. Towards this goal, we propose a network-based method to integrate copy number alteration data with human protein-protein interaction networks and pathway databases to identify pathways that are commonly disrupted in many different types of cancer. RESULTS We applied our approach to a data set of 2,172 cancer patients across 16 different types of cancers, and discovered a set of commonly disrupted pathways, which are likely essential for tumor formation in majority of the cancers. We also identified pathways that are only disrupted in specific cancer types, providing molecular markers for different human cancers. Analysis with independent microarray gene expression datasets confirms that the commonly disrupted pathways can be used to identify patient subgroups with significantly different survival outcomes. We also provide a network view of disrupted pathways to explain how copy number alterations affect pathways that regulate cell growth, cycle, and differentiation for tumorigenesis. CONCLUSIONS In this work, we demonstrated that the network-based integrative analysis can help to identify pathways disrupted by copy number alterations across 16 types of human cancers, which are not readily identifiable by conventional overrepresentation-based and other pathway-based methods. All the results and source code are available at http://compbio.cs.umn.edu/NetPathID/.
Collapse
|
13
|
Neurometrics of intrinsic connectivity networks at rest using fMRI: retest reliability and cross-validation using a meta-level method. Neuroimage 2013; 76:236-51. [PMID: 23507379 DOI: 10.1016/j.neuroimage.2013.02.066] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Revised: 02/19/2013] [Accepted: 02/24/2013] [Indexed: 01/02/2023] Open
Abstract
Functional images of the resting brain can be empirically parsed into intrinsic connectivity networks (ICNs) which closely resemble patterns of evoked task-based brain activity and which have a biological and genetic basis. Recently, ICNs have become popular for investigating brain functioning and brain-behavior relationships. However, the replicability and neurometrics of these networks are only beginning to be reported. Using a meta-level independent component analysis (ICA), we produced ICNs from three data sets collected from two samples of healthy adults. The ICNs from our data sets demonstrated robust and independent replication of 12 intrinsic networks that reflected 17 canonical, task-based, brain networks. We found within-subject reliability of ICNs was modest overall, but ranged from poor to good, and that voxels with the highest measured connectivity rarely had the highest reliability. Networks associated with executive functions, visuospatial reasoning, motor coordination, speech and audition, default mode, vision, and interoception showed moderate to high group-level reproducibility and replicability. However, only the first four of these networks also showed fair or better within-subject reliability over time. Our findings highlight the replicability of ICNs across data sets, the range of within-subject neurometrics across different networks, and the shared characteristics between resting and task-based networks.
Collapse
|
14
|
Abstract
Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype–gene association matrix under the prior knowledge from phenotype similarity network and protein–protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype–gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein–protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways.
Collapse
|
15
|
Abstract
Genetic interactions provide a powerful perspective into gene function, but our knowledge of the specific mechanisms that give rise to these interactions is still relatively limited. The availability of a global genetic interaction map in Saccharomyces cerevisiae, covering ∼30% of all possible double mutant combinations, provides an unprecedented opportunity for an unbiased assessment of the native structure within genetic interaction networks and how it relates to gene function and modular organization. Toward this end, we developed a data mining approach to exhaustively discover all block structures within this network, which allowed for its complete modular decomposition. The resulting modular structures revealed the importance of the context of individual genetic interactions in their interpretation and revealed distinct trends among genetic interaction hubs as well as insights into the evolution of duplicate genes. Block membership also revealed a surprising degree of multifunctionality across the yeast genome and enabled a novel association of VIP1 and IPK1 with DNA replication and repair, which is supported by experimental evidence. Our modular decomposition also provided a basis for testing the between-pathway model of negative genetic interactions and within-pathway model of positive genetic interactions. While we find that most modular structures involving negative genetic interactions fit the between-pathway model, we found that current models for positive genetic interactions fail to explain 80% of the modular structures detected. We also find differences between the modular structures of essential and nonessential genes.
Collapse
|
16
|
Genomic variation in myeloma: design, content, and initial application of the Bank On A Cure SNP Panel to detect associations with progression-free survival. BMC Med 2008; 6:26. [PMID: 18778477 PMCID: PMC2553089 DOI: 10.1186/1741-7015-6-26] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/29/2008] [Accepted: 09/08/2008] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND We have engaged in an international program designated the Bank On A Cure, which has established DNA banks from multiple cooperative and institutional clinical trials, and a platform for examining the association of genetic variations with disease risk and outcomes in multiple myeloma. We describe the development and content of a novel custom SNP panel that contains 3404 SNPs in 983 genes, representing cellular functions and pathways that may influence disease severity at diagnosis, toxicity, progression or other treatment outcomes. A systematic search of national databases was used to identify non-synonymous coding SNPs and SNPs within transcriptional regulatory regions. To explore SNP associations with PFS we compared SNP profiles of short term (less than 1 year, n = 70) versus long term progression-free survivors (greater than 3 years, n = 73) in two phase III clinical trials. RESULTS Quality controls were established, demonstrating an accurate and robust screening panel for genetic variations, and some initial racial comparisons of allelic variation were done. A variety of analytical approaches, including machine learning tools for data mining and recursive partitioning analyses, demonstrated predictive value of the SNP panel in survival. While the entire SNP panel showed genotype predictive association with PFS, some SNP subsets were identified within drug response, cellular signaling and cell cycle genes. CONCLUSION A targeted gene approach was undertaken to develop an SNP panel that can test for associations with clinical outcomes in myeloma. The initial analysis provided some predictive power, demonstrating that genetic variations in the myeloma patient population may influence PFS.
Collapse
|