1
|
Roberts AN. The Disease Loophole: Index Terms and Their Role in Disease Misclassification. THE JOURNAL OF MEDICINE AND PHILOSOPHY 2024; 49:178-194. [PMID: 38418099 DOI: 10.1093/jmp/jhae006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2024] Open
Abstract
The definitions of disease proffered by philosophers and medical actors typically require that a state of ill health be linked to some known bodily dysfunction before it is classified as a disease. I argue that such definitions of disease are not fully implementable in current medical discourse and practice. Adhering to the definitions would require that medical actors keep close track of the current state of knowledge on the causes and mechanisms of particular illnesses. Yet, unaddressed problems in medical terminology can make this difficult to do. I show that unrecognized misuse of "heterogeneous," "biomarker," and other important health terms-which I call index terms-can misrepresent the current empirical evidence on illness pathophysiology, such that unvalidated illness constructs become mistaken for diseases. Thus, implementing common definitions of disease would require closing this "loophole" in medical discourse. I offer a simple rule that, if followed, could help do just that.
Collapse
Affiliation(s)
- Alex N Roberts
- University of South Dakota, Vermillion, South Dakota, USA
| |
Collapse
|
2
|
Dridi N, Giremus A, Giovannelli JF, Truntzer C, Hadzagic M, Charrier JP, Gerfault L, Ducoroy P, Lacroix B, Grangeat P, Roy P. Bayesian inference for biomarker discovery in proteomics: an analytic solution. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2017; 2017:9. [PMID: 28710702 PMCID: PMC5511129 DOI: 10.1186/s13637-017-0062-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2016] [Accepted: 06/21/2017] [Indexed: 12/02/2022]
Abstract
This paper addresses the question of biomarker discovery in proteomics. Given clinical data regarding a list of proteins for a set of individuals, the tackled problem is to extract a short subset of proteins the concentrations of which are an indicator of the biological status (healthy or pathological). In this paper, it is formulated as a specific instance of variable selection. The originality is that the proteins are not investigated one after the other but the best partition between discriminant and non-discriminant proteins is directly sought. In this way, correlations between the proteins are intrinsically taken into account in the decision. The developed strategy is derived in a Bayesian setting, and the decision is optimal in the sense that it minimizes a global mean error. It is finally based on the posterior probabilities of the partitions. The main difficulty is to calculate these probabilities since they are based on the so-called evidence that require marginalization of all the unknown model parameters. Two models are presented that relate the status to the protein concentrations, depending whether the latter are biomarkers or not. The first model accounts for biological variabilities by assuming that the concentrations are Gaussian distributed with a mean and a covariance matrix that depend on the status only for the biomarkers. The second one is an extension that also takes into account the technical variabilities that may significantly impact the observed concentrations. The main contributions of the paper are: (1) a new Bayesian formulation of the biomarker selection problem, (2) the closed-form expression of the posterior probabilities in the noiseless case, and (3) a suitable approximated solution in the noisy case. The methods are numerically assessed and compared to the state-of-the-art methods (t test, LASSO, Battacharyya distance, FOHSIC) on synthetic and real data from proteins quantified in human serum by mass spectrometry in selected reaction monitoring mode.
Collapse
Affiliation(s)
- Noura Dridi
- IMS (Univ. Bordeaux, CNRS, BINP), Talence, 33400 France
- National Engineering School of Gabes (ENIG), University of Gabes, Gabes, Tunisia
| | | | | | - Caroline Truntzer
- CLIPP, Pôle de Recherche Université de Bourgogne, Dijon, 21000 France
| | - Melita Hadzagic
- IMS (Univ. Bordeaux, CNRS, BINP), Talence, 33400 France
- NATO STO Centre for Maritime Research and Experimentation, La Spezia, 19126 Italy
| | | | - Laurent Gerfault
- Univ. Grenoble Alpes, Grenoble, F-38000 France
- CEA, LETI, MINATEC Campus, Grenoble, F-38054 France
| | - Patrick Ducoroy
- CLIPP, Pôle de Recherche Université de Bourgogne, Dijon, 21000 France
| | - Bruno Lacroix
- Technology Research Department, Innovation Unit, bioMérieux SA, Marcy l’Étoile, France
| | - Pierre Grangeat
- Univ. Grenoble Alpes, Grenoble, F-38000 France
- CEA, LETI, MINATEC Campus, Grenoble, F-38054 France
| | - Pascal Roy
- Service de Biostatistique - Bioinformatique, Hospices Civils de Lyon, Lyon, France
- CNRS UMR 5558, LBBE, Équipe Biostatistique Santé, Villeurbanne, France
- Université de Lyon, Université Claude Bernard Lyon 1, Lyon, France
- Pôle Rhône-Alpes de Bioinformatique, Université Claude Bernard - Lyon 1, Villeurbanne, 69622 France
| |
Collapse
|
3
|
Bhattacharjee M, Rajeevan MS, Sillanpää MJ. Prediction of complex human diseases from pathway-focused candidate markers by joint estimation of marker effects: case of chronic fatigue syndrome. Hum Genomics 2015; 9:8. [PMID: 26063326 PMCID: PMC4479222 DOI: 10.1186/s40246-015-0030-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2013] [Accepted: 05/28/2015] [Indexed: 11/16/2022] Open
Abstract
Background The current practice of using only a few strongly associated genetic markers in regression models results in generally low power in prediction or accounting for heritability of complex human traits. Purpose We illustrate here a Bayesian joint estimation of single nucleotide polymorphism (SNP) effects principle to improve prediction of phenotype status from pathway-focused sets of SNPs. Chronic fatigue syndrome (CFS), a complex disease of unknown etiology with no laboratory methods for diagnosis, was chosen to demonstrate the power of this Bayesian method. For CFS, such a genetic predictive model in combination with clinical evidence might lead to an earlier diagnosis than one based solely on clinical findings. Methods One of our goals is to model disease status using Bayesian statistics which perform variable selection and parameter estimation simultaneously and which can induce the sparseness and smoothness of the SNP effects. Smoothness of the SNP effects is obtained by explicit modeling of the covariance structure of the SNP effects. Results The Bayesian model achieved perfect goodness of fit when tested within the sampled data. Tenfold cross-validation resulted in 80 % accuracy, one of the best so far for CFS in comparison to previous prediction models. Model reduction aspects were investigated in a computationally feasible manner. Additionally, genetic variation estimates provided by the model identified specific genetic markers for their biological role in the disease pathophysiology. Conclusions This proof-of-principle study provides a powerful approach combining Bayesian methods, SNPs representing multiple pathways and rigorous case ascertainment for accurate genetic risk prediction modeling of complex diseases like CFS and other chronic diseases. Electronic supplementary material The online version of this article (doi:10.1186/s40246-015-0030-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Mangalathu S Rajeevan
- Division of High-Consequence Pathogens & Pathology, Centers for Disease Control and Prevention, Atlanta, 30333, USA.
| | - Mikko J Sillanpää
- Departments of Mathematical Sciences, Biocenter Oulu, University of Oulu, Oulu, FIN-90014, Finland.
| |
Collapse
|
4
|
Pikkuhookana P, Sillanpää MJ. Combined linkage disequilibrium and linkage mapping: Bayesian multilocus approach. Heredity (Edinb) 2013; 112:351-60. [PMID: 24253936 DOI: 10.1038/hdy.2013.111] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Revised: 09/02/2013] [Accepted: 09/27/2013] [Indexed: 01/24/2023] Open
Abstract
Quantitative trait loci (QTL) affecting the phenotype of interest can be detected using linkage analysis (LA), linkage disequilibrium (LD) mapping or a combination of both (LDLA). The LA approach uses information from recombination events within the observed pedigree and LD mapping from the historical recombinations within the unobserved pedigree. We propose the Bayesian variable selection approach for combined LDLA analysis for single-nucleotide polymorphism (SNP) data. The novel approach uses both sources of information simultaneously as is commonly done in plant and animal genetics, but it makes fewer assumptions about population demography than previous LDLA methods. This differs from approaches in human genetics, where LDLA methods use LA information conditional on LD information or the other way round. We argue that the multilocus LDLA model is more powerful for the detection of phenotype-genotype associations than single-locus LDLA analysis. To illustrate the performance of the Bayesian multilocus LDLA method, we analyzed simulation replicates based on real SNP genotype data from small three-generational CEPH families and compared the results with commonly used quantitative transmission disequilibrium test (QTDT). This paper is intended to be conceptual in the sense that it is not meant to be a practical method for analyzing high-density SNP data, which is more common. Our aim was to test whether this approach can function in principle.
Collapse
Affiliation(s)
- P Pikkuhookana
- 1] Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland [2] Department of Biology, University of Oulu, Oulu, Finland [3] Department of Mathematical Sciences, University of Oulu, Oulu, Finland [4] Biocenter Oulu, University of Oulu, Oulu, Finland
| | - M J Sillanpää
- 1] Department of Biology, University of Oulu, Oulu, Finland [2] Department of Mathematical Sciences, University of Oulu, Oulu, Finland [3] Biocenter Oulu, University of Oulu, Oulu, Finland
| |
Collapse
|
5
|
Ciregia F, Giusti L, Da Valle Y, Donadio E, Consensi A, Giacomelli C, Sernissi F, Scarpellini P, Maggi F, Lucacchini A, Bazzichi L. A multidisciplinary approach to study a couple of monozygotic twins discordant for the chronic fatigue syndrome: a focus on potential salivary biomarkers. J Transl Med 2013; 11:243. [PMID: 24088505 PMCID: PMC3850462 DOI: 10.1186/1479-5876-11-243] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2013] [Accepted: 09/30/2013] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Chronic Fatigue Syndrome (CFS) is a severe, systemic illness characterized by persistent, debilitating and medically unexplained fatigue. The etiology and pathophysiology of CFS remains obscure, and diagnosis is formulated through the patient's history and exclusion of other medical causes. Thereby, the availability of biomarkers for CFS could be useful for clinical research. In the present study, we used a proteomic approach to evaluate the global changes in the salivary profile in a couple of monozygotic twins who were discordant for CFS. The aim was to evaluate differences of salivary protein expression in the CFS patient in respect to his healthy twin. METHODS Saliva samples were submitted to two-dimensional electrophoresis (2DE). The gels were stained with Sypro, and a comparison between CFS subject and the healthy one was performed by the software Progenesis Same Spot including the Analysis of variance (ANOVA test). The proteins spot found with a ≥2-fold spot quantity change and p<0.05 were identified by Nano-liquid chromatography electrospray ionization tandem mass spectrometry. To validate the expression changes found with 2DE of 5 proteins (14-3-3 protein zeta/delta, cyclophilin A, Cystatin-C, Protein S100-A7, and zinc-alpha-2-glycoprotein), we used the western blot analysis. Moreover, proteins differentially expressed were functionally analyzed using the Ingenuity Pathways Analysis software with the aim to determine the predominant canonical pathways and the interaction network involved. RESULTS The analysis of the protein profiles allowed us to find 13 proteins with a different expression in CFS in respect to control. Nine spots were up-regulated in CFS and 4 down-regulated. These proteins belong to different functional classes, such as inflammatory response, immune system and metabolism. In particular, as shown by the pathway analysis, the network built with our proteins highlights the involvement of inflammatory response in CFS pathogenesis. CONCLUSIONS This study shows the presence of differentially expressed proteins in the saliva of the couple of monozygotic twins discordant for CFS, probably related to the disease. Consequently, we believe the proteomic approach could be useful both to define a panel of potential diagnostic biomarkers and to shed new light on the comprehension of the pathogenetic pathways of CFS.
Collapse
Affiliation(s)
- Federica Ciregia
- Department of Pharmacy, University of Pisa, Via Bonanno 6, Pisa, 56126, Italy.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Association of active human herpesvirus-6, -7 and parvovirus b19 infection with clinical outcomes in patients with myalgic encephalomyelitis/chronic fatigue syndrome. Adv Virol 2012; 2012:205085. [PMID: 22927850 PMCID: PMC3426163 DOI: 10.1155/2012/205085] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2012] [Revised: 06/21/2012] [Accepted: 06/28/2012] [Indexed: 11/17/2022] Open
Abstract
Frequency of active human herpesvirus-6, -7 (HHV-6, HHV-7) and parvovirus B19 (B19) infection/coinfection and its association with clinical course of ME/CFS was evaluated. 108 ME/CFS patients and 90 practically healthy persons were enrolled in the study. Viral genomic sequences were detected by PCR, virus-specific antibodies and cytokine levels-by ELISA, HHV-6 variants-by restriction analysis. Active viral infection including concurrent infection was found in 64.8% (70/108) of patients and in 13.3% (12/90) of practically healthy persons. Increase in peripheral blood leukocyte DNA HHV-6 load as well as in proinflammatory cytokines' levels was detected in patients during active viral infection. Definite relationship was observed between active betaherpesvirus infection and subfebrility, lymphadenopathy and malaise after exertion, and between active B19 infection and multijoint pain. Neuropsychological disturbances were detected in all patients. The manifestation of symptoms was of more frequent occurrence in patients with concurrent infection. The high rate of active HHV-6, HHV-7 and B19 infection/coinfection with the simultaneous increase in plasma proinflammatory cytokines' level as well as the association between active viral infection and distinctive types of clinical symptoms shows necessity of simultaneous study of these viral infections for identification of possible subsets of ME/CFS.
Collapse
|
7
|
Bhattacharjee M, Sillanpää MJ. A bayesian mixed regression based prediction of quantitative traits from molecular marker and gene expression data. PLoS One 2011; 6:e26959. [PMID: 22087238 PMCID: PMC3210128 DOI: 10.1371/journal.pone.0026959] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2011] [Accepted: 10/07/2011] [Indexed: 11/19/2022] Open
Abstract
Both molecular marker and gene expression data were considered alone as well as jointly to serve as additive predictors for two pathogen-activity-phenotypes in real recombinant inbred lines of soybean. For unobserved phenotype prediction, we used a bayesian hierarchical regression modeling, where the number of possible predictors in the model was controlled by different selection strategies tested. Our initial findings were submitted for DREAM5 (the 5th Dialogue on Reverse Engineering Assessment and Methods challenge) and were judged to be the best in sub-challenge B3 wherein both functional genomic and genetic data were used to predict the phenotypes. In this work we further improve upon this previous work by considering various predictor selection strategies and cross-validation was used to measure accuracy of in-data and out-data predictions. The results from various model choices indicate that for this data use of both data types (namely functional genomic and genetic) simultaneously improves out-data prediction accuracy. Adequate goodness-of-fit can be easily achieved with more complex models for both phenotypes, since the number of potential predictors is large and the sample size is not small. We also further studied gene-set enrichment (for continuous phenotype) in the biological process in question and chromosomal enrichment of the gene set. The methodological contribution of this paper is in exploration of variable selection techniques to alleviate the problem of over-fitting. Different strategies based on the nature of covariates were explored and all methods were implemented under the bayesian hierarchical modeling framework with indicator-based covariate selection. All the models based in careful variable selection procedure were found to produce significant results based on permutation test.
Collapse
|
8
|
Lunn DJ, Wei C, Hovorka R. Fitting dynamic models with forcing functions: application to continuous glucose monitoring in insulin therapy. Stat Med 2011; 30:2234-50. [PMID: 21590789 PMCID: PMC3201840 DOI: 10.1002/sim.4254] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2010] [Accepted: 03/07/2011] [Indexed: 11/16/2022]
Abstract
The artificial pancreas is an emerging technology to treat type 1 diabetes (T1D). It has the potential to revolutionize diabetes care and improve quality of life. The system requires extensive testing, however, to ensure that it is both effective and safe. Clinical studies are resource demanding and so a principle aim is to develop an in silico population of subjects with T1D on which to conduct pre-clinical testing. This paper aims to reliably characterize the relationship between blood glucose and glucose measured by subcutaneous sensor as a major step towards this goal. Blood-and sensor-glucose are related through a dynamic model, specified in terms of differential equations. Such models can present special challenges for statistical inference, however. In this paper we make use of the BUGS software, which can accommodate a limited class of dynamic models, and it is in this context that we discuss such challenges. For example, we show how dynamic models involving forcing functions can be accommodated. To account for fluctuations away from the dynamic model that are apparent in the observed data, we assume an autoregressive structure for the residual error model. This leads to some identifiability issues but gives very good predictions of virtual data. Our approach is pragmatic and we propose a method to mitigate the consequences of such identifiability issues.
Collapse
Affiliation(s)
- D J Lunn
- Medical Research Council Biostatistics Unit, Institute of Public Health, University Forvie Site, Cambridge, U.K.
| | | | | |
Collapse
|
9
|
Sillanpää MJ. Overview of techniques to account for confounding due to population stratification and cryptic relatedness in genomic data association analyses. Heredity (Edinb) 2010; 106:511-9. [PMID: 20628415 DOI: 10.1038/hdy.2010.91] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Population-based genomic association analyses are more powerful than within-family analyses. However, population stratification (unknown or ignored origin of individuals from multiple source populations) and cryptic relatedness (unknown or ignored covariance between individuals because of their relatedness) are confounding factors in population-based genomic association analyses, which inflate the false-positive rate. As a consequence, false association signals may arise in genomic data association analyses for reasons other than true association between the tested genomic factor (marker genotype, gene or protein expression) and the study phenotype. It is therefore important to correct or account for these confounders in population-based genomic data association analyses. The common correction techniques for population stratification and cryptic relatedness problems are presented here in the phenotype-marker association analysis context, and comments on their suitability for other types of genomic association analyses (for example, phenotype-expression association) are also provided. Even though many of these techniques have originally been developed in the context of human genetics, most of them are also applicable to model organisms and breeding populations.
Collapse
Affiliation(s)
- M J Sillanpää
- Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
10
|
|
11
|
Correcting for relatedness in Bayesian models for genomic data association analysis. Heredity (Edinb) 2009; 103:223-37. [PMID: 19455182 DOI: 10.1038/hdy.2009.56] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
For small pedigrees, the issue of correcting for known or estimated relatedness structure in population-based Bayesian multilocus association analysis is considered. Two such relatedness corrections: [1] a random term arising from the infinite polygenic model and [2] a fixed covariate following the class D model of Bonney, are compared with the case of no correction using both simulated and real marker and gene-expression data from lymphoblastoid cell lines from four CEPH families. This comparison is performed with clinical quantitative trait locus (cQTL) models-multilocus association models where marker data and expression levels of gene transcripts as well as possible genotype x expression interaction terms are jointly used to explain quantitative trait variation. We found out that regardless of having a correction term in the model, the cQTL-models fit a few extra small-effect components (similar to finite polygenic models) which itself serves as a relatedness correction. For small data and small heritability one may use the covariate model, which clearly outperforms the infinite polygenic model in small data examples.
Collapse
|