1
|
Bour C, Ahne A, Aguayo G, Fischer A, Marcic D, Kayser P, Fagherazzi G. Global diabetes burden: analysis of regional differences to improve diabetes care. BMJ Open Diabetes Res Care 2022; 10:10/5/e003040. [PMID: 36307139 PMCID: PMC9621169 DOI: 10.1136/bmjdrc-2022-003040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 10/14/2022] [Indexed: 11/07/2022] Open
Abstract
INTRODUCTION The current evaluation processes of the burden of diabetes are incomplete and subject to bias. This study aimed to identify regional differences in the diabetes burden on a universal level from the perspective of people with diabetes. RESEARCH DESIGN AND METHODS We developed a worldwide online diabetes observatory based on 34 million diabetes-related tweets from 172 countries covering 41 languages, spanning from 2017 to 2021. After translating all tweets to English, we used machine learning algorithms to remove institutional tweets and jokes, geolocate users, identify topics of interest and quantify associated sentiments and emotions across the seven World Bank regions. RESULTS We identified four topics of interest for people with diabetes (PWD) in the Middle East and North Africa and another 18 topics in North America. Topics related to glycemic control and food are shared among six regions of the world. These topics were mainly associated with sadness (35% and 39% on average compared with levels of sadness in other topics). We also revealed several region-specific concerns (eg, insulin pricing in North America or the burden of daily diabetes management in Europe and Central Asia). CONCLUSIONS The needs and concerns of PWD vary significantly worldwide, and the burden of diabetes is perceived differently. Our results will support better integration of these regional differences into diabetes programs to improve patient-centric diabetes research and care, focused on the most relevant concerns to enhance personalized medicine and self-management of PWD.
Collapse
Affiliation(s)
- Charline Bour
- Department of Precision Health, Deep Digital Phenotyping Research Unit, Luxembourg Institute of Health, Strassen, Luxembourg
- Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Adrian Ahne
- Center for Research in Epidemiology and Population Health (CESP), INSERM, Villejuif (Paris), Île-de-France, France
| | - Gloria Aguayo
- Department of Precision Health, Deep Digital Phenotyping Research Unit, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Aurélie Fischer
- Department of Precision Health, Deep Digital Phenotyping Research Unit, Luxembourg Institute of Health, Strassen, Luxembourg
| | - David Marcic
- Department of Precision Health, Data Integration and Analysis Unit, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Philippe Kayser
- Department of Precision Health, Data Integration and Analysis Unit, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Guy Fagherazzi
- Department of Precision Health, Deep Digital Phenotyping Research Unit, Luxembourg Institute of Health, Strassen, Luxembourg
| |
Collapse
|
2
|
Ahne A, Khetan V, Tannier X, Rizvi MIH, Czernichow T, Orchard F, Bour C, Fano A, Fagherazzi G. Extraction of Explicit and Implicit Cause-Effect Relationships in Patient-Reported Diabetes-Related Tweets From 2017 to 2021: Deep Learning Approach. JMIR Med Inform 2022; 10:e37201. [PMID: 35852829 PMCID: PMC9346561 DOI: 10.2196/37201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 05/17/2022] [Accepted: 06/04/2022] [Indexed: 11/25/2022] Open
Abstract
Background Intervening in and preventing diabetes distress requires an understanding of its causes and, in particular, from a patient’s perspective. Social media data provide direct access to how patients see and understand their disease and consequently show the causes of diabetes distress. Objective Leveraging machine learning methods, we aim to extract both explicit and implicit cause-effect relationships in patient-reported diabetes-related tweets and provide a methodology to better understand the opinions, feelings, and observations shared within the diabetes online community from a causality perspective. Methods More than 30 million diabetes-related tweets in English were collected between April 2017 and January 2021. Deep learning and natural language processing methods were applied to focus on tweets with personal and emotional content. A cause-effect tweet data set was manually labeled and used to train (1) a fine-tuned BERTweet model to detect causal sentences containing a causal relation and (2) a conditional random field model with Bidirectional Encoder Representations from Transformers (BERT)-based features to extract possible cause-effect associations. Causes and effects were clustered in a semisupervised approach and visualized in an interactive cause-effect network. Results Causal sentences were detected with a recall of 68% in an imbalanced data set. A conditional random field model with BERT-based features outperformed a fine-tuned BERT model for cause-effect detection with a macro recall of 68%. This led to 96,676 sentences with cause-effect relationships. “Diabetes” was identified as the central cluster followed by “death” and “insulin.” Insulin pricing–related causes were frequently associated with death. Conclusions A novel methodology was developed to detect causal sentences and identify both explicit and implicit, single and multiword cause, and the corresponding effect, as expressed in diabetes-related tweets leveraging BERT-based architectures and visualized as cause-effect network. Extracting causal associations in real life, patient-reported outcomes in social media data provide a useful complementary source of information in diabetes research.
Collapse
Affiliation(s)
- Adrian Ahne
- Center of Epidemiology and Population Health, Inserm, Hospital Gustave Roussy, Paris-Saclay University, Villejuif, France.,Epiconcept Company, Paris, France
| | - Vivek Khetan
- Accenture Labs, San Francisco, CA, United States
| | - Xavier Tannier
- Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé, Inserm, University Sorbonne Paris Nord, Sorbonne University, Paris, France
| | | | | | | | - Charline Bour
- Deep Digital Phenotyping Research Unit, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Andrew Fano
- Accenture Labs, San Francisco, CA, United States
| | - Guy Fagherazzi
- Deep Digital Phenotyping Research Unit, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg
| |
Collapse
|
3
|
Ahne A, Fagherazzi G, Tannier X, Czernichow T, Orchard F. Improving Diabetes-Related Biomedical Literature Exploration in the Clinical Decision-making Process via Interactive Classification and Topic Discovery: Methodology Development Study. J Med Internet Res 2022; 24:e27434. [PMID: 35040795 PMCID: PMC8808347 DOI: 10.2196/27434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 04/06/2021] [Accepted: 11/10/2021] [Indexed: 11/30/2022] Open
Abstract
Background The amount of available textual health data such as scientific and biomedical literature is constantly growing and becoming more and more challenging for health professionals to properly summarize those data and practice evidence-based clinical decision making. Moreover, the exploration of unstructured health text data is challenging for professionals without computer science knowledge due to limited time, resources, and skills. Current tools to explore text data lack ease of use, require high computational efforts, and incorporate domain knowledge and focus on topics of interest with difficulty. Objective We developed a methodology able to explore and target topics of interest via an interactive user interface for health professionals with limited computer science knowledge. We aim to reach near state-of-the-art performance while reducing memory consumption, increasing scalability, and minimizing user interaction effort to improve the clinical decision-making process. The performance was evaluated on diabetes-related abstracts from PubMed. Methods The methodology consists of 4 parts: (1) a novel interpretable hierarchical clustering of documents where each node is defined by headwords (words that best represent the documents in the node), (2) an efficient classification system to target topics, (3) minimized user interaction effort through active learning, and (4) a visual user interface. We evaluated our approach on 50,911 diabetes-related abstracts providing a hierarchical Medical Subject Headings (MeSH) structure, a unique identifier for a topic. Hierarchical clustering performance was compared against the implementation in the machine learning library scikit-learn. On a subset of 2000 randomly chosen diabetes abstracts, our active learning strategy was compared against 3 other strategies: random selection of training instances, uncertainty sampling that chooses instances about which the model is most uncertain, and an expected gradient length strategy based on convolutional neural networks (CNNs). Results For the hierarchical clustering performance, we achieved an F1 score of 0.73 compared to 0.76 achieved by scikit-learn. Concerning active learning performance, after 200 chosen training samples based on these strategies, the weighted F1 score of all MeSH codes resulted in a satisfying 0.62 F1 score using our approach, 0.61 using the uncertainty strategy, 0.63 using the CNN, and 0.45 using the random strategy. Moreover, our methodology showed a constant low memory use with increased number of documents. Conclusions We proposed an easy-to-use tool for health professionals with limited computer science knowledge who combine their domain knowledge with topic exploration and target specific topics of interest while improving transparency. Furthermore, our approach is memory efficient and highly parallelizable, making it interesting for large Big Data sets. This approach can be used by health professionals to gain deep insights into biomedical literature to ultimately improve the evidence-based clinical decision making process.
Collapse
Affiliation(s)
- Adrian Ahne
- Exposome and Heredity team, Center of Epidemiology and Population Health, Hospital Gustave Roussy, Inserm, Paris-Saclay University, Villejuif, France.,Epiconcept Company, Paris, France
| | - Guy Fagherazzi
- Deep Digital Phenotyping Research Unit, Department of Population Health, Luxembourg Institute of Health, Luxembourg, Luxembourg
| | - Xavier Tannier
- Laboratoire d'Informatique Medicale et d'Ingenierie des Connaissances pour la e-Sante, Limics, Inserm, University Sorbonne Paris Nord, Sorbonne University, Paris, France
| | | | | |
Collapse
|
4
|
Bour C, Ahne A, Schmitz S, Perchoux C, Dessenne C, Fagherazzi G. The Use of Social Media for Health Research Purposes: Scoping Review. J Med Internet Res 2021; 23:e25736. [PMID: 34042593 PMCID: PMC8193478 DOI: 10.2196/25736] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 01/15/2021] [Accepted: 03/18/2021] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND As social media are increasingly used worldwide, more and more scientists are relying on them for their health-related projects. However, social media features, methodologies, and ethical issues are unclear so far because, to our knowledge, there has been no overview of this relatively young field of research. OBJECTIVE This scoping review aimed to provide an evidence map of the different uses of social media for health research purposes, their fields of application, and their analysis methods. METHODS We followed the scoping review methodologies developed by Arksey and O'Malley and the Joanna Briggs Institute. After developing search strategies based on keywords (eg, social media, health research), comprehensive searches were conducted in the PubMed/MEDLINE and Web of Science databases. We limited the search strategies to documents written in English and published between January 1, 2005, and April 9, 2020. After removing duplicates, articles were screened at the title and abstract level and at the full text level by two independent reviewers. One reviewer extracted data, which were descriptively analyzed to map the available evidence. RESULTS After screening 1237 titles and abstracts and 407 full texts, 268 unique papers were included, dating from 2009 to 2020 with an average annual growth rate of 32.71% for the 2009-2019 period. Studies mainly came from the Americas (173/268, 64.6%, including 151 from the United States). Articles used machine learning or data mining techniques (60/268) to analyze the data, discussed opportunities and limitations of the use of social media for research (59/268), assessed the feasibility of recruitment strategies (45/268), or discussed ethical issues (16/268). Communicable (eg, influenza, 40/268) and then chronic (eg, cancer, 24/268) diseases were the two main areas of interest. CONCLUSIONS Since their early days, social media have been recognized as resources with high potential for health research purposes, yet the field is still suffering from strong heterogeneity in the methodologies used, which prevents the research from being compared and generalized. For the field to be fully recognized as a valid, complementary approach to more traditional health research study designs, there is now a need for more guidance by types of applications of social media for health research, both from a methodological and an ethical perspective. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) RR2-10.1136/bmjopen-2020-040671.
Collapse
Affiliation(s)
- Charline Bour
- Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Adrian Ahne
- Inserm U1018, Center for Research in Epidemiology and Population Health (CESP), Paris Saclay University, Villejuif, France.,Epiconcept, Paris, France
| | - Susanne Schmitz
- Competence Centre for Methodology and Statistics, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Camille Perchoux
- Luxembourg Institute of Socio-Economic Research, Esch/Alzette, Luxembourg
| | - Coralie Dessenne
- Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Guy Fagherazzi
- Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| |
Collapse
|
5
|
Bour C, Schmitz S, Ahne A, Perchoux C, Dessenne C, Fagherazzi G. Scoping review protocol on the use of social media for health research purposes. BMJ Open 2021; 11:e040671. [PMID: 33574143 PMCID: PMC7880087 DOI: 10.1136/bmjopen-2020-040671] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 10/27/2020] [Accepted: 01/21/2021] [Indexed: 02/01/2023] Open
Abstract
INTRODUCTION More than one-third of the world population uses at least one form of social media. Since their advent in 2005, health-oriented research based on social media data has largely increased as discussions about health issues are broadly shared online and generate a large amount of health-related data. The objective of this scoping review is to provide an evidence map of the various uses of social media for health research purposes, their fields of applications and their analysis methods. METHODS AND ANALYSIS This scoping review will follow the Arksey and O'Malley methodological framework (2005) as well as the Joanna Briggs Institute Reviewer's manual. Relevant publications will be first searched on the PudMed/MEDLINE database and then on Web of Science. We will focus on literature published between January 2005 and April 2020. All articles related to the use of social media or networks for health-oriented research purposes will be included. A first search will be conducted with some keywords in order to identify relevant articles. After identifying the research strategy, a two-part study selection process will be systematically applied by two reviewers. The first part consists of screening titles and abstracts found, thanks to the search strategy, to define the eligibility of each article. In the second part, the full texts will be screened and only relevant articles will be kept. Data will finally be extracted, collated and charted to summarise all the relevant methods, outcomes and key findings in the articles. ETHICS AND DISSEMINATION This scoping review will provide an extensive overview of the use of social media for health research purposes. Opportunities as well as future ethical, methodological and technical challenges will also be discussed based on our findings to define a new research agenda. Results will be disseminated through a peer-reviewed publication.
Collapse
Affiliation(s)
- Charline Bour
- Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Susanne Schmitz
- Department of Population Health, Competence Center for Methodology and Statistics, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Adrian Ahne
- Center for Research in Epidemiology and Population Health (CESP), Inserm U1018, Villejuif, France
- Epiconcept, Paris, France
| | - Camille Perchoux
- Urban Development and Mobility, Luxembourg Institute of Socio-Economic Research (LISER), Esch-sur-Alzette, Luxembourg
| | - Coralie Dessenne
- Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Guy Fagherazzi
- Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| |
Collapse
|
6
|
Ahne A, Orchard F, Tannier X, Perchoux C, Balkau B, Pagoto S, Harding JL, Czernichow T, Fagherazzi G. Insulin pricing and other major diabetes-related concerns in the USA: a study of 46 407 tweets between 2017 and 2019. BMJ Open Diabetes Res Care 2020; 8:8/1/e001190. [PMID: 32503810 PMCID: PMC7282343 DOI: 10.1136/bmjdrc-2020-001190] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 03/24/2020] [Accepted: 04/04/2020] [Indexed: 11/15/2022] Open
Abstract
INTRODUCTION Little research has been done to systematically evaluate concerns of people living with diabetes through social media, which has been a powerful tool for social change and to better understand perceptions around health-related issues. This study aims to identify key diabetes-related concerns in the USA and primary emotions associated with those concerns using information shared on Twitter. RESEARCH DESIGN AND METHODS A total of 11.7 million diabetes-related tweets in English were collected between April 2017 and July 2019. Machine learning methods were used to filter tweets with personal content, to geolocate (to the USA) and to identify clusters of tweets with emotional elements. A sentiment analysis was then applied to each cluster. RESULTS We identified 46 407 tweets with emotional elements in the USA from which 30 clusters were identified; 5 clusters (18% of tweets) were related to insulin pricing with both positive emotions (joy, love) referring to advocacy for affordable insulin and sadness emotions related to the frustration of insulin prices, 5 clusters (12% of tweets) to solidarity and support with a majority of joy and love emotions expressed. The most negative topics (10% of tweets) were related to diabetes distress (24% sadness, 27% anger, 21% fear elements), to diabetic and insulin shock (45% anger, 46% fear) and comorbidities (40% sadness). CONCLUSIONS Using social media data, we have been able to describe key diabetes-related concerns and their associated emotions. More specifically, we were able to highlight the real-world concerns of insulin pricing and its negative impact on mood. Using such data can be a useful addition to current measures that inform public decision making around topics of concern and burden among people with diabetes.
Collapse
Affiliation(s)
- Adrian Ahne
- Center for Research in Epidemiology and Population Health (CESP), INSERM, University Paris Saclay, Villejuif (Paris), Île-de-France, France
- Epiconcept Company, Paris, France
| | | | - Xavier Tannier
- LIMICS, INSERM U1142, Sorbonne University, Paris, Île-de-France, France
| | - Camille Perchoux
- Luxembourg Institute of Socio-Economic Research, Esch-sur-Alzette, Luxembourg
| | - Beverley Balkau
- Center for Research in Epidemiology and Population Health (CESP), INSERM, University Paris Saclay, Villejuif (Paris), Île-de-France, France
| | - Sherry Pagoto
- Department of Allied Health Sciences, UConn Center for mHealth & Social Media, University of Connecticut, Storrs, Connecticut, USA
| | - Jessica Lee Harding
- Division of Transplantation, Department of Surgery, Emory University School of Medicine, Emory University Hospital, Atlanta, Georgia, USA
| | | | - Guy Fagherazzi
- Digital Epidemiology Hub, Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| |
Collapse
|
7
|
Ahne A, Müller-Derlich J, Merlos-Lange AM, Kanbay F, Wolf K, Lang BF. Two distinct mechanisms for deletion in mitochondrial DNA of Schizosaccharomyces pombe mutator strains. Slipped mispairing mediated by direct repeats and erroneous intron splicing. J Mol Biol 1988; 202:725-34. [PMID: 3172236 DOI: 10.1016/0022-2836(88)90553-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Mutator strains of the fission yeast Schizosaccharomyces pombe produce mitochondrial respiratory deficient mutants at a high rate, and roughly 20% of these mutants carry deletions in the range of 50 to 1500 base-pairs. To elucidate the mechanism of deletion we have sequenced ten deletion mutants in the mosaic gene encoding apocytochrome b (cob) and three in the split gene coding for the first subunit of cytochrome c oxidase (cox1). Of 13 deletions, ten are correlated with the presence of direct repeats, which could promote deletions by slipped mispairing during DNA replication. In some of these mutants, the termini are located in possible DNA secondary structures. In three independently isolated mutants with identical deletions in the cob gene, the 5' deletion endpoint coincides with the 3' splice point of the intron, whereas the 3' endpoint of the deletion exhibits pronounced homology with the 5' splice point of the intron. This result suggests that these deletions might be initiated by erroneous RNA splicing.
Collapse
Affiliation(s)
- A Ahne
- Institut für Genetik und Mikrobiologie, Universität München, FRG
| | | | | | | | | | | |
Collapse
|