1
|
Gonçalves R, Blaauwendraad S, Avraam D, Beneíto A, Charles MA, Elhakeem A, Escribano J, Etienne L, García-Baquero Moneo G, Soares AG, de Groot J, Grote V, Gruszfeld D, Guerlich K, Guxens M, Heude B, Koletzko B, Lertxundi A, Lozano M, El Marroun H, McEachan R, Pinot de Moira A, Santorelli G, Strandberg-Larsen K, Tafflet M, Vainqueur C, Verduci E, Vrijheid M, Welten M, Wright J, Yang TC, Gaillard R, Jaddoe VW. Early-life growth and emotional, behavior and cognitive outcomes in childhood and adolescence in the EU child cohort network: individual participant data meta-analysis of over 109,000 individuals. THE LANCET REGIONAL HEALTH. EUROPE 2025; 52:101247. [PMID: 40094119 PMCID: PMC11910110 DOI: 10.1016/j.lanepe.2025.101247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2024] [Revised: 02/14/2025] [Accepted: 02/14/2025] [Indexed: 03/19/2025]
Abstract
Background Fetal and infant development might be critical for cognitive outcomes and psychopathology later in life. We assessed the associations of birth characteristics and early life growth with behavior and cognitive outcomes from childhood to adolescence. Methods We used harmonized data of 109,481 children from 8 European birth cohorts. Birth weight, gestational age, and body mass index (BMI) tertiles at the age of 2 years were used as the exposure variables. Outcomes included internalizing and externalizing problems and attention-deficit hyperactivity disorder (ADHD), autism spectrum disorder (ASD), and non-verbal intelligence quotient (Non-verbal IQ) in childhood (4-10 years), early adolescence (11-16 years), and late adolescence (17-20 years). We used 1-stage individual participant data meta-analyses using generalized linear models. Findings A one-week older gestational age was associated with lower scores for internalizing problems (difference -0·48 (95% CI: -0·59, -0·37)), externalizing problems (difference -0·34 (95% CI: -0·44, -0·23)), and ADHD symptoms (difference -0·38 (95% CI: -0·49, -0·27)), and with higher scores for non-verbal IQ (difference 0·65 (95% CI: 0·41, 0·89)). As compared to term birth, preterm birth was associated with higher internalizing problems (difference 3·43 (95% CI: 2·52, 4·33)) and externalizing problems (difference 2·31 (95% CI: 1·16, 3·46)), ADHD symptoms (difference 4·15 (95% CI: 3·15, 5·16)), ASD symptoms (difference 3·23 (95% CI: 0·37, 6·08)), and lower non-verbal IQ (difference -5·44 (95% CI: -7·44, -3·44)). Small size for gestational age at birth (SGA) in comparison with appropriate size for gestational age (AGA) was associated with higher ADHD symptoms (difference 4·88 (95% CI: 3·87, 5·90)) and lower Non-verbal IQ (difference -7·02 (95% CI: -8·84, -5·21)). Large size for gestational age at birth was associated with lower ADHD symptoms (difference -1·09 (95% CI: -1·73, 0·45)) and higher non-verbal IQ (difference 2·47 (95% CI: 0·77, 4·18)). Explorative analyses showed that as compared to children with an appropriate size for gestational age at birth and a normal BMI at the age of 2 years, children born SGA who remained small at 2 years had the lowest non-verbal IQ score (difference -8·14 percentiles (95% CI: -11·89, -4·39)). Interpretation Both fetal and early childhood growth are associated with emotional, behavioral and cognitive outcomes throughout childhood and adolescence. Compensatory infant growth might partly attenuate the adverse effects of suboptimal fetal growth. Future studies are needed to identify the potential for optimizing mental health outcomes in new generations by improving early-life growth. Funding This project received funding from the European Union's Horizon 2020 research and innovation programme (LIFECYCLE, grant agreement No 733206, 2016; EUCAN-Connect grant agreement No 824989; ATHLETE, grant agreement No 874583).
Collapse
Affiliation(s)
- Romy Gonçalves
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, the Netherlands
- Department of Paediatrics, Sophia's Children's Hospital, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Sophia Blaauwendraad
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, the Netherlands
- Department of Paediatrics, Sophia's Children's Hospital, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Demetris Avraam
- Department of Public Health, Section of Epidemiology, University of Copenhagen, Copenhagen, Denmark
| | - Andrea Beneíto
- Catalan Institute of Health-Camp de Tarragona, Tarragona, Spain
- Epidemiology and Environmental Health Joint Research Unit, FISABIO−Universitat Jaume I−Universitat de València, Valencia, Spain
| | - Marie-Aline Charles
- Université de Paris, Centre of Research in Epidemiology and Statistics, Inserm, Inrae, Paris, France
- Ined, Inserm, EFS Joint Unit Elfe, Paris, France
| | - Ahmed Elhakeem
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Joaquin Escribano
- Department of Paediatrics, Sant Joan Reus Hospital, University Rovira i Virgili, IISPV, Reus, Spain
| | - Louise Etienne
- Centre Hospitalier Chretien St. Vincent, Rocourt, Liège-Rocourt, Belgium
| | - Gonzalo García-Baquero Moneo
- Faculty of Biology, University of Salamanca, Salamanca, Spain
- Biogipuzkoa Health Research Institute, Donostia, Spain
| | - Ana Gonçalves Soares
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Jasmin de Groot
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, the Netherlands
- Department of Paediatrics, Sophia's Children's Hospital, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Veit Grote
- Division of Metabolic and Nutritional Medicine, Department of Paediatrics, Dr. von Hauner Children's Hospital, LMU University Hospital, LMU Munich, Munich, Germany
- German Center for Child and Adolescent Health, Munich, Germany
| | - Dariusz Gruszfeld
- Neonatal Department and Neonatal Intensive Care Unit, Children's Memorial Health Institute, Warsaw, Poland
| | - Kathrin Guerlich
- Division of Metabolic and Nutritional Medicine, Department of Paediatrics, Dr. von Hauner Children's Hospital, LMU University Hospital, LMU Munich, Munich, Germany
- German Center for Child and Adolescent Health, Munich, Germany
| | - Monica Guxens
- ISGlobal, Barcelona, Spain
- ICREA, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Instituto de Salud Carlos III, Madrid, Spain
- Department of Child and Adolescent Psychiatry/Psychology, Erasmus MC, University Medical Centre, Rotterdam, the Netherlands
| | - Barbara Heude
- Université de Paris, Centre of Research in Epidemiology and Statistics, Inserm, Inrae, Paris, France
| | - Berthold Koletzko
- Division of Metabolic and Nutritional Medicine, Department of Paediatrics, Dr. von Hauner Children's Hospital, LMU University Hospital, LMU Munich, Munich, Germany
- German Center for Child and Adolescent Health, Munich, Germany
| | - Aitana Lertxundi
- Biogipuzkoa Health Research Institute, Donostia, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Instituto de Salud Carlos III, Madrid, Spain
- Department of Preventive Medicine and Public Health, Faculty of Medicine, University of the Basque Country (UPV/EHU), Leioa, Spain
| | - Manuel Lozano
- Epidemiology and Environmental Health Joint Research Unit, FISABIO−Universitat Jaume I−Universitat de València, Valencia, Spain
- Preventive Medicine and Public Health, Food Sciences, Toxicology and Forensic Medicine Department, Universitat de València, Valencia, Spain
| | - Hanan El Marroun
- Department of Child and Adolescent Psychiatry/Psychology, Erasmus MC, University Medical Centre, Rotterdam, the Netherlands
- Department of Psychology, Education and Child Studies, Erasmus School of Social and Behavioural Science, Erasmus University Rotterdam, Rotterdam, the Netherlands
| | - Rosie McEachan
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford, BD9 6RJ, United Kingdom
| | - Angela Pinot de Moira
- Department of Public Health, Section of Epidemiology, University of Copenhagen, Copenhagen, Denmark
- Imperial College London, United Kingdom
| | - Gillian Santorelli
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford, BD9 6RJ, United Kingdom
| | | | - Muriel Tafflet
- Université de Paris, Centre of Research in Epidemiology and Statistics, Inserm, Inrae, Paris, France
| | - Chloe Vainqueur
- Université de Paris, Centre of Research in Epidemiology and Statistics, Inserm, Inrae, Paris, France
| | - Elvira Verduci
- Department of Paediatrics, Vittore Buzzi Children's Hospital, University of Milan, Milan, Italy
| | - Martine Vrijheid
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Instituto de Salud Carlos III, Madrid, Spain
| | - Marieke Welten
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, the Netherlands
- Department of Paediatrics, Sophia's Children's Hospital, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - John Wright
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford, BD9 6RJ, United Kingdom
| | - Tiffany C. Yang
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford, BD9 6RJ, United Kingdom
| | - Romy Gaillard
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, the Netherlands
- Department of Paediatrics, Sophia's Children's Hospital, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Vincent W.V. Jaddoe
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, the Netherlands
- Department of Paediatrics, Sophia's Children's Hospital, Erasmus University Medical Center, Rotterdam, the Netherlands
| |
Collapse
|
2
|
Avraam D, Wilson RC, Aguirre Chan N, Banerjee S, Bishop TRP, Butters O, Cadman T, Cederkvist L, Duijts L, Escribà Montagut X, Garner H, Gonçalves G, González JR, Haakma S, Hartlev M, Hasenauer J, Huth M, Hyde E, Jaddoe VWV, Marcon Y, Mayrhofer MT, Molnar-Gabor F, Morgan AS, Murtagh M, Nestor M, Nybo Andersen AM, Parker S, Pinot de Moira A, Schwarz F, Strandberg-Larsen K, Swertz MA, Welten M, Wheater S, Burton P. DataSHIELD: mitigating disclosure risk in a multi-site federated analysis platform. BIOINFORMATICS ADVANCES 2025; 5:vbaf046. [PMID: 40191546 PMCID: PMC11968321 DOI: 10.1093/bioadv/vbaf046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2024] [Revised: 02/13/2025] [Accepted: 03/05/2025] [Indexed: 04/09/2025]
Abstract
Motivation The validity of epidemiologic findings can be increased using triangulation, i.e. comparison of findings across contexts, and by having sufficiently large amounts of relevant data to analyse. However, access to data is often constrained by practical considerations and by ethico-legal and data governance restrictions. Gaining access to such data can be time-consuming due to the governance requirements associated with data access requests to institutions in different jurisdictions. Results DataSHIELD is a software solution that enables remote analysis without the need for data transfer (federated analysis). DataSHIELD is a scientifically mature, open-source data access and analysis platform aligned with the 'Five Safes' framework, the international framework governing safe research access to data. It allows real-time analysis while mitigating disclosure risk through an active multi-layer system of disclosure-preventing mechanisms. This combination of real-time remote statistical analysis, disclosure prevention mechanisms, and federation capabilities makes DataSHIELD a solution for addressing many of the technical and regulatory challenges in performing the large-scale statistical analysis of health and biomedical data. This paper describes the key components that comprise the disclosure protection system of DataSHIELD. These broadly fall into three classes: (i) system protection elements, (ii) analysis protection elements, and (iii) governance protection elements. Availability and implementation Information about the DataSHIELD software is available in https://datashield.org/ and https://github.com/datashield.
Collapse
Affiliation(s)
- Demetris Avraam
- Department of Public Health, Section of Epidemiology, University of Copenhagen, Copenhagen, DK-1353, Denmark
- Department of Public Health, Policy and Systems, University of Liverpool, Liverpool, L69 3GF, United Kingdom
| | - Rebecca C Wilson
- Department of Public Health, Policy and Systems, University of Liverpool, Liverpool, L69 3GF, United Kingdom
| | - Noemi Aguirre Chan
- BioQuant, Faculty of Law, Heidelberg University, Heidelberg, 69120, Germany
| | - Soumya Banerjee
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, United Kingdom
| | - Tom R P Bishop
- MRC Epidemiology Unit, University of Cambridge, Cambridge, CB2 0QQ, United Kingdom
| | - Olly Butters
- Department of Public Health, Policy and Systems, University of Liverpool, Liverpool, L69 3GF, United Kingdom
| | - Tim Cadman
- Department of Genetics, University of Groningen and University Medical Center Groningen, Groningen, 9713 AV, The Netherlands
- Barcelona Institute for Global Health (ISGlobal), Barcelona, 08003, Spain
| | - Luise Cederkvist
- Department of Public Health, Section of Epidemiology, University of Copenhagen, Copenhagen, DK-1353, Denmark
| | - Liesbeth Duijts
- Department of Pediatrics, Division of Respiratory Medicine and Allergology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
- Department of Neonatal and Pediatric Intensive Care, Division of Neonatology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
| | | | - Hugh Garner
- National Innovation Centre for Aging, Newcastle University, Newcastle upon Tyne, NE4 5TG, United Kingdom
| | - Gonçalo Gonçalves
- Human-Centered Computing and Information Science, INESC TEC, Porto, 4200-465, Portugal
| | - Juan R González
- Barcelona Institute for Global Health (ISGlobal), Barcelona, 08003, Spain
- Centro de Investigación Biomédica en Red en Epidemiología y Salud Pública, Barcelona, 08003, Spain
| | - Sido Haakma
- Department of Genetics, University of Groningen and University Medical Center Groningen, Groningen, 9713 AV, The Netherlands
| | - Mette Hartlev
- Centre for Legal Studies in Welfare and Market, Faculty of Law, University of Copenhagen, Copenhagen, DK-2300, Denmark
| | - Jan Hasenauer
- Life and Medical Sciences (LIMES) Institute and Bonn Center for Mathematical Life Sciences, University of Bonn, Bonn, 53115, Germany
| | - Manuel Huth
- Life and Medical Sciences (LIMES) Institute and Bonn Center for Mathematical Life Sciences, University of Bonn, Bonn, 53115, Germany
| | - Eleanor Hyde
- Department of Genetics, University of Groningen and University Medical Center Groningen, Groningen, 9713 AV, The Netherlands
| | - Vincent W V Jaddoe
- Generation R Study Group, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
- Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
| | | | | | | | - Andrei Scott Morgan
- Elizabeth Garrett Anderson Institute for Women’s Health London, University College London, London, WC1E 6DE, United Kingdom
- Obstetric, Perinatal, Paediatric and Life Course Epidemiology Team (OPPaLE), Center for Research in Epidemiology and StatisticS (CRESS), Institut National pour la Santé et la Recherche Médicale (INSERM, French Institute for Health and Medical Research), Institut National de Recherche pour l'Agriculture, l'Alimentation et l'Environnement (INRAe), Paris Cité University, Paris, 75010, France
| | - Madeleine Murtagh
- School of Social and Political Sciences, University of Glasgow, Glasgow, G12 8RT, United Kingdom
| | - Marc Nestor
- BioQuant, Faculty of Law, Heidelberg University, Heidelberg, 69120, Germany
| | - Anne-Marie Nybo Andersen
- Department of Public Health, Section of Epidemiology, University of Copenhagen, Copenhagen, DK-1353, Denmark
| | - Simon Parker
- BioQuant, Faculty of Law, Heidelberg University, Heidelberg, 69120, Germany
- German Human Genome-phenome Archive, DKFZ, Heidelberg, D-69120, Germany
| | - Angela Pinot de Moira
- Department of Public Health, Section of Epidemiology, University of Copenhagen, Copenhagen, DK-1353, Denmark
- School of Public Health, Imperial College London, London, W12 0BZ, United Kingdom
| | - Florian Schwarz
- Department of Molecular Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, 14558, Germany
| | - Katrine Strandberg-Larsen
- Department of Public Health, Section of Epidemiology, University of Copenhagen, Copenhagen, DK-1353, Denmark
| | - Morris A Swertz
- Department of Genetics, University of Groningen and University Medical Center Groningen, Groningen, 9713 AV, The Netherlands
| | - Marieke Welten
- Generation R Study Group, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
- Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
| | - Stuart Wheater
- Arjuna Technologies, Newcastle upon Tyne, NE4 5TG, United Kingdom
| | - Paul Burton
- Population Health Sciences Institute, Newcastle University, Newcastle, NE2 4AX, United Kingdom
| |
Collapse
|
3
|
Cadman T, Slofstra MK, van der Geest MA, Avraam D, Bishop TRP, de Boer T, Duijts L, Haakma S, Hyde E, Jaddoe V, Karramass T, Kelpin F, Marcon Y, Pinot de Moira A, Postma D, Tolboom C, Veenstra RL, Wheater S, Welten M, Wilson RC, Zwart E, Swertz M. MOLGENIS Armadillo: a lightweight server for federated analysis using DataSHIELD. Bioinformatics 2024; 41:btae726. [PMID: 39673440 PMCID: PMC11734753 DOI: 10.1093/bioinformatics/btae726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 11/21/2024] [Accepted: 11/29/2024] [Indexed: 12/16/2024] Open
Abstract
SUMMARY Extensive human health data from cohort studies, national registries, and biobanks can reveal lifecourse risk factors impacting health. Combining these sources offers increased statistical power, rare outcome detection, replication of findings, and extended study periods. Traditionally, this required data transfer to a central location or separate partner analyses with pooled summary statistics, posing ethical, legal, and time constraints. Federated analysis-which involves remote data analysis without sharing individual-level data-is a promising alternative. One promising solution is DataSHIELD (https://datashield.org/), an open-source R based implementation. To enable federated analysis, data owners need a user-friendly way to install the federated infrastructure and manage users and data. Here, we present MOLGENIS Armadillo: a lightweight server for federated analysis solutions such as DataSHIELD. AVAILABILITY AND IMPLEMENTATION Armadillo is implemented as a collection of three packages freely available under the open source licence LGPLv3: two R packages downloadable from the Comprehensive R Archive Network (CRAN) ("MolgenisArmadillo" and "DSMolgenisArmdillo") and one Java application ("ArmadilloService") as jar and docker images via Github (https://github.com/molgenis/molgenis-service-armadillo).
Collapse
Affiliation(s)
- Tim Cadman
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | - Mariska K Slofstra
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | - Marije A van der Geest
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | - Demetris Avraam
- Department of Public Health, University of Copenhagen, Copenhagen, 1353, Denmark
| | - Tom R P Bishop
- Medical Research Council Epidemiology Unit, University of Cambridge School of Clinical Medicine, Cambridge, CB2 0QQ, United Kingdom
| | - Tommy de Boer
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | - Liesbeth Duijts
- Department of Neonatal and Pediatric Intensive Care, Division of Neonatology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
- Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
| | - Sido Haakma
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | - Eleanor Hyde
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | - Vincent Jaddoe
- Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
- Generation R Study Group, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
| | - Tarik Karramass
- Department of Neonatal and Pediatric Intensive Care, Division of Neonatology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
- Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
| | - Fleur Kelpin
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | | | - Angela Pinot de Moira
- Department of Epidemiology and Biostatistics, Imperial College London, London, W2 1PG, United Kingdom
| | - Dick Postma
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | - Clemens Tolboom
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | - Ruben L Veenstra
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | - Stuart Wheater
- Arjuna Technologies, Newcastle Helix, Urban Science Building, Newcastle upon Tyne, NE4 5TG, United Kingdom
| | - Marieke Welten
- Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
- Generation R Study Group, Erasmus MC, University Medical Center Rotterdam, Rotterdam, 3015 GD, The Netherlands
| | - Rebecca C Wilson
- Department of Public Health, Policy and Systems, University of Liverpool, Liverpool, L69 3GF, United Kingdom
| | - Erik Zwart
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| | - Morris Swertz
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, 9700 RB, The Netherlands
| |
Collapse
|
4
|
Escriba-Montagut X, Marcon Y, Anguita-Ruiz A, Avraam D, Urquiza J, Morgan AS, Wilson RC, Burton P, Gonzalez JR. Federated privacy-protected meta- and mega-omics data analysis in multi-center studies with a fully open-source analytic platform. PLoS Comput Biol 2024; 20:e1012626. [PMID: 39652598 DOI: 10.1371/journal.pcbi.1012626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 12/19/2024] [Accepted: 11/10/2024] [Indexed: 12/21/2024] Open
Abstract
The importance of maintaining data privacy and complying with regulatory requirements is highlighted especially when sharing omic data between different research centers. This challenge is even more pronounced in the scenario where a multi-center effort for collaborative omics studies is necessary. OmicSHIELD is introduced as an open-source tool aimed at overcoming these challenges by enabling privacy-protected federated analysis of sensitive omic data. In order to ensure this, multiple security mechanisms have been included in the software. This innovative tool is capable of managing a wide range of omic data analyses specifically tailored to biomedical research. These include genome and epigenome wide association studies and differential gene expression analyses. OmicSHIELD is designed to support both meta- and mega-analysis, so that it offers a wide range of capabilities for different analysis designs. We present a series of use cases illustrating some examples of how the software addresses real-world analyses of omic data.
Collapse
Affiliation(s)
- Xavier Escriba-Montagut
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | | | - Augusto Anguita-Ruiz
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Demetris Avraam
- Department of Public Health, Policy and Systems, University of Liverpool, Liverpool, United Kingdom
| | - Jose Urquiza
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Centro de Investigación Biomédica en Red en Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| | - Andrei S Morgan
- Université Paris Cité, Centre of Research in Epidemiology and StatisticS (CRESS), Obstetrical Perinatal and Pediatric Epidemiology Research Team (EPOPé), INSERM, INRAE, F-75006, Paris, France
- Elizabeth Garrett Anderson Institute for Women's Health London, University College London, London, United Kingdom
| | - Rebecca C Wilson
- Department of Public Health, Policy and Systems, University of Liverpool, Liverpool, United Kingdom
| | - Paul Burton
- Population Health Sciences Institute, Newcastle University, Newcastle, United Kingdom
| | - Juan R Gonzalez
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Centro de Investigación Biomédica en Red en Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| |
Collapse
|
5
|
Huth M, Garavito CA, Seep L, Cirera L, Saúte F, Sicuri E, Hasenauer J. Federated difference-in-differences with multiple time periods in DataSHIELD. iScience 2024; 27:111025. [PMID: 39498304 PMCID: PMC11532944 DOI: 10.1016/j.isci.2024.111025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 08/28/2024] [Accepted: 09/20/2024] [Indexed: 11/07/2024] Open
Abstract
Difference-in-differences (DID) is a key tool for causal impact evaluation but faces challenges when applied to sensitive data restricted by privacy regulations. Obtaining consent can shrink sample sizes and reduce statistical power, limiting the analysis's effectiveness. Federated learning addresses these issues by sharing aggregated statistics rather than individual data, though advanced federated DID software is limited. We developed a federated version of the Callaway and Sant'Anna difference-in-differences (CSDID), integrated into the DataSHIELD platform, adhering to stringent privacy protocols. Our approach reproduces key estimates and standard errors while preserving confidentiality. Using simulated and real-world data from a malaria intervention in Mozambique, we demonstrate that federated estimates increase sample sizes, reduce estimation uncertainty, and enable analyses when data owners cannot share treated or untreated group data. Our work contributes to facilitating the evaluation of policy interventions or treatments across centers and borders.
Collapse
Affiliation(s)
- Manuel Huth
- Institute for Computational Biology, Helmholtz Munich - German Research Center for Environmental Health, Munich, Germany
- LIMES, Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, Germany
| | | | - Lea Seep
- LIMES, Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, Germany
| | | | - Francisco Saúte
- Centro de Investigação em Saúde de Manhiça, Manhiça, Mozambique
| | - Elisa Sicuri
- ISGlobal, Barcelona, Spain
- Centro de Investigação em Saúde de Manhiça, Manhiça, Mozambique
- LSE Health - Department of Health Policy, London School of Economics and Political Science, London, UK
- Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Jan Hasenauer
- Institute for Computational Biology, Helmholtz Munich - German Research Center for Environmental Health, Munich, Germany
- LIMES, Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, Germany
| |
Collapse
|
6
|
Schmid K, Sehring J, Németh A, Harter PN, Weber KJ, Vengadeswaran A, Storf H, Seidemann C, Karki K, Fischer P, Dohmen H, Selignow C, von Deimling A, Grau S, Schröder U, Plate KH, Stein M, Uhl E, Acker T, Amsel D. DistSNE: Distributed computing and online visualization of DNA methylation-based central nervous system tumor classification. Brain Pathol 2024; 34:e13228. [PMID: 38012085 PMCID: PMC11007060 DOI: 10.1111/bpa.13228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 11/10/2023] [Indexed: 11/29/2023] Open
Abstract
The current state-of-the-art analysis of central nervous system (CNS) tumors through DNA methylation profiling relies on the tumor classifier developed by Capper and colleagues, which centrally harnesses DNA methylation data provided by users. Here, we present a distributed-computing-based approach for CNS tumor classification that achieves a comparable performance to centralized systems while safeguarding privacy. We utilize the t-distributed neighborhood embedding (t-SNE) model for dimensionality reduction and visualization of tumor classification results in two-dimensional graphs in a distributed approach across multiple sites (DistSNE). DistSNE provides an intuitive web interface (https://gin-tsne.med.uni-giessen.de) for user-friendly local data management and federated methylome-based tumor classification calculations for multiple collaborators in a DataSHIELD environment. The freely accessible web interface supports convenient data upload, result review, and summary report generation. Importantly, increasing sample size as achieved through distributed access to additional datasets allows DistSNE to improve cluster analysis and enhance predictive power. Collectively, DistSNE enables a simple and fast classification of CNS tumors using large-scale methylation data from distributed sources, while maintaining the privacy and allowing easy and flexible network expansion to other institutes. This approach holds great potential for advancing human brain tumor classification and fostering collaborative precision medicine in neuro-oncology.
Collapse
Affiliation(s)
- Kai Schmid
- Institute of Neuropathology, Justus‐Liebig University GiessenGiessenGermany
| | - Jannik Sehring
- Institute of Neuropathology, Justus‐Liebig University GiessenGiessenGermany
| | - Attila Németh
- Institute of Neuropathology, Justus‐Liebig University GiessenGiessenGermany
| | - Patrick N. Harter
- Neurological Institute (Edinger Institute)University Hospital FrankfurtFrankfurtGermany
- Present address:
Center for Neuropathology and Prion ResearchUniversity Hospital of MunichMunichGermany
| | - Katharina J. Weber
- Neurological Institute (Edinger Institute)University Hospital FrankfurtFrankfurtGermany
- German Cancer Consortium (DKTK)HeidelbergGermany
- German Cancer Research Center (DKFZ)HeidelbergGermany
- Frankfurt Cancer Institute (FCI)FrankfurtGermany
- University Cancer Center (UCT) FrankfurtFrankfurtGermany
| | - Abishaa Vengadeswaran
- Medical Informatics Group (MIG), Goethe University FrankfurtUniversity Hospital FrankfurtFrankfurt am MainGermany
| | - Holger Storf
- Medical Informatics Group (MIG), Goethe University FrankfurtUniversity Hospital FrankfurtFrankfurt am MainGermany
| | | | - Kapil Karki
- DIZ MarburgPhillips University MarburgMarburgGermany
| | - Patrick Fischer
- Institute for Medical InformaticsJustus‐Liebig UniversityGiessenGermany
- Department of Neuropathology, German Cancer Research Center (DKFZ)Universitätsklinikum Heidelberg, and CCU NeuropathologyHeidelbergGermany
| | - Hildegard Dohmen
- Institute of Neuropathology, Justus‐Liebig University GiessenGiessenGermany
| | - Carmen Selignow
- Institute of Neuropathology, Justus‐Liebig University GiessenGiessenGermany
| | | | - Stefan Grau
- Department of NeurosurgeryHospital FuldaFuldaGermany
| | - Uwe Schröder
- Department of NeurosurgeryMVZ Frankfurt/OderFrankfurtGermany
| | - Karl H. Plate
- Neurological Institute (Edinger Institute)University Hospital FrankfurtFrankfurtGermany
| | - Marco Stein
- Department of NeurosurgeryUniversity Hospital Giessen und Marburg Location GiessenGiessenGermany
| | - Eberhard Uhl
- Department of NeurosurgeryUniversity Hospital Giessen und Marburg Location GiessenGiessenGermany
| | - Till Acker
- Institute of Neuropathology, Justus‐Liebig University GiessenGiessenGermany
| | - Daniel Amsel
- Institute of Neuropathology, Justus‐Liebig University GiessenGiessenGermany
| |
Collapse
|
7
|
Delfin C, Dragan I, Kuznetsov D, Tajes JF, Smit F, Coral DE, Farzaneh A, Haugg A, Hungele A, Niknejad A, Hall C, Jacobs D, Marek D, Fraser DP, Thuillier D, Ahmadizar F, Mehl F, Pattou F, Burdet F, Hawkes G, Arts ICW, Blanch J, Van Soest J, Fernández-Real JM, Boehl J, Fink K, van Greevenbroek MMJ, Kavousi M, Minten M, Prinz N, Ipsen N, Franks PW, Ramos R, Holl RW, Horban S, Duarte-Salles T, Tran VDT, Raverdy V, Leal Y, Lenart A, Pearson E, Sparsø T, Giordano GN, Ioannidis V, Soh K, Frayling TM, Le Roux CW, Ibberson M. A Federated Database for Obesity Research: An IMI-SOPHIA Study. Life (Basel) 2024; 14:262. [PMID: 38398771 PMCID: PMC10890572 DOI: 10.3390/life14020262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 01/12/2024] [Accepted: 02/08/2024] [Indexed: 02/25/2024] Open
Abstract
Obesity is considered by many as a lifestyle choice rather than a chronic progressive disease. The Innovative Medicines Initiative (IMI) SOPHIA (Stratification of Obesity Phenotypes to Optimize Future Obesity Therapy) project is part of a momentum shift aiming to provide better tools for the stratification of people with obesity according to disease risk and treatment response. One of the challenges to achieving these goals is that many clinical cohorts are siloed, limiting the potential of combined data for biomarker discovery. In SOPHIA, we have addressed this challenge by setting up a federated database building on open-source DataSHIELD technology. The database currently federates 16 cohorts that are accessible via a central gateway. The database is multi-modal, including research studies, clinical trials, and routine health data, and is accessed using the R statistical programming environment where statistical and machine learning analyses can be performed at a distance without any disclosure of patient-level data. We demonstrate the use of the database by providing a proof-of-concept analysis, performing a federated linear model of BMI and systolic blood pressure, pooling all data from 16 studies virtually without any analyst seeing individual patient-level data. This analysis provided similar point estimates compared to a meta-analysis of the 16 individual studies. Our approach provides a benchmark for reproducible, safe federated analyses across multiple study types provided by multiple stakeholders.
Collapse
Affiliation(s)
| | - Iulian Dragan
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Dmitry Kuznetsov
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Juan Fernandez Tajes
- Genetic and Molecular Epidemiology Unit, Lund University Diabetes Centre, Department of Clinical Sciences, Clinical Research Centre (CRC), Lund University, Jan Waldenströmsgata 35, SE-20502 Malmö, Sweden
| | - Femke Smit
- Maastricht Center for Systems Biology, Faculty of Science and Engineering, Maastricht University, Paul Henri Spaaklaan 1, 6229 EN Maastricht, The Netherlands
| | - Daniel E. Coral
- Genetic and Molecular Epidemiology Unit, Lund University Diabetes Centre, Department of Clinical Sciences, Clinical Research Centre (CRC), Lund University, Jan Waldenströmsgata 35, SE-20502 Malmö, Sweden
| | - Ali Farzaneh
- Department of Epidemiology, Erasmus MC, University Medical Center Rotterdam, 3000 CA Rotterdam, The Netherlands
| | - André Haugg
- Global Biostatistics & Data Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, 88400 Biberach, Germany
| | - Andreas Hungele
- Institute of Epidemiology and Medical Biometry, CAQM, University of Ulm, 89081 Ulm, Germany
- German Center for Diabetes Research (DZD), 85764 Neuherberg, Germany
| | - Anne Niknejad
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Christopher Hall
- Division of Population Health and Genomics, Ninewells Hospital and School of Medicine, University of Dundee, Dundee DD1 4HN, UK
| | - Daan Jacobs
- Nederlandse Obesitas Kliniek, Huis Ter Heide, 3712 BA Utrecht, The Netherlands
| | - Diana Marek
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Diane P. Fraser
- University of Exeter Medical School, University of Exeter, Exeter EX1 2LU, UK
| | - Dorothee Thuillier
- Univ Lille, Inserm, CHU Lille, Pasteur Institute Lille, U1190 Translational Research for Diabetes, European Genomic Institute of Diabetes, 59000 Lille, France; (D.T.)
| | - Fariba Ahmadizar
- Data Science and Biostatistics Department, Julius Global Health, University Medical Center Utrecht, 3508 GA Utrecht, The Netherlands
| | - Florence Mehl
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Francois Pattou
- Univ Lille, Inserm, CHU Lille, Pasteur Institute Lille, U1190 Translational Research for Diabetes, European Genomic Institute of Diabetes, 59000 Lille, France; (D.T.)
| | - Frederic Burdet
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Gareth Hawkes
- University of Exeter Medical School, University of Exeter, Exeter EX1 2LU, UK
| | - Ilja C. W. Arts
- Maastricht Center for Systems Biology, Faculty of Science and Engineering, Maastricht University, Paul Henri Spaaklaan 1, 6229 EN Maastricht, The Netherlands
| | - Jordi Blanch
- Fundació Institut Universitari per a la Recerca a l’Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), 08007 Barcelona, Spain
- ISV-Girona Research Group, Research Unit in Primary Care, Primary Care Services, Catalan Institute of Health (ICS), 08908 Barcelona, Spain
| | - Johan Van Soest
- Brightlands Institute for Smart Society (BISS), Faculty of Science and Engineering, Maastricht University, 6229 EN Maastricht, The Netherlands
- Department of Radiation Oncology (Maastro), GROW-School for Oncology and Reproduction, Maastricht University Medical Center, 6229 EN Maastricht, The Netherlands
| | - José-Manuel Fernández-Real
- Nutrition, Eumetabolism and Health Group, Institut d’Investigació Biomèdica de Girona (IDIBGI-CERCA), Av. França 30, 17007 Girona, Spain
- Department of Medical Sciences, School of Medicine, University of Girona, 17003 Girona, Spain
- CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Instituto de Salud Carlos III, 28029 Madrid, Spain
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Av. França, s/n, 17007 Girona, Spain
| | - Juergen Boehl
- Global Biostatistics & Data Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, 88400 Biberach, Germany
| | - Katharina Fink
- Institute of Epidemiology and Medical Biometry, CAQM, University of Ulm, 89081 Ulm, Germany
- German Center for Diabetes Research (DZD), 85764 Neuherberg, Germany
| | - Marleen M. J. van Greevenbroek
- Department of Internal Medicine and CARIM School of Cardiovascular Diseases, Maastricht University, 6229 EN Maastricht, The Netherlands
| | - Maryam Kavousi
- Department of Epidemiology, Erasmus MC, University Medical Center Rotterdam, 3000 CA Rotterdam, The Netherlands
| | - Michiel Minten
- Maastricht Center for Systems Biology, Faculty of Science and Engineering, Maastricht University, Paul Henri Spaaklaan 1, 6229 EN Maastricht, The Netherlands
| | - Nicole Prinz
- Institute of Epidemiology and Medical Biometry, CAQM, University of Ulm, 89081 Ulm, Germany
- German Center for Diabetes Research (DZD), 85764 Neuherberg, Germany
| | | | - Paul W. Franks
- Genetic and Molecular Epidemiology Unit, Lund University Diabetes Centre, Department of Clinical Sciences, Clinical Research Centre (CRC), Lund University, Jan Waldenströmsgata 35, SE-20502 Malmö, Sweden
| | - Rafael Ramos
- Fundació Institut Universitari per a la Recerca a l’Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), 08007 Barcelona, Spain
- Department of Medical Sciences, School of Medicine, University of Girona, 17003 Girona, Spain
- Department of Medical Informatics, Erasmus University Medical Center, 3000 CA Rotterdam, The Netherlands
- Research in Vascular Health Group, Institut d’Investigació Biomèdica de Girona (IDIBGI-CERCA), Parc Hospitalari Martí i Julià, Edifici M2, 17190 Salt, Spain
| | - Reinhard W. Holl
- Institute of Epidemiology and Medical Biometry, CAQM, University of Ulm, 89081 Ulm, Germany
- German Center for Diabetes Research (DZD), 85764 Neuherberg, Germany
| | - Scott Horban
- Division of Population Health and Genomics, Ninewells Hospital and School of Medicine, University of Dundee, Dundee DD1 4HN, UK
| | - Talita Duarte-Salles
- Fundació Institut Universitari per a la Recerca a l’Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), 08007 Barcelona, Spain
- Department of Medical Informatics, Erasmus University Medical Center, 3000 CA Rotterdam, The Netherlands
| | - Van Du T. Tran
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Violeta Raverdy
- Univ Lille, Inserm, CHU Lille, Pasteur Institute Lille, U1190 Translational Research for Diabetes, European Genomic Institute of Diabetes, 59000 Lille, France; (D.T.)
| | - Yenny Leal
- Nutrition, Eumetabolism and Health Group, Institut d’Investigació Biomèdica de Girona (IDIBGI-CERCA), Av. França 30, 17007 Girona, Spain
- Department of Medical Sciences, School of Medicine, University of Girona, 17003 Girona, Spain
- CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Instituto de Salud Carlos III, 28029 Madrid, Spain
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Av. França, s/n, 17007 Girona, Spain
| | | | - Ewan Pearson
- Division of Population Health and Genomics, Ninewells Hospital and School of Medicine, University of Dundee, Dundee DD1 4HN, UK
| | | | - Giuseppe N. Giordano
- Genetic and Molecular Epidemiology Unit, Lund University Diabetes Centre, Department of Clinical Sciences, Clinical Research Centre (CRC), Lund University, Jan Waldenströmsgata 35, SE-20502 Malmö, Sweden
| | - Vassilios Ioannidis
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Keng Soh
- Novo Nordisk A/S, 2860 Søborg, Denmark
| | - Timothy M. Frayling
- University of Exeter Medical School, University of Exeter, Exeter EX1 2LU, UK
- Department of Genetic Medicine and Development, Faculty of Medicine, University of Geneva, 1 Rue Michel-Servet, CH-1211 Geneva, Switzerland
| | - Carel W. Le Roux
- Diabetes Complications Research Centre, University College Dublin, D04 V1W8 Dublin, Ireland
| | - Mark Ibberson
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| |
Collapse
|
8
|
Akyüz K, Cano Abadía M, Goisauf M, Mayrhofer MT. Unlocking the potential of big data and AI in medicine: insights from biobanking. Front Med (Lausanne) 2024; 11:1336588. [PMID: 38357641 PMCID: PMC10864616 DOI: 10.3389/fmed.2024.1336588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 01/19/2024] [Indexed: 02/16/2024] Open
Abstract
Big data and artificial intelligence are key elements in the medical field as they are expected to improve accuracy and efficiency in diagnosis and treatment, particularly in identifying biomedically relevant patterns, facilitating progress towards individually tailored preventative and therapeutic interventions. These applications belong to current research practice that is data-intensive. While the combination of imaging, pathological, genomic, and clinical data is needed to train algorithms to realize the full potential of these technologies, biobanks often serve as crucial infrastructures for data-sharing and data flows. In this paper, we argue that the 'data turn' in the life sciences has increasingly re-structured major infrastructures, which often were created for biological samples and associated data, as predominantly data infrastructures. These have evolved and diversified over time in terms of tackling relevant issues such as harmonization and standardization, but also consent practices and risk assessment. In line with the datafication, an increased use of AI-based technologies marks the current developments at the forefront of the big data research in life science and medicine that engender new issues and concerns along with opportunities. At a time when secure health data environments, such as European Health Data Space, are in the making, we argue that such meta-infrastructures can benefit both from the experience and evolution of biobanking, but also the current state of affairs in AI in medicine, regarding good governance, the social aspects and practices, as well as critical thinking about data practices, which can contribute to trustworthiness of such meta-infrastructures.
Collapse
Affiliation(s)
- Kaya Akyüz
- Department of ELSI Services and Research, BBMRI-ERIC, Graz, Austria
| | | | | | | |
Collapse
|
9
|
Tomasoni D, Lombardo R, Lauria M. Strengths and limitations of non-disclosive data analysis: a comparison of breast cancer survival classifiers using VisualSHIELD. Front Genet 2024; 15:1270387. [PMID: 38348453 PMCID: PMC10859452 DOI: 10.3389/fgene.2024.1270387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 01/08/2024] [Indexed: 02/15/2024] Open
Abstract
Preserving data privacy is an important concern in the research use of patient data. The DataSHIELD suite enables privacy-aware advanced statistical analysis in a federated setting. Despite its many applications, it has a few open practical issues: the complexity of hosting a federated infrastructure, the performance penalty imposed by the privacy-preserving constraints, and the ease of use by non-technical users. In this work, we describe a case study in which we review different breast cancer classifiers and report our findings about the limits and advantages of such non-disclosive suite of tools in a realistic setting. Five independent gene expression datasets of breast cancer survival were downloaded from Gene Expression Omnibus (GEO) and pooled together through the federated infrastructure. Three previously published and two newly proposed 5-year cancer-free survival risk score classifiers were trained in a federated environment, and an additional reference classifier was trained with unconstrained data access. The performance of these six classifiers was systematically evaluated, and the results show that i) the published classifiers do not generalize well when applied to patient cohorts that differ from those used to develop them; ii) among the methods we tried, the classification using logistic regression worked better on average, closely followed by random forest; iii) the unconstrained version of the logistic regression classifier outperformed the federated version by 4% on average. Reproducibility of our experiments is ensured through the use of VisualSHIELD, an open-source tool that augments DataSHIELD with new functions, a standardized deployment procedure, and a simple graphical user interface.
Collapse
Affiliation(s)
- Danilo Tomasoni
- Fondazione the Microsoft Research–University of Trento Centre for Computational and Systems Biology (COSBI), Rovereto, Italy
| | | | - Mario Lauria
- Fondazione the Microsoft Research–University of Trento Centre for Computational and Systems Biology (COSBI), Rovereto, Italy
- Department of Mathematics, University of Trento, Povo, Italy
| |
Collapse
|
10
|
Dellacasa C, Ortali M, Rossi E, Abu Attieh H, Osmo T, Puskaric M, Rinaldi E, Prasser F, Stellmach C, Cataudella S, Agarwal B, Mata Naranjo J, Scipione G. An innovative technological infrastructure for managing SARS-CoV-2 data across different cohorts in compliance with General Data Protection Regulation. Digit Health 2024; 10:20552076241248922. [PMID: 38766364 PMCID: PMC11100396 DOI: 10.1177/20552076241248922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 04/04/2024] [Indexed: 05/22/2024] Open
Abstract
Background The ORCHESTRA project, funded by the European Commission, aims to create a pan-European cohort built on existing and new large-scale population cohorts to help rapidly advance the knowledge related to the prevention of the SARS-CoV-2 infection and the management of COVID-19 and its long-term sequelae. The integration and analysis of the very heterogeneous health data pose the challenge of building an innovative technological infrastructure as the foundation of a dedicated framework for data management that should address the regulatory requirements such as the General Data Protection Regulation (GDPR). Methods The three participating Supercomputing European Centres (CINECA - Italy, CINES - France and HLRS - Germany) designed and deployed a dedicated infrastructure to fulfil the functional requirements for data management to ensure sensitive biomedical data confidentiality/privacy, integrity, and security. Besides the technological issues, many methodological aspects have been considered: Berlin Institute of Health (BIH), Charité provided its expertise both for data protection, information security, and data harmonisation/standardisation. Results The resulting infrastructure is based on a multi-layer approach that integrates several security measures to ensure data protection. A centralised Data Collection Platform has been established in the Italian National Hub while, for the use cases in which data sharing is not possible due to privacy restrictions, a distributed approach for Federated Analysis has been considered. A Data Portal is available as a centralised point of access for non-sensitive data and results, according to findability, accessibility, interoperability, and reusability (FAIR) data principles. This technological infrastructure has been used to support significative data exchange between population cohorts and to publish important scientific results related to SARS-CoV-2. Conclusions Considering the increasing demand for data usage in accordance with the requirements of the GDPR regulations, the experience gained in the project and the infrastructure released for the ORCHESTRA project can act as a model to manage future public health threats. Other projects could benefit from the results achieved by ORCHESTRA by building upon the available standardisation of variables, design of the architecture, and process used for GDPR compliance.
Collapse
Affiliation(s)
- Chiara Dellacasa
- HPC Department, CINECA Consorzio Interuniversitario,
Bologna, Italy
| | - Maurizio Ortali
- HPC Department, CINECA Consorzio Interuniversitario,
Bologna, Italy
| | - Elisa Rossi
- HPC Department, CINECA Consorzio Interuniversitario,
Bologna, Italy
| | - Hammam Abu Attieh
- Berlin Institute of Health (BIH), Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Thomas Osmo
- Département Archivage et Services aux Données (DASD), Centre Informatique National de l'Enseignement Supérieur (CINES), Montpellier, France
| | - Miroslav Puskaric
- High Performance Computing Center Stuttgart (HLRS), University of Stuttgart, Stuttgart, Germany
| | - Eugenia Rinaldi
- Berlin Institute of Health (BIH), Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Fabian Prasser
- Berlin Institute of Health (BIH), Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Caroline Stellmach
- Berlin Institute of Health (BIH), Charité – Universitätsmedizin Berlin, Berlin, Germany
| | | | - Bhaskar Agarwal
- HPC Department, CINECA Consorzio Interuniversitario,
Bologna, Italy
| | | | | |
Collapse
|
11
|
Salvador E, Mazzi C, De Santis N, Bertoli G, Jonjić A, Coklo M, Majdan M, Peñalvo JL, Buonfrate D. Impact of domiciliary administration of NSAIDs on COVID-19 hospital outcomes: an unCoVer analysis. Front Pharmacol 2023; 14:1252800. [PMID: 37876733 PMCID: PMC10591104 DOI: 10.3389/fphar.2023.1252800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 09/25/2023] [Indexed: 10/26/2023] Open
Abstract
Background: Effective domiciliary treatment can be useful in the early phase of COVID-19 to limit disease progression, and pressure on hospitals. There are discrepant data on the use of non-steroidal anti-inflammatory drugs (NSAIDs). Aim of this study is to evaluate whether the clinical outcome of patients who were hospitalized for COVID-19 is influenced by domiciliary treatment with NSAIDs. Secondary objective was to explore the association between other patient characteristics/therapies and outcome. Methods: A large dataset of COVID-19 patients was created in the context of a European Union-funded project (unCoVer). The primary outcome was explored using a study level random effects meta-analysis for binary (multivariate logistic regression models) outcomes adjusted for selected factors, including demographics and other comorbidities. Results: 218 out of 1,144 patients reported use of NSAIDs before admission. No association between NSAIDs use and clinical outcome was found (unadj. OR: 0.96, 95%CI: 0.68-1.38). The model showed an independent upward risk of death with increasing age (OR 1.06; 95% CI 1.05-1.07) and male sex (1.36; 95% CI 1.04-1.76). Conclusion: In our study, the domiciliary use of NSAIDs did not show association with clinical outcome in patients hospitalized with COVID-19. Older ages and male sex were associated to an increased risk of death.
Collapse
Affiliation(s)
- Elena Salvador
- Department of Infectious Tropical Diseases and Microbiology, IRCCS Sacro Cuore Don Calabria Hospital, Negrar, Verona, Italy
| | - Cristina Mazzi
- Clinical Research Unit, IRCCS Sacro Cuore Don Calabria Hospital, Negrar, Verona, Italy
| | - Nicoletta De Santis
- Department of Infectious Tropical Diseases and Microbiology, IRCCS Sacro Cuore Don Calabria Hospital, Negrar, Verona, Italy
| | - Giulia Bertoli
- Department of Infectious Tropical Diseases and Microbiology, IRCCS Sacro Cuore Don Calabria Hospital, Negrar, Verona, Italy
| | - Antonija Jonjić
- Centre for Applied Bioanthropology, Institute for Anthropological Research, Zagreb, Croatia
| | - Miran Coklo
- Centre for Applied Bioanthropology, Institute for Anthropological Research, Zagreb, Croatia
| | - Marek Majdan
- Institute for Global Health and Epidemiology, Trnava University, Trnava, Slovakia
| | - José L. Peñalvo
- Unit of Non-Communicable Diseases, Institute of Tropical Medicine, Antwerp, Belgium
- Global Health Institute, University of Antwerp, Antwerp, Belgium
| | - Dora Buonfrate
- Department of Infectious Tropical Diseases and Microbiology, IRCCS Sacro Cuore Don Calabria Hospital, Negrar, Verona, Italy
| |
Collapse
|
12
|
Huth M, Arruda J, Gusinow R, Contento L, Tacconelli E, Hasenauer J. Accessibility of covariance information creates vulnerability in Federated Learning frameworks. Bioinformatics 2023; 39:btad531. [PMID: 37647639 PMCID: PMC10516515 DOI: 10.1093/bioinformatics/btad531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 06/27/2023] [Accepted: 08/28/2023] [Indexed: 09/01/2023] Open
Abstract
MOTIVATION Federated Learning (FL) is gaining traction in various fields as it enables integrative data analysis without sharing sensitive data, such as in healthcare. However, the risk of data leakage caused by malicious attacks must be considered. In this study, we introduce a novel attack algorithm that relies on being able to compute sample means, sample covariances, and construct known linearly independent vectors on the data owner side. RESULTS We show that these basic functionalities, which are available in several established FL frameworks, are sufficient to reconstruct privacy-protected data. Additionally, the attack algorithm is robust to defense strategies that involve adding random noise. We demonstrate the limitations of existing frameworks and propose potential defense strategies analyzing the implications of using differential privacy. The novel insights presented in this study will aid in the improvement of FL frameworks. AVAILABILITY AND IMPLEMENTATION The code examples are provided at GitHub (https://github.com/manuhuth/Data-Leakage-From-Covariances.git). The CNSIM1 dataset, which we used in the manuscript, is available within the DSData R package (https://github.com/datashield/DSData/tree/main/data).
Collapse
Affiliation(s)
- Manuel Huth
- Institute of Computational Biology, Helmholtz Munich, Neuherberg 85764, Germany
- Life and Medical Sciences Institute, Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn 53115, Germany
| | - Jonas Arruda
- Life and Medical Sciences Institute, Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn 53115, Germany
| | - Roy Gusinow
- Institute of Computational Biology, Helmholtz Munich, Neuherberg 85764, Germany
- Life and Medical Sciences Institute, Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn 53115, Germany
| | - Lorenzo Contento
- Life and Medical Sciences Institute, Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn 53115, Germany
| | - Evelina Tacconelli
- Division of Infectious Diseases, Department of Diagnostics and Public Health, University of Verona, Verona 37124, Italy
| | - Jan Hasenauer
- Life and Medical Sciences Institute, Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn 53115, Germany
| |
Collapse
|
13
|
Wolfien M, Ahmadi N, Fitzer K, Grummt S, Heine KL, Jung IC, Krefting D, Kühn A, Peng Y, Reinecke I, Scheel J, Schmidt T, Schmücker P, Schüttler C, Waltemath D, Zoch M, Sedlmayr M. Ten Topics to Get Started in Medical Informatics Research. J Med Internet Res 2023; 25:e45948. [PMID: 37486754 PMCID: PMC10407648 DOI: 10.2196/45948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 03/29/2023] [Accepted: 04/11/2023] [Indexed: 07/25/2023] Open
Abstract
The vast and heterogeneous data being constantly generated in clinics can provide great wealth for patients and research alike. The quickly evolving field of medical informatics research has contributed numerous concepts, algorithms, and standards to facilitate this development. However, these difficult relationships, complex terminologies, and multiple implementations can present obstacles for people who want to get active in the field. With a particular focus on medical informatics research conducted in Germany, we present in our Viewpoint a set of 10 important topics to improve the overall interdisciplinary communication between different stakeholders (eg, physicians, computational experts, experimentalists, students, patient representatives). This may lower the barriers to entry and offer a starting point for collaborations at different levels. The suggested topics are briefly introduced, then general best practice guidance is given, and further resources for in-depth reading or hands-on tutorials are recommended. In addition, the topics are set to cover current aspects and open research gaps of the medical informatics domain, including data regulations and concepts; data harmonization and processing; and data evaluation, visualization, and dissemination. In addition, we give an example on how these topics can be integrated in a medical informatics curriculum for higher education. By recognizing these topics, readers will be able to (1) set clinical and research data into the context of medical informatics, understanding what is possible to achieve with data or how data should be handled in terms of data privacy and storage; (2) distinguish current interoperability standards and obtain first insights into the processes leading to effective data transfer and analysis; and (3) value the use of newly developed technical approaches to utilize the full potential of clinical data.
Collapse
Affiliation(s)
- Markus Wolfien
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- Center for Scalable Data Analytics and Artificial Intelligence, Dresden, Germany
| | - Najia Ahmadi
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Kai Fitzer
- Core Unit Data Integration Center, University Medicine Greifswald, Greifswald, Germany
| | - Sophia Grummt
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Kilian-Ludwig Heine
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Ian-C Jung
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Dagmar Krefting
- Department of Medical Informatics, University Medical Center, Goettingen, Germany
| | - Andreas Kühn
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Yuan Peng
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Ines Reinecke
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Julia Scheel
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany
| | - Tobias Schmidt
- Institute for Medical Informatics, University of Applied Sciences Mannheim, Mannheim, Germany
| | - Paul Schmücker
- Institute for Medical Informatics, University of Applied Sciences Mannheim, Mannheim, Germany
| | - Christina Schüttler
- Central Biobank Erlangen, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Dagmar Waltemath
- Core Unit Data Integration Center, University Medicine Greifswald, Greifswald, Germany
- Department of Medical Informatics, University Medicine Greifswald, Greifswald, Germany
| | - Michele Zoch
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Martin Sedlmayr
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- Center for Scalable Data Analytics and Artificial Intelligence, Dresden, Germany
| |
Collapse
|
14
|
Safarlou CW, Jongsma KR, Vermeulen R, Bredenoord AL. The ethical aspects of exposome research: a systematic review. EXPOSOME 2023; 3:osad004. [PMID: 37745046 PMCID: PMC7615114 DOI: 10.1093/exposome/osad004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
In recent years, exposome research has been put forward as the next frontier for the study of human health and disease. Exposome research entails the analysis of the totality of environmental exposures and their corresponding biological responses within the human body. Increasingly, this is operationalized by big-data approaches to map the effects of internal as well as external exposures using smart sensors and multiomics technologies. However, the ethical implications of exposome research are still only rarely discussed in the literature. Therefore, we conducted a systematic review of the academic literature regarding both the exposome and underlying research fields and approaches, to map the ethical aspects that are relevant to exposome research. We identify five ethical themes that are prominent in ethics discussions: the goals of exposome research, its standards, its tools, how it relates to study participants, and the consequences of its products. Furthermore, we provide a number of general principles for how future ethics research can best make use of our comprehensive overview of the ethical aspects of exposome research. Lastly, we highlight three aspects of exposome research that are most in need of ethical reflection: the actionability of its findings, the epidemiological or clinical norms applicable to exposome research, and the meaning and action-implications of bias.
Collapse
Affiliation(s)
- Caspar W. Safarlou
- Department of Global Public Health and Bioethics, Julius Center for
Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The
Netherlands
| | - Karin R. Jongsma
- Department of Global Public Health and Bioethics, Julius Center for
Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The
Netherlands
| | - Roel Vermeulen
- Department of Global Public Health and Bioethics, Julius Center for
Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The
Netherlands
- Department of Population Health Sciences, Utrecht University,
Utrecht, The Netherlands
| | - Annelien L. Bredenoord
- Department of Global Public Health and Bioethics, Julius Center for
Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The
Netherlands
- Erasmus School of Philosophy, Erasmus University Rotterdam,
Rotterdam, The Netherlands
| |
Collapse
|
15
|
Palm J, Meineke FA, Przybilla J, Peschel T. "fhircrackr": An R Package Unlocking Fast Healthcare Interoperability Resources for Statistical Analysis. Appl Clin Inform 2023; 14:54-64. [PMID: 36696915 PMCID: PMC9876659 DOI: 10.1055/s-0042-1760436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND The growing interest in the secondary use of electronic health record (EHR) data has increased the number of new data integration and data sharing infrastructures. The present work has been developed in the context of the German Medical Informatics Initiative, where 29 university hospitals agreed to the usage of the Health Level Seven Fast Healthcare Interoperability Resources (FHIR) standard for their newly established data integration centers. This standard is optimized to describe and exchange medical data but less suitable for standard statistical analysis which mostly requires tabular data formats. OBJECTIVES The objective of this work is to establish a tool that makes FHIR data accessible for standard statistical analysis by providing means to retrieve and transform data from a FHIR server. The tool should be implemented in a programming environment known to most data analysts and offer functions with variable degrees of flexibility and automation catering to users with different levels of FHIR expertise. METHODS We propose the fhircrackr framework, which allows downloading and flattening FHIR resources for data analysis. The framework supports different download and authentication protocols and gives the user full control over the data that is extracted from the FHIR resources and transformed into tables. We implemented it using the programming language R [1] and published it under the GPL-3 open source license. RESULTS The framework was successfully applied to both publicly available test data and real-world data from several ongoing studies. While the processing of larger real-world data sets puts a considerable burden on computation time and memory consumption, those challenges can be attenuated with a number of suitable measures like parallelization and temporary storage mechanisms. CONCLUSION The fhircrackr R package provides an open source solution within an environment that is familiar to most data scientists and helps overcome the practical challenges that still hamper the usage of EHR data for research.
Collapse
Affiliation(s)
- Julia Palm
- Institute of Medical Statistics, Computer and Data Sciences, Jena University Hospital, Jena, Thüringen, Germany
| | - Frank A Meineke
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany
| | - Jens Przybilla
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany.,Clinical Trial Centre Leipzig, University of Leipzig, Leipzig, Germany
| | - Thomas Peschel
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany
| |
Collapse
|
16
|
Turner MC. What is next for occupational cancer epidemiology? Scand J Work Environ Health 2022; 48:591-597. [PMID: 36228312 PMCID: PMC10546614 DOI: 10.5271/sjweh.4067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
17
|
Escribà-Montagut X, Marcon Y, Avraam D, Banerjee S, Bishop TRP, Burton P, González JR. Software Application Profile: ShinyDataSHIELD—an R Shiny application to perform federated non-disclosive data analysis in multicohort studies. Int J Epidemiol 2022; 52:315-320. [PMCID: PMC9908040 DOI: 10.1093/ije/dyac201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 10/10/2022] [Indexed: 07/13/2024] Open
Abstract
Motivation DataSHIELD is an open-source software infrastructure enabling the analysis of data distributed across multiple databases (federated data) without leaking individuals’ information (non-disclosive). It has applications in many scientific domains, ranging from biosciences to social sciences and including high-throughput genomic studies. R is the language used to interact with (and build) DataSHIELD. This creates difficulties for researchers who do not have experience writing R code or lack the time to learn how to use the DataSHIELD functions. To help new researchers use the DataSHIELD infrastructure and to improve the user-friendliness for experienced researchers, we present ShinyDataSHIELD. Implementation ShinyDataSHIELD is a web application with an R backend that serves as a graphical user interface (GUI) to the DataSHIELD infrastructure. General features The version of the application presented here includes modules to perform: (i) exploratory analysis through descriptive summary statistics and graphical representations (scatter plots, histograms, heatmaps and boxplots); (ii) statistical modelling (generalized linear fixed and mixed-effects models, survival analysis through Cox regression); (iii) genome-wide association studies (GWAS); and (iv) omic analysis (transcriptomics, epigenomics and multi-omic integration). Availability ShinyDataSHIELD is publicly hosted online [https://datashield-demo.obiba.org/ ], the source code and user guide are deposited on Zenodo DOI 10.5281/zenodo.6500323, freely available to non-commercial users under ‘Commons Clause’ License Condition v1.0. Docker images are also available [https://hub.docker.com/r/brgelab/shiny-data-shield ].
Collapse
Affiliation(s)
- Xavier Escribà-Montagut
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | | | - Demetris Avraam
- Population Health Sciences Institute, Newcastle University, Newcastle, UK
| | - Soumya Banerjee
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Tom R P Bishop
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Paul Burton
- Population Health Sciences Institute, Newcastle University, Newcastle, UK
| | - Juan R González
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Centro de Investigación Biomédica en Red en Epidemiología y Salud Pública, Barcelona, Spain
| |
Collapse
|
18
|
Tacconelli E, Gorska A, Carrara E, Davis RJ, Bonten M, Friedrich AW, Glasner C, Goossens H, Hasenauer J, Abad JMH, Peñalvo JL, Sanchez-Niubo A, Sialm A, Scipione G, Soriano G, Yazdanpanah Y, Vorstenbosch E, Jaenisch T. Challenges of data sharing in European Covid-19 projects: A learning opportunity for advancing pandemic preparedness and response. Lancet Reg Health Eur 2022; 21:100467. [PMID: 35942201 PMCID: PMC9351292 DOI: 10.1016/j.lanepe.2022.100467] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The COVID-19 pandemic saw a massive investment into collaborative research projects with a focus on producing data to support public health decisions. We relay our direct experience of four projects funded under the Horizon2020 programme, namely ReCoDID, ORCHESTRA, unCoVer and SYNCHROS. The projects provide insight into the complexities of sharing patient level data from observational cohorts. We focus on compliance with the General Data Protection Regulation (GDPR) and ethics approvals when sharing data across national borders. We discuss procedures for data mapping; submission of new international codes to standards organisation; federated approach; and centralised data curation. Finally, we put forward recommendations for the development of guidelines for the application of GDPR in case of major public health threats; mandatory standards for data collection in funding frameworks; training and capacity building for data owners; cataloguing of international use of metadata standards; and dedicated funding for identified critical areas.
Collapse
|
19
|
Kalia V, Belsky DW, Baccarelli AA, Miller GW. An exposomic framework to uncover environmental drivers of aging. EXPOSOME 2022; 2:osac002. [PMID: 35295547 PMCID: PMC8917275 DOI: 10.1093/exposome/osac002] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 01/19/2022] [Accepted: 01/24/2022] [Indexed: 01/02/2023]
Abstract
The exposome, the environmental complement of the genome, is an omics level characterization of an individual's exposures. There is growing interest in uncovering the role of the environment in human health using an exposomic framework that provides a systematic and unbiased analysis of the non-genetic drivers of health and disease. Many environmental toxicants are associated with molecular hallmarks of aging. An exposomic framework has potential to advance understanding of these associations and how modifications to the environment can promote healthy aging in the population. However, few studies have used this framework to study biological aging. We provide an overview of approaches and challenges in using an exposomic framework to investigate environmental drivers of aging. While capturing exposures over a life course is a daunting and expensive task, the use of historical data can be a practical way to approach this research.
Collapse
Affiliation(s)
- Vrinda Kalia
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| | - Daniel W Belsky
- Department of Epidemiology and Robert N. Butler Columbia Aging Center, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| | - Andrea A Baccarelli
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| | - Gary W Miller
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| |
Collapse
|
20
|
Avraam D, Jones E, Burton P. A deterministic approach for protecting privacy in sensitive personal data. BMC Med Inform Decis Mak 2022; 22:24. [PMID: 35090447 PMCID: PMC8796499 DOI: 10.1186/s12911-022-01754-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 01/09/2022] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Data privacy is one of the biggest challenges for any organisation which processes personal data, especially in the area of medical research where data include sensitive information about patients and study participants. Sharing of data is therefore problematic, which is at odds with the principle of open data that is so important to the advancement of society and science. Several statistical methods and computational tools have been developed to help data custodians and analysts overcome this challenge. METHODS In this paper, we propose a new deterministic approach for anonymising personal data. The method stratifies the underlying data by the categorical variables and re-distributes the continuous variables through a k nearest neighbours based algorithm. RESULTS We demonstrate the use of the deterministic anonymisation on real data, including data from a sample of Titanic passengers, and data from participants in the 1958 Birth Cohort. CONCLUSIONS The proposed procedure makes data re-identification difficult while minimising the loss of utility (by preserving the spatial properties of the underlying data); the latter means that informative statistical analysis can still be conducted.
Collapse
Affiliation(s)
- Demetris Avraam
- Population Health Sciences Institute, Newcastle University, Newcastle, UK
- Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Elinor Jones
- Department of Statistical Science, University College London, London, UK
| | - Paul Burton
- Population Health Sciences Institute, Newcastle University, Newcastle, UK
| |
Collapse
|
21
|
Gruendner J, Deppenwiese N, Folz M, Köhler T, Kroll B, Prokosch HU, Rosenau L, Rühle M, Scheidl MA, Schüttler C, Sedlmayr B, Twrdik A, Kiel A, Majeed RW. Architecture for a feasibility query portal for distributed COVID-19 Fast Healthcare Interoperability Resources (FHIR) patient data repositories: Design and Implementation Study (Preprint). JMIR Med Inform 2022; 10:e36709. [PMID: 35486893 PMCID: PMC9135115 DOI: 10.2196/36709] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 03/16/2022] [Accepted: 04/11/2022] [Indexed: 12/04/2022] Open
Abstract
Background An essential step in any medical research project after identifying the research question is to determine if there are sufficient patients available for a study and where to find them. Pursuing digital feasibility queries on available patient data registries has proven to be an excellent way of reusing existing real-world data sources. To support multicentric research, these feasibility queries should be designed and implemented to run across multiple sites and securely access local data. Working across hospitals usually involves working with different data formats and vocabularies. Recently, the Fast Healthcare Interoperability Resources (FHIR) standard was developed by Health Level Seven to address this concern and describe patient data in a standardized format. The Medical Informatics Initiative in Germany has committed to this standard and created data integration centers, which convert existing data into the FHIR format at each hospital. This partially solves the interoperability problem; however, a distributed feasibility query platform for the FHIR standard is still missing. Objective This study described the design and implementation of the components involved in creating a cross-hospital feasibility query platform for researchers based on FHIR resources. This effort was part of a large COVID-19 data exchange platform and was designed to be scalable for a broad range of patient data. Methods We analyzed and designed the abstract components necessary for a distributed feasibility query. This included a user interface for creating the query, backend with an ontology and terminology service, middleware for query distribution, and FHIR feasibility query execution service. Results We implemented the components described in the Methods section. The resulting solution was distributed to 33 German university hospitals. The functionality of the comprehensive network infrastructure was demonstrated using a test data set based on the German Corona Consensus Data Set. A performance test using specifically created synthetic data revealed the applicability of our solution to data sets containing millions of FHIR resources. The solution can be easily deployed across hospitals and supports feasibility queries, combining multiple inclusion and exclusion criteria using standard Health Level Seven query languages such as Clinical Quality Language and FHIR Search. Developing a platform based on multiple microservices allowed us to create an extendable platform and support multiple Health Level Seven query languages and middleware components to allow integration with future directions of the Medical Informatics Initiative. Conclusions We designed and implemented a feasibility platform for distributed feasibility queries, which works directly on FHIR-formatted data and distributed it across 33 university hospitals in Germany. We showed that developing a feasibility platform directly on the FHIR standard is feasible.
Collapse
Affiliation(s)
- Julian Gruendner
- Chair of Medical Informatics, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Noemi Deppenwiese
- Center of Medical Information and Communication Technology, University Hospital Erlangen, Erlangen, Germany
| | - Michael Folz
- Institute of Medical Informatics, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Thomas Köhler
- Federated Information Systems, German Cancer Research Center, Heidelberg, Germany
| | - Björn Kroll
- IT Center for Clinical Research, University of Lübeck, Lübeck, Germany
| | - Hans-Ulrich Prokosch
- Chair of Medical Informatics, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Lorenz Rosenau
- IT Center for Clinical Research, University of Lübeck, Lübeck, Germany
| | - Mathias Rühle
- Leipzig Research Centre for Civilization Diseases, University of Leipzig, Leipzig, Germany
| | - Marc-Anton Scheidl
- Chair of Medical Informatics, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Christina Schüttler
- Chair of Medical Informatics, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Brita Sedlmayr
- Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Alexander Twrdik
- Leipzig Research Centre for Civilization Diseases, University of Leipzig, Leipzig, Germany
| | - Alexander Kiel
- Federated Information Systems, German Cancer Research Center, Heidelberg, Germany
- Leipzig Research Centre for Civilization Diseases, University of Leipzig, Leipzig, Germany
| | - Raphael W Majeed
- Institute for Medical Informatics, University Clinic Rheinisch-Westfälische Technische Hochschule Aachen, Aachen, Germany
- Universities of Giessen and Marburg Lung Center, German Centre For Lung Research, Justus-Liebig University Giessen, Giessen, Germany
| |
Collapse
|
22
|
Metaproteomics Approach and Pathway Modulation in Obesity and Diabetes: A Narrative Review. Nutrients 2021; 14:nu14010047. [PMID: 35010920 PMCID: PMC8746330 DOI: 10.3390/nu14010047] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 12/20/2021] [Accepted: 12/21/2021] [Indexed: 12/14/2022] Open
Abstract
Low-grade inflammatory diseases revealed metabolic perturbations that have been linked to various phenotypes, including gut microbiota dysbiosis. In the last decade, metaproteomics has been used to investigate protein composition profiles at specific steps and in specific healthy/pathologic conditions. We applied a rigorous protocol that relied on PRISMA guidelines and filtering criteria to obtain an exhaustive study selection that finally resulted in a group of 10 studies, based on metaproteomics and that aim at investigating obesity and diabetes. This batch of studies was used to discuss specific microbial and human metaproteome alterations and metabolic patterns in subjects affected by diabetes (T1D and T2D) and obesity. We provided the main up- and down-regulated protein patterns in the inspected pathologies. Despite the available results, the evident paucity of metaproteomic data is to be considered as a limiting factor in drawing objective considerations. To date, ad hoc prepared metaproteomic databases collecting pathologic data and related metadata, together with standardized analysis protocols, are required to increase our knowledge on these widespread pathologies.
Collapse
|
23
|
Pinart M, Dötsch A, Schlicht K, Laudes M, Bouwman J, Forslund SK, Pischon T, Nimptsch K. Gut Microbiome Composition in Obese and Non-Obese Persons: A Systematic Review and Meta-Analysis. Nutrients 2021; 14:nu14010012. [PMID: 35010887 PMCID: PMC8746372 DOI: 10.3390/nu14010012] [Citation(s) in RCA: 195] [Impact Index Per Article: 48.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 12/13/2021] [Accepted: 12/14/2021] [Indexed: 12/11/2022] Open
Abstract
Whether the gut microbiome in obesity is characterized by lower diversity and altered composition at the phylum or genus level may be more accurately investigated using high-throughput sequencing technologies. We conducted a systematic review in PubMed and Embase including 32 cross-sectional studies assessing the gut microbiome composition by high-throughput sequencing in obese and non-obese adults. A significantly lower alpha diversity (Shannon index) in obese versus non-obese adults was observed in nine out of 22 studies, and meta-analysis of seven studies revealed a non-significant mean difference (−0.06, 95% CI −0.24, 0.12, I2 = 81%). At the phylum level, significantly more Firmicutes and fewer Bacteroidetes in obese versus non-obese adults were observed in six out of seventeen, and in four out of eighteen studies, respectively. Meta-analyses of six studies revealed significantly higher Firmicutes (5.50, 95% 0.27, 10.73, I2 = 81%) and non-significantly lower Bacteroidetes (−4.79, 95% CI −10.77, 1.20, I2 = 86%). At the genus level, lower relative proportions of Bifidobacterium and Eggerthella and higher Acidaminococcus, Anaerococcus, Catenibacterium, Dialister, Dorea, Escherichia-Shigella, Eubacterium, Fusobacterium, Megasphera, Prevotella, Roseburia, Streptococcus, and Sutterella were found in obese versus non-obese adults. Although a proportion of studies found lower diversity and differences in gut microbiome composition in obese versus non-obese adults, the observed heterogeneity across studies precludes clear answers.
Collapse
Affiliation(s)
- Mariona Pinart
- Molecular Epidemiology Research Group, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; (M.P.); (T.P.)
| | - Andreas Dötsch
- Department of Physiology and Biochemistry of Nutrition, Max Rubner-Institut (MRI)—Federal Research Institute of Nutrition and Food, 76131 Karlsruhe, Germany;
| | - Kristina Schlicht
- Institute of Diabetes and Clinical Metabolic Research, University of Kiel, 24105 Kiel, Germany; (K.S.); (M.L.)
| | - Matthias Laudes
- Institute of Diabetes and Clinical Metabolic Research, University of Kiel, 24105 Kiel, Germany; (K.S.); (M.L.)
- Division of Endocrinology, Diabetes and Clinical Nutrition, Department of Internal Medicine 1, Kiel University, 24118 Kiel, Germany
| | - Jildau Bouwman
- Microbiology and Systems Biology Group, Toegepast Natuurwetenschappelijk Onderzoek (TNO), Utrechtseweg 48, 3704 HE Zeist, The Netherlands;
| | - Sofia K. Forslund
- Experimental and Clinical Research Center, A Cooperation of Charité-Universitätsmedizin Berlin and Max Delbrück Center for Molecular Medicine, Lindenberger Weg 80, 13125 Berlin, Germany;
- Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, 10117 Berlin, Germany
- Host-Microbiome Factors in Cardiovascular Disease Lab, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Biobank Core Facility, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, 10178 Berlin, Germany
- German Centre for Cardiovascular Research (DZHK), Partner Site Berlin, 10785 Berlin, Germany
| | - Tobias Pischon
- Molecular Epidemiology Research Group, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; (M.P.); (T.P.)
- Biobank Core Facility, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, 10178 Berlin, Germany
- German Centre for Cardiovascular Research (DZHK), Partner Site Berlin, 10785 Berlin, Germany
- Biobank Technology Platform, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Katharina Nimptsch
- Molecular Epidemiology Research Group, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; (M.P.); (T.P.)
- Correspondence: ; Tel.: +49-30-9046-4573
| |
Collapse
|
24
|
Vrijheid M, Basagaña X, Gonzalez JR, Jaddoe VWV, Jensen G, Keun HC, McEachan RRC, Porcel J, Siroux V, Swertz MA, Thomsen C, Aasvang GM, Andrušaitytė S, Angeli K, Avraam D, Ballester F, Burton P, Bustamante M, Casas M, Chatzi L, Chevrier C, Cingotti N, Conti D, Crépet A, Dadvand P, Duijts L, van Enckevort E, Esplugues A, Fossati S, Garlantezec R, Gómez Roig MD, Grazuleviciene R, Gützkow KB, Guxens M, Haakma S, Hessel EVS, Hoyles L, Hyde E, Klanova J, van Klaveren JD, Kortenkamp A, Le Brusquet L, Leenen I, Lertxundi A, Lertxundi N, Lionis C, Llop S, Lopez-Espinosa MJ, Lyon-Caen S, Maitre L, Mason D, Mathy S, Mazarico E, Nawrot T, Nieuwenhuijsen M, Ortiz R, Pedersen M, Perelló J, Pérez-Cruz M, Philippat C, Piler P, Pizzi C, Quentin J, Richiardi L, Rodriguez A, Roumeliotaki T, Sabin Capote JM, Santiago L, Santos S, Siskos AP, Strandberg-Larsen K, Stratakis N, Sunyer J, Tenenhaus A, Vafeiadi M, Wilson RC, Wright J, Yang T, Slama R. Advancing tools for human early lifecourse exposome research and translation (ATHLETE): Project overview. Environ Epidemiol 2021; 5:e166. [PMID: 34934888 PMCID: PMC8683140 DOI: 10.1097/ee9.0000000000000166] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 06/28/2021] [Indexed: 11/26/2022] Open
Abstract
Early life stages are vulnerable to environmental hazards and present important windows of opportunity for lifelong disease prevention. This makes early life a relevant starting point for exposome studies. The Advancing Tools for Human Early Lifecourse Exposome Research and Translation (ATHLETE) project aims to develop a toolbox of exposome tools and a Europe-wide exposome cohort that will be used to systematically quantify the effects of a wide range of community- and individual-level environmental risk factors on mental, cardiometabolic, and respiratory health outcomes and associated biological pathways, longitudinally from early pregnancy through to adolescence. Exposome tool and data development include as follows: (1) a findable, accessible, interoperable, reusable (FAIR) data infrastructure for early life exposome cohort data, including 16 prospective birth cohorts in 11 European countries; (2) targeted and nontargeted approaches to measure a wide range of environmental exposures (urban, chemical, physical, behavioral, social); (3) advanced statistical and toxicological strategies to analyze complex multidimensional exposome data; (4) estimation of associations between the exposome and early organ development, health trajectories, and biological (metagenomic, metabolomic, epigenetic, aging, and stress) pathways; (5) intervention strategies to improve early life urban and chemical exposomes, co-produced with local communities; and (6) child health impacts and associated costs related to the exposome. Data, tools, and results will be assembled in an openly accessible toolbox, which will provide great opportunities for researchers, policymakers, and other stakeholders, beyond the duration of the project. ATHLETE's results will help to better understand and prevent health damage from environmental exposures and their mixtures from the earliest parts of the life course onward.
Collapse
Affiliation(s)
- Martine Vrijheid
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
- Corresponding Author. Address: ISGlobal, Institute for Global Health, C. Doctor Aiguader 88, 08003 Barcelona, Spain. E-mail: (M. Vrijheid)
| | - Xavier Basagaña
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Juan R. Gonzalez
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Vincent W. V. Jaddoe
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands
- Department of Pediatrics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Genon Jensen
- Health & Environment Alliance (HEAL), Brussels, Belgium
| | - Hector C. Keun
- Department of Surgery & Cancer and Department of Metabolism, Digestion & Reproduction, Imperial College London, London, United Kingdom
| | - Rosemary R. C. McEachan
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford,United Kingdom
| | - Joana Porcel
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Valerie Siroux
- University Grenoble Alpes, Inserm, CNRS, IAB (Institute for Advanced Biosciences) Joint Research Center, Team of Environmental Epidemiology Applied to Development and Respiratory Health, Grenoble, France
| | - Morris A. Swertz
- University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Groningen, The Netherlands
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Cathrine Thomsen
- Department of Environmental Health, Norwegian Institute of Public Health, Oslo, Norway
| | - Gunn Marit Aasvang
- Department of Environmental Health, Norwegian Institute of Public Health, Oslo, Norway
| | - Sandra Andrušaitytė
- Department of Environmental Sciences, Vytautas Magnus University, Kaunas, Lithuania
| | - Karine Angeli
- French Agency for Food, Environmental and Occupational Health and Safety (ANSES), Risk Assessment Department, Maisons-Alfort, France
| | - Demetris Avraam
- Population Health Sciences Institute, Newcastle University, Newcastle, United Kingdom
| | - Ferran Ballester
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Epidemiology and Environmental Health Joint Research Unit, FISABIO-Universitat Jaume I-Universitat de València, València, Spain
- Faculty of Nursing and Chiropody, Universitat de València, Valencia, Spain
| | - Paul Burton
- Population Health Sciences Institute, Newcastle University, Newcastle, United Kingdom
| | - Mariona Bustamante
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Maribel Casas
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Leda Chatzi
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Cécile Chevrier
- University Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail)—UMR_S 1085, Rennes, France
| | | | - David Conti
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Amélie Crépet
- French Agency for Food, Environmental and Occupational Health and Safety (ANSES), Risk Assessment Department, Maisons-Alfort, France
| | - Payam Dadvand
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Liesbeth Duijts
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands
- Department of Pediatrics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Esther van Enckevort
- University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Groningen, The Netherlands
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Ana Esplugues
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Epidemiology and Environmental Health Joint Research Unit, FISABIO-Universitat Jaume I-Universitat de València, València, Spain
- Faculty of Nursing and Chiropody, Universitat de València, Valencia, Spain
| | - Serena Fossati
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Ronan Garlantezec
- CHU de Rennes, University Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail)—UMR_S 1085, Rennes, France
| | - María Dolores Gómez Roig
- Institut de Recerca Sant Joan de Déu (IR-SJD), Barcelona, Spain
- Maternal and Child Health and Development Network II (SAMID II), Instituto de Salud Carlos III (ISCIII), Madrid, Spain
- BCNatal—Barcelona Center for Maternal Fetal and Neonatal Medicine, Hospital Sant Joan de Déu, Barcelona, Spain
| | | | - Kristine B. Gützkow
- Department of Environmental Health, Norwegian Institute of Public Health, Oslo, Norway
| | - Mònica Guxens
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
- Department of Child and Adolescence Psychiatry, Erasmus MC, University Medical Center, Rotterdam, The Netherlands
| | - Sido Haakma
- University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Groningen, The Netherlands
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Ellen V. S. Hessel
- National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Lesley Hoyles
- Department of Biosciences, Nottingham Trent University, Nottingham, United Kingdom
| | - Eleanor Hyde
- University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Groningen, The Netherlands
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Jana Klanova
- RECETOX Centre, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Jacob D. van Klaveren
- National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Andreas Kortenkamp
- Brunel University London, College of Health, Medicine and Life Sciences, Uxbridge, United Kingdom
| | - Laurent Le Brusquet
- University Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des Signaux et Systèmes, Gif-sur-Yvette, France
| | - Ivonne Leenen
- Health & Environment Alliance (HEAL), Brussels, Belgium
| | - Aitana Lertxundi
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- University of Basque Country UPV/EHU, Basque Country, Bilbao, Spain
- Biodonostia, Research Health Institute, Donostia-San Sebastian, Spain
| | - Nerea Lertxundi
- University of Basque Country UPV/EHU, Basque Country, Bilbao, Spain
- Biodonostia, Research Health Institute, Donostia-San Sebastian, Spain
| | - Christos Lionis
- Department of Social Medicine, School of Medicine, University of Crete, Heraklion, Crete, Greece
| | - Sabrina Llop
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Epidemiology and Environmental Health Joint Research Unit, FISABIO-Universitat Jaume I-Universitat de València, València, Spain
| | - Maria-Jose Lopez-Espinosa
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Epidemiology and Environmental Health Joint Research Unit, FISABIO-Universitat Jaume I-Universitat de València, València, Spain
- Faculty of Nursing and Chiropody, Universitat de València, Valencia, Spain
| | - Sarah Lyon-Caen
- University Grenoble Alpes, Inserm, CNRS, IAB (Institute for Advanced Biosciences) Joint Research Center, Team of Environmental Epidemiology Applied to Development and Respiratory Health, Grenoble, France
| | - Lea Maitre
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Dan Mason
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford,United Kingdom
| | - Sandrine Mathy
- University Grenoble Alpes, CNRS, INRAE, Grenoble INP, GAEL, Grenoble, France
| | - Edurne Mazarico
- Institut de Recerca Sant Joan de Déu (IR-SJD), Barcelona, Spain
- Maternal and Child Health and Development Network II (SAMID II), Instituto de Salud Carlos III (ISCIII), Madrid, Spain
- BCNatal—Barcelona Center for Maternal Fetal and Neonatal Medicine, Hospital Sant Joan de Déu, Barcelona, Spain
| | - Tim Nawrot
- Centre for Environmental Sciences, Hasselt University, Hasselt, Belgium
- Centre for Health and Environment, Leuven University, Leuven, Belgium
| | - Mark Nieuwenhuijsen
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Rodney Ortiz
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Marie Pedersen
- Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | | | - Míriam Pérez-Cruz
- Institut de Recerca Sant Joan de Déu (IR-SJD), Barcelona, Spain
- Maternal and Child Health and Development Network II (SAMID II), Instituto de Salud Carlos III (ISCIII), Madrid, Spain
- BCNatal—Barcelona Center for Maternal Fetal and Neonatal Medicine, Hospital Sant Joan de Déu, Barcelona, Spain
| | - Claire Philippat
- University Grenoble Alpes, Inserm, CNRS, IAB (Institute for Advanced Biosciences) Joint Research Center, Team of Environmental Epidemiology Applied to Development and Respiratory Health, Grenoble, France
| | - Pavel Piler
- RECETOX Centre, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Costanza Pizzi
- Cancer Epidemiology Unit, Department of Medical Sciences, University of Turin, Turin, Italy
| | - Joane Quentin
- University Grenoble Alpes, Inserm, CNRS, IAB (Institute for Advanced Biosciences) Joint Research Center, Team of Environmental Epidemiology Applied to Development and Respiratory Health, Grenoble, France
| | - Lorenzo Richiardi
- Cancer Epidemiology Unit, Department of Medical Sciences, University of Turin, Turin, Italy
| | | | - Theano Roumeliotaki
- Department of Social Medicine, School of Medicine, University of Crete, Heraklion, Crete, Greece
| | | | | | - Susana Santos
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands
- Department of Pediatrics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Alexandros P. Siskos
- Department of Surgery & Cancer and Department of Metabolism, Digestion & Reproduction, Imperial College London, London, United Kingdom
| | | | - Nikos Stratakis
- ISGlobal, Barcelona, Spain
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Jordi Sunyer
- ISGlobal, Barcelona, Spain
- Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Arthur Tenenhaus
- University Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des Signaux et Systèmes, Gif-sur-Yvette, France
| | - Marina Vafeiadi
- Department of Social Medicine, School of Medicine, University of Crete, Heraklion, Crete, Greece
| | - Rebecca C. Wilson
- Department of Public Health, Policy and Systems, University of Liverpool, Liverpool, United Kingdom
| | - John Wright
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford,United Kingdom
| | - Tiffany Yang
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford,United Kingdom
| | - Remy Slama
- University Grenoble Alpes, Inserm, CNRS, IAB (Institute for Advanced Biosciences) Joint Research Center, Team of Environmental Epidemiology Applied to Development and Respiratory Health, Grenoble, France
| |
Collapse
|
25
|
Pinart M, Nimptsch K, Forslund SK, Schlicht K, Gueimonde M, Brigidi P, Turroni S, Ahrens W, Hebestreit A, Wolters M, Dötsch A, Nöthlings U, Oluwagbemigun K, Cuadrat RRC, Schulze MB, Standl M, Schloter M, De Angelis M, Iozzo P, Guzzardi MA, Vlaemynck G, Penders J, Jonkers DMAE, Stemmer M, Chiesa G, Cavalieri D, De Filippo C, Ercolini D, De Filippis F, Ribet D, Achamrah N, Tavolacci MP, Déchelotte P, Bouwman J, Laudes M, Pischon T. Identification and Characterization of Human Observational Studies in Nutritional Epidemiology on Gut Microbiomics for Joint Data Analysis. Nutrients 2021; 13:3292. [PMID: 34579168 PMCID: PMC8466729 DOI: 10.3390/nu13093292] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 09/10/2021] [Accepted: 09/17/2021] [Indexed: 01/16/2023] Open
Abstract
In any research field, data access and data integration are major challenges that even large, well-established consortia face. Although data sharing initiatives are increasing, joint data analyses on nutrition and microbiomics in health and disease are still scarce. We aimed to identify observational studies with data on nutrition and gut microbiome composition from the Intestinal Microbiomics (INTIMIC) Knowledge Platform following the findable, accessible, interoperable, and reusable (FAIR) principles. An adapted template from the European Nutritional Phenotype Assessment and Data Sharing Initiative (ENPADASI) consortium was used to collect microbiome-specific information and other related factors. In total, 23 studies (17 longitudinal and 6 cross-sectional) were identified from Italy (7), Germany (6), Netherlands (3), Spain (2), Belgium (1), and France (1) or multiple countries (3). Of these, 21 studies collected information on both dietary intake (24 h dietary recall, food frequency questionnaire (FFQ), or Food Records) and gut microbiome. All studies collected stool samples. The most often used sequencing platform was Illumina MiSeq, and the preferred hypervariable regions of the 16S rRNA gene were V3-V4 or V4. The combination of datasets will allow for sufficiently powered investigations to increase the knowledge and understanding of the relationship between food and gut microbiome in health and disease.
Collapse
Affiliation(s)
- Mariona Pinart
- Molecular Epidemiology Research Group, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; (M.P.); (T.P.)
| | - Katharina Nimptsch
- Molecular Epidemiology Research Group, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; (M.P.); (T.P.)
| | - Sofia K. Forslund
- Experimental and Clinical Research Center, A Cooperation of Charité-Universitätsmedizin Berlin and Max Delbrück Center for Molecular Medicine, Lindenberger Weg 80, 13125 Berlin, Germany;
- Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, 10117 Berlin, Germany
- Host-Microbiome Factors in Cardiovascular Disease Lab, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
- German Centre for Cardiovascular Research (DZHK), Partner Site Berlin, 10785 Berlin, Germany
- Berlin Institute of Health (BIH), 10178 Berlin, Germany
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Kristina Schlicht
- Institute of Diabetes and Clinical Metabolic Research, University of Kiel, 24105 Kiel, Germany; (K.S.); (M.L.)
| | - Miguel Gueimonde
- Department of Microbiology and Biochemistry of Dairy Products, IPLA-CSIC, 33300 Villaviciosa, Spain;
- Diet, Microbiota and Health Group, Instituto de Investigación Sanitaria del Principado de Asturias (ISPA), 33011 Oviedo, Spain
| | - Patrizia Brigidi
- Department of Medical and Surgical Sciences, University of Bologna, Via Massarenti 9, 40138 Bologna, Italy;
| | - Silvia Turroni
- Department of Pharmacy and Biotechnology, University of Bologna, Via Belmeloro 6, 40126 Bologna, Italy;
| | - Wolfgang Ahrens
- Leibniz Institute for Prevention Research and Epidemiology-BIPS, 28359 Bremen, Germany; (W.A.); (A.H.); (M.W.)
- Institute of Statistics, Bremen University, 28359 Bremen, Germany
| | - Antje Hebestreit
- Leibniz Institute for Prevention Research and Epidemiology-BIPS, 28359 Bremen, Germany; (W.A.); (A.H.); (M.W.)
| | - Maike Wolters
- Leibniz Institute for Prevention Research and Epidemiology-BIPS, 28359 Bremen, Germany; (W.A.); (A.H.); (M.W.)
| | - Andreas Dötsch
- Department of Physiology and Biochemistry of Nutrition, Max Rubner-Institut (MRI)-Federal Research Institute of Nutrition and Food, 76131 Karlsruhe, Germany;
| | - Ute Nöthlings
- Nutritional Epidemiology Unit, Institute of Nutrition and Food Sciences, University of Bonn, 53115 Bonn, Germany; (U.N.); (K.O.)
| | - Kolade Oluwagbemigun
- Nutritional Epidemiology Unit, Institute of Nutrition and Food Sciences, University of Bonn, 53115 Bonn, Germany; (U.N.); (K.O.)
| | - Rafael R. C. Cuadrat
- Department of Molecular Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, 14558 Nuthetal, Germany; (R.R.C.C.); (M.B.S.)
| | - Matthias B. Schulze
- Department of Molecular Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, 14558 Nuthetal, Germany; (R.R.C.C.); (M.B.S.)
- Institute of Nutritional Science, University of Potsdam, 14558 Potsdam, Germany
- German Center for Diabetes Research (DZD), 85764 Neuherberg, Germany
| | - Marie Standl
- Institute of Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany;
| | - Michael Schloter
- Research Unit for Comparative Microbiome Analysis, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany;
| | - Maria De Angelis
- Department of Soil, Plant and Food Sciences, University of Bari Aldo Moro, 70126 Bari, Italy;
| | - Patricia Iozzo
- Institute of Clinical Physiology, National Research Council, Via Moruzzi 1, 56124 Pisa, Italy; (P.I.); (M.A.G.)
| | - Maria Angela Guzzardi
- Institute of Clinical Physiology, National Research Council, Via Moruzzi 1, 56124 Pisa, Italy; (P.I.); (M.A.G.)
| | - Geertrui Vlaemynck
- Department Technology and Food, Flanders Research Institute for Agriculture, Fisheries and Food, 9090 Melle, Belgium;
| | - John Penders
- Department of Medical Microbiology, School of Nutrition and Translational Research in Metabolism (NUTRIM) and Care and Public Health Research Institute (CAPHRI), Maastricht University Medical Center, 6200 MD Maastricht, The Netherlands;
| | - Daisy M. A. E. Jonkers
- Department of Internal Medicine, Division Gastroenterology-Hepatology, NUTRIM School of Nutrition and Translational Research in Metabolism, Maastricht University Medical Center, 6200 MD Maastricht, The Netherlands;
| | - Maya Stemmer
- Department of Industrial Engineering and Management, Ben-Gurion University of the Negev, Beer-Sheva P.O. Box 653, Israel;
| | - Giulia Chiesa
- Department of Pharmacological and Biomolecular Sciences, Università degli Studi di Milano, 20133 Milan, Italy;
| | - Duccio Cavalieri
- Department of Biology, University of Florence, Via Madonna del Piano 6, 50019 Florence, Italy;
| | - Carlotta De Filippo
- Institute of Agricultural Biology and Biotechnology National Research Council, Via Moruzzi 1, 56124 Pisa, Italy;
| | - Danilo Ercolini
- Department of Agricultural Sciences, University of Naples Federico II, 80055 Portici, Italy; (D.E.); (F.D.F.)
- Task Force on Microbiome Studies, University of Naples Federico II, 80134 Naples, Italy
| | - Francesca De Filippis
- Department of Agricultural Sciences, University of Naples Federico II, 80055 Portici, Italy; (D.E.); (F.D.F.)
- Task Force on Microbiome Studies, University of Naples Federico II, 80134 Naples, Italy
| | - David Ribet
- INSERM UMR 1073 “Nutrition, Inflammation and Gut-Brain Axis Dysfunctions”, UNIROUEN, Normandie University, 76000 Rouen, France; (D.R.); (N.A.); (M.-P.T.); (P.D.)
| | - Najate Achamrah
- INSERM UMR 1073 “Nutrition, Inflammation and Gut-Brain Axis Dysfunctions”, UNIROUEN, Normandie University, 76000 Rouen, France; (D.R.); (N.A.); (M.-P.T.); (P.D.)
- Department of Nutrition, CHU Rouen, 76000 Rouen, France
| | - Marie-Pierre Tavolacci
- INSERM UMR 1073 “Nutrition, Inflammation and Gut-Brain Axis Dysfunctions”, UNIROUEN, Normandie University, 76000 Rouen, France; (D.R.); (N.A.); (M.-P.T.); (P.D.)
- INSERM CIC-CRB 1404, CHU Rouen, 76000 Rouen, France
| | - Pierre Déchelotte
- INSERM UMR 1073 “Nutrition, Inflammation and Gut-Brain Axis Dysfunctions”, UNIROUEN, Normandie University, 76000 Rouen, France; (D.R.); (N.A.); (M.-P.T.); (P.D.)
- Department of Nutrition, CHU Rouen, 76000 Rouen, France
| | - Jildau Bouwman
- Microbiology and Systems Biology Group, TNO, Utrechtseweg 48, 3704 HE Zeist, The Netherlands;
| | - Matthias Laudes
- Institute of Diabetes and Clinical Metabolic Research, University of Kiel, 24105 Kiel, Germany; (K.S.); (M.L.)
| | - Tobias Pischon
- Molecular Epidemiology Research Group, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; (M.P.); (T.P.)
- Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, 10117 Berlin, Germany
- German Centre for Cardiovascular Research (DZHK), Partner Site Berlin, 10785 Berlin, Germany
- Biobank Technology Platform, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
- Biobank Core Facility, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, 10178 Berlin, Germany
| |
Collapse
|