Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Muilu J, Peltonen L, Litton JE. The federated database – a basis for biobank-based post-genome studies, integrating phenome and genome data from 600 000 twin pairs in Europe. Eur J Hum Genet 2007;15:718-23. [PMID: 17487219 DOI: 10.1038/sj.ejhg.5201850] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

For:	Muilu J, Peltonen L, Litton JE. The federated database – a basis for biobank-based post-genome studies, integrating phenome and genome data from 600 000 twin pairs in Europe. Eur J Hum Genet 2007;15:718-23. [PMID: 17487219 DOI: 10.1038/sj.ejhg.5201850] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

Number

Cited by Other Article(s)

Schröder M, Muller SH, Vradi E, Mielke J, Lim YM, Couvelard F, Mostert M, Koudstaal S, Eijkemans MJ, Gerlinger C. Sharing Medical Big Data While Preserving Patient Confidentiality in Innovative Medicines Initiative: A Summary and Case Report from BigData@Heart. BIG DATA 2023;11:399-407. [PMID: 37889577 PMCID: PMC10733752 DOI: 10.1089/big.2022.0178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]

Overview of Federated Facility to Harmonize, Analyze and Management of Missing Data in Cohorts. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9194103] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Teare HJA, de Masi F, Banasik K, Barnett A, Herrgard S, Jablonka B, Postma JWM, McDonald TJ, Forgie I, Chmura PJ, Rydzka EK, Gupta R, Brunak S, Pearson E, Kaye J. The governance structure for data access in the DIRECT consortium: an innovative medicines initiative (IMI) project. LIFE SCIENCES, SOCIETY AND POLICY 2018;14:20. [PMID: 30182269 PMCID: PMC6123336 DOI: 10.1186/s40504-018-0083-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Accepted: 07/23/2018] [Indexed: 06/08/2023]

Affiliation(s)

Harriet J. A. Teare HeLEX Centre, University of Oxford, Ewert House, Banbury Road, Oxford, OX2 7DD UK Melbourne Law School, University of Melbourne, 185 Pelham Street, Carlton, VIC 3053 Australia
Federico de Masi Center for Biological Sequence Analysis, Department of Bio and Health Informatics, Technical University of Denmark, Building 208, DK-2800 Lyngby, Denmark
Karina Banasik Translational Disease Systems Biology, NNF Center for Protein Research, University of Copenhagen, Faculty of Health and Medical Sciences, Blegdamsvej 3B, DK-2200 Copenhagen, Denmark
Anna Barnett Division of Molecular & Clinical Medicine, School of Medicine, University of Dundee, Ninewells Hospital & Medical School, Dundee, UK
Sanna Herrgard Center for Biological Sequence Analysis, Department of Bio and Health Informatics, Technical University of Denmark, Building 208, DK-2800 Lyngby, Denmark
Bernd Jablonka Sanofi-Aventis Deutschland GmbH, Industriepark Höchst, 65926 Frankfurt, Germany
Jacqueline W. M. Postma Clinical Research Centre, Lund University Diabetes Centre, Box 50332, SE-202 13 Malmö, Sweden
Timothy J. McDonald Blood Sciences, Template A2, Royal Devon and Exeter Hospital, Barrack Road, Exeter, EX2 5DW UK
Ian Forgie Division of Molecular & Clinical Medicine, School of Medicine, University of Dundee, Ninewells Hospital & Medical School, Dundee, UK
Piotr J. Chmura Center for Biological Sequence Analysis, Department of Bio and Health Informatics, Technical University of Denmark, Building 208, DK-2800 Lyngby, Denmark
Emil K. Rydzka Center for Biological Sequence Analysis, Department of Bio and Health Informatics, Technical University of Denmark, Building 208, DK-2800 Lyngby, Denmark
Ramneek Gupta Center for Biological Sequence Analysis, Department of Bio and Health Informatics, Technical University of Denmark, Building 208, DK-2800 Lyngby, Denmark
Soren Brunak Center for Biological Sequence Analysis, Department of Bio and Health Informatics, Technical University of Denmark, Building 208, DK-2800 Lyngby, Denmark Translational Disease Systems Biology, NNF Center for Protein Research, University of Copenhagen, Faculty of Health and Medical Sciences, Blegdamsvej 3B, DK-2200 Copenhagen, Denmark
Ewan Pearson Division of Molecular & Clinical Medicine, School of Medicine, University of Dundee, Ninewells Hospital & Medical School, Dundee, UK
Jane Kaye HeLEX Centre, University of Oxford, Ewert House, Banbury Road, Oxford, OX2 7DD UK Melbourne Law School, University of Melbourne, 185 Pelham Street, Carlton, VIC 3053 Australia

Collapse

Fortier I, Raina P, Van den Heuvel ER, Griffith LE, Craig C, Saliba M, Doiron D, Stolk RP, Knoppers BM, Ferretti V, Granda P, Burton P. Maelstrom Research guidelines for rigorous retrospective data harmonization. Int J Epidemiol 2017;46:103-105. [PMID: 27272186 PMCID: PMC5407152 DOI: 10.1093/ije/dyw075] [Citation(s) in RCA: 85] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/16/2016] [Indexed: 12/26/2022] Open

Abstract

Background

It is widely accepted and acknowledged that data harmonization is crucial: in its absence, the co-analysis of major tranches of high quality extant data is liable to inefficiency or error. However, despite its widespread practice, no formalized/systematic guidelines exist to ensure high quality retrospective data harmonization.

Methods

To better understand real-world harmonization practices and facilitate development of formal guidelines, three interrelated initiatives were undertaken between 2006 and 2015. They included a phone survey with 34 major international research initiatives, a series of workshops with experts, and case studies applying the proposed guidelines.

Results

A wide range of projects use retrospective harmonization to support their research activities but even when appropriate approaches are used, the terminologies, procedures, technologies and methods adopted vary markedly. The generic guidelines outlined in this article delineate the essentials required and describe an interdependent step-by-step approach to harmonization: 0) define the research question, objectives and protocol; 1) assemble pre-existing knowledge and select studies; 2) define targeted variables and evaluate harmonization potential; 3) process data; 4) estimate quality of the harmonized dataset(s) generated; and 5) disseminate and preserve final harmonization products.

Conclusions

This manuscript provides guidelines aiming to encourage rigorous and effective approaches to harmonization which are comprehensively and transparently documented and straightforward to interpret and implement. This can be seen as a key step towards implementing guiding principles analogous to those that are well recognised as being essential in securing the foundational underpinning of systematic reviews and the meta-analysis of clinical trials.

Collapse

Park HS, Cho H, Kim HS. Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea. Healthc Inform Res 2016;22:129-41. [PMID: 27200223 PMCID: PMC4871843 DOI: 10.4258/hir.2016.22.2.129] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2016] [Revised: 04/19/2016] [Accepted: 04/22/2016] [Indexed: 11/23/2022] Open

Abstract

Objectives

This study developed an integrated database for 15 regional biobanks that provides large quantities of high-quality bio-data to researchers to be used for the prevention of disease, for the development of personalized medicines, and in genetics studies.

Methods

We collected raw data, managed independently by 15 regional biobanks, for database modeling and analyzed and defined the metadata of the items. We also built a three-step (high, middle, and low) classification system for classifying the item concepts based on the metadata. To generate clear meanings of the items, clinical items were defined using the Systematized Nomenclature of Medicine Clinical Terms, and specimen items were defined using the Logical Observation Identifiers Names and Codes. To optimize database performance, we set up a multi-column index based on the classification system and the international standard code.

Results

As a result of subdividing 7,197,252 raw data items collected, we refined the metadata into 1,796 clinical items and 1,792 specimen items. The classification system consists of 15 high, 163 middle, and 3,588 low class items. International standard codes were linked to 69.9% of the clinical items and 71.7% of the specimen items. The database consists of 18 tables based on a table from MySQL Server 5.6. As a result of the performance evaluation, the multi-column index shortened query time by as much as nine times.

Conclusions

The database developed was based on an international standard terminology system, providing an infrastructure that can integrate the 7,197,252 raw data items managed by the 15 regional biobanks. In particular, it resolved the inevitable interoperability issues in the exchange of information among the biobanks, and provided a solution to the synonym problem, which arises when the same concept is expressed in a variety of ways.

Collapse

Carter KW, Francis RW, Carter KW, Francis RW, Bresnahan M, Gissler M, Grønborg TK, Gross R, Gunnes N, Hammond G, Hornig M, Hultman CM, Huttunen J, Langridge A, Leonard H, Newman S, Parner ET, Petersson G, Reichenberg A, Sandin S, Schendel DE, Schalkwyk L, Sourander A, Steadman C, Stoltenberg C, Suominen A, Surén P, Susser E, Sylvester Vethanayagam A, Yusof Z. ViPAR: a software platform for the Virtual Pooling and Analysis of Research Data. Int J Epidemiol 2015;45:408-416. [PMID: 26452388 PMCID: PMC4864874 DOI: 10.1093/ije/dyv193] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Abstract

Background: Research studies exploring the determinants of disease require sufficient statistical power to detect meaningful effects. Sample size is often increased through centralized pooling of disparately located datasets, though ethical, privacy and data ownership issues can often hamper this process. Methods that facilitate the sharing of research data that are sympathetic with these issues and which allow flexible and detailed statistical analyses are therefore in critical need. We have created a software platform for the Virtual Pooling and Analysis of Research data (ViPAR), which employs free and open source methods to provide researchers with a web-based platform to analyse datasets housed in disparate locations.

Methods: Database federation permits controlled access to remotely located datasets from a central location. The Secure Shell protocol allows data to be securely exchanged between devices over an insecure network. ViPAR combines these free technologies into a solution that facilitates ‘virtual pooling’ where data can be temporarily pooled into computer memory and made available for analysis without the need for permanent central storage.

Results: Within the ViPAR infrastructure, remote sites manage their own harmonized research dataset in a database hosted at their site, while a central server hosts the data federation component and a secure analysis portal. When an analysis is initiated, requested data are retrieved from each remote site and virtually pooled at the central site. The data are then analysed by statistical software and, on completion, results of the analysis are returned to the user and the virtually pooled data are removed from memory.

Conclusions: ViPAR is a secure, flexible and powerful analysis platform built on open source technology that is currently in use by large international consortia, and is made publicly available at [ http://bioinformatics.childhealthresearch.org.au/software/vipar/ ].

Collapse

Affiliation(s)

Kim W Carter
Richard W Francis
K W Carter Telethon Kids Institute, University of Western Australia, Perth, WA, Australia
R W Francis Telethon Kids Institute, University of Western Australia, Perth, WA, Australia
M Bresnahan Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, USA, New York State Psychiatric Institute, New York, NY, USA
M Gissler National Institute for Health and Welfare, Helsinki, Finland, NHV Nordic School of Public Health, Gothenburg, Sweden
T K Grønborg Department of Public Health, University of Aarhus, Aarhus, Denmark
R Gross Division of Psychiatry, Sheba Medical Center, Tel Hashomer, Israel, Department of Epidemiology and Preventive Medicine, Sackler Faculty of Medicine, Tel Aviv University, Ramat Aviv, Israel
N Gunnes Norwegian Institute of Public Health, Oslo, Norway
G Hammond Telethon Kids Institute, University of Western Australia, Perth, WA, Australia
M Hornig Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, USA, Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York, NY, USA
C M Hultman Karolinska Institutet, Stockholm, Sweden
J Huttunen Turku University, Turku, Finland
A Langridge Telethon Kids Institute, University of Western Australia, Perth, WA, Australia
H Leonard Telethon Kids Institute, University of Western Australia, Perth, WA, Australia
S Newman Institute of Psychiatry, King's College London, London, UK
E T Parner Department of Public Health, University of Aarhus, Aarhus, Denmark
G Petersson Karolinska Institutet, Stockholm, Sweden
A Reichenberg Department of Psychosis Studies, Institute of Psychiatry, King's College London, London, UK, Departments of Preventative Medicine and Psychiatry, Ischan School of Medicine at Mount Sinai, New York, NY, USA
S Sandin Karolinska Institutet, Stockholm, Sweden
D E Schendel Department of Public Health, Section for Epidemiology, University of Aarhus, Aarhus, Denmark, Department of Economics and Business, National Centre for Register-based Research, University of Aarhus, Aarhus, Denmark, Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
L Schalkwyk Institute of Psychiatry, King's College London, London, UK
A Sourander Child Psychiatry Research Center, Department of Child Psychiatry, Turku University, Turku, Finland, Turku University Hospital, Turku, Finland
C Steadman Telethon Kids Institute, University of Western Australia, Perth, WA, Australia
C Stoltenberg Norwegian Institute of Public Health, Oslo, Norway, Department of Global Public Health and Primary Care, University of Bergen, Bergen, Norway
A Suominen Department of Child Psychiatry, Turku University, Turku, Finland and
P Surén Norwegian Institute of Public Health, Oslo, Norway
E Susser Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, USA, New York State Psychiatric Institute, New York, NY, USA
A Sylvester Vethanayagam University of Aarhus, Aarhus, Denmark
Z Yusof Karolinska Institutet, Stockholm, Sweden

Collapse

Data harmonization and federated analysis of population-based studies: the BioSHaRE project. Emerg Themes Epidemiol 2013;10:12. [PMID: 24257327 PMCID: PMC4175511 DOI: 10.1186/1742-7622-10-12] [Citation(s) in RCA: 90] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2013] [Accepted: 11/11/2013] [Indexed: 01/08/2023] Open

Schendel DE, Bresnahan M, Carter KW, Francis RW, Gissler M, Grønborg TK, Gross R, Gunnes N, Hornig M, Hultman CM, Langridge A, Lauritsen MB, Leonard H, Parner ET, Reichenberg A, Sandin S, Sourander A, Stoltenberg C, Suominen A, Surén P, Susser E. The International Collaboration for Autism Registry Epidemiology (iCARE): multinational registry-based investigations of autism risk factors and trends. J Autism Dev Disord 2013;43:2650-63. [PMID: 23563868 PMCID: PMC4512211 DOI: 10.1007/s10803-013-1815-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Linkage of Data from Diverse Data Sources (LDS): A Data Combination Model Provides Clinical Data of Corresponding Specimens in Biobanking Information System. J Med Syst 2013;37:9975. [DOI: 10.1007/s10916-013-9975-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2013] [Accepted: 08/29/2013] [Indexed: 11/26/2022]

Boomsma DI, Willemsen G, Vink JM, Bartels M, Groot P, Hottenga JJ, van Beijsterveldt CEMT, Stroet T, van Dijk R, Wertheim R, Visser M, van der Kleij F. Design and Implementation of a Twin-Family Database for Behavior Genetics and Genomics Studies. Twin Res Hum Genet 2012;11:342-8. [DOI: 10.1375/twin.11.3.342] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Wichmann HE, Kuhn KA, Waldenberger M, Schmelcher D, Schuffenhauer S, Meitinger T, Wurst SHR, Lamla G, Fortier I, Burton PR, Peltonen L, Perola M, Metspalu A, Riegman P, Landegren U, Taussig MJ, Litton JE, Fransson MN, Eder J, Cambon-Thomsen A, Bovenberg J, Dagher G, van Ommen GJ, Griffith M, Yuille M, Zatloukal K. Comprehensive catalog of European biobanks. Nat Biotechnol 2011;29:795-7. [DOI: 10.1038/nbt.1958] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

A Survey on Data Integration in Bioinformatics. ACTA ACUST UNITED AC 2011. [DOI: 10.1007/978-3-642-25483-3_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Späth MB, Grimson J. Applying the archetype approach to the database of a biobank information management system. Int J Med Inform 2010;80:205-26. [PMID: 21131230 DOI: 10.1016/j.ijmedinf.2010.11.002] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2010] [Revised: 11/01/2010] [Accepted: 11/02/2010] [Indexed: 11/17/2022]

Abstract

PURPOSE

The purpose of this study is to investigate the feasibility of applying the openEHR archetype approach to modelling the data in the database of an existing proprietary biobank information management system. A biobank information management system stores the clinical/phenotypic data of the sample donor and sample related information. The clinical/phenotypic data is potentially sourced from the donor's electronic health record (EHR). The study evaluates the reuse of openEHR archetypes that have been developed for the creation of an interoperable EHR in the context of biobanking, and proposes a new set of archetypes specifically for biobanks. The ultimate goal of the research is the development of an interoperable electronic biomedical research record (eBMRR) to support biomedical knowledge discovery.

METHODS

The database of the prostate cancer biobank of the Irish Prostate Cancer Research Consortium (PCRC), which supports the identification of novel biomarkers for prostate cancer, was taken as the basis for the modelling effort. First the database schema of the biobank was analyzed and reorganized into archetype-friendly concepts. Then, archetype repositories were searched for matching archetypes. Some existing archetypes were reused without change, some were modified or specialized, and new archetypes were developed where needed. The fields of the biobank database schema were then mapped to the elements in the archetypes. Finally, the archetypes were arranged into templates specifically to meet the requirements of the PCRC biobank.

RESULTS

A set of 47 archetypes was found to cover all the concepts used in the biobank. Of these, 29 (62%) were reused without change, 6 were modified and/or extended, 1 was specialized, and 11 were newly defined. These archetypes were arranged into 8 templates specifically required for this biobank. A number of issues were encountered in this research. Some arose from the immaturity of the archetype approach, such as immature modelling support tools, difficulties in defining high-quality archetypes and the problem of overlapping archetypes. In addition, the identification of suitable existing archetypes was time-consuming and many semantic conflicts were encountered during the process of mapping the PCRC BIMS database to existing archetypes. These include differences in the granularity of documentation, in metadata-level versus data-level modelling, in terminologies and vocabularies used, and in the amount of structure imposed on the information to be recorded. Furthermore, the current way of modelling the sample entity was found to be cumbersome in the sample-centric activity of biobanking.

CONCLUSIONS

The archetype approach is a promising approach to create a shareable eBMRR based on the study participant/donor for biobanks. Many archetypes originally developed for the EHR domain can be reused to model the clinical/phenotypic and sample information in the biobank context, which validates the genericity of these archetypes and their potential for reuse in the context of biomedical research. However, finding suitable archetypes in the repositories and establishing an exact mapping between the fields in the PCRC BIMS database and the elements of existing archetypes that have been designed for clinical practice can be challenging and time-consuming and involves resolving many common system integration conflicts. These may be attributable to differences in the requirements for information documentation between clinical practice and biobanking. This research also recognized the need for better support tools, modelling guidelines and best practice rules and reconfirmed the need for better domain knowledge governance. Furthermore, the authors propose that the establishment of an independent sample record with the sample as record subject should be investigated. The research presented in this paper is limited by the fact that the new archetypes developed during this research are based on a single biobank instance. These new archetypes may not be complete, representing only those subsets of items required by this particular database. Nevertheless, this exercise exposes some of the gaps that exist in the archetype modelling landscape and highlights the concepts that need to be modelled with archetypes to enable the development of an eBMRR.

Collapse

Kim H, Yi BK, Kim IK, Kwak YS. Integrating Clinical Information in National Biobank of Korea. J Med Syst 2009;35:647-56. [DOI: 10.1007/s10916-009-9402-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2009] [Accepted: 11/16/2009] [Indexed: 02/04/2023]

Baker EJ, Jay JJ, Philip VM, Zhang Y, Li Z, Kirova R, Langston MA, Chesler EJ. Ontological Discovery Environment: a system for integrating gene-phenotype associations. Genomics 2009;94:377-87. [PMID: 19733230 DOI: 10.1016/j.ygeno.2009.08.016] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2009] [Revised: 08/19/2009] [Accepted: 08/27/2009] [Indexed: 10/20/2022]

Abstract

The wealth of genomic technologies has enabled biologists to rapidly ascribe phenotypic characters to biological substrates. Central to effective biological investigation is the operational definition of the process under investigation. We propose an elucidation of categories of biological characters, including disease relevant traits, based on natural endogenous processes and experimentally observed biological networks, pathways and systems rather than on externally manifested constructs and current semantics such as disease names and processes. The Ontological Discovery Environment (ODE) is an Internet accessible resource for the storage, sharing, retrieval and analysis of phenotype-centered genomic data sets across species and experimental model systems. Any type of data set representing gene-phenotype relationships, such quantitative trait loci (QTL) positional candidates, literature reviews, microarray experiments, ontological or even meta-data, may serve as inputs. To demonstrate a use case leveraging the homology capabilities of ODE and its ability to synthesize diverse data sets, we conducted an analysis of genomic studies related to alcoholism. The core of ODE's gene set similarity, distance and hierarchical analysis is the creation of a bipartite network of gene-phenotype relations, a unique discrete graph approach to analysis that enables set-set matching of non-referential data. Gene sets are annotated with several levels of metadata, including community ontologies, while gene set translations compare models across species. Computationally derived gene sets are integrated into hierarchical trees based on gene-derived phenotype interdependencies. Automated set identifications are augmented by statistical tools which enable users to interpret the confidence of modeled results. This approach allows data integration and hypothesis discovery across multiple experimental contexts, regardless of the face similarity and semantic annotation of the experimental systems or species domain.

Collapse

Information Systems for Federated Biobanks. TRANSACTIONS ON LARGE-SCALE DATA- AND KNOWLEDGE-CENTERED SYSTEMS I 2009. [DOI: 10.1007/978-3-642-03722-1_7] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Eder J, Dabringer C, Schicho M, Stark K. Data Management for Federated Biobanks. ACTA ACUST UNITED AC 2009. [DOI: 10.1007/978-3-642-03573-9_15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]

Harrison JH, Aller RD. Regional and national health care data repositories. Clin Lab Med 2008;28:101-17, vii. [PMID: 18194721 DOI: 10.1016/j.cll.2007.10.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Siadaty MS, Harrison JH. Multi-database mining. Clin Lab Med 2008;28:73-82, vi. [PMID: 18194719 DOI: 10.1016/j.cll.2007.10.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Perola M, Sammalisto S, Hiekkalinna T, Martin NG, Visscher PM, Montgomery GW, Benyamin B, Harris JR, Boomsma D, Willemsen G, Hottenga JJ, Christensen K, Kyvik KO, Sørensen TIA, Pedersen NL, Magnusson PKE, Spector TD, Widen E, Silventoinen K, Kaprio J, Palotie A, Peltonen L. Combined genome scans for body stature in 6,602 European twins: evidence for common Caucasian loci. PLoS Genet 2007;3:e97. [PMID: 17559308 PMCID: PMC1892350 DOI: 10.1371/journal.pgen.0030097] [Citation(s) in RCA: 132] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2007] [Accepted: 05/02/2007] [Indexed: 01/06/2023] Open

Abstract

Twin cohorts provide a unique advantage for investigations of the role of genetics and environment in the etiology of variation in common complex traits by reducing the variance due to environment, age, and cohort differences. The GenomEUtwin (http://www.genomeutwin.org) consortium consists of eight twin cohorts (Australian, Danish, Dutch, Finnish, Italian, Norwegian, Swedish, and United Kingdom) with the total resource of hundreds of thousands of twin pairs. We performed quantitative trait locus (QTL) analysis of one of the most heritable human complex traits, adult stature (body height) using genome-wide scans performed for 3,817 families (8,450 individuals) derived from twin cohorts from Australia, Denmark, Finland, Netherlands, Sweden, and United Kingdom with an approximate ten-centimorgan microsatellite marker map. The marker maps for different studies differed and they were combined and related to the sequence positions using software developed by us, which is publicly available (https://apps.bioinfo.helsinki.fi/software/cartographer.aspx). Variance component linkage analysis was performed with age, sex, and country of origin as covariates. The covariate adjusted heritability was 81% for stature in the pooled dataset. We found evidence for a major QTL for human stature on 8q21.3 (multipoint logarithm of the odds 3.28), and suggestive evidence for loci on Chromosomes X, 7, and 20. Some evidence of sex heterogeneity was found, however, no obvious female-specific QTLs emerged. Several cohorts contributed to the identified loci, suggesting an evolutionarily old genetic variant having effects on stature in European-based populations. To facilitate the genetic studies of stature we have also set up a website that lists all stature genome scans published and their most significant loci (http://www.genomeutwin.org/stature_gene_map.htm).

Twin cohorts provide a unique advantage for research of the role of genetics and environment behind common complex traits by reducing the variance due to environment, age, and cohort differences. The GenomEUtwin consortium consists of eight twin cohorts with the total resource of hundreds of thousands of twin pairs (http://www.genomeutwin.org). We performed quantitative family-based genetic linkage analysis for one of the most heritable human complex traits, adult stature (body height), using genome-wide scans derived from twin cohorts from Australia, Denmark, Finland, Netherlands, Sweden, and United Kingdom. Age, sex, and country were adjusted for in the data analyses. Human stature was found to be very heritable across all the cohorts and in the combined dataset. We found evidence for a shared genetic locus accounting for human stature on Chromosome 8, and suggestive evidence for loci on Chromosomes X, 7, and 20. Since twins from several countries contributed to the identified loci, an evolutionarily old genetic variant must influence stature in European-based populations. To facilitate the research in the field we have also set up a website that lists all stature genome scans published and their most significant loci (http://www.genomeutwin.org/stature_gene_map.htm).

Collapse

Affiliation(s)

Markus Perola Department of Molecular Medicine, National Public Health Institute, Helsinki, Finland Faculty of Medicine, Department of Medical Genetics, University of Helsinki, Helsinki, Finland
Sampo Sammalisto Department of Molecular Medicine, National Public Health Institute, Helsinki, Finland
Tero Hiekkalinna Department of Molecular Medicine, National Public Health Institute, Helsinki, Finland
Nick G Martin Queensland Institute of Medical Research, Brisbane, Australia
Peter M Visscher Queensland Institute of Medical Research, Brisbane, Australia
Grant W Montgomery Queensland Institute of Medical Research, Brisbane, Australia
Beben Benyamin Queensland Institute of Medical Research, Brisbane, Australia
Jennifer R Harris Norwegian Institute of Public Health, Oslo, Norway
Dorret Boomsma Free University, Amsterdam, The Netherlands
Gonneke Willemsen Free University, Amsterdam, The Netherlands
Jouke-Jan Hottenga Free University, Amsterdam, The Netherlands
Kaare Christensen Department of Epidemiology, Institute of Public Health, University of Southern Denmark, Odense, Denmark
Kirsten Ohm Kyvik Department of Epidemiology, Institute of Public Health, University of Southern Denmark, Odense, Denmark
Thorkild I. A Sørensen The Institute of Preventive Medicine, Copenhagen, Denmark
Nancy L Pedersen Karolinska Institutet, Stockholm, Sweden
Patrik K. E Magnusson Karolinska Institutet, Stockholm, Sweden
Tim D Spector Kings College London, London, United Kingdom
Elisabeth Widen Finnish Genome Center, University of Helsinki, Helsinki, Finland
Karri Silventoinen Faculty of Medicine, Department of Public Health, University of Helsinki, Helsinki, Finland
Jaakko Kaprio Faculty of Medicine, Department of Public Health, University of Helsinki, Helsinki, Finland Department of Mental Health and Alcohol Research, National Public Health Institute, Helsinki, Finland
Aarno Palotie Finnish Genome Center, University of Helsinki, Helsinki, Finland
Leena Peltonen Department of Molecular Medicine, National Public Health Institute, Helsinki, Finland Faculty of Medicine, Department of Medical Genetics, University of Helsinki, Helsinki, Finland The Broad Institute, Massachusetts Institute of Technology, Boston, Massachusetts, United States of America * To whom correspondence should be addressed. E-mail:
GenomEUtwin Project

Collapse