Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ohno-Machado L, Bafna V, Boxwala AA, Chapman BE, Chapman WW, Chaudhuri K, Day ME, Farcas C, Heintzman ND, Jiang X, Kim H, Kim J, Matheny ME, Resnic FS, Vinterbo SA. iDASH: integrating data for analysis, anonymization, and sharing. J Am Med Inform Assoc 2012;19:196-201. [PMID: 22081224 PMCID: PMC3277627 DOI: 10.1136/amiajnl-2011-000538] [Citation(s) in RCA: 115] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2011] [Accepted: 08/15/2011] [Indexed: 11/03/2022] Open

For:	Ohno-Machado L, Bafna V, Boxwala AA, Chapman BE, Chapman WW, Chaudhuri K, Day ME, Farcas C, Heintzman ND, Jiang X, Kim H, Kim J, Matheny ME, Resnic FS, Vinterbo SA. iDASH: integrating data for analysis, anonymization, and sharing. J Am Med Inform Assoc 2012;19:196-201. [PMID: 22081224 PMCID: PMC3277627 DOI: 10.1136/amiajnl-2011-000538] [Citation(s) in RCA: 115] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2011] [Accepted: 08/15/2011] [Indexed: 11/03/2022] Open

Number

Cited by Other Article(s)

Kuo TT, Jiang X, Tang H, Wang X, Harmanci A, Kim M, Post K, Bu D, Bath T, Kim J, Liu W, Chen H, Ohno-Machado L. The evolving privacy and security concerns for genomic data analysis and sharing as observed from the iDASH competition. J Am Med Inform Assoc 2022;29:2182-2190. [PMID: 36164820 PMCID: PMC9667175 DOI: 10.1093/jamia/ocac165] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 08/25/2022] [Accepted: 09/13/2022] [Indexed: 01/11/2023] Open

Functional genomics data: privacy risk assessment and technological mitigation. Nat Rev Genet 2022;23:245-258. [PMID: 34759381 DOI: 10.1038/s41576-021-00428-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/18/2021] [Indexed: 12/15/2022]

Spini G, Mancini E, Attema T, Abspoel M, de Gier J, Fehr S, Veugen T, van Heesch M, Worm D, De Luca A, Cramer R, Sloot PM. New Approach to Privacy-Preserving Clinical Decision Support Systems for HIV Treatment. J Med Syst 2022;46:84. [PMID: 36261621 PMCID: PMC9581834 DOI: 10.1007/s10916-022-01851-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 08/09/2022] [Accepted: 08/16/2022] [Indexed: 01/04/2023]

Affiliation(s)

Gabriele Spini Applied Cryptography and Quantum Algorithms, TNO, 96800, 2509 JE Postbus, The Hague, The Netherlands
Emiliano Mancini Institute for Advanced Study, University of Amsterdam, Oude Turfmarkt 147, 1012 GC Amsterdam, The Netherlands ,Department of Global Health, Amsterdam UMC, Location AMC, 1105 AZ Amsterdam, The Netherlands ,Data Science Institute, Hasselt University, Diepenbeek, Belgium
Thomas Attema Applied Cryptography and Quantum Algorithms, TNO, 96800, 2509 JE Postbus, The Hague, The Netherlands ,Cryptology Group, CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands ,Mathematical Institute, Leiden University, P.O. Box 9512, 2300 RA Leiden, The Netherlands
Mark Abspoel Cryptology Group, CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands ,Philips Research, High Tech Campus 34, 5656 AE Eindhoven, The Netherlands
Jan de Gier Applied Cryptography and Quantum Algorithms, TNO, 96800, 2509 JE Postbus, The Hague, The Netherlands
Serge Fehr Cryptology Group, CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands ,Mathematical Institute, Leiden University, P.O. Box 9512, 2300 RA Leiden, The Netherlands
Thijs Veugen Applied Cryptography and Quantum Algorithms, TNO, 96800, 2509 JE Postbus, The Hague, The Netherlands ,Cryptology Group, CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands
Maran van Heesch Applied Cryptography and Quantum Algorithms, TNO, 96800, 2509 JE Postbus, The Hague, The Netherlands
Daniël Worm Applied Cryptography and Quantum Algorithms, TNO, 96800, 2509 JE Postbus, The Hague, The Netherlands
Andrea De Luca Department of Medical Biotechnologies, University of Siena and Siena University Hospital, Viale Mario Bracci 16, 53100 Siena, Italy
Ronald Cramer Cryptology Group, CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands ,Mathematical Institute, Leiden University, P.O. Box 9512, 2300 RA Leiden, The Netherlands
Peter M.A. Sloot Institute for Advanced Study, University of Amsterdam, Oude Turfmarkt 147, 1012 GC Amsterdam, The Netherlands ,Complexity Institute, Nanyang Technological University, Academic Building North, Level 1 Section B Unit No. 7 (ABN-01B-07), 61 Nanyang Drive, 637335 Singapore, Singapore ,Advanced Computing, ITMO University, Lomonosova street 9, 191002 Saint Petersburg, Russia

Collapse

Buchlak QD, Esmaili N, Bennett C, Farrokhi F. Natural Language Processing Applications in the Clinical Neurosciences: A Machine Learning Augmented Systematic Review. ACTA NEUROCHIRURGICA. SUPPLEMENT 2022;134:277-289. [PMID: 34862552 DOI: 10.1007/978-3-030-85292-4_32] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Kuo TT, Bath T, Ma S, Pattengale N, Yang M, Cao Y, Hudson CM, Kim J, Post K, Xiong L, Ohno-Machado L. Benchmarking blockchain-based gene-drug interaction data sharing methods: A case study from the iDASH 2019 secure genome analysis competition blockchain track. Int J Med Inform 2021;154:104559. [PMID: 34474309 PMCID: PMC9933142 DOI: 10.1016/j.ijmedinf.2021.104559] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 07/24/2021] [Accepted: 07/27/2021] [Indexed: 01/11/2023]

Callahan A, Polony V, Posada JD, Banda JM, Gombar S, Shah NH. ACE: the Advanced Cohort Engine for searching longitudinal patient records. J Am Med Inform Assoc 2021;28:1468-1479. [PMID: 33712854 PMCID: PMC8279796 DOI: 10.1093/jamia/ocab027] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 02/23/2021] [Indexed: 01/02/2023] Open

Forsch N, Govil S, Perry JC, Hegde S, Young AA, Omens JH, McCulloch AD. Computational analysis of cardiac structure and function in congenital heart disease: Translating discoveries to clinical strategies. JOURNAL OF COMPUTATIONAL SCIENCE 2021;52:101211. [PMID: 34691293 PMCID: PMC8528218 DOI: 10.1016/j.jocs.2020.101211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Kuo TT, Gabriel RA, Cidambi KR, Ohno-Machado L. EXpectation Propagation LOgistic REgRession on permissioned blockCHAIN (ExplorerChain): decentralized online healthcare/genomics predictive model learning. J Am Med Inform Assoc 2021;27:747-756. [PMID: 32364235 PMCID: PMC7309256 DOI: 10.1093/jamia/ocaa023] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 02/11/2020] [Accepted: 02/24/2020] [Indexed: 11/19/2022] Open

Abstract

Objective

Predicting patient outcomes using healthcare/genomics data is an increasingly popular/important area. However, some diseases are rare and require data from multiple institutions to construct generalizable models. To address institutional data protection policies, many distributed methods keep the data locally but rely on a central server for coordination, which introduces risks such as a single point of failure. We focus on providing an alternative based on a decentralized approach. We introduce the idea using blockchain technology for this purpose, with a brief description of its own potential advantages/disadvantages.

Materials and Methods

We explain how our proposed EXpectation Propagation LOgistic REgRession on Permissioned blockCHAIN (ExplorerChain) can achieve the same results when compared to a distributed model that uses a central server on 3 healthcare/genomic datasets, and what trade-offs need to be considered when using centralized/decentralized methods. We explain how the use of blockchain technology can help decrease some of the problems encountered in decentralized methods.

Results

We showed that the discrimination power of ExplorerChain can be statistically similar to its counterpart central server-based algorithm. While ExplorerChain inherited some benefits of blockchain, it had a small increased running time.

Discussion

ExplorerChain has the same prerequisites as a distributed model with a centralized server for coordination. In a manner similar to secure multi-party computation strategies, it assumes that participating institutions are honest, but “curious.”

Conclusion

When evaluated on relatively small datasets, results suggest that ExplorerChain, which combines artificial intelligence and blockchain technologies, performs as well as a central server-based method, and may avoid some risks at the cost of efficiency.

Collapse

Kuo TT, Kim J, Gabriel RA. Privacy-preserving model learning on a blockchain network-of-networks. J Am Med Inform Assoc 2021;27:343-354. [PMID: 31943009 PMCID: PMC7025358 DOI: 10.1093/jamia/ocz214] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 11/04/2019] [Accepted: 12/02/2019] [Indexed: 01/07/2023] Open

Abstract

Objective

To facilitate clinical/genomic/biomedical research, constructing generalizable predictive models using cross-institutional methods while protecting privacy is imperative. However, state-of-the-art methods assume a “flattened” topology, while real-world research networks may consist of “network-of-networks” which can imply practical issues including training on small data for rare diseases/conditions, prioritizing locally trained models, and maintaining models for each level of the hierarchy. In this study, we focus on developing a hierarchical approach to inherit the benefits of the privacy-preserving methods, retain the advantages of adopting blockchain, and address practical concerns on a research network-of-networks.

Materials and Methods

We propose a framework to combine level-wise model learning, blockchain-based model dissemination, and a novel hierarchical consensus algorithm for model ensemble. We developed an example implementation HierarchicalChain (hierarchical privacy-preserving modeling on blockchain), evaluated it on 3 healthcare/genomic datasets, as well as compared its predictive correctness, learning iteration, and execution time with a state-of-the-art method designed for flattened network topology.

Results

HierarchicalChain improves the predictive correctness for small training datasets and provides comparable correctness results with the competing method with higher learning iteration and similar per-iteration execution time, inherits the benefits of the privacy-preserving learning and advantages of blockchain technology, and immutable records models for each level.

Discussion

HierarchicalChain is independent of the core privacy-preserving learning method, as well as of the underlying blockchain platform. Further studies are warranted for various types of network topology, complex data, and privacy concerns.

Conclusion

We demonstrated the potential of utilizing the information from the hierarchical network-of-networks topology to improve prediction.

Collapse

Amirmahani F, Ebrahimi N, Molaei F, Faghihkhorasani F, Jamshidi Goharrizi K, Mirtaghi SM, Borjian‐Boroujeni M, Hamblin MR. Approaches for the integration of big data in translational medicine: single‐cell and computational methods. Ann N Y Acad Sci 2021;1493:3-28. [DOI: 10.1111/nyas.14544] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 10/31/2020] [Accepted: 11/12/2020] [Indexed: 12/11/2022]

Al-Ebbini L, Khabour OF, Alzoubi KH, Alkaraki AK. Biomedical Data Sharing Among Researchers: A Study from Jordan. J Multidiscip Healthc 2020;13:1669-1676. [PMID: 33262602 PMCID: PMC7695599 DOI: 10.2147/jmdh.s284294] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 10/22/2020] [Indexed: 12/02/2022] Open

Geleijnse G, Chiang RCJ, Sieswerda M, Schuurman M, Lee KC, van Soest J, Dekker A, Lee WC, Verbeek XAAM. Prognostic factors analysis for oral cavity cancer survival in the Netherlands and Taiwan using a privacy-preserving federated infrastructure. Sci Rep 2020;10:20526. [PMID: 33239719 PMCID: PMC7688977 DOI: 10.1038/s41598-020-77476-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 11/09/2020] [Indexed: 11/24/2022] Open

Kuo TT. The anatomy of a distributed predictive modeling framework: online learning, blockchain network, and consensus algorithm. JAMIA Open 2020;3:201-208. [PMID: 32734160 PMCID: PMC7382618 DOI: 10.1093/jamiaopen/ooaa017] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 04/21/2020] [Accepted: 04/29/2020] [Indexed: 11/23/2022] Open

Abstract

Objective

Cross-institutional distributed healthcare/genomic predictive modeling is an emerging technology that fulfills both the need of building a more generalizable model and of protecting patient data by only exchanging the models but not the patient data. In this article, the implementation details are presented for one specific blockchain-based approach, ExplorerChain, from a software development perspective. The healthcare/genomic use cases of myocardial infarction, cancer biomarker, and length of hospitalization after surgery are also described.

Materials and Methods

ExplorerChain’s 3 main technical components, including online machine learning, metadata of transaction, and the Proof-of-Information-Timed (PoINT) algorithm, are introduced in this study. Specifically, the 3 algorithms (ie, core, new network, and new site/data) are described in detail.

Results

ExplorerChain was implemented and the design details of it were illustrated, especially the development configurations in a practical setting. Also, the system architecture and programming languages are introduced. The code was also released in an open source repository available at https://github.com/tsungtingkuo/explorerchain.

Discussion

The designing considerations of semi-trust assumption, data format normalization, and non-determinism was discussed. The limitations of the implementation include fixed-number participating sites, limited join-or-leave capability during initialization, advanced privacy technology yet to be included, and further investigation in ethical, legal, and social implications.

Conclusion

This study can serve as a reference for the researchers who would like to implement and even deploy blockchain technology. Furthermore, the off-the-shelf software can also serve as a cornerstone to accelerate the development and investigation of future healthcare/genomic blockchain studies.

Collapse

Conboy C. Consent and Privacy in the Era of Precision Medicine and Biobanking Genomic Data. AMERICAN JOURNAL OF LAW & MEDICINE 2020;46:167-187. [PMID: 32659188 DOI: 10.1177/0098858820933493] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]

Kuo TT, Gabriel RA, Ohno-Machado L. Fair compute loads enabled by blockchain: sharing models by alternating client and server roles. J Am Med Inform Assoc 2020;26:392-403. [PMID: 30892656 PMCID: PMC7787356 DOI: 10.1093/jamia/ocy180] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Revised: 10/16/2018] [Accepted: 12/02/2018] [Indexed: 11/28/2022] Open

Abstract

Objective

Decentralized privacy-preserving predictive modeling enables multiple institutions to learn a more generalizable model on healthcare or genomic data by sharing the partially trained models instead of patient-level data, while avoiding risks such as single point of control. State-of-the-art blockchain-based methods remove the “server” role but can be less accurate than models that rely on a server. Therefore, we aim at developing a general model sharing framework to preserve predictive correctness, mitigate the risks of a centralized architecture, and compute the models in a fair way

Materials and Methods

We propose a framework that includes both server and “client” roles to preserve correctness. We adopt a blockchain network to obtain the benefits of decentralization, by alternating the roles for each site to ensure computational fairness. Also, we developed GloreChain (Grid Binary LOgistic REgression on Permissioned BlockChain) as a concrete example, and compared it to a centralized algorithm on 3 healthcare or genomic datasets to evaluate predictive correctness, number of learning iterations and execution time

Results

GloreChain performs exactly the same as the centralized method in terms of correctness and number of iterations. It inherits the advantages of blockchain, at the cost of increased time to reach a consensus model

Discussion

Our framework is general or flexible and can also address intrinsic challenges of blockchain networks. Further investigations will focus on higher-dimensional datasets, additional use cases, privacy-preserving quality concerns, and ethical, legal, and social implications

Conclusions

Our framework provides a promising potential for institutions to learn a predictive model based on healthcare or genomic data in a privacy-preserving and decentralized way.

Collapse

Chehab K, Kalboussi A, Hadj Kacem A. Study of Healthcare Professionals’ Interaction in the Patient Records Based on Annotations. LECTURE NOTES IN COMPUTER SCIENCE 2020. [PMCID: PMC7313271 DOI: 10.1007/978-3-030-51517-1_28] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Esmaeilzadeh P, Mirzaei T. The Potential of Blockchain Technology for Health Information Exchange: Experimental Study From Patients' Perspectives. J Med Internet Res 2019;21:e14184. [PMID: 31223119 PMCID: PMC6610459 DOI: 10.2196/14184] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 05/12/2019] [Accepted: 05/20/2019] [Indexed: 01/22/2023] Open

Abstract

Background

Nowadays, a number of mechanisms and tools are being used by health care organizations and physicians to electronically exchange the personal health information of patients. The main objectives of different methods of health information exchange (HIE) are to reduce health care costs, minimize medical errors, and improve the coordination of interorganizational information exchange across health care entities. The main challenges associated with the common HIE systems are privacy concerns, security risks, low visibility of system transparency, and lack of patient control. Blockchain technology is likely to disrupt the current information exchange models utilized in the health care industry.

Objective

Little is known about patients’ perceptions and attitudes toward the implementation of blockchain-enabled HIE networks, and it is still not clear if patients (as one of the main HIE stakeholders) are likely to opt in to the applications of this technology in HIE initiatives. Thus, this study aimed at exploring the core value of blockchain technology in the health care industry from health care consumers’ views.

Methods

To recognize the potential applications of blockchain technology in health care practices, we designed 16 information exchange scenarios for controlled Web-based experiments. Overall, 2013 respondents participated in 16 Web-based experiments. Each experiment described an information exchange condition characterized by 4 exchange mechanisms (ie, direct, lookup, patient-centered, and blockchain), 2 types of health information (ie, sensitive vs nonsensitive), and 2 types of privacy policy (weak vs strong).

Results

The findings show that there are significant differences in patients’ perceptions of various exchange mechanisms with regard to patient privacy concern, trust in competency and integrity, opt-in intention, and willingness to share information. Interestingly, participants hold a favorable attitude toward the implementation of blockchain-based exchange mechanisms for privacy protection, coordination, and information exchange purposes. This study proposed the potentials and limitations of a blockchain-based attempt in the HIE context.

Conclusions

The results of this research should be of interest to both academics and practitioners. The findings propose potential limitations of a blockchain-based HIE that should be addressed by health care organizations to exchange personal health information in a secure and private manner. This study can contribute to the research in the blockchain area and enrich the literature on the use of blockchain in HIE efforts. Practitioners can also identify how to leverage the benefit of blockchain to promote HIE initiatives nationwide.

Collapse

Gu W, Yildirimman R, Van der Stuyft E, Verbeeck D, Herzinger S, Satagopam V, Barbosa-Silva A, Schneider R, Lange B, Lehrach H, Guo Y, Henderson D, Rowe A. Data and knowledge management in translational research: implementation of the eTRIKS platform for the IMI OncoTrack consortium. BMC Bioinformatics 2019;20:164. [PMID: 30935364 PMCID: PMC6444691 DOI: 10.1186/s12859-019-2748-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 03/18/2019] [Indexed: 01/04/2023] Open

Abstract

Background

For large international research consortia, such as those funded by the European Union’s Horizon 2020 programme or the Innovative Medicines Initiative, good data coordination practices and tools are essential for the successful collection, organization and analysis of the resulting data. Research consortia are attempting ever more ambitious science to better understand disease, by leveraging technologies such as whole genome sequencing, proteomics, patient-derived biological models and computer-based systems biology simulations.

Results

The IMI eTRIKS consortium is charged with the task of developing an integrated knowledge management platform capable of supporting the complexity of the data generated by such research programmes. In this paper, using the example of the OncoTrack consortium, we describe a typical use case in translational medicine. The tranSMART knowledge management platform was implemented to support data from observational clinical cohorts, drug response data from cell culture models and drug response data from mouse xenograft tumour models. The high dimensional (omics) data from the molecular analyses of the corresponding biological materials were linked to these collections, so that users could browse and analyse these to derive candidate biomarkers.

Conclusions

In all these steps, data mapping, linking and preparation are handled automatically by the tranSMART integration platform. Therefore, researchers without specialist data handling skills can focus directly on the scientific questions, without spending undue effort on processing the data and data integration, which are otherwise a burden and the most time-consuming part of translational research data analysis.

Electronic supplementary material

The online version of this article (10.1186/s12859-019-2748-y) contains supplementary material, which is available to authorized users.

Collapse

Yoshida K, Gruber S, Fireman BH, Toh S. Comparison of privacy-protecting analytic and data-sharing methods: A simulation study. Pharmacoepidemiol Drug Saf 2018;27:1034-1041. [PMID: 30022561 PMCID: PMC6135666 DOI: 10.1002/pds.4615] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2017] [Revised: 04/09/2018] [Accepted: 06/11/2018] [Indexed: 11/06/2022]

Abstract

PURPOSE

Privacy-protecting analytic and data-sharing methods that minimize the disclosure risk of sensitive information are increasingly important due to the growing interest in utilizing data across multiple sources. We conducted a simulation study to examine how avoiding sharing individual-level data in a distributed data network can affect analytic results.

METHODS

The base scenario had four sites of varying sizes with 5% outcome incidence, 50% treatment prevalence, and seven confounders. We varied treatment prevalence, outcome incidence, treatment effect, site size, number of sites, and covariate distribution. Confounding adjustment was conducted using propensity score or disease risk score. We compared analyses of three types of aggregate-level data requested from sites: risk-set, summary-table, or effect-estimate data (meta-analysis) with benchmark results of analysis of pooled individual-level data. We assessed bias and precision of hazard ratio estimates as well as the accuracy of standard error estimates.

RESULTS

All the aggregate-level data-sharing approaches, regardless of confounding adjustment methods, successfully approximated pooled individual-level data analysis in most simulation scenarios. Meta-analysis showed minor bias when using inverse probability of treatment weights (IPTW) in infrequent exposure (5%), rare outcome (0.01%), and small site (5,000 patients) settings. SE estimates became less accurate for IPTW risk-set approach with less frequent exposure and for propensity score-matching meta-analysis approach with rare outcomes.

CONCLUSIONS

Overall, we found that we can avoid sharing individual-level data and obtain valid results in many settings, although care must be taken with meta-analysis approach in infrequent exposure and rare outcome scenarios, particularly when confounding adjustment is performed with IPTW.

Collapse

Vaidya J, Shafiq B, Asani M, Adam N, Jiang X, Ohno-Machado L. A Scalable Privacy-preserving Data Generation Methodology for Exploratory Analysis. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018;2017:1695-1704. [PMID: 29854240 PMCID: PMC5977652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Dankar FK, Ptitsyn A, Dankar SK. The development of large-scale de-identified biomedical databases in the age of genomics-principles and challenges. Hum Genomics 2018;12:19. [PMID: 29636096 PMCID: PMC5894154 DOI: 10.1186/s40246-018-0147-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 03/15/2018] [Indexed: 12/24/2022] Open

Simplifying research access to genomics and health data with Library Cards. Sci Data 2018. [PMID: 29537396 PMCID: PMC5851345 DOI: 10.1038/sdata.2018.39] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open

Kuo TT, Kim HE, Ohno-Machado L. Blockchain distributed ledger technologies for biomedical and health care applications. J Am Med Inform Assoc 2018;24:1211-1220. [PMID: 29016974 PMCID: PMC6080687 DOI: 10.1093/jamia/ocx068] [Citation(s) in RCA: 260] [Impact Index Per Article: 43.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 06/30/2017] [Indexed: 11/16/2022] Open

Wei W, Ji Z, He Y, Zhang K, Ha Y, Li Q, Ohno-Machado L. Finding relevant biomedical datasets: the UC San Diego solution for the bioCADDIE Retrieval Challenge. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018;2018:4939515. [PMID: 29688374 PMCID: PMC5861401 DOI: 10.1093/database/bay017] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 01/30/2018] [Indexed: 01/28/2023]

Wang Y, Wang L, Rastegar-Mojarad M, Moon S, Shen F, Afzal N, Liu S, Zeng Y, Mehrabi S, Sohn S, Liu H. Clinical information extraction applications: A literature review. J Biomed Inform 2018;77:34-49. [PMID: 29162496 PMCID: PMC5771858 DOI: 10.1016/j.jbi.2017.11.011] [Citation(s) in RCA: 316] [Impact Index Per Article: 52.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 11/01/2017] [Accepted: 11/17/2017] [Indexed: 12/24/2022]

Christoph J, Knell C, Bosserhoff A, Naschberger E, Stürzl M, Rübner M, Seuss H, Ruh M, Prokosch HU, Sedlmayr B. Usability and Suitability of the Omics-Integrating Analysis Platform tranSMART for Translational Research and Education. Appl Clin Inform 2017;8:1173-1183. [PMID: 29270954 DOI: 10.4338/aci-2017-05-ra-0085] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Abstract

BACKGROUND

Platforms like tranSMART assist researchers in analyzing clinical and corresponding omics data. Usability is an important, yet often overlooked, factor affecting the adoption and meaningful use. Analyses on the specific needs of translational researchers and considerations about the application of such platforms for education are rare.

OBJECTIVES

The aim of this study was to test whether tranSMART can be used in education and how well medical students and professional researchers can handle it; to identify which kind of translational researchers-in terms of skills, experienced limitations, and available data-can take advantage of tranSMART; and to evaluate the usability and to generate recommendations for improvements.

METHODS

An online-based test has been done by medical students (N = 109) and researchers (N = 26). The test comprised 13 tasks in the context of four typical research scenarios based on experimental and clinical data. A web questionnaire was provided to identify both the needs and the conditions of research as well as to evaluate the system's usability based on the "System Usability Scale" (SUS).

RESULTS

Students and researchers were able to handle tranSMART well and coped with most scenarios: cohort identification, data exploration, hypothesis generation, and hypothesis validation were answered with a rate of correctness between 82 and 100%. Of the total, 72.2% of the teaching researchers considered tranSMART suitable for their lessons and 84.6% of the researchers considered the platform useful for their daily work; 65.4% of the researchers named the nonavailability of a platform like tranSMART as a restriction on their research. The usability was rated "acceptable" with a SUS of 70.8.

CONCLUSION

tranSMART is potentially suitable for education purposes and fits most of the needs of translational researchers. Improvements are needed on the presentation of analysis results and on the guidance of users through the analysis, especially to ensure the compliance of the analysis with the requirements of statistical testing.

Collapse

Martin-Sanchez FJ, Aguiar-Pulido V, Lopez-Campos GH, Peek N, Sacchi L. Secondary Use and Analysis of Big Data Collected for Patient Care. Yearb Med Inform 2017;26:28-37. [PMID: 28480474 PMCID: PMC6239231 DOI: 10.15265/iy-2017-008] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Jagodnik KM, Koplev S, Jenkins SL, Ohno-Machado L, Paten B, Schurer SC, Dumontier M, Verborgh R, Bui A, Ping P, McKenna NJ, Madduri R, Pillai A, Ma'ayan A. Developing a framework for digital objects in the Big Data to Knowledge (BD2K) commons: Report from the Commons Framework Pilots workshop. J Biomed Inform 2017;71:49-57. [PMID: 28501646 DOI: 10.1016/j.jbi.2017.05.006] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Revised: 05/01/2017] [Accepted: 05/08/2017] [Indexed: 12/11/2022]

Affiliation(s)

Kathleen M Jagodnik Department of Pharmacological Sciences, BD2K-LINCS Data Coordination and Integration Center, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
Simon Koplev Department of Pharmacological Sciences, BD2K-LINCS Data Coordination and Integration Center, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
Sherry L Jenkins Department of Pharmacological Sciences, BD2K-LINCS Data Coordination and Integration Center, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
Lucila Ohno-Machado Health System Department of Biomedical Informatics, University of California San Diego, 9500 Gilman Dr., La Jolla, CA 92083, USA; Health Services Research, San Diego Veterans Administration Health System, San Diego, CA 92083, USA
Benedict Paten UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High St., Santa Cruz, CA 95060, USA
Stephan C Schurer Department of Molecular and Cellular Pharmacology, University of Miami, 331461120 NW 14th Street, CRB 650 (M-857), Miami, FL 33136, USA
Michel Dumontier Institute for Data Science, Universiteit Maastricht, Minderbroedersberg 4-6, 6211 LK Maastricht, Netherlands
Ruben Verborgh Ghent University - iMinds Research Foundation Flanders, St. Pietersnieuwstraat 33, 9000 Gent, Belgium
Alex Bui Department of Radiological Sciences, UCLA School of Medicine, Los Angeles, CA 90095, USA; Department of Bioengineering, UCLA Henri Samueli School of Engineering, Los Angeles, CA 90095, USA
Peipei Ping Departments of Physiology, Medicine, and Bioinformatics, UCLA School of Medicine, Los Angeles, CA 90095, USA
Neil J McKenna Department of Molecular and Cellular Biology, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
Ravi Madduri Department of Mathematics and Computer Science, Argonne National Laboratory, 9700 S. Cass Avenue, Argonne, IL 60439, USA
Ajay Pillai Division of Genome Sciences, National Human Genome Research Institute, National Institutes of Health, 31 Center Drive, MSC 2152, 9000 Rockville Pike, Bethesda, MD 20892, USA
Avi Ma'ayan Department of Pharmacological Sciences, BD2K-LINCS Data Coordination and Integration Center, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA.

Collapse

DataSHIELD – New Directions and Dimensions. DATA SCIENCE JOURNAL 2017. [DOI: 10.5334/dsj-2017-021] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Chen X, Fann YC, McAuliffe M, Vismer D, Yang R. Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation. JMIR Med Inform 2017;5:e2. [PMID: 28213343 PMCID: PMC5336604 DOI: 10.2196/medinform.5054] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Revised: 07/04/2016] [Accepted: 07/04/2016] [Indexed: 11/26/2022] Open

Abstract

Background

As one of the several effective solutions for personal privacy protection, a global unique identifier (GUID) is linked with hash codes that are generated from combinations of personally identifiable information (PII) by a one-way hash algorithm. On the GUID server, no PII is permitted to be stored, and only GUID and hash codes are allowed. The quality of PII entry is critical to the GUID system.

Objective

The goal of our study was to explore a method of checking questionable entry of PII in this context without using or sending any portion of PII while registering a subject.

Methods

According to the principle of GUID system, all possible combination patterns of PII fields were analyzed and used to generate hash codes, which were stored on the GUID server. Based on the matching rules of the GUID system, an error-checking algorithm was developed using set theory to check PII entry errors. We selected 200,000 simulated individuals with randomly-planted errors to evaluate the proposed algorithm. These errors were placed in the required PII fields or optional PII fields. The performance of the proposed algorithm was also tested in the registering system of study subjects.

Results

There are 127,700 error-planted subjects, of which 114,464 (89.64%) can still be identified as the previous one and remaining 13,236 (10.36%, 13,236/127,700) are discriminated as new subjects. As expected, 100% of nonidentified subjects had errors within the required PII fields. The possibility that a subject is identified is related to the count and the type of incorrect PII field. For all identified subjects, their errors can be found by the proposed algorithm. The scope of questionable PII fields is also associated with the count and the type of the incorrect PII field. The best situation is to precisely find the exact incorrect PII fields, and the worst situation is to shrink the questionable scope only to a set of 13 PII fields. In the application, the proposed algorithm can give a hint of questionable PII entry and perform as an effective tool.

Conclusions

The GUID system has high error tolerance and may correctly identify and associate a subject even with few PII field errors. Correct data entry, especially required PII fields, is critical to avoiding false splits. In the context of one-way hash transformation, the questionable input of PII may be identified by applying set theory operators based on the hash codes. The count and the type of incorrect PII fields play an important role in identifying a subject and locating questionable PII fields.

Collapse

Garvin JH, Kalsy M, Brandt C, Luther SL, Divita G, Coronado G, Redd D, Christensen C, Hill B, Kelly N, Treitler QZ. An Evolving Ecosystem for Natural Language Processing in Department of Veterans Affairs. J Med Syst 2017;41:32. [PMID: 28050745 DOI: 10.1007/s10916-016-0681-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2016] [Accepted: 12/22/2016] [Indexed: 11/26/2022]

Affiliation(s)

Jennifer H Garvin IDEAS Center SLC VA Healthcare System, 500 Foothill Drive, Salt Lake City, UT, 84148, USA. GRECC SLC VA Healthcare System, 500 Foothill Drive, Salt Lake City, UT, 84148, USA. Division of Epidemiology, University of Utah School of Medicine, 295 Chipeta Way, Salt Lake City, UT, 84132, USA. Department of Biomedical Informatics, University of Utah School of Medicine, 421 Wakara Way, Ste. 140, Salt Lake City, UT, 84108, USA.
Megha Kalsy IDEAS Center SLC VA Healthcare System, 500 Foothill Drive, Salt Lake City, UT, 84148, USA Department of Biomedical Informatics, University of Utah School of Medicine, 421 Wakara Way, Ste. 140, Salt Lake City, UT, 84108, USA
Cynthia Brandt VA Connecticut Healthcare System, 950 Campbell Avenue, West Haven, CT, USA Yale School of Medicine, 333 Cedar St., New Haven, CT, USA
Stephen L Luther James A Haley Veterans Hospital, 13000 Bruce B. Downs Blvd, Tampa, FL, USA
Guy Divita IDEAS Center SLC VA Healthcare System, 500 Foothill Drive, Salt Lake City, UT, 84148, USA Department of Biomedical Informatics, University of Utah School of Medicine, 421 Wakara Way, Ste. 140, Salt Lake City, UT, 84108, USA
Gregory Coronado IDEAS Center SLC VA Healthcare System, 500 Foothill Drive, Salt Lake City, UT, 84148, USA
Doug Redd IDEAS Center SLC VA Healthcare System, 500 Foothill Drive, Salt Lake City, UT, 84148, USA Department of Biomedical Informatics, University of Utah School of Medicine, 421 Wakara Way, Ste. 140, Salt Lake City, UT, 84108, USA Department of Clinical Research and Leadership, George Washington University School of Medicine and Health Sciences, 2100 Pennsylvania Ave, NW, Washington, DC, 20037, USA
Carrie Christensen IDEAS Center SLC VA Healthcare System, 500 Foothill Drive, Salt Lake City, UT, 84148, USA Department of Biomedical Informatics, University of Utah School of Medicine, 421 Wakara Way, Ste. 140, Salt Lake City, UT, 84108, USA
Brent Hill IDEAS Center SLC VA Healthcare System, 500 Foothill Drive, Salt Lake City, UT, 84148, USA Department of Biomedical Informatics, University of Utah School of Medicine, 421 Wakara Way, Ste. 140, Salt Lake City, UT, 84108, USA
Natalie Kelly IDEAS Center SLC VA Healthcare System, 500 Foothill Drive, Salt Lake City, UT, 84148, USA
Qing Zeng Treitler IDEAS Center SLC VA Healthcare System, 500 Foothill Drive, Salt Lake City, UT, 84148, USA Department of Biomedical Informatics, University of Utah School of Medicine, 421 Wakara Way, Ste. 140, Salt Lake City, UT, 84108, USA Department of Clinical Research and Leadership, George Washington University School of Medicine and Health Sciences, 2100 Pennsylvania Ave, NW, Washington, DC, 20037, USA

Collapse

Skolariki K, Avramouli A. The Use of Translational Research Platforms in Clinical and Biomedical Data Exploration. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2017;988:301-311. [PMID: 28971409 DOI: 10.1007/978-3-319-56246-9_25] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Satagopam V, Gu W, Eifes S, Gawron P, Ostaszewski M, Gebel S, Barbosa-Silva A, Balling R, Schneider R. Integration and Visualization of Translational Medicine Data for Better Understanding of Human Diseases. BIG DATA 2016;4:97-108. [PMID: 27441714 PMCID: PMC4932659 DOI: 10.1089/big.2015.0057] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Abstract

Translational medicine is a domain turning results of basic life science research into new tools and methods in a clinical environment, for example, as new diagnostics or therapies. Nowadays, the process of translation is supported by large amounts of heterogeneous data ranging from medical data to a whole range of -omics data. It is not only a great opportunity but also a great challenge, as translational medicine big data is difficult to integrate and analyze, and requires the involvement of biomedical experts for the data processing. We show here that visualization and interoperable workflows, combining multiple complex steps, can address at least parts of the challenge. In this article, we present an integrated workflow for exploring, analysis, and interpretation of translational medicine data in the context of human health. Three Web services-tranSMART, a Galaxy Server, and a MINERVA platform-are combined into one big data pipeline. Native visualization capabilities enable the biomedical experts to get a comprehensive overview and control over separate steps of the workflow. The capabilities of tranSMART enable a flexible filtering of multidimensional integrated data sets to create subsets suitable for downstream processing. A Galaxy Server offers visually aided construction of analytical pipelines, with the use of existing or custom components. A MINERVA platform supports the exploration of health and disease-related mechanisms in a contextualized analytical visualization system. We demonstrate the utility of our workflow by illustrating its subsequent steps using an existing data set, for which we propose a filtering scheme, an analytical pipeline, and a corresponding visualization of analytical results. The workflow is available as a sandbox environment, where readers can work with the described setup themselves. Overall, our work shows how visualization and interfacing of big data processing services facilitate exploration, analysis, and interpretation of translational medicine data.

Collapse

Rance B, Canuel V, Countouris H, Laurent-Puig P, Burgun A. Integrating Heterogeneous Biomedical Data for Cancer Research: the CARPEM infrastructure. Appl Clin Inform 2016;7:260-74. [PMID: 27437039 PMCID: PMC4941838 DOI: 10.4338/aci-2015-09-ra-0125] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 02/07/2016] [Indexed: 01/19/2023] Open

Doan S, Maehara CK, Chaparro JD, Lu S, Liu R, Graham A, Berry E, Hsu CN, Kanegaye JT, Lloyd DD, Ohno-Machado L, Burns JC, Tremoulet AH. Building a Natural Language Processing Tool to Identify Patients With High Clinical Suspicion for Kawasaki Disease from Emergency Department Notes. Acad Emerg Med 2016;23:628-36. [PMID: 26826020 DOI: 10.1111/acem.12925] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Revised: 11/29/2015] [Accepted: 12/30/2015] [Indexed: 11/26/2022]

Quintana Y. Challenges to Implementation of Global Translational Collaboration Platforms. MOJ PROTEOMICS & BIOINFORMATICS 2016;2:65. [PMID: 26798845 PMCID: PMC4717481 DOI: 10.15406/mojpb.2015.02.00065] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Tremoulet AH, Dutkowski J, Sato Y, Kanegaye JT, Ling XB, Burns JC. Novel data-mining approach identifies biomarkers for diagnosis of Kawasaki disease. Pediatr Res 2015;78:547-53. [PMID: 26237629 PMCID: PMC4628575 DOI: 10.1038/pr.2015.137] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/05/2015] [Accepted: 04/17/2015] [Indexed: 11/30/2022]

Lu CL, Wang S, Ji Z, Wu Y, Xiong L, Jiang X, Ohno-Machado L. WebDISCO: a web service for distributed cox model learning without patient-level data sharing. J Am Med Inform Assoc 2015;22:1212-9. [PMID: 26159465 PMCID: PMC5009917 DOI: 10.1093/jamia/ocv083] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Revised: 05/16/2015] [Accepted: 05/26/2015] [Indexed: 11/14/2022] Open

Abstract

OBJECTIVE

The Cox proportional hazards model is a widely used method for analyzing survival data. To achieve sufficient statistical power in a survival analysis, it usually requires a large amount of data. Data sharing across institutions could be a potential workaround for providing this added power.

METHODS AND MATERIALS

The authors develop a web service for distributed Cox model learning (WebDISCO), which focuses on the proof-of-concept and algorithm development for federated survival analysis. The sensitive patient-level data can be processed locally and only the less-sensitive intermediate statistics are exchanged to build a global Cox model. Mathematical derivation shows that the proposed distributed algorithm is identical to the centralized Cox model.

RESULTS

The authors evaluated the proposed framework at the University of California, San Diego (UCSD), Emory, and Duke. The experimental results show that both distributed and centralized models result in near-identical model coefficients with differences in the range [Formula: see text] to [Formula: see text]. The results confirm the mathematical derivation and show that the implementation of the distributed model can achieve the same results as the centralized implementation.

LIMITATION

The proposed method serves as a proof of concept, in which a publicly available dataset was used to evaluate the performance. The authors do not intend to suggest that this method can resolve policy and engineering issues related to the federated use of institutional data, but they should serve as evidence of the technical feasibility of the proposed approach.Conclusions WebDISCO (Web-based Distributed Cox Regression Model; https://webdisco.ucsd-dbmi.org:8443/cox/) provides a proof-of-concept web service that implements a distributed algorithm to conduct distributed survival analysis without sharing patient level data.

Collapse

Noor AM, Holmberg L, Gillett C, Grigoriadis A. Big Data: the challenge for small research groups in the era of cancer genomics. Br J Cancer 2015;113:1405-12. [PMID: 26492224 PMCID: PMC4815885 DOI: 10.1038/bjc.2015.341] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Revised: 08/04/2015] [Accepted: 08/09/2015] [Indexed: 01/06/2023] Open

Wang S, Zhang Y, Dai W, Lauter K, Kim M, Tang Y, Xiong H, Jiang X. HEALER: homomorphic computation of ExAct Logistic rEgRession for secure rare disease variants analysis in GWAS. Bioinformatics 2015;32:211-8. [PMID: 26446135 DOI: 10.1093/bioinformatics/btv563] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 09/22/2015] [Indexed: 01/06/2023] Open

Kho AN, Cashy JP, Jackson KL, Pah AR, Goel S, Boehnke J, Humphries JE, Kominers SD, Hota BN, Sims SA, Malin BA, French DD, Walunas TL, Meltzer DO, Kaleba EO, Jones RC, Galanter WL. Design and implementation of a privacy preserving electronic health record linkage tool in Chicago. J Am Med Inform Assoc 2015;22:1072-80. [PMID: 26104741 PMCID: PMC5009931 DOI: 10.1093/jamia/ocv038] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Revised: 02/25/2015] [Accepted: 03/26/2015] [Indexed: 11/12/2022] Open

Affiliation(s)

Abel N Kho Department of Medicine, and Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
John P Cashy Department of Medicine, and Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA Department of Veterans Affairs, Pittsburgh PA
Kathryn L Jackson Department of Medicine, and Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
Adam R Pah Department of Medicine, and Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
Satyender Goel Department of Medicine, and Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
Jörn Boehnke Department of Economics, University of Chicago, Chicago, IL, USA
John Eric Humphries Department of Economics, University of Chicago, Chicago, IL, USA
Scott Duke Kominers Society of Fellows Department of Economics, Business School, Program For Evolutionary Dynamics, and Center for Research on Computation and Society, Harvard University, Cambridge, MA, USA
Bala N Hota Department of Medicine, Rush University Medical Center, Chicago, IL, USA
Shannon A Sims Department of Medicine, Rush University Medical Center, Chicago, IL, USA
Bradley A Malin Department of Biomedical Informatics, School of Medicine, and Department of Electrical Engineering and Computer Science, School of Engineering, Vanderbilt University, Nashville, TN, USA
Dustin D French Center for Healthcare Studies and Department of Ophthalmology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
Theresa L Walunas Department of Medicine, and Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
David O Meltzer Department of Veterans Affairs, Pittsburgh PA
Erin O Kaleba Alliance of Chicago Community Health Services, Chicago, IL, USA
Roderick C Jones Formerly of Chicago Department of Public Health, currently at Ann and Robert H. Lurie Children's Hospital, Chicago, IL, USA
William L Galanter University of Illinois Hospital and Health Sciences System, Chicago, IL, USA

Collapse

Dyke SOM, Cheung WA, Joly Y, Ammerpohl O, Lutsik P, Rothstein MA, Caron M, Busche S, Bourque G, Rönnblom L, Flicek P, Beck S, Hirst M, Stunnenberg H, Siebert R, Walter J, Pastinen T. Epigenome data release: a participant-centered approach to privacy protection. Genome Biol 2015;16:142. [PMID: 26185018 PMCID: PMC4504083 DOI: 10.1186/s13059-015-0723-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2015] [Accepted: 07/09/2015] [Indexed: 11/10/2022] Open

Affiliation(s)

Stephanie O M Dyke Centre of Genomics and Policy, Department of Human Genetics, McGill University, Montreal, QC, H3A 0G1, Canada.
Warren A Cheung Department of Human Genetics, McGill University and Genome Quebec Innovation Centre, Montreal, QC, H3A 0G1, Canada
Yann Joly Centre of Genomics and Policy, Department of Human Genetics, McGill University, Montreal, QC, H3A 0G1, Canada
Ole Ammerpohl Institute of Human Genetics, University Hospital Schleswig-Holstein, Campus Kiel & Christian-Albrechts-University Kiel, 24105, Kiel, Germany
Pavlo Lutsik Saarland University, 66123, Saarbrücken, Germany
Mark A Rothstein Institute for Bioethics, Health Policy and Law, University of Louisville School of Medicine, Louisville, KY, 40202, USA
Maxime Caron Department of Human Genetics, McGill University and Genome Quebec Innovation Centre, Montreal, QC, H3A 0G1, Canada
Stephan Busche Department of Human Genetics, McGill University and Genome Quebec Innovation Centre, Montreal, QC, H3A 0G1, Canada
Guillaume Bourque Department of Human Genetics, McGill University and Genome Quebec Innovation Centre, Montreal, QC, H3A 0G1, Canada
Lars Rönnblom Department of Medical Sciences, Science for Life Laboratory, Uppsala University, SE-751 85, Uppsala, Sweden
Paul Flicek European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
Stephan Beck Medical Genomics, UCL Cancer Institute, University College London, London, WC1E 6BT, UK
Martin Hirst Centre for High-Throughput Biology, University of British Columbia and Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, V5Z 4S6, Canada
Henk Stunnenberg Department of Molecular Biology, RIMLS, Faculty of Science, Radboud University, 6500 HB, Nijmegen, The Netherlands
Reiner Siebert Institute of Human Genetics, University Hospital Schleswig-Holstein, Campus Kiel & Christian-Albrechts-University Kiel, 24105, Kiel, Germany
Jörn Walter Saarland University, 66123, Saarbrücken, Germany
Tomi Pastinen Department of Human Genetics, McGill University and Genome Quebec Innovation Centre, Montreal, QC, H3A 0G1, Canada.

Collapse

Meeker D, Jiang X, Matheny ME, Farcas C, D'Arcy M, Pearlman L, Nookala L, Day ME, Kim KK, Kim H, Boxwala A, El-Kareh R, Kuo GM, Resnic FS, Kesselman C, Ohno-Machado L. A system to build distributed multivariate models and manage disparate data sharing policies: implementation in the scalable national network for effectiveness research. J Am Med Inform Assoc 2015;22:1187-95. [PMID: 26142423 PMCID: PMC4639714 DOI: 10.1093/jamia/ocv017] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Accepted: 02/18/2015] [Indexed: 11/29/2022] Open

Abstract

Background Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies are managed in a study-centric manner.

Objective The objective was to implement infrastructure that supports the functionality of some existing research networks (e.g., cohort discovery, workflow management, and estimation of multivariate analytic models on centralized data) while adding additional important new features, such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies.

Materials and Methods Based on the requirements gathered from statisticians, administrators, and investigators from multiple institutions, the authors developed infrastructure and tools to support multisite comparative effectiveness studies using web services for multivariate statistical estimation in the SCANNER federated network.

Results The authors implemented massively parallel (map-reduce) computation methods and a new policy management system to enable each study initiated by network participants to define the ways in which data may be processed, managed, queried, and shared. The authors illustrated the use of these systems among institutions with highly different policies and operating under different state laws.

Discussion and Conclusion Federated research networks need not limit distributed query functionality to count queries, cohort discovery, or independently estimated analytic models. Multivariate analyses can be efficiently and securely conducted without patient-level data transport, allowing institutions with strict local data storage requirements to participate in sophisticated analyses based on federated research networks.

Collapse

Belle A, Thiagarajan R, Soroushmehr SMR, Navidi F, Beard DA, Najarian K. Big Data Analytics in Healthcare. BIOMED RESEARCH INTERNATIONAL 2015;2015:370194. [PMID: 26229957 PMCID: PMC4503556 DOI: 10.1155/2015/370194] [Citation(s) in RCA: 261] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2015] [Revised: 05/26/2015] [Accepted: 06/16/2015] [Indexed: 02/06/2023]

Toga AW, Dinov ID. Sharing big biomedical data. JOURNAL OF BIG DATA 2015;2:7. [PMID: 26929900 PMCID: PMC4768816 DOI: 10.1186/s40537-015-0016-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Accepted: 04/28/2015] [Indexed: 06/05/2023]

Gallego B, Walter SR, Day RO, Dunn AG, Sivaraman V, Shah N, Longhurst CA, Coiera E. Bringing cohort studies to the bedside: framework for a ‘green button’ to support clinical decision-making. J Comp Eff Res 2015;4:191-197. [DOI: 10.2217/cer.15.12] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Medina García R, Torres Serrano E, Segrelles Quilis JD, Blanquer Espert I, Martí Bonmatí L, Almenar Cubells D. A systematic approach for using DICOM structured reports in clinical processes: focus on breast cancer. J Digit Imaging 2015;28:132-45. [PMID: 25200428 PMCID: PMC4359202 DOI: 10.1007/s10278-014-9728-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open

An electronic medical record system with treatment recommendations based on patient similarity. J Med Syst 2015;39:55. [PMID: 25762458 DOI: 10.1007/s10916-015-0237-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2014] [Accepted: 03/02/2015] [Indexed: 10/23/2022]

Doan S, Conway M, Phuong TM, Ohno-Machado L. Natural language processing in biomedicine: a unified system architecture overview. Methods Mol Biol 2015;1168:275-94. [PMID: 24870142 DOI: 10.1007/978-1-4939-0847-9_16] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Jiang X, Zhao Y, Wang X, Malin B, Wang S, Ohno-Machado L, Tang H. A community assessment of privacy preserving techniques for human genomes. BMC Med Inform Decis Mak 2014;14 Suppl 1:S1. [PMID: 25521230 PMCID: PMC4290799 DOI: 10.1186/1472-6947-14-s1-s1] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open