Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kim H, Golub GH, Park H. Missing value estimation for DNA microarray gene expression data: local least squares imputation. Bioinformatics 2004;21:187-98. [PMID: 15333461 DOI: 10.1093/bioinformatics/bth499] [Citation(s) in RCA: 198] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

For:	Kim H, Golub GH, Park H. Missing value estimation for DNA microarray gene expression data: local least squares imputation. Bioinformatics 2004;21:187-98. [PMID: 15333461 DOI: 10.1093/bioinformatics/bth499] [Citation(s) in RCA: 198] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Organick L, Chen YJ, Dumas Ang S, Lopez R, Liu X, Strauss K, Ceze L. Probing the physical limits of reliable DNA data retrieval. Nat Commun 2020;11:616. [PMID: 32001691 PMCID: PMC6992699 DOI: 10.1038/s41467-020-14319-8] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 12/16/2019] [Indexed: 12/31/2022] Open

Similarity-learning information-fusion schemes for missing data imputation. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2019.06.013] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]

Nikfalazar S, Yeh CH, Bedingfield S, Khorshidi HA. Missing data imputation using decision trees and fuzzy clustering with iterative learning. Knowl Inf Syst 2019. [DOI: 10.1007/s10115-019-01427-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Du Y, Han G, Quan Y, Yu Z, Wong HS, Chen CLP, Zhang J. Exploiting Global Low-Rank Structure and Local Sparsity Nature for Tensor Completion. IEEE TRANSACTIONS ON CYBERNETICS 2019;49:3898-3910. [PMID: 30047919 DOI: 10.1109/tcyb.2018.2853122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Wang X, Shen S, Rasam SS, Qu J. MS1 ion current-based quantitative proteomics: A promising solution for reliable analysis of large biological cohorts. MASS SPECTROMETRY REVIEWS 2019;38:461-482. [PMID: 30920002 PMCID: PMC6849792 DOI: 10.1002/mas.21595] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Accepted: 02/28/2019] [Indexed: 05/04/2023]

ElGendy K, Malcomson FC, Bradburn DM, Mathers JC. Effects of bariatric surgery on DNA methylation in adults: a systematic review and meta-analysis. Surg Obes Relat Dis 2019;16:128-136. [PMID: 31708383 DOI: 10.1016/j.soard.2019.09.075] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 09/24/2019] [Accepted: 09/27/2019] [Indexed: 01/06/2023]

Tang J, Fu J, Wang Y, Luo Y, Yang Q, Li B, Tu G, Hong J, Cui X, Chen Y, Yao L, Xue W, Zhu F. Simultaneous Improvement in the Precision, Accuracy, and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains. Mol Cell Proteomics 2019;18:1683-1699. [PMID: 31097671 PMCID: PMC6682996 DOI: 10.1074/mcp.ra118.001169] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 04/28/2019] [Indexed: 12/13/2022] Open

Abstract

The label-free proteome quantification (LFQ) is multistep workflow collectively defined by quantification tools and subsequent data manipulation methods that has been extensively applied in current biomedical, agricultural, and environmental studies. Despite recent advances, in-depth and high-quality quantification remains extremely challenging and requires the optimization of LFQs by comparatively evaluating their performance. However, the evaluation results using different criteria (precision, accuracy, and robustness) vary greatly, and the huge number of potential LFQs becomes one of the bottlenecks in comprehensively optimizing proteome quantification. In this study, a novel strategy, enabling the discovery of the LFQs of simultaneously enhanced performance from thousands of workflows (integrating 18 quantification tools with 3,128 manipulation chains), was therefore proposed. First, the feasibility of achieving simultaneous improvement in the precision, accuracy, and robustness of LFQ was systematically assessed by collectively optimizing its multistep manipulation chains. Second, based on a variety of benchmark datasets acquired by various quantification measurements of different modes of acquisition, this novel strategy successfully identified a number of manipulation chains that simultaneously improved the performance across multiple criteria. Finally, to further enhance proteome quantification and discover the LFQs of optimal performance, an online tool (https://idrblab.org/anpela/) enabling collective performance assessment (from multiple perspectives) of the entire LFQ workflow was developed. This study confirmed the feasibility of achieving simultaneous improvement in precision, accuracy, and robustness. The novel strategy proposed and validated in this study together with the online tool might provide useful guidance for the research field requiring the mass-spectrometry-based LFQ technique.

Collapse

Iwata M, Yuan L, Zhao Q, Tabei Y, Berenger F, Sawada R, Akiyoshi S, Hamano M, Yamanishi Y. Predicting drug-induced transcriptome responses of a wide range of human cell lines by a novel tensor-train decomposition algorithm. Bioinformatics 2019;35:i191-i199. [PMID: 31510663 PMCID: PMC6612872 DOI: 10.1093/bioinformatics/btz313] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Välikangas T, Suomi T, Elo LL. A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation. Brief Bioinform 2019;19:1344-1355. [PMID: 28575146 PMCID: PMC6291797 DOI: 10.1093/bib/bbx054] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Indexed: 01/15/2023] Open

Laishram A, Padmanabhan V. Discovery of user-item subgroups via genetic algorithm for effective prediction of ratings in collaborative filtering. APPL INTELL 2019. [DOI: 10.1007/s10489-019-01495-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

de Campos LM, Cano A, Castellano JG, Moral S. Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions. Stat Appl Genet Mol Biol 2019;18:sagmb-2018-0042. [PMID: 31042646 DOI: 10.1515/sagmb-2018-0042] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

A nifty collaborative analysis to predicting a novel tool (DRFLLS) for missing values estimation. Soft comput 2019. [DOI: 10.1007/s00500-019-03972-x] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Wong KY, Zeng D, Lin DY. Robust Score Tests With Missing Data in Genomics Studies. J Am Stat Assoc 2019;114:1778-1786. [PMID: 31920211 DOI: 10.1080/01621459.2018.1514304] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Islam MS, Hoque MA, Islam MS, Ali M, Hossen MB, Binyamin M, Merican AF, Akazawa K, Kumar N, Sugimoto M. Mining Gene Expression Profile with Missing Values: An Integration of Kernel PCA and Robust Singular Values Decomposition. Curr Bioinform 2018. [DOI: 10.2174/1574893613666180413151654] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

Abstract Background: Gene expression profiling and transcriptomics provide valuable information about the role of genes that are differentially expressed between two or more samples. It is always important and challenging to analyse High-throughput DNA microarray data with a number of missing values under various experimental conditions. </P><P> Objectives: Graphical data visualizations of the expression of all genes in a particular cell provide holistic views of gene expression patterns, which improve our understanding of cellular systems under normal and pathological conditions. However, current visualization methods are sensitive to missing values, which are frequently observed in microarray-based gene expression profiling, potentially affecting the subsequent statistical analyses. Methods: We addressed in this study the problem of missing values with respect to different imputation methods using gene expression biplot (GE biplot), one of the most popular gene visualization techniques. The effects of missing values for mining differentially expressed genes in gene expression data were evaluated using four well-known imputation methods: Robust Singular Value Decomposition (Robust SVD), Column Average (CA), Column Median (CM), and K-nearest Neighbors (KNN). Frobenius norm and absolute distances were used to measure the accuracy of the methods. Results: Three numerical experiments were performed using simulated data (i) and publicly available colon cancer (ii) and leukemia data (iii) to analyze the performance of each method. The results showed that CM and KNN performed better than Robust SVD and CA for identifying the index gene profile in the biplot visualization in both the simulation study and the colon cancer and leukemia microarray datasets. Conclusion: The impact of missing values on the GE biplot was smaller when the data matrix was imputed by KNN than by CM. This study concluded that KNN performed satisfactorily in generating a GE biplot in the presence of missing values in microarray data. Collapse

Lee JY, Styczynski MP. NS-kNN: a modified k-nearest neighbors approach for imputing metabolomics data. Metabolomics 2018;14:153. [PMID: 30830437 PMCID: PMC6532628 DOI: 10.1007/s11306-018-1451-8] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Accepted: 11/15/2018] [Indexed: 01/28/2023]

Yan Y, Dai T, Yang M, Du X, Zhang Y, Zhang Y. Classifying Incomplete Gene-Expression Data: Ensemble Learning with Non-Pre-Imputation Feature Filtering and Best-First Search Technique. Int J Mol Sci 2018;19:ijms19113398. [PMID: 30380746 PMCID: PMC6274900 DOI: 10.3390/ijms19113398] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 10/20/2018] [Accepted: 10/23/2018] [Indexed: 01/09/2023] Open

Effects of dietary interventions on DNA methylation in adult humans: systematic review and meta-analysis. Br J Nutr 2018;120:961-976. [DOI: 10.1017/s000711451800243x] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Choi HS, Choe JY, Kim H, Han JW, Chi YK, Kim K, Hong J, Kim T, Kim TH, Yoon S, Kim KW. Deep learning based low-cost high-accuracy diagnostic framework for dementia using comprehensive neuropsychological assessment profiles. BMC Geriatr 2018;18:234. [PMID: 30285646 PMCID: PMC6171238 DOI: 10.1186/s12877-018-0915-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Accepted: 09/10/2018] [Indexed: 01/04/2023] Open

Nguyen D, Stutz R, Schorr S, Lang S, Pfeffer S, Freeze HH, Förster F, Helms V, Dudek J, Zimmermann R. Proteomics reveals signal peptide features determining the client specificity in human TRAP-dependent ER protein import. Nat Commun 2018;9:3765. [PMID: 30217974 PMCID: PMC6138672 DOI: 10.1038/s41467-018-06188-z] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Accepted: 08/23/2018] [Indexed: 12/22/2022] Open

Chen X, Chen C, Cai Y, Wang H, Ye Q. Kernel Sparse Representation with Hybrid Regularization for On-Road Traffic Sensor Data Imputation. SENSORS 2018;18:s18092884. [PMID: 30200348 PMCID: PMC6163639 DOI: 10.3390/s18092884] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Revised: 08/28/2018] [Accepted: 08/29/2018] [Indexed: 11/16/2022]

Chen X, Cai Y, Ye Q, Chen L, Li Z. Graph regularized local self-representation for missing value imputation with applications to on-road traffic sensor data. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.04.029] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Urkup C, Bozkaya B, Salman FS. Customer mobility signatures and financial indicators as predictors in product recommendation. PLoS One 2018;13:e0201197. [PMID: 30052681 PMCID: PMC6063431 DOI: 10.1371/journal.pone.0201197] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Accepted: 07/10/2018] [Indexed: 11/19/2022] Open

Gong W, Kwak IY, Pota P, Koyano-Nakagawa N, Garry DJ. DrImpute: imputing dropout events in single cell RNA sequencing data. BMC Bioinformatics 2018;19:220. [PMID: 29884114 PMCID: PMC5994079 DOI: 10.1186/s12859-018-2226-y] [Citation(s) in RCA: 187] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Accepted: 05/30/2018] [Indexed: 11/10/2022] Open

Severson KA, Monian B, Love JC, Braatz RD. A method for learning a sparse classifier in the presence of missing data for high-dimensional biological datasets. Bioinformatics 2018;33:2897-2905. [PMID: 28431087 DOI: 10.1093/bioinformatics/btx224] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Accepted: 04/13/2017] [Indexed: 11/13/2022] Open

van Gennip Y, Hunter B, Ma A, Moyer D, de Vera R, Bertozzi AL. Unsupervised record matching with noisy and incomplete data. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2018. [DOI: 10.1007/s41060-018-0129-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Wang A, Chen Y, An N, Yang J, Li L, Jiang L. Microarray Missing Value Imputation: A Regularized Local Learning Method. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018;16:980-993. [PMID: 29994588 DOI: 10.1109/tcbb.2018.2810205] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Aghdam R, Baghfalaki T, Khosravi P, Saberi Ansari E. The Ability of Different Imputation Methods to Preserve the Significant Genes and Pathways in Cancer. GENOMICS, PROTEOMICS & BIOINFORMATICS 2017;15:396-404. [PMID: 29247873 PMCID: PMC5828654 DOI: 10.1016/j.gpb.2017.08.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2017] [Revised: 07/18/2017] [Accepted: 08/08/2017] [Indexed: 11/23/2022]

Taylor SL, Ruhaak LR, Kelly K, Weiss RH, Kim K. Effects of imputation on correlation: implications for analysis of mass spectrometry data from multiple biological matrices. Brief Bioinform 2017;18:312-320. [PMID: 26896791 DOI: 10.1093/bib/bbw010] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Indexed: 11/14/2022] Open

Armina R, Mohd Zain A, Ali NA, Sallehuddin R. A Review On Missing Value Estimation Using Imputation Algorithm. ACTA ACUST UNITED AC 2017. [DOI: 10.1088/1742-6596/892/1/012004] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Ensemble correlation-based low-rank matrix completion with applications to traffic data imputation. Knowl Based Syst 2017. [DOI: 10.1016/j.knosys.2017.06.010] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Wang X, Shojaie A, Zhang Y, Shelley D, Lampe PD, Levy L, Peters U, Potter JD, White E, Lampe JW. Exploratory plasma proteomic analysis in a randomized crossover trial of aspirin among healthy men and women. PLoS One 2017;12:e0178444. [PMID: 28542447 PMCID: PMC5444835 DOI: 10.1371/journal.pone.0178444] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 05/12/2017] [Indexed: 12/21/2022] Open

Park JG, Paul S, Briones N, Zeng J, Gillis K, Wallstrom G, LaBaer J, Amundson SA. Developing Human Radiation Biodosimetry Models: Testing Cross-Species Conversion Approaches Using an Ex Vivo Model System. Radiat Res 2017;187:708-721. [PMID: 28328310 DOI: 10.1667/rr14655.1] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Abstract

In the event of a large-scale radiation exposure, accurate and quick assessment of radiation dose received would be critical for triage and medical treatment of large numbers of potentially exposed individuals. Current methods of biodosimetry, such as the dicentric chromosome assay, are time consuming and require sophisticated equipment and highly trained personnel. Therefore, scalable biodosimetry approaches, including gene expression profiles in peripheral blood cells, are being investigated. Due to the limited availability of appropriate human samples, biodosimetry development has relied heavily on mouse models, which are not directly applicable to human response. Therefore, to explore the feasibility of using non-human primate (NHP) models to build and test a biodosimetry algorithm for use in humans, we irradiated ex vivo peripheral blood samples from both humans and rhesus macaques with doses of 0, 2, 5, 6 and 7 Gy, and compared the gene expression profiles 24 h later using Agilent human microarrays. Among the dose-responsive genes in human and using non-human primate, 52 genes showed highly correlated expression patterns between the species, and were enriched in p53/DNA damage response, apoptosis and cell cycle-related genes. When these interspecies-correlated genes were used to build biodosimetry models with using NHP data, the mean prediction accuracy on non-human primate samples was about 90% within 1 Gy of delivered dose in leave-one-out cross-validation. However, tests on human samples suggested that human gene expression values may need to be adjusted prior to application of the NHP model. A "multi-gene" approach utilizing all gene values for cross-species conversion and applying the converted values on the NHP biodosimetry models, gave a leave-one-out cross-validation prediction accuracy for human samples highly comparable (up to 94%) to that for non-human primates. Overall, this study demonstrates that a robust NHP biodosimetry model can be built using interspecies-correlated genes, and that, by using multiple regression-based cross-species conversion of expression values, absorbed dose in human samples can be accurately predicted by the NHP model.

Collapse

Pati SK, Das AK. Missing value estimation for microarray data through cluster analysis. Knowl Inf Syst 2017. [DOI: 10.1007/s10115-017-1025-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Yu Z, Li T, Horng SJ, Pan Y, Wang H, Jing Y. An Iterative Locally Auto-Weighted Least Squares Method for Microarray Missing Value Estimation. IEEE Trans Nanobioscience 2017;16:21-33. [PMID: 28114029 DOI: 10.1109/tnb.2016.2636243] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Wu WS, Jhou MJ. MVIAeval: a web tool for comprehensively evaluating the performance of a new missing value imputation algorithm. BMC Bioinformatics 2017;18:31. [PMID: 28086746 PMCID: PMC5237319 DOI: 10.1186/s12859-016-1429-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2016] [Accepted: 12/15/2016] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Missing value imputation is important for microarray data analyses because microarray data with missing values would significantly degrade the performance of the downstream analyses. Although many microarray missing value imputation algorithms have been developed, an objective and comprehensive performance comparison framework is still lacking. To solve this problem, we previously proposed a framework which can perform a comprehensive performance comparison of different existing algorithms. Also the performance of a new algorithm can be evaluated by our performance comparison framework. However, constructing our framework is not an easy task for the interested researchers. To save researchers' time and efforts, here we present an easy-to-use web tool named MVIAeval (Missing Value Imputation Algorithm evaluator) which implements our performance comparison framework.

RESULTS

MVIAeval provides a user-friendly interface allowing users to upload the R code of their new algorithm and select (i) the test datasets among 20 benchmark microarray (time series and non-time series) datasets, (ii) the compared algorithms among 12 existing algorithms, (iii) the performance indices from three existing ones, (iv) the comprehensive performance scores from two possible choices, and (v) the number of simulation runs. The comprehensive performance comparison results are then generated and shown as both figures and tables.

CONCLUSIONS

MVIAeval is a useful tool for researchers to easily conduct a comprehensive and objective performance evaluation of their newly developed missing value imputation algorithm for microarray data or any data which can be represented as a matrix form (e.g. NGS data or proteomics data). Thus, MVIAeval will greatly expedite the progress in the research of missing value imputation algorithms.

Collapse

Pietrocola F, Demont Y, Castoldi F, Enot D, Durand S, Semeraro M, Baracco EE, Pol J, Bravo-San Pedro JM, Bordenave C, Levesque S, Humeau J, Chery A, Métivier D, Madeo F, Maiuri MC, Kroemer G. Metabolic effects of fasting on human and mouse blood in vivo. Autophagy 2017;13:567-578. [PMID: 28059587 DOI: 10.1080/15548627.2016.1271513] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open

Affiliation(s)

Federico Pietrocola a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Yohann Demont a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France
Francesca Castoldi a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France.,f Sotio a.c. ; Prague , Czech Republic
David Enot a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Sylvère Durand a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Michaela Semeraro a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,e Centre d'Investigation Clinique-Unité de Recherche Clinique Paris Centre Necker-Cochin, Assistance Publique Hôpitaux de Paris , France
Elisa Elena Baracco a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Jonathan Pol a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Jose Manuel Bravo-San Pedro a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Chloé Bordenave a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Sarah Levesque a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Juliette Humeau a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Alexis Chery a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Didier Métivier a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France
Frank Madeo g Institute of Molecular Biosciences, NAWI Graz, University of Graz , Graz , Austria.,h BioTechMed-Graz , Graz , Austria
M Chiara Maiuri a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France
Guido Kroemer a Equipe 11 labellisée Ligue contre le Cancer, Centre de Recherche des Cordeliers, INSERM U 1138 , Paris , France.,b Université Paris Descartes, Sorbonne Paris Cité , Paris , France.,c Université Pierre et Marie Curie , Paris , France.,d Metabolomics and Cell Biology Platforms, Gustave Roussy Comprehensive Cancer Institute , Villejuif , France.,i Pôle de Biologie, Hôpital Européen Georges Pompidou, AP-HP , Paris , France.,j Karolinska Institute, Department of Women's and Children's Health , Karolinska University Hospital , Stockholm , Sweden

Collapse

Gibert K, Sànchez–Marrè M, Izquierdo J. A survey on pre-processing techniques: Relevant issues in the context of environmental data mining. AI COMMUN 2016. [DOI: 10.3233/aic-160710] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Amiri M, Jensen R. Missing data imputation using fuzzy-rough methods. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.04.015] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Cai T, Cai TT, Zhang A. Structured Matrix Completion with Applications to Genomic Data Integration. J Am Stat Assoc 2016;111:621-633. [PMID: 28042188 PMCID: PMC5198844 DOI: 10.1080/01621459.2015.1021005] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Revised: 01/01/2015] [Indexed: 10/23/2022]

Chen Y, Wang A, Ding H, Que X, Li Y, An N, Jiang L. A global learning with local preservation method for microarray data imputation. Comput Biol Med 2016;77:76-89. [PMID: 27522236 DOI: 10.1016/j.compbiomed.2016.08.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 08/04/2016] [Accepted: 08/04/2016] [Indexed: 12/28/2022]

Lin D, Zhang J, Li J, Xu C, Deng HW, Wang YP. An integrative imputation method based on multi-omics datasets. BMC Bioinformatics 2016;17:247. [PMID: 27329642 PMCID: PMC4915152 DOI: 10.1186/s12859-016-1122-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Accepted: 06/05/2016] [Indexed: 12/26/2022] Open

Kapur A, Marwah K, Alterovitz G. Gene expression prediction using low-rank matrix completion. BMC Bioinformatics 2016;17:243. [PMID: 27317252 PMCID: PMC4912738 DOI: 10.1186/s12859-016-1106-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2015] [Accepted: 05/28/2016] [Indexed: 11/25/2022] Open

McCoin CS, Piccolo BD, Knotts TA, Matern D, Vockley J, Gillingham MB, Adams SH. Unique plasma metabolomic signatures of individuals with inherited disorders of long-chain fatty acid oxidation. J Inherit Metab Dis 2016;39:399-408. [PMID: 26907176 PMCID: PMC4851894 DOI: 10.1007/s10545-016-9915-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Revised: 01/09/2016] [Accepted: 01/22/2016] [Indexed: 01/29/2023]

Zhang G, Huang KC, Xu Z, Tzeng JY, Conneely KN, Guan W, Kang J, Li Y. Across-Platform Imputation of DNA Methylation Levels Incorporating Nonlocal Information Using Penalized Functional Regression. Genet Epidemiol 2016;40:333-40. [PMID: 27061717 PMCID: PMC4862742 DOI: 10.1002/gepi.21969] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Revised: 02/03/2016] [Accepted: 02/18/2016] [Indexed: 12/28/2022]

Affiliation(s)

Guosheng Zhang Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, North Carolina, United States of America Department of Statistics, University of North Carolina, Chapel Hill, North Carolina, United States of America
Kuan-Chieh Huang Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, United States of America
Zheng Xu Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, United States of America Department of Computer Science, University of North Carolina, Chapel Hill, North Carolina, United States of America
Jung-Ying Tzeng Department of Statistics, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
Karen N Conneely Department of Human Genetics, School of Medicine, Emory University, Atlanta, Georgia, United States of America
Weihua Guan Division of Biostatistics, School of Public Health, University of Minnesota, Minnesota, United States of America
Jian Kang Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America
Yun Li Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, North Carolina, United States of America Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, United States of America Department of Computer Science, University of North Carolina, Chapel Hill, North Carolina, United States of America

Collapse

Yang Y, Xu Z, Song D. Missing value imputation for microRNA expression data by using a GO-based similarity measure. BMC Bioinformatics 2016;17 Suppl 1:10. [PMID: 26818962 PMCID: PMC4895707 DOI: 10.1186/s12859-015-0853-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Abstract

BACKGROUND

Missing values are commonly present in microarray data profiles. Instead of discarding genes or samples with incomplete expression level, missing values need to be properly imputed for accurate data analysis. The imputation methods can be roughly categorized as expression level-based and domain knowledge-based. The first type of methods only rely on expression data without the help of external data sources, while the second type incorporates available domain knowledge into expression data to improve imputation accuracy. In recent years, microRNA (miRNA) microarray has been largely developed and used for identifying miRNA biomarkers in complex human disease studies. Similar to mRNA profiles, miRNA expression profiles with missing values can be treated with the existing imputation methods. However, the domain knowledge-based methods are hard to be applied due to the lack of direct functional annotation for miRNAs. With the rapid accumulation of miRNA microarray data, it is increasingly needed to develop domain knowledge-based imputation algorithms specific to miRNA expression profiles to improve the quality of miRNA data analysis.

RESULTS

We connect miRNAs with domain knowledge of Gene Ontology (GO) via their target genes, and define miRNA functional similarity based on the semantic similarity of GO terms in GO graphs. A new measure combining miRNA functional similarity and expression similarity is used in the imputation of missing values. The new measure is tested on two miRNA microarray datasets from breast cancer research and achieves improved performance compared with the expression-based method on both datasets.

CONCLUSIONS

The experimental results demonstrate that the biological domain knowledge can benefit the estimation of missing values in miRNA profiles as well as mRNA profiles. Especially, functional similarity defined by GO terms annotated for the target genes of miRNAs can be useful complementary information for the expression-based method to improve the imputation accuracy of miRNA array data. Our method and data are available to the public upon request.

Collapse

Improved methods for the imputation of missing data by nearest neighbor methods. Comput Stat Data Anal 2015. [DOI: 10.1016/j.csda.2015.04.009] [Citation(s) in RCA: 82] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Gao L, Pei G, Chen L, Zhang W. A global network-based protocol for functional inference of hypothetical proteins in Synechocystis sp. PCC 6803. J Microbiol Methods 2015;116:44-52. [DOI: 10.1016/j.mimet.2015.06.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2015] [Revised: 06/24/2015] [Accepted: 06/25/2015] [Indexed: 01/15/2023]

Automatic instance selection via locality constrained sparse representation for missing value estimation. Knowl Based Syst 2015. [DOI: 10.1016/j.knosys.2015.05.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Haakensen VD, Steinfeld I, Saldova R, Shehni AA, Kifer I, Naume B, Rudd PM, Børresen-Dale AL, Yakhini Z. Serum N-glycan analysis in breast cancer patients--Relation to tumour biology and clinical outcome. Mol Oncol 2015;10:59-72. [PMID: 26321095 DOI: 10.1016/j.molonc.2015.08.002] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Revised: 08/02/2015] [Accepted: 08/03/2015] [Indexed: 12/13/2022] Open

100

Li H, Zhao C, Shao F, Li GZ, Wang X. A hybrid imputation approach for microarray missing value estimation. BMC Genomics 2015;16 Suppl 9:S1. [PMID: 26330180 PMCID: PMC4547405 DOI: 10.1186/1471-2164-16-s9-s1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Missing data is an inevitable phenomenon in gene expression microarray experiments due to instrument failure or human error. It has a negative impact on performance of downstream analysis. Technically, most existing approaches suffer from this prevalent problem. Imputation is one of the frequently used methods for processing missing data. Actually many developments have been achieved in the research on estimating missing values. The challenging task is how to improve imputation accuracy for data with a large missing rate.

METHODS

In this paper, induced by the thought of collaborative training, we propose a novel hybrid imputation method, called Recursive Mutual Imputation (RMI). Specifically, RMI exploits global correlation information and local structure in the data, captured by two popular methods, Bayesian Principal Component Analysis (BPCA) and Local Least Squares (LLS), respectively. Mutual strategy is implemented by sharing the estimated data sequences at each recursive process. Meanwhile, we consider the imputation sequence based on the number of missing entries in the target gene. Furthermore, a weight based integrated method is utilized in the final assembling step.

RESULTS

We evaluate RMI with three state-of-art algorithms (BPCA, LLS, Iterated Local Least Squares imputation (ItrLLS)) on four publicly available microarray datasets. Experimental results clearly demonstrate that RMI significantly outperforms comparative methods in terms of Normalized Root Mean Square Error (NRMSE), especially for datasets with large missing rates and less complete genes.

CONCLUSIONS

It is noted that our proposed hybrid imputation approach incorporates both global and local information of microarray genes, which achieves lower NRMSE values against to any single approach only. Besides, this study highlights the need for considering the imputing sequence of missing entries for imputation methods.

Collapse