101
|
Premchand B, Toe KK, Wang C, Shaikh S, Libedinsky C, Ang KK, So RQ. Decoding movement direction from cortical microelectrode recordings using an LSTM-based neural network. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2020:3007-3010. [PMID: 33018638 DOI: 10.1109/embc44109.2020.9175593] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Brain-machine interfaces (BMIs) allow individuals to communicate with computers using neural signals, and Kalman Filter (KF) are prevailingly used to decode movement directions from these neural signals. In this paper, we implemented a multi-layer long short-term memory (LSTM)based artificial neural network (ANN) for decoding BMI neural signals. We collected motor cortical neural signals from a nonhuman primate (NHP), implanted with microelectrode array (MEA) while performing a directional joystick task. Next, we compared the LSTM model in decoding the joystick trajectories from the neural signals against the prevailing KF model. The results showed that the LSTM model yielded significantly improved decoding accuracy measured by mean correlation coefficient (0.84, p < 10-7) than the KF model (0.72). In addition, using a principal component analysis (PCA)-based dimensionality reduction technique yielded slightly deteriorated accuracies for both the LSTM (0.80) and KF (0.70) models, but greatly reduced the computational complexity. The results showed that the LSTM decoding model holds promise to improve decoding in BMIs for paralyzed humans.
Collapse
|
102
|
Hsu LL, Culhane AC. Impact of Data Preprocessing on Integrative Matrix Factorization of Single Cell Data. Front Oncol 2020; 10:973. [PMID: 32656082 PMCID: PMC7324639 DOI: 10.3389/fonc.2020.00973] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 05/18/2020] [Indexed: 01/04/2023] Open
Abstract
Integrative, single-cell analyses may provide unprecedented insights into cellular and spatial diversity of the tumor microenvironment. The sparsity, noise, and high dimensionality of these data present unique challenges. Whilst approaches for integrating single-cell data are emerging and are far from being standardized, most data integration, cell clustering, cell trajectory, and analysis pipelines employ a dimension reduction step, frequently principal component analysis (PCA), a matrix factorization method that is relatively fast, and can easily scale to large datasets when used with sparse-matrix representations. In this review, we provide a guide to PCA and related methods. We describe the relationship between PCA and singular value decomposition, the difference between PCA of a correlation and covariance matrix, the impact of scaling, log-transforming, and standardization, and how to recognize a horseshoe or arch effect in a PCA. We describe canonical correlation analysis (CCA), a popular matrix factorization approach for the integration of single-cell data from different platforms or studies. We discuss alternatives to CCA and why additional preprocessing or weighting datasets within the joint decomposition should be considered.
Collapse
Affiliation(s)
- Lauren L Hsu
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States.,Division of Biostatistics and Computational Biology, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, United States
| | - Aedin C Culhane
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States.,Division of Biostatistics and Computational Biology, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, United States
| |
Collapse
|
103
|
Kalia V, Walker DI, Krasnodemski KM, Jones DP, Miller GW, Kioumourtzoglou MA. Unsupervised dimensionality reduction for exposome research. CURRENT OPINION IN ENVIRONMENTAL SCIENCE & HEALTH 2020; 15:32-38. [PMID: 32905218 PMCID: PMC7467332 DOI: 10.1016/j.coesh.2020.05.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Understanding the effect of the environment on human health has benefited from progress made in measuring the exposome. High resolution mass spectrometry (HRMS) has made it possible to measure small molecules across a large dynamic range, allowing researchers to study the role of low abundance environmental toxicants in causing human disease. HRMS data have a high dimensional structure (number of predictors >> number of observations), generating information on the abundance of many chemical features (predictors) which may be highly correlated. Unsupervised dimension reduction techniques can allow dimensionality reduction of the various features into components that capture the essence of the variability in the exposome dataset. We illustrate and discuss the relevance of three different unsupervised dimension reduction techniques: principal component analysis, factor analysis, and non-negative matrix factorization. We focus on the utility of each method in understanding the relationship between the exposome and a disease outcome and describe their strengths and limitations. While the utility of these methods is context specific, it remains important to focus on the interpretability of results from each method.
Collapse
Affiliation(s)
- Vrinda Kalia
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY 10032
| | - Douglas I. Walker
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029
| | - Katherine M. Krasnodemski
- Division of Pulmonary, Allergy and Critical Medicine, Department of Medicine, School of Medicine, Emory University, Atlanta, GA 30322
| | - Dean P. Jones
- Division of Pulmonary, Allergy and Critical Medicine, Department of Medicine, School of Medicine, Emory University, Atlanta, GA 30322
| | - Gary W. Miller
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY 10032
| | | |
Collapse
|
104
|
Ovchinnikova S, Anders S. Exploring dimension-reduced embeddings with Sleepwalk. Genome Res 2020; 30:749-756. [PMID: 32430339 PMCID: PMC7263188 DOI: 10.1101/gr.251447.119] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 05/01/2020] [Indexed: 01/24/2023]
Abstract
Dimension-reduction methods, such as t-SNE or UMAP, are widely used when exploring high-dimensional data describing many entities, for example, RNA-seq data for many single cells. However, dimension reduction is commonly prone to introducing artifacts, and we hence need means to see where a dimension-reduced embedding is a faithful representation of the local neighborhood and where it is not. We present Sleepwalk, a simple but powerful tool that allows the user to interactively explore an embedding, using color to depict original or any other distances from all points to the cell under the mouse cursor. We show how this approach not only highlights distortions but also reveals otherwise hidden characteristics of the data, and how Sleepwalk's comparative modes help integrate multisample data and understand differences between embedding and preprocessing methods. Sleepwalk is a versatile and intuitive tool that unlocks the full power of dimension reduction and will be of value not only in single-cell RNA-seq but also in any other area with matrix-shaped big data.
Collapse
Affiliation(s)
- Svetlana Ovchinnikova
- Center for Molecular Biology of the University of Heidelberg (ZMBH), 69120 Heidelberg, Germany
| | - Simon Anders
- Center for Molecular Biology of the University of Heidelberg (ZMBH), 69120 Heidelberg, Germany
| |
Collapse
|
105
|
Yamada R, Okada D, Wang J, Basak T, Koyama S. Interpretation of omics data analyses. J Hum Genet 2020; 66:93-102. [PMID: 32385339 PMCID: PMC7728595 DOI: 10.1038/s10038-020-0763-5] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 03/25/2020] [Accepted: 03/28/2020] [Indexed: 11/22/2022]
Abstract
Omics studies attempt to extract meaningful messages from large-scale and high-dimensional data sets by treating the data sets as a whole. The concept of treating data sets as a whole is important in every step of the data-handling procedures: the pre-processing step of data records, the step of statistical analyses and machine learning, translation of the outputs into human natural perceptions, and acceptance of the messages with uncertainty. In the pre-processing, the method by which to control the data quality and batch effects are discussed. For the main analyses, the approaches are divided into two types and their basic concepts are discussed. The first type is the evaluation of many items individually, followed by interpretation of individual items in the context of multiple testing and combination. The second type is the extraction of fewer important aspects from the whole data records. The outputs of the main analyses are translated into natural languages with techniques, such as annotation and ontology. The other technique for making the outputs perceptible is visualization. At the end of this review, one of the most important issues in the interpretation of omics data analyses is discussed. Omics studies have a large amount of information in their data sets, and every approach reveals only a very restricted aspect of the whole data sets. The understandable messages from these studies have unavoidable uncertainty.
Collapse
Affiliation(s)
- Ryo Yamada
- Unit of Statistical Genetics, Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Nanbusogo-Kenkyu-To-1, 5F, 53 Syogoin-Kawaramachi, Sakyo-ku, Kyoto, 606-8507, Japan.
| | - Daigo Okada
- Unit of Statistical Genetics, Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Nanbusogo-Kenkyu-To-1, 5F, 53 Syogoin-Kawaramachi, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Juan Wang
- Unit of Statistical Genetics, Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Nanbusogo-Kenkyu-To-1, 5F, 53 Syogoin-Kawaramachi, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Tapati Basak
- Unit of Statistical Genetics, Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Nanbusogo-Kenkyu-To-1, 5F, 53 Syogoin-Kawaramachi, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Satoshi Koyama
- Unit of Statistical Genetics, Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Nanbusogo-Kenkyu-To-1, 5F, 53 Syogoin-Kawaramachi, Sakyo-ku, Kyoto, 606-8507, Japan
| |
Collapse
|
106
|
Badillo S, Banfai B, Birzele F, Davydov II, Hutchinson L, Kam‐Thong T, Siebourg‐Polster J, Steiert B, Zhang JD. An Introduction to Machine Learning. Clin Pharmacol Ther 2020; 107:871-885. [PMID: 32128792 PMCID: PMC7189875 DOI: 10.1002/cpt.1796] [Citation(s) in RCA: 152] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 01/15/2020] [Indexed: 12/16/2022]
Abstract
In the last few years, machine learning (ML) and artificial intelligence have seen a new wave of publicity fueled by the huge and ever-increasing amount of data and computational power as well as the discovery of improved learning algorithms. However, the idea of a computer learning some abstract concept from data and applying them to yet unseen situations is not new and has been around at least since the 1950s. Many of these basic principles are very familiar to the pharmacometrics and clinical pharmacology community. In this paper, we want to introduce the foundational ideas of ML to this community such that readers obtain the essential tools they need to understand publications on the topic. Although we will not go into the very details and theoretical background, we aim to point readers to relevant literature and put applications of ML in molecular biology as well as the fields of pharmacometrics and clinical pharmacology into perspective.
Collapse
Affiliation(s)
- Solveig Badillo
- Pharmaceutical Sciences, Roche Pharma Research and Early Development (pRED), Roche Innovation Center BaselBaselSwitzerland
| | - Balazs Banfai
- Pharmaceutical Sciences, Roche Pharma Research and Early Development (pRED), Roche Innovation Center BaselBaselSwitzerland
| | - Fabian Birzele
- Pharmaceutical Sciences, Roche Pharma Research and Early Development (pRED), Roche Innovation Center BaselBaselSwitzerland
| | - Iakov I. Davydov
- Pharmaceutical Sciences, Roche Pharma Research and Early Development (pRED), Roche Innovation Center BaselBaselSwitzerland
| | - Lucy Hutchinson
- Pharmaceutical Sciences, Roche Pharma Research and Early Development (pRED), Roche Innovation Center BaselBaselSwitzerland
| | - Tony Kam‐Thong
- Pharmaceutical Sciences, Roche Pharma Research and Early Development (pRED), Roche Innovation Center BaselBaselSwitzerland
| | - Juliane Siebourg‐Polster
- Pharmaceutical Sciences, Roche Pharma Research and Early Development (pRED), Roche Innovation Center BaselBaselSwitzerland
| | - Bernhard Steiert
- Pharmaceutical Sciences, Roche Pharma Research and Early Development (pRED), Roche Innovation Center BaselBaselSwitzerland
| | - Jitao David Zhang
- Pharmaceutical Sciences, Roche Pharma Research and Early Development (pRED), Roche Innovation Center BaselBaselSwitzerland
| |
Collapse
|
107
|
Chiu YJ, Hsieh YH, Huang YH. Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells. BMC Med Genomics 2019; 12:169. [PMID: 31856824 PMCID: PMC6923925 DOI: 10.1186/s12920-019-0613-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2019] [Accepted: 10/31/2019] [Indexed: 01/07/2023] Open
Abstract
Background To facilitate the investigation of the pathogenic roles played by various immune cells in complex tissues such as tumors, a few computational methods for deconvoluting bulk gene expression profiles to predict cell composition have been created. However, available methods were usually developed along with a set of reference gene expression profiles consisting of imbalanced replicates across different cell types. Therefore, the objective of this study was to create a new deconvolution method equipped with a new set of reference gene expression profiles that incorporate more microarray replicates of the immune cells that have been frequently implicated in the poor prognosis of cancers, such as T helper cells, regulatory T cells and macrophage M1/M2 cells. Methods Our deconvolution method was developed by choosing ε-support vector regression (ε-SVR) as the core algorithm assigned with a loss function subject to the L1-norm penalty. To construct the reference gene expression signature matrix for regression, a subset of differentially expressed genes were chosen from 148 microarray-based gene expression profiles for 9 types of immune cells by using ANOVA and minimizing condition number. Agreement analyses including mean absolute percentage errors and Bland-Altman plots were carried out to compare the performances of our method and CIBERSORT. Results In silico cell mixtures, simulated bulk tissues, and real human samples with known immune-cell fractions were used as the test datasets for benchmarking. Our method outperformed CIBERSORT in the benchmarks using in silico breast tissue-immune cell mixtures in the proportions of 30:70 and 50:50, and in the benchmark using 164 human PBMC samples. Our results suggest that the performance of our method was at least comparable to that of a state-of-the-art tool, CIBERSORT. Conclusions We developed a new cell composition deconvolution method and the implementation was entirely based on the publicly available R and Python packages. In addition, we compiled a new set of reference gene expression profiles, which might allow for a more robust prediction of the immune cell fractions from the expression profiles of cell mixtures. The source code of our method could be downloaded from https://github.com/holiday01/deconvolution-to-estimate-immune-cell-subsets.
Collapse
Affiliation(s)
- Yen-Jung Chiu
- Institute of Biomedical Informatics, National Yang-Ming University, No.155, Sec. 2, Li-Nong St., Beitou Dist, Taipei, 11221, Taiwan
| | - Yi-Hsuan Hsieh
- Institute of Biomedical Informatics, National Yang-Ming University, No.155, Sec. 2, Li-Nong St., Beitou Dist, Taipei, 11221, Taiwan
| | - Yen-Hua Huang
- Institute of Biomedical Informatics, National Yang-Ming University, No.155, Sec. 2, Li-Nong St., Beitou Dist, Taipei, 11221, Taiwan. .,Centre for Systems and Synthetic Biology, National Yang-Ming University, Taipei, 11221, Taiwan.
| |
Collapse
|
108
|
Kumral D, Şansal F, Cesnaite E, Mahjoory K, Al E, Gaebler M, Nikulin VV, Villringer A. BOLD and EEG signal variability at rest differently relate to aging in the human brain. Neuroimage 2019; 207:116373. [PMID: 31759114 DOI: 10.1016/j.neuroimage.2019.116373] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 10/17/2019] [Accepted: 11/17/2019] [Indexed: 01/22/2023] Open
Abstract
Variability of neural activity is regarded as a crucial feature of healthy brain function, and several neuroimaging approaches have been employed to assess it noninvasively. Studies on the variability of both evoked brain response and spontaneous brain signals have shown remarkable changes with aging but it is unclear if the different measures of brain signal variability - identified with either hemodynamic or electrophysiological methods - reflect the same underlying physiology. In this study, we aimed to explore age differences of spontaneous brain signal variability with two different imaging modalities (EEG, fMRI) in healthy younger (25 ± 3 years, N = 135) and older (67 ± 4 years, N = 54) adults. Consistent with the previous studies, we found lower blood oxygenation level dependent (BOLD) variability in the older subjects as well as less signal variability in the amplitude of low-frequency oscillations (1-12 Hz), measured in source space. These age-related reductions were mostly observed in the areas that overlap with the default mode network. Moreover, age-related increases of variability in the amplitude of beta-band frequency EEG oscillations (15-25 Hz) were seen predominantly in temporal brain regions. There were significant sex differences in EEG signal variability in various brain regions while no significant sex differences were observed in BOLD signal variability. Bivariate and multivariate correlation analyses revealed no significant associations between EEG- and fMRI-based variability measures. In summary, we show that both BOLD and EEG signal variability reflect aging-related processes but are likely to be dominated by different physiological origins, which relate differentially to age and sex.
Collapse
Affiliation(s)
- D Kumral
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; MindBrainBody Institute at the Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin, Germany.
| | - F Şansal
- International Graduate Program Medical Neurosciences, Charité-Universitätsmedizin, Berlin, Germany; Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - E Cesnaite
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - K Mahjoory
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Institute for Biomagnetism and Biosignalanalysis, University of Muenster, Muenster, Germany
| | - E Al
- MindBrainBody Institute at the Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin, Germany; Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - M Gaebler
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; MindBrainBody Institute at the Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin, Germany
| | - V V Nikulin
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Neurophysics Group, Department of Neurology, Campus Benjamin Franklin, Charité Universitätsmedizin Berlin, Berlin, Germany; Centre for Cognition and Decision Making, Institute for Cognitive Neuroscience, National Research University Higher School of Economics, Moscow, Russia
| | - A Villringer
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; MindBrainBody Institute at the Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin, Germany; Center for Stroke Research Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany; Department of Cognitive Neurology, University Hospital Leipzig, Leipzig, Germany
| |
Collapse
|
109
|
Popal H, Wang Y, Olson IR. A Guide to Representational Similarity Analysis for Social Neuroscience. Soc Cogn Affect Neurosci 2019; 14:1243-1253. [PMID: 31989169 PMCID: PMC7057283 DOI: 10.1093/scan/nsz099] [Citation(s) in RCA: 69] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Revised: 10/13/2019] [Accepted: 10/22/2019] [Indexed: 01/04/2023] Open
Abstract
Representational similarity analysis (RSA) is a computational technique that uses pairwise comparisons of stimuli to reveal their representation in higher-order space. In the context of neuroimaging, mass-univariate analyses and other multivariate analyses can provide information on what and where information is represented but have limitations in their ability to address how information is represented. Social neuroscience is a field that can particularly benefit from incorporating RSA techniques to explore hypotheses regarding the representation of multidimensional data, how representations can predict behavior, how representations differ between groups and how multimodal data can be compared to inform theories. The goal of this paper is to provide a practical as well as theoretical guide to implementing RSA in social neuroscience studies.
Collapse
Affiliation(s)
- Haroon Popal
- Department of Psychology, Temple University, Philadelphia, PA
| | | | - Ingrid R Olson
- Department of Psychology, Temple University, Philadelphia, PA
| |
Collapse
|