1
|
Abed-Esfahani P, Darwin BC, Howard D, Wang N, Kim E, Lerch J, French L. Evaluation of deep convolutional neural networks for in situ hybridization gene expression image representation. PLoS One 2022; 17:e0262717. [PMID: 35073334 PMCID: PMC8786163 DOI: 10.1371/journal.pone.0262717] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 12/31/2021] [Indexed: 11/19/2022] Open
Abstract
High resolution in situ hybridization (ISH) images of the brain capture spatial gene expression at cellular resolution. These spatial profiles are key to understanding brain organization at the molecular level. Previously, manual qualitative scoring and informatics pipelines have been applied to ISH images to determine expression intensity and pattern. To better capture the complex patterns of gene expression in the human cerebral cortex, we applied a machine learning approach. We propose gene re-identification as a contrastive learning task to compute representations of ISH images. We train our model on an ISH dataset of ~1,000 genes obtained from postmortem samples from 42 individuals. This model reaches a gene re-identification rate of 38.3%, a 13x improvement over random chance. We find that the learned embeddings predict expression intensity and pattern. To test generalization, we generated embeddings in a second dataset that assayed the expression of 78 genes in 53 individuals. In this set of images, 60.2% of genes are re-identified, suggesting the model is robust. Importantly, this dataset assayed expression in individuals diagnosed with schizophrenia. Gene and donor-specific embeddings from the model predict schizophrenia diagnosis at levels similar to that reached with demographic information. Mutations in the most discriminative gene, Sodium Voltage-Gated Channel Beta Subunit 4 (SCN4B), may help understand cardiovascular associations with schizophrenia and its treatment. We have publicly released our source code, embeddings, and models to spur further application to spatial transcriptomics. In summary, we propose and evaluate gene re-identification as a machine learning task to represent ISH gene expression images.
Collapse
Affiliation(s)
- Pegah Abed-Esfahani
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health (CAMH), Toronto, Canada
| | | | - Derek Howard
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health (CAMH), Toronto, Canada
| | - Nick Wang
- Mouse Imaging Centre, Hospital for Sick Children, Toronto, Canada
| | - Ethan Kim
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health (CAMH), Toronto, Canada
- Institute for Medical Science, University of Toronto, Toronto, Canada
| | - Jason Lerch
- Mouse Imaging Centre, Hospital for Sick Children, Toronto, Canada
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, United Kingdom
| | - Leon French
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health (CAMH), Toronto, Canada
- Institute for Medical Science, University of Toronto, Toronto, Canada
- Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, Canada
- Department of Psychiatry, University of Toronto, Toronto, Canada
| |
Collapse
|
2
|
Cohen I, David E(O, Netanyahu NS. Supervised and Unsupervised End-to-End Deep Learning for Gene Ontology Classification of Neural In Situ Hybridization Images. ENTROPY 2019; 21:e21030221. [PMID: 33266936 PMCID: PMC7514702 DOI: 10.3390/e21030221] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Revised: 10/22/2018] [Accepted: 12/19/2018] [Indexed: 11/16/2022]
Abstract
In recent years, large datasets of high-resolution mammalian neural images have become available, which has prompted active research on the analysis of gene expression data. Traditional image processing methods are typically applied for learning functional representations of genes, based on their expressions in these brain images. In this paper, we describe a novel end-to-end deep learning-based method for generating compact representations of in situ hybridization (ISH) images, which are invariant-to-translation. In contrast to traditional image processing methods, our method relies, instead, on deep convolutional denoising autoencoders (CDAE) for processing raw pixel inputs, and generating the desired compact image representations. We provide an in-depth description of our deep learning-based approach, and present extensive experimental results, demonstrating that representations extracted by CDAE can help learn features of functional gene ontology categories for their classification in a highly accurate manner. Our methods improve the previous state-of-the-art classification rate (Liscovitch, et al.) from an average AUC of 0.92 to 0.997, i.e., it achieves 96% reduction in error rate. Furthermore, the representation vectors generated due to our method are more compact in comparison to previous state-of-the-art methods, allowing for a more efficient high-level representation of images. These results are obtained with significantly downsampled images in comparison to the original high-resolution ones, further underscoring the robustness of our proposed method.
Collapse
Affiliation(s)
- Ido Cohen
- Department of Computer Science, Bar-Ilan University, Ramat-Gan 5290002, Israel
| | - Eli (Omid) David
- Department of Computer Science, Bar-Ilan University, Ramat-Gan 5290002, Israel
- Correspondence:
| | - Nathan S. Netanyahu
- Department of Computer Science, Bar-Ilan University, Ramat-Gan 5290002, Israel
- Gonda Brain Research Center, Bar-Ilan University, Ramat-Gan 5290002, Israel
- Center for Automation Research, UMIACS, University of Maryland at College Park, College Park, MD 20742, USA
| |
Collapse
|
3
|
Mahfouz A, Huisman SMH, Lelieveldt BPF, Reinders MJT. Brain transcriptome atlases: a computational perspective. Brain Struct Funct 2017; 222:1557-1580. [PMID: 27909802 PMCID: PMC5406417 DOI: 10.1007/s00429-016-1338-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 11/15/2016] [Indexed: 01/31/2023]
Abstract
The immense complexity of the mammalian brain is largely reflected in the underlying molecular signatures of its billions of cells. Brain transcriptome atlases provide valuable insights into gene expression patterns across different brain areas throughout the course of development. Such atlases allow researchers to probe the molecular mechanisms which define neuronal identities, neuroanatomy, and patterns of connectivity. Despite the immense effort put into generating such atlases, to answer fundamental questions in neuroscience, an even greater effort is needed to develop methods to probe the resulting high-dimensional multivariate data. We provide a comprehensive overview of the various computational methods used to analyze brain transcriptome atlases.
Collapse
Affiliation(s)
- Ahmed Mahfouz
- Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands.
- Delft Bioinformatics Laboratory, Delft University of Technology, Delft, The Netherlands.
| | - Sjoerd M H Huisman
- Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Laboratory, Delft University of Technology, Delft, The Netherlands
| | - Boudewijn P F Lelieveldt
- Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Laboratory, Delft University of Technology, Delft, The Netherlands
| | - Marcel J T Reinders
- Delft Bioinformatics Laboratory, Delft University of Technology, Delft, The Netherlands
| |
Collapse
|
4
|
Zeng T, Li R, Mukkamala R, Ye J, Ji S. Deep convolutional neural networks for annotating gene expression patterns in the mouse brain. BMC Bioinformatics 2015; 16:147. [PMID: 25948335 PMCID: PMC4432953 DOI: 10.1186/s12859-015-0553-9] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2014] [Accepted: 03/27/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Profiling gene expression in brain structures at various spatial and temporal scales is essential to understanding how genes regulate the development of brain structures. The Allen Developing Mouse Brain Atlas provides high-resolution 3-D in situ hybridization (ISH) gene expression patterns in multiple developing stages of the mouse brain. Currently, the ISH images are annotated with anatomical terms manually. In this paper, we propose a computational approach to annotate gene expression pattern images in the mouse brain at various structural levels over the course of development. RESULTS We applied deep convolutional neural network that was trained on a large set of natural images to extract features from the ISH images of developing mouse brain. As a baseline representation, we applied invariant image feature descriptors to capture local statistics from ISH images and used the bag-of-words approach to build image-level representations. Both types of features from multiple ISH image sections of the entire brain were then combined to build 3-D, brain-wide gene expression representations. We employed regularized learning methods for discriminating gene expression patterns in different brain structures. Results show that our approach of using convolutional model as feature extractors achieved superior performance in annotating gene expression patterns at multiple levels of brain structures throughout four developing ages. Overall, we achieved average AUC of 0.894 ± 0.014, as compared with 0.820 ± 0.046 yielded by the bag-of-words approach. CONCLUSIONS Deep convolutional neural network model trained on natural image sets and applied to gene expression pattern annotation tasks yielded superior performance, demonstrating its transfer learning property is applicable to such biological image sets.
Collapse
Affiliation(s)
- Tao Zeng
- Department of Computer Science, Old Dominion University, Norfolk, 23529, VA, USA.
| | - Rongjian Li
- Department of Computer Science, Old Dominion University, Norfolk, 23529, VA, USA.
| | - Ravi Mukkamala
- Department of Computer Science, Old Dominion University, Norfolk, 23529, VA, USA.
| | - Jieping Ye
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, 48109, MI, USA. .,Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, 48109, MI, USA.
| | - Shuiwang Ji
- Department of Computer Science, Old Dominion University, Norfolk, 23529, VA, USA.
| |
Collapse
|
5
|
Coelho LP, Pato C, Friães A, Neumann A, von Köckritz-Blickwede M, Ramirez M, Carriço JA. Automatic determination of NET (neutrophil extracellular traps) coverage in fluorescent microscopy images. Bioinformatics 2015; 31:2364-70. [PMID: 25792554 DOI: 10.1093/bioinformatics/btv156] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 02/16/2015] [Indexed: 01/07/2023] Open
Abstract
MOTIVATION Neutrophil extracellular traps (NETs) are believed to be essential in controlling several bacterial pathogens. Quantification of NETs in vitro is an important tool in studies aiming to clarify the biological and chemical factors contributing to NET production, stabilization and degradation. This estimation can be performed on the basis of fluorescent microscopy images using appropriate labelings. In this context, it is desirable to automate the analysis to eliminate both the tedious process of manual annotation and possible operator-specific biases. RESULTS We propose a framework for the automated determination of NET content, based on visually annotated images which are used to train a supervised machine-learning method. We derive several methods in this framework. The best results are obtained by combining these into a single prediction. The overall Q(2) of the combined method is 93%. By having two experts label part of the image set, we were able to compare the performance of the algorithms to the human interoperator variability. We find that the two operators exhibited a very high correlation on their overall assessment of the NET coverage area in the images (R(2) is 97%), although there were consistent differences in labeling at pixel level (Q(2), which unlike R(2) does not correct for additive and multiplicative biases, was only 89%). AVAILABILITY AND IMPLEMENTATION Open source software (under the MIT license) is available at https://github.com/luispedro/Coelho2015_NetsDetermination for both reproducibility and application to new data.
Collapse
Affiliation(s)
- Luis Pedro Coelho
- Unidade de Biofísica e Expressão Genética, Instituto de Medicina Molecular and
| | - Catarina Pato
- Unidade de Biofísica e Expressão Genética, Instituto de Medicina Molecular and
| | - Ana Friães
- Unidade de Biofísica e Expressão Genética, Instituto de Medicina Molecular and
| | - Ariane Neumann
- Unidade de Biofísica e Expressão Genética, Instituto de Medicina Molecular and
| | | | - Mário Ramirez
- Unidade de Biofísica e Expressão Genética, Instituto de Medicina Molecular and
| | - João André Carriço
- Unidade de Biofísica e Expressão Genética, Instituto de Medicina Molecular and
| |
Collapse
|
6
|
Li R, Zhang W, Ji S. Automated identification of cell-type-specific genes in the mouse brain by image computing of expression patterns. BMC Bioinformatics 2014; 15:209. [PMID: 24947138 PMCID: PMC4078975 DOI: 10.1186/1471-2105-15-209] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 05/29/2014] [Indexed: 02/07/2023] Open
Abstract
Background Differential gene expression patterns in cells of the mammalian brain result in the morphological, connectional, and functional diversity of cells. A wide variety of studies have shown that certain genes are expressed only in specific cell-types. Analysis of cell-type-specific gene expression patterns can provide insights into the relationship between genes, connectivity, brain regions, and cell-types. However, automated methods for identifying cell-type-specific genes are lacking to date. Results Here, we describe a set of computational methods for identifying cell-type-specific genes in the mouse brain by automated image computing of in situ hybridization (ISH) expression patterns. We applied invariant image feature descriptors to capture local gene expression information from cellular-resolution ISH images. We then built image-level representations by applying vector quantization on the image descriptors. We employed regularized learning methods for classifying genes specifically expressed in different brain cell-types. These methods can also rank image features based on their discriminative power. We used a data set of 2,872 genes from the Allen Brain Atlas in the experiments. Results showed that our methods are predictive of cell-type-specificity of genes. Our classifiers achieved AUC values of approximately 87% when the enrichment level is set to 20. In addition, we showed that the highly-ranked image features captured the relationship between cell-types. Conclusions Overall, our results showed that automated image computing methods could potentially be used to identify cell-type-specific genes in the mouse brain.
Collapse
Affiliation(s)
| | | | - Shuiwang Ji
- Department of Computer Science, Old Dominion University, 23529 Norfolk, VA, USA.
| |
Collapse
|
7
|
Zaldivar A, Krichmar JL. Allen Brain Atlas-Driven Visualizations: a web-based gene expression energy visualization tool. Front Neuroinform 2014; 8:51. [PMID: 24904397 PMCID: PMC4033128 DOI: 10.3389/fninf.2014.00051] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Accepted: 04/25/2014] [Indexed: 01/08/2023] Open
Abstract
The Allen Brain Atlas-Driven Visualizations (ABADV) is a publicly accessible web-based tool created to retrieve and visualize expression energy data from the Allen Brain Atlas (ABA) across multiple genes and brain structures. Though the ABA offers their own search engine and software for researchers to view their growing collection of online public data sets, including extensive gene expression and neuroanatomical data from human and mouse brain, many of their tools limit the amount of genes and brain structures researchers can view at once. To complement their work, ABADV generates multiple pie charts, bar charts and heat maps of expression energy values for any given set of genes and brain structures. Such a suite of free and easy-to-understand visualizations allows for easy comparison of gene expression across multiple brain areas. In addition, each visualization links back to the ABA so researchers may view a summary of the experimental detail. ABADV is currently supported on modern web browsers and is compatible with expression energy data from the Allen Mouse Brain Atlas in situ hybridization data. By creating this web application, researchers can immediately obtain and survey numerous amounts of expression energy data from the ABA, which they can then use to supplement their work or perform meta-analysis. In the future, we hope to enable ABADV across multiple data resources.
Collapse
Affiliation(s)
- Andrew Zaldivar
- Department of Cognitive Sciences, University of California, Irvine Irvine, CA, USA
| | - Jeffrey L Krichmar
- Department of Cognitive Sciences, University of California, Irvine Irvine, CA, USA ; Department of Computer Science, University of California, Irvine Irvine, CA, USA
| |
Collapse
|