Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Coelho LP, Kangas JD, Naik AW, Osuna-Highley E, Glory-Afshar E, Fuhrman M, Simha R, Berget PB, Jarvik JW, Murphy RF. Determining the subcellular location of new proteins from microscope images using local features. ACTA ACUST UNITED AC 2013;29:2343-9. [PMID: 23836142 DOI: 10.1093/bioinformatics/btt392] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

For:	Coelho LP, Kangas JD, Naik AW, Osuna-Highley E, Glory-Afshar E, Fuhrman M, Simha R, Berget PB, Jarvik JW, Murphy RF. Determining the subcellular location of new proteins from microscope images using local features. ACTA ACUST UNITED AC 2013;29:2343-9. [PMID: 23836142 DOI: 10.1093/bioinformatics/btt392] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Number

Cited by Other Article(s)

Zou K, Wang S, Wang Z, Zhang Z, Yang F. HAR_Locator: a novel protein subcellular location prediction model of immunohistochemistry images based on hybrid attention modules and residual units. Front Mol Biosci 2023;10:1171429. [PMID: 37664182 PMCID: PMC10470064 DOI: 10.3389/fmolb.2023.1171429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 08/04/2023] [Indexed: 09/05/2023] Open

Abstract

Introduction: Proteins located in subcellular compartments have played an indispensable role in the physiological function of eukaryotic organisms. The pattern of protein subcellular localization is conducive to understanding the mechanism and function of proteins, contributing to investigating pathological changes of cells, and providing technical support for targeted drug research on human diseases. Automated systems based on featurization or representation learning and classifier design have attracted interest in predicting the subcellular location of proteins due to a considerable rise in proteins. However, large-scale, fine-grained protein microscopic images are prone to trapping and losing feature information in the general deep learning models, and the shallow features derived from statistical methods have weak supervision abilities. Methods: In this work, a novel model called HAR_Locator was developed to predict the subcellular location of proteins by concatenating multi-view abstract features and shallow features, whose advanced advantages are summarized in the following three protocols. Firstly, to get discriminative abstract feature information on protein subcellular location, an abstract feature extractor called HARnet based on Hybrid Attention modules and Residual units was proposed to relieve gradient dispersion and focus on protein-target regions. Secondly, it not only improves the supervision ability of image information but also enhances the generalization ability of the HAR_Locator through concatenating abstract features and shallow features. Finally, a multi-category multi-classifier decision system based on an Artificial Neural Network (ANN) was introduced to obtain the final output results of samples by fitting the most representative result from five subset predictors. Results: To evaluate the model, a collection of 6,778 immunohistochemistry (IHC) images from the Human Protein Atlas (HPA) database was used to present experimental results, and the accuracy, precision, and recall evaluation indicators were significantly increased to 84.73%, 84.77%, and 84.70%, respectively, compared with baseline predictors.

Collapse

Wang RH, Luo T, Zhang HL, Du PF. PLA-GNN: Computational inference of protein subcellular location alterations under drug treatments with deep graph neural networks. Comput Biol Med 2023;157:106775. [PMID: 36921458 DOI: 10.1016/j.compbiomed.2023.106775] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 02/21/2023] [Accepted: 03/09/2023] [Indexed: 03/12/2023]

Mou M, Pan Z, Lu M, Sun H, Wang Y, Luo Y, Zhu F. Application of Machine Learning in Spatial Proteomics. J Chem Inf Model 2022;62:5875-5895. [PMID: 36378082 DOI: 10.1021/acs.jcim.2c01161] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Gonschior H, Schmied C, Van der Veen RE, Eichhorst J, Himmerkus N, Piontek J, Günzel D, Bleich M, Furuse M, Haucke V, Lehmann M. Nanoscale segregation of channel and barrier claudins enables paracellular ion flux. Nat Commun 2022;13:4985. [PMID: 36008380 PMCID: PMC9411157 DOI: 10.1038/s41467-022-32533-4] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 08/04/2022] [Indexed: 11/09/2022] Open

Nanni L, Paci M, Brahnam S, Lumini A. Feature transforms for image data augmentation. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07645-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Abstract AbstractA problem with convolutional neural networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we propose some new methods for data augmentation based on several image transformations: the Fourier transform (FT), the Radon transform (RT), and the discrete cosine transform (DCT). These and other data augmentation methods are considered in order to quantify their effectiveness in creating ensembles of neural networks. The novelty of this research is to consider different strategies for data augmentation to generate training sets from which to train several classifiers which are combined into an ensemble. Specifically, the idea is to create an ensemble based on a kind of bagging of the training set, where each model is trained on a different training set obtained by augmenting the original training set with different approaches. We build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, with three based on FT, RT, and DCT, proposed here for the first time. Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature. Collapse

Nanni L, Brahnam S, Paci M, Ghidoni S. Comparison of Different Convolutional Neural Network Activation Functions and Methods for Building Ensembles for Small to Midsize Medical Data Sets. SENSORS (BASEL, SWITZERLAND) 2022;22:s22166129. [PMID: 36015898 PMCID: PMC9415767 DOI: 10.3390/s22166129] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 08/09/2022] [Accepted: 08/12/2022] [Indexed: 05/08/2023]

Wang G, Xue MQ, Shen HB, Xu YY. Learning protein subcellular localization multi-view patterns from heterogeneous data of imaging, sequence and networks. Brief Bioinform 2022;23:6499983. [PMID: 35018423 DOI: 10.1093/bib/bbab539] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 11/03/2021] [Accepted: 11/20/2021] [Indexed: 11/13/2022] Open

Ullah M, Han K, Hadi F, Xu J, Song J, Yu DJ. PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection. Brief Bioinform 2021;22:bbab278. [PMID: 34337652 PMCID: PMC8574991 DOI: 10.1093/bib/bbab278] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2021] [Revised: 06/30/2021] [Accepted: 07/01/2021] [Indexed: 01/17/2023] Open

Abstract

Protein subcellular localization plays a crucial role in characterizing the function of proteins and understanding various cellular processes. Therefore, accurate identification of protein subcellular location is an important yet challenging task. Numerous computational methods have been proposed to predict the subcellular location of proteins. However, most existing methods have limited capability in terms of the overall accuracy, time consumption and generalization power. To address these problems, in this study, we developed a novel computational approach based on human protein atlas (HPA) data, referred to as PScL-HDeep, for accurate and efficient image-based prediction of protein subcellular location in human tissues. We extracted different handcrafted and deep learned (by employing pretrained deep learning model) features from different viewpoints of the image. The step-wise discriminant analysis (SDA) algorithm was applied to generate the optimal feature set from each original raw feature set. To further obtain a more informative feature subset, support vector machine-based recursive feature elimination with correlation bias reduction (SVM-RFE + CBR) feature selection algorithm was applied to the integrated feature set. Finally, the classification models, namely support vector machine with radial basis function (SVM-RBF) and support vector machine with linear kernel (SVM-LNR), were learned on the final selected feature set. To evaluate the performance of the proposed method, a new gold standard benchmark training dataset was constructed from the HPA databank. PScL-HDeep achieved the maximum performance on 10-fold cross validation test on this dataset and showed a better efficacy over existing predictors. Furthermore, we also illustrated the generalization ability of the proposed method by conducting a stringent independent validation test.

Collapse

Maurya R, Pathak VK, Dutta MK. Deep learning based microscopic cell images classification framework using multi-level ensemble. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021;211:106445. [PMID: 34627021 DOI: 10.1016/j.cmpb.2021.106445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Accepted: 09/26/2021] [Indexed: 06/13/2023]

Abstract

BACKGROUND AND OBJECTIVES

Advancement of the ultra-fast microscopic images acquisition and generation techniques give rise to the automated artificial intelligence (AI)-based microscopic images classification systems. The earlier cell classification systems classify the cell images of a specific type captured using a specific microscopy technique, therefore the motivation behind the present study is to develop a generic framework that can be used for the classification of cell images of multiple types captured using a variety of microscopic techniques.

METHODS

The proposed framework for microscopic cell images classification is based on the transfer learning-based multi-level ensemble approach. The ensemble is made by training the same base model with different optimisation methods and different learning rates. An important contribution of the proposed framework lies in its ability to capture different granularities of features extracted from multiple scales of an input microscopic cell image. The base learners used in the proposed ensemble encapsulates the aggregation of low-level coarse features and high-level semantic features, thus, represent the different granular microscopic cell image features present at different scales of input cell images. The batch normalisation layer has been added to the base models for the fast convergence in the proposed ensemble for microscopic cell images classification.

RESULTS

The general applicability of the proposed framework for microscopic cell image classification has been tested with five different public datasets. The proposed method has outperformed the experimental results obtained in several other similar works.

CONCLUSIONS

The proposed framework for microscopic cell classification outperforms the other state-of-the-art classification methods in the same domain with a comparatively lesser amount of training data.

Collapse

Chen J, Hou J, Wong KC. Categorical Matrix Completion With Active Learning for High-Throughput Screening. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:2261-2270. [PMID: 32203025 DOI: 10.1109/tcbb.2020.2982142] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Maurya R, Pathak VK, Burget R, Dutta MK. Automated detection of bioimages using novel deep feature fusion algorithm and effective high-dimensional feature selection approach. Comput Biol Med 2021;137:104862. [PMID: 34534793 DOI: 10.1016/j.compbiomed.2021.104862] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 08/26/2021] [Accepted: 09/07/2021] [Indexed: 11/30/2022]

Christopher JA, Stadler C, Martin CE, Morgenstern M, Pan Y, Betsinger CN, Rattray DG, Mahdessian D, Gingras AC, Warscheid B, Lehtiö J, Cristea IM, Foster LJ, Emili A, Lilley KS. Subcellular proteomics. NATURE REVIEWS. METHODS PRIMERS 2021;1:32. [PMID: 34549195 PMCID: PMC8451152 DOI: 10.1038/s43586-021-00029-y] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 03/15/2021] [Indexed: 12/11/2022]

Affiliation(s)

Josie A. Christopher Department of Biochemistry, University of Cambridge, Cambridge, UK Milner Therapeutics Institute, Jeffrey Cheah Biomedical Centre, Cambridge, UK
Charlotte Stadler Department of Protein Sciences, Karolinska Institutet, Science for Life Laboratory, Solna, Sweden
Claire E. Martin Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada
Marcel Morgenstern Institute of Biology II, Biochemistry and Functional Proteomics, Faculty of Biology, University of Freiburg, Freiburg, Germany
Yanbo Pan Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, Sweden
Cora N. Betsinger Department of Molecular Biology, Princeton University, Princeton, NJ, USA
David G. Rattray Department of Biochemistry & Molecular Biology, Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
Diana Mahdessian Department of Protein Sciences, Karolinska Institutet, Science for Life Laboratory, Solna, Sweden
Anne-Claude Gingras Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
Bettina Warscheid Institute of Biology II, Biochemistry and Functional Proteomics, Faculty of Biology, University of Freiburg, Freiburg, Germany BIOSS and CIBSS Signaling Research Centers, University of Freiburg, Freiburg, Germany
Janne Lehtiö Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, Sweden
Ileana M. Cristea Department of Molecular Biology, Princeton University, Princeton, NJ, USA
Leonard J. Foster Department of Biochemistry & Molecular Biology, Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
Andrew Emili Center for Network Systems Biology, Boston University, Boston, MA, USA
Kathryn S. Lilley Department of Biochemistry, University of Cambridge, Cambridge, UK Milner Therapeutics Institute, Jeffrey Cheah Biomedical Centre, Cambridge, UK

Collapse

Veschini L, Sailem H, Malani D, Pietiäinen V, Stojiljkovic A, Wiseman E, Danovi D. High-Content Imaging to Phenotype Human Primary and iPSC-Derived Cells. Methods Mol Biol 2021;2185:423-445. [PMID: 33165865 DOI: 10.1007/978-1-0716-0810-4_27] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Xu YY, Zhou H, Murphy RF, Shen HB. Consistency and variation of protein subcellular location annotations. Proteins 2020;89:242-250. [PMID: 32935893 DOI: 10.1002/prot.26010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 07/09/2020] [Accepted: 09/13/2020] [Indexed: 11/09/2022]

Lundberg E, Borner GHH. Spatial proteomics: a powerful discovery tool for cell biology. Nat Rev Mol Cell Biol 2020;20:285-302. [PMID: 30659282 DOI: 10.1038/s41580-018-0094-y] [Citation(s) in RCA: 343] [Impact Index Per Article: 68.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Yang F, Liu Y, Wang Y, Yin Z, Yang Z. MIC_Locator: a novel image-based protein subcellular location multi-label prediction model based on multi-scale monogenic signal representation and intensity encoding strategy. BMC Bioinformatics 2019;20:522. [PMID: 31655541 PMCID: PMC6815465 DOI: 10.1186/s12859-019-3136-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 10/09/2019] [Indexed: 12/20/2022] Open

Abstract

Background

Protein subcellular localization plays a crucial role in understanding cell function. Proteins need to be in the right place at the right time, and combine with the corresponding molecules to fulfill their functions. Furthermore, prediction of protein subcellular location not only should be a guiding role in drug design and development due to potential molecular targets but also be an essential role in genome annotation. Taking the current status of image-based protein subcellular localization as an example, there are three common drawbacks, i.e., obsolete datasets without updating label information, stereotypical feature descriptor on spatial domain or grey level, and single-function prediction algorithm’s limited capacity of handling single-label database.

Results

In this paper, a novel human protein subcellular localization prediction model MIC_Locator is proposed. Firstly, the latest datasets are collected and collated as our benchmark dataset instead of obsolete data while training prediction model. Secondly, Fourier transformation, Riesz transformation, Log-Gabor filter and intensity coding strategy are employed to obtain frequency feature based on three components of monogenic signal with different frequency scales. Thirdly, a chained prediction model is proposed to handle multi-label instead of single-label datasets. The experiment results showed that the MIC_Locator can achieve 60.56% subset accuracy and outperform the existing majority of prediction models, and the frequency feature and intensity coding strategy can be conducive to improving the classification accuracy.

Conclusions

Our results demonstrate that the frequency feature is more beneficial for improving the performance of model compared to features extracted from spatial domain, and the MIC_Locator proposed in this paper can speed up validation of protein annotation, knowledge of protein function and proteomics research.

Collapse

Xiang S, Liang Q, Hu Y, Tang P, Coppola G, Zhang D, Sun W. AMC-Net: Asymmetric and multi-scale convolutional neural network for multi-label HPA classification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2019;178:275-287. [PMID: 31416555 DOI: 10.1016/j.cmpb.2019.07.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 06/20/2019] [Accepted: 07/06/2019] [Indexed: 06/10/2023]

Affiliation(s)

Shao Xiang College of Electrical and Information Engineering, Hunan University, Changsha 410082, China; Hunan Key Laboratory of Intelligent Robot Technology in Electronic Manufacturing, Hunan University, Changsha 410082, China; National Engineering Laboratory for Robot Vision Perception and Control technologies, Hunan University, Changsha 410082, China
Qiaokang Liang College of Electrical and Information Engineering, Hunan University, Changsha 410082, China; Hunan Key Laboratory of Intelligent Robot Technology in Electronic Manufacturing, Hunan University, Changsha 410082, China; National Engineering Laboratory for Robot Vision Perception and Control technologies, Hunan University, Changsha 410082, China.
Yucheng Hu College of Electrical and Information Engineering, Hunan University, Changsha 410082, China; Hunan Key Laboratory of Intelligent Robot Technology in Electronic Manufacturing, Hunan University, Changsha 410082, China; National Engineering Laboratory for Robot Vision Perception and Control technologies, Hunan University, Changsha 410082, China
Pen Tang College of Electrical and Information Engineering, Hunan University, Changsha 410082, China; Hunan Key Laboratory of Intelligent Robot Technology in Electronic Manufacturing, Hunan University, Changsha 410082, China; National Engineering Laboratory for Robot Vision Perception and Control technologies, Hunan University, Changsha 410082, China
Gianmarc Coppola Faculty of Engineering and Applied Science, University of Ontario Institute of Technology, Oshawa, Ontario, L1H 7K4, Canada
Dan Zhang Department of Mechanical Engineering, York University, Toronto, ON M3J 1P3, Canada
Wei Sun College of Electrical and Information Engineering, Hunan University, Changsha 410082, China; Hunan Key Laboratory of Intelligent Robot Technology in Electronic Manufacturing, Hunan University, Changsha 410082, China; National Engineering Laboratory for Robot Vision Perception and Control technologies, Hunan University, Changsha 410082, China

Collapse

Deep learning is combined with massive-scale citizen science to improve large-scale image classification. Nat Biotechnol 2018;36:820-828. [PMID: 30125267 DOI: 10.1038/nbt.4225] [Citation(s) in RCA: 95] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2017] [Accepted: 07/19/2018] [Indexed: 01/11/2023]

Godinez WJ, Hossain I, Lazic SE, Davies JW, Zhang X. A multi-scale convolutional neural network for phenotyping high-content cellular images. Bioinformatics 2018;33:2010-2019. [PMID: 28203779 DOI: 10.1093/bioinformatics/btx069] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2016] [Accepted: 02/13/2017] [Indexed: 12/27/2022] Open

Lin D, Sun L, Toh KA, Zhang JB, Lin Z. Biomedical image classification based on a cascade of an SVM with a reject option and subspace analysis. Comput Biol Med 2018;96:128-140. [PMID: 29567484 DOI: 10.1016/j.compbiomed.2018.03.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2017] [Revised: 03/07/2018] [Accepted: 03/07/2018] [Indexed: 11/26/2022]

Nanni L, Brahnam S, Ghidoni S, Lumini A. Bioimage Classification with Handcrafted and Learned Features. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018;16:874-885. [PMID: 29994096 DOI: 10.1109/tcbb.2018.2821127] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Data-analysis strategies for image-based cell profiling. Nat Methods 2017;14:849-863. [PMID: 28858338 PMCID: PMC6871000 DOI: 10.1038/nmeth.4397] [Citation(s) in RCA: 436] [Impact Index Per Article: 54.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 07/28/2017] [Indexed: 12/16/2022]

Song Y, Li Q, Huang H, Feng D, Chen M, Cai W. Low Dimensional Representation of Fisher Vectors for Microscopy Image Classification. IEEE TRANSACTIONS ON MEDICAL IMAGING 2017;36:1636-1649. [PMID: 28358678 DOI: 10.1109/tmi.2017.2687466] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Song Y, Li Q, Zhang F, Huang H, Feng D, Wang Y, Chen M, Cai W. Dual discriminative local coding for tissue aging analysis. Med Image Anal 2017;38:65-76. [PMID: 28282641 DOI: 10.1016/j.media.2016.10.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2016] [Revised: 07/12/2016] [Accepted: 10/05/2016] [Indexed: 11/26/2022]

An Overview of data science uses in bioimage informatics. Methods 2017;115:110-118. [DOI: 10.1016/j.ymeth.2016.12.014] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Revised: 12/09/2016] [Accepted: 12/30/2016] [Indexed: 01/17/2023] Open

Song Y, Cai W, Huang H, Feng D, Wang Y, Chen M. Bioimage classification with subcategory discriminant transform of high dimensional visual descriptors. BMC Bioinformatics 2016;17:465. [PMID: 27852213 PMCID: PMC5112644 DOI: 10.1186/s12859-016-1318-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2016] [Accepted: 11/01/2016] [Indexed: 11/10/2022] Open

Bougen-Zhukov N, Loh SY, Lee HK, Loo LH. Large-scale image-based screening and profiling of cellular phenotypes. Cytometry A 2016;91:115-125. [PMID: 27434125 DOI: 10.1002/cyto.a.22909] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Fricker MD, Moger J, Littlejohn GR, Deeks MJ. Making microscopy count: quantitative light microscopy of dynamic processes in living plants. J Microsc 2016;263:181-91. [PMID: 27145353 DOI: 10.1111/jmi.12403] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Revised: 01/31/2016] [Accepted: 02/16/2016] [Indexed: 12/18/2022]

Donovan RM, Tapia JJ, Sullivan DP, Faeder JR, Murphy RF, Dittrich M, Zuckerman DM. Unbiased Rare Event Sampling in Spatial Stochastic Systems Biology Models Using a Weighted Ensemble of Trajectories. PLoS Comput Biol 2016;12:e1004611. [PMID: 26845334 PMCID: PMC4741515 DOI: 10.1371/journal.pcbi.1004611] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 10/16/2015] [Indexed: 12/25/2022] Open

Abstract

The long-term goal of connecting scales in biological simulation can be facilitated by scale-agnostic methods. We demonstrate that the weighted ensemble (WE) strategy, initially developed for molecular simulations, applies effectively to spatially resolved cell-scale simulations. The WE approach runs an ensemble of parallel trajectories with assigned weights and uses a statistical resampling strategy of replicating and pruning trajectories to focus computational effort on difficult-to-sample regions. The method can also generate unbiased estimates of non-equilibrium and equilibrium observables, sometimes with significantly less aggregate computing time than would be possible using standard parallelization. Here, we use WE to orchestrate particle-based kinetic Monte Carlo simulations, which include spatial geometry (e.g., of organelles, plasma membrane) and biochemical interactions among mobile molecular species. We study a series of models exhibiting spatial, temporal and biochemical complexity and show that although WE has important limitations, it can achieve performance significantly exceeding standard parallel simulation—by orders of magnitude for some observables.

Stochastic simulations (simulations where randomness plays a role) of even simple biological systems are often so computationally intensive that it is impossible, in practice, to simulate them exhaustively and gather good statistics about the likelihood of different outcomes. The difficulty is compounded for the observation of rare events in these simulations; unfortunately, rare events, such as state transitions and barrier crossings, are often those of particular interest. Using the weighted ensemble (WE) method, we are able to enhance the characterization of rare events in cell biology simulations, but in such a way that the statistics for these events remain unbiased. The histogram of outcomes that WE produces has the same shape as a naive one, but the resolution of events in the tails of the histogram is greatly improved. This improved resolution in rare event statistics can be used to infer unbiased estimates of long timescale dynamics from short simulations, and we show that using a weighted ensemble can result in a reduction in total simulation time needed to sample certain events of interest in spatial, stochastic models of biological systems.

Collapse

Naik AW, Kangas JD, Sullivan DP, Murphy RF. Active machine learning-driven experimentation to determine compound effects on protein patterns. eLife 2016;5:e10047. [PMID: 26840049 PMCID: PMC4798950 DOI: 10.7554/elife.10047] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 01/28/2016] [Indexed: 12/03/2022] Open

Abstract

High throughput screening determines the effects of many conditions on a given biological target. Currently, to estimate the effects of those conditions on other targets requires either strong modeling assumptions (e.g. similarities among targets) or separate screens. Ideally, data-driven experimentation could be used to learn accurate models for many conditions and targets without doing all possible experiments. We have previously described an active machine learning algorithm that can iteratively choose small sets of experiments to learn models of multiple effects. We now show that, with no prior knowledge and with liquid handling robotics and automated microscopy under its control, this learner accurately learned the effects of 48 chemical compounds on the subcellular localization of 48 proteins while performing only 29% of all possible experiments. The results represent the first practical demonstration of the utility of active learning-driven biological experimentation in which the set of possible phenotypes is unknown in advance.

DOI:http://dx.doi.org/10.7554/eLife.10047.001

Biomedical scientists have invested significant effort into making it easy to perform lots of experiments quickly and cheaply. These “high throughput” methods are the workhorses of modern “systems biology” efforts. However, we simply cannot perform an experiment for every possible combination of different cell type, genetic mutation and other conditions. In practice this has led researchers to either exhaustively test a few conditions or targets, or to try to pick the experiments that best allow a particular problem to be explored. But which experiments should we pick? The ones we think we can predict the outcome of accurately, the ones for which we are uncertain what the results will be, or a combination of the two?

Humans are not particularly well suited for this task because it requires reasoning about many possible outcomes at the same time. However, computers are much better at handling statistics for many experiments, and machine learning algorithms allow computers to “learn” how to make predictions and decisions based on the data they’ve previously processed.

Previous computer simulations showed that a machine learning approach termed “active learning” could do a good job of picking a series of experiments to perform in order to efficiently learn a model that predicts the results of experiments that were not done. Now, Naik et al. have performed cell biology experiments in which experiments were chosen by an active learning algorithm and then performed using liquid handling robots and an automated microscope. The key idea behind the approach is that you learn more from an experiment you can’t predict (or that you predicted incorrectly) than from just confirming your confident predictions.

The results of the robot-driven experiments showed that the active learning approach outperforms strategies a human might use, even when the potential outcomes of individual experiments are not known beforehand. The next challenge is to apply these methods to reduce the cost of achieving the goals of large projects, such as The Cancer Genome Atlas.

DOI:http://dx.doi.org/10.7554/eLife.10047.002

Collapse

CP-CHARM: segmentation-free image classification made accessible. BMC Bioinformatics 2016;17:51. [PMID: 26817459 PMCID: PMC4729047 DOI: 10.1186/s12859-016-0895-y] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 01/18/2016] [Indexed: 11/10/2022] Open

Abstract

Background

Automated classification using machine learning often relies on features derived from segmenting individual objects, which can be difficult to automate. WND-CHARM is a previously developed classification algorithm in which features are computed on the whole image, thereby avoiding the need for segmentation. The algorithm obtained encouraging results but requires considerable computational expertise to execute. Furthermore, some benchmark sets have been shown to be subject to confounding artifacts that overestimate classification accuracy.

Results

We developed CP-CHARM, a user-friendly image-based classification algorithm inspired by WND-CHARM in (i) its ability to capture a wide variety of morphological aspects of the image, and (ii) the absence of requirement for segmentation. In order to make such an image-based classification method easily accessible to the biological research community, CP-CHARM relies on the widely-used open-source image analysis software CellProfiler for feature extraction. To validate our method, we reproduced WND-CHARM’s results and ensured that CP-CHARM obtained comparable performance. We then successfully applied our approach on cell-based assay data and on tissue images. We designed these new training and test sets to reduce the effect of batch-related artifacts.

Conclusions

The proposed method preserves the strengths of WND-CHARM - it extracts a wide variety of morphological features directly on whole images thereby avoiding the need for cell segmentation, but additionally, it makes the methods easily accessible for researchers without computational expertise by implementing them as a CellProfiler pipeline. It has been demonstrated to perform well on a wide range of bioimage classification problems, including on new datasets that have been carefully selected and annotated to minimize batch effects. This provides for the first time a realistic and reliable assessment of the whole image classification strategy.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-016-0895-y) contains supplementary material, which is available to authorized users.

Collapse

Yang Q, Zou HY, Zhang Y, Tang LJ, Shen GL, Jiang JH, Yu RQ. Multiplex protein pattern unmixing using a non-linear variable-weighted support vector machine as optimized by a particle swarm optimization algorithm. Talanta 2016;147:609-14. [DOI: 10.1016/j.talanta.2015.10.047] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Revised: 10/14/2015] [Accepted: 10/18/2015] [Indexed: 11/30/2022]

Shao W, Liu M, Zhang D. Human cell structure-driven model construction for predicting protein subcellular location from biological images. Bioinformatics 2015;32:114-21. [PMID: 26363175 DOI: 10.1093/bioinformatics/btv521] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Accepted: 08/31/2015] [Indexed: 11/14/2022] Open

Coelho LP, Pato C, Friães A, Neumann A, von Köckritz-Blickwede M, Ramirez M, Carriço JA. Automatic determination of NET (neutrophil extracellular traps) coverage in fluorescent microscopy images. Bioinformatics 2015;31:2364-70. [PMID: 25792554 DOI: 10.1093/bioinformatics/btv156] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 02/16/2015] [Indexed: 01/07/2023] Open

Krauß SD, Petersen D, Niedieker D, Fricke I, Freier E, El-Mashtoly SF, Gerwert K, Mosig A. Colocalization of fluorescence and Raman microscopic images for the identification of subcellular compartments: a validation study. Analyst 2015;140:2360-8. [DOI: 10.1039/c4an02153c] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

Abbas SS, Dijkstra TMH, Heskes T. A comparative study of cell classifiers for image-based high-throughput screening. BMC Bioinformatics 2014;15:342. [PMID: 25336059 PMCID: PMC4287552 DOI: 10.1186/1471-2105-15-342] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Accepted: 09/29/2014] [Indexed: 11/24/2022] Open

Yang F, Xu YY, Shen HB. Many local pattern texture features: which is better for image-based multilabel human protein subcellular localization classification? ScientificWorldJournal 2014;2014:429049. [PMID: 25050396 PMCID: PMC4094881 DOI: 10.1155/2014/429049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2014] [Accepted: 05/22/2014] [Indexed: 01/14/2023] Open

Yang F, Xu YY, Wang ST, Shen HB. Image-based classification of protein subcellular location patterns in human reproductive tissue by ensemble learning global and local features. Neurocomputing 2014. [DOI: 10.1016/j.neucom.2013.10.034] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Predicting human protein subcellular locations by the ensemble of multiple predictors via protein-protein interaction network with edge clustering coefficients. PLoS One 2014;9:e86879. [PMID: 24466278 PMCID: PMC3900678 DOI: 10.1371/journal.pone.0086879] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Accepted: 12/18/2013] [Indexed: 12/14/2022] Open