151
|
Cai Y, Droste R, Sharma H, Chatelain P, Drukker L, Papageorghiou AT, Noble JA. Spatio-temporal visual attention modelling of standard biometry plane-finding navigation. Med Image Anal 2020; 65:101762. [PMID: 32623278 DOI: 10.1016/j.media.2020.101762] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 06/15/2020] [Accepted: 06/18/2020] [Indexed: 11/26/2022]
Abstract
We present a novel multi-task neural network called Temporal SonoEyeNet (TSEN) with a primary task to describe the visual navigation process of sonographers by learning to generate visual attention maps of ultrasound images around standard biometry planes of the fetal abdomen, head (trans-ventricular plane) and femur. TSEN has three components: a feature extractor, a temporal attention module (TAM), and an auxiliary video classification module (VCM). A soft dynamic time warping (sDTW) loss function is used to improve visual attention modelling. Variants of the model are trained on a dataset of 280 video clips, each containing one of the three biometry planes and lasting 3-7 seconds, with corresponding real-time recorded gaze tracking data of an experienced sonographer. We report the performances of the different variants of TSEN for visual attention prediction at standard biometry plane detection. The best model performance is achieved using bi-directional convolutional long-short term memory (biCLSTM) in both TAM and VCM, and it outperforms a previous spatial model on all static and dynamic saliency metrics. As an auxiliary task to validate the clinical relevance of the visual attention modelling, the predicted visual attention maps were used to guide standard biometry plane detection in consecutive US video frames. All spatio-temporal TSEN models achieve higher scores compared to a spatial-only baseline; the best performing TSEN model achieves F1 scores on these standard biometry planes of 83.7%, 89.9% and 81.1%, respectively.
Collapse
Affiliation(s)
- Yifan Cai
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, UK.
| | - Richard Droste
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, UK
| | - Harshita Sharma
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, UK
| | - Pierre Chatelain
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, UK
| | - Lior Drukker
- Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, OX3 9DU, UK
| | - Aris T Papageorghiou
- Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, OX3 9DU, UK
| | - J Alison Noble
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, UK
| |
Collapse
|
152
|
Sebai M, Wang X, Wang T. MaskMitosis: a deep learning framework for fully supervised, weakly supervised, and unsupervised mitosis detection in histopathology images. Med Biol Eng Comput 2020; 58:1603-1623. [PMID: 32445109 DOI: 10.1007/s11517-020-02175-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 03/30/2020] [Indexed: 01/14/2023]
Abstract
Counting the mitotic cells in histopathological cancerous tissue areas is the most relevant indicator of tumor grade in aggressive breast cancer diagnosis. In this paper, we propose a robust and accurate technique for the automatic detection of mitoses from histological breast cancer slides using the multi-task deep learning framework for object detection and instance segmentation Mask RCNN. Our mitosis detection and instance segmentation framework is deployed for two main tasks: it is used as a detection network to perform mitosis localization and classification in the fully annotated mitosis datasets (i.e., the pixel-level annotated datasets), and it is used as a segmentation network to estimate the mitosis mask labels for the weakly annotated mitosis datasets (i.e., the datasets with centroid-pixel labels only). We evaluate our approach on the fully annotated 2012 ICPR grand challenge dataset and the weakly annotated 2014 ICPR MITOS-ATYPIA challenge dataset. Our evaluation experiments show that we can obtain the highest F-score of 0.863 on the 2012 ICPR dataset by applying the mitosis detection and instance segmentation model trained on the pixel-level labels provided by this dataset. For the weakly annotated 2014 ICPR dataset, we first employ the mitosis detection and instance segmentation model trained on the fully annotated 2012 ICPR dataset to segment the centroid-pixel annotated mitosis ground truths, and produce the mitosis mask and bounding box labels. These estimated labels are then used to train another mitosis detection and instance segmentation model for mitosis detection on the 2014 ICPR dataset. By adopting this two-stage framework, our method outperforms all state-of-the-art mitosis detection approaches on the 2014 ICPR dataset by achieving an F-score of 0.475. Moreover, we show that the proposed framework can also perform unsupervised mitosis detection through the estimation of pseudo labels for an unlabeled dataset and it can achieve promising detection results. Code has been made available at: https://github.com/MeriemSebai/MaskMitosis. Graphical Abstract Overview of MaskMitosis framework.
Collapse
Affiliation(s)
- Meriem Sebai
- School of Computer Science and Technology, Huazhong University of Science and Technology (HUST), Wuhan, People's Republic of China.
| | - Xinggang Wang
- School of Electronics Information and Communications, Huazhong University of Science and Technology (HUST), Wuhan, People's Republic of China
| | - Tianjiang Wang
- School of Computer Science and Technology, Huazhong University of Science and Technology (HUST), Wuhan, People's Republic of China.
| |
Collapse
|
153
|
Li L, Chang D, Han L, Zhang X, Zaia J, Wan XF. Multi-task learning sparse group lasso: a method for quantifying antigenicity of influenza A(H1N1) virus using mutations and variations in glycosylation of Hemagglutinin. BMC Bioinformatics 2020; 21:182. [PMID: 32393178 PMCID: PMC7216668 DOI: 10.1186/s12859-020-3527-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2020] [Accepted: 04/30/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In addition to causing the pandemic influenza outbreaks of 1918 and 2009, subtype H1N1 influenza A viruses (IAVs) have caused seasonal epidemics since 1977. Antigenic property of influenza viruses are determined by both protein sequence and N-linked glycosylation of influenza glycoproteins, especially hemagglutinin (HA). The currently available computational methods are only considered features in protein sequence but not N-linked glycosylation. RESULTS A multi-task learning sparse group least absolute shrinkage and selection operator (LASSO) (MTL-SGL) regression method was developed and applied to derive two types of predominant features including protein sequence and N-linked glycosylation in hemagglutinin (HA) affecting variations in serologic data for human and swine H1N1 IAVs. Results suggested that mutations and changes in N-linked glycosylation sites are associated with the rise of antigenic variants of H1N1 IAVs. Furthermore, the implicated mutations are predominantly located at five reported antibody-binding sites, and within or close to the HA receptor binding site. All of the three N-linked glycosylation sites (i.e. sequons NCSV at HA 54, NHTV at HA 125, and NLSK at HA 160) identified by MTL-SGL to determine antigenic changes were experimentally validated in the H1N1 antigenic variants using mass spectrometry analyses. Compared with conventional sparse learning methods, MTL-SGL achieved a lower prediction error and higher accuracy, indicating that grouped features and MTL in the MTL-SGL method are not only able to handle serologic data generated from multiple reagents, supplies, and protocols, but also perform better in genetic sequence-based antigenic quantification. CONCLUSIONS In summary, the results of this study suggest that mutations and variations in N-glycosylation in HA caused antigenic variations in H1N1 IAVs and that the sequence-based antigenicity predictive model will be useful in understanding antigenic evolution of IAVs.
Collapse
Affiliation(s)
- Lei Li
- Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, USA
| | - Deborah Chang
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston, MA, USA
| | - Lei Han
- Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, USA.,Tencent AI Lab, Shenzhen, China
| | - Xiaojian Zhang
- Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, USA.,Department of Molecular Microbiology and Immunology, School of Medicine, University of Missouri, Columbia, MO, USA.,MU Center for Research on Influenza Systems Biology (CRISB), University of Missouri, Columbia, MO, USA.,Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | - Joseph Zaia
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston, MA, USA
| | - Xiu-Feng Wan
- Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, USA. .,Department of Molecular Microbiology and Immunology, School of Medicine, University of Missouri, Columbia, MO, USA. .,MU Center for Research on Influenza Systems Biology (CRISB), University of Missouri, Columbia, MO, USA. .,Bond Life Sciences Center, University of Missouri, Columbia, MO, USA. .,Department of Electrical Engineering & Computer Science, College of Engineering, University of Missouri, Columbia, MO, USA. .,MU Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA.
| |
Collapse
|
154
|
Koyuncu CF, Gunesli GN, Cetin-Atalay R, Gunduz-Demir C. DeepDistance: A multi-task deep regression model for cell detection in inverted microscopy images. Med Image Anal 2020; 63:101720. [PMID: 32438298 DOI: 10.1016/j.media.2020.101720] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 02/28/2020] [Accepted: 05/04/2020] [Indexed: 11/25/2022]
Abstract
This paper presents a new deep regression model, which we call DeepDistance, for cell detection in images acquired with inverted microscopy. This model considers cell detection as a task of finding most probable locations that suggest cell centers in an image. It represents this main task with a regression task of learning an inner distance metric. However, different than the previously reported regression based methods, the DeepDistance model proposes to approach its learning as a multi-task regression problem where multiple tasks are learned by using shared feature representations. To this end, it defines a secondary metric, normalized outer distance, to represent a different aspect of the problem and proposes to define its learning as complementary to the main cell detection task. In order to learn these two complementary tasks more effectively, the DeepDistance model designs a fully convolutional network (FCN) with a shared encoder path and end-to-end trains this FCN to concurrently learn the tasks in parallel. For further performance improvement on the main task, this paper also presents an extended version of the DeepDistance model that includes an auxiliary classification task and learns it in parallel to the two regression tasks by also sharing feature representations with them. DeepDistance uses the inner distances estimated by these FCNs in a detection algorithm to locate individual cells in a given image. In addition to this detection algorithm, this paper also suggests a cell segmentation algorithm that employs the estimated maps to find cell boundaries. Our experiments on three different human cell lines reveal that the proposed multi-task learning models, the DeepDistance model and its extended version, successfully identify the locations of cell as well as delineate their boundaries, even for the cell line that was not used in training, and improve the results of its counterparts.
Collapse
Affiliation(s)
| | - Gozde Nur Gunesli
- Department of Computer Engineering, Bilkent University, Ankara TR-06800, Turkey.
| | - Rengul Cetin-Atalay
- CanSyL,Graduate School of Informatics, Middle East Technical University, Ankara TR-06800, Turkey.
| | - Cigdem Gunduz-Demir
- Department of Computer Engineering, Bilkent University, Ankara TR-06800, Turkey; Neuroscience Graduate Program, Bilkent University, Ankara TR-06800, Turkey.
| |
Collapse
|
155
|
Abstract
An immense amount of observable diversity exists for all traits and across global populations. In the post-genomic era, equipped with efficient sequencing capabilities and better genotyping methods, we are now able to more fully appreciate how regulation of gene expression is consequential to one's genotypes in coding and non-coding DNA. The identification of genetic loci that contribute to quantifiable variation in genetic expression is critical in further improving our understanding of the biological regulation of complex traits. Expression quantitative traits loci (eQTLs) mapping studies have provided a powerful suite of techniques for genome wide analysis to detect these regulatory effects. However, a typical eQTL analysis relies on a large number of samples with many genetic variants to achieve robust power and significance for detection. With this in mind, eQTL analysis brings about distinct computational and statistical challenges that require advanced methodological development to overcome. In recent years, many statistical and machine learning methods for eQTL analysis have been developed with the ability to provide a more complex perspective towards the identification of relationships between genetic variation and genetic expression. In this chapter, we provide a comprehensive review of statistical and machine learning methods. We will present various machine learning methods based upon regularization terms and several other statistical analysis methods. Finally, we will discuss prior knowledge integration and hyperparameter optimization.
Collapse
Affiliation(s)
- Junjie Chen
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, USA.
| | - Conor Nodzak
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, USA
| |
Collapse
|
156
|
Shao W, Peng Y, Zu C, Wang M, Zhang D. Hypergraph based multi-task feature selection for multimodal classification of Alzheimer's disease. Comput Med Imaging Graph 2019; 80:101663. [PMID: 31923610 DOI: 10.1016/j.compmedimag.2019.101663] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2018] [Revised: 06/14/2019] [Accepted: 10/01/2019] [Indexed: 12/18/2022]
Abstract
Multi-modality based classification methods are superior to the single modality based approaches for the automatic diagnosis of the Alzheimer's disease (AD) and mild cognitive impairment (MCI). However, most of the multi-modality based methods usually ignore the structure information of data and simply squeeze them to pairwise relationships. In real-world applications, the relationships among subjects are much more complex than pairwise, and the high-order structure containing more discriminative information will be intuitively beneficial to our learning tasks. In light of this, a hypergraph based multi-task feature selection method for AD/MCI classification is proposed in this paper. Specifically, we first perform feature selection on each modality as a single task and incorporate group-sparsity regularizer to jointly select common features across multiple modalities. Then, we introduce a hypergraph based regularization term for the standard multi-task feature selection to model the high-order structure relationship among subjects. Finally, a multi-kernel support vector machine is adopted to fuse the features selected from different modalities for the final classification. The experimental results on the Alzheimer's Disease Neuroimaging Initiative (ADNI) demonstrate that our proposed method achieves better classification performance than the start-of-art multi-modality based methods.
Collapse
Affiliation(s)
- Wei Shao
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China.
| | - Yao Peng
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China.
| | - Chen Zu
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China.
| | - Mingliang Wang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China.
| | - Daoqiang Zhang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China.
| |
Collapse
|
157
|
Abstract
In the clinical domain, it is important to understand whether an adverse drug reaction (ADR) is caused by a particular medication. Clinical judgement studies help judge the causal relation between a medication and its ADRs. In this study, we present the first attempt to automatically infer the causality between a drug and an ADR from electronic health records (EHRs) by answering the Naranjo questionnaire, the validated clinical question answering set used by domain experts for ADR causality assessment. Using physicians' annotation as the gold standard, our proposed joint model, which uses multi-task learning to predict the answers of a subset of the Naranjo questionnaire, significantly outperforms the baseline pipeline model with a good margin, achieving a macro-weighted f-score between 0.3652 - 0.5271 and micro-weighted f-score between 0.9523 - 0.9918.
Collapse
Affiliation(s)
| | - Fei Li
- UMass Lowell, Lowell, USA
| | | |
Collapse
|
158
|
Sadawi N, Olier I, Vanschoren J, van Rijn JN, Besnard J, Bickerton R, Grosan C, Soldatova L, King RD. Multi-task learning with a natural metric for quantitative structure activity relationship learning. J Cheminform 2019; 11:68. [PMID: 33430958 PMCID: PMC6852942 DOI: 10.1186/s13321-019-0392-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Accepted: 11/04/2019] [Indexed: 11/24/2022] Open
Abstract
The goal of quantitative structure activity relationship (QSAR) learning is to learn a function that, given the structure of a small molecule (a potential drug), outputs the predicted activity of the compound. We employed multi-task learning (MTL) to exploit commonalities in drug targets and assays. We used datasets containing curated records about the activity of specific compounds on drug targets provided by ChEMBL. Totally, 1091 assays have been analysed. As a baseline, a single task learning approach that trains random forest to predict drug activity for each drug target individually was considered. We then carried out feature-based and instance-based MTL to predict drug activities. We introduced a natural metric of evolutionary distance between drug targets as a measure of tasks relatedness. Instance-based MTL significantly outperformed both, feature-based MTL and the base learner, on 741 drug targets out of 1091. Feature-based MTL won on 179 occasions and the base learner performed best on 171 drug targets. We conclude that MTL QSAR is improved by incorporating the evolutionary distance between targets. These results indicate that QSAR learning can be performed effectively, even if little data is available for specific drug targets, by leveraging what is known about similar drug targets.
Collapse
Affiliation(s)
- Noureddin Sadawi
- Department of Medicine, Imperial College London, London, UK
- Brunel University London, London, UK
| | - Ivan Olier
- Department of Applied Mathematics, Liverpool John Moores University, Liverpool, UK
| | | | | | - Jeremy Besnard
- University of Dundee, Dundee, Dundee, UK
- Ex Scientia Ltd, Dundee, UK
| | | | | | - Larisa Soldatova
- Brunel University London, London, UK
- Goldsmiths, University of London, London, UK
| | | |
Collapse
|
159
|
Chen P, Dong W, Lu X, Kaymak U, He K, Huang Z. Deep representation learning for individualized treatment effect estimation using electronic health records. J Biomed Inform 2019; 100:103303. [PMID: 31610264 DOI: 10.1016/j.jbi.2019.103303] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2019] [Revised: 09/22/2019] [Accepted: 10/07/2019] [Indexed: 12/25/2022]
Abstract
Utilizing clinical observational data to estimate individualized treatment effects (ITE) is a challenging task, as confounding inevitably exists in clinical data. Most of the existing models for ITE estimation tackle this problem by creating unbiased estimators of the treatment effects. Although valuable, learning a balanced representation is sometimes directly opposed to the objective of learning an effective and discriminative model for ITE estimation. We propose a novel hybrid model bridging multi-task deep learning and K-nearest neighbors (KNN) for ITE estimation. In detail, the proposed model firstly adopts multi-task deep learning to extract both outcome-predictive and treatment-specific latent representations from Electronic Health Records (EHR), by jointly performing the outcome prediction and treatment category classification. Thereafter, we estimate counterfactual outcomes by KNN based on the learned hidden representations. We validate the proposed model on a widely used semi-simulated dataset, i.e. IHDP, and a real-world clinical dataset consisting of 736 heart failure (HF) patients. The performance of our model remains robust and reaches 1.7 and 0.23 in terms of Precision in the estimation of heterogeneous effect (PEHE) and average treatment effect (ATE), respectively, on IHDP dataset, and 0.703 and 0.796 in terms of accuracy and F1 score respectively, on HF dataset. The results demonstrate that the proposed model achieves competitive performance over state-of-the-art models. In addition, the results reveal several findings which are consistent with existing medical domain knowledge, and discover certain suggestive hypotheses that could be validated through further investigations in the clinical domain.
Collapse
Affiliation(s)
- Peipei Chen
- College of Biomedical Engineering and Instrumental Science, Zhejiang University, 310008 Hangzhou, China; School of Industrial Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands
| | - Wei Dong
- Department of Cardiology, Chinese PLA General Hospital, 100853 Beijing, China
| | - Xudong Lu
- College of Biomedical Engineering and Instrumental Science, Zhejiang University, 310008 Hangzhou, China; School of Industrial Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands
| | - Uzay Kaymak
- School of Industrial Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands; College of Biomedical Engineering and Instrumental Science, Zhejiang University, 310008 Hangzhou, China
| | - Kunlun He
- Department of Cardiology, Chinese PLA General Hospital, 100853 Beijing, China.
| | - Zhengxing Huang
- College of Biomedical Engineering and Instrumental Science, Zhejiang University, 310008 Hangzhou, China.
| |
Collapse
|
160
|
Jin Y, Li H, Dou Q, Chen H, Qin J, Fu CW, Heng PA. Multi-task recurrent convolutional network with correlation loss for surgical video analysis. Med Image Anal 2019; 59:101572. [PMID: 31639622 DOI: 10.1016/j.media.2019.101572] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 09/29/2019] [Accepted: 10/03/2019] [Indexed: 12/16/2022]
Abstract
Surgical tool presence detection and surgical phase recognition are two fundamental yet challenging tasks in surgical video analysis as well as very essential components in various applications in modern operating rooms. While these two analysis tasks are highly correlated in clinical practice as the surgical process is typically well-defined, most previous methods tackled them separately, without making full use of their relatedness. In this paper, we present a novel method by developing a multi-task recurrent convolutional network with correlation loss (MTRCNet-CL) to exploit their relatedness to simultaneously boost the performance of both tasks. Specifically, our proposed MTRCNet-CL model has an end-to-end architecture with two branches, which share earlier feature encoders to extract general visual features while holding respective higher layers targeting for specific tasks. Given that temporal information is crucial for phase recognition, long-short term memory (LSTM) is explored to model the sequential dependencies in the phase recognition branch. More importantly, a novel and effective correlation loss is designed to model the relatedness between tool presence and phase identification of each video frame, by minimizing the divergence of predictions from the two branches. Mutually leveraging both low-level feature sharing and high-level prediction correlating, our MTRCNet-CL method can encourage the interactions between the two tasks to a large extent, and hence can bring about benefits to each other. Extensive experiments on a large surgical video dataset (Cholec80) demonstrate outstanding performance of our proposed method, consistently exceeding the state-of-the-art methods by a large margin, e.g., 89.1% v.s. 81.0% for the mAP in tool presence detection and 87.4% v.s. 84.5% for F1 score in phase recognition.
Collapse
Affiliation(s)
- Yueming Jin
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
| | - Huaxia Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
| | - Qi Dou
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, China.
| | - Hao Chen
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
| | - Jing Qin
- Centre for Smart Health, School of Nursing, The Hong Kong Polytechnic University, China
| | - Chi-Wing Fu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, China; T Stone Robotics Institute, The Chinese University of Hong Kong, China
| |
Collapse
|
161
|
Wang D, Li M, Ben-Shlomo N, Corrales CE, Cheng Y, Zhang T, Jayender J. Mixed-Supervised Dual-Network for Medical Image Segmentation. Med Image Comput Comput Assist Interv 2019; 11765:192-200. [PMID: 32395724 PMCID: PMC7213952 DOI: 10.1007/978-3-030-32245-8_22] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Deep learning based medical image segmentation models usually require large datasets with high-quality dense segmentations to train, which are very time-consuming and expensive to prepare. One way to tackle this difficulty is using the mixed-supervised learning framework, where only a part of data is densely annotated with segmentation label and the rest is weakly labeled with bounding boxes. The model is trained jointly in a multi-task learning setting. In this paper, we propose Mixed-Supervised Dual-Network (MSDN), a novel architecture which consists of two separate networks for the detection and segmentation tasks respectively, and a series of connection modules between the layers of the two networks. These connection modules are used to transfer useful information from the auxiliary detection task to help the segmentation task. We propose to use a recent technique called 'Squeeze and Excitation' in the connection module to boost the transfer. We conduct experiments on two medical image segmentation datasets. The proposed MSDN model outperforms multiple baselines.
Collapse
Affiliation(s)
- Duo Wang
- Department of Automation, Tsinghua University, Beijing, China
- Department of Radiology, Brigham and Women's Hospital, Boston, USA
| | - Ming Li
- Department of Radiology and Radiation Oncology, Huadong Hospital affiliated to Fudan University, Shanghai, China
| | - Nir Ben-Shlomo
- Department of Surgery, Brigham and Women's Hospital, Boston, USA
| | - C Eduardo Corrales
- Department of Surgery, Brigham and Women's Hospital, Boston, USA
- Harvard Medical School, Boston, USA
| | - Yu Cheng
- Microsoft AI & Research, Redmond, WA, USA
| | - Tao Zhang
- Department of Automation, Tsinghua University, Beijing, China
| | - Jagadeesan Jayender
- Department of Radiology, Brigham and Women's Hospital, Boston, USA
- Harvard Medical School, Boston, USA
| |
Collapse
|
162
|
Bui TD, Wang L, Chen J, Lin W, Li G, Shen D. Multi-task Learning for Neonatal Brain Segmentation Using 3D Dense-Unet with Dense Attention Guided by Geodesic Distance. Domain Adapt Represent Transf Med Image Learn Less Labels Imperfect Data (2019) 2019; 11795:243-251. [PMID: 32090208 PMCID: PMC7034948 DOI: 10.1007/978-3-030-33391-1_28] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The deep convolutional neural network has achieved outstanding performance on neonatal brain MRI tissue segmentation. However, it may fail to produce reasonable results on unseen datasets that have different imaging appearance distributions with the training data. The main reason is that deep learning models tend to have a good fitting to the training dataset, but do not lead to a good generalization on the unseen datasets. To address this problem, we propose a multi-task learning method, which simultaneously learns both tissue segmentation and geodesic distance regression to regularize a shared encoder network. Furthermore, a dense attention gate is explored to force the network to learn rich contextual information. By using three neonatal brain datasets with different imaging protocols from different scanners, our experimental results demonstrate superior performance of our proposed method over the existing deep learning-based methods on the unseen datasets.
Collapse
Affiliation(s)
- Toan Duc Bui
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Li Wang
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jian Chen
- School of Information Science and Engineering, Fujian University of Technology, Fuzhou 350118, China
| | - Weili Lin
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Gang Li
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Dinggang Shen
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea
| |
Collapse
|
163
|
Lin Z, Li S, Ni D, Liao Y, Wen H, Du J, Chen S, Wang T, Lei B. Multi-task learning for quality assessment of fetal head ultrasound images. Med Image Anal 2019; 58:101548. [PMID: 31525671 DOI: 10.1016/j.media.2019.101548] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 07/15/2019] [Accepted: 08/23/2019] [Indexed: 11/26/2022]
Abstract
It is essential to measure anatomical parameters in prenatal ultrasound images for the growth and development of the fetus, which is highly relied on obtaining a standard plane. However, the acquisition of a standard plane is, in turn, highly subjective and depends on the clinical experience of sonographers. In order to deal with this challenge, we propose a new multi-task learning framework using a faster regional convolutional neural network (MF R-CNN) architecture for standard plane detection and quality assessment. MF R-CNN can identify the critical anatomical structure of the fetal head and analyze whether the magnification of the ultrasound image is appropriate, and then performs quality assessment of ultrasound images based on clinical protocols. Specifically, the first five convolution blocks of the MF R-CNN learn the features shared within the input data, which can be associated with the detection and classification tasks, and then extend to the task-specific output streams. In training, in order to speed up the different convergence of different tasks, we devise a section train method based on transfer learning. In addition, our proposed method also uses prior clinical and statistical knowledge to reduce the false detection rate. By identifying the key anatomical structure and magnification of the ultrasound image, we score the ultrasonic plane of fetal head to judge whether it is a standard image or not. Experimental results on our own-collected dataset show that our method can accurately make a quality assessment of an ultrasound plane within half a second. Our method achieves promising performance compared with state-of-the-art methods, which can improve the examination effectiveness and alleviate the measurement error caused by improper ultrasound scanning.
Collapse
Affiliation(s)
- Zehui Lin
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, China
| | - Shengli Li
- Department of Ultrasound, Affiliated Shenzhen Maternal and Child Healthcare Hospital of Nanfang Medical University, 3012 Fuqiang Rd, Shenzhen, 518060, China
| | - Dong Ni
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, China
| | - Yimei Liao
- Department of Ultrasound, Affiliated Shenzhen Maternal and Child Healthcare Hospital of Nanfang Medical University, 3012 Fuqiang Rd, Shenzhen, 518060, China
| | - Huaxuan Wen
- Department of Ultrasound, Affiliated Shenzhen Maternal and Child Healthcare Hospital of Nanfang Medical University, 3012 Fuqiang Rd, Shenzhen, 518060, China
| | - Jie Du
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, China
| | - Siping Chen
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, China
| | - Tianfu Wang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, China.
| | - Baiying Lei
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, China.
| |
Collapse
|
164
|
Liu X, Cao P, Wang J, Kong J, Zhao D. Fused Group Lasso Regularized Multi-Task Feature Learning and Its Application to the Cognitive Performance Prediction of Alzheimer's Disease. Neuroinformatics 2019; 17:271-294. [PMID: 30284672 DOI: 10.1007/s12021-018-9398-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Alzheimer's disease (AD) is characterized by gradual neurodegeneration and loss of brain function, especially for memory during early stages. Regression analysis has been widely applied to AD research to relate clinical and biomarker data such as predicting cognitive outcomes from MRI measures. Recently, multi-task based feature learning (MTFL) methods with sparsity-inducing [Formula: see text]-norm have been widely studied to select a discriminative feature subset from MRI features by incorporating inherent correlations among multiple clinical cognitive measures. However, existing MTFL assumes the correlation among all tasks is uniform, and the task relatedness is modeled by encouraging a common subset of features via sparsity-inducing regularizations that neglect the inherent structure of tasks and MRI features. To address this issue, we proposed a fused group lasso regularization to model the underlying structures, involving 1) a graph structure within tasks and 2) a group structure among the image features. To this end, we present a multi-task feature learning framework with a mixed norm of fused group lasso and [Formula: see text]-norm to model these more flexible structures. For optimization, we employed the alternating direction method of multipliers (ADMM) to efficiently solve the proposed non-smooth formulation. We evaluated the performance of the proposed method using the Alzheimer's Disease Neuroimaging Initiative (ADNI) datasets. The experimental results demonstrate that incorporating the two prior structures with fused group lasso norm into the multi-task feature learning can improve prediction performance over several competing methods, with estimated correlations of cognitive functions and identification of cognition-relevant imaging markers that are clinically and biologically meaningful.
Collapse
Affiliation(s)
- Xiaoli Liu
- Computer Science and Engineering, Northeastern University, Shenyang, China.,Key Laboratory of Medical Image Computing of Ministry of Education, Northeastern University, Shenyang, China
| | - Peng Cao
- Computer Science and Engineering, Northeastern University, Shenyang, China.
| | - Jianzhong Wang
- College of Information Science and Technology, Northeast Normal University, Changchun, China
| | - Jun Kong
- College of Information Science and Technology, Northeast Normal University, Changchun, China.,Key Laboratory of Applied Statistics of MOE, Changchun, China
| | - Dazhe Zhao
- Computer Science and Engineering, Northeastern University, Shenyang, China.,Key Laboratory of Medical Image Computing of Ministry of Education, Northeastern University, Shenyang, China
| |
Collapse
|
165
|
Abstract
BACKGROUND Biomedical named entity recognition (BioNER) is a fundamental and essential task for biomedical literature mining, which affects the performance of downstream tasks. Most BioNER models rely on domain-specific features or hand-crafted rules, but extracting features from massive data requires much time and human efforts. To solve this, neural network models are used to automatically learn features. Recently, multi-task learning has been applied successfully to neural network models of biomedical literature mining. For BioNER models, using multi-task learning makes use of features from multiple datasets and improves the performance of models. RESULTS In experiments, we compared our proposed model with other multi-task models and found our model outperformed the others on datasets of gene, protein, disease categories. We also tested the performance of different dataset pairs to find out the best partners of datasets. Besides, we explored and analyzed the influence of different entity types by using sub-datasets. When dataset size was reduced, our model still produced positive results. CONCLUSION We propose a novel multi-task model for BioNER with the cross-sharing structure to improve the performance of multi-task models. The cross-sharing structure in our model makes use of features from both datasets in the training procedure. Detailed analysis about best partners of datasets and influence between entity categories can provide guidance of choosing proper dataset pairs for multi-task training. Our implementation is available at https://github.com/JogleLew/bioner-cross-sharing .
Collapse
Affiliation(s)
- Xi Wang
- State Key Laboratory of Software Development Environment, Beihang University, Beijing, 100191, China
| | - Jiagao Lyu
- State Key Laboratory of Software Development Environment, Beihang University, Beijing, 100191, China
| | - Li Dong
- State Key Laboratory of Software Development Environment, Beihang University, Beijing, 100191, China
| | - Ke Xu
- State Key Laboratory of Software Development Environment, Beihang University, Beijing, 100191, China.
| |
Collapse
|
166
|
Tseng SY, Baucom B, Georgiou P. Unsupervised online multitask learning of behavioral sentence embeddings. PeerJ Comput Sci 2019; 5:e200. [PMID: 33816853 PMCID: PMC7924526 DOI: 10.7717/peerj-cs.200] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 05/22/2019] [Indexed: 06/12/2023]
Abstract
Appropriate embedding transformation of sentences can aid in downstream tasks such as NLP and emotion and behavior analysis. Such efforts evolved from word vectors which were trained in an unsupervised manner using large-scale corpora. Recent research, however, has shown that sentence embeddings trained using in-domain data or supervised techniques, often through multitask learning, perform better than unsupervised ones. Representations have also been shown to be applicable in multiple tasks, especially when training incorporates multiple information sources. In this work we aspire to combine the simplicity of using abundant unsupervised data with transfer learning by introducing an online multitask objective. We present a multitask paradigm for unsupervised learning of sentence embeddings which simultaneously addresses domain adaption. We show that embeddings generated through this process increase performance in subsequent domain-relevant tasks. We evaluate on the affective tasks of emotion recognition and behavior analysis and compare our results with state-of-the-art general-purpose supervised sentence embeddings. Our unsupervised sentence embeddings outperform the alternative universal embeddings in both identifying behaviors within couples therapy and in emotion recognition.
Collapse
Affiliation(s)
- Shao-Yen Tseng
- Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, United States of America
| | - Brian Baucom
- Department of Psychology, University of Utah, Salt Lake City, UT, United States of America
| | - Panayiotis Georgiou
- Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, United States of America
| |
Collapse
|
167
|
Sosnin S, Vashurina M, Withnall M, Karpov P, Fedorov M, Tetko IV. A Survey of Multi-task Learning Methods in Chemoinformatics. Mol Inform 2019; 38:e1800108. [PMID: 30499195 PMCID: PMC6587441 DOI: 10.1002/minf.201800108] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Accepted: 10/16/2018] [Indexed: 01/09/2023]
Abstract
Despite the increasing volume of available data, the proportion of experimentally measured data remains small compared to the virtual chemical space of possible chemical structures. Therefore, there is a strong interest in simultaneously predicting different ADMET and biological properties of molecules, which are frequently strongly correlated with one another. Such joint data analyses can increase the accuracy of models by exploiting their common representation and identifying common features between individual properties. In this work we review the recent developments in multi-learning approaches as well as cover the freely available tools and packages that can be used to perform such studies.
Collapse
Affiliation(s)
- Sergey Sosnin
- Center for Computational and Data-Intensive Science and EngineeringSkolkovo Institute of Science and Technology Skolkovo Innovation CenterMoscow143026Russia
| | - Mariia Vashurina
- Helmholtz Zentrum München – German Research Center for Environmental Health (GmbH)Institute of Structural BiologyIngolstädter Landstraße 1D-85764NeuherbergGermany
| | - Michael Withnall
- Helmholtz Zentrum München – German Research Center for Environmental Health (GmbH)Institute of Structural BiologyIngolstädter Landstraße 1D-85764NeuherbergGermany
| | - Pavel Karpov
- Helmholtz Zentrum München – German Research Center for Environmental Health (GmbH)Institute of Structural BiologyIngolstädter Landstraße 1D-85764NeuherbergGermany
| | - Maxim Fedorov
- Center for Computational and Data-Intensive Science and EngineeringSkolkovo Institute of Science and Technology Skolkovo Innovation CenterMoscow143026Russia
- University of StrathclydeDepartment of Physics John Anderson Building, 107 Rottenrow EastG40NGGlasgowUnited Kingdom
| | - Igor V. Tetko
- Helmholtz Zentrum München – German Research Center for Environmental Health (GmbH)Institute of Structural BiologyIngolstädter Landstraße 1D-85764NeuherbergGermany
- BIGCHEM GmbHIngolstädter Landstraße 1, b. 60wD-85764NeuherbergGermany
| |
Collapse
|
168
|
Cheplygina V, de Bruijne M, Pluim JPW. Not-so-supervised: A survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med Image Anal 2019; 54:280-296. [PMID: 30959445 DOI: 10.1016/j.media.2019.03.009] [Citation(s) in RCA: 303] [Impact Index Per Article: 60.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Revised: 12/20/2018] [Accepted: 03/25/2019] [Indexed: 02/07/2023]
Abstract
Machine learning (ML) algorithms have made a tremendous impact in the field of medical imaging. While medical imaging datasets have been growing in size, a challenge for supervised ML algorithms that is frequently mentioned is the lack of annotated data. As a result, various methods that can learn with less/other types of supervision, have been proposed. We give an overview of semi-supervised, multiple instance, and transfer learning in medical imaging, both in diagnosis or segmentation tasks. We also discuss connections between these learning scenarios, and opportunities for future research. A dataset with the details of the surveyed papers is available via https://figshare.com/articles/Database_of_surveyed_literature_in_Not-so-supervised_a_survey_of_semi-supervised_multi-instance_and_transfer_learning_in_medical_image_analysis_/7479416.
Collapse
Affiliation(s)
- Veronika Cheplygina
- Medical Image Analysis, Department Biomedical Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands.
| | - Marleen de Bruijne
- Biomedical Imaging Group Rotterdam, Departments Radiology and Medical Informatics, Erasmus Medical Center, Rotterdam, the Netherlands; The Image Section, Department Computer Science, University of Copenhagen, Copenhagen, Denmark
| | - Josien P W Pluim
- Medical Image Analysis, Department Biomedical Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands; Image Sciences Institute, University Medical Center Utrecht, Utrecht, the Netherlands
| |
Collapse
|
169
|
Bi X, Wang H. Early Alzheimer's disease diagnosis based on EEG spectral images using deep learning. Neural Netw 2019; 114:119-135. [PMID: 30903945 DOI: 10.1016/j.neunet.2019.02.005] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2018] [Revised: 01/04/2019] [Accepted: 02/14/2019] [Indexed: 11/27/2022]
Abstract
Early diagnosis of Alzheimer's disease (AD) is a proceeding hot issue along with a sharp upward trend in the incidence rate. Recently, early diagnosis of AD employing Electroencephalogram (EEG) as a specific hallmark has been an increasingly significant hot topic area. In consideration of the limited size of available EEG spectral images, how to extract more abstract features for better generalization still remains tremendously troubling. In this paper, we demonstrate that it can be settled well with multi-task learning strategy based on discriminative convolutional high-order Boltzmann Machine with hybrid feature maps. First, differently from our original model - Contractive Slab and Spike Convolutional Deep Boltzmann Machine (CssCDBM), we directly conduct EEG spectral image classification via inducing label layer, resulting in a discriminative version of CssCDBM, referred to as DCssCDBM. This demonstrates DCssCDBM can be extended well into the classification model instead of feature extractor alone previously. Then, the most important approach innovation is that we train our DCssCDBM with multi-task learning framework via EEG spectral images based Identification and verification tasks for overfitting reduction for the first time, which could increase the inter-subject variations and reduce the intra-subject variations respectively, both of which are essential to early diagnosis of AD. The proposed method shows the better ability of high-level representations extraction and demonstrates the advanced results over several state-of-the-art methods.
Collapse
Affiliation(s)
- Xiaojun Bi
- School of Information Engineering, Minzu University of China, Beijing, China.
| | - Haibo Wang
- College of Information And Communication Engineering, Harbin Engineering University, Harbin, China.
| |
Collapse
|
170
|
Lin A, Horvath D, Marcou G, Beck B, Varnek A. Multi-task generative topographic mapping in virtual screening. J Comput Aided Mol Des 2019; 33:331-343. [PMID: 30739238 DOI: 10.1007/s10822-019-00188-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 02/02/2019] [Indexed: 12/16/2022]
Abstract
The previously reported procedure to generate "universal" Generative Topographic Maps (GTMs) of the drug-like chemical space is in practice a multi-task learning process, in which both operational GTM parameters (example: map grid size) and hyperparameters (key example: the molecular descriptor space to be used) are being chosen by an evolutionary process in order to fit/select "universal" GTM manifolds. After selection (a one-time task aimed at optimizing the compromise in terms of neighborhood behavior compliance, over a large pool of various biological targets), for any further use the manifolds are ready to provide "fit-free" predictive models. Using any structure-activity set-irrespectively whether the associated target served at map fitting stage or not-the generation or "coloring" a property landscape enables predicting the property for any external molecule, with zero additional fitable parameters involved. While previous works have signaled the excellent behavior of such models in aggressive three-fold cross-validation assessments of their predictive power, the present work wished to explore their behavior in Virtual Screening (VS), here simulated on hand of external DUD ligand and decoy series that are fully disjoint from the ChEMBL-extracted landscape coloring sets. Beyond the rather robust results of the universal GTM manifolds in this challenge, it could be shown that the descriptor spaces selected by the evolutionary multi-task learner were intrinsically able to serve as an excellent support for many other VS procedures, starting from parameter-free similarity searching, to local (target-specific) GTM models, to parameter-rich, nonlinear Random Forest and Neural Network approaches.
Collapse
Affiliation(s)
- Arkadii Lin
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4, Blaise Pascal Str., 67081, Strasbourg, France.,Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorferstrasse 65, 88397, Biberach an der Riss, Germany
| | - Dragos Horvath
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4, Blaise Pascal Str., 67081, Strasbourg, France
| | - Gilles Marcou
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4, Blaise Pascal Str., 67081, Strasbourg, France
| | - Bernd Beck
- Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorferstrasse 65, 88397, Biberach an der Riss, Germany
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4, Blaise Pascal Str., 67081, Strasbourg, France.
| |
Collapse
|
171
|
Ho LST, Dinh V, Nguyen CV. Multi-task learning improves ancestral state reconstruction. Theor Popul Biol 2019; 126:33-39. [PMID: 30641072 DOI: 10.1016/j.tpb.2019.01.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 10/28/2018] [Accepted: 01/08/2019] [Indexed: 11/20/2022]
Abstract
We consider the ancestral state reconstruction problem where we need to infer phenotypes of ancestors using observations from present-day species. For this problem, we propose a multi-task learning method that uses regularized maximum likelihood to estimate the ancestral states of various traits simultaneously. We then show both theoretically and by simulation that this method improves the estimates of the ancestral states compared to the maximum likelihood method. The result also indicates that for the problem of ancestral state reconstruction under the Brownian motion model, the maximum likelihood method can be improved.
Collapse
Affiliation(s)
- Lam Si Tung Ho
- Department of Mathematics and Statistics Dalhousie University, Halifax, Nova Scotia, Canada.
| | - Vu Dinh
- Department of Mathematical Sciences, University of Delaware, USA
| | | |
Collapse
|
172
|
Abstract
BACKGROUND Biomedical semantic indexing is important for information retrieval and many other research fields in bioinformatics. It annotates biomedical citations with Medical Subject Headings. In face of unbalanced category distribution in the training data, sampling methods are difficult to apply for semantic indexing task. RESULTS In this paper, we present a novel deep serial multi-task learning model. The primary task treats the biomedical semantic indexing as a multi-label text classification issue that considers the relations of the labels. The auxiliary task is a regression task that predicts the MeSH number of the citation and provides hints for the network to make it converge faster. The experimental results on the BioASQ-Task5A open dataset show that our model outperforms the state-of-the-art solution "MTI", proposed by the US National Library of Medicine. Further, it not only achieves the highest precision among all the solutions in BioASQ-Task5A but also has faster convergence speed compared with some naive deep learning methods. CONCLUSIONS Rather than parallel in an ordinary multi-task structure, the tasks in our model are serial and tightly coupled. It can achieve satisfied performance without any handcrafted feature.
Collapse
Affiliation(s)
- Yongping Du
- Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Yunpeng Pan
- Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Chencheng Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Junzhong Ji
- Faculty of Information Technology, Beijing University of Technology, Beijing, China
| |
Collapse
|
173
|
Abstract
BACKGROUND Accurate predictive modeling in clinical research enables effective early intervention that patients are most likely to benefit from. However, due to the complex biological nature of disease progression, capturing the highly non-linear information from low-level input features is quite challenging. This requires predictive models with high-capacity. In practice, clinical datasets are often of limited size, bringing danger of overfitting for high-capacity models. To address these two challenges, we propose a deep multi-task neural network for predictive modeling. METHODS The proposed network leverages clinical measures as auxiliary targets that are related to the primary target. The predictions for the primary and auxiliary targets are made simultaneously by the neural network. Network structure is specifically designed to capture the clinical relevance by learning a shared feature representation between the primary and auxiliary targets. We apply the proposed model in a hypertension dataset and a breast cancer dataset, where the primary tasks are to predict the left ventricular mass indexed to body surface area and the time of recurrence of breast cancer. Moreover, we analyze the weights of the proposed neural network to rank input features for model interpretability. RESULTS The experimental results indicate that the proposed model outperforms other different models, achieving the best predictive accuracy (mean squared error 199.76 for hypertension data, 860.62 for Wisconsin prognostic breast cancer data) with the ability to rank features according to their contributions to the targets. The ranking is supported by previous related research. CONCLUSION We propose a novel effective method for clinical predictive modeling by combing the deep neural network and multi-task learning. By leveraging auxiliary measures clinically related to the primary target, our method improves the predictive accuracy. Based on featue ranking, our model is interpreted and shows consistency with previous studies on cardiovascular diseases and cancers.
Collapse
Affiliation(s)
- Xiangrui Li
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Dongxiao Zhu
- Department of Computer Science, Wayne State University, Detroit, MI, USA.
| | - Phillip Levy
- Department of Emergency Medicine, Wayne State University, Detroit, MI, USA.,Integrative Biosciences Center, Wayne State University, Detroit, MI, USA
| |
Collapse
|
174
|
Lan Q, Sun H, Robertson J, Deng X, Jin R. Non-invasive assessment of liver quality in transplantation based on thermal imaging analysis. Comput Methods Programs Biomed 2018; 164:31-47. [PMID: 30195430 DOI: 10.1016/j.cmpb.2018.06.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Revised: 05/25/2018] [Accepted: 06/05/2018] [Indexed: 06/08/2023]
Abstract
BACKGROUND AND OBJECTIVE Liver quality evaluation is one of the vital steps for predicting the success of liver transplantation. Current evaluation methods, such as biopsy and visual inspection, which are either invasive or lack of consistent standards, provide limited predictive value of long-term transplant viability. Objective analytical models, based on the real-time infrared images of livers during perfusion and preservation, are proposed as novel methods to precisely evaluate donated liver quality. METHODS In this study, by using principal component analysis to extract infrared image features as predictors, we construct a multivariate logistic regression model for single liver quality evaluation, and a multi-task learning logistic regression model for cross-liver quality evaluation. RESULTS The single liver quality predictions show testing errors of 0%. The leave-one-liver-out predictions show testing errors ranging from 9% to 36%. CONCLUSIONS It is found that there is a strong correlation between the viability of livers and the infrared image features in both single liver and cross-liver quality evaluations. These analytical methods also determine that the selected significant infrared image features indicate regional difference in viability and show that more stringent pre-implantation evaluation may be needed to predict transplant outcomes.
Collapse
Affiliation(s)
- Qing Lan
- Grado Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, VA 24061, USA
| | - Hongyue Sun
- Grado Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, VA 24061, USA
| | - John Robertson
- School of Biomedical Engineering and Sciences, Virginia Tech, VA 24061, USA
| | - Xinwei Deng
- Department of Statistics, Virginia Tech, VA 24061, USA
| | - Ran Jin
- Grado Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, VA 24061, USA.
| |
Collapse
|
175
|
Cao P, Liu X, Liu H, Yang J, Zhao D, Huang M, Zaiane O. Generalized fused group lasso regularized multi-task feature learning for predicting cognitive outcomes in Alzheimers disease. Comput Methods Programs Biomed 2018; 162:19-45. [PMID: 29903486 DOI: 10.1016/j.cmpb.2018.04.028] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Revised: 03/19/2018] [Accepted: 04/30/2018] [Indexed: 06/08/2023]
Abstract
OBJECTIVE Alzheimers disease (AD) is characterized by gradual neurodegeneration and loss of brain function, especially for memory during early stages. Regression analysis has been widely applied to AD research to relate clinical and biomarker data such as predicting cognitive outcomes from Magnetic Resonance Imaging (MRI) measures. Recently, the multi-task feature learning (MTFL) methods have been widely studied to predict cognitive outcomes and select the discriminative feature subset from MRI features by incorporating inherent correlations among multiple clinical cognitive measures. However, the existing MTFL assumes the correlation among all the tasks is uniform, and the task relatedness is modeled by encouraging a common subset of features with neglecting the inherent structure of tasks and MRI features. METHODS In this paper, we proposed a generalized fused group lasso (GFGL) regularization to model the underlying structures, involving (1) a graph structure within tasks and (2) a group structure among the image features. Then, we present a multi-task learning framework (called GFGL-MTFL), combining the ℓ2, 1-norm with the GFGL regularization, to model the flexible structures. RESULTS Through empirical evaluation and comparison with different baseline methods and the state-of-the-art MTL methods on data from Alzheimer's Disease Neuroimaging Initiative (ADNI) database, we illustrate that the proposed GFGL-MTFL method outperforms other methods in terms of both Mean Squared Error (nMSE) and weighted correlation coefficient (wR). Improvements are statistically significant for most scores (tasks). CONCLUSIONS The experimental results with real and synthetic data demonstrate that incorporating the two prior structures by the generalized fused group lasso norm into the multi task feature learning can improve the prediction performance over several state-of-the-art competing methods, and the estimated correlation of the cognitive functions and the identification of cognition relevant imaging markers are clinically and biologically meaningful.
Collapse
Affiliation(s)
- Peng Cao
- Computer Science and Engineering, Northeastern University, Shenyang, China.
| | - Xiaoli Liu
- Computer Science and Engineering, Northeastern University, Shenyang, China
| | - Hezi Liu
- The Third People's Hospital of Fushun, Fushun, China
| | - Jinzhu Yang
- Computer Science and Engineering, Northeastern University, Shenyang, China
| | - Dazhe Zhao
- Computer Science and Engineering, Northeastern University, Shenyang, China
| | - Min Huang
- College of Information Science and Engineering, Northeastern University, Shenyang, China
| | - Osmar Zaiane
- Computing Science, University of Alberta, Edmonton, Alberta, Canada
| |
Collapse
|
176
|
Wu H, Bailey C, Rasoulinejad P, Li S. Automated comprehensive Adolescent Idiopathic Scoliosis assessment using MVC-Net. Med Image Anal 2018; 48:1-11. [PMID: 29803920 DOI: 10.1016/j.media.2018.05.005] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Revised: 04/24/2018] [Accepted: 05/11/2018] [Indexed: 10/16/2022]
Abstract
Automated quantitative estimation of spinal curvature is an important task for the ongoing evaluation and treatment planning of Adolescent Idiopathic Scoliosis (AIS). It solves the widely accepted disadvantage of manual Cobb angle measurement (time-consuming and unreliable) which is currently the gold standard for AIS assessment. Attempts have been made to improve the reliability of automated Cobb angle estimation. However, it is very challenging to achieve accurate and robust estimation of Cobb angles due to the need for correctly identifying all the required vertebrae in both Anterior-posterior (AP) and Lateral (LAT) view x-rays. The challenge is especially evident in LAT x-ray where occlusion of vertebrae by the ribcage occurs. We therefore propose a novel Multi-View Correlation Network (MVC-Net) architecture that can provide a fully automated end-to-end framework for spinal curvature estimation in multi-view (both AP and LAT) x-rays. The proposed MVC-Net uses our newly designed multi-view convolution layers to incorporate joint features of multi-view x-rays, which allows the network to mitigate the occlusion problem by utilizing the structural dependencies of the two views. The MVC-Net consists of three closely-linked components: (1) a series of X-modules for joint representation of spinal structure (2) a Spinal Landmark Estimator network for robust spinal landmark estimation, and (3) a Cobb Angle Estimator network for accurate Cobb Angles estimation. By utilizing an iterative multi-task training algorithm to train the Spinal Landmark Estimator and Cobb Angle Estimator in tandem, the MVC-Net leverages the multi-task relationship between landmark and angle estimation to reliably detect all the required vertebrae for accurate Cobb angles estimation. Experimental results on 526 x-ray images from 154 patients show an impressive 4.04° Circular Mean Absolute Error (CMAE) in AP Cobb angle and 4.07° CMAE in LAT Cobb angle estimation, which demonstrates the MVC-Net's capability of robust and accurate estimation of Cobb angles in multi-view x-rays. Our method therefore provides clinicians with a framework for efficient, accurate, and reliable estimation of spinal curvature for comprehensive AIS assessment.
Collapse
|
177
|
Ma Q, Zhang T, Zanetti MV, Shen H, Satterthwaite TD, Wolf DH, Gur RE, Fan Y, Hu D, Busatto GF, Davatzikos C. Classification of multi-site MR images in the presence of heterogeneity using multi-task learning. Neuroimage Clin 2018; 19:476-486. [PMID: 29984156 PMCID: PMC6029565 DOI: 10.1016/j.nicl.2018.04.037] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 04/09/2018] [Accepted: 04/28/2018] [Indexed: 12/21/2022]
Abstract
With the advent of Big Data Imaging Analytics applied to neuroimaging, datasets from multiple sites need to be pooled into larger samples. However, heterogeneity across different scanners, protocols and populations, renders the task of finding underlying disease signatures challenging. The current work investigates the value of multi-task learning in finding disease signatures that generalize across studies and populations. Herein, we present a multi-task learning type of formulation, in which different tasks are from different studies and populations being pooled together. We test this approach in an MRI study of the neuroanatomy of schizophrenia (SCZ) by pooling data from 3 different sites and populations: Philadelphia, Sao Paulo and Tianjin (50 controls and 50 patients from each site), which posed integration challenges due to variability in disease chronicity, treatment exposure, and data collection. Some existing methods are also tested for comparison purposes. Experiments show that classification accuracy of multi-site data outperformed that of single-site data and pooled data using multi-task feature learning, and also outperformed other comparison methods. Several anatomical regions were identified to be common discriminant features across sites. These included prefrontal, superior temporal, insular, anterior cingulate cortex, temporo-limbic and striatal regions consistently implicated in the pathophysiology of schizophrenia, as well as the cerebellum, precuneus, and fusiform, middle temporal, inferior parietal, postcentral, angular, lingual and middle occipital gyri. These results indicate that the proposed multi-task learning method is robust in finding consistent and reliable structural brain abnormalities associated with SCZ across different sites, in the presence of multiple sources of heterogeneity.
Collapse
Affiliation(s)
- Qiongmin Ma
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan 410073, China; Center for Biomedical Image Computing and Analytics, and Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104, United States; Beijing Institute of System Engineering, China.
| | - Tianhao Zhang
- Center for Biomedical Image Computing and Analytics, and Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Marcus V Zanetti
- Laboratory of Psychiatric Neuroimaging (LIM-21), Department and Institute of Psychiatry, Faculty of Medicine, University of São Paulo, São Paulo, Brazil
| | - Hui Shen
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan 410073, China
| | | | - Daniel H Wolf
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Raquel E Gur
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Yong Fan
- Center for Biomedical Image Computing and Analytics, and Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Dewen Hu
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan 410073, China
| | - Geraldo F Busatto
- Laboratory of Psychiatric Neuroimaging (LIM-21), Department and Institute of Psychiatry, Faculty of Medicine, University of São Paulo, São Paulo, Brazil
| | - Christos Davatzikos
- Center for Biomedical Image Computing and Analytics, and Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104, United States
| |
Collapse
|
178
|
Adeli E, Meng Y, Li G, Lin W, Shen D. Multi-task prediction of infant cognitive scores from longitudinal incomplete neuroimaging data. Neuroimage 2018; 185:783-792. [PMID: 29709627 DOI: 10.1016/j.neuroimage.2018.04.052] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2017] [Revised: 03/26/2018] [Accepted: 04/23/2018] [Indexed: 01/13/2023] Open
Abstract
Early postnatal brain undergoes a stunning period of development. Over the past few years, research on dynamic infant brain development has received increased attention, exhibiting how important the early stages of a child's life are in terms of brain development. To precisely chart the early brain developmental trajectories, longitudinal studies with data acquired over a long-enough period of infants' early life is essential. However, in practice, missing data from different time point(s) during the data gathering procedure is often inevitable. This leads to incomplete set of longitudinal data, which poses a major challenge for such studies. In this paper, prediction of multiple future cognitive scores with incomplete longitudinal imaging data is modeled into a multi-task machine learning framework. To efficiently learn this model, we account for selection of informative features (i.e., neuroimaging morphometric measurements for different time points), while preserving the structural information and the interrelation between these multiple cognitive scores. Several experiments are conducted on a carefully acquired in-house dataset, and the results affirm that we can predict the cognitive scores measured at the age of four years old, using the imaging data of earlier time points, as early as 24 months of age, with a reasonable performance (i.e., root mean square error of 0.18).
Collapse
Affiliation(s)
- Ehsan Adeli
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, NC 27599, United States; Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, CA 94305, United States.
| | - Yu Meng
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, NC 27599, United States; Department of Computer Science, University of North Carolina at Chapel Hill, NC 27599, United States
| | - Gang Li
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, NC 27599, United States
| | - Weili Lin
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, NC 27599, United States
| | - Dinggang Shen
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, NC 27599, United States; Department of Brain & Cognitive Eng, Korea University, Seoul, 02841, Republic of Korea.
| |
Collapse
|
179
|
Abstract
Background Finding potential drug targets is a crucial step in drug discovery and development. Recently, resources such as the Library of Integrated Network-Based Cellular Signatures (LINCS) L1000 database provide gene expression profiles induced by various chemical and genetic perturbations and thereby make it possible to analyze the relationship between compounds and gene targets at a genome-wide scale. Current approaches for comparing the expression profiles are based on pairwise connectivity mapping analysis. However, this method makes the simple assumption that the effect of a drug treatment is similar to knocking down its single target gene. Since many compounds can bind multiple targets, the pairwise mapping ignores the combined effects of multiple targets, and therefore fails to detect many potential targets of the compounds. Results We propose an algorithm to find sets of gene knock-downs that induce gene expression changes similar to a drug treatment. Assuming that the effects of gene knock-downs are additive, we propose a novel bipartite block-wise sparse multi-task learning model with super-graph structure (BBSS-MTL) for multi-target drug repositioning that overcomes the restrictive assumptions of connectivity mapping analysis. Conclusions The proposed method BBSS-MTL is more accurate for predicting potential drug targets than the simple pairwise connectivity mapping analysis on five datasets generated from different cancer cell lines. Availability The code can be obtained at http://gr.xjtu.edu.cn/web/liminli/codes.
Collapse
|
180
|
Thung KH, Yap PT, Adeli E, Lee SW, Shen D. Conversion and time-to-conversion predictions of mild cognitive impairment using low-rank affinity pursuit denoising and matrix completion. Med Image Anal 2018; 45:68-82. [PMID: 29414437 PMCID: PMC6892173 DOI: 10.1016/j.media.2018.01.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2016] [Revised: 12/12/2017] [Accepted: 01/12/2018] [Indexed: 10/18/2022]
Abstract
In this paper, we aim to predict conversion and time-to-conversion of mild cognitive impairment (MCI) patients using multi-modal neuroimaging data and clinical data, via cross-sectional and longitudinal studies. However, such data are often heterogeneous, high-dimensional, noisy, and incomplete. We thus propose a framework that includes sparse feature selection, low-rank affinity pursuit denoising (LRAD), and low-rank matrix completion (LRMC) in this study. Specifically, we first use sparse linear regressions to remove unrelated features. Then, considering the heterogeneity of the MCI data, which can be assumed as a union of multiple subspaces, we propose to use a low rank subspace method (i.e., LRAD) to denoise the data. Finally, we employ LRMC algorithm with three data fitting terms and one inequality constraint for joint conversion and time-to-conversion predictions. Our framework aims to answer a very important but yet rarely explored question in AD study, i.e., when will the MCI convert to AD? This is different from survival analysis, which provides the probabilities of conversion at different time points that are mainly used for global analysis, while our time-to-conversion prediction is for each individual subject. Evaluations using the ADNI dataset indicate that our method outperforms conventional LRMC and other state-of-the-art methods. Our method achieves a maximal pMCI classification accuracy of 84% and time prediction correlation of 0.665.
Collapse
Affiliation(s)
- Kim-Han Thung
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill 27599, USA.
| | - Pew-Thian Yap
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill 27599, USA
| | - Ehsan Adeli
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill 27599, USA
| | - Seong-Whan Lee
- Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Dinggang Shen
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill 27599, USA; Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea.
| |
Collapse
|
181
|
Liu X, Goncalves AR, Cao P, Zhao D, Banerjee A. Modeling Alzheimer's disease cognitive scores using multi-task sparse group lasso. Comput Med Imaging Graph 2017; 66:100-114. [PMID: 29602022 DOI: 10.1016/j.compmedimag.2017.11.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2016] [Revised: 06/10/2017] [Accepted: 11/14/2017] [Indexed: 01/12/2023]
Abstract
Alzheimer's disease (AD) is a severe neurodegenerative disorder characterized by loss of memory and reduction in cognitive functions due to progressive degeneration of neurons and their connections, eventually leading to death. In this paper, we consider the problem of simultaneously predicting several different cognitive scores associated with categorizing subjects as normal, mild cognitive impairment (MCI), or Alzheimer's disease (AD) in a multi-task learning framework using features extracted from brain images obtained from ADNI (Alzheimer's Disease Neuroimaging Initiative). To solve the problem, we present a multi-task sparse group lasso (MT-SGL) framework, which estimates sparse features coupled across tasks, and can work with loss functions associated with any Generalized Linear Models. Through comparisons with a variety of baseline models using multiple evaluation metrics, we illustrate the promising predictive performance of MT-SGL on ADNI along with its ability to identify brain regions more likely to help the characterization Alzheimer's disease progression.
Collapse
Affiliation(s)
- Xiaoli Liu
- College of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Medical Image Computing of Ministry of Education, Northeastern University, Shenyang, China; Computing Science & Engineering, University of Minnesota, Twin Cities, USA
| | - André R Goncalves
- Center for Research and Development in Telecommunications (CPqD), Brazil
| | - Peng Cao
- College of Computer Science and Engineering, Northeastern University, Shenyang, China.
| | - Dazhe Zhao
- College of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Medical Image Computing of Ministry of Education, Northeastern University, Shenyang, China
| | - Arindam Banerjee
- Computing Science & Engineering, University of Minnesota, Twin Cities, USA
| | | |
Collapse
|
182
|
Zu C, Jie B, Liu M, Chen S, Shen D, Zhang D. Label-aligned multi-task feature learning for multimodal classification of Alzheimer's disease and mild cognitive impairment. Brain Imaging Behav 2017; 10:1148-1159. [PMID: 26572145 DOI: 10.1007/s11682-015-9480-7] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Multimodal classification methods using different modalities of imaging and non-imaging data have recently shown great advantages over traditional single-modality-based ones for diagnosis and prognosis of Alzheimer's disease (AD), as well as its prodromal stage, i.e., mild cognitive impairment (MCI). However, to the best of our knowledge, most existing methods focus on mining the relationship across multiple modalities of the same subjects, while ignoring the potentially useful relationship across different subjects. Accordingly, in this paper, we propose a novel learning method for multimodal classification of AD/MCI, by fully exploring the relationships across both modalities and subjects. Specifically, our proposed method includes two subsequent components, i.e., label-aligned multi-task feature selection and multimodal classification. In the first step, the feature selection learning from multiple modalities are treated as different learning tasks and a group sparsity regularizer is imposed to jointly select a subset of relevant features. Furthermore, to utilize the discriminative information among labeled subjects, a new label-aligned regularization term is added into the objective function of standard multi-task feature selection, where label-alignment means that all multi-modality subjects with the same class labels should be closer in the new feature-reduced space. In the second step, a multi-kernel support vector machine (SVM) is adopted to fuse the selected features from multi-modality data for final classification. To validate our method, we perform experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) database using baseline MRI and FDG-PET imaging data. The experimental results demonstrate that our proposed method achieves better classification performance compared with several state-of-the-art methods for multimodal classification of AD/MCI.
Collapse
Affiliation(s)
- Chen Zu
- Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
| | - Biao Jie
- Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
- School of Mathematics and Computer Science, Anhui Normal University, Wuhu, 241000, China
| | - Mingxia Liu
- Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
| | - Songcan Chen
- Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
| | - Dinggang Shen
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- Department of Brain and Cognitive Engineering, Korea University, Seoul, 136-701, Korea.
| | - Daoqiang Zhang
- Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China.
| |
Collapse
|
183
|
Crichton G, Pyysalo S, Chiu B, Korhonen A. A neural network multi-task learning approach to biomedical named entity recognition. BMC Bioinformatics 2017; 18:368. [PMID: 28810903 PMCID: PMC5558737 DOI: 10.1186/s12859-017-1776-8] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 07/31/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Named Entity Recognition (NER) is a key task in biomedical text mining. Accurate NER systems require task-specific, manually-annotated datasets, which are expensive to develop and thus limited in size. Since such datasets contain related but different information, an interesting question is whether it might be possible to use them together to improve NER performance. To investigate this, we develop supervised, multi-task, convolutional neural network models and apply them to a large number of varied existing biomedical named entity datasets. Additionally, we investigated the effect of dataset size on performance in both single- and multi-task settings. RESULTS We present a single-task model for NER, a Multi-output multi-task model and a Dependent multi-task model. We apply the three models to 15 biomedical datasets containing multiple named entities including Anatomy, Chemical, Disease, Gene/Protein and Species. Each dataset represent a task. The results from the single-task model and the multi-task models are then compared for evidence of benefits from Multi-task Learning. With the Multi-output multi-task model we observed an average F-score improvement of 0.8% when compared to the single-task model from an average baseline of 78.4%. Although there was a significant drop in performance on one dataset, performance improves significantly for five datasets by up to 6.3%. For the Dependent multi-task model we observed an average improvement of 0.4% when compared to the single-task model. There were no significant drops in performance on any dataset, and performance improves significantly for six datasets by up to 1.1%. The dataset size experiments found that as dataset size decreased, the multi-output model's performance increased compared to the single-task model's. Using 50, 25 and 10% of the training data resulted in an average drop of approximately 3.4, 8 and 16.7% respectively for the single-task model but approximately 0.2, 3.0 and 9.8% for the multi-task model. CONCLUSIONS Our results show that, on average, the multi-task models produced better NER results than the single-task models trained on a single NER dataset. We also found that Multi-task Learning is beneficial for small datasets. Across the various settings the improvements are significant, demonstrating the benefit of Multi-task Learning for this task.
Collapse
Affiliation(s)
- Gamal Crichton
- Language Technology Laboratory, DTAL, University of Cambridge, 9 West Road, Cambridge, CB39DB UK
| | - Sampo Pyysalo
- Language Technology Laboratory, DTAL, University of Cambridge, 9 West Road, Cambridge, CB39DB UK
| | - Billy Chiu
- Language Technology Laboratory, DTAL, University of Cambridge, 9 West Road, Cambridge, CB39DB UK
| | - Anna Korhonen
- Language Technology Laboratory, DTAL, University of Cambridge, 9 West Road, Cambridge, CB39DB UK
| |
Collapse
|
184
|
Wachinger C, Reuter M, Klein T. DeepNAT: Deep convolutional neural network for segmenting neuroanatomy. Neuroimage 2017; 170:434-445. [PMID: 28223187 DOI: 10.1016/j.neuroimage.2017.02.035] [Citation(s) in RCA: 173] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Revised: 02/13/2017] [Accepted: 02/13/2017] [Indexed: 10/20/2022] Open
Abstract
We introduce DeepNAT, a 3D Deep convolutional neural network for the automatic segmentation of NeuroAnaTomy in T1-weighted magnetic resonance images. DeepNAT is an end-to-end learning-based approach to brain segmentation that jointly learns an abstract feature representation and a multi-class classification. We propose a 3D patch-based approach, where we do not only predict the center voxel of the patch but also neighbors, which is formulated as multi-task learning. To address a class imbalance problem, we arrange two networks hierarchically, where the first one separates foreground from background, and the second one identifies 25 brain structures on the foreground. Since patches lack spatial context, we augment them with coordinates. To this end, we introduce a novel intrinsic parameterization of the brain volume, formed by eigenfunctions of the Laplace-Beltrami operator. As network architecture, we use three convolutional layers with pooling, batch normalization, and non-linearities, followed by fully connected layers with dropout. The final segmentation is inferred from the probabilistic output of the network with a 3D fully connected conditional random field, which ensures label agreement between close voxels. The roughly 2.7million parameters in the network are learned with stochastic gradient descent. Our results show that DeepNAT compares favorably to state-of-the-art methods. Finally, the purely learning-based method may have a high potential for the adaptation to young, old, or diseased brains by fine-tuning the pre-trained network with a small training sample on the target application, where the availability of larger datasets with manual annotations may boost the overall segmentation accuracy in the future.
Collapse
Affiliation(s)
- Christian Wachinger
- Department of Child and Adolescent Psychiatry, Psychosomatic and Psychotherapy, Ludwig-Maximilian-University, Waltherstr. 23, 81369 München, Munich, Germany.
| | - Martin Reuter
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA; German Centre for Neurodegenerative Diseases (DZNE), Department of Image Analysis, Bonn, Germany
| | | |
Collapse
|
185
|
Yu G, Liu Y, Shen D. Graph-guided joint prediction of class label and clinical scores for the Alzheimer's disease. Brain Struct Funct 2015; 221:3787-801. [PMID: 26476928 DOI: 10.1007/s00429-015-1132-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Accepted: 10/04/2015] [Indexed: 11/24/2022]
Abstract
Accurate diagnosis of Alzheimer's disease and its prodromal stage, i.e., mild cognitive impairment, is very important for early treatment. Over the last decade, various machine learning methods have been proposed to predict disease status and clinical scores from brain images. It is worth noting that many features extracted from brain images are correlated significantly. In this case, feature selection combined with the additional correlation information among features can effectively improve classification/regression performance. Typically, the correlation information among features can be modeled by the connectivity of an undirected graph, where each node represents one feature and each edge indicates that the two involved features are correlated significantly. In this paper, we propose a new graph-guided multi-task learning method incorporating this undirected graph information to predict multiple response variables (i.e., class label and clinical scores) jointly. Specifically, based on the sparse undirected feature graph, we utilize a new latent group Lasso penalty to encourage the correlated features to be selected together. Furthermore, this new penalty also encourages the intrinsic correlated tasks to share a common feature subset. To validate our method, we have performed many numerical studies using simulated datasets and the Alzheimer's Disease Neuroimaging Initiative dataset. Compared with the other methods, our proposed method has very promising performance.
Collapse
Affiliation(s)
- Guan Yu
- Department of Statistics and Operations Research, The University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Yufeng Liu
- Department of Statistics and Operations Research, The University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.,Carolina Center for Genome Sciences, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Department of Biostatistics, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Dinggang Shen
- Department of Radiology and BRIC, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. .,Department of Brain and Cognitive Engineering, Korea University, Seoul, 02841, Republic of Korea.
| |
Collapse
|
186
|
Suk HI, Lee SW, Shen D. Deep sparse multi-task learning for feature selection in Alzheimer's disease diagnosis. Brain Struct Funct 2015; 221:2569-87. [PMID: 25993900 DOI: 10.1007/s00429-015-1059-y] [Citation(s) in RCA: 96] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 05/07/2015] [Indexed: 12/31/2022]
Abstract
Recently, neuroimaging-based Alzheimer's disease (AD) or mild cognitive impairment (MCI) diagnosis has attracted researchers in the field, due to the increasing prevalence of the diseases. Unfortunately, the unfavorable high-dimensional nature of neuroimaging data, but a limited small number of samples available, makes it challenging to build a robust computer-aided diagnosis system. Machine learning techniques have been considered as a useful tool in this respect and, among various methods, sparse regression has shown its validity in the literature. However, to our best knowledge, the existing sparse regression methods mostly try to select features based on the optimal regression coefficients in one step. We argue that since the training feature vectors are composed of both informative and uninformative or less informative features, the resulting optimal regression coefficients are inevidently affected by the uninformative or less informative features. To this end, we first propose a novel deep architecture to recursively discard uninformative features by performing sparse multi-task learning in a hierarchical fashion. We further hypothesize that the optimal regression coefficients reflect the relative importance of features in representing the target response variables. In this regard, we use the optimal regression coefficients learned in one hierarchy as feature weighting factors in the following hierarchy, and formulate a weighted sparse multi-task learning method. Lastly, we also take into account the distributional characteristics of samples per class and use clustering-induced subclass label vectors as target response values in our sparse regression model. In our experiments on the ADNI cohort, we performed both binary and multi-class classification tasks in AD/MCI diagnosis and showed the superiority of the proposed method by comparing with the state-of-the-art methods.
Collapse
Affiliation(s)
- Heung-Il Suk
- Department of Brain and Cognitive Engineering, Korea University, Seoul, 136-713, Republic of Korea.
| | - Seong-Whan Lee
- Department of Brain and Cognitive Engineering, Korea University, Seoul, 136-713, Republic of Korea
| | - Dinggang Shen
- Department of Brain and Cognitive Engineering, Korea University, Seoul, 136-713, Republic of Korea.
- Biomedical Research Imaging Center and Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
187
|
Thung KH, Wee CY, Yap PT, Shen D. Neurodegenerative disease diagnosis using incomplete multi-modality data via matrix shrinkage and completion. Neuroimage 2014; 91:386-400. [PMID: 24480301 PMCID: PMC4096013 DOI: 10.1016/j.neuroimage.2014.01.033] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2013] [Revised: 01/13/2014] [Accepted: 01/18/2014] [Indexed: 12/17/2022] Open
Abstract
In this work, we are interested in predicting the diagnostic statuses of potentially neurodegenerated patients using feature values derived from multi-modality neuroimaging data and biological data, which might be incomplete. Collecting the feature values into a matrix, with each row containing a feature vector of a sample, we propose a framework to predict the corresponding associated multiple target outputs (e.g., diagnosis label and clinical scores) from this feature matrix by performing matrix shrinkage following matrix completion. Specifically, we first combine the feature and target output matrices into a large matrix and then partition this large incomplete matrix into smaller submatrices, each consisting of samples with complete feature values (corresponding to a certain combination of modalities) and target outputs. Treating each target output as the outcome of a prediction task, we apply a 2-step multi-task learning algorithm to select the most discriminative features and samples in each submatrix. Features and samples that are not selected in any of the submatrices are discarded, resulting in a shrunk version of the original large matrix. The missing feature values and unknown target outputs of the shrunk matrix is then completed simultaneously. Experimental results using the ADNI dataset indicate that our proposed framework achieves higher classification accuracy at a greater speed when compared with conventional imputation-based classification methods and also yields competitive performance when compared with the state-of-the-art methods.
Collapse
Affiliation(s)
- Kim-Han Thung
- Biomedical Research Imaging Center (BRIC) and Department of Radiology, University of North Carolina at Chapel Hill, USA.
| | - Chong-Yaw Wee
- Biomedical Research Imaging Center (BRIC) and Department of Radiology, University of North Carolina at Chapel Hill, USA
| | - Pew-Thian Yap
- Biomedical Research Imaging Center (BRIC) and Department of Radiology, University of North Carolina at Chapel Hill, USA
| | - Dinggang Shen
- Biomedical Research Imaging Center (BRIC) and Department of Radiology, University of North Carolina at Chapel Hill, USA; Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea.
| |
Collapse
|
188
|
Tsao S, Gajawelli N, Zhou J, Shi J, Ye J, Wang Y, Lepore N. Evaluating the Predictive Power of Multivariate Tensor-based Morphometry in Alzheimers Disease Progression via Convex Fused Sparse Group Lasso. Proc SPIE Int Soc Opt Eng 2014; 9034:90342L. [PMID: 25076826 DOI: 10.1117/12.2042720] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Prediction of Alzheimers disease (AD) progression based on baseline measures allows us to understand disease progression and has implications in decisions concerning treatment strategy. To this end we combine a predictive multi-task machine learning method1 with novel MR-based multivariate morphometric surface map of the hippocampus2 to predict future cognitive scores of patients. Previous work by Zhou et al.1 has shown that a multi-task learning framework that performs prediction of all future time points (or tasks) simultaneously can be used to encode both sparsity as well as temporal smoothness. They showed that this can be used in predicting cognitive outcomes of Alzheimers Disease Neuroimaging Initiative (ADNI) subjects based on FreeSurfer-based baseline MRI features, MMSE score demographic information and ApoE status. Whilst volumetric information may hold generalized information on brain status, we hypothesized that hippocampus specific information may be more useful in predictive modeling of AD. To this end, we applied Shi et al.2s recently developed multivariate tensor-based (mTBM) parametric surface analysis method to extract features from the hippocampal surface. We show that by combining the power of the multi-task framework with the sensitivity of mTBM features of the hippocampus surface, we are able to improve significantly improve predictive performance of ADAS cognitive scores 6, 12, 24, 36 and 48 months from baseline.
Collapse
Affiliation(s)
- Sinchai Tsao
- University of Washington, Seattle, Washington, USA
| | | | - Jiayu Zhou
- Arizona State University, Phoenix, Arizona, USA
| | - Jie Shi
- Arizona State University, Phoenix, Arizona, USA
| | - Jieping Ye
- Arizona State University, Phoenix, Arizona, USA
| | - Yalin Wang
- Arizona State University, Phoenix, Arizona, USA
| | - Natasha Lepore
- Children's Hospital Los Angeles, Los Angeles, California, USA
| |
Collapse
|
189
|
Marquand AF, Brammer M, Williams SC, Doyle OM. Bayesian multi-task learning for decoding multi-subject neuroimaging data. Neuroimage 2014; 92:298-311. [PMID: 24531053 DOI: 10.1016/j.neuroimage.2014.02.008] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2013] [Revised: 01/28/2014] [Accepted: 02/03/2014] [Indexed: 11/25/2022] Open
Abstract
Decoding models based on pattern recognition (PR) are becoming increasingly important tools for neuroimaging data analysis. In contrast to alternative (mass-univariate) encoding approaches that use hierarchical models to capture inter-subject variability, inter-subject differences are not typically handled efficiently in PR. In this work, we propose to overcome this problem by recasting the decoding problem in a multi-task learning (MTL) framework. In MTL, a single PR model is used to learn different but related “tasks” simultaneously. The primary advantage of MTL is that it makes more efficient use of the data available and leads to more accurate models by making use of the relationships between tasks. In this work, we construct MTL models where each subject is modelled by a separate task. We use a flexible covariance structure to model the relationships between tasks and induce coupling between them using Gaussian process priors. We present an MTL method for classification problems and demonstrate a novel mapping method suitable for PR models. We apply these MTL approaches to classifying many different contrasts in a publicly available fMRI dataset and show that the proposed MTL methods produce higher decoding accuracy and more consistent discriminative activity patterns than currently used techniques. Our results demonstrate that MTL provides a promising method for multi-subject decoding studies by focusing on the commonalities between a group of subjects rather than the idiosyncratic properties of different subjects. In mass-univariate analysis, mixed effects models can capture subject variability. In pattern recognition (PR), subject variability is usually not modelled explicitly. Multi-task learning (MTL) is proposed to accommodate subject variability in PR. The proposed approach improves predictive accuracy and pattern reproducibility. A novel brain mapping approach is also proposed for MTL and existing PR models.
Collapse
|
190
|
Zhang L, Sedykh A, Tripathi A, Zhu H, Afantitis A, Mouchlis VD, Melagraki G, Rusyn I, Tropsha A. Identification of putative estrogen receptor-mediated endocrine disrupting chemicals using QSAR- and structure-based virtual screening approaches. Toxicol Appl Pharmacol 2013; 272:67-76. [PMID: 23707773 PMCID: PMC3775906 DOI: 10.1016/j.taap.2013.04.032] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Revised: 04/16/2013] [Accepted: 04/17/2013] [Indexed: 12/24/2022]
Abstract
Identification of endocrine disrupting chemicals is one of the important goals of environmental chemical hazard screening. We report on the development of validated in silico predictors of chemicals likely to cause estrogen receptor (ER)-mediated endocrine disruption to facilitate their prioritization for future screening. A database of relative binding affinity of a large number of ERα and/or ERβ ligands was assembled (546 for ERα and 137 for ERβ). Both single-task learning (STL) and multi-task learning (MTL) continuous quantitative structure-activity relationship (QSAR) models were developed for predicting ligand binding affinity to ERα or ERβ. High predictive accuracy was achieved for ERα binding affinity (MTL R(2)=0.71, STL R(2)=0.73). For ERβ binding affinity, MTL models were significantly more predictive (R(2)=0.53, p<0.05) than STL models. In addition, docking studies were performed on a set of ER agonists/antagonists (67 agonists and 39 antagonists for ERα, 48 agonists and 32 antagonists for ERβ, supplemented by putative decoys/non-binders) using the following ER structures (in complexes with respective ligands) retrieved from the Protein Data Bank: ERα agonist (PDB ID: 1L2I), ERα antagonist (PDB ID: 3DT3), ERβ agonist (PDB ID: 2NV7), and ERβ antagonist (PDB ID: 1L2J). We found that all four ER conformations discriminated their corresponding ligands from presumed non-binders. Finally, both QSAR models and ER structures were employed in parallel to virtually screen several large libraries of environmental chemicals to derive a ligand- and structure-based prioritized list of putative estrogenic compounds to be used for in vitro and in vivo experimental validation.
Collapse
Affiliation(s)
- Liying Zhang
- Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC
| | - Alexander Sedykh
- Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC
| | - Ashutosh Tripathi
- Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC
| | - Hao Zhu
- The Rutgers Center for Computational and Integrative Biology, Rutgers University, Camden, NJ
- Department of Chemistry, Rutgers University, Camden, NJ
| | | | | | | | - Ivan Rusyn
- Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC
| | - Alexander Tropsha
- Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC
| |
Collapse
|
191
|
Abstract
Multi-task learning (MTL) aims to improve the performance of multiple related tasks by exploiting the intrinsic relationships among them. Recently, multi-task feature learning algorithms have received increasing attention and they have been successfully applied to many applications involving high-dimensional data. However, they assume that all tasks share a common set of features, which is too restrictive and may not hold in real-world applications, since outlier tasks often exist. In this paper, we propose a Robust MultiTask Feature Learning algorithm (rMTFL) which simultaneously captures a common set of features among relevant tasks and identifies outlier tasks. Specifically, we decompose the weight (model) matrix for all tasks into two components. We impose the well-known group Lasso penalty on row groups of the first component for capturing the shared features among relevant tasks. To simultaneously identify the outlier tasks, we impose the same group Lasso penalty but on column groups of the second component. We propose to employ the accelerated gradient descent to efficiently solve the optimization problem in rMTFL, and show that the proposed algorithm is scalable to large-size problems. In addition, we provide a detailed theoretical analysis on the proposed rMTFL formulation. Specifically, we present a theoretical bound to measure how well our proposed rMTFL approximates the true evaluation, and provide bounds to measure the error between the estimated weights of rMTFL and the underlying true weights. Moreover, by assuming that the underlying true weights are above the noise level, we present a sound theoretical result to show how to obtain the underlying true shared features and outlier tasks (sparsity patterns). Empirical studies on both synthetic and real-world data demonstrate that our proposed rMTFL is capable of simultaneously capturing shared features among tasks and identifying outlier tasks.
Collapse
Affiliation(s)
- Pinghua Gong
- State Key Laboratory on Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology (TNList), Department of Automation, Tsinghua University, Beijing 100084, China
| | - Jieping Ye
- Computer Science and Engineering, Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ 85287
| | - Changshui Zhang
- State Key Laboratory on Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology (TNList), Department of Automation, Tsinghua University, Beijing 100084, China
| |
Collapse
|
192
|
Chen J, Liu J, Ye J. Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks. ACM Trans Knowl Discov Data 2012; 5:22. [PMID: 24077658 PMCID: PMC3783291 DOI: 10.1145/2086737.2086742] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2010] [Accepted: 11/01/2011] [Indexed: 06/01/2023]
Abstract
We consider the problem of learning incoherent sparse and low-rank patterns from multiple tasks. Our approach is based on a linear multi-task learning formulation, in which the sparse and low-rank patterns are induced by a cardinality regularization term and a low-rank constraint, respectively. This formulation is non-convex; we convert it into its convex surrogate, which can be routinely solved via semidefinite programming for small-size problems. We propose to employ the general projected gradient scheme to efficiently solve such a convex surrogate; however, in the optimization formulation, the objective function is non-differentiable and the feasible domain is non-trivial. We present the procedures for computing the projected gradient and ensuring the global convergence of the projected gradient scheme. The computation of projected gradient involves a constrained optimization problem; we show that the optimal solution to such a problem can be obtained via solving an unconstrained optimization subproblem and an Euclidean projection subproblem. We also present two projected gradient algorithms and analyze their rates of convergence in details. In addition, we illustrate the use of the presented projected gradient algorithms for the proposed multi-task learning formulation using the least squares loss. Experimental results on a collection of real-world data sets demonstrate the effectiveness of the proposed multi-task learning formulation and the efficiency of the proposed projected gradient algorithms.
Collapse
|
193
|
Abstract
This paper focuses on the problem of choosing a prior for an unknown random effects distribution within a Bayesian hierarchical model. The goal is to obtain a sparse representation by allowing a combination of global and local borrowing of information. A local partition process prior is proposed, which induces dependent local clustering. Subjects can be clustered together for a subset of their parameters, and one learns about similarities between subjects increasingly as parameters are added. Some basic properties are described, including simple two-parameter expressions for marginal and conditional clustering probabilities. A slice sampler is developed which bypasses the need to approximate the countably infinite random measure in performing posterior computation. The methods are illustrated using simulation examples, and an application to hormone trajectory data.
Collapse
Affiliation(s)
- DAVID B. DUNSON
- Department of Statistical Science, Box 90251, Duke University, Durham, North Carolina 27708, U.S.A
| |
Collapse
|