1
|
A Novel Hybrid Transfer Learning Framework for Dynamic Cutterhead Torque Prediction of the Tunnel Boring Machine. ENERGIES 2022. [DOI: 10.3390/en15082907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
A tunnel boring machine (TBM) is an important large-scale engineering machine, which is widely applied in tunnel construction. Precise cutterhead torque prediction plays an essential role in the cost estimation of energy consumption and safety operation in the tunneling process, since it directly influences the adaptable adjustment of excavation parameters. Complicated and variable geological conditions, leading to operational and status parameters of the TBM, usually exhibit some spatio-temporally varying characteristic, which poses a serious challenge to conventional data-based methods for dynamic cutterhead torque prediction. In this study, a novel hybrid transfer learning framework, namely TRLS-SVR, is proposed to transfer knowledge from a historical dataset that may contain multiple working patterns and alleviate fresh data noise interference when addressing dynamic cutterhead torque prediction issues. Compared with conventional data-driven algorithms, TRLS-SVR considers long-ago historical data, and can effectively extract and leverage the public latent knowledge that is implied in historical datasets for current prediction. A collection of in situ TBM operation data from a tunnel project located in China is utilized to evaluate the performance of the proposed framework.
Collapse
|
2
|
Forouzannezhad P, Maes D, Hippe DS, Thammasorn P, Iranzad R, Han J, Duan C, Liu X, Wang S, Chaovalitwongse WA, Zeng J, Bowen SR. Multitask Learning Radiomics on Longitudinal Imaging to Predict Survival Outcomes following Risk-Adaptive Chemoradiation for Non-Small Cell Lung Cancer. Cancers (Basel) 2022; 14:1228. [PMID: 35267535 PMCID: PMC8909466 DOI: 10.3390/cancers14051228] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 02/23/2022] [Accepted: 02/25/2022] [Indexed: 11/16/2022] Open
Abstract
Medical imaging provides quantitative and spatial information to evaluate treatment response in the management of patients with non-small cell lung cancer (NSCLC). High throughput extraction of radiomic features on these images can potentially phenotype tumors non-invasively and support risk stratification based on survival outcome prediction. The prognostic value of radiomics from different imaging modalities and time points prior to and during chemoradiation therapy of NSCLC, relative to conventional imaging biomarker or delta radiomics models, remains uncharacterized. We investigated the utility of multitask learning of multi-time point radiomic features, as opposed to single-task learning, for improving survival outcome prediction relative to conventional clinical imaging feature model benchmarks. Survival outcomes were prospectively collected for 45 patients with unresectable NSCLC enrolled on the FLARE-RT phase II trial of risk-adaptive chemoradiation and optional consolidation PD-L1 checkpoint blockade (NCT02773238). FDG-PET, CT, and perfusion SPECT imaging pretreatment and week 3 mid-treatment was performed and 110 IBSI-compliant pyradiomics shape-/intensity-/texture-based features from the metabolic tumor volume were extracted. Outcome modeling consisted of a fused Laplacian sparse group LASSO with component-wise gradient boosting survival regression in a multitask learning framework. Testing performance under stratified 10-fold cross-validation was evaluated for multitask learning radiomics of different imaging modalities and time points. Multitask learning models were benchmarked against conventional clinical imaging and delta radiomics models and evaluated with the concordance index (c-index) and index of prediction accuracy (IPA). FDG-PET radiomics had higher prognostic value for overall survival in test folds (c-index 0.71 [0.67, 0.75]) than CT radiomics (c-index 0.64 [0.60, 0.71]) or perfusion SPECT radiomics (c-index 0.60 [0.57, 0.63]). Multitask learning of pre-/mid-treatment FDG-PET radiomics (c-index 0.71 [0.67, 0.75]) outperformed benchmark clinical imaging (c-index 0.65 [0.59, 0.71]) and FDG-PET delta radiomics (c-index 0.52 [0.48, 0.58]) models. Similarly, the IPA for multitask learning FDG-PET radiomics (30%) was higher than clinical imaging (26%) and delta radiomics (15%) models. Radiomics models performed consistently under different voxel resampling conditions. Multitask learning radiomics for outcome modeling provides a clinical decision support platform that leverages longitudinal imaging information. This framework can reveal the relative importance of different imaging modalities and time points when designing risk-adaptive cancer treatment strategies.
Collapse
Affiliation(s)
- Parisa Forouzannezhad
- Department of Radiation Oncology, School of Medicine, University of Washington, Seattle, WA 98195, USA; (P.F.); (D.M.); (J.Z.)
| | - Dominic Maes
- Department of Radiation Oncology, School of Medicine, University of Washington, Seattle, WA 98195, USA; (P.F.); (D.M.); (J.Z.)
| | - Daniel S. Hippe
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA;
| | - Phawis Thammasorn
- Department of Industrial Engineering, University of Arkansas, Fayetteville, AR 72701, USA; (P.T.); (R.I.); (X.L.); (W.A.C.)
| | - Reza Iranzad
- Department of Industrial Engineering, University of Arkansas, Fayetteville, AR 72701, USA; (P.T.); (R.I.); (X.L.); (W.A.C.)
| | - Jie Han
- Department of Industrial, Manufacturing, and System Engineering, University of Texas, Arlington, TX 76019, USA; (J.H.); (S.W.)
| | - Chunyan Duan
- Department of Mechanical Engineering, Tongji University, Shanghai 200092, China;
| | - Xiao Liu
- Department of Industrial Engineering, University of Arkansas, Fayetteville, AR 72701, USA; (P.T.); (R.I.); (X.L.); (W.A.C.)
| | - Shouyi Wang
- Department of Industrial, Manufacturing, and System Engineering, University of Texas, Arlington, TX 76019, USA; (J.H.); (S.W.)
| | - W. Art Chaovalitwongse
- Department of Industrial Engineering, University of Arkansas, Fayetteville, AR 72701, USA; (P.T.); (R.I.); (X.L.); (W.A.C.)
| | - Jing Zeng
- Department of Radiation Oncology, School of Medicine, University of Washington, Seattle, WA 98195, USA; (P.F.); (D.M.); (J.Z.)
| | - Stephen R. Bowen
- Department of Radiation Oncology, School of Medicine, University of Washington, Seattle, WA 98195, USA; (P.F.); (D.M.); (J.Z.)
- Department of Radiology, School of Medicine, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
3
|
Uncertain Interval Forecasting for Combined Electricity-Heat-Cooling-Gas Loads in the Integrated Energy System Based on Multi-Task Learning and Multi-Kernel Extreme Learning Machine. MATHEMATICS 2021. [DOI: 10.3390/math9141645] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The accurate prediction of electricity-heat-cooling-gas loads on the demand side in the integrated energy system (IES) can provide significant reference for multiple energy planning and stable operation of the IES. This paper combines the multi-task learning (MTL) method, the Bootstrap method, the improved Salp Swarm Algorithm (ISSA) and the multi-kernel extreme learning machine (MKELM) method to establish the uncertain interval prediction model of electricity-heat-cooling-gas loads. The ISSA introduces the dynamic inertia weight and chaotic local searching mechanism into the basic SSA to improve the searching speed and avoid falling into local optimum. The MKELM model is established by combining the RBF kernel function and the Poly kernel function to integrate the superior learning ability and generalization ability of the two functions. Based on the established model, weather, calendar information, social–economic factors, and historical load are selected as the input variables. Through empirical analysis and comparison discussion, we can obtain: (1) the prediction results of workday are better than those on holiday. (2) The Bootstrap-ISSA-MKELM based on the MTL method has superior performance than that based on the STL method. (3) Through comparing discussion, we discover the established uncertain interval prediction model has the superior performance in combined electricity-heat-cooling-gas loads prediction.
Collapse
|
4
|
Zhang C, Tao D, Hu T, Liu B. Generalization Bounds of Multitask Learning From Perspective of Vector-Valued Function Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:1906-1919. [PMID: 32497006 DOI: 10.1109/tnnls.2020.2995428] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, we study the generalization performance of multitask learning (MTL) by considering MTL as a learning process of vector-valued functions (VFs). We will answer two theoretical questions, given a small size training sample: 1) under what conditions does MTL perform better than single-task learning (STL)? And 2) under what conditions does MTL guarantee the consistency of all tasks during learning? In contrast to the conventional task-summation based MTL, the introduction of VF form enables us to detect the behavior of each task and the task-group relatedness in MTL. Specifically, the task-group relatedness examines how the success (or failure) of some tasks affects the performance of the other tasks. By deriving the specific deviation and symmetrization inequalities for VFs, we obtain a generalization bound for MTL to the upper bound of the joint probability that there is at least one task with a large generalization gap. To answer the first question, we discuss how the synergic relatedness between task groups affects the generalization performance of MTL and shows that MTL outperforms STL if almost any pair of complementary task groups is predominantly synergic. Moreover, to answer the second question, we present a sufficient condition to guarantee the consistency of each task in MTL, which requires that the function class of each task should not have high complexity. In addition, our findings provide a strategy to examine whether the task settings will enjoy the advantages of MTL.
Collapse
|
5
|
Yu C, Gao Z, Zhang W, Yang G, Zhao S, Zhang H, Zhang Y, Li S. Multitask Learning for Estimating Multitype Cardiac Indices in MRI and CT Based on Adversarial Reverse Mapping. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:493-506. [PMID: 32310804 DOI: 10.1109/tnnls.2020.2984955] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The estimation of multitype cardiac indices from cardiac magnetic resonance imaging (MRI) and computed tomography (CT) images attracts great attention because of its clinical potential for comprehensive function assessment. However, the most exiting model can only work in one imaging modality (MRI or CT) without transferable capability. In this article, we propose the multitask learning method with the reverse inferring for estimating multitype cardiac indices in MRI and CT. Different from the existing forward inferring methods, our method builds a reverse mapping network that maps the multitype cardiac indices to cardiac images. The task dependencies are then learned and shared to multitask learning networks using an adversarial training approach. Finally, we transfer the parameters learned from MRI to CT. A series of experiments were conducted in which we first optimized the performance of our framework via ten-fold cross-validation of over 2900 cardiac MRI images. Then, the fine-tuned network was run on an independent data set with 2360 cardiac CT images. The results of all the experiments conducted on the proposed adversarial reverse mapping show excellent performance in estimating multitype cardiac indices.
Collapse
|
6
|
Sun G, Cong Y, Zhang Y, Zhao G, Fu Y. Continual Multiview Task Learning via Deep Matrix Factorization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:139-150. [PMID: 32175877 DOI: 10.1109/tnnls.2020.2977497] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The state-of-the-art multitask multiview (MTMV) learning tackles a scenario where multiple tasks are related to each other via multiple shared feature views. However, in many real-world scenarios where a sequence of the multiview task comes, the higher storage requirement and computational cost of retraining previous tasks with MTMV models have presented a formidable challenge for this lifelong learning scenario. To address this challenge, in this article, we propose a new continual multiview task learning model that integrates deep matrix factorization and sparse subspace learning in a unified framework, which is termed deep continual multiview task learning (DCMvTL). More specifically, as a new multiview task arrives, DCMvTL first adopts a deep matrix factorization technique to capture hidden and hierarchical representations for this new coming multiview task while accumulating the fresh multiview knowledge in a layerwise manner. Then, a sparse subspace learning model is employed for the extracted factors at each layer and further reveals cross-view correlations via a self-expressive constraint. For model optimization, we derive a general multiview learning formulation when a new multiview task comes and apply an alternating minimization strategy to achieve lifelong learning. Extensive experiments on benchmark data sets demonstrate the effectiveness of our proposed DCMvTL model compared with the existing state-of-the-art MTMV and lifelong multiview task learning models.
Collapse
|
7
|
He F, Liu T, Tao D. Why ResNet Works? Residuals Generalize. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:5349-5362. [PMID: 32031953 DOI: 10.1109/tnnls.2020.2966319] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Residual connections significantly boost the performance of deep neural networks. However, few theoretical results address the influence of residuals on the hypothesis complexity and the generalization ability of deep neural networks. This article studies the influence of residual connections on the hypothesis complexity of the neural network in terms of the covering number of its hypothesis space. We first present an upper bound of the covering number of networks with residual connections. This bound shares a similar structure with that of neural networks without residual connections. This result suggests that moving a weight matrix or nonlinear activation from the bone to a vine would not increase the hypothesis space. Afterward, an O(1 / √N) margin-based multiclass generalization bound is obtained for ResNet, as an exemplary case of any deep neural network with residual connections. Generalization guarantees for similar state-of-the-art neural network architectures, such as DenseNet and ResNeXt, are straightforward. According to the obtained generalization bound, we should introduce regularization terms to control the magnitude of the norms of weight matrices not to increase too much, in practice, to ensure a good generalization ability, which justifies the technique of weight decay.
Collapse
|
8
|
Qin N, Liang K, Huang D, Ma L, Kemp AH. Multiple Convolutional Recurrent Neural Networks for Fault Identification and Performance Degradation Evaluation of High-Speed Train Bogie. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:5363-5376. [PMID: 32054588 DOI: 10.1109/tnnls.2020.2966744] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
As an important part of high-speed train (HST), the mechanical performance of bogies imposes a direct impact on the safety and reliability of HST. It is a fact that, regardless of the potential mechanical performance degradation status, most existing fault diagnosis methods focus only on the identification of bogie fault types. However, for application scenarios such as auxiliary maintenance, identifying the performance degradation of bogie is critical in determining a particular maintenance strategy. In this article, by considering the intrinsic link between fault type and performance degradation of bogie, a novel multiple convolutional recurrent neural network (M-CRNN) that consists of two CRNN frameworks is proposed for simultaneous diagnosis of fault type and performance degradation state. Specifically, the CRNN framework 1 is designed to detect the fault types of the bogie. Meanwhile, CRNN framework 2, which is formed by CRNN Framework 1 and an RNN module, is adopted to further extract the features of fault performance degradation. It is worth highlighting that M-CRNN extends the structure of traditional neural networks and makes full use of the temporal correlation of performance degradation and model fault types. The effectiveness of the proposed M-CRNN algorithm is tested via the HST model CRH380A at different running speeds, including 160, 200, and 220 km/h. The overall accuracy of M-CRNN, i.e., the product of the accuracies for identifying the fault types and evaluating the fault performance degradation, is beyond 94.6% in all cases. This clearly demonstrates the potential applicability of the proposed method for multiple fault diagnosis tasks of HST bogie system.
Collapse
|
9
|
|
10
|
Kim H, Suh D. Hybrid Particle Swarm Optimization for Multi-Sensor Data Fusion. SENSORS (BASEL, SWITZERLAND) 2018; 18:E2792. [PMID: 30149565 PMCID: PMC6165151 DOI: 10.3390/s18092792] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Revised: 08/17/2018] [Accepted: 08/22/2018] [Indexed: 11/22/2022]
Abstract
A hybrid particle swarm optimization (PSO), able to overcome the large-scale nonlinearity or heavily correlation in the data fusion model of multiple sensing information, is proposed in this paper. In recent smart convergence technology, multiple similar and/or dissimilar sensors are widely used to support precisely sensing information from different perspectives, and these are integrated with data fusion algorithms to get synergistic effects. However, the construction of the data fusion model is not trivial because of difficulties to meet under the restricted conditions of a multi-sensor system such as its limited options for deploying sensors and nonlinear characteristics, or correlation errors of multiple sensors. This paper presents a hybrid PSO to facilitate the construction of robust data fusion model based on neural network while ensuring the balance between exploration and exploitation. The performance of the proposed model was evaluated by benchmarks composed of representative datasets. The well-optimized data fusion model is expected to provide an enhancement in the synergistic accuracy.
Collapse
Affiliation(s)
- Hyunseok Kim
- IoT Research Division, Electronics and Telecommunications Research Institute (ETRI), Daejeon 34129, Korea.
| | - Dongjun Suh
- School of Convergence & Fusion System Engineering, Kyungpook National University, Sangju 37224, Korea.
| |
Collapse
|