1
|
Uddin S, Lu H. Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data. PLoS One 2024; 19:e0301541. [PMID: 38635591 PMCID: PMC11025817 DOI: 10.1371/journal.pone.0301541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Accepted: 03/18/2024] [Indexed: 04/20/2024] Open
Abstract
Many individual studies in the literature observed the superiority of tree-based machine learning (ML) algorithms. However, the current body of literature lacks statistical validation of this superiority. This study addresses this gap by employing five ML algorithms on 200 open-access datasets from a wide range of research contexts to statistically confirm the superiority of tree-based ML algorithms over their counterparts. Specifically, it examines two tree-based ML (Decision tree and Random forest) and three non-tree-based ML (Support vector machine, Logistic regression and k-nearest neighbour) algorithms. Results from paired-sample t-tests show that both tree-based ML algorithms reveal better performance than each non-tree-based ML algorithm for the four ML performance measures (accuracy, precision, recall and F1 score) considered in this study, each at p<0.001 significance level. This performance superiority is consistent across both the model development and test phases. This study also used paired-sample t-tests for the subsets of the research datasets from disease prediction (66) and university-ranking (50) research contexts for further validation. The observed superiority of the tree-based ML algorithms remains valid for these subsets. Tree-based ML algorithms significantly outperformed non-tree-based algorithms for these two research contexts for all four performance measures. We discuss the research implications of these findings in detail in this article.
Collapse
Affiliation(s)
- Shahadat Uddin
- School of Project Management, Faculty of Engineering, The University of Sydney, Forest Lodge, NSW, Australia
| | - Haohui Lu
- School of Project Management, Faculty of Engineering, The University of Sydney, Forest Lodge, NSW, Australia
| |
Collapse
|
2
|
Aminizadeh S, Heidari A, Dehghan M, Toumaj S, Rezaei M, Jafari Navimipour N, Stroppa F, Unal M. Opportunities and challenges of artificial intelligence and distributed systems to improve the quality of healthcare service. Artif Intell Med 2024; 149:102779. [PMID: 38462281 DOI: 10.1016/j.artmed.2024.102779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 12/30/2023] [Accepted: 01/14/2024] [Indexed: 03/12/2024]
Abstract
The healthcare sector, characterized by vast datasets and many diseases, is pivotal in shaping community health and overall quality of life. Traditional healthcare methods, often characterized by limitations in disease prevention, predominantly react to illnesses after their onset rather than proactively averting them. The advent of Artificial Intelligence (AI) has ushered in a wave of transformative applications designed to enhance healthcare services, with Machine Learning (ML) as a noteworthy subset of AI. ML empowers computers to analyze extensive datasets, while Deep Learning (DL), a specific ML methodology, excels at extracting meaningful patterns from these data troves. Despite notable technological advancements in recent years, the full potential of these applications within medical contexts remains largely untapped, primarily due to the medical community's cautious stance toward novel technologies. The motivation of this paper lies in recognizing the pivotal role of the healthcare sector in community well-being and the necessity for a shift toward proactive healthcare approaches. To our knowledge, there is a notable absence of a comprehensive published review that delves into ML, DL and distributed systems, all aimed at elevating the Quality of Service (QoS) in healthcare. This study seeks to bridge this gap by presenting a systematic and organized review of prevailing ML, DL, and distributed system algorithms as applied in healthcare settings. Within our work, we outline key challenges that both current and future developers may encounter, with a particular focus on aspects such as approach, data utilization, strategy, and development processes. Our study findings reveal that the Internet of Things (IoT) stands out as the most frequently utilized platform (44.3 %), with disease diagnosis emerging as the predominant healthcare application (47.8 %). Notably, discussions center significantly on the prevention and identification of cardiovascular diseases (29.2 %). The studies under examination employ a diverse range of ML and DL methods, along with distributed systems, with Convolutional Neural Networks (CNNs) being the most commonly used (16.7 %), followed by Long Short-Term Memory (LSTM) networks (14.6 %) and shallow learning networks (12.5 %). In evaluating QoS, the predominant emphasis revolves around the accuracy parameter (80 %). This study highlights how ML, DL, and distributed systems reshape healthcare. It contributes to advancing healthcare quality, bridging the gap between technology and medical adoption, and benefiting practitioners and patients.
Collapse
Affiliation(s)
- Sarina Aminizadeh
- Medical Faculty, Tabriz Branch, Islamic Azad University, Tabriz, Iran
| | - Arash Heidari
- Department of Software Engineering, Haliç University, Istanbul 34060, Turkiye.
| | - Mahshid Dehghan
- Tabriz University of Medical Sciences, Faculty of Medicine, Tabriz, Iran
| | - Shiva Toumaj
- Urmia University of Medical Sciences, Urmia, Iran
| | - Mahsa Rezaei
- Tabriz University of Medical Sciences, Faculty of Surgery, Tabriz, Iran
| | - Nima Jafari Navimipour
- Future Technology Research Center, National Yunlin University of Science and Technology, Douliou 64002, Taiwan; Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Kadir Has University, Istanbul, Türkiye.
| | - Fabio Stroppa
- Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Kadir Has University, Istanbul, Türkiye
| | - Mehmet Unal
- Department of Mathematics, School of Engineering and Natural Sciences, Bahçeşehir University, Istanbul, Turkiye
| |
Collapse
|
3
|
Wu S, Yan Y, Wang W. CF-YOLOX: An Autonomous Driving Detection Model for Multi-Scale Object Detection. SENSORS (BASEL, SWITZERLAND) 2023; 23:3794. [PMID: 37112134 PMCID: PMC10144478 DOI: 10.3390/s23083794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 03/29/2023] [Accepted: 03/31/2023] [Indexed: 06/19/2023]
Abstract
In self-driving cars, object detection algorithms are becoming increasingly important, and the accurate and fast recognition of objects is critical to realize autonomous driving. The existing detection algorithms are not ideal for the detection of small objects. This paper proposes a YOLOX-based network model for multi-scale object detection tasks in complex scenes. This method adds a CBAM-G module to the backbone of the original network, which performs grouping operations on CBAM. It changes the height and width of the convolution kernel of the spatial attention module to 7 × 1 to improve the ability of the model to extract prominent features. We proposed an object-contextual feature fusion module, which can provide more semantic information and improve the perception of multi-scale objects. Finally, we considered the problem of fewer samples and less loss of small objects and introduced a scaling factor that could increase the loss of small objects to improve the detection ability of small objects. We validated the effectiveness of the proposed method on the KITTI dataset, and the mAP value was 2.46% higher than the original model. Experimental comparisons showed that our model achieved superior detection performance compared to other models.
Collapse
|
4
|
Zhu G, Luo X, Yang T, Cai L, Yeo JH, Yan G, Yang J. Deep learning-based recognition and segmentation of intracranial aneurysms under small sample size. Front Physiol 2022; 13:1084202. [PMID: 36601346 PMCID: PMC9806214 DOI: 10.3389/fphys.2022.1084202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
The manual identification and segmentation of intracranial aneurysms (IAs) involved in the 3D reconstruction procedure are labor-intensive and prone to human errors. To meet the demands for routine clinical management and large cohort studies of IAs, fast and accurate patient-specific IA reconstruction becomes a research Frontier. In this study, a deep-learning-based framework for IA identification and segmentation was developed, and the impacts of image pre-processing and convolutional neural network (CNN) architectures on the framework's performance were investigated. Three-dimensional (3D) segmentation-dedicated architectures, including 3D UNet, VNet, and 3D Res-UNet were evaluated. The dataset used in this study included 101 sets of anonymized cranial computed tomography angiography (CTA) images with 140 IA cases. After the labeling and image pre-processing, a training set and test set containing 112 and 28 IA lesions were used to train and evaluate the convolutional neural network mentioned above. The performances of three convolutional neural networks were compared in terms of training performance, segmentation performance, and segmentation efficiency using multiple quantitative metrics. All the convolutional neural networks showed a non-zero voxel-wise recall (V-Recall) at the case level. Among them, 3D UNet exhibited a better overall segmentation performance under the relatively small sample size. The automatic segmentation results based on 3D UNet reached an average V-Recall of 0.797 ± 0.140 (3.5% and 17.3% higher than that of VNet and 3D Res-UNet), as well as an average dice similarity coefficient (DSC) of 0.818 ± 0.100, which was 4.1%, and 11.7% higher than VNet and 3D Res-UNet. Moreover, the average Hausdorff distance (HD) of the 3D UNet was 3.323 ± 3.212 voxels, which was 8.3% and 17.3% lower than that of VNet and 3D Res-UNet. The three-dimensional deviation analysis results also showed that the segmentations of 3D UNet had the smallest deviation with a max distance of +1.4760/-2.3854 mm, an average distance of 0.3480 mm, a standard deviation (STD) of 0.5978 mm, a root mean square (RMS) of 0.7269 mm. In addition, the average segmentation time (AST) of the 3D UNet was 0.053s, equal to that of 3D Res-UNet and 8.62% shorter than VNet. The results from this study suggested that the proposed deep learning framework integrated with 3D UNet can provide fast and accurate IA identification and segmentation.
Collapse
Affiliation(s)
- Guangyu Zhu
- School of Energy and Power Engineering, Xi’an Jiaotong University, Xi’an, China,*Correspondence: Guangyu Zhu, ; Jian Yang,
| | - Xueqi Luo
- School of Energy and Power Engineering, Xi’an Jiaotong University, Xi’an, China
| | - Tingting Yang
- School of Energy and Power Engineering, Xi’an Jiaotong University, Xi’an, China
| | - Li Cai
- Xi’an Key Laboratory of Scientific Computation and Applied Statistics, Xi’an, China,School of Mathematics and Statistics, Northwestern Polytechnical University, Xi’an, China
| | - Joon Hock Yeo
- School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore, Singapore
| | - Ge Yan
- Department of Radiology, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Jian Yang
- Department of Radiology, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China,*Correspondence: Guangyu Zhu, ; Jian Yang,
| |
Collapse
|
5
|
Comparisons of deep learning and machine learning while using text mining methods to identify suicide attempts of patients with mood disorders. J Affect Disord 2022; 317:107-113. [PMID: 36029873 DOI: 10.1016/j.jad.2022.08.054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 08/05/2022] [Accepted: 08/20/2022] [Indexed: 11/23/2022]
Abstract
BACKGROUND Suicide attempt is one of the most severe consequences for patients with mood disorders. This study aimed to perform deep learning and machine learning while using text mining to identify patients with suicide attempts and to compare their effectiveness. METHODS A total of 13,100 patients with mood disorders were selected. Two traditional text mining methods, logistic regression and Support vector machine (SVM), and one deep learning model (Convolutional neural network, CNN) were adopted to perform overall analysis and gender-specific subgroup analysis of patients to identify suicide attempts. The classification effectiveness of these models was evaluated by accuracy, F1-value, precision, recall, and the area under Receiver operator characteristic curve (ROC). RESULTS CNN's results were greater than the other two for all indicators except recall which was slightly smaller than SVM in male subgroup analysis. The accuracy values of the CNN were 98.4 %, 98.2 %, and 98.5 % in the overall analysis and the subgroup analysis for males and females, respectively. The results of McNemar's test showed that CNN and SVM models' predictions were statistically different from the logistic regression model's predictions in the overall analysis and the subgroup analysis for females (P < 0.050). LIMITATIONS A fixed number of features were selected based on document frequency to train models; this was a single-site study. CONCLUSIONS CNN model was a better way to detect suicide attempts in patients with mood disorders prior to hospital admission, saving time and resources in recognizing high-risk patients and preventing suicide.
Collapse
|
6
|
ECG Heartbeat Classification Using CONVXGB Model. ELECTRONICS 2022. [DOI: 10.3390/electronics11152280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
ELECTROCARDIOGRAM (ECG) signals are reliable in identifying and monitoring patients with various cardiac diseases and severe cardiovascular syndromes, including arrhythmia and myocardial infarction (MI). Thus, cardiologists use ECG signals in diagnosing cardiac diseases. Machine learning (ML) has also proven its usefulness in the medical field and in signal classification. However, current ML approaches rely on hand-crafted feature extraction methods or very complicated deep learning networks. This paper presents a novel method for feature extraction from ECG signals and ECG classification using a convolutional neural network (CNN) with eXtreme Gradient Boosting (XBoost), ConvXGB. This model was established by stacking two convolutional layers for automatic feature extraction from ECG signals, followed by XGBoost as the last layer, which is used for classification. This technique simplified ECG classification in comparison to other methods by minimizing the number of required parameters and eliminating the need for weight readjustment throughout the backpropagation phase. Furthermore, experiments on two famous ECG datasets–the Massachusetts Institute of Technology–Beth Israel Hospital (MIT-BIH) and Physikalisch-Technische Bundesanstalt (PTB) datasets–demonstrated that this technique handled the ECG signal classification issue better than either CNN or XGBoost alone. In addition, a comparison showed that this model outperformed state-of-the-art models, with scores of 0.9938, 0.9839, 0.9836, 0.9837, and 0.9911 for accuracy, precision, recall, F1-score, and specificity, respectively.
Collapse
|
7
|
Detection and Monitoring of Pitting Progression on Gear Tooth Flank Using Deep Learning. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12115327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Gears are essential machine elements that are exposed to heavy loads. In some cases, gearboxes are critical elements since they serve as machine drivers that must operate almost every day for a more extended period, such as years or even tens of years. Any interruption due to gear failures can cause significant losses, and therefore it is necessary to have a monitoring system that will ensure proper operation. Tooth surface damage is a common occurrence in operating gears. One of the most common types of damage to teeth surfaces is pitting. It is necessary for normal gear operations to regularly determine the occurrence and span of a damaged tooth surface caused by pitting. In this paper, we propose a machine vision system as part of the inspection process for detecting pitting and monitoring its progression. The implemented inspection system uses a faster R-CNN network to identify and position pitting on a specific tooth, which enables monitoring. Prediction confidence values of pitting damage detection are between 99.5–99.9%, while prediction confidence values for teeth recognized as crucial for monitoring are between 97–99%.
Collapse
|
8
|
Oralhan Z, Oralhan B, Khayyat MM, Abdel-Khalek S, Mansour RF. 3D Input Convolutional Neural Network for SSVEP Classification in Design of Brain Computer Interface for Patient User. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:8452002. [PMID: 35664638 PMCID: PMC9159868 DOI: 10.1155/2022/8452002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Accepted: 04/08/2022] [Indexed: 11/17/2022]
Abstract
This research was aimed at presenting performance of 3-dimensional input convolutional neural networks for steady-state visual evoked potential classification in a wireless EEG-based brain-computer interface system. Overall performance of a brain-computer interface system depends on information transfer rate. Parameters such as signal classification accuracy rate, signal stimulator structure, and user task completion time affect information transfer rate. In this study, we used 3 types of signal classification methods that are 1-dimensional, 2-dimensional, and 3-dimensional input convolutional neural network. According to online experiment with using 3-dimensional input convolutional neural network, we reached average classification accuracy rate and average information transfer rate as 93.75% and 58.35 bit/min, respectively. This both results significantly higher than the other methods that we used in experiments. Moreover, user task completion time was reduced with using 3-dimensional input convolutional neural network. Our proposed method is novel and state-of-art model for steady-state visual evoked potential classification.
Collapse
Affiliation(s)
- Zeki Oralhan
- Department of Electrical Electronics Engineering, Nuh Naci Yazgan University, 38090 Kayseri, Turkey
| | - Burcu Oralhan
- Department of Business Administration, Nuh Naci Yazgan University, 38090 Kayseri, Turkey
| | - Manal M. Khayyat
- Computer Science Department, Deanship of Preparatory Year of the Joint Medical Track, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Sayed Abdel-Khalek
- Department of Mathematics, College of Science, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
| | - Romany F. Mansour
- Department of Mathematics, Faculty of Science, New Valley University, El-Kharga 72511, Egypt
| |
Collapse
|
9
|
Latif J, Tu S, Xiao C, Ur Rehman S, Imran A, Latif Y. ODGNet: a deep learning model for automated optic disc localization and glaucoma classification using fundus images. SN APPLIED SCIENCES 2022. [DOI: 10.1007/s42452-022-04984-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
AbstractGlaucoma is one of the prevalent causes of blindness in the modern world. It is a salient chronic eye disease that leads to irreversible vision loss. The impediments of glaucoma can be restricted if it is identified at primary stages. In this paper, a novel two-phase Optic Disk localization and Glaucoma Diagnosis Network (ODGNet) has been proposed. In the first phase, a visual saliency map incorporated with shallow CNN is used for effective OD localization from the fundus images. In the second phase, the transfer learning-based pre-trained models are used for glaucoma diagnosis. The transfer learning-based models such as AlexNet, ResNet, and VGGNet incorporated with saliency maps are evaluated on five public retinal datasets (ORIGA, HRF, DRIONS-DB, DR-HAGIS, and RIM-ONE) to differentiate between normal and glaucomatous images. This study’s experimental results demonstrate that the proposed ODGNet evaluated on ORIGA for glaucoma diagnosis is the most predictive model and achieve 95.75, 94.90, 94.75, and 97.85% of accuracy, specificity, sensitivity, and area under the curve, respectively. These results indicate that the proposed OD localization method based on the saliency map and shallow CNN is robust, accurate and saves the computational cost.
Collapse
|
10
|
Gradient-Sensitive Optimization for Convolutional Neural Networks. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021. [DOI: 10.1155/2021/6671830] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Convolutional neural networks (CNNs) are effective models for image classification and recognition. Gradient descent optimization (GD) is the basic algorithm for CNN model optimization. Since GD appeared, a series of improved algorithms have been derived. Among these algorithms, adaptive moment estimation (Adam) has been widely recognized. However, local changes are ignored in Adam to some extent. In this paper, we introduce an adaptive learning rate factor based on current and recent gradients. According to this factor, we can dynamically adjust the learning rate of each independent parameter to adaptively adjust the global convergence process. We use the factor to adjust the learning rate for each parameter. The convergence of the proposed algorithm is proven by using the regret bound approach of the online learning framework. In the experimental section, comparisons are conducted between the proposed algorithm and other existing algorithms, such as AdaGrad, RMSprop, Adam, diffGrad, and AdaHMG, on test functions and the MNIST dataset. The results show that Adam and RMSprop combined with our algorithm can not only find the global minimum faster in the experiment using the test function but also have a better convergence curve and higher test set accuracy in experiments using datasets. Our algorithm is a supplement to the existing gradient descent algorithms, which can be combined with many other existing gradient descent algorithms to improve the efficiency of iteration, speed up the convergence of the cost function, and improve the final recognition rate.
Collapse
|
11
|
A Hybrid Deep CNN Model for Abnormal Arrhythmia Detection Based on Cardiac ECG Signal. SENSORS 2021; 21:s21030951. [PMID: 33535397 PMCID: PMC7867037 DOI: 10.3390/s21030951] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Revised: 01/09/2021] [Accepted: 01/15/2021] [Indexed: 11/21/2022]
Abstract
Electrocardiogram (ECG) signals play a vital role in diagnosing and monitoring patients suffering from various cardiovascular diseases (CVDs). This research aims to develop a robust algorithm that can accurately classify the electrocardiogram signal even in the presence of environmental noise. A one-dimensional convolutional neural network (CNN) with two convolutional layers, two down-sampling layers, and a fully connected layer is proposed in this work. The same 1D data was transformed into two-dimensional (2D) images to improve the model’s classification accuracy. Then, we applied the 2D CNN model consisting of input and output layers, three 2D-convolutional layers, three down-sampling layers, and a fully connected layer. The classification accuracy of 97.38% and 99.02% is achieved with the proposed 1D and 2D model when tested on the publicly available Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) arrhythmia database. Both proposed 1D and 2D CNN models outperformed the corresponding state-of-the-art classification algorithms for the same data, which validates the proposed models’ effectiveness.
Collapse
|
12
|
ModPSO-CNN: an evolutionary convolution neural network with application to visual recognition. Soft comput 2020. [DOI: 10.1007/s00500-020-05288-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
13
|
Attention to the Variation of Probabilistic Events: Information Processing with Message Importance Measure. ENTROPY 2019; 21:e21050439. [PMID: 33267153 PMCID: PMC7514929 DOI: 10.3390/e21050439] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 04/09/2019] [Accepted: 04/23/2019] [Indexed: 11/18/2022]
Abstract
Different probabilities of events attract different attention in many scenarios such as anomaly detection and security systems. To characterize the events’ importance from a probabilistic perspective, the message importance measure (MIM) is proposed as a kind of semantics analysis tool. Similar to Shannon entropy, the MIM has its special function in information representation, in which the parameter of MIM plays a vital role. Actually, the parameter dominates the properties of MIM, based on which the MIM has three work regions where this measure can be used flexibly for different goals. When the parameter is positive but not large enough, the MIM not only provides a new viewpoint for information processing but also has some similarities with Shannon entropy in the information compression and transmission. In this regard, this paper first constructs a system model with message importance measure and proposes the message importance loss to enrich the information processing strategies. Moreover, the message importance loss capacity is proposed to measure the information importance harvest in a transmission. Furthermore, the message importance distortion function is discussed to give an upper bound of information compression based on the MIM. Additionally, the bitrate transmission constrained by the message importance loss is investigated to broaden the scope for Shannon information theory.
Collapse
|
14
|
Recognizing Information Feature Variation: Message Importance Transfer Measure and Its Applications in Big Data. ENTROPY 2018; 20:e20060401. [PMID: 33265491 PMCID: PMC7512920 DOI: 10.3390/e20060401] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Revised: 05/11/2018] [Accepted: 05/12/2018] [Indexed: 11/16/2022]
Abstract
Information transfer that characterizes the information feature variation can have a crucial impact on big data analytics and processing. Actually, the measure for information transfer can reflect the system change from the statistics by using the variable distributions, similar to Kullback-Leibler (KL) divergence and Renyi divergence. Furthermore, to some degree, small probability events may carry the most important part of the total message in an information transfer of big data. Therefore, it is significant to propose an information transfer measure with respect to the message importance from the viewpoint of small probability events. In this paper, we present the message importance transfer measure (MITM) and analyze its performance and applications in three aspects. First, we discuss the robustness of MITM by using it to measuring information distance. Then, we present a message importance transfer capacity by resorting to the MITM and give an upper bound for the information transfer process with disturbance. Finally, we apply the MITM to discuss the queue length selection, which is the fundamental problem of caching operation on mobile edge computing.
Collapse
|