1
|
Naeem MR, Amin R, Farhan M, Alotaibi FA, Alnfiai MM, Sampedro GA, Karovič V. Harnessing AI and analytics to enhance cybersecurity and privacy for collective intelligence systems. PeerJ Comput Sci 2024; 10:e2264. [PMID: 39314701 PMCID: PMC11419604 DOI: 10.7717/peerj-cs.2264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 07/24/2024] [Indexed: 09/25/2024]
Abstract
Collective intelligence systems like Chat Generative Pre-Trained Transformer (ChatGPT) have emerged. They have brought both promise and peril to cybersecurity and privacy protection. This study introduces novel approaches to harness the power of artificial intelligence (AI) and big data analytics to enhance security and privacy in this new era. Contributions could explore topics such as: leveraging natural language processing (NLP) in ChatGPT-like systems to strengthen information security; evaluating privacy-enhancing technologies to maximize data utility while minimizing personal data exposure; modeling human behavior and agency to build secure and ethical human-centric systems; applying machine learning to detect threats and vulnerabilities in a data-driven manner; using analytics to preserve privacy in large datasets while enabling value creation; crafting AI techniques that operate in a trustworthy and explainable manner. This article advances the state-of-the-art at the intersection of cybersecurity, privacy, human factors, ethics, and cutting-edge AI, providing impactful solutions to emerging challenges. Our research presents a revolutionary approach to malware detection that leverages deep learning (DL) based methodologies to automatically learn features from raw data. Our approach involves constructing a grayscale image from a malware file and extracting features to minimize its size. This process affords us the ability to discern patterns that might remain hidden from other techniques, enabling us to utilize convolutional neural networks (CNNs) to learn from these grayscale images and a stacking ensemble to classify malware. The goal is to model a highly complex nonlinear function with parameters that can be optimized to achieve superior performance. To test our approach, we ran it on over 6,414 malware variants and 2,050 benign files from the MalImg collection, resulting in an impressive 99.86 percent validation accuracy for malware detection. Furthermore, we conducted a classification experiment on 15 malware families and 13 tests with varying parameters to compare our model to other comparable research. Our model outperformed most of the similar research with detection accuracy ranging from 47.07% to 99.81% and a significant increase in detection performance. Our results demonstrate the efficacy of our approach, which unlocks the hidden patterns that underlie complex systems, advancing the frontiers of computational security.
Collapse
Affiliation(s)
- Muhammad Rehan Naeem
- Department of Computer Science, University of Engineering and Technology Taxila, Taxila, Punjab, Pakistan
| | - Rashid Amin
- Department of Computer Science, University of Engineering and Technology Taxila, Taxila, Punjab, Pakistan
| | - Muhammad Farhan
- School of Science and Engineering, School of Science and Engineering, Al Akhawayn University in Ifrane, Ifrane, Ifrane, Morocco
| | - Faiz Abdullah Alotaibi
- Assistant Professor, Department of Information Science, College of Humanities and Social Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Mrim M. Alnfiai
- Department of Information Technology, College of Computers and Information Technology, Taif University, Taif, Saudi Arabia
| | - Gabriel Avelino Sampedro
- Faculty of Information and Communication Studies, University of the Philippines Open University, Los Baños, Philippines
- Center for Computational Imaging and Visual Innovations, De La Salle University, Taft Ave, Malate, Manila, Philippines
| | - Vincent Karovič
- Faculty of Management, Comenius University in Bratislava, Odbojárov, Bratislava, Slovakia
| |
Collapse
|
2
|
Alsubai S, Dutta AK, Alnajim AM, Wahab Sait AR, Ayub R, AlShehri AM, Ahmad N. Artificial intelligence-driven malware detection framework for internet of things environment. PeerJ Comput Sci 2023; 9:e1366. [PMID: 37346520 PMCID: PMC10280412 DOI: 10.7717/peerj-cs.1366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 04/04/2023] [Indexed: 06/23/2023]
Abstract
The Internet of Things (IoT) environment demands a malware detection (MD) framework for protecting sensitive data from unauthorized access. The study intends to develop an image-based MD framework. The authors apply image conversion and enhancement techniques to convert malware binaries into RGB images. You only look once (Yolo V7) is employed for extracting the key features from the malware images. Harris Hawks optimization is used to optimize the DenseNet161 model to classify images into malware and benign. IoT malware and Virusshare datasets are utilized to evaluate the proposed framework's performance. The outcome reveals that the proposed framework outperforms the current MD framework. The framework generates the outcome at an accuracy and F1-score of 98.65 and 98.5 and 97.3 and 96.63 for IoT malware and Virusshare datasets, respectively. In addition, it achieves an area under the receiver operating characteristics and the precision-recall curve of 0.98 and 0.85 and 0.97 and 0.84 for IoT malware and Virusshare datasets, accordingly. The study's outcome reveals that the proposed framework can be deployed in the IoT environment to protect the resources.
Collapse
Affiliation(s)
- Shtwai Alsubai
- Prince Sattam Bin Abdulaziz University, Al-Kharj, Kingdom of Saudi Arabia
| | - Ashit Kumar Dutta
- Department of Computer Science and Information Technology, Almaarefa University, Riyadh, Kingdom of Saudi Arabia
| | - Abdullah M. Alnajim
- Department of Information Technology, College of computer, Qassim University, Buraydah, Saudi Arabia
| | - Abdul rahaman Wahab Sait
- Department of Archives and Communication, King Faisal University, Al Ahsa, Hofuf, Kingdom of Saudi Arabia
| | - Rashid Ayub
- Department of Science Technology & Innovation Unit, King Saud University, Riyadh, Saudi Arabia
| | - Afnan Mushabbab AlShehri
- Department of Computer Science and Information Systems, College of Applied Sciences, AlMaarefa University, Ad Diriyah, Riyadh, Kingdom of Saudi Arabia
| | - Naved Ahmad
- Department of Computer Science and Information Systems, College of Applied Sciences, AlMaarefa University, Ad Diriyah, Riyadh, Kingdom of Saudi Arabia
| |
Collapse
|
3
|
Yahya AA, Liu K, Hawbani A, Wang Y, Hadi AN. A Novel Image Classification Method Based on Residual Network, Inception, and Proposed Activation Function. SENSORS (BASEL, SWITZERLAND) 2023; 23:2976. [PMID: 36991687 PMCID: PMC10056718 DOI: 10.3390/s23062976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 03/04/2023] [Accepted: 03/06/2023] [Indexed: 06/19/2023]
Abstract
In deeper layers, ResNet heavily depends on skip connections and Relu. Although skip connections have demonstrated their usefulness in networks, a major issue arises when the dimensions between layers are not consistent. In such cases, it is necessary to use techniques such as zero-padding or projection to match the dimensions between layers. These adjustments increase the complexity of the network architecture, resulting in an increase in parameter number and a rise in computational costs. Another problem is the vanishing gradient caused by utilizing Relu. In our model, after making appropriate adjustments to the inception blocks, we replace the deeper layers of ResNet with modified inception blocks and Relu with our non-monotonic activation function (NMAF). To reduce parameter number, we use symmetric factorization and 1×1 convolutions. Utilizing these two techniques contributed to reducing the parameter number by around 6 M parameters, which has helped reduce the run time by 30 s/epoch. Unlike Relu, NMAF addresses the deactivation problem of the non-positive number by activating the negative values and outputting small negative numbers instead of zero in Relu, which helped in enhancing the convergence speed and increasing the accuracy by 5%, 15%, and 5% for the non-noisy datasets, and 5%, 6%, 21% for non-noisy datasets.
Collapse
Affiliation(s)
- Ali Abdullah Yahya
- School of Computer and Information, Anqing Normal University, Anqing 246011, China
| | - Kui Liu
- School of Computer and Information, Anqing Normal University, Anqing 246011, China
| | - Ammar Hawbani
- School of Computer and Technology, University of Science and Technology of China, Hefei 230027, China
| | - Yibin Wang
- School of Computer and Information, Anqing Normal University, Anqing 246011, China
| | - Ali Naser Hadi
- School of Computer and Information, Hefei University of Technology, Hefei 230009, China
| |
Collapse
|
4
|
Deep Convolution Neural Network sharing for the multi-label images classification. MACHINE LEARNING WITH APPLICATIONS 2022. [DOI: 10.1016/j.mlwa.2022.100422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
|
5
|
SAGMAD—A Signature Agnostic Malware Detection System Based on Binary Visualisation and Fuzzy Sets. ELECTRONICS 2022. [DOI: 10.3390/electronics11071044] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Image conversion of byte-level data, or binary visualisation, is a relevant approach to security applications interested in malicious activity detection. However, in practice, binary visualisation has always been seen to have great limitations when dealing with large volumes of data, and would be a reluctant candidate as the core building block of an intrusion detection system (IDS). This is due to the requirements of computational time when processing the flow of byte data into image format. Machine intelligence solutions based on colour tone variations that are intended for pattern recognition would overtax the process. In this paper, we aim to solve this issue by proposing a fast binary visualisation method that uses Fuzzy Set theory and the H-indexing space filling curve. Our model can assign different colour tones on a byte, allowing it to be influenced by neighbouring byte values while preserving optimal locality indexing. With this work, we wish to establish the first steps in pursuit of a signature-free IDS. For our experiment, we used 5000 malicious and benign files of different sizes. Our methodology was tested on various platforms, including GRNET’s High-Performance Computing services. Further improvements in computation time allowed larger files to convert in roughly 0.5 s on a desktop environment. Its performance was also compared with existing machine learning-based detection applications that used traditional binary visualisation. Despite lack of optimal tuning, SAGMAD was able to achieve 91.94% accuracy, 90.63% precision, 92.7% recall, and an F-score of 91.61% on average when tested within previous binary visualisation applications and following their parameterisation scheme. The results exceeded malware file-based experiments and were similar to network intrusion applications. Overall, the results demonstrated here prove our method to be a promising mechanism for a fast AI-based signature-agnostic IDS.
Collapse
|