1
|
Bantan RAR, Ahmad Z, Khan F, Elgarhy M, Almaspoor Z, Hamedani GG, El-Morshedy M, Gemeay AM. Predictive modeling of the COVID-19 data using a new version of the flexible Weibull model and machine learning techniques. Math Biosci Eng 2023; 20:2847-2873. [PMID: 36899561 DOI: 10.3934/mbe.2023134] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Statistical modeling and forecasting of time-to-events data are crucial in every applied sector. For the modeling and forecasting of such data sets, several statistical methods have been introduced and implemented. This paper has two aims, i.e., (i) statistical modeling and (ii) forecasting. For modeling time-to-events data, we introduce a new statistical model by combining the flexible Weibull model with the Z-family approach. The new model is called the Z flexible Weibull extension (Z-FWE) model, where the characterizations of the Z-FWE model are obtained. The maximum likelihood estimators of the Z-FWE distribution are obtained. The evaluation of the estimators of the Z-FWE model is assessed in a simulation study. The Z-FWE distribution is applied to analyze the mortality rate of COVID-19 patients. Finally, for forecasting the COVID-19 data set, we use machine learning (ML) techniques i.e., artificial neural network (ANN) and group method of data handling (GMDH) with the autoregressive integrated moving average model (ARIMA). Based on our findings, it is observed that ML techniques are more robust in terms of forecasting than the ARIMA model.
Collapse
Affiliation(s)
- Rashad A R Bantan
- Department of Marine Geology, Faculty of Marine Science, King Abdulaziz University, Jeddah 21551, Saudi Arabia
| | - Zubair Ahmad
- Department of Statistics, Yazd University, P.O. Box 89175-741, Yazd, Iran
| | | | - Mohammed Elgarhy
- The Higher Institute of Commercial Sciences, Al mahalla Al kubra, Algarbia 31951, Egypt
| | - Zahra Almaspoor
- Department of Statistics, Yazd University, P.O. Box 89175-741, Yazd, Iran
| | - G G Hamedani
- Department of Mathematical and Statistical Sciences, Marquette University, Milwaukee, WI, USA
| | - Mahmoud El-Morshedy
- Department of Mathematics, College of Science and Humanities in Al-Kharj, Prince Sattam Bin Abdulaziz University, Al-Kharj 11942, Saudi Arabia
- Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt
| | - Ahmed M Gemeay
- Department of Mathematics, Faculty of Science, Tanta University, Tanta 31527, Egypt
| |
Collapse
|
2
|
Qureshi M, Khan S, Bantan RAR, Daniyal M, Elgarhy M, Marzo RR, Lin Y. Modeling and Forecasting Monkeypox Cases Using Stochastic Models. J Clin Med 2022; 11:6555. [PMID: 36362783 PMCID: PMC9659136 DOI: 10.3390/jcm11216555] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 10/24/2022] [Accepted: 10/27/2022] [Indexed: 08/25/2023] Open
Abstract
BACKGROUND Monkeypox virus is gaining attention due to its severity and spread among people. This study sheds light on the modeling and forecasting of new monkeypox cases. Knowledge about the future situation of the virus using a more accurate time series and stochastic models is required for future actions and plans to cope with the challenge. METHODS We conduct a side-by-side comparison of the machine learning approach with the traditional time series model. The multilayer perceptron model (MLP), a machine learning technique, and the Box-Jenkins methodology, also known as the ARIMA model, are used for classical modeling. Both methods are applied to the Monkeypox cumulative data set and compared using different model selection criteria such as root mean square error, mean square error, mean absolute error, and mean absolute percentage error. RESULTS With a root mean square error of 150.78, the monkeypox series follows the ARIMA (7,1,7) model among the other potential models. Comparatively, we use the multilayer perceptron (MLP) model, which employs the sigmoid activation function and has a different number of hidden neurons in a single hidden layer. The root mean square error of the MLP model, which uses a single input and ten hidden neurons, is 54.40, significantly lower than that of the ARIMA model. The actual confirmed cases versus estimated or fitted plots also demonstrate that the multilayer perceptron model has a better fit for the monkeypox data than the ARIMA model. CONCLUSIONS AND RECOMMENDATION When it comes to predicting monkeypox, the machine learning method outperforms the traditional time series. A better match can be achieved in future studies by applying the extreme learning machine model (ELM), support vector machine (SVM), and some other methods with various activation functions. It is thus concluded that the selected data provide a real picture of the virus. If the situations remain the same, governments and other stockholders should ensure the follow-up of Standard Operating Procedures (SOPs) among the masses, as the trends will continue rising in the upcoming 10 days. However, governments should take some serious interventions to cope with the virus. LIMITATION In the ARIMA models selected for forecasting, we did not incorporate the effect of covariates such as the effect of net migration of monkeypox virus patients, government interventions, etc.
Collapse
Affiliation(s)
- Moiz Qureshi
- Department of Statistics, Shaheed Benazir Bhutto University, Nawabshah 67450, Pakistan
| | - Shahid Khan
- Department of Mathematics, National University of Modern Languages, Islamabad 44000, Pakistan
| | - Rashad A. R. Bantan
- Department of Marine Geology, Faculty of Marine Science, King AbdulAziz University, Jeddah 21551, Saudi Arabia
| | - Muhammad Daniyal
- Department of Statistics, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
| | - Mohammed Elgarhy
- The Higher Institute of Commercial Sciences, Al Mahalla Al Kubra 31951, Egypt
| | - Roy Rillera Marzo
- Department of Community Medicine, International Medical School, Management and Science University, Shah Alam 40100, Selangor, Malaysia
- Global Public Health, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Jalan Lagoon Selatan, Subang Jaya 47500, Selangor, Malaysia
| | - Yulan Lin
- Department of Epidemiology and Health Statistics, School of Public Health, Fujian Medical University, Fuzhou 350122, China
| |
Collapse
|
3
|
Bantan RAR, Chesneau C, Jamal F, Elbatal I, Elgarhy M. The Truncated Burr X-G Family of Distributions: Properties and Applications to Actuarial and Financial Data. Entropy (Basel) 2021; 23:e23081088. [PMID: 34441228 PMCID: PMC8391697 DOI: 10.3390/e23081088] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 08/17/2021] [Accepted: 08/19/2021] [Indexed: 11/16/2022]
Abstract
In this article, the "truncated-composed" scheme was applied to the Burr X distribution to motivate a new family of univariate continuous-type distributions, called the truncated Burr X generated family. It is mathematically simple and provides more modeling freedom for any parental distribution. Additional functionality is conferred on the probability density and hazard rate functions, improving their peak, asymmetry, tail, and flatness levels. These characteristics are represented analytically and graphically with three special distributions of the family derived from the exponential, Rayleigh, and Lindley distributions. Subsequently, we conducted asymptotic, first-order stochastic dominance, series expansion, Tsallis entropy, and moment studies. Useful risk measures were also investigated. The remainder of the study was devoted to the statistical use of the associated models. In particular, we developed an adapted maximum likelihood methodology aiming to efficiently estimate the model parameters. The special distribution extending the exponential distribution was applied as a statistical model to fit two sets of actuarial and financial data. It performed better than a wide variety of selected competing non-nested models. Numerical applications for risk measures are also given.
Collapse
Affiliation(s)
- Rashad A. R. Bantan
- Department of Marine Geology, Faculty of Marine Science, King AbdulAziz University, Jeddah 21551, Saudi Arabia;
| | - Christophe Chesneau
- Department of Mathematics, Université de Caen, LMNO, Campus II, Science 3, 14032 Caen, France
- Correspondence:
| | - Farrukh Jamal
- Department of Statistics, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan;
| | - Ibrahim Elbatal
- Department of Mathematics and Statistics, College of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia;
| | - Mohammed Elgarhy
- The Higher Institute of Commercial Sciences, Al mahalla Al kubra, Algarbia 31951, Egypt;
| |
Collapse
|
4
|
Bantan RAR, Ali A, Naeem S, Jamal F, Elgarhy M, Chesneau C. Discrimination of sunflower seeds using multispectral and texture dataset in combination with region selection and supervised classification methods. Chaos 2020; 30:113142. [PMID: 33261340 DOI: 10.1063/5.0024017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 10/14/2020] [Indexed: 06/12/2023]
Abstract
The purpose of this study is to discriminate sunflower seeds with the help of a dataset having spectral and textural features. The production of crop based on seed purity and quality other hand sunflower seed used for oil content worldwide. In this regard, the foundation of a dataset categorizes sunflower seed varieties (Syngenta CG, HS360, S278, HS30, Armani, and High Sun 33), which were acquired from the agricultural farms of The Islamia University of Bahawalpur, Pakistan, into six classes. For preprocessing, a new region-oriented seed-based segmentation was deployed for the automatic selection of regions and extraction of 53 multi-features from each region, while 11 optimized fused multi-features were selected using the chi-square feature selection technique. For discrimination, four supervised classifiers, namely, deep learning J4, support vector machine, random committee, and Bayes net, were employed to optimize the multi-feature dataset. We observe very promising accuracies of 98.2%, 97.5%, 96.6%, and 94.8%, respectively, when the size of a region is (180 × 180).
Collapse
Affiliation(s)
- Rashad A R Bantan
- Department of Marine Geology, Faculty of Marine Science, King Abdulaziz University, Jeddah 21551, Saudi Arabia
| | - Aqib Ali
- Department of Computer Science & IT, Glim Institute of Modern Studies, Bahawalpur 61300, Pakistan
| | - Samreen Naeem
- Department of Computer Science & IT, Glim Institute of Modern Studies, Bahawalpur 61300, Pakistan
| | - Farrukh Jamal
- Department of Statistics, The Islamia University of Bahawalpur, Bahawalpur, Punjab 63100, Pakistan
| | - Mohammed Elgarhy
- Valley High Institute for Management Finance and Information Systems, Obour, Qaliubia 11828, Egypt
| | - Christophe Chesneau
- Department of Mathematics, Université de Caen, LMNO, Campus II, Science 3, 14032 Caen, France
| |
Collapse
|
5
|
Abstract
In this article, we introduce a new general family of distributions derived to the truncated inverted Kumaraswamy distribution (on the unit interval), called the truncated inverted Kumaraswamy generated family. Among its qualities, it is characterized with tractable functions, has the ability to enhance the flexibility of a given distribution, and demonstrates nice statistical properties, including competitive fits for various kinds of data. A particular focus is given on a special member of the family defined with the exponential distribution as baseline, offering a new three-parameter lifetime distribution. This new distribution has the advantage of having a hazard rate function allowing monotonically increasing, decreasing, and upside-down bathtub shapes. In full generality, important properties of the new family are determined, with an emphasis on the entropy (Rényi and Shannon entropy). The estimation of the model parameters is established by the maximum likelihood method. A numerical simulation study illustrates the nice performance of the obtained estimates. Two practical data sets are then analyzed. We thus prove the potential of the new model in terms of fitting, with favorable results in comparison to other modern parametric models of the literature.
Collapse
Affiliation(s)
- Rashad A. R. Bantan
- Deanship of Scientific Research, King Abdulaziz University, Jeddah 21442, Saudi Arabia;
| | - Farrukh Jamal
- Department of Statistics, Govt. S.A Postgraduate College Dera Nawab Sahib, Bahawalpur, Punjab 63100, Pakistan;
| | - Christophe Chesneau
- Department of Mathematics, Université de Caen, LMNO, Campus II, Science 3, 14032 Caen, France
- Correspondence: ; Tel.: +33-02-3156-7424
| | - Mohammed Elgarhy
- Valley High Institute for Management Finance and Information Systems, Obour, Qaliubia 11828, Egypt;
| |
Collapse
|