1
|
Dabiri MS, Hadavimoghaddam F, Ashoorian S, Schaffie M, Hemmati-Sarapardeh A. Modeling liquid rate through wellhead chokes using machine learning techniques. Sci Rep 2024; 14:6945. [PMID: 38521803 PMCID: PMC10960849 DOI: 10.1038/s41598-024-54010-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 02/07/2024] [Indexed: 03/25/2024] Open
Abstract
Precise measurement and prediction of the fluid flow rates in production wells are crucial for anticipating the production volume and hydrocarbon recovery and creating a steady and controllable flow regime in such wells. This study suggests two approaches to predict the flow rate through wellhead chokes. The first is a data-driven approach using different methods, namely: Adaptive boosting support vector regression (Adaboost-SVR), multivariate adaptive regression spline (MARS), radial basis function (RBF), and multilayer perceptron (MLP) with three algorithms: Levenberg-Marquardt (LM), bayesian-regularization (BR), and scaled conjugate gradient (SCG). The second is a developed correlation that depends on wellhead pressure (Pwh), gas-to-liquid ratio (GLR), and choke size (Dc). A dataset of 565 data points is available for model development. The performance of the two suggested approaches is compared with earlier correlations. Results revealed that the proposed models outperform the existing ones, with the Adaboost-SVR model showing the best performance with an average absolute percent relative error (AAPRE) of 5.15% and a correlation coefficient of 0.9784. Additionally, the results indicated that the developed correlation resulted in better predictions compared to the earlier ones. Furthermore, a sensitivity analysis of the input variable was also investigated in this study and revealed that the choke size variable had the most significant effect, while the Pwh and GLR showed a slight effect on the liquid rate. Eventually, the leverage approach showed that only 2.1% of the data points were in the suspicious range.
Collapse
Affiliation(s)
- Mohammad-Saber Dabiri
- Department of Petroleum Engineering, Shahid Bahonar University of Kerman, Kerman, Iran.
| | | | - Sefatallah Ashoorian
- Institute of Petroleum Engineering, School of Chemical Engineering, University of Tehran, P.O. Box: 11155-4563, Tehran, Iran
| | - Mahin Schaffie
- Department of Petroleum Engineering, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Abdolhossein Hemmati-Sarapardeh
- Department of Petroleum Engineering, Shahid Bahonar University of Kerman, Kerman, Iran.
- State Key Laboratory of Petroleum Resources and Prospecting, China University of Petroleum (Beijing), Beijing, China.
| |
Collapse
|
2
|
Wei-yu C, Sun L, Zhou J, Li X, Huang L, Xia G, Meng X, Wang K. Toward Predicting Interfacial Tension of Impure and Pure CO 2-Brine Systems Using Robust Correlative Approaches. ACS OMEGA 2024; 9:7937-7957. [PMID: 38405476 PMCID: PMC10882694 DOI: 10.1021/acsomega.3c07956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 01/18/2024] [Accepted: 01/23/2024] [Indexed: 02/27/2024]
Abstract
In the context of global climate change, significant attention is being directed toward renewable energy and the pivotal role of carbon capture and storage (CCS) technologies. These innovations involve secure CO2 storage in deep saline aquifers through structural and capillary processes, with the interfacial tension (IFT) of the CO2-brine system influencing the storage capacity of formations. In this study, an extensive data set of 2811 experimental data points was compiled to model the IFT of impure and pure CO2-brine systems. Three white-box machine learning (ML) methods, namely, genetic programming (GP), gene expression programming (GEP), and group method of data handling (GMDH) were employed to establish accurate mathematical correlations. Notably, the study utilized two distinct modeling approaches: one focused on impurity compositions and the other incorporating a pseudocritical temperature variable (Tcm) offering a versatile predictive tool suitable for various gas mixtures. Among the correlation methods explored, GMDH, employing five inputs, exhibited exceptional accuracy and reliability across all metrics. Its mean absolute percentage error (MAPE) values for testing, training, and complete data sets stood at 7.63, 7.31, and 7.38%, respectively. In the case of six-input models, the GEP correlation displayed the highest precision, with MAPE values of 9.30, 8.06, and 8.31% for the testing, training, and total data sets, respectively. The sensitivity and trend analyses revealed that pressure exerted the most significant impact on the IFT of CO2-brine, showcasing an adverse effect. Moreover, an impurity possessing a critical temperature below that of CO2 resulted in an elevated IFT. Consequently, this relationship leads to higher impurity concentrations aligning with lower Tcm values and subsequently elevated IFT. Also, monovalent and divalent cation molalities exhibited a growing influence on the IFT, with divalent cations exerting approximately double the influence of monovalent cations. Finally, the Leverage approach confirmed both the reliability of the experimental data and the robust statistical validity of the best correlations established in this study.
Collapse
Affiliation(s)
- Chen Wei-yu
- CNOOC
EnerTech-Drilling & Production Co., Ltd., Tianjin 300452, China
| | - Lin Sun
- CNOOC
EnerTech-Drilling & Production Co., Ltd., Tianjin 300452, China
| | - Jiyong Zhou
- CNOOC
EnerTech-Drilling & Production Co., Ltd., Tianjin 300452, China
| | - Xuguang Li
- CNOOC
EnerTech-Drilling & Production Co., Ltd., Tianjin 300452, China
| | - Liping Huang
- CNOOC
EnerTech-Drilling & Production Co., Ltd., Tianjin 300452, China
| | - Guang Xia
- CNOOC
EnerTech-Drilling & Production Co., Ltd., Tianjin 300452, China
| | - Xiangli Meng
- CNOOC
EnerTech-Drilling & Production Co., Ltd., Tianjin 300452, China
| | - Kui Wang
- State
Key Laboratory of Petroleum Resources and Prospecting, China University of Petroleum (Beijing), Beijing 102249, China
| |
Collapse
|
3
|
Zou X, Zhu Y, Lv J, Zhou Y, Ding B, Liu W, Xiao K, Zhang Q. Toward Estimating CO 2 Solubility in Pure Water and Brine Using Cascade Forward Neural Network and Generalized Regression Neural Network: Application to CO 2 Dissolution Trapping in Saline Aquifers. ACS OMEGA 2024; 9:4705-4720. [PMID: 38313487 PMCID: PMC10831835 DOI: 10.1021/acsomega.3c07962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/28/2023] [Accepted: 01/04/2024] [Indexed: 02/06/2024]
Abstract
Predicting carbon dioxide (CO2) solubility in water and brine is crucial for understanding carbon capture and storage (CCS) processes. Accurate solubility predictions inform the feasibility and effectiveness of CO2 dissolution trapping, a key mechanism in carbon sequestration in saline aquifers. In this work, a comprehensive data set comprising 1278 experimental solubility data points for CO2-brine systems was assembled, encompassing diverse operating conditions. These data encompassed brines containing six different salts: NaCl, KCl, NaHCO3, CaCl2, MgCl2, and Na2SO4. Also, this databank encompassed temperature spanning from 273.15 to 453.15 K and a pressure range spanning 0.06-100 MPa. To model this solubility databank, cascade forward neural network (CFNN) and generalized regression neural network (GRNN) were employed. Furthermore, three optimization algorithms, namely, Bayesian Regularization (BR), Broyden-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton, and Levenberg-Marquardt (LM), were applied to enhance the performance of the CFNN models. The CFNN-LM model showcased average absolute percent relative error (AAPRE) values of 5.37% for the overall data set, 5.26% for the training subset, and 5.85% for the testing subset. Overall, the CFNN-LM model stands out as the most accurate among the models crafted in this study, boasting the highest overall R2 value of 0.9949 among the other models. Based on sensitivity analysis, pressure exerts the most significant influence and stands as the sole parameter with a positive impact on CO2 solubility in brine. Conversely, temperature and the concentration of all six salts considered in the model exhibited a negative impact. All salts exert a negative impact on CO2 solubility due to their salting-out effect, with varying degrees of influence. The salting-out effects of the salts can be ranked as follows: from the most pronounced to the least: MgCl2 > CaCl2 > NaCl > KCl > NaHCO3 > Na2SO4. By employing the leverage approach, only a few instances of potential suspected and out-of-leverage data were found. The relatively low count of identified potential suspected and out-of-leverage data, given the expansive solubility database, underscores the reliability and accuracy of both the data set and the CFNN-LM model's performance in this survey.
Collapse
Affiliation(s)
- Xinyuan Zou
- State
Key Laboratory of Enhanced Oil Recovery, Research Institute of Petroleum
Exploration and Development, CNPC, Beijing 100083, China
- Research
Institute of Petroleum Exploration & Development, Beijing 100083, China
| | - Yingting Zhu
- Research
Institute of Petroleum Exploration & Development, Beijing 100083, China
- Key
Laboratory of Oilfield Chemistry of CNPC, Beijing 100083, China
| | - Jing Lv
- Research
Institute of Petroleum Exploration & Development, Beijing 100083, China
- Key
Laboratory of Oilfield Chemistry of CNPC, Beijing 100083, China
| | - Yuchi Zhou
- Oil
and Gas engineering research Institute, Petrochina Jilin Oilfield Company, Songyuan 138000, China
| | - Bin Ding
- Research
Institute of Petroleum Exploration & Development, Beijing 100083, China
- Key
Laboratory of Oilfield Chemistry of CNPC, Beijing 100083, China
| | - Weidong Liu
- Research
Institute of Petroleum Exploration & Development, Beijing 100083, China
- Key
Laboratory of Oilfield Chemistry of CNPC, Beijing 100083, China
| | - Kai Xiao
- State
Key Laboratory of Petroleum Resources and Prospecting, China University of Petroleum (Beijing), Beijing 102249, China
| | - Qun Zhang
- State
Key Laboratory of Enhanced Oil Recovery, Research Institute of Petroleum
Exploration and Development, CNPC, Beijing 100083, China
- Research
Institute of Petroleum Exploration & Development, Beijing 100083, China
| |
Collapse
|
4
|
Hadavimoghaddam F, Rozhenko A, Mohammadi MR, Mostajeran Gortani M, Pourafshary P, Hemmati-Sarapardeh A. Modeling crude oil pyrolysis process using advanced white-box and black-box machine learning techniques. Sci Rep 2023; 13:22649. [PMID: 38114589 PMCID: PMC10730853 DOI: 10.1038/s41598-023-49349-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 12/07/2023] [Indexed: 12/21/2023] Open
Abstract
Accurate prediction of fuel deposition during crude oil pyrolysis is pivotal for sustaining the combustion front and ensuring the effectiveness of in-situ combustion enhanced oil recovery (ISC EOR). Employing 2071 experimental TGA datasets from 13 diverse crude oil samples extracted from the literature, this study sought to precisely model crude oil pyrolysis. A suite of robust machine learning techniques, encompassing three black-box approaches (Categorical Gradient Boosting-CatBoost, Gaussian Process Regression-GPR, Extreme Gradient Boosting-XGBoost), and a white-box approach (Genetic Programming-GP), was employed to estimate crude oil residue at varying temperature intervals during TGA runs. Notably, the XGBoost model emerged as the most accurate, boasting a mean absolute percentage error (MAPE) of 0.7796% and a determination coefficient (R2) of 0.9999. Subsequently, the GPR, CatBoost, and GP models demonstrated commendable performance. The GP model, while displaying slightly higher error in comparison to the black-box models, yielded acceptable results and proved suitable for swift estimation of crude oil residue during pyrolysis. Furthermore, a sensitivity analysis was conducted to reveal the varying influence of input parameters on residual crude oil during pyrolysis. Among the inputs, temperature and asphaltenes were identified as the most influential factors in the crude oil pyrolysis process. Higher temperatures and oil °API gravity were associated with a negative impact, leading to a decrease in fuel deposition. On the other hand, increased values of asphaltenes, resins, and heating rates showed a positive impact, resulting in an increase in fuel deposition. These findings underscore the importance of precise modeling for fuel deposition during crude oil pyrolysis, offering insights that can significantly benefit ISC EOR practices.
Collapse
Affiliation(s)
- Fahimeh Hadavimoghaddam
- Key Laboratory of Continental Shale Hydrocarbon Accumulation and Efficient Development, Ministry of Education, Northeast Petroleum University, Daqing, 163318, China
- Ufa State Petroleum Technological University, Ufa, 450064, Russia
| | - Alexei Rozhenko
- Plekhanov Russian University of Economics, Moscow, 117997, Russia
| | | | | | - Peyman Pourafshary
- School of Mining and Geosciences, Nazarbayev University, Astana, Kazakhstan
| | - Abdolhossein Hemmati-Sarapardeh
- Department of Petroleum Engineering, Shahid Bahonar University of Kerman, Kerman, Iran.
- State Key Laboratory of Petroleum Resources and Prospecting, China University of Petroleum (Beijing), Beijing, China.
| |
Collapse
|
5
|
Kigo SN, Omondi EO, Omolo BO. Assessing predictive performance of supervised machine learning algorithms for a diamond pricing model. Sci Rep 2023; 13:17315. [PMID: 37828360 PMCID: PMC10570374 DOI: 10.1038/s41598-023-44326-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 10/06/2023] [Indexed: 10/14/2023] Open
Abstract
This study conducted a comprehensive analysis of multiple supervised machine learning models, regressors and classifiers, to accurately predict diamond prices. Diamond pricing is a complex task due to the non-linear relationships between key features such as carat, cut, clarity, table, and depth. The analysis aimed to develop an accurate predictive model by utilizing both regression and classification approaches. To preprocess the data, the study employed various techniques. The work addressed outliers, standardized the predictors, performed median imputation of missing values, and resolved multicollinearity issues. Equal-width binning on the cut variable was performed to handle class imbalance. Correlation-based feature selection was utilized to eliminate highly correlated variables, ensuring that only relevant features were included in the models. Outliers were handled using the inter-quartile range method, and numerical features were normalized through standardization. Missing values in numerical features were imputed using the median, preserving the integrity of the dataset. Among the models evaluated, the RF regressor exhibited exceptional performance. It achieved the lowest root mean squared error (RMSE) of 523.50, indicating superior accuracy compared to the other models. The RF regressor also obtained a high R-squared ([Formula: see text]) score of 0.985, suggesting it explained a significant portion of the variance in diamond prices. Furthermore, the area under the curve with RF classifier for the test set was 1.00 [Formula: see text], indicating perfect classification performance. These results solidify the RF's position as the best-performing model in terms of accuracy and predictive power, both in regression and classification. The MLP regressor showed promising results with an RMSE of 563.74 and an [Formula: see text] score of 0.980, demonstrating its ability to capture the complex relationships in the data. Although it achieved slightly higher errors than the RF regressor, further analysis is needed to determine its suitability and potential advantages compared to the RF regressor. The XGBoost Regressor achieved an RMSE of 612.88 and an [Formula: see text] score of 0.972, indicating its effectiveness in predicting diamond prices but with slightly higher errors compared to the RF regressor. The Boosted Decision Tree Regressor had an RMSE of 711.31 and an [Formula: see text] score of 0.968, demonstrating its ability to capture some of the underlying patterns but with higher errors than the RF and XGBoost models. In contrast, the KNN regressor yielded a higher RMSE of 1346.65 and a lower [Formula: see text] score of 0.887, indicating its inferior performance in accurately predicting diamond prices compared to the other models. Similarly, the Linear Regression model performed similarly to the KNN regressor, with an RMSE of 1395.41 and an [Formula: see text] score of 0.876. The Support Vector Regression model showed the highest RMSE of 3044.49 and the lowest [Formula: see text] score of 0.421, indicating its limited effectiveness in capturing the complex relationships in the data. Overall, the study demonstrates that the RF outperforms the other models in terms of accuracy and predictive power, as evidenced by its lowest RMSE, highest [Formula: see text] score, and perfect classification performance. This highlights its suitability for accurately predicting diamond prices. The study not only provides an effective tool for the diamond industry but also emphasizes the importance of considering both regression and classification approaches in developing accurate predictive models. The findings contribute valuable insights for pricing strategies, market trends, and decision-making processes in the diamond industry and related fields.
Collapse
Affiliation(s)
- Samuel Njoroge Kigo
- Institute of Mathematical Sciences, Strathmore University, P.O Box 59857-00200, Nairobi, Kenya
| | - Evans Otieno Omondi
- Institute of Mathematical Sciences, Strathmore University, P.O Box 59857-00200, Nairobi, Kenya.
- African Population and Health Research Center, P.O Box 10787-00100, APHRC Campus, Kitisuru, Nairobi, Kenya.
| | - Bernard Oguna Omolo
- Institute of Mathematical Sciences, Strathmore University, P.O Box 59857-00200, Nairobi, Kenya
- Division of Mathematics and Computer Science, University of South Carolina-Upstate, Hodge Center 223 800 University Way, Spartanburg, SC, 29303, USA
- Faculty of Health Sciences, School of Public Health, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
6
|
Rezaei F, Akbari M, Rafiei Y, Hemmati-Sarapardeh A. Compositional modeling of gas-condensate viscosity using ensemble approach. Sci Rep 2023; 13:9659. [PMID: 37316502 DOI: 10.1038/s41598-023-36122-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 05/30/2023] [Indexed: 06/16/2023] Open
Abstract
In gas-condensate reservoirs, liquid dropout occurs by reducing the pressure below the dew point pressure in the area near the wellbore. Estimation of production rate in these reservoirs is important. This goal is possible if the amount of viscosity of the liquids released below the dew point is available. In this study, the most comprehensive database related to the viscosity of gas condensate, including 1370 laboratory data was used. Several intelligent techniques, including Ensemble methods, support vector regression (SVR), K-nearest neighbors (KNN), Radial basis function (RBF), and Multilayer Perceptron (MLP) optimized by Bayesian Regularization and Levenberg-Marquardt were applied for modeling. In models presented in the literature, one of the input parameters for the development of the models is solution gas oil ratio (Rs). Measuring Rs in wellhead requires special equipment and is somewhat difficult. Also, measuring this parameter in the laboratory requires spending time and money. According to the mentioned cases, in this research, unlike the research done in the literature, Rs parameter was not used to develop the models. The input parameters for the development of the models presented in this research were temperature, pressure and condensate composition. The data used includes a wide range of temperature and pressure, and the models presented in this research are the most accurate models to date for predicting the condensate viscosity. Using the mentioned intelligent approaches, precise compositional models were presented to predict the viscosity of gas/condensate at different temperatures and pressures for different gas components. Ensemble method with an average absolute percent relative error (AAPRE) of 4.83% was obtained as the most accurate model. Moreover, the AAPRE values for SVR, KNN, MLP-BR, MLP-LM, and RBF models developed in this study are 4.95%, 5.45%, 6.56%, 7.89%, and 10.9%, respectively. Then, the effect of input parameters on the viscosity of the condensate was determined by the relevancy factor using the results of the Ensemble methods. The most negative and positive effects of parameters on the gas condensate viscosity were related to the reservoir temperature and the mole fraction of C11, respectively. Finally, suspicious laboratory data were determined and reported using the leverage technique.
Collapse
Affiliation(s)
- Farzaneh Rezaei
- Department of Petroleum Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Mohammad Akbari
- Department of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran
| | - Yousef Rafiei
- Department of Petroleum Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Abdolhossein Hemmati-Sarapardeh
- Department of Petroleum Engineering, Shahid Bahonar University of Kerman, Kerman, Iran.
- State Key Laboratory of Petroleum Resources and Prospecting, China University of Petroleum (Beijing), Beijing, China.
| |
Collapse
|
7
|
Zheng H, Mahmoudzadeh A, Amiri-Ramsheh B, Hemmati-Sarapardeh A. Modeling Viscosity of CO 2-N 2 Gaseous Mixtures Using Robust Tree-Based Techniques: Extra Tree, Random Forest, GBoost, and LightGBM. ACS OMEGA 2023; 8:13863-13875. [PMID: 37091404 PMCID: PMC10116627 DOI: 10.1021/acsomega.3c00228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 03/23/2023] [Indexed: 05/03/2023]
Abstract
Carbon dioxide (CO2) has an essential role in most enhanced oil recovery (EOR) methods in the oil industry. Oil swelling and viscosity reduction are the dominant mechanisms in an immiscible CO2-EOR process. Besides numerous CO2 applications in EOR, most oil reservoirs do not have access to natural CO2, and capturing it from flue gas and other sources is costly. Flue gases are available in huge quantities at a significantly lower price and can be considered economically viable agents for EOR operations. In this work, four powerful machine learning algorithms, namely, extra tree (ET), random forest (RF), gradient boosting (GBoost), and light gradient boosted machine (LightGBM) were utilized to accurately estimate the viscosity of CO2-N2 mixtures. To this aim, a databank was employed, containing 3036 data points over wide ranges of pressures and temperatures. Temperature, pressure, and CO2 mole fraction were applied as input parameters, and the viscosity of the CO2-N2 mixture was the output. The RF smart model had the highest precision with the lowest average absolute percent relative error (AAPRE) of 1.58%, root mean square error (RMSE) of 2.221, and determination coefficient (R 2) of 0.9993. The trend analysis showed that the RF model could precisely predict the real physical behavior of the CO2-N2 viscosity variation. Finally, the outlier detection was performed using the leverage approach to demonstrate the validity of the utilized databank and the applicability area of the developed RF model. Accordingly, nearly 96% of the data points seemed to be dependable and valid, and the rest of them were located in the suspected and out-of-leverage data zones.
Collapse
Affiliation(s)
- Haimin Zheng
- Engn
& Design Dept, Proc Sect, CNOOC Research
Institute Co., Beijing 100027, P.R. China
| | - Atena Mahmoudzadeh
- Department
of Chemical and Petroleum Engineering, Sharif
University of Technology, Tehran 1234567812, Iran
| | - Behnam Amiri-Ramsheh
- Department
of Petroleum Engineering, Shahid Bahonar
University of Kerman, Kerman 1234567891, Iran
| | - Abdolhossein Hemmati-Sarapardeh
- Department
of Petroleum Engineering, Shahid Bahonar
University of Kerman, Kerman 1234567891, Iran
- State
Key Laboratory of Petroleum Resources and Prospecting, China University of Petroleum (Beijing), Beijing 102249, China
- ;
| |
Collapse
|
8
|
Prediction of the Ibuprofen Loading Capacity of MOFs by Machine Learning. BIOENGINEERING (BASEL, SWITZERLAND) 2022; 9:bioengineering9100517. [PMID: 36290485 PMCID: PMC9598200 DOI: 10.3390/bioengineering9100517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 09/14/2022] [Accepted: 09/28/2022] [Indexed: 11/07/2022]
Abstract
Metal-organic frameworks (MOFs) have been widely researched as drug delivery systems due to their intrinsic porous structures. Herein, machine learning (ML) technologies were applied for the screening of MOFs with high drug loading capacity. To achieve this, first, a comprehensive dataset was gathered, including 40 data points from more than 100 different publications. The organic linkers, metal ions, and the functional groups, as well as the surface area and the pore volume of the investigated MOFs, were chosen as the model’s inputs, and the output was the ibuprofen (IBU) loading capacity. Thereafter, various advanced and powerful machine learning algorithms, such as support vector regression (SVR), random forest (RF), adaptive boosting (AdaBoost), and categorical boosting (CatBoost), were employed to predict the ibuprofen loading capacity of MOFs. The coefficient of determination (R2) of 0.70, 0.72, 0.66, and 0.76 were obtained for the SVR, RF, AdaBoost, and CatBoost approaches, respectively. Among all the algorithms, CatBoost was the most reliable, exhibiting superior performance regarding the sparse matrices and categorical features. Shapley additive explanations (SHAP) analysis was employed to explore the impact of the eigenvalues of the model’s outputs. Our initial results indicate that this methodology is a well generalized, straightforward, and cost-effective method that can be applied not only for the prediction of IBU loading capacity, but also in many other biomaterials projects.
Collapse
|
9
|
Mohammadi MR, Hadavimoghaddam F, Atashrouz S, Abedi A, Hemmati-Sarapardeh A, Mohaddespour A. Modeling the solubility of light hydrocarbon gases and their mixture in brine with machine learning and equations of state. Sci Rep 2022; 12:14943. [PMID: 36056055 PMCID: PMC9440136 DOI: 10.1038/s41598-022-18983-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 08/23/2022] [Indexed: 11/09/2022] Open
Abstract
Knowledge of the solubilities of hydrocarbon components of natural gas in pure water and aqueous electrolyte solutions is important in terms of engineering designs and environmental aspects. In the current work, six machine-learning algorithms, namely Random Forest, Extra Tree, adaptive boosting support vector regression (AdaBoost-SVR), Decision Tree, group method of data handling (GMDH), and genetic programming (GP) were proposed for estimating the solubility of pure and mixture of methane, ethane, propane, and n-butane gases in pure water and aqueous electrolyte systems. To this end, a huge database of hydrocarbon gases solubility (1836 experimental data points) was prepared over extensive ranges of operating temperature (273-637 K) and pressure (0.051-113.27 MPa). Two different approaches including eight and five inputs were adopted for modeling. Moreover, three famous equations of state (EOSs), namely Peng-Robinson (PR), Valderrama modification of the Patel-Teja (VPT), and Soave-Redlich-Kwong (SRK) were used in comparison with machine-learning models. The AdaBoost-SVR models developed with eight and five inputs outperform the other models proposed in this study, EOSs, and available intelligence models in predicting the solubility of mixtures or/and pure hydrocarbon gases in pure water and aqueous electrolyte systems up to high-pressure and high-temperature conditions having average absolute relative error values of 10.65% and 12.02%, respectively, along with determination coefficient of 0.9999. Among the EOSs, VPT, SRK, and PR were ranked in terms of good predictions, respectively. Also, the two mathematical correlations developed with GP and GMDH had satisfactory results and can provide accurate and quick estimates. According to sensitivity analysis, the temperature and pressure had the greatest effect on hydrocarbon gases' solubility. Additionally, increasing the ionic strength of the solution and the pseudo-critical temperature of the gas mixture decreases the solubilities of hydrocarbon gases in aqueous electrolyte systems. Eventually, the Leverage approach has revealed the validity of the hydrocarbon solubility databank and the high credit of the AdaBoost-SVR models in estimating the solubilities of hydrocarbon gases in aqueous solutions.
Collapse
Affiliation(s)
| | - Fahimeh Hadavimoghaddam
- Key Laboratory of Continental Shale Hydrocarbon Accumulation and Efficient Development (Northeast Petroleum University), Ministry of Education, Northeast Petroleum University, Daqing, 163318, Heilongjiang, China
- Institute of Unconventional Oil and Gas, Northeast Petroleum University, Daqing, 163318, China
| | - Saeid Atashrouz
- Department of Chemical Engineering, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran.
| | - Ali Abedi
- College of Engineering and Technology, American University of the Middle East, Kuwait City, Kuwait
| | - Abdolhossein Hemmati-Sarapardeh
- Department of Petroleum Engineering, Shahid Bahonar University of Kerman, Kerman, Iran.
- College of Construction Engineering, Jilin University, Changchun, China.
| | - Ahmad Mohaddespour
- Department of Chemical Engineering, McGill University, Montreal, QC, H3A 0C5, Canada.
| |
Collapse
|
10
|
Mapping Seasonal Leaf Nutrients of Mangrove with Sentinel-2 Images and XGBoost Method. REMOTE SENSING 2022. [DOI: 10.3390/rs14153679] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Monitoring the seasonal leaf nutrients of mangrove forests helps one to understand the dynamics of carbon (C) sequestration and to diagnose the availability and limitation of nitrogen (N) and phosphorus (P). To date, very little attention has been paid to mapping the seasonal leaf C, N, and P of mangrove forests with remote sensing techniques. Based on Sentinel-2 images taken in spring, summer, and winter, this study aimed to compare three machine learning models (XGBoost, extreme gradient boosting; RF, random forest; LightGBM, light gradient boosting machine) in estimating the three leaf nutrients and further to apply the best-performing model to map the leaf nutrients of 15 seasons from 2017 to 2021. The results showed that there were significant differences in leaf nutrients (p < 0.05) across the three seasons. Among the three machine learning models, XGBoost with sensitive spectral features of Sentinel-2 images was optimal for estimating the leaf C (R2 = 0.655, 0.799, and 0.829 in spring, summer, and winter, respectively), N (R2 = 0.668, 0.743, and 0.704) and P (R2 = 0.539, 0.622, and 0.596) over the three seasons. Moreover, the red-edge (especially B6) and near-infrared bands (B8 and B8a) of Sentinel-2 images were efficient estimators of mangrove leaf nutrients. The information of species, elevation, and canopy structure (leaf area index [LAI] and canopy height) would be incorporated into the present model to improve the model accuracy and transferability in future studies.
Collapse
|
11
|
Nakhaei-Kohani R, Ali Madani S, Mousavi SP, Atashrouz S, Abedi A, Hemmati-Sarapardeh A, Mohaddespour A. Machine Learning Assisted Structure-based Models for Predicting Electrical Conductivity of Ionic Liquids. J Mol Liq 2022. [DOI: 10.1016/j.molliq.2022.119509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
12
|
Numerical Analysis of Shallow Foundations with Varying Loading and Soil Conditions. BUILDINGS 2022. [DOI: 10.3390/buildings12050693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The load–deformation relationship under the footing is essential for foundation design. Shallow foundations are subjected to changes in hydrological conditions such as rainfall and drought, affecting their saturation level and conditions. The actual load–settlement response for design and reconstructions is determined experimentally, numerically, or utilizing both approaches. Ssettlement computation is performed through large-scale physical modeling or extensive laboratory testing. It is expensive, labor intensive, and time consuming. This study is carried out to determine the effect of different saturation degrees and loading conditions on settlement shallow foundations using numerical modeling in Plaxis 2D, Bentley Systems, Exton, Pennsylvania, US. Plastic was used for dry soil calculation, while fully coupled flow deformation was used for partially saturated soil. Pore pressure and deformation changes were computed in fully coupled deformation. The Mohr–Columb model was used in the simulation, and model parameters were calculated from experimental results. The study results show that the degree of saturation is more critical to soil settlement than loading conditions. When a 200 KPa load was applied at the center of the footing, settlement was recored as 28.81 mm, which was less than 42.96 mm in the case of the full-depth shale layer; therefore, settlement was reduced by 30% in the underlying limestone rock layer. Regarding settlement under various degrees of saturation (DOS), settlment is increased by an increased degree of saturation, which increases pore pressure and decreases the shear strength of the soil. Settlement was observed as 0.69 mm at 0% saturation, 1.93 mm at 40% saturation, 2.21 mm at 50% saturation, 2.77 mm at 70% saturation, and 2.84 mm at 90% saturation of soil.
Collapse
|
13
|
Mohammadi MR, Hadavimoghaddam F, Atashrouz S, Abedi A, Hemmati-Sarapardeh A, Mohaddespour A. Toward predicting SO2 solubility in ionic liquids utilizing soft computing approaches and equations of state. J Taiwan Inst Chem Eng 2022. [DOI: 10.1016/j.jtice.2022.104220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
14
|
Numerical Analysis of Piled-Raft Foundations on Multi-Layer Soil Considering Settlement and Swelling. BUILDINGS 2022. [DOI: 10.3390/buildings12030356] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Numerical modelling can simulate the interaction between structural elements and the soil continuum in a piled-raft foundation. The present work utilized a two-dimensional finite element Plaxis 2D software to investigate the settlement, swelling, and structural behavior of foundations during the settlement and swelling of soil on various soil profiles under various load combinations and geometry conditions. The field and laboratory testing have been performed to determine the behavior soil parameters necessary for numerical modelling. The Mohr–Coulomb model is utilized to simulate the behavior of soil, as this model requires very few input parameters, which is important for the practical geotechnical behavior of soil. From this study, it was observed that, as soil is soft and has less stiffness, the un-piled raft was not sufficient to resists and higher loads and exceeds the limits of settlement. Piled raft increases the load carrying capacity of soil, and the lower soil layer has a higher stiffness where the pile rests, decreasing the significant settlement. Further, the effects of (L/d) and (s/d) of the pile and Krs on the settlement are also discussed, detailed numerically under different scenarios. The swelling of expansive soil was also simulated in Plaxis 2D with an application of positive volumetric strain. The above-mentioned parametric study was similarly implemented for the heaving of foundation on expansive soil.
Collapse
|
15
|
A modeling approach for estimating hydrogen sulfide solubility in fifteen different imidazole-based ionic liquids. Sci Rep 2022; 12:4415. [PMID: 35292713 PMCID: PMC8924225 DOI: 10.1038/s41598-022-08304-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 03/07/2022] [Indexed: 11/25/2022] Open
Abstract
Absorption has always been an attractive process for removing hydrogen sulfide (H2S). Posing unique properties and promising removal capacity, ionic liquids (ILs) are potential media for H2S capture. Engineering design of such absorption process needs accurate measurements or reliable estimation of the H2S solubility in ILs. Since experimental measurements are time-consuming and expensive, this study utilizes machine learning methods to monitor H2S solubility in fifteen various ILs accurately. Six robust machine learning methods, including adaptive neuro-fuzzy inference system, least-squares support vector machine (LS-SVM), radial basis function, cascade, multilayer perceptron, and generalized regression neural networks, are implemented/compared. A vast experimental databank comprising 792 datasets was utilized. Temperature, pressure, acentric factor, critical pressure, and critical temperature of investigated ILs are the affecting parameters of our models. Sensitivity and statistical error analysis were utilized to assess the performance and accuracy of the proposed models. The calculated solubility data and the derived models were validated using seven statistical criteria. The obtained results showed that the LS-SVM accurately predicts H2S solubility in ILs and possesses R2, RMSE, MSE, RRSE, RAE, MAE, and AARD of 0.99798, 0.01079, 0.00012, 6.35%, 4.35%, 0.0060, and 4.03, respectively. It was found that the H2S solubility adversely relates to the temperature and directly depends on the pressure. Furthermore, the combination of OMIM+ and Tf2N-, i.e., [OMIM][Tf2N] ionic liquid, is the best choice for H2S capture among the investigated absorbents. The H2S solubility in this ionic liquid can reach more than 0.8 in terms of mole fraction.
Collapse
|
16
|
Modeling Interfacial Tension of N2/CO2 Mixture + n-Alkanes with Machine Learning Methods: Application to EOR in Conventional and Unconventional Reservoirs by Flue Gas Injection. MINERALS 2022. [DOI: 10.3390/min12020252] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The combustion of fossil fuels from the input of oil refineries, power plants, and the venting or flaring of produced gases in oil fields leads to greenhouse gas emissions. Economic usage of greenhouse and flue gases in conventional and unconventional reservoirs would not only enhance the oil and gas recovery but also offers CO2 sequestration. In this regard, the accurate estimation of the interfacial tension (IFT) between the injected gases and the crude oils is crucial for the successful execution of injection scenarios in enhanced oil recovery (EOR) operations. In this paper, the IFT between a CO2/N2 mixture and n-alkanes at different pressures and temperatures is investigated by utilizing machine learning (ML) methods. To this end, a data set containing 268 IFT data was gathered from the literature. Pressure, temperature, the carbon number of n-alkanes, and the mole fraction of N2 were selected as the input parameters. Then, six well-known ML methods (radial basis function (RBF), the adaptive neuro-fuzzy inference system (ANFIS), the least square support vector machine (LSSVM), random forest (RF), multilayer perceptron (MLP), and extremely randomized tree (extra-tree)) were used along with four optimization methods (colliding bodies optimization (CBO), particle swarm optimization (PSO), the Levenberg–Marquardt (LM) algorithm, and coupled simulated annealing (CSA)) to model the IFT of the CO2/N2 mixture and n-alkanes. The RBF model predicted all the IFT values with exceptional precision with an average absolute relative error of 0.77%, and also outperformed all other models in this paper and available in the literature. Furthermore, it was found that the pressure and the carbon number of n-alkanes would show the highest influence on the IFT of the CO2/N2 and n-alkanes, based on sensitivity analysis. Finally, the utilized IFT database and the area of the RBF model applicability were investigated via the leverage method.
Collapse
|
17
|
Modeling of nitrogen solubility in unsaturated, cyclic, and aromatic hydrocarbons: Deep learning methods and SAFT equation of state. J Taiwan Inst Chem Eng 2022. [DOI: 10.1016/j.jtice.2021.10.024] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
18
|
Liu Y, Qi Z, Zhao M, Jiang H, Liu Y, Chen R. Kinetics of liquid-phase phenol hydrogenation enhanced by membrane dispersion. Chem Eng Sci 2022. [DOI: 10.1016/j.ces.2021.117346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
19
|
Travel Time Prediction and Explanation with Spatio-Temporal Features: A Comparative Study. ELECTRONICS 2021. [DOI: 10.3390/electronics11010106] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Travel time information is used as input or auxiliary data for tasks such as dynamic navigation, infrastructure planning, congestion control, and accident detection. Various data-driven Travel Time Prediction (TTP) methods have been proposed in recent years. One of the most challenging tasks in TTP is developing and selecting the most appropriate prediction algorithm. The existing studies that empirically compare different TTP models only use a few models with specific features. Moreover, there is a lack of research on explaining TTPs made by black-box models. Such explanations can help to tune and apply TTP methods successfully. To fill these gaps in the current TTP literature, using three data sets, we compare three types of TTP methods (ensemble tree-based learning, deep neural networks, and hybrid models) and ten different prediction algorithms overall. Furthermore, we apply XAI (Explainable Artificial Intelligence) methods (SHAP and LIME) to understand and interpret models’ predictions. The prediction accuracy and reliability for all models are evaluated and compared. We observed that the ensemble learning methods, i.e., XGBoost and LightGBM, are the best performing models over the three data sets, and XAI methods can adequately explain how various spatial and temporal features influence travel time.
Collapse
|
20
|
Modeling of nitrogen solubility in normal alkanes using machine learning methods compared with cubic and PC-SAFT equations of state. Sci Rep 2021; 11:24403. [PMID: 34937872 PMCID: PMC8695585 DOI: 10.1038/s41598-021-03643-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Accepted: 12/07/2021] [Indexed: 11/24/2022] Open
Abstract
Accurate prediction of the solubility of gases in hydrocarbons is a crucial factor in designing enhanced oil recovery (EOR) operations by gas injection as well as separation, and chemical reaction processes in a petroleum refinery. In this work, nitrogen (N2) solubility in normal alkanes as the major constituents of crude oil was modeled using five representative machine learning (ML) models namely gradient boosting with categorical features support (CatBoost), random forest, light gradient boosting machine (LightGBM), k-nearest neighbors (k-NN), and extreme gradient boosting (XGBoost). A large solubility databank containing 1982 data points was utilized to establish the models for predicting N2 solubility in normal alkanes as a function of pressure, temperature, and molecular weight of normal alkanes over broad ranges of operating pressure (0.0212–69.12 MPa) and temperature (91–703 K). The molecular weight range of normal alkanes was from 16 to 507 g/mol. Also, five equations of state (EOSs) including Redlich–Kwong (RK), Soave–Redlich–Kwong (SRK), Zudkevitch–Joffe (ZJ), Peng–Robinson (PR), and perturbed-chain statistical associating fluid theory (PC-SAFT) were used comparatively with the ML models to estimate N2 solubility in normal alkanes. Results revealed that the CatBoost model is the most precise model in this work with a root mean square error of 0.0147 and coefficient of determination of 0.9943. ZJ EOS also provided the best estimates for the N2 solubility in normal alkanes among the EOSs. Lastly, the results of relevancy factor analysis indicated that pressure has the greatest influence on N2 solubility in normal alkanes and the N2 solubility increases with increasing the molecular weight of normal alkanes.
Collapse
|