1
|
Pradhan UK, Meher PK, Naha S, Pal S, Gupta S, Gupta A, Parsad R. RBPLight: a computational tool for discovery of plant-specific RNA-binding proteins using light gradient boosting machine and ensemble of evolutionary features. Brief Funct Genomics 2023; 22:401-410. [PMID: 37158175 DOI: 10.1093/bfgp/elad016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/12/2023] [Accepted: 04/21/2023] [Indexed: 05/10/2023] Open
Abstract
RNA-binding proteins (RBPs) are essential for post-transcriptional gene regulation in eukaryotes, including splicing control, mRNA transport and decay. Thus, accurate identification of RBPs is important to understand gene expression and regulation of cell state. In order to detect RBPs, a number of computational models have been developed. These methods made use of datasets from several eukaryotic species, specifically from mice and humans. Although some models have been tested on Arabidopsis, these techniques fall short of correctly identifying RBPs for other plant species. Therefore, the development of a powerful computational model for identifying plant-specific RBPs is needed. In this study, we presented a novel computational model for locating RBPs in plants. Five deep learning models and ten shallow learning algorithms were utilized for prediction with 20 sequence-derived and 20 evolutionary feature sets. The highest repeated five-fold cross-validation accuracy, 91.24% AU-ROC and 91.91% AU-PRC, was achieved by light gradient boosting machine. While evaluated using an independent dataset, the developed approach achieved 94.00% AU-ROC and 94.50% AU-PRC. The proposed model achieved significantly higher accuracy for predicting plant-specific RBPs as compared to the currently available state-of-art RBP prediction models. Despite the fact that certain models have already been trained and assessed on the model organism Arabidopsis, this is the first comprehensive computer model for the discovery of plant-specific RBPs. The web server RBPLight was also developed, which is publicly accessible at https://iasri-sg.icar.gov.in/rbplight/, for the convenience of researchers to identify RBPs in plants.
Collapse
Affiliation(s)
- Upendra K Pradhan
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| | - Prabina K Meher
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| | - Sanchita Naha
- Division of Computer Applications, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| | - Soumen Pal
- Division of Computer Applications, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| | - Sagar Gupta
- CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), Palampur (HP) 176061, India
| | - Ajit Gupta
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| | - Rajender Parsad
- ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| |
Collapse
|
2
|
Pradhan UK, Meher PK, Naha S, Sharma NK, Agarwal A, Gupta A, Parsad R. DBPMod: a supervised learning model for computational recognition of DNA-binding proteins in model organisms. Brief Funct Genomics 2023:elad039. [PMID: 37651627 DOI: 10.1093/bfgp/elad039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 08/09/2023] [Accepted: 08/15/2023] [Indexed: 09/02/2023] Open
Abstract
DNA-binding proteins (DBPs) play critical roles in many biological processes, including gene expression, DNA replication, recombination and repair. Understanding the molecular mechanisms underlying these processes depends on the precise identification of DBPs. In recent times, several computational methods have been developed to identify DBPs. However, because of the generic nature of the models, these models are unable to identify species-specific DBPs with higher accuracy. Therefore, a species-specific computational model is needed to predict species-specific DBPs. In this paper, we introduce the computational DBPMod method, which makes use of a machine learning approach to identify species-specific DBPs. For prediction, both shallow learning algorithms and deep learning models were used, with shallow learning models achieving higher accuracy. Additionally, the evolutionary features outperformed sequence-derived features in terms of accuracy. Five model organisms, including Caenorhabditis elegans, Drosophila melanogaster, Escherichia coli, Homo sapiens and Mus musculus, were used to assess the performance of DBPMod. Five-fold cross-validation and independent test set analyses were used to evaluate the prediction accuracy in terms of area under receiver operating characteristic curve (auROC) and area under precision-recall curve (auPRC), which was found to be ~89-92% and ~89-95%, respectively. The comparative results demonstrate that the DBPMod outperforms 12 current state-of-the-art computational approaches in identifying the DBPs for all five model organisms. We further developed the web server of DBPMod to make it easier for researchers to detect DBPs and is publicly available at https://iasri-sg.icar.gov.in/dbpmod/. DBPMod is expected to be an invaluable tool for discovering DBPs, supplementing the current experimental and computational methods.
Collapse
Affiliation(s)
- Upendra K Pradhan
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| | - Prabina K Meher
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| | - Sanchita Naha
- Division of Computer Applications, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| | - Nitesh K Sharma
- Titus Family Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, 1540 Alcazar Street, Los Angeles, CA 90033, USA
| | - Aarushi Agarwal
- Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh 201313, India
| | - Ajit Gupta
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| | - Rajender Parsad
- ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India
| |
Collapse
|
3
|
Rao MN, Gaikwad S, Ram A, Pradhan UK, Sautya S, Kumbhar L, Udayakrishnan PB, Siddaiha V. Effects of sedimentary heavy metals on meiobenthic community in tropical estuaries along eastern Arabian Sea. Environ Geochem Health 2023; 45:731-750. [PMID: 35292879 DOI: 10.1007/s10653-022-01239-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 02/25/2022] [Indexed: 06/14/2023]
Abstract
The central west coast of India comprises the 720 km long coastline of Maharashtra state and houses widespread industrial zones along the eastern Arabian Sea. Sediments from seven industrial-dominated estuaries along the central west coast were studied for metal enrichment and benthic assemblages to determine sediment quality status and ecological effects in these areas. The suit of geochemical indices highlighted the contamination of sediment in the estuaries concerning heavy metals. Positive correlations of Hg with Co, Zn, Ni, Cr, and Pb indicated the source similarity and effect of anthropogenic activity. non-Metric Multidimensional Scaling (n-MDS) based on meiofaunal abundance showed a cleared separation of clusters through the gradient of heavy metal concentrations. The Canonical Correspondence Analysis (CCA) results with the Monte Carlo test signified those heavy metals influenced the meiobenthic community. Heavy metals (Cr, Ni, Zn, Cd, Pb, and Hg) were the main drivers shaping the meiofaunal community with a significant (p < 0.05) reduction in taxa richness, diversity, and evenness. Dominant meiofaunal assemblages evidence the tolerance of foraminiferans and nematodes. However, these taxa were affected by decreased abundance at impacted sites compared to other fauna. In conclusion, results demonstrated that impairment occurred in the meiofaunal community in most estuaries (except AB and KK).
Collapse
Affiliation(s)
- M Nageswar Rao
- CSIR-National Institute of Oceanography, Regional Centre, Mumbai, 400053, India
- Department of Organic Chemistry and Food, Drug and Water, Andhra University, Visakhapatnam, 530003, India
| | - S Gaikwad
- CSIR-National Institute of Oceanography, Regional Centre, Mumbai, 400053, India
| | - Anirudh Ram
- CSIR-National Institute of Oceanography, Regional Centre, Mumbai, 400053, India
| | - U K Pradhan
- CSIR-National Institute of Oceanography, Regional Centre, Mumbai, 400053, India.
| | - S Sautya
- CSIR-National Institute of Oceanography, Regional Centre, Mumbai, 400053, India
| | - L Kumbhar
- CSIR-National Institute of Oceanography, Regional Centre, Mumbai, 400053, India
| | - P B Udayakrishnan
- CSIR-National Institute of Oceanography, Regional Centre, Mumbai, 400053, India
| | - V Siddaiha
- Department of Organic Chemistry and Food, Drug and Water, Andhra University, Visakhapatnam, 530003, India
| |
Collapse
|
4
|
Pradhan UK, Wu Y, Shirodkar PV, Kumar HS, Zhang J. Connecting land use-land cover and precipitation with organic matter biogeochemistry in a tropical river-estuary system of western peninsular India. J Environ Manage 2020; 271:110993. [PMID: 32778283 DOI: 10.1016/j.jenvman.2020.110993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 05/20/2020] [Accepted: 06/17/2020] [Indexed: 06/11/2023]
Abstract
Organic matter (OM) composition changed due to land use ─ land cover (LULC) and hydrology modification, has distinctive linkage towards sustainable environment management in tropical river systems. It is crucial in small river systems, which experience delay of freshwater flow to the estuaries due to headwater damming, also LULC alteration along the entire basin. In order to understand this fundamental linkage in tropical Zuari river-estuary (ZRE), we analyzed multi-proxy data of organic carbon to total nitrogen ratio (Corg/N), stable organic carbon isotope (δ13Corg) and lignin phenols measured in seasonally collected suspended particulate matter (SPM) and sediment samples. Results highlighted about moderate seasonality of OM tracers, with a significant effect of LULC alteration, which nevertheless a striking feature in monsoon-fed river-estuaries of peninsular India. Particulate Corg export from ZRE estimated to be 20 × 103 kg yr-1, was much lower as compared to tropical river-estuary systems elsewhere. OM fraction from vascular plant (mangroves) contributed to SPM and sediment was 15% and 40%, respectively, calculated using a Bayesian mixing calculation through Stable isotope analysis in R (SIAR). Presence of mudflat LULC in the estuarine region notably caused 20% decrease in Corg and 60% increase in lignin phenol (Λ8) as compared to their limits in upstream. This is although mudflat accounts only 3% of ZRE catchment. The degree of shifts in OM tracers highlights towards efficient entrapment, transformation and/or utilization of riverine OM in the mudflats of ZRE. Accelerated human induced LULC dampens the seasonality of OM characteristics and flow is highlighted through this study, which is essential towards sustainable environmental management practice in small rivers of India and World.
Collapse
Affiliation(s)
- U K Pradhan
- CSIR‒National Institute of Oceanography, Regional Centre, Lokhandwala Rd. Andheri (W), Mumbai, 400053, India.
| | - Y Wu
- State Key Laboratory of Estuarine and Coastal Research, East China Normal University, 3663 North Zhongshan Road, 200062, Shanghai, PR China
| | - P V Shirodkar
- CSIR‒National Institute of Oceanography, Dona Paula, 403004, Goa, India
| | - H Shiva Kumar
- Advisory Services and Satellite oceanography Group (ASG), Indian National Centre for Ocean Information Services (INCOIS), Ocean Valley, Pragathi Nagar (BO), Nizampet (SO), Hyderabad, 500090, Andhra Pradesh, India
| | - J Zhang
- State Key Laboratory of Estuarine and Coastal Research, East China Normal University, 3663 North Zhongshan Road, 200062, Shanghai, PR China
| |
Collapse
|
5
|
Nageswar Rao M, Ram A, Pradhan UK, Siddaiah V. Factors controlling organic matter composition and trophic state in seven tropical estuaries along the west coast of India. Environ Geochem Health 2019; 41:545-562. [PMID: 29982906 DOI: 10.1007/s10653-018-0150-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Accepted: 06/29/2018] [Indexed: 06/08/2023]
Abstract
To understand the organic matter (OM) sources and trophic states, spatial and seasonal (post-monsoon and pre-monsoon) variation in sedimentary OM compositions was investigated in seven tropical estuaries of the state of Maharashtra along the central west coast of India. Based on the result of cluster analysis, estuaries were segregated into two distinct groups: Northern Maharashtra and Southern Maharashtra owing to dissimilarity in OM characteristics potentially constrained by geomorphology and catchment properties. Enrichment of Corg and major biochemical compounds (lipids, carbohydrates and proteins) in the middle zone of most estuaries highlighted towards the addition of allochthonous OM. Results of principal component analysis highlighted the similar source of OM in most of the estuaries during both seasons and their distribution largely constrained by grain size change. The benthic trophic state indicated the prevalence of eutrophic state in the middle zone of the investigated estuaries, which may be sporadic and dependent upon anthropogenic activities in the study area.
Collapse
Affiliation(s)
- M Nageswar Rao
- Chemical Oceanography Division, Regional Centre, CSIR-National Institute of Oceanography, Lokhandwala Rd. Four Bungalows, Andheri (West), Mumbai, 400 053, India
| | - Anirudh Ram
- Chemical Oceanography Division, Regional Centre, CSIR-National Institute of Oceanography, Lokhandwala Rd. Four Bungalows, Andheri (West), Mumbai, 400 053, India.
| | - U K Pradhan
- Chemical Oceanography Division, Regional Centre, CSIR-National Institute of Oceanography, Lokhandwala Rd. Four Bungalows, Andheri (West), Mumbai, 400 053, India
| | - V Siddaiah
- Department of Organic Chemistry & FDW, Andhra University, Visakhapatnam, 530 003, India
| |
Collapse
|
6
|
Shirodkar PV, Mesquita A, Pradhan UK, Verlekar XN, Babu MT, Vethamony P. Factors controlling physico-chemical characteristics in the coastal waters off Mangalore--a multivariate approach. Environ Res 2009; 109:245-257. [PMID: 19171328 DOI: 10.1016/j.envres.2008.11.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2008] [Revised: 10/20/2008] [Accepted: 11/10/2008] [Indexed: 05/27/2023]
Abstract
Water quality parameters (temperature, pH, salinity, DO, BOD, suspended solids, nutrients, PHc, phenols, trace metals--Pb, Cd and Hg, chlorophyll-a (chl-a) and phaeopigments) and the sediment quality parameters (total phosphorous, total nitrogen, organic carbon and trace metals) were analysed from samples collected at 15 stations along 3 transects off Karnataka coast (Mangalore harbour in the south to Suratkal in the north), west coast of India during 2007. The analyses showed high ammonia off Suratkal, high nitrite (NO(2)-N) and nitrate (NO(3)-N) in the nearshore waters off Kulai and high nitrite (NO(2)-N) and ammonia (NH(3)-N) in the harbour area. Similarly, high petroleum hydrocarbon (PHc) values were observed near the harbour, while phenols remained high in the nearshore waters of Kulai and Suratkal. Significantly, high concentrations of cadmium and mercury with respect to the earlier studies were observed off Kulai and harbour regions, respectively. R-mode varimax factor analyses were applied separately to surface and bottom water data sets due to existing stratification in the water column caused by riverine inflow and to sediment data. This helped to understand the interrelationships between the variables and to identify probable source components for explaining the environmental status of the area. Six factors (each for surface and bottom waters) were found responsible for variance (86.9% in surface and 82.4% in bottom) in the coastal waters between Mangalore and Suratkal. In sediments, 4 factors explained 86.8% of the observed total variance. The variances indicated addition of nutrients and suspended solids to the coastal waters due to weathering and riverine transport and are categorized as natural sources. The observed contamination of coastal waters indicated anthropogenic inputs of Cd and phenol from industrial effluent sources at Kulai and Suratkal, ammonia from wastewater discharges off Kulai and harbour, PHc and Hg from boat traffic and harbour activities of New Mangalore harbour. However, the strong seasonal currents and the seasonal winds keep the coastal waters well mixed and aerated, which help to disperse the contaminants, without significantly affecting chlorophyll-a concentrations. The interrelationship between the stations as shown by cluster analyses and depicted in dendograms, categorize the contamination levels sector-wise.
Collapse
Affiliation(s)
- P V Shirodkar
- National Institute of Oceanography, Dona Paula, Goa 403 004, India.
| | | | | | | | | | | |
Collapse
|