1
|
Karasaki S, Morello-Frosch R, Callaway D. Machine learning for environmental justice: Dissecting an algorithmic approach to predict drinking water quality in California. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 951:175730. [PMID: 39187077 DOI: 10.1016/j.scitotenv.2024.175730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 08/19/2024] [Accepted: 08/21/2024] [Indexed: 08/28/2024]
Abstract
The potential for machine learning to answer questions of environmental science, monitoring, and regulatory enforcement is evident, but there is cause for concern regarding potential embedded bias: algorithms can codify discrimination and exacerbate systematic gaps. This paper, organized into two halves, underscores the importance of vetting algorithms for bias when used for questions of environmental science and justice. In the first half, we present a case study of using machine learning for environmental justice-motivated research: prediction of drinking water quality. While performance varied across models and contaminants, some performed well. Multiple models had overall accuracy rates at or above 90 % and F2 scores above 0.60 on their respective test sets. In the second half, we dissect this algorithmic approach to examine how modeling decisions affect modeling outcomes - and not only how these decisions change whether the model is correct or incorrect, but for whom. We find that multiple decision points in the modeling process can lead to different predictive outcomes. More importantly, we find that these choices can result in significant differences in demographic characteristics of false negatives. We conclude by proposing a set of practices for researchers and policy makers to follow (and improve upon) when applying machine learning to questions of environmental science, management, and justice.
Collapse
Affiliation(s)
- Seigi Karasaki
- University of California Berkeley, Energy and Resources Group, Berkeley, California, United States.
| | - Rachel Morello-Frosch
- University of California Berkeley, Environmental Science, Policy, and Management, Berkeley, California, United States; University of California Berkeley, School of Public Health, Berkeley, California, United States
| | - Duncan Callaway
- University of California Berkeley, Energy and Resources Group, Berkeley, California, United States
| |
Collapse
|
2
|
Archer H, González DJX, Walsh J, English P, Reynolds P, Boscardin WJ, Carpenter C, Morello-Frosch R. Upstream Oil and Gas Production and Community COVID-19 Case and Mortality Rates in California, USA. GEOHEALTH 2024; 8:e2024GH001070. [PMID: 39524319 PMCID: PMC11543630 DOI: 10.1029/2024gh001070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 10/15/2024] [Accepted: 10/17/2024] [Indexed: 11/16/2024]
Abstract
Higher concentrations of ambient air pollutants, including PM2.5 and NO2, and other pollutants have been found near active oil and gas wells and may be associated with adverse COVID-19 outcomes. We assessed whether residential exposure to nearby oil and gas production was associated with higher rates of the respiratory infection COVID-19 and related mortality using a population-based ecological study in California. Using gridded population estimates, we estimated area-level exposure to annual average oil and gas production volume from active wells within 1 kilometer (km) of populated areas within census block groups from 2018 to 2020. We geocoded confirmed cases and associated deaths to assess block group case and mortality rates from COVID-19 from February 2020 to January 2021. We fit hierarchical Poisson models with individual and area covariates (e.g., age, sex, socioeconomic disadvantage), and included time and other interactions to assess additional variation (e.g., testing, reporting rates). In the first 4 months of the study period (February-May 2020), block groups in the highest tertile of oil and gas production exposure had 34% higher case rates (IRR: 1.34 95% CI: 1.20, 1.49) and 55% higher mortality rates (MRR: 1.52 95%: CI: 1.14, 2.03) than those with no estimated production, after accounting for area-level covariates. Over the entire study period, we observed moderately higher mortality rates in the highest group (MRR: 1.16 95%: CI: 1.01, 1.33) and null associations for case rates.
Collapse
Affiliation(s)
- Helena Archer
- Department of Epidemiology School of Public Health University of California, Berkeley Berkeley CA USA
| | - David J X González
- Department of Environmental Science, Policy, & Management School of Public Health University of California, Berkeley Berkeley CA USA
| | - Julia Walsh
- Department of Maternal and Child Health School of Public Health University of California, Berkeley Berkeley CA USA
| | - Paul English
- Tracking California Public Health Institute Oakland CA USA
| | - Peggy Reynolds
- Department of Epidemiology and Biostatistics University of California, San Francisco San Francisco CA USA
| | - W John Boscardin
- Department of Epidemiology and Biostatistics University of California, San Francisco San Francisco CA USA
- Department of Medicine University of California, San Francisco San Francisco CA USA
| | | | - Rachel Morello-Frosch
- Department of Environmental Science, Policy, & Management School of Public Health University of California, Berkeley Berkeley CA USA
| |
Collapse
|
3
|
Reckling SK, Hu XC, Keshaviah A. Equity in wastewater monitoring: Differences in the demographics and social vulnerability of sewered and unsewered populations across North Carolina. PLoS One 2024; 19:e0311516. [PMID: 39388434 PMCID: PMC11466389 DOI: 10.1371/journal.pone.0311516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 09/19/2024] [Indexed: 10/12/2024] Open
Abstract
Wastewater monitoring is a valuable public health tool that can track a variety of health markers. The strong correlations between trends in wastewater viral concentrations and county-level COVID-19 case counts point to the ability of wastewater data to represent changes in a community's disease burden. However, studies are lacking on whether the populations sampled through wastewater monitoring represent the characteristics of the broader community and the implications on health equity. We conducted a geospatial analysis to examine the extent to which populations contributing to wastewater collected through the North Carolina Wastewater Monitoring Network as of June 2022 represent the broader countywide and statewide populations. After intersecting sewershed boundary polygons for 38 wastewater treatment plants across 18 counties with census block and tract polygons, we compared the demographics and social vulnerability of (1) people residing in monitored sewersheds with countywide and statewide populations, and (2) sewered residents, regardless of inclusion in wastewater monitoring, with unsewered residents. We flagged as meaningful any differences greater than +/- 5 percentage points or 5 percent (for categorical and continuous variables, respectively) and noted statistically significant differences (p < 0.05). We found that residents within monitored sewersheds largely resembled the broader community on most variables analyzed, with only a few exceptions. We also observed that when multiple sewersheds were monitored within a county, their combined service populations resembled the county population, although individual sewershed and county populations sometimes differed. When we contrasted sewered and unsewered populations within a given county, we found that sewered populations were more vulnerable than unsewered populations, suggesting that wastewater monitoring may fill in the data gaps needed to improve health equity. The approach we present here can be used to characterize sewershed populations nationwide to ensure that wastewater monitoring is implemented in a manner that informs equitable public health decision-making.
Collapse
Affiliation(s)
- Stacie K. Reckling
- Center for Geospatial Analytics, North Carolina State University, Raleigh, North Carolina, United States of America
- Division of Public Health, North Carolina Department of Health and Human Services, Raleigh, North Carolina, United States of America
| | - Xindi C. Hu
- Mathematica, Inc., Princeton, New Jersey, United States of America
| | - Aparna Keshaviah
- Mathematica, Inc., Princeton, New Jersey, United States of America
| |
Collapse
|
4
|
Libenson A, Karasaki S, Cushing LJ, Tran T, Rempel JL, Morello-Frosch R, Pace CE. PFAS-Contaminated Pesticides Applied near Public Supply Wells Disproportionately Impact Communities of Color in California. ACS ES&T WATER 2024; 4:2495-2503. [PMID: 38903201 PMCID: PMC11186009 DOI: 10.1021/acsestwater.3c00845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 04/27/2024] [Accepted: 04/29/2024] [Indexed: 06/22/2024]
Abstract
Contaminated drinking water from widespread environmental pollutants such as perfluoroalkyl and polyfluoroalkyl substances (PFAS) poses a rising threat to public health. PFAS monitoring in groundwater is limited and fails to consider pesticides found to contain PFAS as a potential contamination source. Given previous findings on the disproportionate exposure of communities of Color to both pesticides and PFAS, we investigated disparities in PFAS-contaminated pesticide applications in California based on community-level sociodemographic characteristics. We utilized statewide pesticide application data from the California Department of Pesticide Regulation and recently reported concentrations of PFAS chemicals detected in eight pesticide products to calculate the areal density of PFAS applied within 1 km of individual community water systems' (CWSs) supply wells. Spatial regression analyses suggest that statewide, CWSs that serve a greater proportion of Latinx and non-Latinx People of Color residents experience a greater areal density of PFAS applied and greater likelihood of PFAS application near their public supply wells. These results highlight agroecosystems as potentially important sources of PFAS in drinking water and identify areas that may be at risk of PFAS contamination and warrant additional PFAS monitoring and remediation.
Collapse
Affiliation(s)
- Arianna Libenson
- Environmental
Science, Policy, and Management, University
of California Berkeley, Berkeley, California 94720, United States
| | - Seigi Karasaki
- Energy
and Resources Group, University of California
Berkeley, Berkeley, California 94720 United States
| | - Lara J. Cushing
- Fielding
School of Public Health, University of California
Los Angeles, Los Angeles, California 90095, United States
| | - Tien Tran
- Community
Water Center, Sacramento and Visalia, California 93291, United States
| | - Jenny L. Rempel
- Energy
and Resources Group, University of California
Berkeley, Berkeley, California 94720 United States
| | - Rachel Morello-Frosch
- Environmental
Science, Policy, and Management, University
of California Berkeley, Berkeley, California 94720, United States
- School
of Public Health, University of California
Berkeley, Berkeley, California 94720, United States
| | - Clare E. Pace
- Environmental
Science, Policy, and Management, University
of California Berkeley, Berkeley, California 94720, United States
| |
Collapse
|
5
|
Zhao R, Wang S, Zhang Y, Dong C. Partition refinement of WorldPop population spatial distribution data method: A case study of Zhuhai, China. PLoS One 2024; 19:e0301127. [PMID: 38578753 PMCID: PMC10997122 DOI: 10.1371/journal.pone.0301127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 03/08/2024] [Indexed: 04/07/2024] Open
Abstract
Currently, the core idea of the refined method of population spatial distribution is to establish a correlation between the population and auxiliary data at the administrative-unit level and, then, refine it to the grid unit. However, this method ignores the advantages of public population spatial distribution data. Given these problems, this study proposed a partition strategy using the natural break method at the grid-unit level, which adopts the population density to constrain the land class weight and redistributes the population under the dual constraints of land class and area weights. Accordingly, we used the dasymetric method to refine the population distribution data. The study established a partition model for public population spatial distribution data and auxiliary data at the grid-unit level and, then, refined it to smaller grid units. This method effectively utilizes the public population spatial distribution data and solves the problem of the dataset being not sufficiently accurate to describe small-scale regions and low resolutions. Taking the public WorldPop population spatial distribution dataset as an example, the results indicate that the proposed method has higher accuracy than other public datasets and can also describe the actual spatial distribution characteristics of the population accurately and intuitively. Simultaneously, this provides a new concept for research on population spatial distribution refinement methods.
Collapse
Affiliation(s)
- Rong Zhao
- Chinese Academy of Surveying and Mapping, Beijing, China
- School of Geomatics, Liaoning Technical University, Fuxin, China
| | - Shuang Wang
- Chinese Academy of Surveying and Mapping, Beijing, China
- School of Geomatics, Liaoning Technical University, Fuxin, China
| | - Yu Zhang
- Chinese Academy of Surveying and Mapping, Beijing, China
| | - Chun Dong
- Chinese Academy of Surveying and Mapping, Beijing, China
| |
Collapse
|
6
|
Cushing LJ, Ju Y, Kulp S, Depsky N, Karasaki S, Jaeger J, Raval A, Strauss B, Morello-Frosch R. Toxic Tides and Environmental Injustice: Social Vulnerability to Sea Level Rise and Flooding of Hazardous Sites in Coastal California. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:7370-7381. [PMID: 37129408 PMCID: PMC10193577 DOI: 10.1021/acs.est.2c07481] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 03/17/2023] [Accepted: 03/20/2023] [Indexed: 05/03/2023]
Abstract
Sea level rise (SLR) and heavy precipitation events are increasing the frequency and extent of coastal flooding, which can trigger releases of toxic chemicals from hazardous sites, many of which are in low-income communities of color. We used regression models to estimate the association between facility flood risk and social vulnerability indicators in low-lying block groups in California. We applied dasymetric mapping techniques to refine facility boundaries and population estimates and probabilistic SLR projections to estimate facilities' future flood risk. We estimate that 423 facilities are at risk of flooding in 2100 under a high emissions scenario (RCP 8.5). One unit standard deviation increases in nonvoters, poverty rate, renters, residents of color, and linguistically isolated households were associated with a 1.5-2.2 times higher odds of the presence of an at-risk site within 1 km (ORs [95% CIs]: 2.2 [1.8, 2.8], 1.9 [1.5, 2.3], 1.7 [1.4, 1.9], 1.5 [1.2, 1.9], and 1.5 [1.2, 1.9], respectively). Among block groups near at least one at-risk site, the number of sites increased with poverty, proportion of renters and residents of color, and lower voter turnout. These results underscore the need for further research and disaster planning that addresses the differential hazards and health risks of SLR.
Collapse
Affiliation(s)
- Lara J. Cushing
- Department
of Environmental Health Sciences, University
of California Los Angeles, Los Angeles, California 90095, United States
| | - Yang Ju
- School
of Architecture and Urban Planning, Nanjing
University, Nanjing, China 210093
| | - Scott Kulp
- Climate
Central, Princeton, New Jersey 08542, United States
| | - Nicholas Depsky
- Energy
and Resources Group, University of California,
Berkeley, Berkeley, California 94720, United States
| | - Seigi Karasaki
- Energy
and Resources Group, University of California,
Berkeley, Berkeley, California 94720, United States
| | - Jessie Jaeger
- PSE Healthy
Energy, Oakland, California 94612, United States
| | - Amee Raval
- Asian
Pacific Environmental Network, Oakland, California 94612, United States
| | | | - Rachel Morello-Frosch
- Department
of Environmental Science, Policy and Management & School of Public
Health, University of California, Berkeley, Berkeley, California 94720, United States
| |
Collapse
|