Project 3: Modeling spatial patterns of metals and metal mixtures in drinking water in the US

Across the US, tens of millions of individuals unknowingly consume drinking water with concentrations of metals that exceed regulatory guidelines. The problem is more prominent in untreated water from private wells than in municipal drinking water supplies that undergo water treatment. Although private wells supply drinking water for more than 15% of the population, elevated concentrations of metals in these wells may go undetected and unremediated due to sporadic and incomplete monitoring data.  

map of well water locations in the USA

Our Goals

  • Develop a novel statistical method to identify populations potentially at risk for elevated exposures to metals through contaminated drinking water
  • Develop a map of metal mixtures characteristic of drinking water exposures in different regions
  • Investigate whether there are socioeconomic differences or disparities based on metal concentrations in drinking water

Our Approach

Project 3 is developing statistical models (based on logistic regression and random forest methods) to predict concentrations of metals and metal mixtures (arsenic, cadmium, and lead) in private wells based on 50 years of aggregated well data from the National Water Quality Monitoring Council, in addition to  information about hydrological and geological features associated with metal occurrence.

In addition, together with the Environmental Working Group, we are examining decades of public water supply records in combination with information on point sources and indicators of socioeconomic disparities to determine whether elevated concentrations of metals in public drinking water supplies are associated with indicators of environmental justice. 

Project 3 Team

Project 3 News

More

Recent Publications

Mona Q Dai, Benjamin M Geyman, Xindi C Hu, Colin P Thackray, and Elsie M Sunderland. 2023. “Sociodemographic Disparities in Mercury Exposure from United States Coal-Fired Power Plants.” Environ Sci Technol Lett, 10, 7, Pp. 589-595. Publisher's VersionAbstract

Hazardous air pollutants emitted by United States (U.S) coal-fired power plants have been controlled by the Mercury and Air Toxics Standards (MATS) since 2012. Sociodemographic disparities in traditional air pollutant exposures from U.S. power plants are known to occur but have not been evaluated for mercury (Hg), a neurotoxicant that bioaccumulates in food webs. Atmospheric Hg deposition from domestic power plants decreased by 91% across the contiguous U.S. from 6.4 Mg in 2010 to 0.55 Mg in 2020. Prior to MATS, populations living within 5 km of power plants ( = 507) included greater proportions of frequent fish consumers, individuals with low annual income and less than a high school education, and limited English-proficiency households compared to the US general population. These results reinforce a lack of distributional justice in plant siting found in prior work. Significantly greater proportions of low-income individuals lived within 5 km of active facilities in 2020 ( = 277) compared to plants that retired after 2010, suggesting that socioeconomic status may have played a role in retirement. Despite large deposition declines, an end-member scenario for remaining exposures from the largest active power plants for individuals consuming self-caught fish suggests they could still exceed the U.S. Environmental Protection Agency reference dose for methylmercury.

Xindi C Hu, Mona Dai, Jennifer M Sun, and Elsie M Sunderland. 3/2023. “The Utility of Machine Learning Models for Predicting Chemical Contaminants in Drinking Water: Promise, Challenges, and Opportunities.” Curr Environ Health Rep, 10, 1, Pp. 45-60.Abstract

PURPOSE OF REVIEW: This review aims to better understand the utility of machine learning algorithms for predicting spatial patterns of contaminants in the United States (U.S.) drinking water.

RECENT FINDINGS: We found 27 U.S. drinking water studies in the past ten years that used machine learning algorithms to predict water quality. Most studies (42%) developed random forest classification models for groundwater. Continuous models show low predictive power, suggesting that larger datasets and additional predictors are needed. Categorical/classification models for arsenic and nitrate that predict exceedances of pollution thresholds are most common in the literature because of good national scale data coverage and priority as environmental health concerns. Most groundwater data used to develop models were obtained from the United States Geological Survey (USGS) National Water Information System (NWIS). Predictors were similar across contaminants but challenges are posed by the lack of a standard methodology for imputation, pre-processing, and differing availability of data across regions. We reviewed 27 articles that focused on seven drinking water contaminants. Good performance metrics were reported for binary models that classified chemical concentrations above a threshold value by finding significant predictors. Classification models are especially useful for assisting in the design of sampling efforts by identifying high-risk areas. Only a few studies have developed continuous models and obtaining good predictive performance for such models is still challenging. Improving continuous models is important for potential future use in epidemiological studies to supplement data gaps in exposure assessments for drinking water contaminants. While significant progress has been made over the past decade, methodological advances are still needed for selecting appropriate model performance metrics and accounting for spatial autocorrelations in data. Finally, improved infrastructure for code and data sharing would spearhead more rapid advances in machine-learning models for drinking water quality.

Mayuri Bhatia, Aaron J Specht, Vallabhuni Ramya, Dahy Sulaiman, Manasa Konda, Prentiss Balcom, Elsie M Sunderland, and Asif Qureshi. 10/5/2021. “Portable X-ray Fluorescence as a Rapid Determination Tool to Detect Parts per Million Levels of Ni, Zn, As, Se, and Pb in Human Toenails: A South India Case Study.” Environ Sci Technol, 55, 19, Pp. 13113-13121. Publisher's VersionAbstract
Chronic exposure to inorganic pollutants adversely affects human health. Inductively coupled plasma mass spectrometry (ICP-MS) is the most common method used for trace metal(loid) analysis of human biomarkers. However, it leads to sample destruction, generation of secondary waste, and significant recurring costs. Portable X-ray fluorescence (XRF) instruments can rapidly and nondestructively determine low concentrations of metal(loid)s. In this work, we evaluated the applicability of portable XRF as a rapid method for analyzing trace metal(loid)s in toenail samples from three populations (n = 97) near the city of Chennai, India. A Passing-Bablok regression analysis of results from both methods revealed that there was no proportional bias among the two methods for nickel (measurement range ∼25 to 420 mg/kg), zinc (10 to 890 mg/kg), and lead (0.29 to 4.47 mg/kg). There was a small absolute bias between the two methods. There was a strong proportional bias (slope = 0.253, 95% CI: 0.027, 0.614) between the two methods for arsenic (below detection to 3.8 mg/kg) and for selenium when the concentrations were lower than 2 mg/kg. Limits of agreement between the two methods using Bland-Altman analysis were derived for nickel, zinc, and lead. Overall, a suitably calibrated and evaluated portable XRF shows promise in making high-throughput assessments at population scales.
More