For thousands of years, rivers across the world have been the lifeblood of civilizations, shaping the rise and fall of empires, fostering trade, and nourishing entire societies. From the Nile to the Yangtze, rivers have not just witnessed history; they’ve made it.1,2
Beyond their historical significance, rivers are also fascinating from a scientific perspective because they are “packed” with data, including parameters that are crucial for water quality monitoring.
Monitoring river systems is important for ensuring environmental sustainability, protecting communities, and maintaining water quality. By analyzing trends in water flow, pollution levels, and ecological health, organizations can proactively address potential issues, such as flooding risks, contamination, or habitat loss. This work not only supports regulatory compliance but also safeguards vital resources for future generations.
There are many water quality parameters that can be used to evaluate the health of a river ecosystem. We will focus on four key water quality parameters: temperature (T), specific conductance (κ), dissolved oxygen (DO), and pH. Understanding how these parameters affect water quality and overall health of river ecosystems is essential. The table below seeks to clarify the importance, what low and high values may indicate, and the impact on water quality.
Depending on the T, κ, DO, and pH values, river ecosystems can be severely impacted, affecting aquatic life, water quality, and the fauna and flora in general. Although one may investigate these factors in isolation, it is essential to understand that they are interdependent and influence each other; having a solid grasp of this interplay may help uncover relevant insights regarding a river ecosystem’s health and, consequently, its water quality.3
The Ohio River, formed at the confluence of the Allegheny and Monongahela Rivers, serves as a prime example for research on pollution, water quality, and environmental sciences. Its vast watershed, spanning multiple states, plays a critical role in the ecological health of the surrounding regions, impacting both wildlife and human populations. The river's historical significance and ongoing challenges with industrial runoff, agricultural pollutants, and urban waste make it a focal point for scientists and policymakers alike. Studies conducted along the Ohio River contribute valuable insights into the effects of human activities on aquatic ecosystems and the efficacy of environmental regulations.4,5
However, despite the abundance of data available for rivers across the United States, monitoring water quality parameters like pH, temperature, specific conductivity, and dissolved oxygen remains a complex and challenging task. This complexity arises from the need to not only collect and store vast amounts of data but also to analyze and interpret it effectively to drive actionable insights. The variability of water quality over time and across different locations further complicates the task, often requiring sophisticated tools and methods to ensure accurate and meaningful results.6
Leveraging Minitab Connect and publicly available data from the United States Geological Survey (USGS), I have explored how the water quality parameters of the Ohio River have changed over the past ~2 years at a specific USGS data collection station, shown in the figure below.
Every 15 minutes, the USGS API provides new measurements for various water quality parameters collected at the station. A connection was established between Connect and the USGS API to pull data hourly for each parameter. The Rest API connector simplified the process of parsing API responses, enabling accurate, frequent, and integrated data collection.
Using Custom SQL, I combined the tables for Temperature, Specific Conductance, Dissolved Oxygen, and pH, while adding fields like Date, Month, Year, Day of the Week, Season, and Quarter. This streamlined process was made significantly easier with Connect, as the SQL table would be frequently updated to include new data points. After the initial set-up, Connect enables automated and efficient data ingestion, allowing scientists like me to conduct robust analysis and exploration without having to spend additional time cleaning and processing data.
First, I added text and a figure to the top area of my dashboard.
Then, I added a time series plot containing the data for each of the four parameters, along with KPIs for their maximum, mean, and minimum values. To make data exploration easier and more interactive, I added slicers that allow me to filter the data based on Date, Month, Year, Temperature, Specific Conductance, Dissolved Oxygen, and pH.
To identify seasonal trends in my data more easily, I added line plots for Temperature, Dissolved Oxygen, Specific Conductance, all categorized by month. A correlogram was also a helpful addition to determine and quantify correlations between the parameters. A clear correlation shown by the data is the inverse relationship between water temperature and dissolved oxygen (as colder water can “hold” more dissolved oxygen than warmer water).
Finally, I decided to add control charts showing the Specific Conductance and pH for the last 90 days, separated by Month. I also established statistical tests and thresholds that, if exceeded by the daily average, yield alerts via SMS, e-mail, or in-app notification.
In the Connect Dashboard, setting up the alerts was easy and extremely important to facilitate monitoring of dangerous conditions for the river’s aquatic life and ecosystem. If the daily average specific conductance goes above 400 μS/cm, I receive an immediate alert, which may indicate poor water quality due to high levels of dissolved solids, chemicals, and minerals.7 If the river’s daily average pH falls outside of the 6.5 – 8.0 range, I also receive an alert, which may indicate that the biodiversity of the river may face challenges due to stress of their physiological systems and reproductive decline.8
In this use case, an advantage of Minitab Connect is the ability to quickly and efficiently extract, transform, and load data from the USGS API. In addition, another relevant advantage is the ability to trigger alerts immediately when the data point is out of control and/or does not meet desired thresholds. Along with other unique features, Connect can be relevant in a variety of scenarios that require immediate public policy and environmental responses, like the one reported in this blog post.
It is exciting to use tools like Minitab Connect to help analyze trends in historical data, in addition to assisting with monitoring of real-time conditions. By doing so, it has the potential to enable rapid ESG responses and public policy interventions when needed. Overall, anyone engaged with scientific research and exploration could significantly benefit from leveraging cutting-edge software solutions to manage, integrate, analyze, and process data. Thus, it might be time to propel R&D and other industries forward by leaving behind scattered USB sticks and countless spreadsheets. Embracing a more integrated, robust approach to handling data can be extremely beneficial by saving resources, time, and effort. Minitab Connect offers a powerful solution to streamline data processes and help unlock new insights.