Water quality warnings based on cluster analysis in Colombian river basins

Edwin Ferney Castillo, Wilmer Fernando Gonzales, David Camilo Corrales, Iván Darío López, Miller Guzmán Hoyos, Apolinar Figueroa, Juan Carlos Corrales


Fresh water is considered one of the most important renewable natural resources in the world. Among all the countries, Colombia is one of the places with the highest water supply, and has five watersheds: the Caribbean, Orinoco, Amazon, Pacific and Catatumbo. It is therefore vital to study and evaluate the water quality of the rivers and/or lotic systems. In recent studies, some scientists made use of biological indices to calculate water quality, while others detected water quality through machine learning techniques. However, these studies do not allow users to easily interpret the results. These investigations motivated us to propose a dataset for generating water quality alerts in Piedras river basin based on the analysis of the K-Means clustering algorithm and C.4.5 classification technique.


Clustering; water quality data; aquatic macro-invertebrates; taxon; C.4.5 decision tree.

