Αναζήτηση αυτού του ιστολογίου

Δευτέρα 30 Απριλίου 2018

Modeling the risk of water pollution by pesticides from imbalanced data

Abstract

The pollution of ground and surface waters with pesticides is a serious ecological issue that requires adequate treatment. Most of the existing water pollution models are mechanistic mathematical models. While they have made a significant contribution to understanding the transfer processes, they face the problem of validation because of their complexity, the user subjectivity in their parameterization, and the lack of empirical data for validation. In addition, the data describing water pollution with pesticides are, in most cases, very imbalanced. This is due to strict regulations for pesticide applications, which lead to only a few pollution events. In this study, we propose the use of data mining to build models for assessing the risk of water pollution by pesticides in field-drained outflow water. Unlike the mechanistic models, the models generated by data mining are based on easily obtainable empirical data, while the parameterization of the models is not influenced by the subjectivity of ecological modelers. We used empirical data from field trials at the La Jaillière experimental site in France and applied the random forests algorithm to build predictive models that predict "risky" and "not-risky" pesticide application events. To address the problems of the imbalanced classes in the data, cost-sensitive learning and different measures of predictive performance were used. Despite the high imbalance between risky and not-risky application events, we managed to build predictive models that make reliable predictions. The proposed modeling approach can be easily applied to other ecological modeling problems where we encounter empirical data with highly imbalanced classes.



Δεν υπάρχουν σχόλια:

Δημοσίευση σχολίου

Σημείωση: Μόνο ένα μέλος αυτού του ιστολογίου μπορεί να αναρτήσει σχόλιο.