Microgenetic algorithms and artificial neural networks to assess minimum data requirements for prediction of pesticide concentrations in shallow groundwater on a regional scale

Media type: E-Article
Title: Microgenetic algorithms and artificial neural networks to assess minimum data requirements for prediction of pesticide concentrations in shallow groundwater on a regional scale
Contributor: Sahoo, Goloka Behari; Ray, Chittaranjan
imprint: American Geophysical Union (AGU), 2008
Published in: Water Resources Research
Language: English
DOI: 10.1029/2007wr005875
ISSN: 1944-7973; 0043-1397
Keywords: Water Science and Technology
Origination:
Footnote:
Description: <jats:p>Artificial neural networks (ANNs) have been extensively used for forecasting problems involving water quantity and quality. In most cases, the geometry and model parameters of the ANN are set using a trial‐and‐error approach to achieve better network generalization ability, whereby the available data are divided arbitrarily into training, testing, and validation subsets. It has been shown that using the arbitrary sample selection method to assign samples into the training subset commonly results in the inclusion of samples from densely clustered regions and omission of samples from sparsely represented regions. This paper presents a systematic approach using the self‐organizing map (SOM) clustering technique that identifies which samples and determines how many samples should be included in each of the three subsets required by ANN for optimum predictive performance efficiency. In addition, this paper presents the microgenetic algorithms (<jats:italic>μ</jats:italic>GA) that optimize ANN's geometry and model parameters in terms of the correlation coefficient (<jats:italic>R</jats:italic>). In the sensitivity analysis, <jats:italic>μ</jats:italic>GA model parameters are found to be least sensitive to the optimum <jats:italic>R</jats:italic> value, while ANN's predictive performance is significantly affected by (1) the poor selection of its geometry and model parameters and (2) the arbitrary selection of samples for the three subsets of data used. It is demonstrated that the <jats:italic>μ</jats:italic>GA‐ANN model using the SOM technique for data division outperforms the <jats:italic>μ</jats:italic>GA‐ANN model using arbitrary data division. For the training subset, the model using the SOM technique identifies samples that are representative of the region, requiring only 20% of the total samples, whereas the arbitrary sample selection method requires 50–90%. Because resampling on a regional scale is expensive and time consuming, substantial cost and time could be saved if resampling could be done only on the 20% representative drinking water wells.</jats:p>
Access State: Open Access

Search in field:

Recently searched for: