Estimation of daily dissolved oxygen concentration for river water quality using conventional regression analysis, multivariate adaptive regression splines, and TreeNet techniques


Nacar S., Mete B., Bayram A.

ENVIRONMENTAL MONITORING AND ASSESSMENT, cilt.192, sa.12, 2020 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 192 Sayı: 12
  • Basım Tarihi: 2020
  • Doi Numarası: 10.1007/s10661-020-08649-9
  • Dergi Adı: ENVIRONMENTAL MONITORING AND ASSESSMENT
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, ABI/INFORM, Agricultural & Environmental Science Database, Aqualine, Aquatic Science & Fisheries Abstracts (ASFA), BIOSIS, CAB Abstracts, Compendex, EMBASE, Environment Index, Food Science & Technology Abstracts, Geobase, Greenfile, MEDLINE, Pollution Abstracts, Public Affairs Index, Veterinary Science Database, Civil Engineering Abstracts
  • Anahtar Kelimeler: Broad River, Conventional regression analysis, Modeling of dissolved oxygen, Multivariate adaptive regression splines, TreeNet gradient boosting machine, LEARNING BASED OPTIMIZATION, PREDICTION, MODEL, PERFORMANCE, ANFIS
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

The aim of this study was to model the surface water quality of the Broad River Basin, South Carolina. The most suitable two monitoring stations numbered as USGS 02156500 (Near Carlisle) and USGS 02160991 (Near Jenkinsville) were selected for the reason that the river water temperature (WT), pH, and specific conductance (SC), as well as dissolved oxygen (DO) concentration, were simultaneously monitored and recorded at these sites. The monitoring period from September 2016 to August 2017 was taken into account for the modeling studies. The electrical conductivity (EC) values corresponding to the river SC values were calculated. First, the conventional regression analysis (CRA) was applied to three regression forms, i.e., linear, power, and exponential functions, to estimate the river DO concentration. Then, the multivariate adaptive regression splines (MARS) and TreeNet gradient boosting machine (TreeNet) techniques were employed. Three performance statistics, i.e., root means square error (RMSE), mean absolute error (MAE), and Nash-Sutcliffe coefficient of efficiency (NS), were used to compare the estimation capabilities of these techniques. The TreeNet technique, which was used for the first time in the modeling of DO concentration, had higher estimation success with the RMSE, MAE, and NS values of 0.182 mg/L, 0.123 mg/L, and 0.990, respectively, for the Carlisle station and 0.313 mg/L, 0.233 mg/L, and 0.965, respectively, for the Jenkinsville station in the training phase. The MARS technique, which had limited availability of its application in the modeling of DO concentration, had higher estimation success with the RMSE, MAE, and NS values of 0.240 mg/L, 0.195 mg/L, and 0.981, respectively, for the Carlisle station and 0.527 mg/L, 0.432 mg/L, and 0.980, respectively, for the Jenkinsville station in the testing phase. Considering the RMSE and MAE values being lower, as well as NS values being higher for the model having an input combination of WT, pH, and EC, the Carlisle station came into prominence. It was concluded that international researchers, who have engaged in the river water quality modeling studies, can favor the MARS and TreeNET techniques without any hesitation and estimate the river DO concentration successfully. The models developed for the Carlisle station were tested with the data sets for the monitoring period from September 2017 to August 2018 at the same station. Similarly, the models developed for the Jenkinsville station were tested with the data sets for the monitoring period from September 2017 to August 2018 at the same station. It was concluded that the models could estimate the river DO concentrations very close to in situ measurements at the same site but for the different monitoring periods, too. Furthermore, the models developed for the Carlisle station were tested with the data sets from the Jenkinsville station for the same monitoring period. Similarly, the models developed for the Jenkinsville station were tested with the data sets from the Carlisle station for the same monitoring period. It was also concluded that the developed models could estimate the river DO concentrations very close to in situ measurements at different monitoring sites but for the same monitoring period on the same river, too. It can be asserted that the models developed for any monitoring site on a river can be employed for another monitoring site on the same river, too, as in the case of the Broad River, South Carolina.