Technical gazette, Vol. 30 No. 1, 2023.
Preliminary communication
https://doi.org/10.17559/TV-20221103085856
Recovery of Outliers in Water Environment Monitoring Data
Jinling Song
; School of Mathematics and Information Technology, Hebei Agricultural Data Intelligent Perception and Application Technology Innovation Center, Hebei Normal University of Science & Technology, Qinhuangdao 066004, China; Hebei Key Laboratory of Ocean Dynamics, Resources and Environments, Qinhuangdao 066004, China
Meining Zhu
; School of Mathematics and Information Technology, Hebei Agricultural Data Intelligent Perception and Application Technology Innovation Center, Hebei Normal University of Science & Technology, Qinhuangdao 066004, China
Liming Huang
; School of Business Administration, Hebei Normal University of Science & Technology, Qinhuangdao 066004, China
Gang Wang
; School of Mathematics and Information Technology, Hebei Agricultural Data Intelligent Perception and Application Technology Innovation Center, Hebei Normal University of Science & Technology, Qinhuangdao 066004, China
Dongyan Jia
; School of Mathematics and Information Technology, Hebei Agricultural Data Intelligent Perception and Application Technology Innovation Center, Hebei Normal University of Science & Technology, Qinhuangdao 066004, China
Abstract
The water environment monitoring data are time sequences with outliers which depress the data quality, so outlier detection and recovery play an important role in the applications such as knowledge acquisition and prediction modelling of water environment indicators. To detect the outliers, the short-term chain comparison with the sliding window based on the time sequence characteristics is adopted. To recover outliers closer to the real data at that time, the sub-sequences are divided dynamically according to the change characteristics of the dataset, then the similarity between sub-sequences is measured by the shape distance and the outliers are recovered according to the change trend of the corresponding data in the most similar sub-sequences. The monitoring data of a water station are selected in the study. The experimental results show that the recovery method is superior to the commonly used prediction recovery method and fitting recovery method, the recovered data is smoother and the short-term trend is more obvious.
Keywords
outliers; shape distance; sub-sequence similarity; water monitoring data
Hrčak ID:
288434
URI
Publication date:
15.12.2022.
Visits: 783 *