Autonomous Sensor Data Cleaning in Stream Mining Setting

Kenda, Klemen; Mladenić, Dunja

doi:10.2478/bsrj-2018-0020

Business Systems Research : International journal of the Society for Advancing Innovation and Research in Economy, Vol. 9 No. 2, 2018.

Izvorni znanstveni članak

https://doi.org/10.2478/bsrj-2018-0020

Autonomous Sensor Data Cleaning in Stream Mining Setting

Klemen Kenda ; Jožef Stefan Institute, Ljubljana, Slovenia
Dunja Mladenić ; Jozef Stefan International Postgraduate School, Ljubljana, Slovenia

Puni tekst: engleski pdf 632 Kb

str. 69-79

preuzimanja: 498

citiraj

APA 6th Edition

Kenda, K. i Mladenić, D. (2018). Autonomous Sensor Data Cleaning in Stream Mining Setting. Business Systems Research, 9 (2), 69-79. https://doi.org/10.2478/bsrj-2018-0020

MLA 8th Edition

Kenda, Klemen i Dunja Mladenić. "Autonomous Sensor Data Cleaning in Stream Mining Setting." Business Systems Research, vol. 9, br. 2, 2018, str. 69-79. https://doi.org/10.2478/bsrj-2018-0020. Citirano 16.04.2024.

Chicago 17th Edition

Kenda, Klemen i Dunja Mladenić. "Autonomous Sensor Data Cleaning in Stream Mining Setting." Business Systems Research 9, br. 2 (2018): 69-79. https://doi.org/10.2478/bsrj-2018-0020

Harvard

Kenda, K., i Mladenić, D. (2018). 'Autonomous Sensor Data Cleaning in Stream Mining Setting', Business Systems Research, 9(2), str. 69-79. https://doi.org/10.2478/bsrj-2018-0020

Vancouver

Kenda K, Mladenić D. Autonomous Sensor Data Cleaning in Stream Mining Setting. Business Systems Research [Internet]. 2018 [pristupljeno 16.04.2024.];9(2):69-79. https://doi.org/10.2478/bsrj-2018-0020

IEEE

K. Kenda i D. Mladenić, "Autonomous Sensor Data Cleaning in Stream Mining Setting", Business Systems Research, vol.9, br. 2, str. 69-79, 2018. [Online]. https://doi.org/10.2478/bsrj-2018-0020

Sažetak

Background: Internet of Things (IoT), earth observation and big scientific experiments are sources of extensive amounts of sensor big data today. We are faced with large amounts of data with low measurement costs. A standard approach in such cases is a stream mining approach, implying that we look at a particular measurement only once during the real-time processing. This requires the methods to be completely autonomous. In the past, very little attention was given to the most time-consuming part of the data mining process, i.e. data pre-processing. Objectives: In this paper we propose an algorithm for data cleaning, which can be applied to real-world streaming big data. Methods/Approach: We use the short-term prediction method based on the Kalman filter to detect admissible intervals for future measurements. The model can be adapted to the concept drift and is useful for detecting random additive outliers in a sensor data stream. Results: For datasets with low noise, our method has proven to perform better than the method currently commonly used in batch processing scenarios. Our results on higher noise datasets are comparable. Conclusions: We have demonstrated a successful application of the proposed method in real-world scenarios including the groundwater level, server load and smart-grid data.

Ključne riječi

big data; autonomous processing; real-world applications; data cleaning; stream mining; water management; data-centre management; smart-grids

Hrčak ID:

203483

URI

https://hrcak.srce.hr/203483

Datum izdavanja:

11.7.2018.

Posjeta: 928 *

Prijava i registracija

Business Systems Research : International journal of the Society for Advancing Innovation and Research in Economy, Vol. 9 No. 2, 2018.

Sažetak

Ključne riječi

Hrčak ID:

URI

Datum izdavanja: