A Classification Model for Predicting Road Accidents Using Web Data
DOI:
https://doi.org/10.54820/entrenova-2023-0006Keywords:
data mining, web scraping, classification model, road accident predictionAbstract
The increase in urbanisation and the use of vehicles in recent decades has also led to increased road accidents. The causes of road accidents can be various, including human error, weather conditions or even inadequate road infrastructure. Knowing the causes and areas of road accidents can help prevent them by state institutions taking necessary measures and citizens being informed about the areas of road accidents. The primary purpose of this study is to explore patterns in accident web data in Albania and to construct a classification model using data mining techniques and methods. These techniques have been applied to data obtained from several leading media portals in Albania, including about 30,000 articles from online portals and reports from the state authorities. The constructed classification model is expected to be utilised to predict the accident likelihood according to the locations, weather, and period of the year.
References
Abu Alfeilat, H. A. (2019). Effects of distance measure choice on k-nearest neighbor classifier performance: a review. Big data, 7(4), 221-248.
Bedalli, M. A. (2016). A heterogeneous cluster ensemble model for improving the stability of fuzzy cluster analysis. Procedia Computer Science, 102, 129-136.
Biswas, A. A., Mia, M. J., & Majumder, A. (2019). Forecasting the Number of Road Accidents and Casualties using Random Forest Regression in the Context of Bangladesh,. 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1-5). Kanpur, India: IEEE. doi:10.1109/ICCCNT45670.2019.8944500
Chen, M.-M., & Chen, M.-C. (2020). Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest. Information, pp. 11(5), 270. doi:https://doi.org/10.3390/info11050270
Elyassami, S., Hamid, Y., & Habuza, T. (2020). Road Crashes Analysis and Prediction using Gradient Boosted and Random Forest Trees. 2020 6th IEEE Congress on Information Science and Technology (CiSt) (pp. 520-525). 2020 6th IEEE Congress on Information Science and Technology (CiSt): IEEE. doi:10.1109/CiSt49399.2021.9357298
G, M., & R, R. H. (2023). Prediction of Road Accidents in the Different States of India using Machine Learning Algorithms. 2023 IEEE International Conference on Integrated Circuits and Communication Systems (ICICACS) (pp. 1-6). Raichur, India: IEEE. doi:10.1109/ICICACS57338.2023.1009951
Grabocka, J. B.-T. (2013). Efficient classification of long time-series. ICT Innovations 2012: Secure and Intelligent Systems (pp. 47-57). Springer
INSTAT. (2023). Transporti, Aksidentet dhe Karakteristikat e Mjeteve Rrugore. (Instituti i Statistikave - Tiranë) Retrieved May 2022, from http://www.instat.gov.al/al/temat/industria-tregtia-dhe-sh%C3%ABrbimet/transporti-aksidentet-dhe-karakteristikat-e-mjeteve-rrugor
ISTAT. (2023, January). INCIDENTI STRADALI IN ITALIA. (Istat – Istituto nazionale di statistica) Retrieved May 2022, from https://www.istat.it/it/archivio/25982
Kononov, J. (2002). Road accident prediction modeling and diagnostics of accident causality: A comprehensive methodology. Denver: University of Colorado at Denver
Larsen, L. (2004). Methods of multidisciplinary in-depth analyses of road traffic accidents. Journal of Hazardous Materials, 111(1-3), 115-122
Lnenicka, M., Hovad, J., Komarkova, J., & Pasler, M. (2016). A proposal of web data mining application for mapping crime areas in the Czech Republic. 10th International Joint Conference on Software Technologies (ICSOFT) (pp. 1-6). Colmar, France: IEEE
Ministry of Interior. (2023, January). Raporti Mujor (Monthly Report). (Ministry of Interior in Albania) Retrieved May 2022, from https://mb.gov.al/en/raporti-mujor
nltk.org. (2022, June). Natural Language Toolkit. (https://www.nltk.org/) Retrieved May 2022, from https://www.nltk.or
Peterson, L. (2009). K-nearest neighbor. Scholarpedia, 4(2), p. 1883
Ren, Y. Z. (2016). Ensemble classification and regression-recent developments, applications and future directions. IEEE Computational intelligence magazine, 11(1), pp. 41-53
Richardson, L. (n.d.). Beautiful Soup. (Crummy) Retrieved March 2023, from https://www.crummy.com/software/BeautifulSoup
Sen, P. C. (2020). Supervised classification algorithms in machine learning: A survey and review. Emerging Technology in Modelling and Graphics: Proceedings of IEM Graph 2018 (pp. 99-111). Singapore: Springer
Siddik, M. A. (2021). Predicting the Death of Road Accidents in Bangladesh Using Machine Learning Algorithms. ICACDS 2021: Advances in Computing and Data Sciences (pp. 160–171). Springer, Cham. doi:https://doi.org/10.1007/978-3-030-88244-0_1
Sinanaj, L., & Bexheti, L. A. (2023). Predicting Road Accidents with Web Scraping and Machine Learning Techniques. International Scientific Conference on Business and Economics. Tetovo, North Macedonia
Yan, M., & Shen, Y. (2022). Traffic Accident Severity Prediction Based on Random Forest. Sustainability, pp. 14(3), 1729. doi:https://doi.org/10.3390/su1403172
Yang, F. J. (2018). An implementation of naive Bayes classifier. . 2018 International conference on computational science and computational intelligence (CSCI) (pp. 301-306). IEEE.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Erind Bedalli, Luan Sinanaj, Lejla Abazi Bexheti
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.