A Classification Model for Predicting Road Accidents Using Web Data

Authors

  • Luan Sinanaj South East European University, North Macedonia https://orcid.org/0009-0009-4852-5571
  • Erind Bedalli University of Elbasan, North Macedonia
  • Lejla Abazi Bexheti South East European University, North Macedonia

DOI:

https://doi.org/10.54820/entrenova-2023-0006

Keywords:

data mining, web scraping, classification model, road accident prediction

Abstract

The increase in urbanisation and the use of vehicles in recent decades has also led to increased road accidents. The causes of road accidents can be various, including human error, weather conditions or even inadequate road infrastructure. Knowing the causes and areas of road accidents can help prevent them by state institutions taking necessary measures and citizens being informed about the areas of road accidents. The primary purpose of this study is to explore patterns in accident web data in Albania and to construct a classification model using data mining techniques and methods. These techniques have been applied to data obtained from several leading media portals in Albania, including about 30,000 articles from online portals and reports from the state authorities. The constructed classification model is expected to be utilised to predict the accident likelihood according to the locations, weather, and period of the year.

Author Biographies

Luan Sinanaj, South East European University, North Macedonia

Luan Sinanaj is a PhD student at South East European University, North Macedonia, working as a full-time lecturer in the Department of Information Technology at "Aleksandër Moisiu" University of Durrës. He completed his B.Sc. studies in the "Informatics" program at the University of Pisa, Italy. He then pursued a professional master's degree in "Internet Technologies" at the same university. Later, he graduated with a scientific master's degree in the "Economic Informatics" program at the European University of Tirana. PhD(c) Sinanaj is the co-author of the book "Basics of Programming in JAVA" and the author or co-author of several publications at national and international conferences. Also, his work and research experiences and interests are in Programming, Data Mining and Artificial Intelligence. The author can be contacted at ls30441@seeu.edu.mk

Erind Bedalli, University of Elbasan, North Macedonia

Erind Bedalli is a lecturer in the Department of Informatics at the University of Elbasan. He has received his B.Sc. degree in Computer Engineering from Hacettepe University, Ankara, and his M.Sc. degree in Informatics from the University of Tirana. He completed his doctoral studies in fuzzy logic and exploratory data analysis at the University of Tirana in 2014. His research experience and interests include fuzzy logic, data mining, artificial intelligence, and large-scale computing. The author can be contacted at erind.bedalli@uniel.edu.al

Lejla Abazi Bexheti, South East European University, North Macedonia

Lejla Abazi Bexheti is an Associate Professor at the Faculty of Contemporary Sciences and Technologies at South East European University in Macedonia. She holds a PhD in Computer Science and has been part of the CST teaching staff since 2002. Her main research activity is in Learning Systems and eLearning, and she has been involved in many international projects and research activities in this area. At SEE University, she was involved in resolving the Learning Management System issue. Currently, she is Pro-rector for academic issues at SEEU. The author can be contacted at l.abazi@seeu.edu.mk

References

Abu Alfeilat, H. A. (2019). Effects of distance measure choice on k-nearest neighbor classifier performance: a review. Big data, 7(4), 221-248.

Bedalli, M. A. (2016). A heterogeneous cluster ensemble model for improving the stability of fuzzy cluster analysis. Procedia Computer Science, 102, 129-136.

Biswas, A. A., Mia, M. J., & Majumder, A. (2019). Forecasting the Number of Road Accidents and Casualties using Random Forest Regression in the Context of Bangladesh,. 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1-5). Kanpur, India: IEEE. doi:10.1109/ICCCNT45670.2019.8944500

Chen, M.-M., & Chen, M.-C. (2020). Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest. Information, pp. 11(5), 270. doi:https://doi.org/10.3390/info11050270

Elyassami, S., Hamid, Y., & Habuza, T. (2020). Road Crashes Analysis and Prediction using Gradient Boosted and Random Forest Trees. 2020 6th IEEE Congress on Information Science and Technology (CiSt) (pp. 520-525). 2020 6th IEEE Congress on Information Science and Technology (CiSt): IEEE. doi:10.1109/CiSt49399.2021.9357298

G, M., & R, R. H. (2023). Prediction of Road Accidents in the Different States of India using Machine Learning Algorithms. 2023 IEEE International Conference on Integrated Circuits and Communication Systems (ICICACS) (pp. 1-6). Raichur, India: IEEE. doi:10.1109/ICICACS57338.2023.1009951

Grabocka, J. B.-T. (2013). Efficient classification of long time-series. ICT Innovations 2012: Secure and Intelligent Systems (pp. 47-57). Springer

INSTAT. (2023). Transporti, Aksidentet dhe Karakteristikat e Mjeteve Rrugore. (Instituti i Statistikave - Tiranë) Retrieved May 2022, from http://www.instat.gov.al/al/temat/industria-tregtia-dhe-sh%C3%ABrbimet/transporti-aksidentet-dhe-karakteristikat-e-mjeteve-rrugor

ISTAT. (2023, January). INCIDENTI STRADALI IN ITALIA. (Istat – Istituto nazionale di statistica) Retrieved May 2022, from https://www.istat.it/it/archivio/25982

Kononov, J. (2002). Road accident prediction modeling and diagnostics of accident causality: A comprehensive methodology. Denver: University of Colorado at Denver

Larsen, L. (2004). Methods of multidisciplinary in-depth analyses of road traffic accidents. Journal of Hazardous Materials, 111(1-3), 115-122

Lnenicka, M., Hovad, J., Komarkova, J., & Pasler, M. (2016). A proposal of web data mining application for mapping crime areas in the Czech Republic. 10th International Joint Conference on Software Technologies (ICSOFT) (pp. 1-6). Colmar, France: IEEE

Ministry of Interior. (2023, January). Raporti Mujor (Monthly Report). (Ministry of Interior in Albania) Retrieved May 2022, from https://mb.gov.al/en/raporti-mujor

nltk.org. (2022, June). Natural Language Toolkit. (https://www.nltk.org/) Retrieved May 2022, from https://www.nltk.or

Peterson, L. (2009). K-nearest neighbor. Scholarpedia, 4(2), p. 1883

Ren, Y. Z. (2016). Ensemble classification and regression-recent developments, applications and future directions. IEEE Computational intelligence magazine, 11(1), pp. 41-53

Richardson, L. (n.d.). Beautiful Soup. (Crummy) Retrieved March 2023, from https://www.crummy.com/software/BeautifulSoup

Sen, P. C. (2020). Supervised classification algorithms in machine learning: A survey and review. Emerging Technology in Modelling and Graphics: Proceedings of IEM Graph 2018 (pp. 99-111). Singapore: Springer

Siddik, M. A. (2021). Predicting the Death of Road Accidents in Bangladesh Using Machine Learning Algorithms. ICACDS 2021: Advances in Computing and Data Sciences (pp. 160–171). Springer, Cham. doi:https://doi.org/10.1007/978-3-030-88244-0_1

Sinanaj, L., & Bexheti, L. A. (2023). Predicting Road Accidents with Web Scraping and Machine Learning Techniques. International Scientific Conference on Business and Economics. Tetovo, North Macedonia

Yan, M., & Shen, Y. (2022). Traffic Accident Severity Prediction Based on Random Forest. Sustainability, pp. 14(3), 1729. doi:https://doi.org/10.3390/su1403172

Yang, F. J. (2018). An implementation of naive Bayes classifier. . 2018 International conference on computational science and computational intelligence (CSCI) (pp. 301-306). IEEE.

Downloads

Published

2024-03-12

How to Cite

Sinanaj, L. ., Bedalli, E. ., & Abazi Bexheti, L. . (2024). A Classification Model for Predicting Road Accidents Using Web Data. ENTRENOVA - ENTerprise REsearch InNOVAtion, 9(1), 50–61. https://doi.org/10.54820/entrenova-2023-0006

Issue

Section

Mathematical and Quantitative Methods