Skip to the main content

Original scientific paper

https://doi.org/10.7307/ptt.v37i6.1016

Prediction and Investigation of the Injury Severity of Drivers Involved in Speeding-Related Crashes Using Machine Learning Models

Neero Gumsar SORUM ; Department of Civil Engineering, North Eastern Regional Institute of Science and Technology, Nirjuli, India *
Martina Gumsar SORUM ; Department of Civil Engineering, North Eastern Regional Institute of Science and Technology, Nirjuli, India

* Corresponding author.


Full text: english pdf 850 Kb

page 1612-1627

downloads: 34

cite


Abstract

Speeding is the major reason for road traffic crashes and deaths in India. The other driver’s faults include driving under the influence, using mobile phones while driving and driving on the wrong side of the road. Therefore, this study attempts to predict and investigate the driver injury severity (DIS) in speeding-related crashes. A total of 793 police-reported single-vehicle and two-vehicle crash data from Imphal City, India, collected between 2011–2020, were analysed and modelled. For DIS prediction, eleven supervised machine learning (ML) models were implemented using 5-fold and 10-fold cross-validation (FCVs) and trained at train ratio (TR) values of 0.5, 0.6, 0.7 and 0.8 in each FCV. The top ML model for the DIS prediction was selected based on the best combination of recall, accuracy, F1 score, area under the curve (AUC) and precision metrics. Feature importance analysis (FIA) was conducted to determine the impactful factors in DIS prediction. The gradient boosting tree (GBT), stochastic gradient descent, decision tree and lasso-LARS models were identified as the top-performing ML models for the DIS prediction at TR = 0.5, 0.6, 0.7 and 0.8, respectively, in 5-FCV. The light GBM (TR = 0.5 and 0.7), GBT (TR = 0.6) and lasso-LARS (TR = 0.8) were the best-performing ML models in 10-FCV. The FIA results indicated that vehicle type (two-wheeler), nature of crash (head-on collision) and time of crash (12 PM–6 PM and 6 AM–12 PM) variables were the most impactful variables on the DIS prediction in Imphal speeding-related crashes. These ML models can be employed in hilly areas for the accurate prediction of DIS. The study results can help transportation planners in designing road safety measures and strategies to lessen DIS in speeding-related crashes.

Keywords

speeding; driver injury severity; machine learning; Dataiku; gradient boosting tree; lasso-LARS

Hrčak ID:

337235

URI

https://hrcak.srce.hr/337235

Publication date:

27.10.2025.

Visits: 86 *