Skip to the main content

Original scientific paper

https://doi.org/10.17559/TV-20240917001995

Investigating the Contribution of Structural and Contextual Features for Spam Detection

Ulin Nuha ; Dept. of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 80778, Taiwan
Chih-Hsueh Lin ; Dept. of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 80778, Taiwan *

* Corresponding author.


Full text: english pdf 697 Kb

page 958-965

downloads: 222

cite


Abstract

One of the detrimental consequences resulting from the rapid and effortless dissemination of information is the high flow of spam messages in user's electronic devices. Various studies have attempted to conduct spam detection by proposing several approaches for spam detection based on input features. However, the effectiveness and efficiency of spam detection need improvement. Therefore, this paper investigates significant feature schemes for spam detection, including structural features and contextual representation. We propose a hybrid approach combining both features and evaluate it using various machine learning algorithms for short message service (SMS) spam classification. Experimental results show that relying solely on contextual representation outperforms structural features, achieving an accuracy of 92.56%. However, the hybrid approach, combining both structural and contextual features, achieves superior results with an accuracy of 93.22% and an F-score of 95.12%. Among the classifiers tested, random forest demonstrated the most robust performance, consistently achieving accuracy above 90% across all feature extraction schemes. The findings highlight the potential of combining structural and contextual features to enhance spam detection performance, with practical implications for telecommunication providers aiming to improve SMS filtering accuracy.

Keywords

classifier algorithms; contextual representation; feature analysis; hybrid scheme; spam detection

Hrčak ID:

330561

URI

https://hrcak.srce.hr/330561

Publication date:

1.5.2025.

Visits: 456 *