Skoči na glavni sadržaj

Izvorni znanstveni članak

https://doi.org/10.17559/TV-20231004000989

Sentiment Analysis on Big Data: A Hybrid SED-TABU Feature Selection Method

Sabitha Rajagopal ; Department of Computer Science and Engineering, SNS College of Technology, Tamilnadu 641035, India
Sreemathy Jayaprakash ; Department of Computer Science and Engineering, Sri Eshwar College of Engineering, Kinathakadavu, Coimbatore, Tamilnadu 641202, India *
Karthik Subburathinam ; Department of Computer Science and Engineering, SNS College of Technology, Coimbatore, Tamilnadu 641035, India

* Dopisni autor.


Puni tekst: engleski pdf 377 Kb

str. 2079-2086

preuzimanja: 3

citiraj


Sažetak

Big data mining is a crucial component of contemporary decision support systems linked to social networks and other data sources. Sentiment Analysis (SA) is the process by which text analytics is used to mine many data sources for opinions. This research seeks to create a feature selection method for sentiment analysis that is efficient and robust against noise and high dimensionality in Big data environments. The objective is to choose a condensed collection of useful features that increases sentiment categorization precision. It is suggested to use a novel hybrid feature selection method that combines Tabu Search (TS) and Stream Evolution Dynamics (SED). SED offers exploratory power, and TS offers exploitation. The classifier assesses the performance for each feature subset that SED-TS chose. Instances are classified using the AdaBoost classifier. The suggested method was assessed using data from Amazon product reviews. As a result, our technique outperforms wrapper and filter-based feature selection methods. By extracting a small feature subset, the SED-TS hybrid technique attained the best accuracy of 93% and an F1 score of 0.95. The work effectively combined SED and TS for feature selection specifically suited to sentiment analysis on Big data. The hybrid strategy offers higher accuracy and better generalization by utilizing the complementing characteristics of the two strategies. This shows how metaheuristic approaches can be used to classify sentiment in high-dimensional noisy data.

Ključne riječi

adaboost classifier; Big data; feature selection; sentiment analysis (SA); stream evolution dynamics (SED); tabu search (TS)

Hrčak ID:

321943

URI

https://hrcak.srce.hr/321943

Datum izdavanja:

31.10.2024.

Posjeta: 6 *