Skip to the main content

Original scientific paper

https://doi.org/10.1080/00051144.2024.2326383

Ensemble machine learning technique-based plagiarism detection over opinions in social media

Sethu Vinayaga Vadivu ; Department of Computer Science and Engineering, Kalasalingam Academy of Research and Education, Krishnankoil, India *
Palanigurupackiam Nagaraj ; Department of Computer Science and Engineering, Kalasalingam Academy of Research and Education, Krishnankoil, India
Bagavathi Ammai Shanmugam Murugan ; Department of Computer Science and Engineering, M Kumarasamy College of Engineering and Technology, Karur, India

* Corresponding author.


Full text: english pdf 2.206 Kb

page 983-991

downloads: 0

cite


Abstract

With the progressive enhancement of social media, several people prefer posting their opinions
on various social media instead of posting on radios, television or newspapers. The postings differ
in dimensions and include various titles and comments. Nowadays, the formation of plagiarism
is increasing tremendously which occurs by rewriting or repeating one’s work. There are many
ways to detect plagiarism by browsing through the internet. The significant intention of this
paper involves the detection of plagiarism in social media using four different phases, namely
the data pre-processing phase, n-gram evaluation, similarity/distance computation analysis and
the plagiarism detection phase. The pre-processing includes data cleaning processes, such as the
removal of redundant data, upper case letters, noise, irrelevant punctuations and characterizing
into a vector form. After pre-processing the data are fed for n-gram evaluation to develop a posting attribution system. Then finally, an ensemble support vector machine-based African vulture
optimization (ESVM-AVO) approach is employed to detect plagiarism which signifies that the
performance based on detection is enhanced and the execution time in obtaining a high rate of
detection accuracy is very low. Finally, the performance evaluation and the comparative analysis
are carried out to determine the performance of the proposed system.

Keywords

Plagiarism; n-gram; support vector machine; African vulture optimization; opinion mining; social media

Hrčak ID:

326219

URI

https://hrcak.srce.hr/326219

Publication date:

15.3.2024.

Visits: 0 *