Skip to the main content

Original scientific paper

https://doi.org/10.1080/1331677X.2015.1095110

Processing unstructured documents and social media using Big Data techniques

Vlad Diaconita orcid id orcid.org/0000-0002-5169-9232


Full text: english pdf 483 Kb

page 981-993

downloads: 822

cite


Abstract

Big Data technologies can be very useful when it comes to storing and processing using sophisticated algorithms, terabytes or petabytes of data. With the latest advancements, such as Hadoop YARN, processing can be done not only in batch but also in real time. In this paper, we detail a methodology followed by a case study that investigates the power of machine learning algorithms used in a Hadoop environment in classifying unstructured data. We also investigate how to capture
geolocated messages from social networks and how kriging can be used to see if there is a strong relationship between two or more such datasets.

Keywords

Hadoop; MapReduce; k-NN; social media; geolocated messages; large data sets

Hrčak ID:

171608

URI

https://hrcak.srce.hr/171608

Publication date:

20.12.2015.

Visits: 1.297 *