Izvorni znanstveni članak
https://doi.org/10.1080/1331677X.2015.1095110
Processing unstructured documents and social media using Big Data techniques
Vlad Diaconita
orcid.org/0000-0002-5169-9232
Sažetak
Big Data technologies can be very useful when it comes to storing and processing using sophisticated algorithms, terabytes or petabytes of data. With the latest advancements, such as Hadoop YARN, processing can be done not only in batch but also in real time. In this paper, we detail a methodology followed by a case study that investigates the power of machine learning algorithms used in a Hadoop environment in classifying unstructured data. We also investigate how to capture
geolocated messages from social networks and how kriging can be used to see if there is a strong relationship between two or more such datasets.
Ključne riječi
Hadoop; MapReduce; k-NN; social media; geolocated messages; large data sets
Hrčak ID:
171608
URI
Datum izdavanja:
20.12.2015.
Posjeta: 1.627 *