Please use this identifier to cite or link to this item: https://hdl.handle.net/11499/23998
Title: Text Classification on Mahout with Naive-Bayes Machine Learning
Algorithm
Authors: Salur, MU
Tokat, Sezai
Aydilek, IB
Keywords: Machine Learning; Hadoop; Mahout; Text Classification
Publisher: IEEE
Abstract: In daily life, we use the internet for many purposes. The Internet makes easier our life and it has led to the providing to occur new technologies. Several smart devices that use the Internet infrastructure generates digital data in different formats and with different generation speeds. The evaluation of the generated data is carried out by the algorithms associated with the field of machine learning. When considered the large size of generated data and the diversity format of data, it has required methods such as higher performance parallel computing tools as well as machine learning algorithms. One of these tools, which is called Mahout working on Hadoop, is a big data processing tool developed for this purpose. Mahout is described as an open source framework that runs machine learning algorithms in parallel on distributed servers. In this study, the Tweets belonging to 3 daily newspapers have classified according to newspaper categories using the Naive-Bayes algorithm with Mahout framework. The proposed model achieves about % 79 success in the classification of the tweets. The classification results are tabulated in the last section of the study.
URI: https://hdl.handle.net/11499/23998
Appears in Collections:Mühendislik Fakültesi Koleksiyonu
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Show full item record



CORE Recommender

Page view(s)

32
checked on May 6, 2024

Google ScholarTM

Check





Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.