Publication:
Occupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach

dc.contributor.authorYILDIZ, KAZIM
dc.contributor.authorsCiplak Z., YILDIZ K.
dc.date.accessioned2024-06-10T14:39:13Z
dc.date.available2024-06-10T14:39:13Z
dc.date.issued2024-10-15
dc.description.abstractA lot of research has been done on personality and sentiment analysis, demographic and professional aspects using user shares in social networks. In particular, information extraction and value are produced based on Twitter data. This study aims to predict the users, occupational groups, who share in Turkish on Twitter, using machine learning methods. First, occupational groups and the Twitter accounts of the occupations in these occupational groups were determined manually and the tweets shared in these accounts were scraped. All tweets were then grouped by occupation into groups of one, five and ten, creating datasets with different characteristics, each containing more than 500,000 tweets. Some datasets were preprocessed using the Zemberek library, which is used in many Turkish NLP studies, and experiments were conducted out with a total 6 datasets. During the preprocessing phase, since the ready-made stopwords lists were not considered sufficient, unnecessary word lists consisting of single and binary words were created manually. Count and TF-IDF vectorizers are used to convert textual data into numerical. Since each word represents a variable in the text classification study, new variables were created by combining double and triple word phrases (ngrams) with feature extraction. In the experiments in which 24 different models were run, instead of using all the features created, the method of \"determining the optimal number of features\", which consists of the most valuable features, was used. It was found that the most successful model in the experiments using machine learning algorithms with a multinomial approach achieved 97.3% success in all calculated metrics.
dc.identifier.citationCiplak Z., YILDIZ K., "Occupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach", Expert Systems with Applications, cilt.252, 2024
dc.identifier.doi10.1016/j.eswa.2024.124175
dc.identifier.issn0957-4174
dc.identifier.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85192984274&origin=inward
dc.identifier.urihttps://hdl.handle.net/11424/297024
dc.identifier.volume252
dc.language.isoeng
dc.relation.ispartofExpert Systems with Applications
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectBilgisayar Bilimleri
dc.subjectAlgoritmalar
dc.subjectMühendislik ve Teknoloji
dc.subjectComputer Sciences
dc.subjectalgorithms
dc.subjectEngineering and Technology
dc.subjectMühendislik, Bilişim ve Teknoloji (ENG)
dc.subjectBilgisayar Bilimi
dc.subjectMühendislik
dc.subjectBİLGİSAYAR BİLİMİ, YAPAY ZEKA
dc.subjectEngineering, Computing & Technology (ENG)
dc.subjectCOMPUTER SCIENCE
dc.subjectENGINEERING
dc.subjectCOMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
dc.subjectGenel Mühendislik
dc.subjectFizik Bilimleri
dc.subjectBilgisayar Bilimi Uygulamaları
dc.subjectYapay Zeka
dc.subjectGeneral Engineering
dc.subjectPhysical Sciences
dc.subjectComputer Science Applications
dc.subjectArtificial Intelligence
dc.subjectData mining
dc.subjectMachine learning
dc.subjectMultinomial approach
dc.subjectOccupation prediction
dc.subjectTurkish twitter data analysis
dc.titleOccupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach
dc.typearticle
dspace.entity.typePublication
local.avesis.idcf7f9da5-9a6d-48d7-8985-387b17dcf5e6
local.indexed.atSCOPUS
relation.isAuthorOfPublication5f9350a3-17ea-4eb8-a7ab-8fe4644d3a2c
relation.isAuthorOfPublication.latestForDiscovery5f9350a3-17ea-4eb8-a7ab-8fe4644d3a2c

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
file.pdf
Size:
2.3 MB
Format:
Adobe Portable Document Format

Collections