Occupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach

YILDIZ, KAZIM

Publication:
Occupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach

dc.contributor.author	YILDIZ, KAZIM
dc.contributor.authors	Ciplak Z., YILDIZ K.
dc.date.accessioned	2024-06-10T14:39:13Z
dc.date.available	2024-06-10T14:39:13Z
dc.date.issued	2024-10-15
dc.description.abstract	A lot of research has been done on personality and sentiment analysis, demographic and professional aspects using user shares in social networks. In particular, information extraction and value are produced based on Twitter data. This study aims to predict the users, occupational groups, who share in Turkish on Twitter, using machine learning methods. First, occupational groups and the Twitter accounts of the occupations in these occupational groups were determined manually and the tweets shared in these accounts were scraped. All tweets were then grouped by occupation into groups of one, five and ten, creating datasets with different characteristics, each containing more than 500,000 tweets. Some datasets were preprocessed using the Zemberek library, which is used in many Turkish NLP studies, and experiments were conducted out with a total 6 datasets. During the preprocessing phase, since the ready-made stopwords lists were not considered sufficient, unnecessary word lists consisting of single and binary words were created manually. Count and TF-IDF vectorizers are used to convert textual data into numerical. Since each word represents a variable in the text classification study, new variables were created by combining double and triple word phrases (ngrams) with feature extraction. In the experiments in which 24 different models were run, instead of using all the features created, the method of \"determining the optimal number of features\", which consists of the most valuable features, was used. It was found that the most successful model in the experiments using machine learning algorithms with a multinomial approach achieved 97.3% success in all calculated metrics.
dc.identifier.citation	Ciplak Z., YILDIZ K., "Occupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach", Expert Systems with Applications, cilt.252, 2024
dc.identifier.doi	10.1016/j.eswa.2024.124175
dc.identifier.issn	0957-4174
dc.identifier.uri	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85192984274&origin=inward
dc.identifier.uri	https://hdl.handle.net/11424/297024
dc.identifier.volume	252
dc.language.iso	eng
dc.relation.ispartof	Expert Systems with Applications
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	Bilgisayar Bilimleri
dc.subject	Algoritmalar
dc.subject	Mühendislik ve Teknoloji
dc.subject	Computer Sciences
dc.subject	algorithms
dc.subject	Engineering and Technology
dc.subject	Mühendislik, Bilişim ve Teknoloji (ENG)
dc.subject	Bilgisayar Bilimi
dc.subject	Mühendislik
dc.subject	BİLGİSAYAR BİLİMİ, YAPAY ZEKA
dc.subject	Engineering, Computing & Technology (ENG)
dc.subject	COMPUTER SCIENCE
dc.subject	ENGINEERING
dc.subject	COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
dc.subject	Genel Mühendislik
dc.subject	Fizik Bilimleri
dc.subject	Bilgisayar Bilimi Uygulamaları
dc.subject	Yapay Zeka
dc.subject	General Engineering
dc.subject	Physical Sciences
dc.subject	Computer Science Applications
dc.subject	Artificial Intelligence
dc.subject	Data mining
dc.subject	Machine learning
dc.subject	Multinomial approach
dc.subject	Occupation prediction
dc.subject	Turkish twitter data analysis
dc.title	Occupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach
dc.type	article
dspace.entity.type	Publication
local.avesis.id	cf7f9da5-9a6d-48d7-8985-387b17dcf5e6
local.indexed.at	SCOPUS
relation.isAuthorOfPublication	5f9350a3-17ea-4eb8-a7ab-8fe4644d3a2c
relation.isAuthorOfPublication.latestForDiscovery	5f9350a3-17ea-4eb8-a7ab-8fe4644d3a2c

Files

Original bundle

Now showing 1 - 1 of 1

Name:: file.pdf
Size:: 2.3 MB
Format:: Adobe Portable Document Format

Download

Collections

Research Outputs

Publication: Occupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach

Files

Original bundle

Collections

Publication:
Occupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach