Veri madenciliği teknikleri ile istenmeyen türkçe e-postaların önlenmesi üzerine bir uygulama

Saylan, Sefa

Publication:
Veri madenciliği teknikleri ile istenmeyen türkçe e-postaların önlenmesi üzerine bir uygulama

dc.contributor.advisor	ÇAKIR, Özgür
dc.contributor.author	Saylan, Sefa
dc.contributor.department	Marmara Üniversitesi
dc.contributor.department	Sosyal Bilimler Enstitüsü
dc.contributor.department	Sayısal Yöntemler Bilim Dalı
dc.contributor.department	İşletme Anabilim Dalı
dc.date.accessioned	2026-01-13T11:35:51Z
dc.date.issued	2018
dc.description.abstract	İstenmeyen e-postalara maruz kalmak işletmelerin iş süreçlerinde aksamalara, zaman kayıplarına ve hatta maddi kayıplarına sebep olduğundan günümüzün önemli sorunlarından biri olarak görülmektedir. İstenmeyen e-postaların engellenmesi için öncelikle tespit edilmeleri gerekmektedir. Bu çalışmada, gelen e-postaların sınıflandırılması ve istenmeyen Türkçe e-postaların tespiti için Naive Bayes algoritmaları (iki terimli ve çok terimli) ve Destek Vektör Makinesi algoritmaları (doğrusal ve RBF çekirdek fonksiyonlu) kullanılmıştır. Çalışmada, öğrenme kümesinin Türkçede kullanılan etkisiz kelimelerden arındırılması ve arındırılmaması durumunda TF-IDF yöntemi ile oluşturulan farklı boyutlardaki özellik vektörlerinin sınıflandırma başarısına etkisi 72 farklı model oluşturularak incelenmiştir. Öğrenme kümesinden etkisiz kelimelerin arındırılmaması durumunda oluşturulan modellerin çoğunlukla daha yüksek başarı ile sınıflandırma işlemini gerçekleştirdiği sonucuna ulaşılmıştır. En yüksek başarıyı elde eden sınıflandırma algoritmasının çok terimli naive bayes algoritması olduğu gözlemlenmiştir.
dc.description.abstract	Nowadays, spam (Junk) mails might be considered as an important issue since they causes disruptions of business processes, a waste of time and also financial losses. The first step to prevent spam mails have to be detecting them. In this study, Naïve Bayes (Bernoulli and Multinomial) and Support Vector Machine (Linear and RBF Kernel Functions) algorithms are applied to a data set in order to classify incoming mails and prevent unwanted ones. Besides, in 72 different models, it is examined how different size TF-IDF feature vectors affect the accuracy of classification in learning data set with or without stop-words used in Turkish. In case of not removed stop-words used in Turkish success of classification in learning data has been observed to increase. In this study using Multinomial Naive Bayes classification algorithm achieved the best result.
dc.format.extent	X, 128 s.
dc.identifier.uri	https://katalog.marmara.edu.tr/veriler/yordambt/cokluortam/4C/6D686B38-F632-194C-9478-DAF928F3F948.pdf
dc.identifier.uri	https://hdl.handle.net/11424/203145
dc.language.iso	tur
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	Bernoulli Naive Bayes
dc.subject	Classification
dc.subject	Data mining
dc.subject	Destek Vektör Makinesi
dc.subject	Electronic data processing
dc.subject	Electronic mail systems
dc.subject	Elektronik bilgi işlem
dc.subject	Elektronik posta sistemleri
dc.subject	İstenmeyen Türkçe E-postalar
dc.subject	Linear Functions
dc.subject	Lineer Çekirdek Fonksiyonu
dc.subject	Management
dc.subject	Multinomial Naive Bayes
dc.subject	Naive Bayes
dc.subject	RBF Çekirdek Fonksiyonu Spam/ Junk mails
dc.subject	RBF Kernel
dc.subject	Sınıflandırma
dc.subject	Support Vector Machine
dc.subject	Veri madenciliği
dc.subject	Yönetim
dc.title	Veri madenciliği teknikleri ile istenmeyen türkçe e-postaların önlenmesi üzerine bir uygulama
dc.type	masterThesis
dspace.entity.type	Publication

Collections

Tezler

Publication: Veri madenciliği teknikleri ile istenmeyen türkçe e-postaların önlenmesi üzerine bir uygulama

Files

Collections

Publication:
Veri madenciliği teknikleri ile istenmeyen türkçe e-postaların önlenmesi üzerine bir uygulama