Instance labeling in semi-supervised learning with meaning values of words

ALTINEL GİRGİN, AYŞE BERNA; GANİZ, MURAT CAN

doi:10.1016/j.engappai.2017.04.003

Publication:
Instance labeling in semi-supervised learning with meaning values of words

dc.contributor.author	ALTINEL GİRGİN, AYŞE BERNA
dc.contributor.author	GANİZ, MURAT CAN
dc.contributor.authors	Altinel, Berna; Ganiz, Murat Can; Diri, Banu
dc.date.accessioned	2022-03-12T20:32:47Z
dc.date.accessioned	2026-01-11T13:14:29Z
dc.date.available	2022-03-12T20:32:47Z
dc.date.issued	2017
dc.description.abstract	In supervised learning systems; only labeled samples are used for building a classifier that is then used to predict the class labels of the unlabeled samples. However, obtaining labeled data is very expensive, time consuming and difficult in real-life practical situations as labeling a data set requires the effort of a human expert. On the other side, unlabeled data are often plentiful which makes it relatively inexpensive and easier to obtain. Semi-Supervised Learning methods strive to utilize this plentiful source of unlabeled examples to increase the learning capacity of the classifier particularly when amount of labeled examples are restricted. Since SSL techniques usually reach higher accuracy and require less human effort, they attract a substantial amount of attention both in practical applications and theoretical research. A novel semi-supervised methodology is offered in this study. This algorithm utilizes a new method to predict the class labels of unlabeled examples in a corpus and incorporate them into the training set to build a better classifier. The approach presented here depends on a meaning calculation, which computes the words' meaning scores in the scope of classes. Meaning computation is constructed on the Helmholtz principle and utilized to various applications in the field of text mining like feature extraction, information retrieval and document summarization. Nevertheless, according to the literature, ILBOM is the first work which uses meaning calculation in a semi-supervised way to construct a semantic smoothing kernel for Support Vector Machines (SVM). Evaluation of the proposed methodology is done by performing various experiments on standard textual datasets. ILBOM's experimental results are compared with three baseline algorithms including SVM using linear kernel which is one of the most frequently used algorithms in text classification field. Experimental results show that labeling unlabeled instances based on meaning scores of words to augment the training set is valuable, and increases the classification accuracy on previously unseen test instances significantly.
dc.identifier.doi	10.1016/j.engappai.2017.04.003
dc.identifier.eissn	1873-6769
dc.identifier.issn	0952-1976
dc.identifier.uri	https://hdl.handle.net/11424/234432
dc.identifier.wos	WOS:000403122500012
dc.language.iso	eng
dc.publisher	PERGAMON-ELSEVIER SCIENCE LTD
dc.relation.ispartof	ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	Text classification
dc.subject	Semantic kernel
dc.subject	Semi-supervised learning
dc.subject	Instance labeling
dc.subject	Helmholtz principle
dc.subject	TEXT CLASSIFICATION
dc.subject	SEMANTIC KERNEL
dc.title	Instance labeling in semi-supervised learning with meaning values of words
dc.type	article
dspace.entity.type	Publication
oaire.citation.endPage	163
oaire.citation.startPage	152
oaire.citation.title	ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
oaire.citation.volume	62

Collections

Araştırma Çıktıları

Publication: Instance labeling in semi-supervised learning with meaning values of words

Files

Collections

Publication:
Instance labeling in semi-supervised learning with meaning values of words