Semantic text classification: A survey of past and recent advances

ALTINEL GİRGİN, AYŞE BERNA; GANİZ, MURAT CAN

doi:10.1016/j.ipm.2018.08.001

Publication:
Semantic text classification: A survey of past and recent advances

dc.contributor.author	ALTINEL GİRGİN, AYŞE BERNA
dc.contributor.author	GANİZ, MURAT CAN
dc.contributor.authors	Altinel, Berna; Ganiz, Murat Can
dc.date.accessioned	2022-03-12T22:25:02Z
dc.date.accessioned	2026-01-11T10:25:51Z
dc.date.available	2022-03-12T22:25:02Z
dc.date.issued	2018
dc.description.abstract	Automatic text classification is the task of organizing documents into pre-determined classes, generally using machine learning algorithms. Generally speaking, it is one of the most important methods to organize and make use of the gigantic amounts of information that exist in unstructured textual format. Text classification is a widely studied research area of language processing and text mining. In traditional text classification, a document is represented as a bag of words where the words in other words terms are cut from their finer context i.e. their location in a sentence or in a document. Only the broader context of document is used with some type of term frequency information in the vector space. Consequently, semantics of words that can be inferred from the finer context of its location in a sentence and its relations with neighboring words are usually ignored. However, meaning of words, semantic connections between words, documents and even classes are obviously important since methods that capture semantics generally reach better classification performances. Several surveys have been published to analyze diverse approaches for the traditional text classification methods. Most of these surveys cover application of different semantic term relatedness methods in text classification up to a certain degree. However, they do not specifically target semantic text classification algorithms and their advantages over the traditional text classification. In order to fill this gap, we undertake a comprehensive discussion of semantic text classification vs. traditional text classification. This survey explores the past and recent advancements in semantic text classification and attempts to organize existing approaches under five fundamental categories; domain knowledge-based approaches, corpus-based approaches, deep learning based approaches, word/character sequence enhanced approaches and linguistic enriched approaches. Furthermore, this survey highlights the advantages of semantic text classification algorithms over the traditional text classification algorithms.
dc.identifier.doi	10.1016/j.ipm.2018.08.001
dc.identifier.eissn	1873-5371
dc.identifier.issn	0306-4573
dc.identifier.uri	https://hdl.handle.net/11424/234862
dc.identifier.wos	WOS:000445713800017
dc.language.iso	eng
dc.publisher	ELSEVIER SCI LTD
dc.relation.ispartof	INFORMATION PROCESSING & MANAGEMENT
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	Text classification
dc.subject	Semantic text classification
dc.subject	Knowledge-based systems
dc.subject	Corpus-based systems
dc.subject	Neural language models
dc.subject	Deep learning
dc.subject	SMOOTHING METHOD
dc.subject	KERNEL METHODS
dc.subject	WORD
dc.subject	RELATEDNESS
dc.subject	ALGORITHM
dc.subject	COVERAGE
dc.subject	VALUES
dc.title	Semantic text classification: A survey of past and recent advances
dc.type	article
dspace.entity.type	Publication
oaire.citation.endPage	1153
oaire.citation.issue	6
oaire.citation.startPage	1129
oaire.citation.title	INFORMATION PROCESSING & MANAGEMENT
oaire.citation.volume	54

Collections

Araştırma Çıktıları

Publication: Semantic text classification: A survey of past and recent advances

Files

Collections

Publication:
Semantic text classification: A survey of past and recent advances