Publication:
Helmholtz principle based supervised and unsupervised feature selection methods for text mining

dc.contributor.authorGANİZ, MURAT CAN
dc.contributor.authorsTutkan, Melike; Ganiz, Murat Can; Akyokus, Selim
dc.date.accessioned2022-03-14T08:17:10Z
dc.date.accessioned2026-01-11T11:42:17Z
dc.date.available2022-03-14T08:17:10Z
dc.date.issued2016-09
dc.description.abstractOne of the important problems in text classification is the high dimensionality of the feature space. Feature selection methods are used to reduce the dimensionality of the feature space by selecting the most valuable features for classification. Apart from reducing the dimensionality, feature selection methods have potential to improve text classifiers' performance both in terms of accuracy and time. Furthermore, it helps to build simpler and as a result more comprehensible models. In this study we propose new methods for feature selection from textual data, called Meaning Based Feature Selection (MBFS) which is based on the Helmholtz principle from the Gestalt theory of human perception which is used in image processing. The proposed approaches are extensively evaluated by their effect on the classification performance of two well-known classifiers on several datasets and compared with several feature selection algorithms commonly used in text mining. Our results demonstrate the value of the MBFS methods in terms of classification accuracy and execution time. (C) 2016 Elsevier Ltd. All rights reserved.
dc.identifier.doi10.1016/j.ipm.2016.03.007
dc.identifier.eissn1873-5371
dc.identifier.issn0306-4573
dc.identifier.urihttps://hdl.handle.net/11424/241427
dc.identifier.wosWOS:000381540600011
dc.language.isoeng
dc.publisherELSEVIER SCI LTD
dc.relation.ispartofINFORMATION PROCESSING & MANAGEMENT
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectFeature selection
dc.subjectAttribute selection
dc.subjectMachine learning
dc.subjectText mining
dc.subjectText classification
dc.subjectHelmholtz principle
dc.subjectSEMANTIC SMOOTHING METHOD
dc.subjectALGORITHM
dc.titleHelmholtz principle based supervised and unsupervised feature selection methods for text mining
dc.typearticle
dspace.entity.typePublication
oaire.citation.endPage910
oaire.citation.issue5
oaire.citation.startPage885
oaire.citation.titleINFORMATION PROCESSING & MANAGEMENT
oaire.citation.volume52

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
file.pdf
Size:
907.09 KB
Format:
Adobe Portable Document Format