Publication:
A clustering framework for unbalanced partitioning and outlier filtering on high dimensional datasets

dc.contributor.authorsBilgin, Turgay Tugay; Camurcu, A. Yilmaz
dc.contributor.editorIoannidis, Y
dc.contributor.editorNovikov, B
dc.contributor.editorRachev, B
dc.date.accessioned2022-03-12T15:59:57Z
dc.date.accessioned2026-01-10T21:15:50Z
dc.date.available2022-03-12T15:59:57Z
dc.date.issued2007
dc.description.abstractIn this study, we propose a better relationship based clustering framework for dealing with unbalanced clustering and outlier filtering on high dimensional datasets. Original relationship based clustering framework is based on a weighted graph partitioning system named METIS. However, it has two major drawbacks: no outlier filtering and forcing clusters to be balanced. Our proposed framework uses Graclus, an unbalanced kernel k-means based partitioning system. We have two major improvements over the original framework: First, we introduce a new space. It consists of tiny unbalanced partitions created using Graclus, hence we call it micro-partition space. We use a filtering approach to drop out singletons or micro-partitions that have fewer members than a threshold value. Second, we agglomerate the filtered micro-partition space and apply Graclus again for clustering. The visualization of the results has been carried out by CLUSION. Our experiments have shown that our proposed framework produces promising results on high dimensional datasets.
dc.identifier.doidoiWOS:000250283400016
dc.identifier.eissn1611-3349
dc.identifier.isbn978-3-540-75184-7
dc.identifier.issn0302-9743
dc.identifier.urihttps://hdl.handle.net/11424/224554
dc.identifier.wosWOS:000250283400016
dc.language.isoeng
dc.publisherSPRINGER-VERLAG BERLIN
dc.relation.ispartofADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS
dc.relation.ispartofseriesLecture Notes in Computer Science
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectdata mining
dc.subjectdimensionality
dc.subjectclustering
dc.subjectoutlier filtering
dc.subjectVISUALIZATION
dc.titleA clustering framework for unbalanced partitioning and outlier filtering on high dimensional datasets
dc.typeconferenceObject
dspace.entity.typePublication
oaire.citation.endPage+
oaire.citation.startPage205
oaire.citation.titleADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS
oaire.citation.volume4690

Files