Pairwise Exemplar Clustering

Yingzhen Yang; Xinqi Chu; Feng Liang; Thomas Huang

doi:10.1609/aaai.v26i1.8291

Authors

Yingzhen Yang University of Illinois at Urbana-Champaign
Xinqi Chu University of Illinois at Urbana-Champaign
Feng Liang University of Illinois at Urbana-Champaign
Thomas Huang University of Illinois at Urbana-Champaign

DOI:

https://doi.org/10.1609/aaai.v26i1.8291

Keywords:

Clustering, Kernel density estimation, Misclassification rate, Supervised learning, Message computation

Abstract

Exemplar-based clustering methods have been extensively shown to be effective in many clustering problems. They adaptively determine the number of clusters and hold the appealing advantage of not requiring the estimation of latent parameters, which is otherwise difficult in case of complicated parametric model and high dimensionality of the data. However, modeling arbitrary underlying distribution of the data is still difficult for existing exemplar-based clustering methods. We present Pairwise Exemplar Clustering (PEC) to alleviate this problem by modeling the underlying cluster distributions more accurately with non-parametric kernel density estimation. Interpreting the clusters as classes from a supervised learning perspective, we search for an optimal partition of the data that balances two quantities: 1 the misclassification rate of the data partition for separating the clusters; 2 the sum of within-cluster dissimilarities for controlling the cluster size. The broadly used kernel form of cut turns out to be a special case of our formulation. Moreover, we optimize the corresponding objective function by a new efficient algorithm for message computation in a pairwise MRF. Experimental results on synthetic and real data demonstrate the effectiveness of our method.

Pairwise Exemplar Clustering

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information