COMPSTAT 2022: Start Registration
View Submission - COMPSTAT2022
A0484
Title: $K$-means algorithm with positive and negative equivalence constraints Authors:  Igor Melnykov - University of Minnesota Duluth (United States) [presenting]
Abstract: A modification of the K-means algorithm is considered that accommodates two types of hard constraints often encountered in semi-supervised clustering. A positive equivalence constraint requires that the data points bound by such a constraint are placed in the same class in the clustering solution. At the same time, a negative constraint specifies which points must be separated from each other and included in different classes. Although it is common to check for any constraints in the form of an add-on to the basic $K$-means algorithm, its objective function is usually left unchanged in the process. In our approach, the constraints are included in the objective function itself, thus making any restrictions on the placement of points an integral part of the algorithm. The proposed methodology is illustrated in several examples and its connection to model-based clustering is discussed.