Thesis

  1. Lianyu Hu " Clustering based on Space Transformation", 2019 [link]

    supervised by Prof. Caiming Zhong

    Clustering based on Space Transformation(CST) is a novel framework on unified clustering. It enhances the effectiveness and universality of a clustering algorithm on various datasets. Meanwhile, it avoids the problem of selecting suitable clustering algorithms and corresponding parameters for unknown datasets. CST uses a new similarity matrix transformed from the original Euclidean distance matrix of a dataset by a nonlinear mapping that is explicit and interpretable for clustering algorithms. Furthermore, under the transformed space of the similarity, more evidence of cluster structures are extracted, which can result in more accurate clustering results by traditional clustering algorithms. Surprisingly, it is also robust to outliers. In this thesis, the Normalized Cut is used to show the effectiveness and universality of CST. As a starting point, I researched two kinds of CST and try to figure out the essential part of validity in clustering. To choose a specific new similarity from a candidate set of CSTs, a new proposed internal validity index is used. The key to my studies is on designing a good internal validity index. Finally, some clustering mechanisms are involved in clusters or clustering, contributing to a new internal validity index called CVDD. Experimental results show that CVDD outperforms some classic ones and can cope with challenging datasets such as non-spherical clusters, density-separated clusters, and datasets with outliers.