The study introduces a novel dimensionality reduction algorithm called CBMAP (Clustering-Based Manifold Approximation and Projection) that addresses the limitations of recent methods. CBMAP's primary objective is to retain the structural integrity of high-dimensional clusters post-dimensionality reduction.
The key highlights of the CBMAP algorithm are:
CBMAP initiates clustering within the high-dimensional space to determine cluster centers, which are then utilized to compute membership values for each data point relative to these centers. During the data embedding process, CBMAP ensures that the membership values between low-dimensional cluster centers and data points mirror those obtained in the high-dimensional space. This methodology aids in preserving both the global data structure and the local cluster arrangement.
CBMAP is characterized by its speed, scalability, and absence of hyperparameters that substantially impact algorithm behavior. Moreover, CBMAP allows for a low-dimensional projection of the test data, which is highly desirable in machine learning applications.
Experimental evaluations on benchmark datasets demonstrate CBMAP's effectiveness in preserving both global and local structures compared to recent dimensionality reduction methods like t-SNE, UMAP, TriMap, and PaCMAP. CBMAP outperforms these methods in terms of global structure preservation while maintaining competitive performance in local structure preservation.
Til et annet språk
fra kildeinnhold
arxiv.org
Viktige innsikter hentet fra
by Berat Dogan klokken arxiv.org 04-30-2024
https://arxiv.org/pdf/2404.17940.pdfDypere Spørsmål