The study introduces a novel dimensionality reduction algorithm called CBMAP (Clustering-Based Manifold Approximation and Projection) that addresses the limitations of recent methods. CBMAP's primary objective is to retain the structural integrity of high-dimensional clusters post-dimensionality reduction.
The key highlights of the CBMAP algorithm are:
CBMAP initiates clustering within the high-dimensional space to determine cluster centers, which are then utilized to compute membership values for each data point relative to these centers. During the data embedding process, CBMAP ensures that the membership values between low-dimensional cluster centers and data points mirror those obtained in the high-dimensional space. This methodology aids in preserving both the global data structure and the local cluster arrangement.
CBMAP is characterized by its speed, scalability, and absence of hyperparameters that substantially impact algorithm behavior. Moreover, CBMAP allows for a low-dimensional projection of the test data, which is highly desirable in machine learning applications.
Experimental evaluations on benchmark datasets demonstrate CBMAP's effectiveness in preserving both global and local structures compared to recent dimensionality reduction methods like t-SNE, UMAP, TriMap, and PaCMAP. CBMAP outperforms these methods in terms of global structure preservation while maintaining competitive performance in local structure preservation.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Berat Dogan lúc arxiv.org 04-30-2024
https://arxiv.org/pdf/2404.17940.pdfYêu cầu sâu hơn