Q-SNE：使用Q-Gaussian分布式随机邻居嵌入可视化数据

论文标题

Q-SNE：使用Q-Gaussian分布式随机邻居嵌入可视化数据

q-SNE: Visualizing Data using q-Gaussian Distributed Stochastic Neighbor Embedding

论文作者

Abe, Motoshi, Miyao, Junichi, Kurita, Takio

论文摘要

降低的降低已被广泛引入，以使用高维数据进行回归，分类，特征分析和可视化。作为降低维度的一种技术，引入了随机邻居嵌入（SNE）。 SNE带来了强大的结果，可以通过考虑高维空间和低维空间的当地高斯分布之间的相似性来可视化高维数据。为了改善SNE，还引入了T分布的随机邻居嵌入（T-SNE）。为了可视化高维数据，与SNE相比，T-SNE可通过使用T-Distribution作为低维数据的分布来实现2或3维映射的更强大和灵活的可视化。最近，提出统一的歧管近似和投影（UMAP）作为维度降低技术。我们提出了一种新型技术，称为Q-高斯分布式随机邻居嵌入（Q-SNE）。与T-SNE和SNE相比，通过使用Q-Gaussian分布作为低维数据的分布，与T-SNE和SNE相比，Q-SNE在2或3维映射上更强大，更灵活。 Q-Gaussian分布包括高斯分布和T分布，作为Q = 1.0和Q = 2.0的特殊情况。因此，Q-SNE还可以通过更改参数Q来表达T-SNE和SNE，这使得通过选择参数Q来找到最佳可视化。我们通过使用数据集MNIST，COIL-20，Olivettiface，FashionMnist和Glove的数据集，显示了嵌入式空间中K-Nearest邻居（K-NN）分类器在嵌入式空间中的二维映射和分类的可视化性能和分类的性能。

The dimensionality reduction has been widely introduced to use the high-dimensional data for regression, classification, feature analysis, and visualization. As the one technique of dimensionality reduction, a stochastic neighbor embedding (SNE) was introduced. The SNE leads powerful results to visualize high-dimensional data by considering the similarity between the local Gaussian distributions of high and low-dimensional space. To improve the SNE, a t-distributed stochastic neighbor embedding (t-SNE) was also introduced. To visualize high-dimensional data, the t-SNE leads to more powerful and flexible visualization on 2 or 3-dimensional mapping than the SNE by using a t-distribution as the distribution of low-dimensional data. Recently, Uniform manifold approximation and projection (UMAP) is proposed as a dimensionality reduction technique. We present a novel technique called a q-Gaussian distributed stochastic neighbor embedding (q-SNE). The q-SNE leads to more powerful and flexible visualization on 2 or 3-dimensional mapping than the t-SNE and the SNE by using a q-Gaussian distribution as the distribution of low-dimensional data. The q-Gaussian distribution includes the Gaussian distribution and the t-distribution as the special cases with q=1.0 and q=2.0. Therefore, the q-SNE can also express the t-SNE and the SNE by changing the parameter q, and this makes it possible to find the best visualization by choosing the parameter q. We show the performance of q-SNE as visualization on 2-dimensional mapping and classification by k-Nearest Neighbors (k-NN) classifier in embedded space compared with SNE, t-SNE, and UMAP by using the datasets MNIST, COIL-20, OlivettiFaces, FashionMNIST, and Glove.

下载PDF全文

下载文献需遵守相关版权规定

论文标题