Machine Learning via Quantum Nearest-Centroid Algorithm for k-Means Clustering

k-means clustering is a popular machine learning algorithm that structures an unlabelled dataset into k classes. k-means clustering is an NP-hard problem, but examining methods that reduce the average-case complexity is an open area of research. A popular way of classifying the input vectors is to compare the distance of a new vector with the centroid vector of each class (the latter being calculated from the mean of the vectors already in that class). The class with the shortest distance to the vector is the one to which the vector is classified. We refer to this form of classification sub-routine for k-means clustering, as the nearest-centroid algorithm.

sphx_glr_plot_nearest_centroid_0021

Seth Lloyd and others have constructed a quantum nearest-centroid algorithm, only classifying vectors after the optimal clustering has been found. They show that the distance between an input vector, |u⟩, and the set of n reference vectors {|viC} of length m in class C, can be efficiently calculated to within error ε in O(ε−1 log nm) steps on a quantum computer. The algorithm works by constructing the state

|Ψ⟩ = 1/√2 (|u⟩|0⟩ + 1/√n Σnj=1 |vj|j⟩

and performing a swap test with the state

|Φ⟩ = 1/√Z (|u⟩||0⟩ + 1/√n Σnj=1 |vj|j⟩

where Z = |u|2 + (1/n) Σj |vj|2.

The distance between the input vector and the weighted average of the vectors in class C is then proportional to the probability of a successful swap test. The algorithm is repeated for each class until a desired confidence is reached, with the vector being classified into the class from which it has the shortest distance. The complexity arguments on the dependence of m were rigorously confirmed by Lloyd and others using the QPCA (Quantum principal component analysis) construction for a support vector machine (SVM) algorithm. This can roughly be thought of as a k-means clustering problem with k=2. A speedup is obtained due to the classical computation of the required inner products being O(nm)2.

The algorithm has some caveats, in particular it only classifies data without performing the harder task of clustering, and assumes access to a QRAM (Quantum random access memory). In the same paper, Lloyd and others develop a k-means algorithm, including clustering, for implementation on an adiabatic quantum computer. The potential of this algorithm is hard to judge, and is perhaps less promising due to the current focus of the quantum computing field on circuit-based architectures.

Machine learning data is usually (very) high dimensional, containing redundant or irrelevant information. Thus, machine learning benefits from pre-processing data through statistical procedures such as principal component analysis (PCA). PCA reduces the dimensionality by transforming the data to a new set of uncorrelated variables (the principal components) of which the first few retain most of the variation present in the original dataset. The standard way to calculate the principal components boils down to finding the eigenvalues of the data covariance matrix.

1-1

Lloyd and others suggested a quantum version of PCA (QPCA). The bulk of the algorithm consists of the ability to generate the exponent of an arbitrary density matrix ρ efficiently. More specifically, given n copies of ρ, Lloyd and others propose a way to apply the unitary operator e−iρt to any state σ with accuracy ε = O(t2/n). This is done using repeated infinitesimal applications of the swap operator on ρ ⊗ σ. Using phase estimation, the result is used to generate a state which can be sampled from to attain information on the eigenvectors and eigenvalues of the state ρ. The algorithm is most effective when ρ contains a few large eigenvalues and can be represented well by its principal components. In this case, the subspace spanned by the principal components ρ′ closely approximates ρ, such that ||ρ − P ρP || ≤ ε, where P is the projector onto ρ′. This method of QPCA allows construction of the eigenvectors and eigenvalues of the matrix ρ in time O(Rlog(d)), where d and R are the dimensions of the space spanned by the ρ and ρ′ respectively. For low-rank matrices, this is an improvement over the classical algorithm that requires O(d) time. In a machine learning context, if ρ is the covariance matrix of the data, this procedure performs PCA in the desired fashion.

The QPCA algorithm has a number of caveats that need to be covered before one can apply it to machine learning scenarios. For example, to gain a speedup, some of the eigenvalues of ρ need to be large (i.e. ρ needs to be well approximated by ρ′). For the case where all eigenvalues are equal and of size O(1/d), the algorithm reduces to scaling in time O(d) which offers no improvement over classical algorithms. Other aspects that need to be considered include the necessity of QRAM and the scaling of the algorithm with the allowed error ε. As of yet, it is unclear how these requirements affect the applicability of the algorithm to real scenarios.

Advertisements

2 thoughts on “Machine Learning via Quantum Nearest-Centroid Algorithm for k-Means Clustering

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s