Purely Random Correlations of the Matrix, or Studying Noise in Neural Networks


Expressed in the most general form, in essentially all the cases of practical interest, the n × n matrices W used to describe the complex system are by construction designed as

W = XYT —– (1)

where X and Y denote the rectangular n × m matrices. Such, for instance, are the correlation matrices whose standard form corresponds to Y = X. In this case one thinks of n observations or cases, each represented by a m dimensional row vector xi (yi), (i = 1, …, n), and typically m is larger than n. In the limit of purely random correlations the matrix W is then said to be a Wishart matrix. The resulting density ρW(λ) of eigenvalues is here known analytically, with the limits (λmin ≤ λ ≤ λmax) prescribed by

λmaxmin = 1+1/Q±2 1/Q and Q = m/n ≥ 1.

The variance of the elements of xi is here assumed unity.

The more general case, of X and Y different, results in asymmetric correlation matrices with complex eigenvalues λ. In this more general case a limiting distribution corresponding to purely random correlations seems not to be yet known analytically as a function of m/n. It indicates however that in the case of no correlations, quite generically, one may expect a largely uniform distribution of λ bound in an ellipse on the complex plane.

Further examples of matrices of similar structure, of great interest from the point of view of complexity, include the Hamiltonian matrices of strongly interacting quantum many body systems such as atomic nuclei. This holds true on the level of bound states where the problem is described by the Hermitian matrices, as well as for excitations embedded in the continuum. This later case can be formulated in terms of an open quantum system, which is represented by a complex non-Hermitian Hamiltonian matrix. Several neural network models also belong to this category of matrix structure. In this domain the reference is provided by the Gaussian (orthogonal, unitary, symplectic) ensembles of random matrices with the semi-circle law for the eigenvalue distribution. For the irreversible processes there exists their complex version with a special case, the so-called scattering ensemble, which accounts for S-matrix unitarity.

As it has already been expressed above, several variants of ensembles of the random matrices provide an appropriate and natural reference for quantifying various characteristics of complexity. The bulk of such characteristics is expected to be consistent with Random Matrix Theory (RMT), and in fact there exists strong evidence that it is. Once this is established, even more interesting are however deviations, especially those signaling emergence of synchronous or coherent patterns, i.e., the effects connected with the reduction of dimensionality. In the matrix terminology such patterns can thus be associated with a significantly reduced rank k (thus k ≪ n) of a leading component of W. A satisfactory structure of the matrix that would allow some coexistence of chaos or noise and of collectivity thus reads:

W = Wr + Wc —– (2)

Of course, in the absence of Wr, the second term (Wc) of W generates k nonzero eigenvalues, and all the remaining ones (n − k) constitute the zero modes. When Wr enters as a noise (random like matrix) correction, a trace of the above effect is expected to remain, i.e., k large eigenvalues and the bulk composed of n − k small eigenvalues whose distribution and fluctuations are consistent with an appropriate version of random matrix ensemble. One likely mechanism that may lead to such a segregation of eigenspectra is that m in eq. (1) is significantly smaller than n, or that the number of large components makes it effectively small on the level of large entries w of W. Such an effective reduction of m (M = meff) is then expressed by the following distribution P(w) of the large off-diagonal matrix elements in the case they are still generated by the random like processes

P(w) = (|w|(M-1)/2K(M-1)/2(|w|))/(2(M-1)/2Γ(M/2)√π) —– (3)

where K stands for the modified Bessel function. Asymptotically, for large w, this leads to P(w) ∼ e(−|w|) |w|M/2−1, and thus reflects an enhanced probability of appearence of a few large off-diagonal matrix elements as compared to a Gaussian distribution. As consistent with the central limit theorem the distribution quickly converges to a Gaussian with increasing M.

Based on several examples of natural complex dynamical systems, like the strongly interacting Fermi systems, the human brain and the financial markets, one could systematize evidence that such effects are indeed common to all the phenomena that intuitively can be qualified as complex.


One thought on “Purely Random Correlations of the Matrix, or Studying Noise in Neural Networks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s