Random Uniform Deviate, or Correlation Dimension


A widely used dimension algorithm in data analysis is the correlation dimension. Fix m, a positive integer, and r, a positive real number. Given a time-series of data u(1), u(2), …, u(N),from measurements equally spaced in time, form a sequence of vectors x(1), x(2),…, x(N- m + 1) in R’, defined by x(i) = [u(i), u(i+ 1),…,u(i+ m – 1)]. Next, define for each i, 1 ≤ i ≤ N – m + 1,

Cmi (r)= (number of j such that d[x(i), x(j)] ≤ r)/(N-m+1) ———- [1]

We must define d[x(i), x(j)] for vectors x(i) and x(j). We define

d[x(i), x(j)]= maxk = 1,2,…,m (|u(i+k-1) – u(j+k-1)j) ———- [2]

From the Cmi (r), define

Cm(r) = (N- m + i)-1 ∑(N – m + 1)i = 1 Cmi (r) ———- [3]

and define

βm = limr → 0 limn → ∞ log Cm(r)/log r ———- [4]

The assertion is that for m sufficiently large, βmis the correlation dimension. Such a limiting slope has been shown to exist for the commonly studied chaotic attractors. This procedure has frequently been applied to experimental data; investigators seek a “scaling range”of r values for which log Cm(r)/log r is nearly constant for large m, and they infer that this ratio is the correlation dimension. In some instances, investigators have concluded that this procedure establishes deterministic chaos.

The later conclusion is not necessarily correct: a converged, finite correlation dimension value does not guarantee that the defining process is deterministic. Consider the following stochastic process. Fix 0 ≤ p ≤1. Define Xj = α-l/2 sin(2πj/12) ∀ j,where α is specified below. Define Yj as a family of independent identicaly distributed (i.i.d.) real random variables, with uniform density on the interval [-√3, √3]. Define Zj = 1 with probability p, Zj = 0 with probability 1 – p.

α = (∑j = 112 sin2(2πj/12)/12 ———- [5]

and define MI Xj = (1- Zj) Xj + ZjYj. Intuitively, MI X(p) is generated by first ascertaining, for each j, whether the jth sample will be from the deterministic sine wave or from the random uniform deviate, with likelihood (1- p) of the former choice, then calculating either Xj or Yj. Increasing p marks a tendency towards greater system randomness. We now show that almost surely βmin [4] equals 0 ∀ m for the MI X(p) process, p ≠ 1. Fix m, define Kj = (12m)j- 12m, and define Nj = 1 if (MI Xk(j)+l,…, k(j)+m) = (X1,. . ., Xm), Nj = 0 otherwise. The Nj are i.i.d.random variables, with the expected value of Nj, E(Nj) ≥ (1- p)m. By the Strong Law of Large Numbers,

limN → ∞ ∑j = 1N Nj/N = E(N1) ≥ (1-p)m

Observe that (∑j = 1N Nj/12 mN)2 is a lower bound to Cm(r), since xk(i)+1,…., xk(j)+1 if Ni = Nj = 1. Thus for r ‹ 1

limN → ∞ sup log Cm(r)/log r ≤ (1/log r) limN → ∞ (∑j = 1N Nj/12 mN)2 ≤ log (1-p)2m/(12m)2/log r

Since, (1-p)2m/(12m)2 is independent of r, βm = limr → 0 limN → ∞ log Cm(r)/log r = 0. Since, βm ≠ 0 with probability 0 for each m, by countable additivity, ∀m, β= 0.

The MIX(p) process can be motivated by considering an autonomous unit that produces sinusoidal output, surrounded by a world of interacting processes that in ensemble produces output that resembles noise relative to the timing of the unit. The extent to which the surrounding world interacts with the unit could be controlled by a gateway between the two, with a larger gateway admitting greater apparent noise to compete with the sinusoidal signal. It is easy to show that, given a sequence Xj, a sequence of k = 1, 2,…, m i.i.d.Yj, defined by a density function and independent of the Xj, and Z= X+ Yj, then Zj has an infinite correlation dimension. It appears that correlation dimension distinguishes between correlated and uncorrelated successive iterates, with larger estimates of dimension corresponding to uncorrelated data. For a more complete interpretation of correlation dimension results, stochastic processes with correlated increments should be analyzed. Error estimates in dimension calculations are commonly seen. In statistics, one presumes a specified underlying stochastic distribution to estimate misclassification probabilities. Without knowing the form of a distribution, or if the system is deterministic or stochastic, one must be suspicious of error estimates. There often appears to be a desire to establish a noninteger dimension value, to give a fractal and chaotic interpretation to the result, but again, prior to a thorough study of the relationship between the geometric Hausdorff dimension and the time series formula labeled correlation dimension, it is speculation to draw conclusions from a noninteger correlation dimension value.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s