Lévy Process as Combination of a Brownian Motion with Drift and Infinite Sum of Independent Compound Poisson Processes: Introduction to Martingales. Part 4.

Every piecewise constant Lévy process Xt0 can be represented in the form for some Poisson random measure with intensity measure of the form ν(dx)dt where ν is a finite measure, defined by

ν(A) = E[#{t ∈ [0,1] : ∆Xt0 ≠ 0, ∆Xt0 ∈ A}], A ∈ B(Rd) —– (1)

Given a Brownian motion with drift γt + Wt, independent from X0, the sum Xt = Xt0 + γt + Wt defines another Lévy process, which can be decomposed as:

Xt = γt + Wt + ∑s∈[0,t] ΔXs = γt + Wt + ∫[0,t]xRd xJX (ds x dx) —– (2)

where JX is a Poisson random measure on [0,∞[×Rd with intensity ν(dx)dt.

Can every Lévy process be represented in this form? Given a Lévy process Xt, we can still define its Lévy measure ν as above. ν(A) is still finite for any compact set A such that 0 ∉ A: if this were not true, the process would have an infinite number of jumps of finite size on [0, T], which contradicts the cadlag property. So ν defines a Radon measure on Rd \ {0}. But ν is not necessarily a finite measure: the above restriction still allows it to blow up at zero and X may have an infinite number of small jumps on [0, T]. In this case the sum of the jumps becomes an infinite series and its convergence imposes some conditions on the measure ν, under which we obtain a decomposition of X.

Let (Xt)t≥0 be a Lévy process on Rd and ν its Lévy measure.

ν is a Radon measure on Rd \ {0} and verifies:

|x|≤1 |x|2 v(dx) < ∞

The jump measure of X, denoted by JX, is a Poisson random measure on [0,∞[×Rd with intensity measure ν(dx)dt.

∃ a vector γ and a d-dimensional Brownian motion (Bt)t≥0 with covariance matrix A such that

Xt = γt + Bt + Xtl + limε↓0 X’εt —– (3)


Xtl = ∫|x|≥1,s∈[0,t] xJX (ds x dx)

X’εt = ∫ε≤|x|<1,s∈[0,t] x{JX (ds x dx) – ν(dx)ds}

≡ ∫ε≤|x|<1,s∈[0,t] xJ’X (ds x dx)

The terms in (3) are independent and the convergence in the last term is almost sure and uniform in t on [0,T].

The Lévy-Itô decomposition entails that for every Lévy process ∃ a vector γ, a positive definite matrix A and a positive measure ν that uniquely determine its distribution. The triplet (A,ν,γ) is called characteristic tripletor Lévy triplet of the process Xt. γt + Bt is a continuous Gaussian Lévy process and every Gaussian Lévy process is continuous and can be written in this form and can be described by two parameters: the drift γ and the covariance matrix of Brownian motion, denoted by A. The other two terms are discontinuous processes incorporating the jumps of Xt and are described by the Lévy measure ν. The condition ∫|y|≥1 ν(dy) < ∞ means that X has a finite number of jumps with absolute value larger than 1. So the sum

Xtl = ∑|∆Xs|≥10≤s≤t ∆Xs

contains almost surely a finite number of terms and Xtl is a compound Poisson process. There is nothing special about the threshold ∆X = 1: for any ε > 0, the sum of jumps with amplitude between ε and 1:

Xεt = ∑1>|∆Xs|≥ε0≤s≤t ∆Xs = ∫ε≤|x|≤1,s∈[0,t] xJX(ds x dx) —– (4)

is again a well-defined compound Poisson process. However, contrarily to the compound Poisson case, ν can have a singularity at zero: there can be infinitely many small jumps and their sum does not necessarily converge. This prevents us from making ε go to 0 directly in (4). In order to obtain convergence we have to center the remainder term, i.e., replace the jump integral by its compensated version,

X’εt = ∫ε≤|x|≤1,s∈[0,t] xJ’X (ds x dx) —– (5)

which, is a martingale. While Xε can be interpreted as an infinite superposition of independent Poisson processes, X’εshould be seen as an infinite superposition of independent compensated, i.e., centered Poisson processes to which a central-limit type argument can be applied to show convergence. An important implication of the Lévy-Itô decomposition is that every Lévy process is a combination of a Brownian motion with drift and a possibly infinite sum of independent compound Poisson processes. This also means that every Lévy process can be approximated with arbitrary precision by a jump-diffusion process, that is by the sum of Brownian motion with drift and a compound Poisson process.

CUSUM Deceleration. Drunken Risibility.


CUSUM, or cumulative sum is used for detecting and monitoring change detection. Let us introduce a measurable space (Ω, F), where Ω = R, F = ∪nFn and F= σ{Yi, i ∈ {0, 1, …, n}}. The law of the sequence  Yi, i = 1, …., is defined by the family of probability measures {Pv}, v ∈ N*. In other words, the probability measure Pv for a given v > 0, playing the role of the change point, is the measure generated on Ω by the sequence Yi, i = 1, … , when the distribution of the Yi’s changes at time v. The probability measures P0 and P are the measures generated on Ω by the random variables Yi when they have an identical distribution. In other words, the system defined by the sequence Yi undergoes a “regime change” from the distribution P0 to the distribution P at the change point time v.

The CUSUM (cumulative sum control chart) statistic is defined as the maximum of the log-likelihood ratio of the measure Pv to the measure P on the σ-algebra Fn. That is,

Cn := max0≤v≤n log dPv/dP|Fn —– (1)

is the CUSUM statistic on the σ-algebra Fn. The CUSUM statistic process is then the collection of the CUSUM statistics {Cn} of (1) for n = 1, ….

The CUSUM stopping rule is then

T(h) := inf {n ≥ 0: max0≤v≤n log dPv/dP|Fn ≥ h} —– (2)

for some threshold h > 0. In the CUSUM stopping rule (2), the CUSUM statistic process of (1) is initialized at

C0 = 0 —– (3)

The CUSUM statistic process was first introduced by E. S. Page in the form that it takes when the sequence of random variables Yis independent and Gaussian; that is, Yi ∼ N(μ, 1), i = 1, 2,…, with μ = μ0 for i < 𝜈 and μ = μ1 for i ≥ 𝜈. Since its introduction by Page, the CUSUM statistic process of (1) and its associated CUSUM stopping time of (2) have been used in a plethora of applications where it is of interest to perform detection of abrupt changes in the statistical behavior of observations in real time. Examples of such applications are signal processing, monitoring the outbreak of an epidemic, financial surveillance, and computer vision. The popularity of the CUSUM stopping time (2) is mainly due to its low complexity and optimality properties in both discrete and continuous time models.

Let Yi ∼ N(μ0, σ2) that change to Yi ∼ N(μ1, σ2) at the change point time v. We now proceed to derive the form of the CUSUM statistic process (1) and its associated CUSUM stopping time (2). Let us denote by φ(x) = 1/√2π e-x2/2 the Gaussian kernel. For the sequence of random variables Yi given earlier,

Cn := max0≤v≤n log dPv/dP|Fn

= max0≤v≤n log (∏i=1v-1φ(Yi0)/σ ∏i=vnφ(Yi1)/σ)/∏i=1nφ(Yi0)/σ

= 1/σ2max0≤v≤n 1 – μ0)∑i=vn[Yi – (μ1 + μ0)/2] —– (4)

In view of (3), let us initialize the sequence (4) at Y0 = (μ1 + μ0)/2 and distinguish two cases.

a) μ> μ0: divide out (μ1 – μ0), multiply by the constant σ2 in (4) and use (2) to obtain CUSUM stopping T+:

T+(h+) = inf {n ≥ 0: max0≤v≤n i=vn[Yi – (μ1 + μ0)/2] ≥ h+} —– (5)

for an appropriately scaled threshold h> 0.

b) μ< μ0: divide out (μ1 – μ0), multiply by the constant σ2 in (4) and use (2) to obtain CUSUM stopping T:

T(h) = inf {n ≥ 0: max0≤v≤n i=vn[(μ1 + μ0)/2 – Yi] ≥ h} —– (6)

for an appropriately scaled threshold h > 0.

The sequences form a CUSUM according to the deviation of the monitored sequential observations from the average of their pre- and postchange means. Although the stopping times (5) and (6) can be derived by formal CUSUM regime change considerations, they may also be used as general nonparametric stopping rules directly applied to sequential observations.

Quantum Field Theory and Evolution of Forward Rates in Quantitative Finance. Note Quote.


Applications of physics to finance are well known, and the application of quantum mechanics to the theory of option pricing is well known. Hence it is natural to utilize the formalism of quantum field theory to study the evolution of forward rates. Quantum field theory models of the term structure originated with Baaquie. The intuition behind quantum field theory models of the term structure stems from allowing each forward rate maturity to both evolve randomly and be imperfectly correlated with every other maturity. This may also be accomplished by increasing the number of random factors in the original HJM towards infinity. However, the infinite number of factors in a field theory model are linked via a single function that governs the correlation between forward rate maturities. Thus, instead of estimating additional volatility functions in a multifactor HJM framework, one additional parameter is sufficient for a field theory model to instill imperfect correlation between every forward rate maturity. As the correlation between forward rate maturities approaches unity, field theory models reduce to the standard one1 factor HJM model. Therefore, the fundamental difference between finite factor HJM and field theory models is the minimal structure the latter requires to instill imperfect correlation between forward rates. The Heath-Jarrow-Morton framework refers to a class of models that are derived by directly modeling the dynamics of instantaneous forward-rates. The central insight of this framework is to recognize that there is an explicit relationship between the drift and volatility parameters of the forward-rate dynamics in a no-arbitrage world. The familiar short-rate models can be derived in the HJM framework but in general, however, HJM models are non-Markovian. As a result, it is not possible to use the PDE-based computational approach for pricing derivatives. Instead, discrete-time HJM models and Monte-Carlo methods are often used in practice. Monte Carlo methods (or Monte Carlo experiments) are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. Their essential idea is using randomness to solve problems that might be deterministic in principle.

A Lagrangian is introduced to describe the field. The Lagrangian has the advantage over Brownian motion of being able to control fluctuations in the field, hence forward rates, with respect to maturity through the addition of a maturity dependent gradient as detailed in the definition below. The action of the field integrates the Lagrangian over time and when exponentiated and normalized serves as the probability distribution for forward rate curves. The propagator measures the correlation in the field and captures the effect the field at time t and maturity x has on maturity x′ at time t′. In the one factor HJM model, the propagator equals one which allows the quick recovery of one factor HJM results. Previous research has begun with the propagator or “correlation” function for the field instead of deriving this quantity from the Lagrangian. More importantly, the Lagrangian and its associated action generate a path integral that facilitates the solution of contingent claims and hedge parameters. However, previous term structure models have not defined the Lagrangian and are therefore unable to utilize the path integral in their applications. The Feynman path integral, path integral in short, is a fundamental quantity that provides a generating function for forward rate curves. Although crucial for pricing and hedging, the path integral has not appeared in previous term structure models with generalized continuous random processes.


Let t0 denote the current time and T the set of forward rate maturities with t0 ≤ T . The upper bound on the forward rate maturities is the constant TFR which constrains the forward rate maturities T to lie within the interval [t0, t0 + TFR].

To illustrate the field theory approach, the original finite factor HJM model is derived using field theory principles in appendix A. In the case of a one factor model, the derivation does not involve the propagator as the propagator is identically one when forward rates are perfectly correlated. However, the propagator is non trivial for field theory models as it governs the imperfect correlation between forward rate maturities. Let A(t,x) be a two dimensional field driving the evolution of forward rates f (t, x) through time. Following Baaquie, the Lagrangian of the field is defined as


The Lagrangian of the field equals

L[A] = -1/2TFR  {A2(t, x) + 1/μ2(∂A(t,x)∂x)2} —– (1)

Definition is not unique, other Lagrangians exist and would imply different propagators. However, the Lagrangian in the definition is sufficient to explain the contribution of field theory ∂A(t,x)∂x  that controls field fluctuations in the direction of the forward rate maturity. The constant μ measures the strength of the fluctuations in the maturity direction. The Lagrangian in the definition implies the field is continuous, Gaussian, and Markovian. Forward rates involving the field are expressed below where the drift and volatility functions satisfy the usual regularity conditions.

∂f(t,x)/∂t = α (t, x) + σ (t, x)A(t, x) —– (2)

The forward rate process in equation (2) incorporates existing term structure research on Brown- ian sheets, stochastic strings, etc that have been used in previous continuous term structure models. Note that equation (2) is easily generalized to the K factor case by introducing K independent and identical fields Ai(t, x). Forward rates could then be defined as

∂f(t, x)/∂t = α (t, x) + ∑i=1K σi(t, x)Ai(t, x) —– (3)

However, a multifactor HJM model can be reproduced without introducing multiple fields. In fact, under specific correlation functions, the field theory model reduces to a multifactor HJM model without any additional fields to proxy for additional Brownian motions.


Lagrangian of Multifactor HJM

The Lagrangian describing the random process of a K-factor HJM model is given by

L[A] = −1/2 A(t, x)G−1(t, x, x′)A(t, x′) —– (4)


∂f(t, x)/∂t = α(t, x) + A(t, x)

and G−1(t, x, x′)A(t, x′) denotes the inverse of the function.

G(t, x, x′) = ∑i=1K σi(t, x) σi(t, x’) —– (5)

The above proposition is an interesting academic exercise to illustrate the parallel between field theory and traditional multifactor HJM models. However, multifactor HJM models have the disadvantages associated with a finite dimensional basis. Therefore, this approach is not pursued in later empirical work. In addition, it is possible for forward rates to be perfectly correlated within a segment of the forward rate curve but imperfectly correlated with forward rates in other segments. For example, one could designate short, medium, and long maturities of the forward rate curve. This situation is not identical to the multifactor HJM model but justifies certain market practices that distinguish between short, medium, and long term durations when hedging. However, more complicated correlation functions would be required; compromising model parsimony and reintroducing the same conceptual problems of finite factor models. Furthermore, there is little economic intuition to justify why the correlation between forward rates should be discontinuous.

What Drives Investment? Or How Responsible is Kelly’s Optimum Investment Fraction?


A reasonable way to describe assets price variations (on a given time-scale) is to assume them to be multiplicative random walks with log-normal step. This comes from the assumption that growth rates of prices are more significant than their absolute variations. So, we describe the price of a financial assets as a time-dependent multiplicative random process. We introduce a set of N Gaussian random variables xi(t) depending on a time parameter t. By this set, we define N independent multiplicative Gaussian random walks, whose assigned discrete time evolution is given by

pi(t+1) = exi(t)pi(t) —– (1)

for i = 1,…,N, where each xi(t) is not correlated in time. To optimize an investment, one can choose different risk-return strategies. Here, by optimization we will mean the maximization of the typical capital growth rate of a portfolio. A capital W(t), invested into different financial assets who behave as multiplicative random walks, grows almost certainly at an exponential rate ⟨ln W (t+1)/W (t)⟩, where one must average over the distribution of the single multiplicative step. We assume that an investment is diversified according to the Kelly’s optimum investment fraction, in order to maximize the typical capital growth rate over N assets with identical average return α = ⟨exi(t)⟩ − 1 and squared volatility ∆ = ⟨e2xi(t)⟩ − ⟨exi(t)⟩2. It should be noted that Kelly capital growth criterion, which maximizes the expected log of final wealth, provides the strategy that maximizes long run wealth growth asymptotically for repeated investments over time. However, one drawback is found in its very risky behavior due to the log’s essentially zero risk aversion; consequently it tends to suggest large concentrated investments or bets that can lead to high volatility in the short-term. Many investors, hedge funds, and sports bettors use the criterion and its seminal application is to a long sequence of favorable investment situations. On each asset, the investor will allocate a fraction fi of his capital, according to the return expected from that asset. The time evolution of the total capital is ruled by the following multiplicative process

W(t+1) = [1 + ∑i=1Nfi(exi(t) -1)] W(t) —– (2)

First, we consider the case of an unlimited investment, i.e. we put no restriction tothe value of ∑i=1Nfi. The typical growth rate

Vtyp = ⟨ln[1+  ∑i=1Nfi(exi -1)]⟩ —– (3)

of the investor’s capital can be calculated through the following 2nd-order expansion in exi -1, if we assume that fluctuations of prices are small and uncorrelated, that seems to be quite reasonable

Vtyp ≅ ∑i=1Nfi(⟨exi⟩ – 1) – fi2/2(⟨e2xi⟩ – 2⟨exi⟩ + 1 —– (4)

By solving d/df(Vtyp = 0), it easy to show that the optimal value for fi is fiopt (α, Δ) = α / (α2 + Δ) ∀ i. We assume that the investor has little ignorance about the real value of α, that we represent by a Gaussian fluctuation around the real value of α. In the investor’s mind, each asset is different, because of this fluctuation αi = α + εi. The εi are drawn from the same distribution, with ⟨εi⟩ = 0 as errors are normally distributed around the real value. We suppose that the investor makes an effort E to investigate and get information about the statistical parameters of the N assets upon which he will spread his capital. So, his ignorance (i.e. the width of the distribution of the εi) about the real value of αi will be a decreasing function of the effort “per asset” E ; more, we suppose that an even infinite effort will not make N this ignorance vanish. In order to plug these assumptions in the model, we write the width of the distribution of ε as

⟨ε2i⟩ = D0 + (N/E)γ —– (5)

with γ > 0. As one can see, the greater is E, the more exact is the perception, and better is the investment. D0 is the asymptotic ignorance. All the invested fraction fopt (αi, Δ) will be different, according to the investor’s perception. Assuming that the εi are small, we expand all fi(α + εi) in equation 4 up to the 2nd order in εi, and after averaging over the distribution of εi, we obtain the mean value of the typical capital growth rate for an investor who provides a given effort E:

Vtyp = N[A − (D0 + (N/E)γ )B] —– (6)


A = (α (3Δ – α2))/(α2 + Δ)3 B = -(α2 – Δ)2/2(α2 + Δ)3 —– (7)

We are now able to find the optimal number of assets to be included in the portfolio (i.e., for which the investment is more advantageous, taken into account the effort provided to get information), by solving d/dNVtyp = 0, it is easy to see that the number of optimal assets is given by

Nopt(E) = E {[(A – D0]/(1 + γ)B}1/γ —– (8)

that is an increasing function of the effort E. If the investor has no limit in the total capital fraction invested in the portfolio (so that it can be greater than 1, i.e. the investor can invest more money than he has, borrowing it from an external source), the capital can take negative values, if the assets included in the portfolio encounter a simultaneous negative step. So, if the total investment fraction is greater than 1, we should take into account also the cost of refunding loss to the bank, to predict the typical growth rate of the capital.

Comment on Purely Random Correlations of the Matrix, or Studying Noise in Neural Networks


In the presence of two-body interactions the many-body Hamiltonian matrix elements vJα,α′ of good total angular momentum J in the shell-model basis |α⟩ generated by the mean field, can be expressed as follows:

vJα,α′ = ∑J’ii’ cJαα’J’ii’ gJ’ii’ —– (4)

The summation runs over all combinations of the two-particle states |i⟩ coupled to the angular momentum J′ and connected by the two-body interaction g. The analogy of this structure to the one schematically captured by the eq. (2) is evident. gJ’ii’ denote here the radial parts of the corresponding two-body matrix elements while cJαα’J’ii’ globally represent elements of the angular momentum recoupling geometry. gJ’ii’ are drawn from a Gaussian distribution while the geometry expressed by cJαα’J’ii’ enters explicitly. This originates from the fact that a quasi-random coupling of individual spins results in the so-called geometric chaoticity and thus cJαα’ coefficients are also Gaussian distributed. In this case, these two (gJ’ii’ and c) essentially random ingredients lead however to an order of magnitude larger separation of the ground state from the remaining states as compared to a pure Random Matrix Theory (RMT) limit. Due to more severe selection rules the effect of geometric chaoticity does not apply for J = 0. Consistently, the ground state energy gaps measured relative to the average level spacing characteristic for a given J is larger for J > 0 than for J = 0, and also J > 0 ground states are more orderly than those for J = 0, as it can be quantified in terms of the information entropy.

Interestingly, such reductions of dimensionality of the Hamiltonian matrix can also be seen locally in explicit calculations with realistic (non-random) nuclear interactions. A collective state, the one which turns out coherent with some operator representing physical external field, is always surrounded by a reduced density of states, i.e., it repells the other states. In all those cases, the global fluctuation characteristics remain however largely consistent with the corresponding version of the random matrix ensemble.

Recently, a broad arena of applicability of the random matrix theory opens in connection with the most complex systems known to exist in the universe. With no doubt, the most complex is the human’s brain and those phenomena that result from its activity. From the physics point of view the financial world, reflecting such an activity, is of particular interest because its characteristics are quantified directly in terms of numbers and a huge amount of electronically stored financial data is readily available. An access to a single brain activity is also possible by detecting the electric or magnetic fields generated by the neuronal currents. With the present day techniques of electro- or magnetoencephalography, in this way it is possible to generate the time series which resolve neuronal activity down to the scale of 1 ms.

One may debate over what is more complex, the human brain or the financial world, and there is no unique answer. It seems however to us that it is the financial world that is even more complex. After all, it involves the activity of many human brains and it seems even less predictable due to more frequent changes between different modes of action. Noise is of course owerwhelming in either of these systems, as it can be inferred from the structure of eigen-spectra of the correlation matrices taken across different space areas at the same time, or across different time intervals. There however always exist several well identifiable deviations, which, with help of reference to the universal characteristics of the random matrix theory, and with the methodology briefly reviewed above, can be classified as real correlations or collectivity. An easily identifiable gap between the corresponding eigenvalues of the correlation matrix and the bulk of its eigenspectrum plays the central role in this connection. The brain when responding to the sensory stimulations develops larger gaps than the brain at rest. The correlation matrix formalism in its most general asymmetric form allows to study also the time-delayed correlations, like the ones between the oposite hemispheres. The time-delay reflecting the maximum of correlation (time needed for an information to be transmitted between the different sensory areas in the brain is also associated with appearance of one significantly larger eigenvalue. Similar effects appear to govern formation of the heteropolymeric biomolecules. The ones that nature makes use of are separated by an energy gap from the purely random sequences.


Purely Random Correlations of the Matrix, or Studying Noise in Neural Networks


Expressed in the most general form, in essentially all the cases of practical interest, the n × n matrices W used to describe the complex system are by construction designed as

W = XYT —– (1)

where X and Y denote the rectangular n × m matrices. Such, for instance, are the correlation matrices whose standard form corresponds to Y = X. In this case one thinks of n observations or cases, each represented by a m dimensional row vector xi (yi), (i = 1, …, n), and typically m is larger than n. In the limit of purely random correlations the matrix W is then said to be a Wishart matrix. The resulting density ρW(λ) of eigenvalues is here known analytically, with the limits (λmin ≤ λ ≤ λmax) prescribed by

λmaxmin = 1+1/Q±2 1/Q and Q = m/n ≥ 1.

The variance of the elements of xi is here assumed unity.

The more general case, of X and Y different, results in asymmetric correlation matrices with complex eigenvalues λ. In this more general case a limiting distribution corresponding to purely random correlations seems not to be yet known analytically as a function of m/n. It indicates however that in the case of no correlations, quite generically, one may expect a largely uniform distribution of λ bound in an ellipse on the complex plane.

Further examples of matrices of similar structure, of great interest from the point of view of complexity, include the Hamiltonian matrices of strongly interacting quantum many body systems such as atomic nuclei. This holds true on the level of bound states where the problem is described by the Hermitian matrices, as well as for excitations embedded in the continuum. This later case can be formulated in terms of an open quantum system, which is represented by a complex non-Hermitian Hamiltonian matrix. Several neural network models also belong to this category of matrix structure. In this domain the reference is provided by the Gaussian (orthogonal, unitary, symplectic) ensembles of random matrices with the semi-circle law for the eigenvalue distribution. For the irreversible processes there exists their complex version with a special case, the so-called scattering ensemble, which accounts for S-matrix unitarity.

As it has already been expressed above, several variants of ensembles of the random matrices provide an appropriate and natural reference for quantifying various characteristics of complexity. The bulk of such characteristics is expected to be consistent with Random Matrix Theory (RMT), and in fact there exists strong evidence that it is. Once this is established, even more interesting are however deviations, especially those signaling emergence of synchronous or coherent patterns, i.e., the effects connected with the reduction of dimensionality. In the matrix terminology such patterns can thus be associated with a significantly reduced rank k (thus k ≪ n) of a leading component of W. A satisfactory structure of the matrix that would allow some coexistence of chaos or noise and of collectivity thus reads:

W = Wr + Wc —– (2)

Of course, in the absence of Wr, the second term (Wc) of W generates k nonzero eigenvalues, and all the remaining ones (n − k) constitute the zero modes. When Wr enters as a noise (random like matrix) correction, a trace of the above effect is expected to remain, i.e., k large eigenvalues and the bulk composed of n − k small eigenvalues whose distribution and fluctuations are consistent with an appropriate version of random matrix ensemble. One likely mechanism that may lead to such a segregation of eigenspectra is that m in eq. (1) is significantly smaller than n, or that the number of large components makes it effectively small on the level of large entries w of W. Such an effective reduction of m (M = meff) is then expressed by the following distribution P(w) of the large off-diagonal matrix elements in the case they are still generated by the random like processes

P(w) = (|w|(M-1)/2K(M-1)/2(|w|))/(2(M-1)/2Γ(M/2)√π) —– (3)

where K stands for the modified Bessel function. Asymptotically, for large w, this leads to P(w) ∼ e(−|w|) |w|M/2−1, and thus reflects an enhanced probability of appearence of a few large off-diagonal matrix elements as compared to a Gaussian distribution. As consistent with the central limit theorem the distribution quickly converges to a Gaussian with increasing M.

Based on several examples of natural complex dynamical systems, like the strongly interacting Fermi systems, the human brain and the financial markets, one could systematize evidence that such effects are indeed common to all the phenomena that intuitively can be qualified as complex.

Phenomenological Model for Stock Portfolios. Note Quote.


The data analysis and modeling of financial markets have been hot research subjects for physicists as well as economists and mathematicians in recent years. The non-Gaussian property of the probability distributions of price changes, in stock markets and foreign exchange markets, has been one of main problems in this field. From the analysis of the high-frequency time series of market indices, a universal property was found in the probability distributions. The central part of the distribution agrees well with Levy stable distribution, while the tail deviate from it and shows another power law asymptotic behavior. In probability theory, a distribution or a random variable is said to be stable if a linear combination of two independent copies of a random sample has the same distributionup to location and scale parameters. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it. The scaling property on the sampling time interval of data is also well described by the crossover of the two distributions. Several stochastic models of the fluctuation dynamics of stock prices are proposed, which reproduce power law behavior of the probability density. The auto-correlation of financial time series is also an important problem for markets. There is no time correlation of price changes in daily scale, while from more detailed data analysis an exponential decay with a characteristic time τ = 4 minutes was found. The fact that there is no auto-correlation in daily scale is not equal to the independence of the time series in the scale. In fact there is auto-correlation of volatility (absolute value of price change) with a power law tail.

Portfolio is a set of stock issues. The Hamiltonian of the system is introduced and is expressed by spin-spin interactions as in spin glass models of disordered magnetic systems. The interaction coefficients between two stocks are phenomenologically determined by empirical data. They are derived from the covariance of sequences of up and down spins using fluctuation-response theorem. We start with the Hamiltonian expression of our system that contain N stock issues. It is a function of the configuration S consisting of N coded price changes Si (i = 1, 2, …, N ) at equal trading time. The interaction coefficients are also dynamical variables, because the interactions between stocks are thought to change from time to time. We divide a coefficient into two parts, the constant part Jij, which will be phenomenologically determined later, and the dynamical part δJij. The Hamiltonian including the interaction with external fields hi (i = 1,2,…,N) is defined as

H [S, δ, J, h] = ∑<i,j>[δJij2/2Δij – (Jij + δJij)SiSj] – ∑ihiSi —– (1)

The summation is taken over all pairs of stock issues. This form of Hamiltonian is that of annealed spin glass. The fluctuations δJij are assumed to distribute according to Gaussian function. The main part of statistical physics is the evaluation of partition function that is given by the following functional in this case

Z[h] = ∑{si} ∫∏<i,j> dδJij/√(2πΔij) e-H [S, δ, J, h] —– (2)

The integration over the variables δJij is easily performed and gives

Z[h] = A {si} e-Heff[S, h] —– (3)

Here the effective Hamiltonian Heff[S,h] is defined as

Heff[S, h] = – <i,j>JijSiSj – ∑ihiSi —– (4)

and A = e(1/2 ∆ij) is just a normalization factor which is irrelevant to the following step. This form of Hamiltonian with constant Jij is that of quenched spin glass.

The constant interaction coefficients Jij are still undetermined. We use fluctuation-response theorem which relates the susceptibility χij with the covariance Cij between dynamical variables in order to determine those constants, which is given by the equation,

χij = ∂mi/∂hj |h=0 = Cij —– (5)

Thouless-Anderson-Palmer (TAP) equation for quenched spin glass is

mi =tanh(∑jJijmj + hi – ∑jJij2(1 – mj2)mi —– (6)

Equation (5) and the linear approximation of the equation (6) yield the equation

kik − Jik)Ckj = δij —– (7)

Interpreting Cij as the time average of empirical data over a observation time rather than ensemble average, the constant interaction coefficients Jij is phenomenologically determined by the equation (7).

The energy spectra of the system, simply the portfolio energy, is defined as the eigenvalues of the Hamiltonian Heff[S,0]. The probability density of the portfolio energy can be obtained in two ways. We can calculate the probability density from data by the equation

p(E) ΔE = p(E – ΔE/2 ≤ E ≤ E + ΔE/2) —– (8)

This is a fully consistent phenomenological model for stock portfolios, which is expressed by the effective Hamiltonian (4). This model will be also applicable to other financial markets that show collective time evolutions, e.g., foreign exchange market, options markets, inter-market interactions.

Spreading Dynamics Over Trading Prices in the Market


Market time series can be seen as a composite of the set of M interacting dynamical sub-system. Investors put their trading decisions due to their portfolio and market strategies, shaping the prices of the traded stocks. Over time, the prices are depicted the dynamical processes within the collective behavior of the investors. The vicissitudes of a price could affect the dynamic of other prices due to their portfolios. Capturing the dynamics of spreading ups and downs within the market is observing the information flow from one price to one another. For instance we have a source system 𝒴(𝑡) as the source of information affecting other sub-system 𝒳(𝑡), collecting the remaining sub-systems in the vector of 𝒵(𝑡). From the information theoretic studies, we know that the differential entropy of a random vector 𝒳 is defined,

h(𝒳(𝑡)) = −∫ 𝑑 𝑝(𝒙)ln𝑝(𝒙)𝑑𝒙

as the random vector takes value in 𝔑𝑑 with probability density function 𝑝(𝒙). When the random variable 𝒳(𝑡) is multivariate discrete of all possible values of 𝑥 ∈ {𝑥1, 𝑥2, … , 𝑥𝑛}, the entropy is

𝐻(𝒳(𝑡)) = − ∑𝑛𝑖=1 𝑝(𝑥) ln 𝑝(𝑥𝑖)

where now, 𝑝 is the probability mass function of 𝒳. Thus, the transfer entropy,


of the previous 𝒳(𝑡), 𝒴(𝑡), and 𝒵(𝑡) is written as,

𝒯𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡) = 𝐻(𝑋(𝑡)|⟦𝑋(𝑡), 𝑍(𝑡)⟧) − 𝐻(𝑋(𝑡)|⟦𝑋(𝑡), 𝑌(𝑡), 𝑍(𝑡)⟧)

where 𝐻(𝐴) denotes the entropy of the variable 𝐴, 𝐻(𝐴|𝐵) the conditional entropy,

𝐻 ( 𝑋 ( 𝑡 ) | 𝑌 ( 𝑡 ) ) = − ∑𝑛𝑖 = 1𝑚𝑗 = 1 𝑝 (𝑥𝑖,𝑦𝑖) l n 𝑝 (𝑥𝑖 | 𝑥𝑖)

for 𝑚 can be different with 𝑛, and 𝑝(𝑥𝑖|𝑥𝑖) as the conditional probability, as to

𝐻 ( 𝑋 ( 𝑡 ) | 𝑌 ( 𝑡 ) ) = − ∑𝑛𝑖 = 1𝑚𝑗 = 1 𝑝 (𝑥𝑖,𝑦𝑖) l n 𝑝 (𝑥𝑖 | 𝑥𝑖)

with 𝑝(𝑥𝑖,𝑥𝑖) as the joint probability. The past of vectors 𝒳(𝑡), 𝒴(𝑡), and 𝒵(𝑡) are respectively 𝑋(𝑡) = {𝑋(𝑡 − 1), 𝑋(𝑡 − 2), … , 𝑋(𝑡 − 𝑝)}, 𝑌(𝑡) = {𝑌(𝑡 − 1), 𝑌(𝑡 − 2), … , 𝑌(𝑡 − 𝑝)}, and 𝑍(𝑡) = {𝑍(𝑡 − 1), 𝑍(𝑡 − 2), … , 𝑍(𝑡 − 𝑝)} with the length vector 𝑝, and the vectors in the bracket ⟦𝐴, 𝐵⟧ are concatenated.

From there we have,

𝒯𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡) ≡ ∑ 𝑝(𝑋(𝑡), 𝑋(𝑡), 𝑌(𝑡), 𝑍(𝑡))𝑙𝑛 𝑝(𝑋(𝑡)|𝑋(𝑡),𝑌(𝑡),𝑍(𝑡))/(p(𝑋(𝑡)|(𝑋(𝑡),𝑍(𝑡))

𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡) 𝑝(𝑋(𝑡)|𝑋(𝑡),𝑍−(𝑡))

where 𝑝(𝐴) is the probability associated with the vector variable 𝐴, and 𝑝(𝐴|𝐵) = 𝑝(𝐴,𝐵)/𝑝(𝐵) probability of observing 𝐴 with knowledge about the values of 𝐵.

The notion of the entropy is an information theoretic terminology that can be regarded as the measure of the disorder level within the random variable of the time series data. Transfer entropy from 𝒴(𝑡) to 𝒳(𝑡) is reflecting the amount of disorderliness reduced in future values of 𝒳(𝑡) by knowing the past values of 𝒳(𝑡) and the given past values of 𝒴(𝑡). Time “moves” as entropy is transferred and observed in flowing information from series to series.

We have two regressions toward 𝑋(𝑡), the first is the moving series without putting the 𝑌(𝑡) into account,

𝑋(𝑡) = 𝐴⟦𝑋(𝑡), 𝑍(𝑡)⟧ + ∈1(𝑡) and the other one which regard to the information transfer from 𝑌(𝑡) to 𝑋(𝑡),

𝑋(𝑡) = 𝐴⟦𝑋(𝑡), 𝑌(𝑡), 𝑍(𝑡)⟧ + ∈2 (𝑡)

where A is the vector of linear regression coefficient, and the 1 and 2 are the residuals of the regression. The residuals have respective variances of 𝜎(∈1) and 𝜎(∈2), and under Gaussian assumption, the entropy of 𝑋(𝑡) is,

𝐻(𝑋(𝑡)| 𝑋(𝑡), 𝑍(𝑡)) = 1/2 (ln 𝜎(∈1) + 2𝜋𝑒))


𝐻(𝑋(𝑡)| 𝑋(𝑡), 𝑍(𝑡)) = 12 (ln 𝜎(∈2) + 2𝜋𝑒))

Thus, we can get the estimated transfer entropy

𝒯𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡) = 1/2 ln 𝜎(∈1)

This information theoretic notion opens the bridging discussions to the statistics of the autoregressive methods of Granger-causality. The idea of Granger-causality came from understanding that 𝒴(𝑡) is said to cause 𝒳(𝑡) for 𝒴(𝑡) helps predict the future of 𝒳(𝑡). This is a statistical concept equivalent with the transfer entropy, of which in our case, the Granger-causality is estimated as,

𝒢𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡) = ln 𝜎(∈1)/ 𝜎(∈2)= 2 𝒯𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡)

Thus, the entropy transferred can be seen as causal relations among random variables, with which we can learn the spreading dynamics over trading prices in the market represented by the multivariate data.