 Market time series can be seen as a composite of the set of M interacting dynamical sub-system. Investors put their trading decisions due to their portfolio and market strategies, shaping the prices of the traded stocks. Over time, the prices are depicted the dynamical processes within the collective behavior of the investors. The vicissitudes of a price could affect the dynamic of other prices due to their portfolios. Capturing the dynamics of spreading ups and downs within the market is observing the information flow from one price to one another. For instance we have a source system 𝒴(𝑡) as the source of information affecting other sub-system 𝒳(𝑡), collecting the remaining sub-systems in the vector of 𝒵(𝑡). From the information theoretic studies, we know that the differential entropy of a random vector 𝒳 is defined,

h(𝒳(𝑡)) = −∫ 𝑑 𝑝(𝒙)ln𝑝(𝒙)𝑑𝒙

as the random vector takes value in 𝔑𝑑 with probability density function 𝑝(𝒙). When the random variable 𝒳(𝑡) is multivariate discrete of all possible values of 𝑥 ∈ {𝑥1, 𝑥2, … , 𝑥𝑛}, the entropy is

𝐻(𝒳(𝑡)) = − ∑𝑛𝑖=1 𝑝(𝑥) ln 𝑝(𝑥𝑖)

where now, 𝑝 is the probability mass function of 𝒳. Thus, the transfer entropy,

𝒯𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡)

of the previous 𝒳(𝑡), 𝒴(𝑡), and 𝒵(𝑡) is written as,

𝒯𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡) = 𝐻(𝑋(𝑡)|⟦𝑋(𝑡), 𝑍(𝑡)⟧) − 𝐻(𝑋(𝑡)|⟦𝑋(𝑡), 𝑌(𝑡), 𝑍(𝑡)⟧)

where 𝐻(𝐴) denotes the entropy of the variable 𝐴, 𝐻(𝐴|𝐵) the conditional entropy,

𝐻 ( 𝑋 ( 𝑡 ) | 𝑌 ( 𝑡 ) ) = − ∑𝑛𝑖 = 1𝑚𝑗 = 1 𝑝 (𝑥𝑖,𝑦𝑖) l n 𝑝 (𝑥𝑖 | 𝑥𝑖)

for 𝑚 can be different with 𝑛, and 𝑝(𝑥𝑖|𝑥𝑖) as the conditional probability, as to

𝐻 ( 𝑋 ( 𝑡 ) | 𝑌 ( 𝑡 ) ) = − ∑𝑛𝑖 = 1𝑚𝑗 = 1 𝑝 (𝑥𝑖,𝑦𝑖) l n 𝑝 (𝑥𝑖 | 𝑥𝑖)

with 𝑝(𝑥𝑖,𝑥𝑖) as the joint probability. The past of vectors 𝒳(𝑡), 𝒴(𝑡), and 𝒵(𝑡) are respectively 𝑋(𝑡) = {𝑋(𝑡 − 1), 𝑋(𝑡 − 2), … , 𝑋(𝑡 − 𝑝)}, 𝑌(𝑡) = {𝑌(𝑡 − 1), 𝑌(𝑡 − 2), … , 𝑌(𝑡 − 𝑝)}, and 𝑍(𝑡) = {𝑍(𝑡 − 1), 𝑍(𝑡 − 2), … , 𝑍(𝑡 − 𝑝)} with the length vector 𝑝, and the vectors in the bracket ⟦𝐴, 𝐵⟧ are concatenated.

From there we have,

𝒯𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡) ≡ ∑ 𝑝(𝑋(𝑡), 𝑋(𝑡), 𝑌(𝑡), 𝑍(𝑡))𝑙𝑛 𝑝(𝑋(𝑡)|𝑋(𝑡),𝑌(𝑡),𝑍(𝑡))/(p(𝑋(𝑡)|(𝑋(𝑡),𝑍(𝑡))

𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡) 𝑝(𝑋(𝑡)|𝑋(𝑡),𝑍−(𝑡))

where 𝑝(𝐴) is the probability associated with the vector variable 𝐴, and 𝑝(𝐴|𝐵) = 𝑝(𝐴,𝐵)/𝑝(𝐵) probability of observing 𝐴 with knowledge about the values of 𝐵.

The notion of the entropy is an information theoretic terminology that can be regarded as the measure of the disorder level within the random variable of the time series data. Transfer entropy from 𝒴(𝑡) to 𝒳(𝑡) is reflecting the amount of disorderliness reduced in future values of 𝒳(𝑡) by knowing the past values of 𝒳(𝑡) and the given past values of 𝒴(𝑡). Time “moves” as entropy is transferred and observed in flowing information from series to series.

We have two regressions toward 𝑋(𝑡), the first is the moving series without putting the 𝑌(𝑡) into account,

𝑋(𝑡) = 𝐴⟦𝑋(𝑡), 𝑍(𝑡)⟧ + ∈1(𝑡) and the other one which regard to the information transfer from 𝑌(𝑡) to 𝑋(𝑡),

𝑋(𝑡) = 𝐴⟦𝑋(𝑡), 𝑌(𝑡), 𝑍(𝑡)⟧ + ∈2 (𝑡)

where A is the vector of linear regression coefficient, and the 1 and 2 are the residuals of the regression. The residuals have respective variances of 𝜎(∈1) and 𝜎(∈2), and under Gaussian assumption, the entropy of 𝑋(𝑡) is,

𝐻(𝑋(𝑡)| 𝑋(𝑡), 𝑍(𝑡)) = 1/2 (ln 𝜎(∈1) + 2𝜋𝑒))

and

𝐻(𝑋(𝑡)| 𝑋(𝑡), 𝑍(𝑡)) = 12 (ln 𝜎(∈2) + 2𝜋𝑒))

Thus, we can get the estimated transfer entropy

𝒯𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡) = 1/2 ln 𝜎(∈1)

This information theoretic notion opens the bridging discussions to the statistics of the autoregressive methods of Granger-causality. The idea of Granger-causality came from understanding that 𝒴(𝑡) is said to cause 𝒳(𝑡) for 𝒴(𝑡) helps predict the future of 𝒳(𝑡). This is a statistical concept equivalent with the transfer entropy, of which in our case, the Granger-causality is estimated as,

𝒢𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡) = ln 𝜎(∈1)/ 𝜎(∈2)= 2 𝒯𝑌(𝑡)→𝑋(𝑡)|𝑍(𝑡)

Thus, the entropy transferred can be seen as causal relations among random variables, with which we can learn the spreading dynamics over trading prices in the market represented by the multivariate data.