One could conceive a financial market as a set of N agents each of them taking a binary decision every time step. This is an extremely crude representation, but capture the essential feature that decision could be coded by binary symbols (buy = 0, sell = 1, for example). Although the extreme simplification, the above setup allow a “stylized” definition of price.
Let Nt0, Nt1 be the number of agents taking the decision 0, 1 respectively at the time t. Obviously, N = Nt0 + Nt1 for every t . Then, with the above definition of the binary code the price can be defined as:
pt = f(Nt0/Nt1)
where f is an increasing and convex function which also hold that:
a) f(0)=0
b) limx→∞ f(x) = ∞
c) limx→∞ f'(x) = 0
The above definition perfectly agree with the common believe about how offer and demand work. If Nt0 is small and Nt1 large, then there are few agents willing to buy and a lot of agents willing to sale, hence the price should be low. If on the contrary, Nt0 is large and Nt1 is small, then there are a lot of agents willing to buy and just few agents willing to sale, hence the price should be high. Notice that the winning choice is related with the minority choice. We exploit the above analogy to construct a binary time-series associated to each real time-series of financial markets. Let {pt}t∈N be the original real time-series. Then we construct a binary time-series {at}t∈N by the rule:
at = {1 pt > pt-1
at = {0 pt < pt-1
Physical complexity is defined as the number of binary digits that are explainable (or meaningful) with respect to the environment in a string η. In reference to our problem the only physical record one gets is the binary string built up from the original real time series and we consider it as the environment ε . We study the physical complexity of substrings of ε . The comprehension of their complex features has high practical importance. The amount of data agents take into account in order to elaborate their choice is finite and of short range. For every time step t, the binary digits at-l, at-l+1,…, at-1 carry some information about the behavior of agents. Hence, the complexity of these finite strings is a measure of how complex information agents face. The Kolmogorov – Chaitin complexity is defined as the length of the shortest program π producing the sequence η when run on universal Turing machine T:
K(η) = min {|π|: η = T(π)}
where π represent the length of π in bits, T(π) the result of running π on Turing machine T and K(η) the Kolmogorov-Chaitin complexity of sequence π. In the framework of this theory, a string is said to be regular if K(η) < η . It means that η can be described by a program π with length smaller than η length. The interpretation of a string should be done in the framework of an environment. Hence, let imagine a Turing machine that takes the string ε as input. We can define the conditional complexity K(η / ε) as the length of the smallest program that computes η in a Turing machine having ε as input:
K(η / ε) = min {|π|: η = CT(π, ε)}
We want to stress that K(η / ε) represents those bits in η that are random with respect to ε. Finally, the physical complexity can be defined as the number of bits that are meaningful in η with respect to ε :
K(η : ε) = |η| – K(η / ε)
η also represent the unconditional complexity of string η i.e., the value of complexity if the input would be ε = ∅ . Of course, the measure K (η : ε ) as defined in the above equation has few practical applications, mainly because it is impossible to know the way in which information about ε is encoded in η . However, if a statistical ensemble of strings is available to us, then the determination of complexity becomes an exercise in information theory. It can be proved that the average values C(η) of the physical complexity K(η : ε) taken over an ensemble Σ of strings of length η can be approximated by:
C|(η)| = 〈K(η : ε)∑ ≅ |η| – K(η : ε), where
K(η : ε) = -∑η∈∑p(η / ε) log2p(η / ε)
and the sum is taking over all the strings η in the ensemble Σ. In a population of N strings in environment ε, the quantity n(η)/N, where n(s) denotes the number of strings equal to η in ∑, approximates p(η / ε) as N → ∞.
Let ε = {at}t∈N and l be a positive integer l ≥ 2. Let Σl be the ensemble of sequences of length l built up by a moving window of length l i.e., if η ∈ Σl then η = aiai+1ai+l−1 for some value of i. The selection of strings ε is related to periods before crashes and in contrast, period with low uncertainty in the market…..