Malicious Machine Learnings? Privacy Preservation and Computational Correctness Across Parties. Note Quote/Didactics.


Multi-Party Computation deals with the following problem: There are n ≥ 2 parties P1, . . ., Pn where party Pi holds input ti, 1 ≤ i ≤ n, and they wish to compute together a functions = f (t1, . . . , tn) on their inputs. The goal is that each party will learn the output of the function, s, yet with the restriction that Pi will not learn any additional information about the input of the other parties aside from what can be deduced from the pair (ti, s). Clearly it is the secrecy restriction that adds complexity to the problem, as without it each party could announce its input to all other parties, and each party would locally compute the value of the function. Thus, the goal of Multi-Party Computation is to achieve the following two properties at the same time: correctness of the computation and privacy preservation of the inputs.

The following two generalizations are often useful:

(i) Probabilistic functions. Here the value of the function depends on some random string r chosen according to some distribution: s = f (t1, . . . , tn; r). An example of this is the coin-flipping functionality, which takes no inputs, and outputs an unbiased random bit. It is crucial that the value r is not controlled by any of the parties, but is somehow jointly generated during the computation.

(ii) Multioutput functions. It is not mandatory that there be a single output of the function. More generally there could be a unique output for each party, i.e., (s1, . . . , sn) = f(t1,…, tn). In this case, only party Pi learns the output si, and no other party learns any information about the other parties’ input and outputs aside from what can be derived from its own input and output.

One of the most interesting aspects of Multi-Party Computation is to reach the objective of computing the function value, but under the assumption that some of the parties may deviate from the protocol. In cryptography, the parties are usually divided into two types: honest and faulty. An honest party follows the protocol without any deviation. Otherwise, the party is considered to be faulty. The faulty behavior can exemplify itself in a wide range of possibilities. The most benign faulty behavior is where the parties follow the protocol, yet try to learn as much as possible about the inputs of the other parties. These parties are called honest-but-curious (or semihonest). At the other end of the spectrum, the parties may deviate from the prescribed protocol in any way that they desire, with the goal of either influencing the computed output value in some way, or of learning as much as possible about the inputs of the other parties. These parties are called malicious.

We envision an adversary A, who controls all the faulty parties and can coordinate their actions. Thus, in a sense we assume that the faulty parties are working together and can exert the most knowledge and influence over the computation out of this collusion. The adversary can corrupt any number of parties out of the n participating parties. Yet, in order to be able to achieve a solution to the problem, in many cases we would need to limit the number of corrupted parties. This limit is called the threshold k, indicating that the protocol remains secure as long as the number of corrupted parties is at most k.

Assume that there exists a trusted party who privately receives the inputs of all the participating parties, calculates the output value s, and then transmits this value to each one of the parties. This process clearly computes the correct output of f, and also does not enable the participating parties to learn any additional information about the inputs of others. We call this model the ideal model. The security of Multi-Party Computation then states that a protocol is secure if its execution satisfies the following: (1) the honest parties compute the same (correct) outputs as they would in the ideal model; and (2) the protocol does not expose more information than a comparable execution with the trusted party, in the ideal model.

Intuitively, the adversary’s interaction with the parties (on a vector of inputs) in the protocol generates a transcript. This transcript is a random variable that includes the outputs of all the honest parties, which is needed to ensure correctness, and the output of the adversary A. The latter output, without loss of generality, includes all the information that the adversary learned, including its inputs, private state, all the messages sent by the honest parties to A, and, depending on the model, maybe even include more information, such as public messages that the honest parties exchanged. If we show that exactly the same transcript distribution can be generated when interacting with the trusted party in the ideal model, then we are guaranteed that no information is leaked from the computation via the execution of the protocol, as we know that the ideal process does not expose any information about the inputs. More formally,

Let f be a function on n inputs and let π be a protocol that computes the function f. Given an adversary A, which controls some set of parties, we define REALA,π(t) to be the sequence of outputs of honest parties resulting from the execution of π on input vector t under the attack of A, in addition to the output of A. Similarly, given an adversary A′ which controls a set of parties, we define IDEALA′,f(t) to be the sequence of outputs of honest parties computed by the trusted party in the ideal model on input vector t, in addition to the output of A′. We say that π securely computes f if, for every adversary A as above, ∃ an adversary A′, which controls the same parties in the ideal model, such that, on any input vector t, we have that the distribution of REALA,π(t) is “indistinguishable” from the distribution of IDEALA′,f(t).

Intuitively, the task of the ideal adversary A′ is to generate (almost) the same output as A generates in the real execution or the real model. Thus, the attacker A′ is often called the simulator of A. The transcript value generated in the ideal model, IDEALA′,f(t), also includes the outputs of the honest parties (even though we do not give these outputs to A′), which we know were correctly computed by the trusted party. Thus, the real transcript REALA,π(t) should also include correct outputs of the honest parties in the real model.

We assumed that every party Pi has an input ti, which it enters into the computation. However, if Pi is faulty, nothing stops Pi from changing ti into some ti′. Thus, the notion of a “correct” input is defined only for honest parties. However, the “effective” input of a faulty party Pi could be defined as the value ti′ that the simulator A′ gives to the trusted party in the ideal model. Indeed, since the outputs of honest parties look the same in both models, for all effective purposes Pi must have “contributed” the same input ti′ in the real model.

Another possible misbehavior of Pi, even in the ideal model, might be a refusal to give any input at all to the trusted party. This can be handled in a variety of ways, ranging from aborting the entire computation to simply assigning ti some “default value.” For concreteness, we assume that the domain of f includes a special symbol ⊥ indicating this refusal to give the input, so that it is well defined how f should be computed on such missing inputs. What this requires is that in any real protocol we detect when a party does not enter its input and deal with it exactly in the same manner as if the party would input ⊥ in the ideal model.

As regards security, it is implicitly assumed that all honest parties receive the output of the computation. This is achieved by stating that IDEALA′,f(t) includes the outputs of all honest parties. We therefore say that our currency guarantees output delivery. A more relaxed property than output delivery is fairness. If fairness is achieved, then this means that if at least one (even faulty) party learns its outputs, then all (honest) parties eventually do too. A bit more formally, we allow the ideal model adversary A′ to instruct the trusted party not to compute any of the outputs. In this case, in the ideal model either all the parties learn the output, or none do. Since A’s transcript is indistinguishable from A′’s this guarantees that the same fairness guarantee must hold in the real model as well.

A further relaxation of the definition of security is to provide only correctness and privacy. This means that faulty parties can learn their outputs, and prevent the honest parties from learning theirs. Yet, at the same time the protocol will still guarantee that (1) if an honest party receives an output, then this is the correct value, and (2) the privacy of the inputs and outputs of the honest parties is preserved.

The basic security notions are universal and model-independent. However, specific implementations crucially depend on spelling out precisely the model where the computation will be carried out. In particular, the following issues must be specified:

  1. The faulty parties could be honest-but-curious or malicious, and there is usually an upper bound k on the number of parties that the adversary can corrupt.
  2. Distinguishing between the computational setting and the information theoretic setting, in the latter, the adversary is unlimited in its computing powers. Thus, the term “indistinguishable” is formalized by requiring the two transcript distributions to be either identical (so-called perfect security) or, at least, statistically close in their variation distance (so-called statistical security). On the other hand, in the computational, the power of the adversary (as well as that of the honest parties) is restricted. A bit more precisely, Multi-Party Computation problem is parameterized by the security parameter λ, in which case (a) all the computation and communication shall be done in time polynomial in λ; and (b) the misbehavior strategies of the faulty parties are also restricted to be run in time polynomial in λ. Furthermore, the term “indistinguishability” is formalized by computational indistinguishability: two distribution ensembles {Xλ}λ and {Yλ}λ are said to be computationally indistinguishable, if for any polynomial-time distinguisher D, the quantity ε, defined as |Pr[D(Xλ) = 1] − Pr[D(Yλ) = 1]|, is a “negligible” function of λ. This means that for any j > 0 and all sufficiently large λ, ε eventually becomes smaller than λ − j. This modeling helps us to build secure Multi-Party Computational protocols depending on plausible computational assumptions, such as the hardness of factoring large integers.
  3. The two common communication assumptions are the existence of a secure channel and the existence of a broadcast channel. Secure channels assume that every pair of parties Pi and Pj are connected via an authenticated, private channel. A broadcast channel is a channel with the following properties: if a party Pi (honest or faulty) broadcasts a message m, then m is correctly received by all the parties (who are also sure the message came from Pi). In particular, if an honest party receives m, then it knows that every other honest party also received m. A different communication assumption is the existence of envelopes. An envelope guarantees the following properties: a value m can be stored inside the envelope, it will be held without exposure for a given period of time, and then the value m will be revealed without modification. A ballot box is an enhancement of the envelope setting that also provides a random shuffling mechanism of the envelopes. These are, of course, idealized assumptions that allow for a clean description of a protocol, as they separate the communication issues from the computational ones. These idealized assumptions may be realized by a physical mechanisms, but in some settings such mechanisms may not be available. Then it is important to address the question if and under what circumstances we can remove a given communication assumption. For example, we know that the assumption of a secure channel can be substituted with a protocol, but under the introduction of a computational assumption and a public key infrastructure.

Stochasticities. Lévy processes. Part 2.

Define the characteristic function of Xt:

Φt(z) ≡ ΦXt(z) ≡ E[eiz.Xt], z ∈ Rd

For t > s, by writing Xt+s = Xs + (Xt+s − Xs) and using the fact that Xt+s − Xs is independent of Xs, we obtain that t ↦ Φt(z) is a multiplicative function.

Φt+s(z) = ΦXt+s(z) = ΦXs(z) ΦXt+s − Xs(z) = ΦXs(z) ΦXt(z) = ΦsΦt

The stochastic continuity of t ↦ Xt implies in particular that Xt → Xs in distribution when s → t. Therefore, ΦXs(z) → ΦXt(z) when s → t so t ↦ Φt(z) is a continuous function of t. Together with the multiplicative property Φs+t(z) = Φs(z).Φt(z), this implies that t ↦ Φt(z) is an exponential function.

Let (Xt)t≥0 be a Lévy process on Rd. ∃ a continuous function ψ : Rd ↦ R called the characteristic exponent of X, such that:

E[eiz.Xt] = etψ(z), z ∈ Rd

ψ is the cumulant generating function of X1 : ψ = ΨX1 and that the cumulant generating function of Xt varies linearly in t: ΨXt = tΨX1 = tψ. The law of Xt is therefore determined by the knowledge of the law of X1 : the only degree of freedom we have in specifying a Lévy process is to specify the distribution of Xt for a single time (say, t = 1).

This lecture covers stochastic processes, including continuous-time stochastic processes and standard Brownian motion by Choongbum Lee

Haircuts and Collaterals.


In a repo-style securities financing transaction, the repo buyer or lender is exposed to the borrower’s default risk for the whole duration with a market contingent exposure, framed on a short window for default settlement. A margin period of risk (MPR) is a time period starting from the last date when margin is met to the date when the defaulting counterparty is closed out with completion of collateral asset disposal. MPR could cover a number of events or processes, including collateral valuation, margin calculation, margin call, valuation dispute and resolution, default notification and default grace period, and finally time to sell collateral to recover the lent principal and accrued interest. If the sales proceeds are not sufficient, the deficiency could be made a claim to the borrower’s estate, unless the repo is non-recourse. The lender’s exposure in a repo during the MPR is simply principal plus accrued and unpaid interest. Since the accrued and unpaid interest is usually margined at cash, repo exposure in the MPR is flat.

A flat exposure could apply to OTC derivatives as well. For an OTC netting, the mark-to-market of the derivatives could fluctuate as its underlying prices move. The derivatives exposure is formally set on the early termination date which could be days behind the point of default. The surviving counterparty, however, could have delta hedged against market factors following the default so that the derivative exposure remains a more manageable gamma exposure. For developing a collateral haircut model, what is generally assumed is a constant exposure during the MPR.

The primary driver of haircuts is asset volatility. Market liquidity risk is another significant one, as liquidation of the collateral assets might negatively impact the market, if the collateral portfolio is illiquid, large, or concentrated in certain asset sectors or classes. Market prices could be depressed, bid/ask spreads could widen, and some assets might have to be sold at a steep discount. This is particularly pronounced with private securitization and lower grade corporates, which trade infrequently and often rely on valuation services rather than actual market quotations. A haircut model therefore needs to capture liquidity risk, in addition to asset volatility.

In an idealized setting, we therefore consider a counterparty (or borrower) C’s default time at t, when the margin is last met, an MPR of u during which there is no margin posting, and the collateral assets are sold at time t+u instantaneously on the market, with a possible liquidation discount g.

Let us denote the collateral market value as B(t), exposure to the defaulting counterparty C as E(t). At time t, one share of the asset is margined properly, i.e., E(t) = (1-h)B(t), where h is a constant haircut, 1 >h ≥0. The margin agreement is assumed to have a zero minimum transfer amount. The lender would have a residual exposure (E(t) – B(t+u)(1-g))+, where g is a constant, 1 > g ≥ 0. Exposure to C is assumed flat after t. We can write the loss function from holding the collateral as follows,

L(t + u) = Et(1 – Bt+u/Bt (1 – g)/(1 – h))+ = (1 – g)Bt(1 – Bt+u/Bt (h – g)/(1 – g))+ —– (1)

Conditional on default happening at time t, the above determines a one-period loss distribution driven by asset price return B(t+u)/B(t). For repos, this loss function is slightly different from the lender’s ultimate loss which would be lessened due to a claim and recovery process. In the regulatory context, haircut is viewed as a mitigation to counterparty exposure and made independent of counterparty, so recovery from the defaulting party is not considered.

Let y = (1 – Bt+u/Bt) be the price decline. If g=0, Pr(y>h) equals to Pr(L(u)>0). There is no loss, if the price decline is less or equal to h. A first rupee loss will occur only if y > h. h thus provides a cushion before a loss is incurred. Given a target rating class’s default probability p, the first loss haircut can be written as

hp = inf{h > 0:Pr(L(u) > 0) ≤ p} —– (2)

Let VaRq denote the VaR of holding the asset, an amount which the price decline won’t exceed, given a confidence interval of q, say 99%. In light of the adoption of the expected shortfall (ES) in BASEL IV’s new market risk capital standard, we get a chance to define haircut as ES under the q-quantile,

hES = ESq = E[y|y > VaRq]

VaRq = inf{y0 > 0 : Pr(y > y0) ≤ 1 − q} —– (3)

Without the liquidity discount, hp is the same as VaRq. If haircuts are set to VaRq or hES, the market risk capital for holding the asset for the given MPR, defined as a multiple of VaR or ES, is zero. This implies that we can define a haircut to meet a minimum economic capital (EC) requirement C0,

hEC = inf{h ∈ R+: EC[L|h] ≤ C0} —– (4)

where EC is measured either as VaR or ES subtracted by expected loss (EL). For rating criteria employing EL based target per rating class, we could introduce one more definition of haircuts based on EL target L0,

hEL = inf{h ∈ R+: E[L|h] ≤ L0} —– (5)

The expected loss target L0 can be set based on EL criteria of certain designated high credit rating, whether bank internal or external. With an external rating such as Moody’s, for example, a firm can set the haircut to a level such that the expected (cumulative) loss satisfies the expected loss tolerance L0 of some predetermined Moody’s rating target, e.g., ‘Aaa’ or ‘Aa1’. In (4) and (5), loss L’s holding period does not have to be an MPR. In fact, these two definitions apply to the general trading book credit risk capital approach where the standard horizon is one year with a 99.9% confidence interval for default risk.

Different from VaRq, definitions hp, hEL, and hEC are based on a loss distribution solely generated by collateral market risk exposure. As such, we no longer apply the usual wholesale credit risk terminology of probability of default (PD) and loss given default (LGD) to determine EL as product of PD and LGD. Here EL is directly computed from a loss distribution originated from market risk and the haircut intends to be wholesale counterparty independent. For real repo transactions where repo haircuts are known to be counterparty dependent, these definitions remain fit, when the loss distribution incorporates the counterparty credit quality.

Orthodoxy of the Neoclassical Synthesis: Minsky’s Capitalism Without Capitalists, Capital Assets, and Financial Markets


During the very years when orthodoxy turned Keynesianism on its head, extolling Reaganomics and Thatcherism as adequate for achieving stabilisation in the epoch of global capitalism, Minsky (Stabilizing an Unstable Economy) pointed to the destabilising consequences of this approach. The view that instability is the result of the internal processes of a capitalist economy, he wrote, stands in sharp contrast to neoclassical theory, whether Keynesian or monetarist, which holds that instability is due to events that are outside the working of the economy. The neoclassical synthesis and the Keynes theories are different because the focus of the neoclassical synthesis is on how a decentralized market economy achieves coherence and coordination in production and distribution, whereas the focus of the Keynes theory is upon the capital development of an economy. The neoclassical synthesis emphasizes equilibrium and equilibrating tendencies, whereas Keynes‘s theory revolves around bankers and businessmen making deals on Wall Street. The neoclassical synthesis ignores the capitalist nature of the economy, a fact that the Keynes theory is always aware of.

Minsky here identifies the main flaw of the neoclassical synthesis, which is that it ignores the capitalist nature of the economy, while authentic Keynesianism proceeds from precisely this nature. Minsky lays bare the preconceived approach of orthodoxy, which has mainstream economics concentrating all its focus on an equilibrium which is called upon to confirm the orthodox belief in the stability of capitalism. At the same time, orthodoxy fails to devote sufficient attention to the speculation in the area of finance and banking that is the precise cause of the instability of the capitalist economy.

Elsewhere, Minsky stresses still more firmly that from the theory of Keynes, the neoclassical standard included in its arsenal only those earlier-mentioned elements which could be interpreted as confirming its preconceived position that capitalism was so perfect that it could not have innate flaws. In this connection Minsky writes:

Whereas Keynes in The General Theory proposed that economists look at the economy in quite a different way from the way they had, only those parts of The General Theory that could be readily integrated into the old way of looking at things survive in today‘s standard theory. What was lost was a view of an economy always in transit because it accumulates in response to disequilibrating forces that are internal to the economy. As a result of the way accumulation takes place in a capitalist economy, Keynes‘s theory showed that success in operating the economy can only be transitory; instability is an inherent and inescapable flaw of capitalism. 

The view that survived is that a number of special things went wrong, which led the economy into the Great Depression. In this view, apt policy can assure that cannot happen again. The standard theory of the 1950s and 1960s seemed to assert that if policy were apt, then full employment at stable prices could be attained and sustained. The existence of internally disruptive forces was ignored; the neoclassical synthesis became the economics of capitalism without capitalists, capital assets, and financial markets. As a result, very little of Keynes has survived today in standard economics.

Here, resting on Keynes‘s analysis, one can find the central idea of Minsky‘s book: the innate instability of capitalism, which in time will lead the system to a new Great Depression. This forecast has now been brilliantly confirmed, but previously there were few who accepted it. Economic science was orchestrated by proponents of neoclassical orthodoxy under the direction of Nobel prizewinners, authors of popular economics textbooks, and other authorities recognized by the mainstream. These people argued that the main problems which capitalism had encountered in earlier times had already been overcome, and that before it lay a direct, sunny road to an even better future.

Robed in complex theoretical constructs, and underpinned by an abundance of mathematical formulae, these ideas of a cloudless future for capitalism interpreted the economic situation, it then seemed, in thoroughly convincing fashion. These analyses were balm for the souls of the people who had come to believe that capitalism had attained perfection. In this respect, capitalism has come to bear an uncanny resemblance to communism. There is, however, something beyond the preconceptions and prejudices innate to people in all social systems, and that is the reality of historical and economic development. This provides a filter for our ideas, and over time makes it easier to separate truth from error. The present financial and economic crisis is an example of such reality. While the mainstream was still euphoric about the future of capitalism, the post-Keynesians saw the approaching outlines of a new Great Depression. The fate of Post Keynesianism will depend very heavily on the future development of the world capitalist economy. If the business cycle has indeed been abolished (this time), so that stable, non-inflationary growth continues indefinitely under something approximating to the present neoclassical (or pseudo-monetarist) policy consensus, then there is unlikely to be a significant market for Post Keynesian ideas. Things would be very different in the event of a new Great Depression, to think one last time in terms of extreme possibilities. If it happened again, to quote Hyman Minsky, the appeal of both a radical interventionist programme and the analysis from which it was derived would be very greatly enhanced.

Neoclassical orthodoxy, that is, today‘s mainstream economic thinking proceeds from the position that capitalism is so good and perfect that an alternative to it does not and cannot exist. Post-Keynesianism takes a different standpoint. Unlike Marxism it is not so revolutionary a theory as to call for a complete rejection of capitalism. At the same time, it does not consider capitalism so perfect that there is nothing in it that needs to be changed. To the contrary, Post-Keynesianism maintains that capitalism has definite flaws, and requires changes of such scope as to allow alternative ways of running the economy to be fully effective. To the prejudices of the mainstream, post-Keynesianism counterposes an approach based on an objective analysis of the real situation. Its economic and philosophical approach – the methodology of critical realism – has been developed accordingly, the methodological import of which helps post-Keynesianism answer a broad range of questions, providing an alternative both to market fundamentalism, and to bureaucratic centralism within a planned economy. This is the source of its attraction for us….

Conjuncted: Avarice

Greed followed by avarice….We consider the variation in which events occur at a rate equal to the difference in capital of the two traders. That is, an individual is more likely to take capital from a much poorer person rather than from someone of slightly less wealth. For this “avaricious” exchange, the corresponding rate equations are

dck/dt = ck-1j=1k-1(k – 1 – j)cj + ck+1j=k+1(j – k – 1)cj – ckj=1|k – j|cj —– (1)

while the total density obeys,

dN/dt = -c1(1 – N) —– (2)

under the assumption that the total wealth density is set equal to one, ∑kck = 1

These equations can be solved by again applying scaling. For this purpose, it is first expedient to rewrite the rate equation as,

dck/dt = (ck-1 – ck)∑j=1k-1(k – j)cj – ck-1j=1k-1cj + (ck+1 – ck)∑j=k+1(j – k)cj – ck+1j=k+1cj —– (3)

taking the continuum limits

∂c/∂t = ∂c/∂k – N∂/∂k(kc) —— (3)

We now substitute the scaling ansatz,

ck(t) ≅ N2C(x), with x = kN to yield

C(0)[2C + xC′] = (x − 1)C′ + C —– (4)


dN/dt = -C(0)N2 —– (5)

Solving the above equations gives N ≅ [C(0)t]−1 and

C(x) = (1 + μ)(1 + μx)−2−1/μ —– (6)

with μ = C(0) − 1. The scaling approach has thus found a family of solutions which are parameterized by μ, and additional information is needed to determine which of these solutions is appropriate for our system. For this purpose, note that equation (6) exhibits different behaviors depending on the sign of μ. When μ > 0, there is an extended non-universal power-law distribution, while for μ = 0 the solution is the pure exponential, C(x) = e−x. These solutions may be rejected because the wealth distribution cannot extend over an unbounded domain if the initial wealth extends over a finite range.

The accessible solutions therefore correspond to −1 < μ < 0, where the distribution is compact and finite, with C(x) ≡ 0 for x ≥ xf = −μ−1. To determine the true solution, let us re-examine the continuum form of the rate equation, equation (3). From naive power counting, the first two terms are asymptotically dominant and they give a propagating front with kf exactly equal to t. Consequently, the scaled location of the front is given by xf = Nkf. Now the result N ≃ [C(0)t]−1 gives xf = 1/C(0). Comparing this expression with the corresponding value from the scaling approach, xf = [1 − C(0)]−1, selects the value C(0) = 1/2. Remarkably, this scaling solution coincides with the Fermi distribution that found for the case of constant interaction rate. Finally, in terms of the unscaled variables k and t, the wealth distribution is

ck(t) = 4/t2, k < t

= 0, k ≥ 0 —– (7)

This discontinuity is smoothed out by diffusive spreading. Another interesting feature is that if the interaction rate is sufficiently greedy, “gelation” occurs, whereby a finite fraction of the total capital is possessed by a single individual. For interaction rates, or kernels K(j, k) between individuals of capital j and k which do not give rise to gelation, the total density typically varies as a power law in time, while for gelling kernels N(t) goes to zero at some finite time. At the border between these regimes N(t) typically decays exponentially in time. We seek a similar transition in behavior for the capital exchange model by considering the rate equation for the density

dN/dt = -c1k=1k(1, k)ck —– (8)

For the family of kernels with K(1, k) ∼ kν as k → ∞, substitution of the scaling ansatz gives N ̇ ∼ −N3−ν. Thus N(t) exhibits a power-law behavior N ∼ t1/(2−ν) for ν < 2 and an exponential behavior for ν = 2. Thus gelation should arise for ν > 2.

Banking? There isn’t much to it than this anyways.

Don’t go by the innocuous sounding title, for this is at its wit alternative. The modus operandi (Oh!, how much I feel like saying modis operandi in honor of the you Indians know who?) for accumulation of wealth to parts of ‘the system’ (which, for historic reasons, we call ‘capitalists’) is banking. The ‘capitalists’ (defined as those that skim the surplus labor of others) accumulate it through the banking system. That is nearly an empty statement, since wealth = money. That is, money is the means of increasing wealth and thus one represents the other. If capitalists skim surplus labor, it means that they skim surplus money. Money is linked to (only!) banks, and thus, accumulation is in the banks.

If interest is charged, borrowers will go bankrupt. This idea can be extended. If interest is charged, all money is accumulated in banks. Or, better to say, a larger and larger fraction of money is accumulated in the banks, and kept in financial institutions. The accumulation of wealth is accumulation of money in and by banks. It can only be interesting to see whom the money belongs to.

By the way, these institutions, the capitalists naturally wanting to part with as little as possible from this money, are often in fiscal paradises. Famous are The Cayman Islands, The Bahamas, The Seychelles, etc. With the accumulated money the physical property is bought. Once again, this is an empty statement. Money represents buying power (to buy more wealth). For instance buying the means-of-production (the Marxian mathematical equation raises its head again, MoP), such as land, factories, people’s houses (which will then be rented to them; more money). Etc.

Also, a tiny fraction of the money is squandered. It is what normally draws most attention. Oil sheiks that drive golden cars, bunga-bunga parties etc. That, however, is rather insignificant, this way of re-injecting money into the system. Mostly money is used to increase capital. That is why it is an obvious truth that “When you are rich, you must be extremely stupid to become poor. When you are poor, you must be extremely talented to become rich”. When you are rich, just let the capital work for you; it will have the tendency to increase, even if it increases slower than that of your more talented neighbor.

To accelerate the effect of skimming, means of production (MoP, ‘capital’), are confiscated from everything – countries and individual people – that cannot pay the loan + interest (which is unavoidable). Or bought for a much-below market value price in a way of “Take it or leave it; either give me my money back, which I know there is no way you can, or give me all your possessions and options for confiscation of possessions of future generations as well, i.e., I’ll give you new loans (which you will also not be able to pay back, I know, but that way I’ll manage to forever take everything you will ever produce in your life and all generations after you. Slaves, obey your masters!)”

Although not essential (Marx analyzed it not like this), the banking system accelerates the condensation of wealth. It is the modus operandi. Money is accumulated. With that money, capital is bought and then the money is re-confiscated with that newly-bought capital, or by means of new loans, etc. It is a feedback system where all money and capital is condensing on a big pile. Money and capital are synonyms. Note that this pile in not necessarily a set of people. It is just ‘the system’. There is no ‘class struggle’ between rich and poor, where the latter are trying to steal/take-back the money (depending on which side of the alleged theft the person analyzing it is). It is a class struggle of people against ‘the system’.

There is only one stable final distribution: all money/capital belonging to one person or institute, one ‘entity’. That is what is called a ‘singularity’ and the only mathematical function that is stable in this case. It is called a delta-function, or Kronecker-delta function: zero everywhere, except in one point, where it is infinite, with the total integral (total money) equal to unity. In this case: all money on one big pile. All other functions are unstable.

Imagine that there are two brothers that wound up with all the money and the rest of the people are destitute and left without anything. These two brothers will then start lending things to each-other. Since they are doing this in the commercial way (having to give back more than borrowed), one of the brothers will confiscate everything from the other.
Note: There is only one way out of it, namely that the brother ‘feels sorry’ for his sibling and gives him things without anything in return, to compensate for the steady unidirectional flow of wealth….

Conjuncted: Ergodicity. Thought of the Day 51.1


When we scientifically investigate a system, we cannot normally observe all possible histories in Ω, or directly access the conditional probability structure {PrE}E⊆Ω. Instead, we can only observe specific events. Conducting many “runs” of the same experiment is an attempt to observe as many histories of a system as possible, but even the best experimental design rarely allows us to observe all histories or to read off the full conditional probability structure. Furthermore, this strategy works only for smaller systems that we can isolate in laboratory conditions. When the system is the economy, the global ecosystem, or the universe in its entirety, we are stuck in a single history. We cannot step outside that history and look at alternative histories. Nonetheless, we would like to infer something about the laws of the system in general, and especially about the true probability distribution over histories.

Can we discern the system’s laws and true probabilities from observations of specific events? And what kinds of regularities must the system display in order to make this possible? In other words, are there certain “metaphysical prerequisites” that must be in place for scientific inference to work?

To answer these questions, we first consider a very simple example. Here T = {1,2,3,…}, and the system’s state at any time is the outcome of an independent coin toss. So the state space is X = {Heads, Tails}, and each possible history in Ω is one possible Heads/Tails sequence.

Suppose the true conditional probability structure on Ω is induced by the single parameter p, the probability of Heads. In this example, the Law of Large Numbers guarantees that, with probability 1, the limiting frequency of Heads in a given history (as time goes to infinity) will match p. This means that the subset of Ω consisting of “well-behaved” histories has probability 1, where a history is well-behaved if (i) there exists a limiting frequency of Heads for it (i.e., the proportion of Heads converges to a well-defined limit as time goes to infinity) and (ii) that limiting frequency is p. For this reason, we will almost certainly (with probability 1) arrive at the true conditional probability structure on Ω on the basis of observing just a single history and counting the number of Heads and Tails in it.

Does this result generalize? The short answer is “yes”, provided the system’s symmetries are of the right kind. Without suitable symmetries, generalizing from local observations to global laws is not possible. In a slogan, for scientific inference to work, there must be sufficient regularities in the system. In our toy system of the coin tosses, there are. Wigner (1967) recognized this point, taking symmetries to be “a prerequisite for the very possibility of discovering the laws of nature”.

Generally, symmetries allow us to infer general laws from specific observations. For example, let T = {1,2,3,…}, and let Y and Z be two subsets of the state space X. Suppose we have made the observation O: “whenever the state is in the set Y at time 5, there is a 50% probability that it will be in Z at time 6”. Suppose we know, or are justified in hypothesizing, that the system has the set of time symmetries {ψr : r = 1,2,3,….}, with ψr(t) = t + r, as defined as in the previous section. Then, from observation O, we can deduce the following general law: “for any t in T, if the state of the system is in the set Y at time t, there is a 50% probability that it will be in Z at time t + 1”.

However, this example still has a problem. It only shows that if we could make observation O, then our generalization would be warranted, provided the system has the relevant symmetries. But the “if” is a big “if”. Recall what observation O says: “whenever the system’s state is in the set Y at time 5, there is a 50% probability that it will be in the set Z at time 6”. Clearly, this statement is only empirically well supported – and thus a real observation rather than a mere hypothesis – if we can make many observations of possible histories at times 5 and 6. We can do this if the system is an experimental apparatus in a lab or a virtual system in a computer, which we are manipulating and observing “from the outside”, and on which we can perform many “runs” of an experiment. But, as noted above, if we are participants in the system, as in the case of the economy, an ecosystem, or the universe at large, we only get to experience times 5 and 6 once, and we only get to experience one possible history. How, then, can we ever assemble a body of evidence that allows us to make statements such as O?

The solution to this problem lies in the property of ergodicity. This is a property that a system may or may not have and that, if present, serves as the desired metaphysical prerequisite for scientific inference. To explain this property, let us give an example. Suppose T = {1,2,3,…}, and the system has all the time symmetries in the set Ψ = {ψr : r = 1,2,3,….}. Heuristically, the symmetries in Ψ can be interpreted as describing the evolution of the system over time. Suppose each time-step corresponds to a day. Then the history h = (a,b,c,d,e,….) describes a situation where today’s state is a, tomorrow’s is b, the next day’s is c, and so on. The transformed history ψ1(h) = (b,c,d,e,f,….) describes a situation where today’s state is b, tomorrow’s is c, the following day’s is d, and so on. Thus, ψ1(h) describes the same “world” as h, but as seen from the perspective of tomorrow. Likewise, ψ2(h) = (c,d,e,f,g,….) describes the same “world” as h, but as seen from the perspective of the day after tomorrow, and so on.

Given the set Ψ of symmetries, an event E (a subset of Ω) is Ψ-invariant if the inverse image of E under ψ is E itself, for all ψ in Ψ. This implies that if a history h is in E, then ψ(h) will also be in E, for all ψ. In effect, if the world is in the set E today, it will remain in E tomorrow, and the day after tomorrow, and so on. Thus, E is a “persistent” event: an event one cannot escape from by moving forward in time. In a coin-tossing system, where Ψ is still the set of time translations, examples of Ψ- invariant events are “all Heads”, where E contains only the history (Heads, Heads, Heads, …), and “all Tails”, where E contains only the history (Tails, Tails, Tails, …).

The system is ergodic (with respect to Ψ) if, for any Ψ-invariant event E, the unconditional probability of E, i.e., PrΩ(E), is either 0 or 1. In other words, the only persistent events are those which occur in almost no history (i.e., PrΩ(E) = 0) and those which occur in almost every history (i.e., PrΩ(E) = 1). Our coin-tossing system is ergodic, as exemplified by the fact that the Ψ-invariant events “all Heads” and “all Tails” occur with probability 0.

In an ergodic system, it is possible to estimate the probability of any event “empirically”, by simply counting the frequency with which that event occurs. Frequencies are thus evidence for probabilities. The formal statement of this is the following important result from the theory of dynamical systems and stochastic processes.

Ergodic Theorem: Suppose the system is ergodic. Let E be any event and let h be any history. For all times t in T, let Nt be the number of elements r in the set {1, 2, …, t} such that ψr(h) is in E. Then, with probability 1, the ratio Nt/t will converge to PrΩ(E) as t increases towards infinity.

Intuitively, Nt is the number of times the event E has “occurred” in history h from time 1 up to time t. The ratio Nt/t is therefore the frequency of occurrence of event E (up to time t) in history h. This frequency might be measured, for example, by performing a sequence of experiments or observations at times 1, 2, …, t. The Ergodic Theorem says that, almost certainly (i.e., with probability 1), the empirical frequency will converge to the true probability of E, PrΩ(E), as the number of observations becomes large. The estimation of the true conditional probability structure from the frequencies of Heads and Tails in our illustrative coin-tossing system is possible precisely because the system is ergodic.

To understand the significance of this result, let Y and Z be two subsets of X, and suppose E is the event “h(1) is in Y”, while D is the event “h(2) is in Z”. Then the intersection E ∩ D is the event “h(1) is in Y, and h(2) is in Z”. The Ergodic Theorem says that, by performing a sequence of observations over time, we can empirically estimate PrΩ(E) and PrΩ(E ∩ D) with arbitrarily high precision. Thus, we can compute the ratio PrΩ(E ∩ D)/PrΩ(E). But this ratio is simply the conditional probability PrΕ(D). And so, we are able to estimate the conditional probability that the state at time 2 will be in Z, given that at time 1 it was in Y. This illustrates that, by allowing us to estimate unconditional probabilities empirically, the Ergodic Theorem also allows us to estimate conditional probabilities, and in this way to learn the properties of the conditional probability structure {PrE}E⊆Ω.

We may thus conclude that ergodicity is what allows us to generalize from local observations to global laws. In effect, when we engage in scientific inference about some system, or even about the world at large, we rely on the hypothesis that this system, or the world, is ergodic. If our system, or the world, were “dappled”, then presumably we would not be able to presuppose ergodicity, and hence our ability to make scientific generalizations would be compromised.

Conjuncted: Austrian Economics. Some Ruminations. Part 1.

Ludwig von Mises’ argument concerning the impossibility of economic calculation under socialism provides a hint as to what a historical specific theory of capital could look like. He argues that financial accounting based on business capital is an indispensable tool when it comes to the allocation and distribution of resources in the economy. Socialism, which has to do without private ownership of means of production and, therefore, also must sacrifice the concepts of (business) capital and financial accounting, cannot rationally appraise the value of the production factors. Without such an appraisal, production must necessarily result in chaos. 

Phenomenological Model for Stock Portfolios. Note Quote.


The data analysis and modeling of financial markets have been hot research subjects for physicists as well as economists and mathematicians in recent years. The non-Gaussian property of the probability distributions of price changes, in stock markets and foreign exchange markets, has been one of main problems in this field. From the analysis of the high-frequency time series of market indices, a universal property was found in the probability distributions. The central part of the distribution agrees well with Levy stable distribution, while the tail deviate from it and shows another power law asymptotic behavior. In probability theory, a distribution or a random variable is said to be stable if a linear combination of two independent copies of a random sample has the same distributionup to location and scale parameters. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it. The scaling property on the sampling time interval of data is also well described by the crossover of the two distributions. Several stochastic models of the fluctuation dynamics of stock prices are proposed, which reproduce power law behavior of the probability density. The auto-correlation of financial time series is also an important problem for markets. There is no time correlation of price changes in daily scale, while from more detailed data analysis an exponential decay with a characteristic time τ = 4 minutes was found. The fact that there is no auto-correlation in daily scale is not equal to the independence of the time series in the scale. In fact there is auto-correlation of volatility (absolute value of price change) with a power law tail.

Portfolio is a set of stock issues. The Hamiltonian of the system is introduced and is expressed by spin-spin interactions as in spin glass models of disordered magnetic systems. The interaction coefficients between two stocks are phenomenologically determined by empirical data. They are derived from the covariance of sequences of up and down spins using fluctuation-response theorem. We start with the Hamiltonian expression of our system that contain N stock issues. It is a function of the configuration S consisting of N coded price changes Si (i = 1, 2, …, N ) at equal trading time. The interaction coefficients are also dynamical variables, because the interactions between stocks are thought to change from time to time. We divide a coefficient into two parts, the constant part Jij, which will be phenomenologically determined later, and the dynamical part δJij. The Hamiltonian including the interaction with external fields hi (i = 1,2,…,N) is defined as

H [S, δ, J, h] = ∑<i,j>[δJij2/2Δij – (Jij + δJij)SiSj] – ∑ihiSi —– (1)

The summation is taken over all pairs of stock issues. This form of Hamiltonian is that of annealed spin glass. The fluctuations δJij are assumed to distribute according to Gaussian function. The main part of statistical physics is the evaluation of partition function that is given by the following functional in this case

Z[h] = ∑{si} ∫∏<i,j> dδJij/√(2πΔij) e-H [S, δ, J, h] —– (2)

The integration over the variables δJij is easily performed and gives

Z[h] = A {si} e-Heff[S, h] —– (3)

Here the effective Hamiltonian Heff[S,h] is defined as

Heff[S, h] = – <i,j>JijSiSj – ∑ihiSi —– (4)

and A = e(1/2 ∆ij) is just a normalization factor which is irrelevant to the following step. This form of Hamiltonian with constant Jij is that of quenched spin glass.

The constant interaction coefficients Jij are still undetermined. We use fluctuation-response theorem which relates the susceptibility χij with the covariance Cij between dynamical variables in order to determine those constants, which is given by the equation,

χij = ∂mi/∂hj |h=0 = Cij —– (5)

Thouless-Anderson-Palmer (TAP) equation for quenched spin glass is

mi =tanh(∑jJijmj + hi – ∑jJij2(1 – mj2)mi —– (6)

Equation (5) and the linear approximation of the equation (6) yield the equation

kik − Jik)Ckj = δij —– (7)

Interpreting Cij as the time average of empirical data over a observation time rather than ensemble average, the constant interaction coefficients Jij is phenomenologically determined by the equation (7).

The energy spectra of the system, simply the portfolio energy, is defined as the eigenvalues of the Hamiltonian Heff[S,0]. The probability density of the portfolio energy can be obtained in two ways. We can calculate the probability density from data by the equation

p(E) ΔE = p(E – ΔE/2 ≤ E ≤ E + ΔE/2) —– (8)

This is a fully consistent phenomenological model for stock portfolios, which is expressed by the effective Hamiltonian (4). This model will be also applicable to other financial markets that show collective time evolutions, e.g., foreign exchange market, options markets, inter-market interactions.