# Man Proposes OR Woman Proposes – “Matches” Happen Algorithmically and are Economically Walrasian Rather than Keynesian. Note Quote & Didactics.

Consider a set M of men and a set W of women. Each m ∈ M has a strict preference ordering over the elements of W and each w ∈ W has a strict preference ordering over men. Let us denote the preference ordering of an agent i by i and x ≻i y will mean that agent i ranks x above y. Now a marriage or matching would be considered as an assignment of men to women such that each man is assigned to at most one woman and vice versa. But, what if the agent decides to remain single. This is possible by two ways, viz. if a man or a woman is matched with oneself, or for each man or woman, there is a dummy woman or man in the set W or M that corresponds to being single. If this were the construction, then, we could safely assume |M| = |W|. But, there is another impediment here, whereby a subversion of sorts is possible, in that a group of agents could simply opt out of the matching exercise. In such a scenario, it becomes mandatory to define a blocking set. As an implication of such subversiveness, a matching is called unstable if there are two men m, m’ and two women w, w’ such that

1. m is matched to w
2. m’ is matched to w’, and
3. w’ m w and m ≻w’ m’

then, the pair (m, w’) is a blocking pair. Any matching without the blocking pair is called stable.

Now, given the preferences of men and women, is it always possible to find stable matchings? For the same, what is used is Gale and Shapley’s deferred acceptance algorithm.

So, after a brief detour, let us concentrate on the male-proposal version.

First, each man proposes to his top-ranked choice. Next, each woman who has received at least two proposals keeps (tentatively) her top-ranked proposal and rejects the rest. Then, each man who has been rejected proposes to his top-ranked choice among the women who have not rejected him. Again each woman who has at least two proposals (including ones from previous rounds) keeps her top-ranked proposal and rejects the rest. The process repeats until no man has a woman to propose to or each woman has at most one proposal. At this point the algorithm terminates and each man is assigned to a woman who has not rejected his proposal. No man is assigned to more than one woman. Since each woman is allowed to keep only one proposal at any stage, no woman is assigned to more than one man. Therefore the algorithm terminates in a matching.

Consider the matching {(m1, w1), (m2, w2), (m3, w3)}. This is an unstable matching since (m1, w2) is a blocking pair. The matching {(m1, w1), (m3, w2), (m2, w3)}, however, is stable. Now looking at the figure above, m1 proposes to w2, m2 to w1, and m3 to w1. At the end of this round, w1 is the only woman to have received two proposals. One from m3 and the other from m2. Since she ranks m3 above m2, she keeps m3 and rejects m2. Since m3 is the only man to have been rejected, he is the only one to propose again in the second round. This time he proposes to w3. Now each woman has only one proposal and the algorithm terminates with the matching {(m1, w2), (m2, w3), (m3, w2)}.

The male propose algorithm terminates in a stable matching.

Suppose not. Then ∃ a blocking pair (m1, w1) with m1 matched to w2, say, and w1 matched to m2. Since (m1, w1) is blocking and w1m1 w2, in the proposal algorithm, m1 would have proposed to w1 before w2. Since m1 was not matched with w1 by the algorithm, it must be because w1 received a proposal from a man that she ranked higher than m1. Since the algorithm matches her to m2 it follows that m2w1 m1. This contradicts the fact that (m1, w1) is a blocking pair.

Even if where the women propose, the outcome would still be stable matching. The only difference is in kind as the stable matching would be different from the one generated when the men propose. This would also imply that even if stable matching is guaranteed to exist, there is more than one such matching. Then what is the point to prefer one to the other? Well, there is a reason:

Denote a matching by μ. The woman assigned to man m in the matching μ is denoted μ(m). Similarly, μ(w) is the man assigned to woman w. A matching μ is male-optimal if there is no stable matching ν such that ν(m) ≻m μ(m) or ν(m) = μ(m) ∀ m with ν(j) ≻j μ(j) for at least one j ∈ M. Similarly for the female-optimality.

The stable matching produced by the (male-proposal) Deferred Acceptance Algorithm is male-optimal.

Let μ be the matching returned by the male-propose algorithm. Suppose μ is not male optimal. Then, there is a stable matching ν such that ν(m) ≻m μ(m) or ν(m) = μ(m) ∀ m with ν(j) ≻j μ(j) for at least one j ∈ M. Therefore, in the application of the proposal algorithm, there must be an iteration where some man j proposes to ν(j) before μ(j) since ν(j) ≻j μ(j) and is rejected by woman ν(j). Consider the first such iteration. Since woman ν(j) rejects j she must have received a proposal from a man i she prefers to man j. Since this is the first iteration at which a male is rejected by his partner under ν, it follows that man i ranks woman ν(j) higher than ν(i). Summarizing, i ≻ν(j) j and ν(j) ≻i ν(i) implying that ν is not stable, a contradiction.

Now, the obvious question is if this stable matching is optimal w.r.t. to both men and women? The answer this time around is NO. From above, it could easily be seen that there are two stable matchings, one of them is male-optimal and the other is female-optimal. At least, one female is strictly better-off under the female optimality than male optimality, and by this, no female is worse off. If the POV is men, a similar conclusion is drawn.  A stable marriage is immune to a pair of agents opting out of the matching. We could ask that no subset of agents should have an incentive to opt out of the matching. Formally, a matching μ′ dominates a matching μ if there is a set S ⊂ M ∪ W such that for all m, w ∈ S, both (i) μ′(m), μ′(w) ∈ S and (ii) μ′(m) ≻m μ(m) and μ′(w) ≻w μ(w). Stability is a special case of this dominance condition when we restrict attention to sets S consisting of a single couple. The set of undominated matchings is called the core of the matching game.

The direct mechanism associated with the male propose algorithm is strategy-proof for the males.

Suppose not. Then there is a profile of preferences π = (≻m1 , ≻m2 , . . . , ≻mn) for the men, such that man m1, say, can misreport his preferences and obtain a better match. To express this formally, let μ be the stable matching obtained by applying the male proposal algorithm to the profile π. Suppose that m1 reports the preference ordering ≻ instead. Let ν be the stable matching that results when the male-proposal algorithm is applied to the profile π1 = (≻, ≻m2 , . . . , ≻mn). For a contradiction, suppose ν(m1) ≻m1 μ(m1). For notational convenience let a ≽m b mean that a ≻m b or a = b.

First we show that m1 can achieve the same effect by choosing an ordering ≻̄ where woman ν(m1) is ranked first. Let π2 = (≻̄ , ≻m2 , . . . , ≻mn). Knowing that ν is stable w.r.t the profile π1 we show that it is stable with respect to the profile π2. Suppose not. Then under the profile π2 there must be a pair (m, w) that blocks ν. Since ν assigns to m1 its top choice with respect to π2, m1 cannot be part of this blocking pair. Now the preferences of all agents other than m1 are the same in π1 and π2. Therefore, if (m, w) blocks ν w.r.t the profile π2, it must block ν w.r.t the profile π1, contradicting the fact that ν is a stable matching under π1.

Let λ be the male propose stable matching for the profile π2. ν is a stable matching w.r.t the profile π2. As λ is male optimal w.r.t the profile π2, it follows that λ(m1) = ν(m1).

Let’s assume that ν(m1) is the top-ranked woman in the ordering ≻. Now we show that the set B = {mj : μ(mj) ≻mj ν(mj)} is empty. This means that all men, not just m1, are no worse off under ν compared to μ. Since ν is stable w.r.t the original profile, π this contradicts the male optimality of μ.

Suppose B ≠ ∅. Therefore, when the male proposal algorithm is applied to the profile π1, each mj ∈ B is rejected by their match under μ, i.e., μ(mj). Consider the first iteration of the proposal algorithm where some mj is rejected by μ(mj). This means that woman μ(mj) has a proposal from man mk that she ranks higher, i.e., mkμ(mj) mj. Since mk was not matched to μ(mj) under μ it must be that μ(mk) ≻mk μ(mj). Hence mk ∈ B , otherwise μ(mj) ≽ mkν(mk) ≽mk μ(mk) ≻mk μ(mj), which is a contradiction. Since mk ∈ B and mk has proposed to μ(mj) at the time man mj proposes, it means that mk must have been rejected by μ(mk) prior to mj being rejected, contradicting our choice of mj.

The mechanism associated with the male propose algorithm is not strategy-proof for the females. Let us see how this is the case by way of an example. The male propose algorithm returns the matching {(m1, w2), (m2, w3), (m3, w1)}. In the course of the algorithm the only woman who receives at least two proposals is w1. She received proposals from m2 and m3. She rejects m2 who goes on to propose to w3 and the algorithm terminates. Notice that w1 is matched with her second choice. Suppose now that she had rejected m3 instead. Then m3 would have gone on to proposes to w2. Woman w2 now has a choice between m1 and m3. She would keep m3 and reject m1, who would go on to propose to w1. Woman w1 would keep m1 over m2 and in the final matching be paired with a her first-rank choice.

Transposing this on to economic theory, this fits neatly into the Walrasian equilibrium. Walras’ law is an economic theory that the existence of excess supply in one market must be matched by excess demand in another market so that it balances out. Walras’ law asserts that an examined market must be in equilibrium if all other markets are in equilibrium, and this contrasts with Keynesianism, which by contrast, assumes that it is possible for just one market to be out of balance without a “matching” imbalance elsewhere. Moreover, Walrasian equilibria are the solutions of a fixed point problem. In the cases when they can be computed efficiently it is because the set of Walrasian equilibria can be described by a set of convex inequalities. The same can be said of stable matchings/marriages. The set of stable matchings is fixed points of a nondecreasing function defined on a lattice.

# Fragmentation – Lit and Dark Electronic Exchanges. Thought of the Day 116.0

The issue of distance from the trading engine brings us to another key dimen­sion of trading nowadays, especially in US equity markets, namely fragmentation. A trader in US equities markets has to be aware that there are up to 13 lit electronic exchanges and more than 40 dark ones. Together with this wide range of trading options, there is also specific regulation (the so-called ‘trade-through’ rules) which affects what happens to market orders sent to one exchange if there are better execution prices at other exchanges. The interaction of multiple trading venues, latency when moving be­tween these venues, and regulation introduces additional dimensions to keep in mind when designing success l trading strategies.

The role of time is fundamental in the usual price-time priority electronic ex­change, and in a fragmented market, the issue becomes even more important. Traders need to be able to adjust their trading positions fast in response to or in anticipation of changes in market circumstances, not just at the local exchange but at other markets as well. The race to be the first in or out of a certain position is one of the focal points of the debate on the benefits and costs of ‘high-frequency trading’.

The importance of speed permeates the whole process of designing trading algorithms, from the actual code, to the choice of programming language, to the hardware it is implemented on, to the characteristics of the connection to the matching engine, and the way orders are routed within an exchange and between exchanges. Exchanges, being aware of the importance of speed, have adapted and, amongst other things, moved well beyond the basic two types of orders (Market Orders and Limit Orders). Any trader should be very well-informed regarding all the different order types available at the exchanges, what they are and how they may be used.

When coding an algorithm one should be very aware of all the possible types of orders allowed, not just in one exchange, but in all competing exchanges where one’s asset of interest is traded. Being uninformed about the variety of order types can lead to significant losses. Since some of these order types allow changes and adjustments at the trading engine level, they cannot be beaten in terms of latency by the trader’s engine, regardless of how efficiently your algorithms are coded and hardwired.

Another important issue to be aware of is that trading in an exchange is not free, but the cost is not the same for all traders. For example, many exchanges run what is referred to as a maker-taker system of fees whereby a trader sending an MO (and hence taking liquidity away from the market) pays a trading fee, while a trader whose posted LO is filled by the MO (that is, the LO with which the MO is matched) will a pay much lower trading fee, or even receive a payment (a rebate) from the exchange for providing liquidity (making the market). On the other hand, there are markets with an inverted fee schedule, a taker-maker system where the fee structure is the reverse: those providing liquidity pay a higher fee than those taking liquidity (who may even get a rebate). The issue of exchange fees is quite important as fees distort observed market prices (when you make a transaction the relevant price for you is the net price you pay/receive, which is the published price net of fees).

# Evolutionary Game Theory. Note Quote

In classical evolutionary biology the fitness landscape for possible strategies is considered static. Therefore optimization theory is the usual tool in order to analyze the evolution of strategies that consequently tend to climb the peaks of the static landscape. However in more realistic scenarios the evolution of populations modifies the environment so that the fitness landscape becomes dynamic. In other words, the maxima of the fitness landscape depend on the number of specimens that adopt every strategy (frequency-dependent landscape). In this case, when the evolution depends on agents’ actions, game theory is the adequate mathematical tool to describe the process. But this is precisely the scheme in that the evolving physical laws (i.e. algorithms or strategies) are generated from the agent-agent interactions (bottom-up process) submitted to natural selection.

The concept of evolutionarily stable strategy (ESS) is central to evolutionary game theory. An ESS is defined as that strategy that cannot be displaced by any alternative strategy when being followed by the great majority – almost all of systems in a population. In general,

an ESS is not necessarily optimal; however it might be assumed that in the last stages of evolution — before achieving the quantum equilibrium — the fitness landscape of possible strategies could be considered static or at least slow varying. In this simplified case an ESS would be one with the highest payoff therefore satisfying an optimizing criterion. Different ESSs could exist in other regions of the fitness landscape.

In the information-theoretic Darwinian approach it seems plausible to assume as optimization criterion the optimization of information flows for the system. A set of three regulating principles could be:

Structure: The complexity of the system is optimized (maximized).. The definition that is adopted for complexity is Bennett’s logical depth that for a binary string is the time needed to execute the minimal program that generates such string. There is no a general acceptance of the definition of complexity, neither is there a consensus on the relation between the increase of complexity – for a certain definition – and Darwinian evolution. However, it seems that there is some agreement on the fact that, in the long term, Darwinian evolution should drive to an increase in complexity in the biological realm for an adequate natural definition of this concept. Then the complexity of a system at time in this theory would be the Bennett’s logical depth of the program stored at time in its Turing machine. The increase of complexity is a characteristic of Lamarckian evolution, and it is also admitted that the trend of evolution in the Darwinian theory is in the direction in which complexity grows, although whether this tendency depends on the timescale – or some other factors – is still not very clear.

Dynamics: The information outflow of the system is optimized (minimized). The information is the Fisher information measure for the probability density function of the position of the system. According to S. A. Frank, natural selection acts maximizing the Fisher information within a Darwinian system. As a consequence, assuming that the flow of information between a system and its surroundings can be modeled as a zero-sum game, Darwinian systems would follow dynamics.

Interaction: The interaction between two subsystems optimizes (maximizes) the complexity of the total system. The complexity is again equated to the Bennett’s logical depth. The role of Interaction is central in the generation of composite systems, therefore in the structure for the information processor of composite systems resulting from the logical interconnections among the processors of the constituents. There is an enticing option of defining the complexity of a system in contextual terms as the capacity of a system for anticipating the behavior at t + ∆t of the surrounding systems included in the sphere of radius r centered in the position X(t) occupied by the system. This definition would directly drive to the maximization of the predictive power for the systems that maximized their complexity. However, this magnitude would definitely be very difficult to even estimate, in principle much more than the usual definitions for complexity.

Quantum behavior of microscopic systems should now emerge from the ESS. In other terms, the postulates of quantum mechanics should be deduced from the application of the three regulating principles on our physical systems endowed with an information processor.

Let us apply Structure. It is reasonable to consider that the maximization of the complexity of a system would in turn maximize the predictive power of such system. And this optimal statistical inference capacity would plausibly induce the complex Hilbert space structure for the system’s space of states. Let us now consider Dynamics. This is basically the application of the principle of minimum Fisher information or maximum Cramer-Rao bound on the probability distribution function for the position of the system. The concept of entanglement seems to be determinant to study the generation of composite systems, in particular in this theory through applying Interaction. The theory admits a simple model that characterizes the entanglement between two subsystems as the mutual exchange of randomizers (R1, R2), programs (P1, P2) – with their respective anticipation modules (A1, A2) – and wave functions (Ψ1, Ψ2). In this way, both subsystems can anticipate not only the behavior of their corresponding surrounding systems, but also that of the environment of its partner entangled subsystem. In addition, entanglement can be considered a natural phenomenon in this theory, a consequence of the tendency to increase the complexity, and therefore, in a certain sense, an experimental support to the theory.

In addition, the information-theoretic Darwinian approach is a minimalist realist theory – every system follows a continuous trajectory in time, as in Bohmian mechanics, a local theory in physical space – in this theory apparent nonlocality, as in Bell’s inequality violations, would be an artifact of the anticipation module in the information space, although randomness would necessarily be intrinsic to nature through the random number generator methodologically associated with every fundamental system at t = 0, and as essential ingredient to start and fuel – through variation – Darwinian evolution. As time increases, random events determined by the random number generators would progressively be replaced by causal events determined by the evolving programs that gradually take control of the elementary systems. Randomness would be displaced by causality as physical Darwinian evolution gave rise to the quantum equilibrium regime, but not completely, since randomness would play a crucial role in the optimization of strategies – thus, of information flows – as game theory states.

# Accelerated Capital as an Anathema to the Principles of Communicative Action. A Note Quote on the Reciprocity of Capital and Ethicality of Financial Economics

Markowitz portfolio theory explicitly observes that portfolio managers are not (expected) utility maximisers, as they diversify, and offers the hypothesis that a desire for reward is tempered by a fear of uncertainty. This model concludes that all investors should hold the same portfolio, their individual risk-reward objectives are satisfied by the weighting of this ‘index portfolio’ in comparison to riskless cash in the bank, a point on the capital market line. The slope of the Capital Market Line is the market price of risk, which is an important parameter in arbitrage arguments.

Merton had initially attempted to provide an alternative to Markowitz based on utility maximisation employing stochastic calculus. He was only able to resolve the problem by employing the hedging arguments of Black and Scholes, and in doing so built a model that was based on the absence of arbitrage, free of turpe-lucrum. That the prescriptive statement “it should not be possible to make sure profits”, is a statement explicit in the Efficient Markets Hypothesis and in employing an Arrow security in the context of the Law of One Price. Based on these observations, we conject that the whole paradigm for financial economics is built on the principle of balanced reciprocity. In order to explore this conjecture we shall examine the relationship between commerce and themes in Pragmatic philosophy. Specifically, we highlight Robert Brandom’s (Making It Explicit Reasoning, Representing, and Discursive Commitment) position that there is a pragmatist conception of norms – a notion of primitive correctnesses of performance implicit in practice that precludes and are presupposed by their explicit formulation in rules and principles.

The ‘primitive correctnesses’ of commercial practices was recognised by Aristotle when he investigated the nature of Justice in the context of commerce and then by Olivi when he looked favourably on merchants. It is exhibited in the doux-commerce thesis, compare Fourcade and Healey’s contemporary description of the thesis Commerce teaches ethics mainly through its communicative dimension, that is, by promoting conversations among equals and exchange between strangers, with Putnam’s description of Habermas’ communicative action based on the norm of sincerity, the norm of truth-telling, and the norm of asserting only what is rationally warranted …[and] is contrasted with manipulation (Hilary Putnam The Collapse of the Fact Value Dichotomy and Other Essays)

There are practices (that should be) implicit in commerce that make it an exemplar of communicative action. A further expression of markets as centres of communication is manifested in the Asian description of a market brings to mind Donald Davidson’s (Subjective, Intersubjective, Objective) argument that knowledge is not the product of a bipartite conversations but a tripartite relationship between two speakers and their shared environment. Replacing the negotiation between market agents with an algorithm that delivers a theoretical price replaces ‘knowledge’, generated through communication, with dogma. The problem with the performativity that Donald MacKenzie (An Engine, Not a Camera_ How Financial Models Shape Markets) is concerned with is one of monism. In employing pricing algorithms, the markets cannot perform to something that comes close to ‘true belief’, which can only be identified through communication between sapient humans. This is an almost trivial observation to (successful) market participants, but difficult to appreciate by spectators who seek to attain ‘objective’ knowledge of markets from a distance. To appreciate the relevance to financial crises of the position that ‘true belief’ is about establishing coherence through myriad triangulations centred on an asset rather than relying on a theoretical model.

Shifting gears now, unless the martingale measure is a by-product of a hedging approach, the price given by such martingale measures is not related to the cost of a hedging strategy therefore the meaning of such ‘prices’ is not clear. If the hedging argument cannot be employed, as in the markets studied by Cont and Tankov (Financial Modelling with Jump Processes), there is no conceptual framework supporting the prices obtained from the Fundamental Theorem of Asset Pricing. This lack of meaning can be interpreted as a consequence of the strict fact/value dichotomy in contemporary mathematics that came with the eclipse of Poincaré’s Intuitionism by Hilbert’s Formalism and Bourbaki’s Rationalism. The practical problem of supporting the social norms of market exchange has been replaced by a theoretical problem of developing formal models of markets. These models then legitimate the actions of agents in the market without having to make reference to explicitly normative values.

The Efficient Market Hypothesis is based on the axiom that the market price is determined by the balance between supply and demand, and so an increase in trading facilitates the convergence to equilibrium. If this axiom is replaced by the axiom of reciprocity, the justification for speculative activity in support of efficient markets disappears. In fact, the axiom of reciprocity would de-legitimise ‘true’ arbitrage opportunities, as being unfair. This would not necessarily make the activities of actual market arbitrageurs illicit, since there are rarely strategies that are without the risk of a loss, however, it would place more emphasis on the risks of speculation and inhibit the hubris that has been associated with the prelude to the recent Crisis. These points raise the question of the legitimacy of speculation in the markets. In an attempt to understand this issue Gabrielle and Reuven Brenner identify the three types of market participant. ‘Investors’ are preoccupied with future scarcity and so defer income. Because uncertainty exposes the investor to the risk of loss, investors wish to minimise uncertainty at the cost of potential profits, this is the basis of classical investment theory. ‘Gamblers’ will bet on an outcome taking odds that have been agreed on by society, such as with a sporting bet or in a casino, and relates to de Moivre’s and Montmort’s ‘taming of chance’. ‘Speculators’ bet on a mis-calculation of the odds quoted by society and the reason why speculators are regarded as socially questionable is that they have opinions that are explicitly at odds with the consensus: they are practitioners who rebel against a theoretical ‘Truth’. This is captured in Arjun Appadurai’s argument that the leading agents in modern finance believe in their capacity to channel the workings of chance to win in the games dominated by cultures of control . . . [they] are not those who wish to “tame chance” but those who wish to use chance to animate the otherwise deterministic play of risk [quantifiable uncertainty]”.

In the context of Pragmatism, financial speculators embody pluralism, a concept essential to Pragmatic thinking and an antidote to the problem of radical uncertainty. Appadurai was motivated to study finance by Marcel Mauss’ essay Le Don (The Gift), exploring the moral force behind reciprocity in primitive and archaic societies and goes on to say that the contemporary financial speculator is “betting on the obligation of return”, and this is the fundamental axiom of contemporary finance. David Graeber (Debt The First 5,000 Years) also recognises the fundamental position reciprocity has in finance, but where as Appadurai recognises the importance of reciprocity in the presence of uncertainty, Graeber essentially ignores uncertainty in his analysis that ends with the conclusion that “we don’t ‘all’ have to pay our debts”. In advocating that reciprocity need not be honoured, Graeber is not just challenging contemporary capitalism but also the foundations of the civitas, based on equality and reciprocity. The origins of Graeber’s argument are in the first half of the nineteenth century. In 1836 John Stuart Mill defined political economy as being concerned with [man] solely as a being who desires to possess wealth, and who is capable of judging of the comparative efficacy of means for obtaining that end.

In Principles of Political Economy With Some of Their Applications to Social Philosophy, Mill defended Thomas Malthus’ An Essay on the Principle of Population, which focused on scarcity. Mill was writing at a time when Europe was struck by the Cholera pandemic of 1829–1851 and the famines of 1845–1851 and while Lord Tennyson was describing nature as “red in tooth and claw”. At this time, society’s fear of uncertainty seems to have been replaced by a fear of scarcity, and these standards of objectivity dominated economic thought through the twentieth century. Almost a hundred years after Mill, Lionel Robbins defined economics as “the science which studies human behaviour as a relationship between ends and scarce means which have alternative uses”. Dichotomies emerge in the aftermath of the Cartesian revolution that aims to remove doubt from philosophy. Theory and practice, subject and object, facts and values, means and ends are all separated. In this environment ex cathedra norms, in particular utility (profit) maximisation, encroach on commercial practice.

In order to set boundaries on commercial behaviour motivated by profit maximisation, particularly when market uncertainty returned after the Nixon shock of 1971, society imposes regulations on practice. As a consequence, two competing ethics, functional Consequential ethics guiding market practices and regulatory Deontological ethics attempting stabilise the system, vie for supremacy. It is in this debilitating competition between two essentially theoretical ethical frameworks that we offer an explanation for the Financial Crisis of 2007-2009: profit maximisation, not speculation, is destabilising in the presence of radical uncertainty and regulation cannot keep up with motivated profit maximisers who can justify their actions through abstract mathematical models that bare little resemblance to actual markets. An implication of reorienting financial economics to focus on the markets as centres of ‘communicative action’ is that markets could become self-regulating, in the same way that the legal or medical spheres are self-regulated through professions. This is not a ‘libertarian’ argument based on freeing the Consequential ethic from a Deontological brake. Rather it argues that being a market participant entails restricting norms on the agent such as sincerity and truth telling that support knowledge creation, of asset prices, within a broader objective of social cohesion. This immediately calls into question the legitimacy of algorithmic/high- frequency trading that seems an anathema in regard to the principles of communicative action.

# Algorithmic Subfield Representation of the Depth of Descent Tree

A finite field K admits a sparse medium subfield representation if

– it has a subfield of q2 elements for a prime power q, i.e. K is isomorphic to Fq2k with k ≥ 1;

– there exist two polynomials h0 and h1 over Fq2 of small degree, such that h1Xq − h0 has a degree k irreducible factor.

We shall assume that all the fields under consideration admit a sparse medium subfield representation. Furthermore, we also assume that the degrees of the polynomials h0 and h1 are uniformly bounded by a constant δ. Any finite field of the form Fq2k with k ≤ q + 2 admits a sparse medium subfield representation with polynomials h0 and h1 of degree at most 2.

In a field in sparse medium subfield representation, elements will always be represented as polynomials of degree less than k with coefficients in Fq2. When we talk about the discrete logarithm of such an element, we implicitly assume that a basis for this discrete logarithm has been chosen, and that we work in a subgroup whose order has no small irreducible factor to limit ourselves to this case.

Proposition: Let K = Fq2k be a finite field that admits a sparse medium subfield representation. Under the heuristics, there exists an algorithm whose complexity is polynomial in q and k and which can be used for the following two tasks.

1. Given an element of K represented by a polynomial P ∈ Fq2[X] with 2 ≤ deg P ≤ k − 1, the algorithm returns an expression of log P (X ) as a linear combination of at most O(kq2) logarithms logPi(X) with degPi ≤ ⌈1/2 degP⌉ and of log h1(X).

2. The algorithm returns the logarithm of h1(X) and the logarithms of all the elements of K of the form X + a, for a in Fq2.

Let P(X) be an element of K for which we want to compute the discrete logarithm. Here P is a polynomial of degree at most k − 1 and with coefficients in Fq2. We start by applying the algorithm of the above Proposition to P. We obtain a relation of the form

log P = e0 log h1 + ei log Pi,

where the sum has at most κq2k terms for a constant κ and the Pi’s have degree at most ⌈1/2 degP⌉. Then, we apply recursively the algorithm to the Pi’s, thus creating a descent procedure where at each step, a given element P is expressed as a product of elements, whose degree is at most half the degree of P (rounded up) and the arity of the descent tree is in O(q2k). At the end of the process, the logarithm of P is expressed as a linear combination of the logarithms of h1 and of the linear polynomials, for which the logarithms are computed with the algorithm in the above Proposition in its second form.

We are left with the complexity analysis of the descent process. Each internal node of the descent tree corresponds to one application of the algorithm of the above Proposition, therefore each internal node has a cost which is bounded by a polynomial in q and k. The total cost of the descent is therefore bounded by the number of nodes in the descent tree times a polynomial in q and k. The depth of the descent tree is in O(log k). The number of nodes of the tree is then less than or equal to its arity raised to the power of its depth, which is (q2k)O(log k). Since any polynomial in q and k is absorbed in the O() notation in the exponent, we obtain the following result.

Let K = Fq2k be a finite field that admits a sparse medium subfield representation. Assuming the same heuristics as in the above Proposition, any discrete logarithm in K can be computed in a time bounded by

max(q, k)O(log k)

# Glue Code + Pipeline Jungles. Thought of the Day 25.0

Machine learning researchers tend to develop general purpose solutions as self-contained packages. A wide variety of these are available as open-source packages at places like mloss.org, or from in-house code, proprietary packages, and cloud-based platforms. Using self-contained solutions often results in a glue code system design pattern, in which a massive amount of supporting code is written to get data into and out of general-purpose packages.

This glue code design pattern can be costly in the long term, as it tends to freeze a system to the peculiarities of a specific package. General purpose solutions often have different design goals: they seek to provide one learning system to solve many problems, but many practical software systems are highly engineered to apply to one large-scale problem, for which many experimental solutions are sought. While generic systems might make it possible to interchange optimization algorithms, it is quite often refactoring of the construction of the problem space which yields the most benefit to mature systems. The glue code pattern implicitly embeds this construction in supporting code instead of in principally designed components. As a result, the glue code pattern often makes experimentation with other machine learning approaches prohibitively expensive, resulting in an ongoing tax on innovation.

Glue code can be reduced by choosing to re-implement specific algorithms within the broader system architecture. At first, this may seem like a high cost to pay – reimplementing a machine learning package in C++ or Java that is already available in R or matlab, for example, may appear to be a waste of effort. But the resulting system may require dramatically less glue code to integrate in the overall system, be easier to test, be easier to maintain, and be better designed to allow alternate approaches to be plugged in and empirically tested. Problem-specific machine learning code can also be tweaked with problem-specific knowledge that is hard to support in general packages.

As a special case of glue code, pipeline jungles often appear in data preparation. These can evolve organically, as new signals are identified and new information sources added. Without care, the resulting system for preparing data in an ML-friendly format may become a jungle of scrapes, joins, and sampling steps, often with intermediate files output. Managing these pipelines, detecting errors and recovering from failures are all difficult and costly. Testing such pipelines often requires expensive end-to-end integration tests. All of this adds to technical debt of a system and makes further innovation more costly. It’s worth noting that glue code and pipeline jungles are symptomatic of integration issues that may have a root cause in overly separated “research” and “engineering” roles. When machine learning packages are developed in an ivory-tower setting, the resulting packages may appear to be more like black boxes to the teams that actually employ them in practice.

Price prediction is extremely difficult because, price fluctuations are small and are secondary to liquidity fluctuation. A question arises whether liquidity deficit can be traded directly. If we accept that liquidity deficit is an entity of the same nature as volatility then the answer is yes, and liquidity deficit can be traded through some kind of derivative instruments. Let us illustrate the approach on a simple case – options trading. Whatever option model is used, the key element of it is implied volatility. Implied volatility trading strategy can be implemented through trading some delta–neutral “synthetic asset”, built e.g. as long–short pairs of a call on an asset and an asset itself, call–put pairs or similar “delta–neutral vehicles”. Delta neutral is a portfolio strategy consisting of multiple positions with offsetting positive and negative deltas so that the overall delta of the assets in questions totals zero. A delta-neutral portfolio balances the response to market movements for a certain range to bring the net change of the position to zero. Delta measures how much an option’s price changes when the underlying security’s price changes. Optimal implementation of such “synthetic asset” depends on commissions, liquidity available, exchange access, etc. and varies from fund to fund. Assume we have built such delta–neutral instrument, the price of which depend on volatility only. How to trade it? We have the same two requirements: 1) Avoid catastrophic P&L drain and 2) Predict future value of volatility (forward volatility). Now, when trading delta–neutral strategy, this matches exactly our theory and trading algorithm becomes this.

1. If for underlying asset we have (execution flow at time t = 0) I0 < IIL (liquidity deficit) then enter “long volatility” position for “delta–neutral” synthetic asset. This enter condition means that if current execution flow is low – future value of it will be high. If at current price,  the value of I0 is low – the price would change to increase future I.
2. If for underlying asset we have (execution flow at time t = 0) I0 > IIH then close existing “long volatility” position for “delta–neutral” synthetic asset. At high I0 future value of I cannot be determined, it can either go down (typically) or increase even more (much more seldom, but just few such events sufficient to incur catastrophic P&L drain). According to main concept of P&L trading strategy, one should have zero position during market uncertainty.

The reason why this strategy is expected to be profitable is that experiments show that implied volatility is very much price fluctuation–dependent, and execution flow spikes I0 > IIH in underlying asset typically lead to substantial price move of it and then implied volatility increase for “synthetic asset”. This strategy is a typical “buy low volatility”, then “sell high volatility”. The key difference from regular case is that, instead of price volatility, liquidity deficit is used as a proxy to forward volatility. The described strategy never goes short volatility, so catastrophic P&L drain is unlikely. In addition to that, actual trading implementation requires the use of “delta–neutral” synthetic asset, what incurs substantial costs on commissions and execution, and thus actual P&L is difficult to estimate without existing setup for high–frequency option trading.

# Random Uniform Deviate, or Correlation Dimension

A widely used dimension algorithm in data analysis is the correlation dimension. Fix m, a positive integer, and r, a positive real number. Given a time-series of data u(1), u(2), …, u(N),from measurements equally spaced in time, form a sequence of vectors x(1), x(2),…, x(N- m + 1) in R’, defined by x(i) = [u(i), u(i+ 1),…,u(i+ m – 1)]. Next, define for each i, 1 ≤ i ≤ N – m + 1,

Cmi (r)= (number of j such that d[x(i), x(j)] ≤ r)/(N-m+1) ———- [1]

We must define d[x(i), x(j)] for vectors x(i) and x(j). We define

d[x(i), x(j)]= maxk = 1,2,…,m (|u(i+k-1) – u(j+k-1)j) ———- [2]

From the Cmi (r), define

Cm(r) = (N- m + i)-1 ∑(N – m + 1)i = 1 Cmi (r) ———- [3]

and define

βm = limr → 0 limn → ∞ log Cm(r)/log r ———- [4]

The assertion is that for m sufficiently large, βmis the correlation dimension. Such a limiting slope has been shown to exist for the commonly studied chaotic attractors. This procedure has frequently been applied to experimental data; investigators seek a “scaling range”of r values for which log Cm(r)/log r is nearly constant for large m, and they infer that this ratio is the correlation dimension. In some instances, investigators have concluded that this procedure establishes deterministic chaos.

The later conclusion is not necessarily correct: a converged, finite correlation dimension value does not guarantee that the defining process is deterministic. Consider the following stochastic process. Fix 0 ≤ p ≤1. Define Xj = α-l/2 sin(2πj/12) ∀ j,where α is specified below. Define Yj as a family of independent identicaly distributed (i.i.d.) real random variables, with uniform density on the interval [-√3, √3]. Define Zj = 1 with probability p, Zj = 0 with probability 1 – p.

α = (∑j = 112 sin2(2πj/12)/12 ———- [5]

and define MI Xj = (1- Zj) Xj + ZjYj. Intuitively, MI X(p) is generated by first ascertaining, for each j, whether the jth sample will be from the deterministic sine wave or from the random uniform deviate, with likelihood (1- p) of the former choice, then calculating either Xj or Yj. Increasing p marks a tendency towards greater system randomness. We now show that almost surely βmin [4] equals 0 ∀ m for the MI X(p) process, p ≠ 1. Fix m, define Kj = (12m)j- 12m, and define Nj = 1 if (MI Xk(j)+l,…, k(j)+m) = (X1,. . ., Xm), Nj = 0 otherwise. The Nj are i.i.d.random variables, with the expected value of Nj, E(Nj) ≥ (1- p)m. By the Strong Law of Large Numbers,

limN → ∞ ∑j = 1N Nj/N = E(N1) ≥ (1-p)m

Observe that (∑j = 1N Nj/12 mN)2 is a lower bound to Cm(r), since xk(i)+1,…., xk(j)+1 if Ni = Nj = 1. Thus for r ‹ 1

limN → ∞ sup log Cm(r)/log r ≤ (1/log r) limN → ∞ (∑j = 1N Nj/12 mN)2 ≤ log (1-p)2m/(12m)2/log r

Since, (1-p)2m/(12m)2 is independent of r, βm = limr → 0 limN → ∞ log Cm(r)/log r = 0. Since, βm ≠ 0 with probability 0 for each m, by countable additivity, ∀m, β= 0.

The MIX(p) process can be motivated by considering an autonomous unit that produces sinusoidal output, surrounded by a world of interacting processes that in ensemble produces output that resembles noise relative to the timing of the unit. The extent to which the surrounding world interacts with the unit could be controlled by a gateway between the two, with a larger gateway admitting greater apparent noise to compete with the sinusoidal signal. It is easy to show that, given a sequence Xj, a sequence of k = 1, 2,…, m i.i.d.Yj, defined by a density function and independent of the Xj, and Z= X+ Yj, then Zj has an infinite correlation dimension. It appears that correlation dimension distinguishes between correlated and uncorrelated successive iterates, with larger estimates of dimension corresponding to uncorrelated data. For a more complete interpretation of correlation dimension results, stochastic processes with correlated increments should be analyzed. Error estimates in dimension calculations are commonly seen. In statistics, one presumes a specified underlying stochastic distribution to estimate misclassification probabilities. Without knowing the form of a distribution, or if the system is deterministic or stochastic, one must be suspicious of error estimates. There often appears to be a desire to establish a noninteger dimension value, to give a fractal and chaotic interpretation to the result, but again, prior to a thorough study of the relationship between the geometric Hausdorff dimension and the time series formula labeled correlation dimension, it is speculation to draw conclusions from a noninteger correlation dimension value.

# The Poverty of Left in Latin America as orchestrated and endorsed by Joseph Stiglitz

For a change, here is another mushroom cropping up (It actually cropped up as an idea and then materialised in 2008-09), this time, thanks to Leftist governments of Venezuela, Ecuador and Bolivia. Christened “Bank of the South“, the \$7 billion fund-Development Bank is the most logical culmination (what else is there?) of these Latin Americans against the neoliberal, austerity-directed reforms of the Bretton Woods behemoths. Let them dig the grounds for fecundity, what really caught my attention here is that almost a decade back, Joseph Stiglitz endorsed Hugo Chavez’s economic policies, and in 2007 even called such a development bank calling it as reflecting the perspectives of those in the South. And that was a bad call.

I want to chip in why I think such endorsements portray the vacuity, and strangely that too coming from the likes of Joseph Stiglitz.

South-South cooperation is actually becoming the malafide of the resistance against the neoliberal (Oh! how much do I despise this word now, and all the more so when it is gaining currency amongst the alternative political viewpoints) policies, and it is easily gauged by the receding of the Pink Tide in Latin america, the so-called cradle of Left in the late 90s of the earlier century and the first decade of the new. “With global stagnation and falling export prices, the ‘pink tide’ states must choose between their social programme and their economic strategy,” Financial Times is tightening screws on the coffin of Left on the continent. With the recent killing of Bolivia’s Deputy Interior Minister Rodolfo Illanes by striking miners, President Evo Morales has resorted to what the Left has always been classically resorting to: “conspiracy theory”. As Richard Seymour has quipped with a lot of prescience, Morales’ resort to conspiracy theory makes a certain sense in the context of Latin America, where a series of left-wing governments elected as part of a “pink tide” in the 2000s have gone into crisis. Argentina elected its first right-wing government in 12 years in November. Venezuela’s economic crash has led to the victory of the right-wing opposition in the senate. Notwithstanding the hyperventilating coverage of the country’s total collapse, the country is beset by real problems, a combination of opposition disruptioninternational pressure and government mistakes exacerbating the turmoil. In Brazil, impeachment proceedings against Dilma Rousseff have put the unelected opposition in power. Rousseff is impeached for manipulating the figures to make the government’s finances look better than they were, but the real problem appears to be that amid economic troubles, Rousseff was elected on a programme of investment rather than austerity. Bolivia, did set an example of an anomaly, where growth stabilised, public investment reached a high level, and minimum wages greater than the rate of inflation were introduced. So, why this turnaround? The government has built its authority on support from the police and army, and has repeatedly deployed police against social movements where they were inconvenient, such as during the protests against fuel price increases in 2010, or against a road built on indigenous land during 2011. Dissatisfaction with the left-wing and left-center governments in Argentina, Brazil, and Venezuela did not arise because the right-wing is admired by the public, but rather because the improvements begun under the left have stalled. The problems are most severe in Venezuela, where the drop in world oil prices has led to extreme inflation and scarcity of supplies in many sectors of the economy. It is important to understand that, largely, the problems have not arisen in the socialist part of the economies of these countries, but in sectors that are still under the control of private enterprise. In Venezuela, more than 70 percent of economic activity is still private. Food distribution, which is central to the problems of scarcity and inflation, is virtually monopolized by a small number of private companies that have ties to the right-wing opposition, especially the Polar company that controls 40 percent of the market. Venezuela, in short is a failed state. So, obviously the tide is turning.

From within the simmering, rises a reinvigorated bank, and its efficiency would depend on a host of issues, viz. a replication of WB/IMF’s more contributions to fund, more weightage to vote; exemption from taxes salaries and procurement of investment, which also incidentally happens to be copy of WB/IMF; undecidability on reserve funds; prioritising infrastructure over agriculture and social sectors; chalking out a plan for investment in financial intermediaries to develop national companies; procurement; and participation & transparency. Would the Bank actually be able to overcome these is in time. But, my main intention has been Joseph’s remark, or rather his subscription to such alternatives to WB/IMF. More than he actually welcoming this bank, or for that matter the NDB as rivalling the hegemonic structures of WB/IMF, it’s his stance on financialisation of capital that needs to be sent through a scanner. I have to admit honestly that I was bowled over by his discontent book, which did send me on a trip to track change via his honest and integrity-filled analysis of globalisation. Even reading Bhagwati in concomitance wasn’t a powerful let down to following JS. But, just like the Left stands precariously, JS’s conceptualisation somehow misses the beat for me these days.

He, undoubtedly was a voice to hear during and in the aftermath of global crisis. His attack was three-pronged and all of it suited the purpose for the non-esoteric to figure out the causes of 2008-09 downturn. That there are problems associated with mainstream economics with over reliance on algorithms designed by mathematical geniuses, questionable character of rationality as a result of conflict of interests impairing ratings agencies, and lack of accountability on Wall Street’s excessive risk-taking adventures isn’t really in any doubt. But, thereafter fluctuations start becoming noticeable, and as a left-inclined theorist, he blames the neoliberal policies that had its beginnings in the 70s for all the ills with current economic and financial mess the world over. Assuming it to be true, then how do we explain the fact that Western Europe’s hyper-regulated economies are presently in even worse shape than America’s? Today Greece is a nation on financial life-support. Yet it has long been one of the most regulated and interventionist economies in the entire EU. This, however, doesn’t stop Stiglitz from proposing a massive expansion of regulation. This, he says, should be shaped “by financial experts in unions, nongovernmental organisations  and universities”. More generally, there’s nothing new about what Stiglitz calls “New Capitalism.” It’s a return to old-fashioned Keynesian demand-management and the pursuit of “full employment” – that old Keynesian mantra – through the government’s direction of any number of economic sectors. Then there’s Stiglitz’s proposal for a Global Reserve System to effectively undertake demand-management for the world economy. To be fair, this is not an instance of megalomania on Stiglitz’s part. Keynes argued for something similar almost 70 years ago. But here Stiglitz wraps himself again in contradiction. Having stressed the Fed’s inability to manage America’s economy, why does Stiglitz imagine a global central bank could possibly manage monetary policy for the entire world economy? What precisely, we might ask, is the optimal interest rate for the global economy? Surely only God could know that. Until then, I’d have my reservations in taking him seriously.

# Machine Learning via Quantum Nearest-Centroid Algorithm for k-Means Clustering

k-means clustering is a popular machine learning algorithm that structures an unlabelled dataset into k classes. k-means clustering is an NP-hard problem, but examining methods that reduce the average-case complexity is an open area of research. A popular way of classifying the input vectors is to compare the distance of a new vector with the centroid vector of each class (the latter being calculated from the mean of the vectors already in that class). The class with the shortest distance to the vector is the one to which the vector is classified. We refer to this form of classification sub-routine for k-means clustering, as the nearest-centroid algorithm.

Seth Lloyd and others have constructed a quantum nearest-centroid algorithm, only classifying vectors after the optimal clustering has been found. They show that the distance between an input vector, |u⟩, and the set of n reference vectors {|viC} of length m in class C, can be efficiently calculated to within error ε in O(ε−1 log nm) steps on a quantum computer. The algorithm works by constructing the state

|Ψ⟩ = 1/√2 (|u⟩|0⟩ + 1/√n Σnj=1 |vj|j⟩

and performing a swap test with the state

|Φ⟩ = 1/√Z (|u⟩||0⟩ + 1/√n Σnj=1 |vj|j⟩

where Z = |u|2 + (1/n) Σj |vj|2.

The distance between the input vector and the weighted average of the vectors in class C is then proportional to the probability of a successful swap test. The algorithm is repeated for each class until a desired confidence is reached, with the vector being classified into the class from which it has the shortest distance. The complexity arguments on the dependence of m were rigorously confirmed by Lloyd and others using the QPCA (Quantum principal component analysis) construction for a support vector machine (SVM) algorithm. This can roughly be thought of as a k-means clustering problem with k=2. A speedup is obtained due to the classical computation of the required inner products being O(nm)2.

The algorithm has some caveats, in particular it only classifies data without performing the harder task of clustering, and assumes access to a QRAM (Quantum random access memory). In the same paper, Lloyd and others develop a k-means algorithm, including clustering, for implementation on an adiabatic quantum computer. The potential of this algorithm is hard to judge, and is perhaps less promising due to the current focus of the quantum computing field on circuit-based architectures.

Machine learning data is usually (very) high dimensional, containing redundant or irrelevant information. Thus, machine learning benefits from pre-processing data through statistical procedures such as principal component analysis (PCA). PCA reduces the dimensionality by transforming the data to a new set of uncorrelated variables (the principal components) of which the first few retain most of the variation present in the original dataset. The standard way to calculate the principal components boils down to finding the eigenvalues of the data covariance matrix.

Lloyd and others suggested a quantum version of PCA (QPCA). The bulk of the algorithm consists of the ability to generate the exponent of an arbitrary density matrix ρ efficiently. More specifically, given n copies of ρ, Lloyd and others propose a way to apply the unitary operator e−iρt to any state σ with accuracy ε = O(t2/n). This is done using repeated infinitesimal applications of the swap operator on ρ ⊗ σ. Using phase estimation, the result is used to generate a state which can be sampled from to attain information on the eigenvectors and eigenvalues of the state ρ. The algorithm is most effective when ρ contains a few large eigenvalues and can be represented well by its principal components. In this case, the subspace spanned by the principal components ρ′ closely approximates ρ, such that ||ρ − P ρP || ≤ ε, where P is the projector onto ρ′. This method of QPCA allows construction of the eigenvectors and eigenvalues of the matrix ρ in time O(Rlog(d)), where d and R are the dimensions of the space spanned by the ρ and ρ′ respectively. For low-rank matrices, this is an improvement over the classical algorithm that requires O(d) time. In a machine learning context, if ρ is the covariance matrix of the data, this procedure performs PCA in the desired fashion.

The QPCA algorithm has a number of caveats that need to be covered before one can apply it to machine learning scenarios. For example, to gain a speedup, some of the eigenvalues of ρ need to be large (i.e. ρ needs to be well approximated by ρ′). For the case where all eigenvalues are equal and of size O(1/d), the algorithm reduces to scaling in time O(d) which offers no improvement over classical algorithms. Other aspects that need to be considered include the necessity of QRAM and the scaling of the algorithm with the allowed error ε. As of yet, it is unclear how these requirements affect the applicability of the algorithm to real scenarios.