Infinite Sequences and Halting Problem. Thought of the Day 76.0

580376_f96f_2

In attempting to extend the notion of depth from finite strings to infinite sequences, one encounters a familiar phenomenon: the definitions become sharper (e.g. recursively invariant), but their intuitive meaning is less clear, because of distinctions (e.g. between infintely-often and almost-everywhere properties) that do not exist in the finite case.

An infinite sequence X is called strongly deep if at every significance level s, and for every recursive function f, all but finitely many initial segments Xn have depth exceeding f(n).

It is necessary to require the initial segments to be deep almost everywhere rather than infinitely often, because even the most trivial sequence has infinitely many deep initial segments Xn (viz. the segments whose lengths n are deep numbers).

It is not difficult to show that the property of strong depth is invariant under truth-table equivalence (this is the same as Turing equivalence in recursively bounded time, or via a total recursive operator), and that the same notion would result if the initial segments were required to be deep in the sense of receiving less than 2−s of their algorithmic probability from f(n)-fast programs. The characteristic sequence of the halting set K is an example of a strongly deep sequence.

A weaker definition of depth, also invariant under truth-table equivalence, is perhaps more analogous to that adopted for finite strings:

An infinite sequence X is weakly deep if it is not computable in recursively bounded time from any algorithmically random infinite sequence.

Computability in recursively bounded time is equivalent to two other properties, viz. truth-table reducibility and reducibility via a total recursive operator.

By contrast to the situation with truth-table reducibility, Péter Gacs has shown that every sequence is computable from (i.e. Turing reducible to) an algorithmically random sequence if no bound is imposed on the time. This is the infinite analog of far more obvious fact that every finite string is computable from an algorithmically random string (e.g. its minimal program).

Every strongly deep sequence is weakly deep, but by intermittently padding K with large blocks of zeros, one can construct a weakly deep sequence with infinitely many shallow initial segments.

Truth table reducibility to an algorithmically random sequence is equivalent to the property studied by Levin et. al. of being random with respect to some recursive measure. Levin calls sequences with this property “proper” or “complete” sequences, and views them as more realistic and interesting than other sequences because they are the typical outcomes of probabilistic or deterministic effective processes operating in recursively bounded time.

Weakly deep sequences arise with finite probability when a universal Turing machine (with one-way input and output tapes, so that it can act as a transducer of infinite sequences) is given an infinite coin toss sequence for input. These sequences are necessarily produced very slowly: the time to output the n’th digit being bounded by no recursive function, and the output sequence contains evidence of this slowness. Because they are produced with finite probability, such sequences can contain only finite information about the halting problem.

Advertisements

String’s Depth of Burial

string_conversion_03.2013

A string’s depth might be defined as the execution time of its minimal program.

The difficulty with this definition arises in cases where the minimal program is only a few bits smaller than some much faster program, such as a print program, to compute the same output x. In this case, slight changes in x may induce arbitrarily large changes in the run time of the minimal program, by changing which of the two competing programs is minimal. Analogous instability manifests itself in translating programs from one universal machine to another. This instability emphasizes the essential role of the quantity of buried redundancy, not as a measure of depth, but as a certifier of depth. In terms of the philosophy-of-science metaphor, an object whose minimal program is only a few bits smaller than its print program is like an observation that points to a nontrivial hypothesis, but with only a low level of statistical confidence.

To adequately characterize a finite string’s depth one must therefore consider the amount of buried redundancy as well as the depth of its burial. A string’s depth at significance level s might thus be defined as that amount of time complexity which is attested by s bits worth of buried redundancy. This characterization of depth may be formalized in several ways.

A string’s depth at significance level s be defined as the time required to compute the string by a program no more than s bits larger than the minimal program.

This definition solves the stability problem, but is unsatisfactory in the way it treats multiple programs of the same length. Intuitively, 2k distinct (n + k)-bit programs that compute same output ought to be accorded the same weight as one n-bit program; but, by the present definition, they would be given no more weight than one (n + k)-bit program.

A string’s depth at signicifcance level s depth might be defined as the time t required for the string’s time-bounded algorithmic probability Pt(x) to rise to within a factor 2−s of its asymptotic time-unbounded value P(x).

This formalizes the notion that for the string to have originated by an effective process of t steps or fewer is less plausible than for the first s tosses of a fair coin all to come up heads.

It is not known whether there exist strings that are deep, in other words, strings having no small fast programs, even though they have enough large fast programs to contribute a significant fraction of their algorithmic probability. Such strings might be called deterministically deep but probabilistically shallow, because their chance of being produced quickly in a probabilistic computation (e.g. one where the input bits of U are supplied by coin tossing) is significant compared to their chance of being produced slowly. The question of whether such strings exist is probably hard to answer because it does not relativize uniformly. Deterministic and probabilistic depths are not very different relative to a random coin-toss oracle A of the equality of random-oracle-relativized deterministic and probabilistic polynomial time complexity classes; but they can be very different relative to an oracle B deliberately designed to hide information from deterministic computations (this parallels Hunt’s proof that deterministic and probabilistic polynomial time are unequal relative to such an oracle).

(Depth of Finite Strings): Let x and w be strings and s a significance parameter. A string’s depth at significance level s, denoted Ds(x), will be defined as min{T(p) : (|p|−|p| < s)∧(U(p) = x)}, the least time required to compute it by a s-incompressible program. At any given significance level, a string will be called t-deep if its depth exceeds t, and t-shallow otherwise.

The difference between this definition and the previous one is rather subtle philosophically and not very great quantitatively. Philosophically, when each individual hypothesis for the rapid origin of x is implausible at the 2−s confidence level, then it requires only that a weighted average of all such hypotheses be implausible.

There exist constants c1 and c2 such that for any string x, if programs running in time ≤ t contribute a fraction between 2−s and 2−s+1 of the string’s total algorithmic probability, then x has depth at most t at significance level s + c1 and depth at least t at significance level s − min{H(s), H(t)} − c2.

Proof : The first part follows easily from the fact that any k-compressible self-delimiting program p is associated with a unique, k − O(1) bits shorter, program of the form “execute the result of executing p∗”. Therefore there exists a constant c1 such that if all t-fast programs for x were s + c1– compressible, the associated shorter programs would contribute more than the total algorithmic probability of x. The second part follows because, roughly, if fast programs contribute only a small fraction of the algorithmic probability of x, then the property of being a fast program for x is so unusual that no program having that property can be random. More precisely, the t-fast programs for x constitute a finite prefix set, a superset S of which can be computed by a program of size H(x) + min{H(t), H(s)} + O(1) bits. (Given x∗ and either t∗ or s∗, begin enumerating all self-delimiting programs that compute x, in order of increasing running time, and quit when either the running time exceeds t or the accumulated measure of programs so far enumerated exceeds 2−(H(x)−s)). Therefore there exists a constant c2 such that, every member of S, and thus every t-fast program for x, is compressible by at least s − min{H(s), H(t)} − O(1) bits.

The ability of universal machines to simulate one another efficiently implies a corresponding degree of machine-independence for depth: for any two efficiently universal machines of the sort considered here, there exists a constant c and a linear polynomial L such that for any t, strings whose (s+c)-significant depth is at least L(t) on one machine will have s-significant depth at least t on the other.

Depth of one string relative to another may be defined analogously, and represents the plausible time required to produce one string, x, from another, w.

(Relative Depth of Finite Strings): For any two strings w and x, the depth of x relative to w at significance level s, denoted Ds(x/w), will be defined as min{T(p, w) : (|p|−|(p/w)∗| < s)∧(U(p, w) = x)}, the least time required to compute x from w by a program that is s-incompressible relative to w.

Depth of a string relative to its length is a particularly useful notion, allowing us, as it were, to consider the triviality or nontriviality of the “content” of a string (i.e. its bit sequence), independent of its “form” (length). For example, although the infinite sequence 000… is intuitively trivial, its initial segment 0n is deep whenever n is deep. However, 0n is always shallow relative to n, as is, with high probability, a random string of length n.

In order to adequately represent the intuitive notion of stored mathematical work, it is necessary that depth obey a “slow growth” law, i.e. that fast deterministic processes be unable to transform a shallow object into a deep one, and that fast probabilistic processes be able to do so only with low probability.

(Slow Growth Law): Given any data string x and two significance parameters s2 > s1, a random program generated by coin tossing has probability less than 2−(s2−s1)+O(1) of transforming x into an excessively deep output, i.e. one whose s2-significant depth exceeds the s1-significant depth of x plus the run time of the transforming program plus O(1). More precisely, there exist positive constants c1, c2 such that for all strings x, and all pairs of significance parameters s2 > s1, the prefix set {q : Ds2(U(q, x)) > Ds1(x) + T(q, x) + c1} has measure less than 2−(s2−s1)+c2.

Proof: Let p be a s1-incompressible program which computes x in time Ds1(x), and let r be the restart prefix mentioned in the definition of the U machine. Let Q be the prefix set {q : Ds2(U(q, x)) > T(q, x) + Ds1(x) + c1}, where the constant c1 is sufficient to cover the time overhead of concatenation. For all q ∈ Q, the program rpq by definition computes some deep result U(q, x) in less time than that result’s own s2-significant depth, and so rpq must be compressible by s2 bits. The sum of the algorithmic probabilities of strings of the form rpq, where q ∈ Q, is therefore

Σq∈Q P(rpq)< Σq∈Q 2−|rpq| + s2 = 2−|r|−|p|+s2 μ(Q)

On the other hand, since the self-delimiting program p can be recovered from any string of the form rpq (by deleting r and executing the remainder pq until halting occurs, by which time exactly p will have been read), the algorithmic probability of p is at least as great (within a constant factor) as the sum of the algorithmic probabilities of the strings {rpq : q ∈ Q} considered above:

P(p) > μ(Q) · 2−|r|−|p|+s2−O(1)

Recalling the fact that minimal program size is equal within a constant factor to the −log of algorithmic probability, and the s1-incompressibility of p, we have P(p) < 2−(|p|−s1+O(1)), and therefore finally

μ(Q) < 2−(s2−s1)+O(1), which was to be demonstrated.

Universal Turing Machine: Algorithmic Halting

169d342be4ac9fdca10d1c8c9c04c3df

A natural number x will be identified with the x’th binary string in lexicographic order (Λ,0,1,00,01,10,11,000…), and a set X of natural numbers will be identified with its characteristic sequence, and with the real number between 0 and 1 having that sequence as its dyadic expansion. The length of a string x will be denoted |x|, the n’th bit of an infinite sequence X will be noted X(n), and the initial n bits of X will be denoted Xn. Concatenation of strings p and q will be denoted pq.

We now define the information content (and later the depth) of finite strings using a universal Turing machine U. A universal Turing machine may be viewed as a partial recursive function of two arguments. It is universal in the sense that by varying one argument (“program”) any partial recursive function of the other argument (“data”) can be obtained. In the usual machine formats, program, data and output are all finite strings, or, equivalently, natural numbers. However, it is not possible to take a uniformly weighted average over a countably infinite set. Chaitin’s universal machine has two tapes: a read-only one-way tape containing the infinite program; and an ordinary two-way read/write tape, which is used for data input, intermediate work, and output, all of which are finite strings. Our machine differs from Chaitin’s in having some additional auxiliary storage (e.g. another read/write tape) which is needed only to improve the time efficiency of simulations.

We consider only terminating computations, during which, of course, only a finite portion of the program tape can be read. Therefore, the machine’s behavior can still be described by a partial recursive function of two string arguments U(p, w), if we use the first argument to represent that portion of the program that is actually read in the course of a particular computation. The expression U (p, w) = x will be used to indicate that the U machine, started with any infinite sequence beginning with p on its program tape and the finite string w on its data tape, performs a halting computation which reads exactly the initial portion p of the program, and leaves output data x on the data tape at the end of the computation. In all other cases (reading less than p, more than p, or failing to halt), the function U(p, w) is undefined. Wherever U(p, w) is defined, we say that p is a self-delimiting program to compute x from w, and we use T(p, w) to represent the time (machine cycles) of the computation. Often we will consider computations without input data; in that case we abbreviate U(p, Λ) and T(p, Λ) as U(p) and T(p) respectively.

The self-delimiting convention for the program tape forces the domain of U and T, for each data input w, to be a prefix set, that is, a set of strings no member of which is the extension of any other member. Any prefix set S obeys the Kraft inequality

p∈S 2−|p| ≤ 1

Besides being self-delimiting with regard to its program tape, the U machine must be efficiently universal in the sense of being able to simulate any other machine of its kind (Turing machines with self-delimiting program tape) with at most an additive constant constant increase in program size and a linear increase in execution time.

Without loss of generality we assume that there exists for the U machine a constant prefix r which has the effect of stacking an instruction to restart the computation when it would otherwise end. This gives the machine the ability to concatenate programs to run consecutively: if U(p, w) = x and U(q, x) = y, then U(rpq, w) = y. Moreover, this concatenation should be efficient in the sense that T (rpq, w) should exceed T (p, w) + T (q, x) by at most O(1). This efficiency of running concatenated programs can be realized with the help of the auxiliary storage to stack the restart instructions.

Sometimes we will generalize U to have access to an “oracle” A, i.e. an infinite look-up table which the machine can consult in the course of its computation. The oracle may be thought of as an arbitrary 0/1-valued function A(x) which the machine can cause to be evaluated by writing the argument x on a special tape and entering a special state of the finite control unit. In the next machine cycle the oracle responds by sending back the value A(x). The time required to evaluate the function is thus linear in the length of its argument. In particular we consider the case in which the information in the oracle is random, each location of the look-up table having been filled by an independent coin toss. Such a random oracle is a function whose values are reproducible, but otherwise unpredictable and uncorrelated.

Let {φAi (p, w): i = 0,1,2…} be an acceptable Gödel numbering of A-partial recursive functions of two arguments and {φAi (p, w)} an associated Blum complexity measure, henceforth referred to as time. An index j is called self-delimiting iff, for all oracles A and all values w of the second argument, the set { x : φAj (x, w) is defined} is a prefix set. A self-delimiting index has efficient concatenation if there exists a string r such that for all oracles A and all strings w, x, y, p, and q,if φAj (p, w) = x and φAj (q, x) = y, then φAj(rpq, w) = y and φAj (rpq, w) = φAj (p, w) + φAj (q, x) + O(1). A self-delimiting index u with efficient concatenation is called efficiently universal iff, for every self-delimiting index j with efficient concatenation, there exists a simulation program s and a linear polynomial L such that for all oracles A and all strings p and w, and

φAu(sp, w) = φAj (p, w)

and

ΦAu(sp, w) ≤ L(ΦAj (p, w))

The functions UA(p,w) and TA(p,w) are defined respectively as φAu(p, w) and ΦAu(p, w), where u is an efficiently universal index.

For any string x, the minimal program, denoted x∗, is min{p : U(p) = x}, the least self-delimiting program to compute x. For any two strings x and w, the minimal program of x relative to w, denoted (x/w)∗, is defined similarly as min{p : U(p,w) = x}.

By contrast to its minimal program, any string x also has a print program, of length |x| + O(log|x|), which simply transcribes the string x from a verbatim description of x contained within the program. The print program is logarithmically longer than x because, being self-delimiting, it must indicate the length as well as the contents of x. Because it makes no effort to exploit redundancies to achieve efficient coding, the print program can be made to run quickly (e.g. linear time in |x|, in the present formalism). Extra information w may help, but cannot significantly hinder, the computation of x, since a finite subprogram would suffice to tell U to simply erase w before proceeding. Therefore, a relative minimal program (x/w)∗ may be much shorter than the corresponding absolute minimal program x∗, but can only be longer by O(1), independent of x and w.

A string is compressible by s bits if its minimal program is shorter by at least s bits than the string itself, i.e. if |x∗| ≤ |x| − s. Similarly, a string x is said to be compressible by s bits relative to a string w if |(x/w)∗| ≤ |x| − s. Regardless of how compressible a string x may be, its minimal program x∗ is compressible by at most an additive constant depending on the universal computer but independent of x. [If (x∗)∗ were much smaller than x∗, then the role of x∗ as minimal program for x would be undercut by a program of the form “execute the result of executing (x∗)∗.”] Similarly, a relative minimal program (x/w)∗ is compressible relative to w by at most a constant number of bits independent of x or w.

The algorithmic probability of a string x, denoted P(x), is defined as {∑2−|p| : U(p) = x}. This is the probability that the U machine, with a random program chosen by coin tossing and an initially blank data tape, will halt with output x. The time-bounded algorithmic probability, Pt(x), is defined similarly, except that the sum is taken only over programs which halt within time t. We use P(x/w) and Pt(x/w) to denote the analogous algorithmic probabilities of one string x relative to another w, i.e. for computations that begin with w on the data tape and halt with x on the data tape.

The algorithmic entropy H(x) is defined as the least integer greater than −log2P(x), and the conditional entropy H(x/w) is defined similarly as the least integer greater than − log2P(x/w). Among the most important properties of the algorithmic entropy is its equality, to within O(1), with the size of the minimal program:

∃c∀x∀wH(x/w) ≤ |(x/w)∗| ≤ H(x/w) + c

The first part of the relation, viz. that algorithmic entropy should be no greater than minimal program size, is obvious, because of the minimal program’s own contribution to the algorithmic probability. The second half of the relation is less obvious. The approximate equality of algorithmic entropy and minimal program size means that there are few near-minimal programs for any given input/output pair (x/w), and that every string gets an O(1) fraction of its algorithmic probability from its minimal program.

Finite strings, such as minimal programs, which are incompressible or nearly so are called algorithmically random. The definition of randomness for finite strings is necessarily a little vague because of the ±O(1) machine-dependence of H(x) and, in the case of strings other than self-delimiting programs, because of the question of how to count the information encoded in the string’s length, as opposed to its bit sequence. Roughly speaking, an n-bit self-delimiting program is considered random (and therefore not ad-hoc as a hypothesis) iff its information content is about n bits, i.e. iff it is incompressible; while an externally delimited n-bit string is considered random iff its information content is about n + H(n) bits, enough to specify both its length and its contents.

For infinite binary sequences (which may be viewed also as real numbers in the unit interval, or as characteristic sequences of sets of natural numbers) randomness can be defined sharply: a sequence X is incompressible, or algorithmically random, if there is an O(1) bound to the compressibility of its initial segments Xn. Intuitively, an infinite sequence is random if it is typical in every way of sequences that might be produced by tossing a fair coin; in other words, if it belongs to no informally definable set of measure zero. Algorithmically random sequences constitute a larger class, including sequences such as Ω which can be specified by ineffective definitions.

The busy beaver function B(n) is the greatest number computable by a self-delimiting program of n bits or fewer. The halting set K is {x : φx(x) converges}. This is the standard representation of the halting problem.

The self-delimiting halting set K0 is the (prefix) set of all self-delimiting programs for the U machine that halt: {p : U(p) converges}.

K and K0 are readily computed from one another (e.g. by regarding the self-delimiting programs as a subset of ordinary programs, the first 2n bits of K0 can be recovered from the first 2n+O(1) bits of K; by encoding each n-bit ordinary program as a self-delimiting program of length n + O(log n), the first 2n bits of K can be recovered from the first 2n+O(log n) bits of K0.)

The halting probability Ω is defined as {2−|p| : U(p) converges}, the probability that the U machine would halt on an infinite input supplied by coin tossing. Ω is thus a real number between 0 and 1.

The first 2n bits of K0 can be computed from the first n bits of Ω, by enumerating halting programs until enough have halted to account for all but 2−n of the total halting probability. The time required for this decoding (between B(n − O(1)) and B(n + H(n) + O(1)) grows faster than any computable function of n. Although K0 is only slowly computable from Ω, the first n bits of Ω can be rapidly computed from the first 2n+H(n)+O(1) bits of K0, by asking about the halting of programs of the form “enumerate halting programs until (if ever) their cumulative weight exceeds q, then halt”, where q is an n-bit rational number…

Malignant Acceleration in Tech-Finance. Some Further Rumination on Regulations. Thought of the Day 72.1

these-stunning-charts-show-some-of-the-wild-trading-activity-that-came-from-a-dark-pool-this-morning

Regardless of the positive effects of HFT that offers, such as reduced spreads, higher liquidity, and faster price discovery, its negative side is mostly what has caught people’s attention. Several notorious market failures and accidents in recent years all seem to be related to HFT practices. They showed how much risk HFT can involve and how huge the damage can be.

HFT heavily depends on the reliability of the trading algorithms that generate, route, and execute orders. High-frequency traders thus must ensure that these algorithms have been tested completely and thoroughly before they are deployed into the live systems of the financial markets. Any improperly-tested, or prematurely-released algorithms may cause losses to both investors and the exchanges. Several examples demonstrate the extent of the ever-present vulnerabilities.

In August 2012, the Knight Capital Group implemented a new liquidity testing software routine into its trading system, which was running live on the NYSE. The system started making bizarre trading decisions, quadrupling the price of one company, Wizzard Software, as well as bidding-up the price of much larger entities, such as General Electric. Within 45 minutes, the company lost USD 440 million. After this event and the weakening of Knight Capital’s capital base, it agreed to merge with another algorithmic trading firm, Getco, which is the biggest HFT firm in the U.S. today. This example emphasizes the importance of implementing precautions to ensure their algorithms are not mistakenly used.

Another example is Everbright Securities in China. In 2013, state-owned brokerage firm, Everbright Securities Co., sent more than 26,000 mistaken buy orders to the Shanghai Stock Exchange (SSE of RMB 23.4 billion (USD 3.82 billion), pushing its benchmark index up 6 % in two minutes. This resulted in a trading loss of approximately RMB 194 million (USD 31.7 million). In a follow-up evaluative study, the China Securities Regulatory Commission (CSRC) found that there were significant flaws in Everbright’s information and risk management systems.

The damage caused by HFT errors is not limited to specific trading firms themselves, but also may involve stock exchanges and the stability of the related financial market. On Friday, May 18, 2012, the social network giant, Facebook’s stock was issued on the NASDAQ exchange. This was the most anticipated initial public offering (IPO) in its history. However, technology problems with the opening made a mess of the IPO. It attracted HFT traders, and very large order flows were expected, and before the IPO, NASDAQ was confident in its ability to deal with the high volume of orders.

But when the deluge of orders to buy, sell and cancel trades came, NASDAQ’s trading software began to fail under the strain. This resulted in a 30-minute delay on NASDAQ’s side, and a 17-second blackout for all stock trading at the exchange, causing further panic. Scrutiny of the problems immediately led to fines for the exchange and accusations that HFT traders bore some responsibility too. Problems persisted after opening, with many customer orders from institutional and retail buyers unfilled for hours or never filled at all, while others ended up buying more shares than they had intended. This incredible gaffe, which some estimates say cost traders USD 100 million, eclipsed NASDAQ’s achievement in getting Facebook’s initial IPO, the third largest IPO in U.S. history. This incident has been estimated to have cost investors USD 100 million.

Another instance occurred on May 6, 2010, when U.S. financial markets were surprised by what has been referred to ever since as the “Flash Crash” Within less than 30 minutes, the main U.S. stock markets experienced the single largest price declines within a day, with a decline of more than 5 % for many U.S.-based equity products. In addition, the Dow Jones Industrial Average (DJIA), at its lowest point that day, fell by nearly 1,000 points, although it was followed by a rapid rebound. This brief period of extreme intraday volatility demonstrated the weakness of the structure and stability of U.S. financial markets, as well as the opportunities for volatility-focused HFT traders. Although a subsequent investigation by the SEC cleared high-frequency traders of directly having caused the Flash Crash, they were still blamed for exaggerating market volatility, withdrawing liquidity for many U.S.-based equities (FLASH BOYS).

Since the mid-2000s, the average trade size in the U.S. stock market had plummeted, the markets had fragmented, and the gap in time between the public view of the markets and the view of high-frequency traders had widened. The rise of high-frequency trading had been accompanied also by a rise in stock market volatility – over and above the turmoil caused by the 2008 financial crisis. The price volatility within each trading day in the U.S. stock market between 2010 and 2013 was nearly 40 percent higher than the volatility between 2004 and 2006, for instance. There were days in 2011 in which volatility was higher than in the most volatile days of the dot-com bubble. Although these different incidents have different causes, the effects were similar and some common conclusions can be drawn. The presence of algorithmic trading and HFT in the financial markets exacerbates the adverse impacts of trading-related mistakes. It may lead to extremely higher market volatility and surprises about suddenly-diminished liquidity. This raises concerns about the stability and health of the financial markets for regulators. With the continuous and fast development of HFT, larger and larger shares of equity trades were created in the U.S. financial markets. Also, there was mounting evidence of disturbed market stability and caused significant financial losses due to HFT-related errors. This led the regulators to increase their attention and effort to provide the exchanges and traders with guidance on HFT practices They also expressed concerns about high-frequency traders extracting profit at the costs of traditional investors and even manipulating the market. For instance, high-frequency traders can generate a large amount of orders within microseconds to exacerbate a trend. Other types of misconduct include: ping orders, which is using some orders to detect other hidden orders; and quote stuffing, which is issuing a large number of orders to create uncertainty in the market. HFT creates room for these kinds of market abuses, and its blazing speed and huge trade volumes make their detection difficult for regulators. Regulators have taken steps to increase their regulatory authority over HFT activities. Some of the problems that arose in the mid-2000s led to regulatory hearings in the United States Senate on dark pools, flash orders and HFT practices. Another example occurred after the Facebook IPO problem. This led the SEC to call for a limit up-limit down mechanism at the exchanges to prevent trades in individual securities from occurring outside of a specified price range so that market volatility will be under better control. These regulatory actions put stricter requirements on HFT practices, aiming to minimize the market disturbance when many fast trading orders occur within a day.

Regulating the Velocities of Dark Pools. Thought of the Day 72.0

hft-robots630

On 22 September 2010 the SEC chair Mary Schapiro signaled US authorities were considering the introduction of regulations targeted at HFT:

…High frequency trading firms have a tremendous capacity to affect the stability and integrity of the equity markets. Currently, however, high frequency trading firms are subject to very little in the way of obligations either to protect that stability by promoting reasonable price continuity in tough times, or to refrain from exacerbating price volatility.

However regulating an industry working towards moving as fast as the speed of light is no ordinary administrative task: – Modern finance is undergoing a fundamental transformation. Artificial intelligence, mathematical models, and supercomputers have replaced human intelligence, human deliberation, and human execution…. Modern finance is becoming cyborg finance – an industry that is faster, larger, more complex, more global, more interconnected, and less human. C W Lin proposes a number of principles for regulating this cyber finance industry:

  1. Update antiquated paradigms of reasonable investors and compartmentalised institutions, and confront the emerging institutional realities, and realise the old paradigms of governance of markets may be ill-suited for the new finance industry;
  2. Enhance disclosure which recognises the complexity and technological capacities of the new finance industry;
  3. Adopt regulations to moderate the velocities of finance realising that as these approach the speed of light they may contain more risks than rewards for the new financial industry;
  4. Introduce smarter coordination harmonising financial regulation beyond traditional spaces of jurisdiction.

Electronic markets will require international coordination, surveillance and regulation. The high-frequency trading environment has the potential to generate errors and losses at a speed and magnitude far greater than that in a floor or screen-based trading environment… Moreover, issues related to risk management of these technology-dependent trading systems are numerous and complex and cannot be addressed in isolation within domestic financial markets. For example, placing limits on high-frequency algorithmic trading or restricting Un-filtered sponsored access and co-location within one jurisdiction might only drive trading firms to another jurisdiction where controls are less stringent.

In these regulatory endeavours it will be vital to remember that all innovation is not intrinsically good and might be inherently dangerous, and the objective is to make a more efficient and equitable financial system, not simply a faster system: Despite its fast computers and credit derivatives, the current financial system does not seem better at transferring funds from savers to borrowers than the financial system of 1910. Furthermore as Thomas Piketty‘s Capital in the Twenty-First Century amply demonstrates any thought of the democratisation of finance induced by the huge expansion of superannuation funds together with the increased access to finance afforded by credit cards and ATM machines, is something of a fantasy, since levels of structural inequality have endured through these technological transformations. The tragedy is that under the guise of technological advance and sophistication we could be destroying the capacity of financial markets to fulfil their essential purpose, as Haldane eloquently states:

An efficient capital market transfers savings today into investment tomorrow and growth the day after. In that way, it boosts welfare. Short-termism in capital markets could interrupt this transfer. If promised returns the day after tomorrow fail to induce saving today, there will be no investment tomorrow. If so, long-term growth and welfare would be the casualty.

Momentum of Accelerated Capital. Note Quote.

high-frequency-trading

Distinct types of high frequency trading firms include independent proprietary firms, which use private funds and specific strategies which remain secretive, and may act as market makers generating automatic buy and sell orders continuously throughout the day. Broker-dealer proprietary desks are part of traditional broker-dealer firms but are not related to their client business, and are operated by the largest investment banks. Thirdly hedge funds focus on complex statistical arbitrage, taking advantage of pricing inefficiencies between asset classes and securities.

Today strategies using algorithmic trading and High Frequency Trading play a central role on financial exchanges, alternative markets, and banks‘ internalized (over-the-counter) dealings:

High frequency traders typically act in a proprietary capacity, making use of a number of strategies and generating a very large number of trades every single day. They leverage technology and algorithms from end-to-end of the investment chain – from market data analysis and the operation of a specific trading strategy to the generation, routing, and execution of orders and trades. What differentiates HFT from algorithmic trading is the high frequency turnover of positions as well as its implicit reliance on ultra-low latency connection and speed of the system.

The use of algorithms in computerised exchange trading has experienced a long evolution with the increasing digitalisation of exchanges:

Over time, algorithms have continuously evolved: while initial first-generation algorithms – fairly simple in their goals and logic – were pure trade execution algos, second-generation algorithms – strategy implementation algos – have become much more sophisticated and are typically used to produce own trading signals which are then executed by trade execution algos. Third-generation algorithms include intelligent logic that learns from market activity and adjusts the trading strategy of the order based on what the algorithm perceives is happening in the market. HFT is not a strategy per se, but rather a technologically more advanced method of implementing particular trading strategies. The objective of HFT strategies is to seek to benefit from market liquidity imbalances or other short-term pricing inefficiencies.

While algorithms are employed by most traders in contemporary markets, the intense focus on speed and the momentary holding periods are the unique practices of the high frequency traders. As the defence of high frequency trading is built around the principles that it increases liquidity, narrows spreads, and improves market efficiency, the high number of trades made by HFT traders results in greater liquidity in the market. Algorithmic trading has resulted in the prices of securities being updated more quickly with more competitive bid-ask prices, and narrowing spreads. Finally HFT enables prices to reflect information more quickly and accurately, ensuring accurate pricing at smaller time intervals. But there are critical differences between high frequency traders and traditional market makers:

  1. HFT do not have an affirmative market making obligation, that is they are not obliged to provide liquidity by constantly displaying two sides quotes, which may translate into a lack of liquidity during volatile conditions.
  2. HFT contribute little market depth due to the marginal size of their quotes, which may result in larger orders having to transact with many small orders, and this may impact on overall transaction costs.
  3. HFT quotes are barely accessible due to the extremely short duration for which the liquidity is available when orders are cancelled within milliseconds.

Besides the shallowness of the HFT contribution to liquidity, are the real fears of how HFT can compound and magnify risk by the rapidity of its actions:

There is evidence that high-frequency algorithmic trading also has some positive benefits for investors by narrowing spreads – the difference between the price at which a buyer is willing to purchase a financial instrument and the price at which a seller is willing to sell it – and by increasing liquidity at each decimal point. However, a major issue for regulators and policymakers is the extent to which high-frequency trading, unfiltered sponsored access, and co-location amplify risks, including systemic risk, by increasing the speed at which trading errors or fraudulent trades can occur.

Although there have always been occasional trading errors and episodic volatility spikes in markets, the speed, automation and interconnectedness of today‘s markets create a different scale of risk. These risks demand that exchanges and market participants employ effective quality management systems and sophisticated risk mitigation controls adapted to these new dynamics in order to protect against potential threats to market stability arising from technology malfunctions or episodic illiquidity. However, there are more deliberate aspects of HFT strategies which may present serious problems for market structure and functioning, and where conduct may be illegal, for example in order anticipation seeks to ascertain the existence of large buyers or sellers in the marketplace and then to trade ahead of those buyers and sellers in anticipation that their large orders will move market prices. A momentum strategy involves initiating a series of orders and trades in an attempt to ignite a rapid price move. HFT strategies can resemble traditional forms of market manipulation that violate the Exchange Act:

  1. Spoofing and layering occurs when traders create a false appearance of market activity by entering multiple non-bona fide orders on one side of the market at increasing or decreasing prices in order to induce others to buy or sell the stock at a price altered by the bogus orders.
  2. Painting the tape involves placing successive small amount of buy orders at increasing prices in order to stimulate increased demand.

  3. Quote Stuffing and price fade are additional HFT dubious practices: quote stuffing is a practice that floods the market with huge numbers of orders and cancellations in rapid succession which may generate buying or selling interest, or compromise the trading position of other market participants. Order or price fade involves the rapid cancellation of orders in response to other trades.

The World Federation of Exchanges insists: ― Exchanges are committed to protecting market stability and promoting orderly markets, and understand that a robust and resilient risk control framework adapted to today‘s high speed markets, is a cornerstone of enhancing investor confidence. However this robust and resilient risk control framework‘ seems lacking, including in the dark pools now established for trading that were initially proposed as safer than the open market.

Quantum Informational Biochemistry. Thought of the Day 71.0

el_net2

A natural extension of the information-theoretic Darwinian approach for biological systems is obtained taking into account that biological systems are constituted in their fundamental level by physical systems. Therefore it is through the interaction among physical elementary systems that the biological level is reached after increasing several orders of magnitude the size of the system and only for certain associations of molecules – biochemistry.

In particular, this viewpoint lies in the foundation of the “quantum brain” project established by Hameroff and Penrose (Shadows of the Mind). They tried to lift quantum physical processes associated with microsystems composing the brain to the level of consciousness. Microtubulas were considered as the basic quantum information processors. This project as well the general project of reduction of biology to quantum physics has its strong and weak sides. One of the main problems is that decoherence should quickly wash out the quantum features such as superposition and entanglement. (Hameroff and Penrose would disagree with this statement. They try to develop models of hot and macroscopic brain preserving quantum features of its elementary micro-components.)

However, even if we assume that microscopic quantum physical behavior disappears with increasing size and number of atoms due to decoherence, it seems that the basic quantum features of information processing can survive in macroscopic biological systems (operating on temporal and spatial scales which are essentially different from the scales of the quantum micro-world). The associated information processor for the mesoscopic or macroscopic biological system would be a network of increasing complexity formed by the elementary probabilistic classical Turing machines of the constituents. Such composed network of processors can exhibit special behavioral signatures which are similar to quantum ones. We call such biological systems quantum-like. In the series of works Asano and others (Quantum Adaptivity in Biology From Genetics to Cognition), there was developed an advanced formalism for modeling of behavior of quantum-like systems based on theory of open quantum systems and more general theory of adaptive quantum systems. This formalism is known as quantum bioinformatics.

The present quantum-like model of biological behavior is of the operational type (as well as the standard quantum mechanical model endowed with the Copenhagen interpretation). It cannot explain physical and biological processes behind the quantum-like information processing. Clarification of the origin of quantum-like biological behavior is related, in particular, to understanding of the nature of entanglement and its role in the process of interaction and cooperation in physical and biological systems. Qualitatively the information-theoretic Darwinian approach supplies an interesting possibility of explaining the generation of quantum-like information processors in biological systems. Hence, it can serve as the bio-physical background for quantum bioinformatics. There is an intriguing point in the fact that if the information-theoretic Darwinian approach is right, then it would be possible to produce quantum information from optimal flows of past, present and anticipated classical information in any classical information processor endowed with a complex enough program. Thus the unified evolutionary theory would supply a physical basis to Quantum Information Biology.