Game Theory and Finite Strategies: Nash Equilibrium Takes Quantum Computations to Optimality.

nash_equilibrium_1_maximin

Finite games of strategy, within the framework of noncooperative quantum game theory, can be approached from finite chain categories, where, by finite chain category, it is understood a category C(n;N) that is generated by n objects and N morphic chains, called primitive chains, linking the objects in a specific order, such that there is a single labelling. C(n;N) is, thus, generated by N primitive chains of the form:

x0 →f1 x1 →f2 x1 → … xn-1 →fn xn —– (1)

A finite chain category is interpreted as a finite game category as follows: to each morphism in a chain xi-1 →fi xi, there corresponds a strategy played by a player that occupies the position i, in this way, a chain corresponds to a sequence of strategic choices available to the players. A quantum formal theory, for a finite game category C(n;N), is defined as a formal structure such that each morphic fundament fi of the morphic relation xi-1 →fi xis a tuple of the form:

fi := (Hi, Pi, Pˆfi) —– (2)

where Hi is the i-th player’s Hilbert space, Pi is a complete set of projectors onto a basis that spans the Hilbert space, and Pˆfi ∈ Pi. This structure is interpreted as follows: from the strategic Hilbert space Hi, given the pure strategies’ projectors Pi, the player chooses to play Pˆfi .

From the morphic fundament (2), an assumption has to be made on composition in the finite category, we assume the following tensor product composition operation:

fj ◦ fi = fji —– (3)

fji = (Hji = Hj ⊗ Hi, Pji = Pj ⊗ Pi, Pˆfji = Pˆfj ⊗ Pˆfi) —– (4)

From here, a morphism for a game choice path could be introduced as:

x0 →fn…21 xn —– (5)

fn…21 = (HG = ⊗i=n1 Hi, PG = ⊗i=n1 Pi, Pˆ fn…21 = ⊗i=n1fi) —– (6)

in this way, the choices along the chain of players are completely encoded in the tensor product projectors Pˆfn…21. There are N = ∏i=1n dim(Hi) such morphisms, a number that coincides with the number of primitive chains in the category C(n;N).

Each projector can be addressed as a strategic marker of a game path, and leads to the matrix form of an Arrow-Debreu security, therefore, we call it game Arrow-Debreu projector. While, in traditional financial economics, the Arrow-Debreu securities pay one unit of numeraire per state of nature, in the present game setting, they pay one unit of payoff per game path at the beginning of the game, however this analogy may be taken it must be addressed with some care, since these are not securities, rather, they represent, projectively, strategic choice chains in the game, so that the price of a projector Pˆfn…21 (the Arrow-Debreu price) is the price of a strategic choice and, therefore, the result of the strategic evaluation of the game by the different players.

Now, let |Ψ⟩ be a ket vector in the game’s Hilbert space HG, such that:

|Ψ⟩ = ∑fn…21 ψ(fn…21)|(fn…21⟩ —– (7)

where ψ(fn…21) is the Arrow-Debreu price amplitude, with the condition:

fn…21 |ψ(fn…21)|2 = D —– (8)

for D > 0, then, the |ψ(fn…21)|corresponds to the Arrow-Debreu prices for the game path fn…21 and D is the discount factor in riskless borrowing, defining an economic scale for temporal connections between one unit of payoff now and one unit of payoff at the end of the game, such that one unit of payoff now can be capitalized to the end of the game (when the decision takes place) through a multiplication by 1/D, while one unit of payoff at the end of the game can be discounted to the beginning of the game through multiplication by D.

In this case, the unit operator, 1ˆ = ∑fn…21 Pˆfn…21 has a similar profile as that of a bond in standard financial economics, with ⟨Ψ|1ˆ|Ψ⟩ = D, on the other hand, the general payoff system, for each player, can be addressed from an operator expansion:

πiˆ = ∑fn…21 πi (fn…21) Pˆfn…21 —– (9)

where each weight πi(fn…21) corresponds to quantities associated with each Arrow-Debreu projector that can be interpreted as similar to the quantities of each Arrow-Debreu security for a general asset. Multiplying each weight by the corresponding Arrow-Debreu price, one obtains the payoff value for each alternative such that the total payoff for the player at the end of the game is given by:

⟨Ψ|1ˆ|Ψ⟩ = ∑fn…21 πi(fn…21) |ψ(fn…21)|2/D —– (10)

We can discount the total payoff to the beginning of the game using the discount factor D, leading to the present value payoff for the player:

PVi = D ⟨Ψ|πiˆ|Ψ⟩ = D ∑fn…21 π (fn…21) |ψ(fn…21)|2/D —– (11)

, where π (fn…21) represents quantities, while the ratio |ψ(fn…21)|2/D represents the future value at the decision moment of the quantum Arrow- Debreu prices (capitalized quantum Arrow-Debreu prices). Introducing the ket

|Q⟩ ∈ HG, such that:

|Q⟩ = 1/√D |Ψ⟩ —– (12)

then, |Q⟩ is a normalized ket for which the price amplitudes are expressed in terms of their future value. Replacing in (11), we have:

PVi = D ⟨Q|πˆi|Q⟩ —– (13)

In the quantum game setting, the capitalized Arrow-Debreu price amplitudes ⟨fn…21|Q⟩ become quantum strategic configurations, resulting from the strategic cognition of the players with respect to the game. Given |Q⟩, each player’s strategic valuation of each pure strategy can be obtained by introducing the projector chains:

Cˆfi = ∑fn…i+1fi-1…1 Pˆfn…i+1 ⊗ Pˆfi ⊗ Pˆfi-1…1 —– (14)

with ∑fi Cˆfi = 1ˆ. For each alternative choice of the player i, the chain sums over all of the other choice paths for the rest of the players, such chains are called coarse-grained chains in the decoherent histories approach to quantum mechanics. Following this approach, one may introduce the pricing functional from the expression for the decoherence functional:

D (fi, gi : |Q⟩) = ⟨Q| Cˆfi Cgi|Q⟩  —– (15)

we, then have, for each player

D (fi, gi : |Q⟩) = 0, ∀ fi ≠ gi —– (16)

this is the usual quantum mechanics’ condition for an aditivity of measure (also known as decoherence condition), which means that the capitalized prices for two different alternative choices of player i are additive. Then, we can work with the pricing functional D(fi, fi :|Q⟩) as giving, for each player an Arrow-Debreu capitalized price associated with the pure strategy, expressed by fi. Given that (16) is satisfied, each player’s quantum Arrow-Debreu pricing matrix, defined analogously to the decoherence matrix from the decoherent histories approach, is a diagonal matrix and can be expanded as a linear combination of the projectors for each player’s pure strategies as follows:

Di (|Q⟩) = ∑fi D(fi, f: |Q⟩) Pˆfi —– (17)

which has the mathematical expression of a mixed strategy. Thus, each player chooses from all of the possible quantum computations, the one that maximizes the present value payoff function with all the other strategies held fixed, which is in agreement with Nash.

Advertisement

Intuitive Algebra (Groupoid/Categorical Structure) of Open Strings As Morphisms

A geometric Dirichlet brane is a triple (L, E, ∇E) – a submanifold L ⊂ M, carrying a vector bundle E, with connection ∇E.

The real dimension of L is also often brought into the nomenclature, so that one speaks of a Dirichlet p-brane if p = dimRL.

An open string which stretches from a Dirichlet brane (L, E, ∇E) to a Dirichlet brane (K, F, ∇F), is a map X from an interval I ≅ [0,1] to M, such that X(0) ∈ L and X(1) ∈ K. An “open string history” is a map from R into open strings, or equivalently a map from a two-dimensional surface with boundary, say Σ ≡ I × R, to M , such that the two boundaries embed into L and K.

Untitled

The quantum theory of these open strings is defined by a functional integral over these histories, with a weight which depends on the connections ∇E and ∇F. It describes the time evolution of an open string state which is a wave function in a Hilbert space HB,B′ labelled by the two choices of brane B = (L, E, ∇E) and B′ = (K, F, ∇F).

Untitled

Distinct Dirichlet branes can embed into the same submanifold L. One way to represent this would be to specify the configurations of Dirichlet branes as a set of submanifolds with multiplicity. However, we can also represent this choice by using the choice of bundle E. Thus, a set of N identical branes will be represented by tensoring the bundle E with CN. The connection is also obtained by tensor product. An N-fold copy of the Dirichlet brane (L, E, ∇E) is thus a triple (L, E ⊗CN, ∇E ⊗ idN).

In physics, one visualizes this choice by labelling each open string boundary with a basis vector of CN, which specifies a choice among the N identical branes. These labels are called Chan-Paton factors. One then uses them to constrain the interactions between open strings. If we picture such an interaction as the joining of two open strings to one, the end of the first to the beginning of the second, we require not only the positions of the two ends to agree, but also the Chan-Paton factors. This operation is the intuitive algebra of open strings.

Mathematically, an algebra of open strings can always be tensored with a matrix algebra, in general producing a noncommutative algebra. More generally, if there is more than one possible boundary condition, then, rather than an algebra, it is better to think of this as a groupoid or categorical structure on the boundary conditions and the corresponding open strings. In the language of groupoids, particular open strings are elements of the groupoid, and the composition law is defined only for pairs of open strings with a common boundary. In the categorical language, boundary conditions are objects, and open strings are morphisms. The simplest intuitive argument that a non-trivial choice can be made here is to call upon the general principle that any local deformation of the world-sheet action should be a physically valid choice. In particular, particles in physics can be charged under a gauge field, for example the Maxwell field for an electron, the color Yang-Mills field for a quark, and so on. The wave function for a charged particle is then not complex-valued, but takes values in a bundle E.

Now, the effect of a general connection ∇E is to modify the functional integral by modifying the weight associated to a given history of the particle. Suppose the trajectory of a particle is defined by a map φ : R → M; then a natural functional on trajectories associated with a connection ∇ on M is simply its holonomy along the trajectory, a linear map from E|φ(t1) to E|φ(t2). The functional integral is now defined physically as a sum over trajectories with this holonomy included in the weight.

The simplest way to generalize this to a string is to consider the ls → 0 limit. Now the constraint of finiteness of energy is satisfied only by a string of vanishingly small length, effectively a particle. In this limit, both ends of the string map to the same point, which must therefore lie on L ∩ K.

The upshot is that, in this limit, the wave function of an open string between Dirichlet branes (L, E, ∇) and (K, F, ∇F) transforms as a section of E ⊠ F over L ∩ K, with the natural connection on the direct product. In the special case of (L, E, ∇E) ≅ (K, F, ∇F), this reduces to the statement that an open string state is a section of EndE. Open string states are sections of a graded vector bundle End E ⊗ Λ•T∗L, the degree-1 part of which corresponds to infinitesimal deformations of ∇E. In fact, these open string states are the infinitesimal deformations of ∇E, in the standard sense of quantum field theory, i.e., a single open string is a localized excitation of the field obtained by quantizing the connection ∇E. Similarly, other open string states are sections of the normal bundle of L within X, and are related in the same way to infinitesimal deformations of the submanifold. These relations, and their generalizations to open strings stretched between Dirichlet branes, define the physical sense in which the particular set of Dirichlet branes associated to a specified background X can be deduced from string theory.

Energy Trading: Asian Options.

Untitled

Consider a risky asset (stock, commodity, a unit of energy) with the price S(t), where t ∈ [0, T], for a given T > 0. Consider an option with the payoff

Fu = Φ(u(·), S(·)) —– (1)

This payoff depends on a control process u(·) that is selected by an option holder from a certain class of admissible controls U. The mapping Φ : U × S → R is given; S is the set of paths of S(t). All processes from U has to be adapted to the current information flow, i.e., adapted to some filtration Ft that describes this information flow. We call the corresponding options controlled options.

For simplicity, we assume that all options give the right on the corresponding payoff of the amount Fu in cash rather than the right to buy or sell stock or commodities.

Consider a risky asset with the price S(t). Let T > 0 be given, and let g : R → R and f : R × [0, T] → R be some functions. Consider an option with the payoff at time T

Fu = g(∫0u(t) f (S(t), t)dt) —– (2)

Here u(t) is the control process that is selected by the option holder. The process u(t) has to be adapted to the filtration Ft describing the information flow. In addition, it has to be selected such that

0T u(t)dt = 1

A possible modification is the option with the payoff

Fu = ∫0T u(t) f(S(t), t)dt + (1 – ∫0T u(t)dt) f(S(T), T)

In this case, the unused u(t) are accumulated and used at the terminal time. Let us consider some examples of possible selection of f and g. We denote x+ = max(0, x)

Important special cases are the options with g(x) = x, g(x) = (x − k)+, g(x) = (K − x)+,

g(x) = min(M, x), where M > 0 is the cap for benefits, and with

f(x, t) = x, f(x, t) = (x − K)+, f(x, t) = (K − x)+ —– (3)

or

f(x, t) = er(T−t)(x − K)+, f(x, t) = er(T−t)(K − x)+ —– (4)

where K > 0 is given and where r > 0 is the risk-free rate. Options (3) correspond to the case when the payments are made at current time t ∈ [0, T], and options (4) correspond to the case when the payment is made at terminal time T. This takes into account accumulation of interest up to time T on any payoff.

The option with payoff (2) with f(x, t) ≡ x represents a generalization of Asian option where the weight u(t) is selected by the holder. It needs to be noted that an Asian option , which is also called an average option, is an option whose payoff depends on the average price of the underlying asset over a certain period of time as opposed to at maturity. The option with payoff (2) with g(x) ≡ x represents a limit version of the multi-exercise options, when the distribution of exercise time approaches a continuous distribution. An additional restriction on |u(t)| ≤ const would represent the continuous analog of the requirement for multi-exercise options that exercise times must be on some distance from each other. For an analog of the model without this condition, strategies may approach delta-functions.

These options can be used, for instance, for energy trading with u(t) representing the quantity of energy purchased at time t for the fixed price K when the market price is above K. In this case, the option represents a modification of the multi-exercise call option with continuously distributed payoff time. For this model, the total amount of energy that can be purchased is limited per option. Therefore, the option holder may prefer to postpone the purchase if she expects better opportunities in future.

String’s Depth of Burial

string_conversion_03.2013

A string’s depth might be defined as the execution time of its minimal program.

The difficulty with this definition arises in cases where the minimal program is only a few bits smaller than some much faster program, such as a print program, to compute the same output x. In this case, slight changes in x may induce arbitrarily large changes in the run time of the minimal program, by changing which of the two competing programs is minimal. Analogous instability manifests itself in translating programs from one universal machine to another. This instability emphasizes the essential role of the quantity of buried redundancy, not as a measure of depth, but as a certifier of depth. In terms of the philosophy-of-science metaphor, an object whose minimal program is only a few bits smaller than its print program is like an observation that points to a nontrivial hypothesis, but with only a low level of statistical confidence.

To adequately characterize a finite string’s depth one must therefore consider the amount of buried redundancy as well as the depth of its burial. A string’s depth at significance level s might thus be defined as that amount of time complexity which is attested by s bits worth of buried redundancy. This characterization of depth may be formalized in several ways.

A string’s depth at significance level s be defined as the time required to compute the string by a program no more than s bits larger than the minimal program.

This definition solves the stability problem, but is unsatisfactory in the way it treats multiple programs of the same length. Intuitively, 2k distinct (n + k)-bit programs that compute same output ought to be accorded the same weight as one n-bit program; but, by the present definition, they would be given no more weight than one (n + k)-bit program.

A string’s depth at signicifcance level s depth might be defined as the time t required for the string’s time-bounded algorithmic probability Pt(x) to rise to within a factor 2−s of its asymptotic time-unbounded value P(x).

This formalizes the notion that for the string to have originated by an effective process of t steps or fewer is less plausible than for the first s tosses of a fair coin all to come up heads.

It is not known whether there exist strings that are deep, in other words, strings having no small fast programs, even though they have enough large fast programs to contribute a significant fraction of their algorithmic probability. Such strings might be called deterministically deep but probabilistically shallow, because their chance of being produced quickly in a probabilistic computation (e.g. one where the input bits of U are supplied by coin tossing) is significant compared to their chance of being produced slowly. The question of whether such strings exist is probably hard to answer because it does not relativize uniformly. Deterministic and probabilistic depths are not very different relative to a random coin-toss oracle A of the equality of random-oracle-relativized deterministic and probabilistic polynomial time complexity classes; but they can be very different relative to an oracle B deliberately designed to hide information from deterministic computations (this parallels Hunt’s proof that deterministic and probabilistic polynomial time are unequal relative to such an oracle).

(Depth of Finite Strings): Let x and w be strings and s a significance parameter. A string’s depth at significance level s, denoted Ds(x), will be defined as min{T(p) : (|p|−|p| < s)∧(U(p) = x)}, the least time required to compute it by a s-incompressible program. At any given significance level, a string will be called t-deep if its depth exceeds t, and t-shallow otherwise.

The difference between this definition and the previous one is rather subtle philosophically and not very great quantitatively. Philosophically, when each individual hypothesis for the rapid origin of x is implausible at the 2−s confidence level, then it requires only that a weighted average of all such hypotheses be implausible.

There exist constants c1 and c2 such that for any string x, if programs running in time ≤ t contribute a fraction between 2−s and 2−s+1 of the string’s total algorithmic probability, then x has depth at most t at significance level s + c1 and depth at least t at significance level s − min{H(s), H(t)} − c2.

Proof : The first part follows easily from the fact that any k-compressible self-delimiting program p is associated with a unique, k − O(1) bits shorter, program of the form “execute the result of executing p∗”. Therefore there exists a constant c1 such that if all t-fast programs for x were s + c1– compressible, the associated shorter programs would contribute more than the total algorithmic probability of x. The second part follows because, roughly, if fast programs contribute only a small fraction of the algorithmic probability of x, then the property of being a fast program for x is so unusual that no program having that property can be random. More precisely, the t-fast programs for x constitute a finite prefix set, a superset S of which can be computed by a program of size H(x) + min{H(t), H(s)} + O(1) bits. (Given x∗ and either t∗ or s∗, begin enumerating all self-delimiting programs that compute x, in order of increasing running time, and quit when either the running time exceeds t or the accumulated measure of programs so far enumerated exceeds 2−(H(x)−s)). Therefore there exists a constant c2 such that, every member of S, and thus every t-fast program for x, is compressible by at least s − min{H(s), H(t)} − O(1) bits.

The ability of universal machines to simulate one another efficiently implies a corresponding degree of machine-independence for depth: for any two efficiently universal machines of the sort considered here, there exists a constant c and a linear polynomial L such that for any t, strings whose (s+c)-significant depth is at least L(t) on one machine will have s-significant depth at least t on the other.

Depth of one string relative to another may be defined analogously, and represents the plausible time required to produce one string, x, from another, w.

(Relative Depth of Finite Strings): For any two strings w and x, the depth of x relative to w at significance level s, denoted Ds(x/w), will be defined as min{T(p, w) : (|p|−|(p/w)∗| < s)∧(U(p, w) = x)}, the least time required to compute x from w by a program that is s-incompressible relative to w.

Depth of a string relative to its length is a particularly useful notion, allowing us, as it were, to consider the triviality or nontriviality of the “content” of a string (i.e. its bit sequence), independent of its “form” (length). For example, although the infinite sequence 000… is intuitively trivial, its initial segment 0n is deep whenever n is deep. However, 0n is always shallow relative to n, as is, with high probability, a random string of length n.

In order to adequately represent the intuitive notion of stored mathematical work, it is necessary that depth obey a “slow growth” law, i.e. that fast deterministic processes be unable to transform a shallow object into a deep one, and that fast probabilistic processes be able to do so only with low probability.

(Slow Growth Law): Given any data string x and two significance parameters s2 > s1, a random program generated by coin tossing has probability less than 2−(s2−s1)+O(1) of transforming x into an excessively deep output, i.e. one whose s2-significant depth exceeds the s1-significant depth of x plus the run time of the transforming program plus O(1). More precisely, there exist positive constants c1, c2 such that for all strings x, and all pairs of significance parameters s2 > s1, the prefix set {q : Ds2(U(q, x)) > Ds1(x) + T(q, x) + c1} has measure less than 2−(s2−s1)+c2.

Proof: Let p be a s1-incompressible program which computes x in time Ds1(x), and let r be the restart prefix mentioned in the definition of the U machine. Let Q be the prefix set {q : Ds2(U(q, x)) > T(q, x) + Ds1(x) + c1}, where the constant c1 is sufficient to cover the time overhead of concatenation. For all q ∈ Q, the program rpq by definition computes some deep result U(q, x) in less time than that result’s own s2-significant depth, and so rpq must be compressible by s2 bits. The sum of the algorithmic probabilities of strings of the form rpq, where q ∈ Q, is therefore

Σq∈Q P(rpq)< Σq∈Q 2−|rpq| + s2 = 2−|r|−|p|+s2 μ(Q)

On the other hand, since the self-delimiting program p can be recovered from any string of the form rpq (by deleting r and executing the remainder pq until halting occurs, by which time exactly p will have been read), the algorithmic probability of p is at least as great (within a constant factor) as the sum of the algorithmic probabilities of the strings {rpq : q ∈ Q} considered above:

P(p) > μ(Q) · 2−|r|−|p|+s2−O(1)

Recalling the fact that minimal program size is equal within a constant factor to the −log of algorithmic probability, and the s1-incompressibility of p, we have P(p) < 2−(|p|−s1+O(1)), and therefore finally

μ(Q) < 2−(s2−s1)+O(1), which was to be demonstrated.

Belief Networks “Acyclicity”. Thought of the Day 69.0

Belief networks are used to model uncertainty in a domain. The term “belief networks” encompasses a whole range of different but related techniques which deal with reasoning under uncertainty. Both quantitative (mainly using Bayesian probabilistic methods) and qualitative techniques are used. Influence diagrams are an extension to belief networks; they are used when working with decision making. Belief networks are used to develop knowledge based applications in domains which are characterised by inherent uncertainty. Increasingly, belief network techniques are being employed to deliver advanced knowledge based systems to solve real world problems. Belief networks are particularly useful for diagnostic applications and have been used in many deployed systems. The free-text help facility in the Microsoft Office product employs Bayesian belief network technology. Within a belief network the belief of each node (the node’s conditional probability) is calculated based on observed evidence. Various methods have been developed for evaluating node beliefs and for performing probabilistic inference. Influence diagrams, which are an extension of belief networks, provide facilities for structuring the goals of the diagnosis and for ascertaining the value (the influence) that given information will have when determining a diagnosis. In influence diagrams, there are three types of node: chance nodes, which correspond to the nodes in Bayesian belief networks; utility nodes, which represent the utilities of decisions; and decision nodes, which represent decisions which can be taken to influence the state of the world. Influence diagrams are useful in real world applications where there is often a cost, both in terms of time and money, in obtaining information.

The basic idea in belief networks is that the problem domain is modelled as a set of nodes interconnected with arcs to form a directed acyclic graph. Each node represents a random variable, or uncertain quantity, which can take two or more possible values. The arcs signify the existence of direct influences between the linked variables, and the strength of each influence is quantified by a forward conditional probability.

The Belief Network, which is also called the Bayesian Network, is a directed acyclic graph for probabilistic reasoning. It defines the conditional dependencies of the model by associating each node X with a conditional probability P(X|Pa(X)), where Pa(X) denotes the parents of X. Here are two of its conditional independence properties:

1. Each node is conditionally independent of its non-descendants given its parents.

2. Each node is conditionally independent of all other nodes given its Markov blanket, which consists of its parents, children, and children’s parents.

The inference of Belief Network is to compute the posterior probability distribution

P(H|V) = P(H,V)/ ∑HP(H,V)

where H is the set of the query variables, and V is the set of the evidence variables. Approximate inference involves sampling to compute posteriors. The Sigmoid Belief Network is a type of the Belief Network such that

P(Xi = 1|Pa(Xi)) = σ( ∑Xj ∈ Pa(Xi) WjiXj + bi)

where Wji is the weight assigned to the edge from Xj to Xi, and σ is the sigmoid function.

Untitled