Black Hole Entropy in terms of Mass. Note Quote.

c839ecac963908c173c6b13acf3cd2a8--friedrich-nietzsche-the-portal

If M-theory is compactified on a d-torus it becomes a D = 11 – d dimensional theory with Newton constant

GD = G11/Ld = l911/Ld —– (1)

A Schwartzschild black hole of mass M has a radius

Rs ~ M(1/(D-3)) GD(1/(D-3)) —– (2)

According to Bekenstein and Hawking the entropy of such a black hole is

S = Area/4GD —– (3)

where Area refers to the D – 2 dimensional hypervolume of the horizon:

Area ~ RsD-2 —– (4)

Thus

S ~ 1/GD (MGD)(D-2)/(D-3) ~ M(D-2)/(D-3) GD1/(D-3) —– (5)

From the traditional relativists’ point of view, black holes are extremely mysterious objects. They are described by unique classical solutions of Einstein’s equations. All perturbations quickly die away leaving a featureless “bald” black hole with ”no hair”. On the other hand Bekenstein and Hawking have given persuasive arguments that black holes possess thermodynamic entropy and temperature which point to the existence of a hidden microstructure. In particular, entropy generally represents the counting of hidden microstates which are invisible in a coarse grained description. An ultimate exact treatment of objects in matrix theory requires a passage to the infinite N limit. Unfortunately this limit is extremely difficult. For the study of Schwarzchild black holes, the optimal value of N (the value which is large enough to obtain an adequate description without involving many redundant variables) is of order the entropy, S, of the black hole.

Considering the minimum such value for N, we have

Nmin(S) = MRs = M(MGD)1/D-3 = S —– (6)

We see that the value of Nmin in every dimension is proportional to the entropy of the black hole. The thermodynamic properties of super Yang Mills theory can be estimated by standard arguments only if S ≤ N. Thus we are caught between conflicting requirements. For N >> S we don’t have tools to compute. For N ~ S the black hole will not fit into the compact geometry. Therefore we are forced to study the black hole using N = Nmin = S.

Matrix theory compactified on a d-torus is described by d + 1 super Yang Mills theory with 16 real supercharges. For d = 3 we are dealing with a very well known and special quantum field theory. In the standard 3+1 dimensional terminology it is U(N) Yang Mills theory with 4 supersymmetries and with all fields in the adjoint repersentation. This theory is very special in that, in addition to having electric/magnetic duality, it enjoys another property which makes it especially easy to analyze, namely it is exactly scale invariant.

Let us begin by considering it in the thermodynamic limit. The theory is characterized by a “moduli” space defined by the expectation values of the scalar fields φ. Since the φ also represents the positions of the original DO-branes in the non compact directions, we choose them at the origin. This represents the fact that we are considering a single compact object – the black hole- and not several disconnected pieces.

The equation of state of the system, defined by giving the entropy S as a function of temperature. Since entropy is extensive, it is proportional to the volume ∑3 of the dual torus. Furthermore, the scale invariance insures that S has the form

S = constant T33 —– (7)

The constant in this equation counts the number of degrees of freedom. For vanishing coupling constant, the theory is described by free quanta in the adjoint of U(N). This means that the number of degrees of freedom is ~ N2.

From the standard thermodynamic relation,

dE = TdS —– (8)

and the energy of the system is

E ~ N2T43 —– (9)

In order to relate entropy and mass of the black hole, let us eliminate temperature from (7) and (9).

S = N23((E/N23))3/4 —– (10)

Now the energy of the quantum field theory is identified with the light cone energy of the system of DO-branes forming the black hole. That is

E ≈ M2/N R —– (11)

Plugging (11) into (10)

S = N23(M2R/N23)3/4 —– (12)

This makes sense only when N << S, as when N >> S computing the equation of state is slightly trickier. At N ~ S, this is precisely the correct form for the black hole entropy in terms of the mass.

From God’s Perspective, There Are No Fields…Justified Newtonian, Unjustified Relativistic Claim. Note Quote.

Electromagnetism is a relativistic theory. Indeed, it had been relativistic, or Lorentz invariant, before Einstein and Minkowski understood that this somewhat peculiar symmetry of Maxwell’s equations was not accidental but expressive of a radically new structure of time and space. Minkowski spacetime, in contrast to Newtonian spacetime, doesn’t come with a preferred space-like foliation, its geometric structure is not one of ordered slices representing “objective” hyperplanes of absolute simultaneity. But Minkowski spacetime does have an objective (geometric) structure of light-cones, with one double-light-cone originating in every point. The most natural way to define a particle interaction in Minkowski spacetime is to have the particles interact directly, not along equal-time hyperplanes but along light-cones

Particle-b-interacts-with-particle-a-at-point-x-via-retarded-and-advanced-waves-The-mass

In other words, if zi􏱁i)  and zjj􏱁) denote the trajectories of two charged particles, it wouldn’t make sense to say that the particles interact at “equal times” as it is in Newtonian theory. It would however make perfectly sense to say that the particles interact whenever

(zμi zμj)(zμi zμj) = (zi – zj)2 = 0 —– (1)

For an observer finding himself in a universe guided by such laws it might then seem like the effects of particle interactions were propagating through space with the speed of light. And this observer may thus insist that there must be something in addition to the particles, something moving or evolving in spacetime and mediating interactions between charged particles. And all this would be a completely legitimate way of speaking, only that it would reflect more about how things appear from a local perspective in a particular frame of reference than about what is truly and objectively going on in the physical world. From “Gods perspective” there are no fields (or photons, or anything of that kind) – only particles in spacetime interacting with each other. This might sound hypothetical, but, it actually is not entirely fictitious. for such a formulation of electrodynamics actually exists and is known as Wheeler-Feynman electrodynamics, or Wheeler-Feynman Absorber Theory. There is a formal property of field equations called “gauge invariance” which makes it possible to look at things in several different, but equivalent, ways. Because of gauge invariance, this theory says that when you push on something, it creates a disturbance in the gravitational field that propagates outward into the future. Out there in the distant future the disturbance interacts with chiefly the distant matter in the universe. It wiggles. When it wiggles it sends a gravitational disturbance backward in time (a so-called “advanced” wave). The effect of all of these “advanced” disturbances propagating backward in time is to create the inertial reaction force you experience at the instant you start to push (and cancel the advanced wave that would otherwise be created by you pushing on the object). So, in this view fields do not have a real existence independent of the sources that emit and absorb them. It is defined by the principle of least action.

Wheeler–Feynman electrodynamics and Maxwell–Lorentz electrodynamics are for all practical purposes empirically equivalent, and it may seem that the choice between the two candidate theories is merely one of convenience and philosophical preference. But this is not really the case since the sad truth is that the field theory, despite its phenomenal success in practical applications and the crucial role it played in the development of modern physics, is inconsistent. The reason is quite simple. The Maxwell–Lorentz theory for a system of N charged particles is defined, as it should be, by a set of mathematical equations. The equation of motion for the particles is given by the Lorentz force law, which is

The electromagnetic force F on a test charge at a given point and time is a certain function of its charge q and velocity v, which can be parameterized by exactly two vectors E and B, in the functional form:

describing the acceleration of a charged particle in an electromagnetic field. The electromagnetic field, represented by the field-tensor Fμν, is described by Maxwell’s equations. The homogenous Maxwell equations tell us that the antisymmetric tensor Fμν (a 2-form) can be written as the exterior derivative of a potential (a 1-form) Aμ(x), i.e. as

Fμν = ∂μ Aν – ∂ν Aμ —– (2)

The inhomogeneous Maxwell equations couple the field degrees of freedom to matter, that is, they tell us how the charges determine the configuration of the electromagnetic field. Fixing the gauge-freedom contained in (2) by demanding ∂μAμ(x) = 0 (Lorentz gauge), the remaining Maxwell equations take the particularly simple form:

□ Aμ = – 4π jμ —– (3)

where

□ = ∂μμ

is the d’Alembert operator and jμ the 4-current density.

The light-cone structure of relativistic spacetime is reflected in the Lorentz-invariant equation (3). The Liénard–Wiechert field at spacetime point x depends on the trajectories of the particles at the points of intersection with the (past and future) light-cones originating in x. The Liénard–Wiechert field (the solution of (3)) is singular precisely at the points where it is needed, namely on the world-lines of the particles. This is the notorious problem of the electron self-interaction: a charged particle generates a field, the field acts back on the particle, the field-strength becomes infinite at the point of the particle and the interaction terms blow up. Hence, the simple truth is that the field concept for managing interactions between point-particles doesn’t work, unless one relies on formal manipulations like renormalization or modifies Maxwell’s laws on small scales. However, we don’t need the fields and by taking the idea of a relativistic interaction theory seriously, we can “cut the middle man” and let the particles interact directly. The status of the Maxwell equation’s (3) in Wheeler–Feynman theory is now somewhat analogous to the status of Laplace’s equation in Newtonian gravity. We can get to the Gallilean invariant theory by writing the force as the gradient of a potential and having that potential satisfy the simplest nontrivial Galilean invariant equation, which is the Laplace equation:

∆V(x, t) = ∑iδ(x – xi(t)) —– (4)

Similarly, we can get the (arguably) simplest Lorentz invariant theory by writing the force as the exterior derivative of a potential and having that potential satisfy the simplest nontrivial Lorentz invariant equation, which is (3). And as concerns the equation of motion for the particles, the trajectories, if, are parametrized by proper time, then the Minkowski norm of the 4-velocity is a constant of motion. In Newtonian gravity, we can make sense of the gravitational potential at any point in space by conceiving its effect on a hypothetical test particle, feeling the gravitational force without gravitating itself. However, nothing in the theory suggests that we should take the potential seriously in that way and conceive of it as a physical field. Indeed, the gravitational potential is really a function on configuration space rather than a function on physical space, and it is really a useful mathematical tool rather than corresponding to physical degrees of freedom. From the point of view of a direct interaction theory, an analogous reasoning would apply in the relativistic context. It may seem (and historically it has certainly been the usual understanding) that (3), in contrast to (4), is a dynamical equation, describing the temporal evolution of something. However, from a relativistic perspective, this conclusion seems unjustified.

Stochasticities. Lévy processes. Part 2.

Define the characteristic function of Xt:

Φt(z) ≡ ΦXt(z) ≡ E[eiz.Xt], z ∈ Rd

For t > s, by writing Xt+s = Xs + (Xt+s − Xs) and using the fact that Xt+s − Xs is independent of Xs, we obtain that t ↦ Φt(z) is a multiplicative function.

Φt+s(z) = ΦXt+s(z) = ΦXs(z) ΦXt+s − Xs(z) = ΦXs(z) ΦXt(z) = ΦsΦt

The stochastic continuity of t ↦ Xt implies in particular that Xt → Xs in distribution when s → t. Therefore, ΦXs(z) → ΦXt(z) when s → t so t ↦ Φt(z) is a continuous function of t. Together with the multiplicative property Φs+t(z) = Φs(z).Φt(z), this implies that t ↦ Φt(z) is an exponential function.

Let (Xt)t≥0 be a Lévy process on Rd. ∃ a continuous function ψ : Rd ↦ R called the characteristic exponent of X, such that:

E[eiz.Xt] = etψ(z), z ∈ Rd

ψ is the cumulant generating function of X1 : ψ = ΨX1 and that the cumulant generating function of Xt varies linearly in t: ΨXt = tΨX1 = tψ. The law of Xt is therefore determined by the knowledge of the law of X1 : the only degree of freedom we have in specifying a Lévy process is to specify the distribution of Xt for a single time (say, t = 1).

This lecture covers stochastic processes, including continuous-time stochastic processes and standard Brownian motion by Choongbum Lee

Network Theoretic of the Fermionic Quantum State – Epistemological Rumination. Thought of the Day 150.0

galitski_moatv3

In quantum physics, fundamental particles are believed to be of two types: fermions or bosons, depending on the value of their spin (an intrinsic ‘angular moment’ of the particle). Fermions have half-integer spin and cannot occupy a quantum state (a configuration with specified microscopic degrees of freedom, or quantum numbers) that is already occupied. In other words, at most one fermion at a time can occupy one quantum state. The resulting probability that a quantum state is occupied is known as the Fermi-Dirac statistics.

Now, if we want to convert this into a model with maximum entropy, where the real movement is defined topologically, then we require a reproduction of heterogeneity that is observed. The starting recourse is network theory with an ensemble of networks where each vertex i has the same degree ki as in the real network. This choice is justified by the fact that, being an entirely local topological property, the degree is expected to be directly affected by some intrinsic (non-topological) property of vertices. The caveat is that the real shouldn’t be compared with the randomized, which could otherwise lead to interpreting the observed as ‘unavoidable’ topological constraints, in the sense that the violation of the observed values would lead to an ‘impossible’, or at least very unrealistic values.

The resulting model is known as the Configuration Model, and is defined as a maximum-entropy ensemble of graphs with given degree sequence. The degree sequence, which is the constraint defining the model, is nothing but the ordered vector k of degrees of all vertices (where the ith component ki is the degree of vertex i). The ordering preserves the ‘identity’ of vertices: in the resulting network ensemble, the expected degree ⟨ki⟩ of each vertex i is the same as the empirical value ki for that vertex. In the Configuration Model, the graph probability is given by

P(A) = ∏i<jqij(aij) =  ∏i<jpijaij (1 – pij)1-aij —– (1)

where qij(a) = pija (1 – pij)1-a is the probability that particular entry of the adjacency matrix A takes the value aij = a, which is a Bernoulli process with different pairs of vertices characterized by different connection probabilities pij. A Bernoulli trial (or Bernoulli process) is the simplest random event, i.e. one characterized by only two possible outcomes. One of the two outcomes is referred to as the ‘success’ and is assigned a probability p. The other outcome is referred to as the ‘failure’, and is assigned the complementary probability 1 − p. These probabilities read

⟨aij⟩ = pij = (xixj)/(1 + xixj) —– (2)

where xi is the Lagrange multiplier obtained by ensuring that the expected degree of the corresponding vertex i equals its observed value: ⟨ki⟩ = ki ∀ i. As always happens in maximum-entropy ensembles, the probabilistic nature of configurations implies that the constraints are valid only on average (the angular brackets indicate an average over the ensemble of realizable networks). Also note that pij is a monotonically increasing function of xi and xj. This implies that ⟨ki⟩ is a monotonically increasing function of xi. An important consequence is that two variables i and j with the same degree ki = kj must have the same value xi = xj.

Unknown

(2) provides an interesting connection with quantum physics, and in particular the statistical mechanics of fermions. The ‘selection rules’ of fermions dictate that only one particle at a time can occupy a single-particle state, exactly as each pair of vertices in binary networks can be either connected or disconnected. In this analogy, every pair i, j of vertices is a ‘quantum state’ identified by the ‘quantum numbers’ i and j. So each link of a binary network is like a fermion that can be in one of the available states, provided that no two objects are in the same state. (2) indicates the expected number of particles/links in the state specified by i and j. With no surprise, it has the same form of the so-called Fermi-Dirac statistics describing the expected number of fermions in a given quantum state. The probabilistic nature of links allows also for the presence of empty states, whose occurrence is now regulated by the probability coefficients (1 − pij). The Configuration Model allows the whole degree sequence of the observed network to be preserved (on average), while randomizing other (unconstrained) network properties. now, when one compares the higher-order (unconstrained) observed topological properties with their expected values calculated over the maximum-entropy ensemble, it should be indicative of the fact that the degree of sequence is informative in explaining the rest of the topology, which is a consequent via probabilities in (2). Colliding these into a scatter plot, the agreement between model and observations can be simply assessed as follows: the less scattered the cloud of points around the identity function, the better the agreement between model and reality. In principle, a broadly scattered cloud around the identity function would indicate the little effectiveness of the chosen constraints in reproducing the unconstrained properties, signaling the presence of genuine higher-order patterns of self-organization, not simply explainable in terms of the degree sequence alone. Thus, the ‘fermionic’ character of the binary model is the mere result of the restriction that no two binary links can be placed between any two vertices, leading to a mathematical result which is formally equivalent to the one of quantum statistics.

Geometry and Localization: An Unholy Alliance? Thought of the Day 95.0

SYM5

There are many misleading metaphors obtained from naively identifying geometry with localization. One which is very close to that of String Theory is the idea that one can embed a lower dimensional Quantum Field Theory (QFT) into a higher dimensional one. This is not possible, but what one can do is restrict a QFT on a spacetime manifold to a submanifold. However if the submanifold contains the time axis (a ”brane”), the restricted theory has too many degrees of freedom in order to merit the name ”physical”, namely it contains as many as the unrestricted; the naive idea that by using a subspace one only gets a fraction of phase space degrees of freedom is a delusion, this can only happen if the subspace does not contain a timelike line as for a null-surface (holographic projection onto a horizon).

The geometric picture of a string in terms of a multi-component conformal field theory is that of an embedding of an n-component chiral theory into its n-dimensional component space (referred to as a target space), which is certainly a string. But this is not what modular localization reveals, rather those oscillatory degrees of freedom of the multicomponent chiral current go into an infinite dimensional Hilbert space over one localization point and do not arrange themselves according according to the geometric source-target idea. A theory of this kind is of course consistent but String Theory is certainly a very misleading terminology for this state of affairs. Any attempt to imitate Feynman rules by replacing word lines by word sheets (of strings) may produce prescriptions for cooking up some mathematically interesting functions, but those results can not be brought into the only form which counts in a quantum theory, namely a perturbative approach in terms of operators and states.

String Theory is by no means the only area in particle theory where geometry and modular localization are at loggerheads. Closely related is the interpretation of the Riemann surfaces, which result from the analytic continuation of chiral theories on the lightray/circle, as the ”living space” in the sense of localization. The mathematical theory of Riemann surfaces does not specify how it should be realized; if its refers to surfaces in an ambient space, a distinguished subgroup of Fuchsian group or any other of the many possible realizations is of no concern for a mathematician. But in the context of chiral models it is important not to confuse the living space of a QFT with its analytic continuation.

Whereas geometry as a mathematical discipline does not care about how it is concretely realized the geometrical aspects of modular localization in spacetime has a very specific geometric content namely that which can be encoded in subspaces (Reeh-Schlieder spaces) generated by operator subalgebras acting onto the vacuum reference state. In other words the physically relevant spacetime geometry and the symmetry group of the vacuum is contained in the abstract positioning of certain subalgebras in a common Hilbert space and not that which comes with classical theories.

Weyl’s Lagrange Density of General Relativistic Maxwell Theory

Weyl pondered on the reasons why the structure group of the physical automorphisms still contained the “Euclidean rotation group” (respectively the Lorentz group) in such a prominent role:

The Euclidean group of rotations has survived even such radical changes of our concepts of the physical world as general relativity and quantum theory. What then are the peculiar merits of this group to which it owes its elevation to the basic group pattern of the universe? For what ‘sufficient reasons’ did the Creator choose this group and no other?”

He reminded that Helmholtz had characterized ∆o ≅ SO (3, ℜ) by the “fact that it gives to a rotating solid what we may call its just degrees of freedom” of a rotating solid body; but this method “breaks down for the Lorentz group that in the four-dimensional world takes the place of the orthogonal group in 3-space”. In the early 1920s he himself had given another characterization living up to the new demands of the theories of relativity in his mathematical analysis of the problem of space.

He mentioned the idea that the Lorentz group might play its prominent role for the physical automorphisms because it expresses deep lying matter structures; but he strongly qualified the idea immediately after having stated it:

Since we have the dualism of invariance with respect to two groups and Ω certainly refers to the manifold of space points, it is a tempting idea to ascribe ∆o to matter and see in it a characteristic of the localizable elementary particles of matter. I leave it undecided whether this idea, the very meaning of which is somewhat vague, has any real merits.

. . . But instead of analysing the structure of the orthogonal group of transformations ∆o, it may be wiser to look for a characterization of the group ∆o as an abstract group. Here we know that the homogeneous n-dimensional orthogonal groups form one of 3 great classes of simple Lie groups. This is at least a partial solution of the problem.

He left it open why it ought to be “wiser” to look for abstract structure properties in order to answer a natural philosophical question. Could it be that he wanted to indicate an open-mindedness toward the more structuralist perspective on automorphism groups, preferred by the young algebraists around him at Princetion in the 1930/40s? Today the classification of simple Lie groups distinguishes 4 series, Ak,Bk,Ck,Dk. Weyl apparently counted the two orthogonal series Bk and Dk as one. The special orthogonal groups in even complex space dimension form the series of simple Lie groups of type Dk, with complex form (SO 2k,C) and real compact form (SO 2k,ℜ). The special orthogonal group in odd space dimension form the series type Bk, with complex form SO(2k + 1, C) and compact real form SO(2k + 1, ℜ).

But even if one accepted such a general structuralist view as a starting point there remained a question for the specification of the space dimension of the group inside the series.

But the number of the dimensions of the world is 4 and not an indeterminate n. It is a fact that the structure of ∆o is quite different for the various dimensionalities n. Hence the group may serve as a clue by which to discover some cogent reason for the di- mensionality 4 of the world. What must be brought to light, is the distinctive character of one definite group, the four-dimensional Lorentz group, either as a group of linear transformations, or as an abstract group.

The remark that the “structure of ∆o is quite different for the various dimensionalities n” with regard to even or odd complex space dimensions (type Dk, resp. Bk) strongly qualifies the import of the general structuralist characterization. But already in the 1920s Weyl had used the fact that for the (real) space dimension n “4 the universal covering of the unity component of the Lorentz group SO (1, 3)o is the realification of SL (2, C). The latter belongs to the first of the Ak series (with complex form SL (k + 1,C). Because of the isomorphism of the initial terms of the series, A1 ≅ B1, this does not imply an exception of Weyl’s general statement. We even may tend to interpret Weyl’s otherwise cryptic remark that the structuralist perspective gives a “at least a partial solution of the problem” by the observation that the Lorentz group in dimension n “4 is, in a rather specific way, the realification of the complex form of one of the three most elementary non-commutative simple Lie groups of type A1 ≅ B1. Its compact real form is SO (3, ℜ), respectively the latter’s universal cover SU (2, C).

Weyl stated clearly that the answer cannot be expected by structural considerations alone. The problem is only “partly one of pure mathematics”, the other part is “empirical”. But the question itself appeared of utmost importance to him

We can not claim to have understood Nature unless we can establish the uniqueness of the four-dimensional Lorentz group in this sense. It is a fact that many of the known laws of nature can at once be generalized to n dimensions. We must dig deep enough until we hit a layer where this is no longer the case.

In 1918 he had given an argument why, in the framework of his new scale gauge geometry, the “world” had to be of dimension 4. His argument had used the construction of the Lagrange density of general relativistic Maxwell theory Lf = fμν fμν √(|detg|), with fμν the components of curvature of his newly introduced scale/length connection, physically interpreted by him as the electromagnetic field. Lf is scale invariant only in spacetime dimension n = 4. The shift from scale gauge to phase gauge undermined the importance of this argument. Although it remained correct mathematically, it lost its convincing power once the scale gauge transformations were relegated from physics to the mathematical automorphism group of the theory only.

Weyl said:

Our question has this in common with most questions of philosophical nature: it depends on the vague distinction between essential and non-essential. Several competing solutions are thinkable; but it may also happen that, once a good solution of the problem is found, it will be of such cogency as to command general recognition.

Vector Fields Tangent to the Surfaces of Foliation. Note Quote.

events-as-foliations-2

Although we are interested in gauge field theories, we will use mainly the language of mechanics that is, of a finite number of degrees of freedom, which is sufficient for our purposes. A quick switch to the field theory language can be achieved by using DeWitt’s condensed notation. Consider, as our starting point a time-independent first- order Lagrangian L(q, q ̇) defined in configuration-velocity space TQ, that is, the tangent bundle of some configuration manifold Q that we assume to be of dimension n. Gauge theories rely on singular as opposed to regular Lagrangians, that is, Lagrangians whose Hessian matrix with respect to the velocities (where q stands, in a free index notation, for local coordinates in Q),

Wij ≡ ∂2L/∂q.i∂q.j —– (1)

is not invertible. Two main consequences are drawn from this non-invertibility. First notice that the Euler-Lagrange equations of motion [L]i = 0, with

[L]i : = αi − Wijq ̈j

and

αi := ∂2L/∂q.i∂q.j q.j

cannot be written in a normal form, that is, isolating on one side the accelerations q ̈ = f (q, q ̇). This makes the usual theorems about the existence and uniqueness of solutions of ordinary differential equations inapplicable. Consequently, there may be points in the tangent bundle where there are no solutions passing through the point, and others where there is more than one solution.

The second consequence of the Hessian matrix being singular concerns the construction of the canonical formalism. The Legendre map from the tangent bundle TQ to the cotangent bundle —or phase space— T ∗Q (we use the notation pˆ(q, q ̇) := ∂L/∂q ̇),

FL : TQ → T ∗ Q —– (2)

(q, q ̇) → (q, p=pˆ) —– (3)

is no longer invertible because ∂pˆ/∂q ̇ = ∂L/∂q ̇∂q ̇ is the Hessian matrix. There appears then an issue about the projectability of structures from the tangent bundle to phase space: there will be functions defined on TQ that cannot be translated (projected) to functions on phase space. This feature of the formalisms propagates in a corresponding way to the tensor structures, forms, vector fields, etc.

In order to better identify the problem and to obtain the conditions of projectability, we must be more specific. We will make a single assumption, which is that the rank of the Hessian matrix is constant everywhere. If this condition is not satisfied throughout the whole tangent bundle, we will restrict our considerations to a region of it, with the same dimensionality, where this condition holds. So we are assuming that the rank of the Legendre map FL is constant throughout T Q and equal to, say, 2n − k. The image of FL will be locally defined by the vanishing of k independent functions, φμ(q, p), μ = 1, 2, .., k. These functions are the primary constraints, and their pullback FL ∗ φμ to the tangent bundle is identically zero:

(FL ∗ φμ)(q, q ̇) := φμ(q, pˆ) = 0, ∀ q, q ̇—– (4)

The primary constraints form a generating set of the ideal of functions that vanish on the image of the Legendre map. With their help it is easy to obtain a basis of null vectors for the Hessian matrix. Indeed, applying ∂/∂q. to (4) we get

Wij = (∂φμ/∂pj)|p=pˆ = 0, ∀ q, q ̇ —– (5)

With this result in hand, let us consider some geometrical aspects of the Legendre map. We already know that its image in T∗Q is given by the primary constraints’ surface. A foliation in TQ is also defined, with each element given as the inverse image of a point in the primary constraints’ surface in T∗Q. One can easily prove that the vector fields tangent to the surfaces of the foliation are generated by

Γμ= (∂φμ/∂pj)|p=pˆ = ∂/∂q.j —– (6)

The proof goes as follows. Consider two neighboring points in TQ belonging to the same sheet, (q, q ̇) and (q, q ̇ + δq ̇) (the configuration coordinates q must be the same because they are preserved by the Legendre map). Then, using the definition of the Legendre map, we must have pˆ(q, q ̇) = pˆ(q, q ̇ + δq ̇), which implies, expanding to first order,

∂pˆ/ ∂q ̇ δ q ̇ = 0

which identifies δq ̇ as a null vector of the Hessian matrix (here expressed as ∂pˆ/∂q ̇). Since we already know a basis for such null vectors, (∂φμ /∂pj)|p=pˆ, μ = 1, 2, …, k, it follows that the vector fields Γμ form a basis for the vector fields tangent to the foliation.

The knowledge of these vector fields is instrumental for addressing the issue of the projectability of structures. Consider a real-valued function fL: TQ → R. It will — locally— define a function fH: T∗Q −→ R iff it is constant on the sheets of the foliation, that is, when

ΓμfL = 0, μ = 1,2,…,k. (7)

Equation (7) is the projectability condition we were looking for. We express it in the following way:

ΓμfL = 0, μ = 1,2,…,k ⇔ there exists fH such that FL ∗ fH = fL

Yield Curve Dynamics or Fluctuating Multi-Factor Rate Curves

intro-945

The actual dynamics (as opposed to the risk-neutral dynamics) of the forward rate curve cannot be reduced to that of the short rate: the statistical evidence points out to the necessity of taking into account more degrees of freedom in order to represent in an adequate fashion the complicated deformations of the term structure. In particular, the imperfect correlation between maturities and the rich variety of term structure deformations shows that a one factor model is too rigid to describe yield curve dynamics.

Furthermore, in practice the value of the short rate is either fixed or at least strongly influenced by an authority exterior to the market (the central banks), through a mechanism different in nature from that which determines rates of higher maturities which are negotiated on the market. The short rate can therefore be viewed as an exogenous stochastic input which then gives rise to a deformation of the term structure as the market adjusts to its variations.

Traditional term structure models define – implicitly or explicitly – the random motion of an infinite number of forward rates as diffusions driven by a finite number of independent Brownian motions. This choice may appear surprising, since it introduces a lot of constraints on the type of evolution one can ascribe to each point of the forward rate curve and greatly reduces the dimensionality i.e. the number of degrees of freedom of the model, such that the resulting model is not able to reproduce any more the complex dynamics of the term structure. Multifactor models are usually justified by refering to the results of principal component analysis of term structure fluctuations. However, one should note that the quantities of interest when dealing with the term structure of interest rates are not the first two moments of the forward rates but typically involve expectations of non-linear functions of the forward rate curve: caps and floors are typical examples from this point of view. Hence, although a multifactor model might explain the variance of the forward rate itself, the same model may not be able to explain correctly the variability of portfolio positions involving non-linear combinations of the same forward rates. In other words, a principal component whose associated eigenvalue is small may have a non-negligible effect on the fluctuations of a non-linear function of forward rates. This question is especially relevant when calculating quantiles and Value-at-Risk measures.

In a multifactor model with k sources of randomness, one can use any k + 1 instruments to hedge a given risky payoff. However, this is not what traders do in real markets: a given interest-rate contingent payoff is hedged with bonds of the same maturity. These practices reflect the existence of a risk specific to instruments of a given maturity. The representation of a maturity-specific risk means that, in a continuous-maturity limit, one must also allow the number of sources of randomness to grow with the number of maturities; otherwise one loses the localization in maturity of the source of randomness in the model.

An important ingredient for the tractability of a model is its Markovian character. Non-Markov processes are difficult to simulate and even harder to manipulate analytically. Of course, any process can be transformed into a Markov process if it is imbedded into a space of sufficiently high dimension; this amounts to injecting a sufficient number of “state variables” into the model. These state variables may or may not be observable quantities; for example one such state variable may be the short rate itself but another one could be an economic variable whose value is not deducible from knowledge of the forward rate curve. If the state variables are not directly observed, they are obtainable in principle from the observed interest rates by a filtering process. Nevertheless the presence of unobserved state variables makes the model more difficult to handle both in terms of interpretation and statistical estimation. This drawback has motivated the development of so-called affine curve models models where one imposes that the state variables be affine functions of the observed yield curve. While the affine hypothesis is not necessarily realistic from an empirical point of view, it has the property of directly relating state variables to the observed term structure.

Another feature of term structure movements is that, as a curve, the forward rate curve displays a continuous deformation: configurations of the forward rate curve at dates not too far from each other tend to be similar. Most applications require the yield curve to have some degree of smoothness e.g. differentiability with respect to the maturity. This is not only a purely mathematical requirement but is reflected in market practices of hedging and arbitrage on fixed income instruments. Market practitioners tend to hedge an interest rate risk of a given maturity with instruments of the same maturity or close to it. This important observation means that the maturity is not simply a way of indexing the family of forward rates: market operators expect forward rates whose maturities are close to behave similarly. Moreover, the model should account for the observation that the volatility term structure displays a hump but that multiple humps are never observed.

Automorphisms. Note Quote.

GraphAutormophismGroupExamples-theilmbh

A group automorphism is an isomorphism from a group to itself. If G is a finite multiplicative group, an automorphism of G can be described as a way of rewriting its multiplication table without altering its pattern of repeated elements. For example, the multiplication table of the group of 4th roots of unity G={1,-1,i,-i} can be written as shown above, which means that the map defined by

 1|->1,    -1|->-1,    i|->-i,    -i|->i

is an automorphism of G.

Looking at classical geometry and mechanics, Weyl followed Newton and Helmholtz in considering congruence as the basic relation which lay at the heart of the “art of measuring” by the handling of that “sort of bodies we call rigid”. He explained how the local congruence relations established by the comparison of rigid bodies can be generalized and abstracted to congruences of the whole space. In this respect Weyl followed an empiricist approach to classical physical geometry, based on a theoretical extension of the material practice with rigid bodies and their motions. Even the mathematical abstraction to mappings of the whole space carried the mark of their empirical origin and was restricted to the group of proper congruences (orientation preserving isometries of Euclidean space, generated by the translations and rotations) denoted by him as ∆+. This group seems to express “an intrinsic structure of space itself; a structure stamped by space upon all the inhabitants of space”.

But already on the earlier level of physical knowledge, so Weyl argued, the mathematical automorphisms of space were larger than ∆. Even if one sees “with Newton, in congruence the one and only basic concept of geometry from which all others derive”, the group Γ of automorphisms in the mathematical sense turns out to be constituted by the similarities.

The structural condition for an automorphism C ∈ Γ of classical congruence geometry is that any pair (v1,v2) of congruent geometric configurations is transformed into another pair (v1*,v2*) of congruent configurations (vj* = C(vj), j = 1,2). For evaluating this property Weyl introduced the following diagram:

IMG_20170320_040116_HDR

Because of the condition for automorphisms just mentioned the maps C T C-1 and C-1TC belong to ∆+ whenever T does. By this argument he showed that the mathematical automorphism group Γ is the normalizer of the congruences ∆+ in the group of bijective mappings of Euclidean space.

More generally, it also explains the reason for his characterization of generalized similarities in his analysis of the problem of space in the early 1920s. In 1918 he translated the relationship between physical equivalences as congruences to the mathematical automorphisms as the similarities/normalizer of the congruences from classical geometry to special relativity (Minkowski space) and “localized” them (in the sense of physics), i.e., he transferred the structural relationship to the infinitesimal neighbourhoods of the differentiable manifold characterizing spacetime (in more recent language, to the tangent spaces) and developed what later would be called Weylian manifolds, a generalization of Riemannian geometry. In his discussion of the problem of space he generalized the same relationship even further by allowing any (closed) sub-group of the general linear group as a candidate for characterizing generalized congruences at every point.

Moreover, Weyl argued that the enlargement of the physico-geometrical automorphisms of classical geometry (proper congruences) by the mathematical automorphisms (similarities) sheds light on Kant’s riddle of the “incongruous counterparts”. Weyl presented it as the question: Why are “incongruous counterparts” like the left and right hands intrinsically indiscernible, although they cannot be transformed into another by a proper motion? From his point of view the intrinsic indiscernibility could be characterized by the mathematical automorphisms Γ. Of course, the congruences ∆ including the reflections are part of the latter, ∆ ⊂ Γ; this implies indiscernibility between “left and right” as a special case. In this way Kant’s riddle was solved by a Leibnizian type of argument. Weyl very cautiously indicated a philosophical implication of this observation:

And he (Kant) is inclined to think that only transcendental idealism is able to solve this riddle. No doubt, the meaning of congruence and similarity is founded in spatial intuition. Kant seems to aim at some subtler point. But just this point is one which can be completely clarified by general concepts, namely by subsuming it under the general and typical group-theoretic situation explained before . . . .

Weyl stopped here without discussing the relationship between group theoretical methods and the “subtler point” Kant aimed at more explicitly. But we may read this remark as an indication that he considered his reflections on automorphism groups as a contribution to the transcendental analysis of the conceptual constitution of modern science. In his book on Symmetry, he went a tiny step further. Still with the Weylian restraint regarding the discussion of philosophical principles he stated: “As far as I see all a priori statements in physics have their origin in symmetry” (126).

To prepare for the following, Weyl specified the subgroup ∆o ⊂ ∆ with all those transformations that fix one point (∆o = O(3, R), the orthogonal group in 3 dimensions, R the field of real numbers). In passing he remarked:

In the four-dimensional world the Lorentz group takes the place of the orthogonal group. But here I shall restrict myself to the three-dimensional space, only occasionally pointing to the modifications, the inclusion of time into the four-dimensional world brings about.

Keeping this caveat in mind (restriction to three-dimensional space) Weyl characterized the “group of automorphisms of the physical world”, in the sense of classical physics (including quantum mechanics) by the combination (more technically, the semidirect product ̧) of translations and rotations, while the mathematical automorphisms arise from a normal extension:

– physical automorphisms ∆ ≅ R3 X| ∆o with ∆o ≅ O(3), respectively ∆ ≅ R4 X| ∆o for the Lorentz group ∆o ≅ O(1, 3),

– mathematical automorphisms Γ = R+ X ∆
(R+ the positive real numbers with multiplication).

In Weyl’s view the difference between mathematical and physical automorphisms established a fundamental distinction between mathematical geometry and physics.

Congruence, or physical equivalence, is a geometric concept, the meaning of which refers to the laws of physical phenomena; the congruence group ∆ is essentially the group of physical automorphisms. If we interpret geometry as an abstract science dealing with such relations and such relations only as can be logically defined in terms of the one concept of congruence, then the group of geometric automorphisms is the normalizer of ∆ and hence wider than ∆.

He considered this as a striking argument against what he considered to be the Cartesian program of a reductionist geometrization of physics (physics as the science of res extensa):

According to this conception, Descartes’s program of reducing physics to geometry would involve a vicious circle, and the fact that the group of geometric automorphisms is wider than that of physical automorphisms would show that such a reduction is actually impossible.” 

In this Weyl alluded to an illusion he himself had shared for a short time as a young scientist. After the creation of his gauge geometry in 1918 and the proposal of a geometrically unified field theory of electromagnetism and gravity he believed, for a short while, to have achieved a complete geometrization of physics.

He gave up this illusion in the middle of the 1920s under the impression of the rising quantum mechanics. In his own contribution to the new quantum mechanics groups and their linear representations played a crucial role. In this respect the mathematical automorphisms of geometry and the physical automorphisms “of Nature”, or more precisely the automorphisms of physical systems, moved even further apart, because now the physical automorphism started to take non-geometrical material degrees of freedom into account (phase symmetry of wave functions and, already earlier, the permutation symmetries of n-particle systems).

But already during the 19th century the physical automorphism group had acquired a far deeper aspect than that of the mobility of rigid bodies:

In physics we have to consider not only points but many types of physical quantities such as velocity, force, electromagnetic field strength, etc. . . .

All these quantities can be represented, relative to a Cartesian frame, by sets of numbers such that any orthogonal transformation T performed on the coordinates keeps the basic physical relations, the physical laws, invariant. Weyl accordingly stated:

All the laws of nature are invariant under the transformations thus induced by the group ∆. Thus physical relativity can be completely described by means of a group of transformations of space-points.

By this argumentation Weyl described a deep shift which ocurred in the late 19th century for the understanding of physics. He described it as an extension of the group of physical automorphisms. The laws of physics (“basic relations” in his more abstract terminology above) could no longer be directly characterized by the motion of rigid bodies because the physics of fields, in particular of electric and magnetic fields, had become central. In this context, the motions of material bodies lost their epistemological primary status and the physical automorphisms acquired a more abstract character, although they were still completely characterizable in geometric terms, by the full group of Euclidean isometries. The indistinguishability of left and right, observed already in clear terms by Kant, acquired the status of a physical symmetry in electromagnetism and in crystallography.

Weyl thus insisted that in classical physics the physical automorphisms could be characterized by the group ∆ of Euclidean isometries, larger than the physical congruences (proper motions) ∆+ but smaller than the mathe- matical automorphisms (similarities) Γ.

This view fitted well to insights which Weyl drew from recent developments in quantum physics. He insisted – differently to what he had thought in 1918 – on the consequence that “length is not relative but absolute” (Hs, p. 15). He argued that physical length measurements were no longer dependent on an arbitrary chosen unit, like in Euclidean geometry. An “absolute standard of length” could be fixed by the quantum mechanical laws of the atomic shell:

The atomic constants of charge and mass of the electron atomic constants and Planck’s quantum of action h, which enter the universal field laws of nature, fix an absolute standard of length, that through the wave lengths of spectral lines is made available for practical measurements.