The Third Trichotomy. Thought of the Day 121.0

peircetriangle

The decisive logical role is played by continuity in the third trichotomy which is Peirce’s generalization of the old distinction between term, proposition and argument in logic. In him, the technical notions are rhema, dicent and argument, and all of them may be represented by symbols. A crucial step in Peirce’s logic of relations (parallel to Frege) is the extension of the predicate from having only one possible subject in a proposition – to the possibility for a predicate to take potentially infinitely many subjects. Predicates so complicated may be reduced, however, to combination of (at most) three-subject predicates, according to Peirce’s reduction hypothesis. Let us consider the definitions from ‘Syllabus (The Essential Peirce Selected Philosophical Writings, Volume 2)’ in continuation of the earlier trichotomies:

According to the third trichotomy, a Sign may be termed a Rheme, a Dicisign or Dicent Sign (that is, a proposition or quasi-proposition), or an Argument.

A Rheme is a Sign which, for its Interpretant, is a Sign of qualitative possibility, that is, is understood as representing such and such a kind of possible Object. Any Rheme, perhaps, will afford some information; but it is not interpreted as doing so.

A Dicent Sign is a Sign, which, for its Interpretant, is a Sign of actual existence. It cannot, therefore, be an Icon, which affords no ground for an interpretation of it as referring to actual existence. A Dicisign necessarily involves, as a part of it, a Rheme, to describe the fact which it is interpreted as indicating. But this is a peculiar kind of Rheme; and while it is essential to the Dicisign, it by no means constitutes it.

An Argument is a Sign which, for its Interpretant, is a Sign of a law. Or we may say that a Rheme is a sign which is understood to represent its object in its characters merely; that a Dicisign is a sign which is understood to represent its object in respect to actual existence; and that an Argument is a Sign which is understood to represent its Object in its character as Sign. ( ) The proposition need not be asserted or judged. It may be contemplated as a sign capable of being asserted or denied. This sign itself retains its full meaning whether it be actually asserted or not. ( ) The proposition professes to be really affected by the actual existent or real law to which it refers. The argument makes the same pretension, but that is not the principal pretension of the argument. The rheme makes no such pretension.

The interpretant of the Argument represents it as an instance of a general class of Arguments, which class on the whole will always tend to the truth. It is this law, in some shape, which the argument urges; and this ‘urging’ is the mode of representation proper to Arguments.

Predicates being general is of course a standard logical notion; in Peirce’s version this generality is further emphasized by the fact that the simple predicate is seen as relational and containing up to three subject slots to be filled in; each of them may be occupied by a continuum of possible subjects. The predicate itself refers to a possible property, a possible relation between subjects; the empty – or partly satiated – predicate does not in itself constitute any claim that this relation does in fact hold. The information it contains is potential, because no single or general indication has yet been chosen to indicate which subjects among the continuum of possible subjects it refers to. The proposition, on the contrary, the dicisign, is a predicate where some of the empty slots have been filled in with indices (proper names, demonstrative pronomina, deixis, gesture, etc.), and is, in fact, asserted. It thus consists of an indexical part and an iconical part, corresponding to the usual distinction between subject and predicate, with its indexical part connecting it to some level of reference reality. This reality needs not, of course, be actual reality; the subject slots may be filled in with general subjects thus importing pieces of continuity into it – but the reality status of such subjects may vary, so it may equally be filled in with fictitious references of all sorts. Even if the dicisign, the proposition, is not an icon, it contains, via its rhematic core, iconical properties. Elsewhere, Peirce simply defines the dicisign as a sign making explicit its reference. Thus a portrait equipped with a sign indicating the portraitee will be a dicisign, just like a charicature draft with a pointing gesture towards the person it depicts will be a dicisign. Even such dicisigns may be general; the pointing gesture could single out a group or a representative for a whole class of objects. While the dicisign specifies its object, the argument is a sign specifying its interpretant – which is what is normally called the conclusion. The argument thus consists of two dicisigns, a premiss (which may be, in turn, composed of several dicisigns and is traditionally seen as consisting of two dicisigns) and a conclusion – a dicisign represented as ensuing from the premiss due to the power of some law. The argument is thus – just like the other thirdness signs in the trichotomies – in itself general. It is a legisign and a symbol – but adds to them the explicit specification of a general, lawlike interpretant. In the full-blown sign, the argument, the more primitive degenerate sign types are orchestrated together in a threefold generality where no less than three continua are evoked: first, the argument itself is a legisign with a halo of possible instantions of itself as a sign; second, it is a symbol referring to a general object, in turn with a halo of possible instantiations around it; third, the argument implies a general law which is represented by one instantiation (the premiss and the rule of inference) but which has a halo of other, related inferences as possible instantiations. As Peirce says, the argument persuades us that this lawlike connection holds for all other cases being of the same type.

Canonical Actions on Bundles – Philosophizing Identity Over Gauge Transformations.

Untitled

In physical applications, fiber bundles often come with a preferred group of transformations (usually the symmetry group of the system). The modem attitude of physicists is to regard this group as a fundamental structure which should be implemented from the very beginning enriching bundles with a further structure and defining a new category.

A similar feature appears on manifolds as well: for example, on ℜ2 one can restrict to Cartesian coordinates when we regard it just as a vector space endowed with a differentiable structure, but one can allow also translations if the “bigger” affine structure is considered. Moreover, coordinates can be chosen in much bigger sets: for instance one can fix the symplectic form w = dx ∧ dy on ℜ2 so that ℜ2 is covered by an atlas of canonical coordinates (which include all Cartesian ones). But ℜ2 also happens to be identifiable with the cotangent bundle T*ℜ so that we can restrict the previous symplectic atlas to allow only natural fibered coordinates. Finally, ℜ2 can be considered as a bare manifold so that general curvilinear coordinates should be allowed accordingly; only if the full (i.e., unrestricted) manifold structure is considered one can use a full maximal atlas. Other choices define instead maximal atlases in suitably restricted sub-classes of allowed charts. As any manifold structure is associated with a maximal atlas, geometric bundles are associated to “maximal trivializations”. However, it may happen that one can restrict (or enlarge) the allowed local trivializations, so that the same geometrical bundle can be trivialized just using the appropriate smaller class of local trivializations. In geometrical terms this corresponds, of course, to impose a further structure on the bare bundle. Of course, this newly structured bundle is defined by the same basic ingredients, i.e. the same base manifold M, the same total space B, the same projection π and the same standard fiber F, but it is characterized by a new maximal trivialization where, however, maximal refers now to a smaller set of local trivializations.

Examples are: vector bundles are characterized by linear local trivializations, affine bundles are characterized by affine local trivializations, principal bundles are characterized by left translations on the fiber group. Further examples come from Physics: gauge transformations are used as transition functions for the configuration bundles of any gauge theory. For these reasons we give the following definition of a fiber bundle with structure group.

A fiber bundle with structure group G is given by a sextuple B = (E, M, π; F ;>.., G) such that:

  • (E, M, π; F) is a fiber bundle. The structure group G is a Lie group (possibly a discrete one) and λ : G —–> Diff(F) defines a left action of G on the standard fiber F .
  • There is a family of preferred trivializations {(Uα, t(α)}α∈I of B such that the following holds: let the transition functions be gˆ(αβ) : Uαβ —–> Diff(F) and let eG be the neutral element of G. ∃ a family of maps g(αβ) : Uαβ —–> G such

    that, for each x ∈ Uαβγ = Uα ∩ Uβ ∩ Uγ

    g(αα)(x) = eG

    g(αβ)(x) = [g(βα)(x)]-1

    g(αβ)(x) . g(βγ)(x) . g(γα)(x) = eG

    and

    (αβ)(x) = λ(g(αβ)(x)) ∈ Diff(F)

The maps g(αβ) : Uαβ —–> G, which depend on the trivialization, are said to form a cocycle with values in G. They are called the transition functions with values in G (or also shortly the transition functions). The preferred trivializations will be said to be compatible with the structure. Whenever dealing with fiber bundles with structure group the choice of a compatible trivialization will be implicitly assumed.

Fiber bundles with structure group provide the suitable framework to deal with bundles with a preferred group of transformations. To see this, let us begin by introducing the notion of structure bundle of a fiber bundle with structure group B = (B, M, π; F; x, G).

Let B = (B, M, π; F; x, G) be a bundle with a structure group; let us fix a trivialization {(Uα, t(α)}α∈I and denote by g(αβ) : Uαβ —–> G its transition functions. By using the canonical left action L : G —–> Diff(G) of G onto itself, let us define gˆ(αβ) : Uαβ —–> Diff(G) given by gˆ(αβ)(x) = L (g(αβ)(x)); they obviously satisfy the cocycle properties. Now by constructing a (unique modulo isomorphisms) principal bundle PB = P(B) having G as structure group and g(αβ) as transition functions acting on G by left translation Lg : G —> G.

The principal bundle P(B) = (P, M, p; G) constructed above is called the structure bundle of B = (B, M, π; F; λ, G).

Notice that there is no similar canonical way of associating a structure bundle to a geometric bundle B = (B, M, π; F), since in that case the structure group G is at least partially undetermined.

Each automorphism of P(B) naturally acts over B.

Let, in fact, {σ(α)}α∈I be a trivialization of PB together with its transition functions g(αβ) : Uαβ —–> G defined by σ(β) = σ(α) . g(αβ). Then any principal morphism Φ = (Φ, φ) over PB is locally represented by local maps ψ(α) : Uα —> G such that

Φ : [x, h]α ↦ [φ(α)(x), ψ(α)(x).h](α)

Since Φ is a global automorphism of PB for the above local expression, the following property holds true in Uαβ.

φ(α)(x) = φ(β)(x) ≡ x’

ψ(α)(x) = g(αβ)(x’) . ψ(β)(x) . g(βα)(x)

By using the family of maps {(φ(α), ψ(α))} one can thence define a family of global automorphisms of B. In fact, using the trivialization {(Uα, t(α)}α∈I, one can define local automorphisms of B given by

Φ(α)B : (x, y) ↦ (φ(α)(x), [λ(ψ(α)(x))](y))

These local maps glue together to give a global automorphism ΦB of the bundle B, due to the fact that g(αβ) are also transition functions of B with respect to its trivialization {(Uα, t(α)}α∈I.

In this way B is endowed with a preferred group of transformations, namely the group Aut(PB) of automorphisms of the structure bundle PB, represented on B by means of the canonical action. These transformations are called (generalized) gauge transformations. Vertical gauge transformations, i.e. gauge transformations projecting over the identity, are also called pure gauge transformations.

Matter Fields

In classical relativity theory, one generally takes for granted that all there is, and all that happens, can be described in terms of various “matter fields,” each of which is represented by one or more smooth tensor (or spinor) fields on the spacetime manifold M. The latter are assumed to satisfy particular “field equations” involving the spacetime metric gab.

Associated with each matter field F is a symmetric smooth tensor field Tab characterized by the property that, for all points p in M, and all future-directed, unit timelike vectors ξa at p, Tabξb is the four-momentum density of F at p as determined relative to ξa.

Tab is called the energy-momentum field associated with F. The four- momentum density vector Tabξb at a point can be further decomposed into its temporal and spatial components relative to ξa,

Tabξb = (Tmbξmξba + Tmbhmaξb

where the first term on the RHS is the energy density, while the second term is the three-momentum density. A number of assumptions about matter fields can be captured as constraints on the energy-momentum tensor fields with which they are associated.

Weak Energy Condition (WEC): Given any timelike vector ξa at any point in M, Tabξaξb ≥ 0.

Dominant Energy Condition (DEC): Given any timelike vector ξa at any point in M, Tabξaξb ≥ 0 and Tabξb is timelike or null.

Strengthened Dominant Energy Condition (SDEC): Given any timelike vector ξa at any point in M, Tabξaξb ≥ 0 and, if Tab ≠ 0 there, then Tabξb is timelike.

Conservation Condition (CC): ∇aTab = 0 at all points in M.

The WEC asserts that the energy density of F, as determined by any observer at any point, is non-negative. The DEC adds the requirement that the four-momentum density of F, as determined by any observer at any point, is a future-directed causal (i.e., timelike or null) vector. We can understand this second clause to assert that the energy of F does not propagate with superluminal velocity. The strengthened version of the DEC just changes “causal” to “timelike” in the second clause. It avoids reference to “point particles.” Each of the listed energy conditions is strictly stronger than the ones that precede it.

The CC, finally, asserts that the energy-momentum carried by F is locally conserved. If two or more matter fields are present in the same region of space-time, it need not be the case that each one individually satisfies the condition. Interaction may occur. But it is a fundamental assumption that the composite energy-momentum field formed by taking the sum of the individual ones satisfies it. Energy-momentum can be transferred from one matter field to another, but it cannot be created or destroyed. The stated conditions have a number of consequences that support the interpretations.

A subset S of M is said to be achronal if there do not exist points p and q in S such that p ≪ q. Let γ : I → M be a smooth curve. We say that a point p in M is a future-endpoint of γ if, for all open sets O containing p, there exists an s0 in I such that, ∀ s ∈ I, if s ≥ s0, then γ(s) ∈ O; i.e., γ eventually enters and remains in O. Now let S be an achronal subset of M. The domain of dependence D(S) of S is the set of all points p in M with this property: given any smooth causal curve without (past- or future-) endpoint, if its image contains p, then it intersects S. So, in particular, S ⊆ D(S).

Untitled

Let S be an achronal subset of M. Further, let Tab be a smooth, symmetric field on M that satisfies both the dominant energy and conservation conditions. Finally, assume Tab = 0 on S. Then Tab = 0 on all of D(S).

The intended interpretation of the proposition is clear. If energy-momentum cannot propagate (locally) outside the null-cone, and if it is conserved, and if it vanishes on S, then it must vanish throughout D(S). After all, how could it “get to” any point in D(S)? According to interpretive principle free massive point particles traverse (images of) timelike geodesics. It turns out that if the energy-momentum content of each body in the sequence satisfies appropriate conditions, then the convergence point will necessarily traverse (the image of) a timelike geodesic.

Let γ: I → M be smooth curve. Suppose that, given any open subset O of M containing γ[I], ∃ a smooth symmetric field Tab on M such that the following conditions hold.

(1) Tab satisfies the SDEC.
(2) Tab satisfies the CC.
(3) Tab = 0 outside of O.
(4) Tab ≠ 0 at some point in O.

Then γ is timelike and can be reparametrized so as to be a geodesic. This might be paraphrased another way. Suppose that for some smooth curve γ , arbitrarily small bodies with energy-momentum satisfying conditions (1) and (2) can contain the image of γ in their worldtubes. Then γ must be a timelike geodesic (up to reparametrization). Bodies here are understood to be “free” if their internal energy-momentum is conserved (by itself). If a body is acted on by a field, it is only the composite energy-momentum of the body and field together that is conserved.

Untitled

But, this formulation for granted that we can keep the background spacetime metric gab fixed while altering the fields Tab that live on M. This is justifiable only to the extent that we are dealing with test bodies whose effect on the background spacetime structure is negligible.

We have here a precise proposition in the language of matter fields that, at least to some degree, captures the interpretive principle. Similarly, it is possible to capture the behavior of light, wherein the behavior of solutions to Maxwell’s equations in a limiting regime (“the optical limit”) where wavelengths are small. It asserts, in effect, that when one passes to this limit, packets of electromagnetic waves are constrained to move along (images of ) null geodesics.

Prisoner’s Dilemma. Thought of the Day 64.0

bg-jacobians

A system suffering from Prisoner’s Dilemma cannot find the optimal solution because the individual driving forces go against the overall driving force. This is called Prisoner’s Dilemma based on the imaginary situation of two prisoners:

Imagine two criminals, named alphabetically A and B, being caught and put in separate prison cells. The police is trying to get confessions out of them. They know that if none will talk, they will both walk out of there for lack of evidence. So the police makes a proposal to each one: “We’ll make it worth your while. If you confess, and your colleague not, we give you 10 thousand euro and your colleague will get 50 years in prison. If you both confess you will each get 20 years in prison”. The decision table for these prisoners is like this:

Untitled

As you can see for yourself, the individual option for A, independent of what B decides to do, is confessing; moving from right column to left column, it is either reducing his sentence from 50 to 20 years, or instead of walking out of there even getting a fat bonus on top. The same applies to B, moving from bottom row to top row of the table. So, they wind up both confessing and getting 20 years in prison. That while it is obvious that the optimal situation is both not talking and walking out of prison scot-free (with the loot!). Because A and B cannot come to an agreement, but both optimize their own personal yield instead, they both get severely punished!

The Prisoner’s Dilemma applies to economy. If people in society cannot come to an agreement, but instead let everybody take decisions to optimize the situation for themselves (as in liberalism), they wind up with a non-optimal situation in which all the wealth is condensed on a single entity. This does not even have to be a person, but the capital itself. Nobody will get anything, beyond the alms granted by the system. In fact, the system will tend to reduce these alms – the minimum wages, or unemployment benefit – and will have all kinds of dogmatic justifications for them, but basically is a strategy of divide-and-conquer, inhibiting people to come to agreements, for instance by breaking the trade unions.

An example of a dogmatic reason is “lowering wages will make that more people get hired for work”. Lowering wages will make the distortion more severe. Nothing more. Moreover, as we have seen, work can be done without human labor. So if it is about competition, men will be cut out of the deal sooner or later. It is not about production. It is about who gets the rights to the consumption of the goods produced. That is also why it is important that people should unite, to come to an agreement where everybody benefits. Up to and including the richest of them all! It is better to have 1% of 1 million than 100% of 1 thousand. Imagine this final situation: All property in the world belongs to the final pan-global bank, with their headquarters in an offshore or fiscal paradise. They do not pay tax. The salaries (even of the bank managers) are minimal. So small that it is indeed not even worth it to call them salary.

Derivability from Relational Logic of Charles Sanders Peirce to Essential Laws of Quantum Mechanics

Charles_Sanders_Peirce

Charles Sanders Peirce made important contributions in logic, where he invented and elaborated novel system of logical syntax and fundamental logical concepts. The starting point is the binary relation SiRSj between the two ‘individual terms’ (subjects) Sj and Si. In a short hand notation we represent this relation by Rij. Relations may be composed: whenever we have relations of the form Rij, Rjl, a third transitive relation Ril emerges following the rule

RijRkl = δjkRil —– (1)

In ordinary logic the individual subject is the starting point and it is defined as a member of a set. Peirce considered the individual as the aggregate of all its relations

Si = ∑j Rij —– (2)

The individual Si thus defined is an eigenstate of the Rii relation

RiiSi = Si —– (3)

The relations Rii are idempotent

R2ii = Rii —– (4)

and they span the identity

i Rii = 1 —– (5)

The Peircean logical structure bears resemblance to category theory. In categories the concept of transformation (transition, map, morphism or arrow) enjoys an autonomous, primary and irreducible role. A category consists of objects A, B, C,… and arrows (morphisms) f, g, h,… . Each arrow f is assigned an object A as domain and an object B as codomain, indicated by writing f : A → B. If g is an arrow g : B → C with domain B, the codomain of f, then f and g can be “composed” to give an arrow gof : A → C. The composition obeys the associative law ho(gof) = (hog)of. For each object A there is an arrow 1A : A → A called the identity arrow of A. The analogy with the relational logic of Peirce is evident, Rij stands as an arrow, the composition rule is manifested in equation (1) and the identity arrow for A ≡ Si is Rii.

Rij may receive multiple interpretations: as a transition from the j state to the i state, as a measurement process that rejects all impinging systems except those in the state j and permits only systems in the state i to emerge from the apparatus, as a transformation replacing the j state by the i state. We proceed to a representation of Rij

Rij = |ri⟩⟨rj| —– (6)

where state ⟨ri | is the dual of the state |ri⟩ and they obey the orthonormal condition

⟨ri |rj⟩ = δij —– (7)

It is immediately seen that our representation satisfies the composition rule equation (1). The completeness, equation (5), takes the form

n|ri⟩⟨ri|=1 —– (8)

All relations remain satisfied if we replace the state |ri⟩ by |ξi⟩ where

i⟩ = 1/√N ∑n |ri⟩⟨rn| —– (9)

with N the number of states. Thus we verify Peirce’s suggestion, equation (2), and the state |ri⟩ is derived as the sum of all its interactions with the other states. Rij acts as a projection, transferring from one r state to another r state

Rij |rk⟩ = δjk |ri⟩ —– (10)

We may think also of another property characterizing our states and define a corresponding operator

Qij = |qi⟩⟨qj | —– (11)

with

Qij |qk⟩ = δjk |qi⟩ —– (12)

and

n |qi⟩⟨qi| = 1 —– (13)

Successive measurements of the q-ness and r-ness of the states is provided by the operator

RijQkl = |ri⟩⟨rj |qk⟩⟨ql | = ⟨rj |qk⟩ Sil —– (14)

with

Sil = |ri⟩⟨ql | —– (15)

Considering the matrix elements of an operator A as Anm = ⟨rn |A |rm⟩ we find for the trace

Tr(Sil) = ∑n ⟨rn |Sil |rn⟩ = ⟨ql |ri⟩ —– (16)

From the above relation we deduce

Tr(Rij) = δij —– (17)

Any operator can be expressed as a linear superposition of the Rij

A = ∑i,j AijRij —– (18)

with

Aij =Tr(ARji) —– (19)

The individual states could be redefined

|ri⟩ → ei |ri⟩ —– (20)

|qi⟩ → ei |qi⟩ —– (21)

without affecting the corresponding composition laws. However the overlap number ⟨ri |qj⟩ changes and therefore we need an invariant formulation for the transition |ri⟩ → |qj⟩. This is provided by the trace of the closed operation RiiQjjRii

Tr(RiiQjjRii) ≡ p(qj, ri) = |⟨ri |qj⟩|2 —– (22)

The completeness relation, equation (13), guarantees that p(qj, ri) may assume the role of a probability since

j p(qj, ri) = 1 —– (23)

We discover that starting from the relational logic of Peirce we obtain all the essential laws of Quantum Mechanics. Our derivation underlines the outmost relational nature of Quantum Mechanics and goes in parallel with the analysis of the quantum algebra of microscopic measurement.

Of Magnitudes, Metrization and Materiality of Abstracto-Concrete Objects.

im6gq0

The possibility of introducing magnitudes in a certain domain of concrete material objects is by no means immediate, granted or elementary. First of all, it is necessary to find a property of such objects that permits to compare them, so that a quasi-serial ordering be introduced in their set, that is a total linear ordering not excluding that more than one object may occupy the same position in the series. Such an ordering must then undergo a metrization, which depends on finding a fundamental measuring procedure permitting the determination of a standard sample to which the unit of measure can be bound. This also depends on the existence of an operation of physical composition, which behaves additively with respect to the quantity which we intend to measure. Only if all these conditions are satisfied will it be possible to introduce a magnitude in a proper sense, that is a function which assigns to each object of the material domain a real number. This real number represents the measure of the object with respect to the intended magnitude. This condition, by introducing an homomorphism between the domain of the material objects and that of the positive real numbers, transforms the language of analysis (that is of the concrete theory of real numbers) into a language capable of speaking faithfully and truly about those physical objects to which it is said that such a magnitude belongs.

Does the success of applying mathematics in the study of the physical world mean that this world has a mathematical structure in an ontological sense, or does it simply mean that we find in mathematics nothing but a convenient practical tool for putting order in our representations of the world? Neither of the answers to this question is right, and this is because the question itself is not correctly raised. Indeed it tacitly presupposes that the endeavour of our scientific investigations consists in facing the reality of “things” as it is, so to speak, in itself. But we know that any science is uniquely concerned with a limited “cut” operated in reality by adopting a particular point of view, that is concretely manifested by adopting a restricted number of predicates in the discourse on reality. Several skilful operational manipulations are needed in order to bring about a homomorphism with the structure of the positive real numbers. It is therefore clear that the objects that are studied by an empirical theory are by no means the rough things of everyday experience, but bundles of “attributes” (that is of properties, relations and functions), introduced through suitable operational procedures having often the explicit and declared goal of determining a concrete structure as isomorphic, or at least homomorphic, to the structure of real numbers or to some other mathematical structure. But now, if the objects of an empirical theory are entities of this kind, we are fully entitled to maintain that they are actually endowed with a mathematical structure: this is simply that structure which we have introduced through our operational procedures. However, this structure is objective and real and, with respect to it, the mathematized discourse is far from having a purely conventional and pragmatic function, with the goal of keeping our ideas in order: it is a faithful description of this structure. Of course, we could never pretend that such a discourse determines the structure of reality in a full and exhaustive way, and this for two distinct reasons: In the first place, reality (both in the sense of the totality of existing things, and of the ”whole” of any single thing), is much richer than the particular “slide” that it is possible to cut out by means of our operational manipulations. In the second place, we must be aware that a scientific object, defined as a structured set of attributes, is an abstract object, is a conceptual construction that is perfectly defined just because it is totally determined by a finite list of predicates. But concrete objects are by no means so: they are endowed with a great deal of attributes of an indefinite variety, so that they can at best exemplify with an acceptable approximation certain abstract objects that are totally encoding a given set of attributes through their corresponding predicates. The reason why such an exemplification can only be partial is that the different attributes that are simultaneously present in a concrete object are, in a way, mutually limiting themselves, so that this object does never fully exemplify anyone of them. This explains the correct sense of such common and obvious remarks as: “a rigid body, a perfect gas, an adiabatic transformation, a perfect elastic recoil, etc, do not exist in reality (or in Nature)”. Sometimes this remark is intended to vehiculate the thesis that these are nothing but intellectual fictions devoid of any correspondence with reality, but instrumentally used by scientists in order to organize their ideas. This interpretation is totally wrong, and is simply due to a confusion between encoding and exemplifying: no concrete thing encodes any finite and explicit number of characteristics that, on the contrary, can be appropriately encoded in a concept. Things can exemplify several concepts, while concepts (or abstract objects) do not exemplify the attributes they encode. Going back to the distinction between sense on the one hand, and reference or denotation on the other hand, we could also say that abstract objects belong to the level of sense, while their exemplifications belong to the level of reference, and constitute what is denoted by them. It is obvious that in the case of empirical sciences we try to construct conceptual structures (abstract objects) having empirical denotations (exemplified by concrete objects). If one has well understood this elementary but important distinction, one is in the position of correctly seeing how mathematics can concern physical objects. These objects are abstract objects, are structured sets of predicates, and there is absolutely nothing surprising in the fact that they could receive a mathematical structure (for example, a structure isomorphic to that of the positive real numbers, or to that of a given group, or of an abstract mathematical space, etc.). If it happens that these abstract objects are exemplified by concrete objects within a certain degree of approximation, we are entitled to say that the corresponding mathematical structure also holds true (with the same degree of approximation) for this domain of concrete objects. Now, in the case of physics, the abstract objects are constructed by isolating certain ontological attributes of things by means of concrete operations, so that they actually refer to things, and are exemplified by the concrete objects singled out by means of such operations up to a given degree of approximation or accuracy. In conclusion, one can maintain that mathematics constitutes at the same time the most exact language for speaking of the objects of the domain under consideration, and faithfully mirrors the concrete structure (in an ontological sense) of this domain of objects. Of course, it is very reasonable to recognize that other aspects of these things (or other attributes of them) might not be treatable by means of the particular mathematical language adopted, and this may imply either that these attributes could perhaps be handled through a different available mathematical language, or even that no mathematical language found as yet could be used for handling them.

Conjuncted: Ergodicity. Thought of the Day 51.1

ergod_noise

When we scientifically investigate a system, we cannot normally observe all possible histories in Ω, or directly access the conditional probability structure {PrE}E⊆Ω. Instead, we can only observe specific events. Conducting many “runs” of the same experiment is an attempt to observe as many histories of a system as possible, but even the best experimental design rarely allows us to observe all histories or to read off the full conditional probability structure. Furthermore, this strategy works only for smaller systems that we can isolate in laboratory conditions. When the system is the economy, the global ecosystem, or the universe in its entirety, we are stuck in a single history. We cannot step outside that history and look at alternative histories. Nonetheless, we would like to infer something about the laws of the system in general, and especially about the true probability distribution over histories.

Can we discern the system’s laws and true probabilities from observations of specific events? And what kinds of regularities must the system display in order to make this possible? In other words, are there certain “metaphysical prerequisites” that must be in place for scientific inference to work?

To answer these questions, we first consider a very simple example. Here T = {1,2,3,…}, and the system’s state at any time is the outcome of an independent coin toss. So the state space is X = {Heads, Tails}, and each possible history in Ω is one possible Heads/Tails sequence.

Suppose the true conditional probability structure on Ω is induced by the single parameter p, the probability of Heads. In this example, the Law of Large Numbers guarantees that, with probability 1, the limiting frequency of Heads in a given history (as time goes to infinity) will match p. This means that the subset of Ω consisting of “well-behaved” histories has probability 1, where a history is well-behaved if (i) there exists a limiting frequency of Heads for it (i.e., the proportion of Heads converges to a well-defined limit as time goes to infinity) and (ii) that limiting frequency is p. For this reason, we will almost certainly (with probability 1) arrive at the true conditional probability structure on Ω on the basis of observing just a single history and counting the number of Heads and Tails in it.

Does this result generalize? The short answer is “yes”, provided the system’s symmetries are of the right kind. Without suitable symmetries, generalizing from local observations to global laws is not possible. In a slogan, for scientific inference to work, there must be sufficient regularities in the system. In our toy system of the coin tosses, there are. Wigner (1967) recognized this point, taking symmetries to be “a prerequisite for the very possibility of discovering the laws of nature”.

Generally, symmetries allow us to infer general laws from specific observations. For example, let T = {1,2,3,…}, and let Y and Z be two subsets of the state space X. Suppose we have made the observation O: “whenever the state is in the set Y at time 5, there is a 50% probability that it will be in Z at time 6”. Suppose we know, or are justified in hypothesizing, that the system has the set of time symmetries {ψr : r = 1,2,3,….}, with ψr(t) = t + r, as defined as in the previous section. Then, from observation O, we can deduce the following general law: “for any t in T, if the state of the system is in the set Y at time t, there is a 50% probability that it will be in Z at time t + 1”.

However, this example still has a problem. It only shows that if we could make observation O, then our generalization would be warranted, provided the system has the relevant symmetries. But the “if” is a big “if”. Recall what observation O says: “whenever the system’s state is in the set Y at time 5, there is a 50% probability that it will be in the set Z at time 6”. Clearly, this statement is only empirically well supported – and thus a real observation rather than a mere hypothesis – if we can make many observations of possible histories at times 5 and 6. We can do this if the system is an experimental apparatus in a lab or a virtual system in a computer, which we are manipulating and observing “from the outside”, and on which we can perform many “runs” of an experiment. But, as noted above, if we are participants in the system, as in the case of the economy, an ecosystem, or the universe at large, we only get to experience times 5 and 6 once, and we only get to experience one possible history. How, then, can we ever assemble a body of evidence that allows us to make statements such as O?

The solution to this problem lies in the property of ergodicity. This is a property that a system may or may not have and that, if present, serves as the desired metaphysical prerequisite for scientific inference. To explain this property, let us give an example. Suppose T = {1,2,3,…}, and the system has all the time symmetries in the set Ψ = {ψr : r = 1,2,3,….}. Heuristically, the symmetries in Ψ can be interpreted as describing the evolution of the system over time. Suppose each time-step corresponds to a day. Then the history h = (a,b,c,d,e,….) describes a situation where today’s state is a, tomorrow’s is b, the next day’s is c, and so on. The transformed history ψ1(h) = (b,c,d,e,f,….) describes a situation where today’s state is b, tomorrow’s is c, the following day’s is d, and so on. Thus, ψ1(h) describes the same “world” as h, but as seen from the perspective of tomorrow. Likewise, ψ2(h) = (c,d,e,f,g,….) describes the same “world” as h, but as seen from the perspective of the day after tomorrow, and so on.

Given the set Ψ of symmetries, an event E (a subset of Ω) is Ψ-invariant if the inverse image of E under ψ is E itself, for all ψ in Ψ. This implies that if a history h is in E, then ψ(h) will also be in E, for all ψ. In effect, if the world is in the set E today, it will remain in E tomorrow, and the day after tomorrow, and so on. Thus, E is a “persistent” event: an event one cannot escape from by moving forward in time. In a coin-tossing system, where Ψ is still the set of time translations, examples of Ψ- invariant events are “all Heads”, where E contains only the history (Heads, Heads, Heads, …), and “all Tails”, where E contains only the history (Tails, Tails, Tails, …).

The system is ergodic (with respect to Ψ) if, for any Ψ-invariant event E, the unconditional probability of E, i.e., PrΩ(E), is either 0 or 1. In other words, the only persistent events are those which occur in almost no history (i.e., PrΩ(E) = 0) and those which occur in almost every history (i.e., PrΩ(E) = 1). Our coin-tossing system is ergodic, as exemplified by the fact that the Ψ-invariant events “all Heads” and “all Tails” occur with probability 0.

In an ergodic system, it is possible to estimate the probability of any event “empirically”, by simply counting the frequency with which that event occurs. Frequencies are thus evidence for probabilities. The formal statement of this is the following important result from the theory of dynamical systems and stochastic processes.

Ergodic Theorem: Suppose the system is ergodic. Let E be any event and let h be any history. For all times t in T, let Nt be the number of elements r in the set {1, 2, …, t} such that ψr(h) is in E. Then, with probability 1, the ratio Nt/t will converge to PrΩ(E) as t increases towards infinity.

Intuitively, Nt is the number of times the event E has “occurred” in history h from time 1 up to time t. The ratio Nt/t is therefore the frequency of occurrence of event E (up to time t) in history h. This frequency might be measured, for example, by performing a sequence of experiments or observations at times 1, 2, …, t. The Ergodic Theorem says that, almost certainly (i.e., with probability 1), the empirical frequency will converge to the true probability of E, PrΩ(E), as the number of observations becomes large. The estimation of the true conditional probability structure from the frequencies of Heads and Tails in our illustrative coin-tossing system is possible precisely because the system is ergodic.

To understand the significance of this result, let Y and Z be two subsets of X, and suppose E is the event “h(1) is in Y”, while D is the event “h(2) is in Z”. Then the intersection E ∩ D is the event “h(1) is in Y, and h(2) is in Z”. The Ergodic Theorem says that, by performing a sequence of observations over time, we can empirically estimate PrΩ(E) and PrΩ(E ∩ D) with arbitrarily high precision. Thus, we can compute the ratio PrΩ(E ∩ D)/PrΩ(E). But this ratio is simply the conditional probability PrΕ(D). And so, we are able to estimate the conditional probability that the state at time 2 will be in Z, given that at time 1 it was in Y. This illustrates that, by allowing us to estimate unconditional probabilities empirically, the Ergodic Theorem also allows us to estimate conditional probabilities, and in this way to learn the properties of the conditional probability structure {PrE}E⊆Ω.

We may thus conclude that ergodicity is what allows us to generalize from local observations to global laws. In effect, when we engage in scientific inference about some system, or even about the world at large, we rely on the hypothesis that this system, or the world, is ergodic. If our system, or the world, were “dappled”, then presumably we would not be able to presuppose ergodicity, and hence our ability to make scientific generalizations would be compromised.

Category Theory of a Sketch. Thought of the Day 50.0

Untitled

If a sketch can be thought of as an abstract concept, a model of a sketch is not so much an interpretation of a sketch, but a concrete or particular instantiation or realization of it. It is tempting to adopt a Kantian terminology here and say that a sketch is an abstract concept, a functor between a sketch and a category C a schema and the models of a sketch the constructions in the “intuition” of the concept.

The schema is not unique since a sketch can be realized in many different categories by many different functors. What varies from one category to the other is not the basic structure of the realizations, but the types of morphisms of the underlying category, e.g., arbitrary functions, continuous maps, etc. Thus, even though a sketch captures essential structural ingredients, others are given by the “environment” in which this structure will be realized, which can be thought of as being itself another structure. Hence, the “meaning” of some concepts cannot be uniquely given by a sketch, which is not to say that it cannot be given in a structuralist fashion.

We now distinguish the group as a structure, given by the sketch for the theory of groups, from the structure of groups, given by a category of groups, that is the category of models of the sketch for groups in a given category, be it Set or another category, e.g., the category of topological spaces with continuous maps. In the latter case, the structure is given by the exactness properties of the category, e.g., Cartesian closed, etc. This is an important improvement over the traditional framework in which one was unable to say whether we should talk about the structure common to all groups, usually taken to be given by the group axioms, or the structure generated by “all” groups. Indeed, one can now ask in a precise manner whether a category C of structures, e.g., the category of (small) groups, is sketchable, that is, whether there exists a sketch S such that Mod(S, Set) is equivalent as a category to C.

There is another category associated to a sketch, namely the theory of that sketch. The theory of a sketch S, denoted by Th(S), is in a sense “freely” constructed from S : the arrows of the underlying graph are freely composed and the diagrams are imposed as equations, and so are the cones and the cocones. Th(S) is in fact a model of S in the previous sense with the following universal property: for any other model M of S in a category C there is a unique functor F: Th(S) → C such that FU = M, where U: S → Th(S). Thus, for instance, the theory of groups is a category with a group object, the generic group, “freely” constructed from the sketch for groups. It is in a way the “universal” group in the sense that any other group in any category can be constructed from it. This is possible since it contains all possible arrows, i.e., all definable operations, obtained in a purely internal or abstract manner. It is debatable whether this category should be called the theory of the sketch. But that may be more a matter of terminology than anything else, since it is clear that the “free” category called the theory is there to stay in one way or another.

Conjuncted: Indiscernibles – Philosophical Constructibility. Thought of the Day 48.1

Simulated Reality

Conjuncted here.

“Thought is nothing other than the desire to finish with the exorbitant excess of the state” (Being and Event). Since Cantor’s theorem implies that this excess cannot be removed or reduced to the situation itself, the only way left is to take control of it. A basic, paradigmatic strategy for achieving this goal is to subject the excess to the power of language. Its essence has been expressed by Leibniz in the form of the principle of indiscernibles: there cannot exist two things whose difference cannot be marked by a describable property. In this manner, language assumes the role of a “law of being”, postulating identity, where it cannot find a difference. Meanwhile – according to Badiou – the generic truth is indiscernible: there is no property expressible in the language of set theory that characterizes elements of the generic set. Truth is beyond the power of knowledge, only the subject can support a procedure of fidelity by deciding what belongs to a truth. This key thesis is established using purely formal means, so it should be regarded as one of the peak moments of the mathematical method employed by Badiou.

Badiou composes the indiscernible out of as many as three different mathematical notions. First of all, he decides that it corresponds to the concept of the inconstructible. Later, however, he writes that “a set δ is discernible (…) if there exists (…) an explicit formula λ(x) (…) such that ‘belong to δ’ and ‘have the property expressed by λ(x)’ coincide”. Finally, at the outset of the argument designed to demonstrate the indiscernibility of truth he brings in yet another definition: “let us suppose the contrary: the discernibility of G. A formula thus exists λ(x, a1,…, an) with parameters a1…, an belonging to M[G] such that for an inhabitant of M[G] it defines the multiple G”. In short, discernibility is understood as:

  1. constructibility
  2. definability by a formula F(y) with one free variable and no parameters. In this approach, a set a is definable if there exists a formula F(y) such that b is an element of a if F(b) holds.
  3. definability by a formula F (y, z1 . . . , zn) with parameters. This time, a set a is definable if there exists a formula F(y, z1,…, zn) and sets a1,…, an such that after substituting z1 = a1,…, zn = an, an element b belongs to a iff F(b, a1,…, an) holds.

Even though in “Being and Event” Badiou does not explain the reasons for this variation, it clearly follows from his other writings (Alain Badiou Conditions) that he is convinced that these notions are equivalent. It should be emphasized then that this is not true: a set may be discernible in one sense, but indiscernible in another. First of all, the last definition has been included probably by mistake because it is trivial. Every set in M[G] is discernible in this sense because for every set a the formula F(y, x) defined as y belongs to x defines a after substituting x = a. Accepting this version of indiscernibility would lead to the conclusion that truth is always discernible, while Badiou claims that it is not so.

Is it not possible to choose the second option and identify discernibility with definability by a formula with no parameters? After all, this notion is most similar to the original idea of Leibniz intuitively, the formula F(y) expresses a property characterizing elements of the set defined by it. Unfortunately, this solution does not warrant indiscernibility of the generic set either. As a matter of fact, assuming that in ontology, that is, in set theory, discernibility corresponds to constructibility, Badiou is right that the generic set is necessarily indiscernible. However, constructibility is a highly technical notion, and its philosophical interpretation seems very problematic. Let us take a closer look at it.

The class of constructible sets – usually denoted by the letter L – forms a hierarchy indexed or numbered by ordinal numbers. The lowest level L0 is simply the empty set. Assuming that some level – let us denote it by Lα – has already been

constructed, the next level Lα+1 is constructed by choosing all subsets of L that can be defined by a formula (possibly with parameters) bounded to the lower level Lα.

Bounding a formula to Lα means that its parameters must belong to Lα and that its quantifiers are restricted to elements of Lα. For instance, the formula ‘there exists z such that z is in y’ simply says that y is not empty. After bounding it to Lα this formula takes the form ‘there exists z in Lα such that z is in y’, so it says that y is not empty, and some element from Lα witnesses it. Accordingly, the set defined by it consists of precisely those sets in Lα that contain an element from Lα.

After constructing an infinite sequence of levels, the level directly above them all is simply the set of all elements constructed so far. For example, the first infinite level Lω consists of all elements constructed on levels L0, L1, L2,….

As a result of applying this inductive definition, on each level of the hierarchy all the formulas are used, so that two distinct sets may be defined by the same formula. On the other hand, only bounded formulas take part in the construction. The definition of constructibility offers too little and too much at the same time. This technical notion resembles the Leibnizian discernibility only in so far as it refers to formulas. In set theory there are more notions of this type though.

To realize difficulties involved in attempts to philosophically interpret constructibility, one may consider a slight, purely technical, extension of it. Let us also accept sets that can be defined by a formula F (y, z1, . . . , zn) with constructible parameters, that is, parameters coming from L. Such a step does not lead further away from the common understanding of Leibniz’s principle than constructibility itself: if parameters coming from lower levels of the hierarchy are admissible when constructing a new set, why not admit others as well, especially since this condition has no philosophical justification?

Actually, one can accept parameters coming from an even more restricted class, e.g., the class of ordinal numbers. Then we will obtain the notion of definability from ordinal numbers. This minor modification of the concept of constructibility – a relaxation of the requirement that the procedure of construction has to be restricted to lower levels of the hierarchy – results in drastic consequences.

Stationarity or Homogeneity of Random Fields

Untitled

Let (Ω, F, P) be a probability space on which all random objects will be defined. A filtration {Ft : t ≥ 0} of σ-algebras, is fixed and defines the information available at each time t.

Random field: A real-valued random field is a family of random variables Z(x) indexed by x ∈ Rd together with a collection of distribution functions of the form Fx1,…,xn which satisfy

Fx1,…,xn(b1,…,bn) = P[Z(x1) ≤ b1,…,Z(xn) ≤ bn], b1,…,bn ∈ R

The mean function of Z is m(x) = E[Z(x)] whereas the covariance function and the correlation function are respectively defined as

R(x, y) = E[Z(x)Z(y)] − m(x)m(y)

c(x, y) = R(x, x)/√(R(x, x)R(y, y))

Notice that the covariance function of a random field Z is a non-negative definite function on Rd × Rd, that is if x1, . . . , xk is any collection of points in Rd, and ξ1, . . . , ξk are arbitrary real constants, then

l=1kj=1k ξlξj R(xl, xj) = ∑l=1kj=1k ξlξj E(Z(xl) Z(xj)) = E (∑j=1k ξj Z(xj))2 ≥ 0

Without loss of generality, we assumed m = 0. The property of non-negative definiteness characterizes covariance functions. Hence, given any function m : Rd → R and a non-negative definite function R : Rd × Rd → R, it is always possible to construct a random field for which m and R are the mean and covariance function, respectively.

Bochner’s Theorem: A continuous function R from Rd to the complex plane is non-negative definite if and only if it is the Fourier-Stieltjes transform of a measure F on Rd, that is the representation

R(x) = ∫Rd eix.λ dF(λ)

holds for x ∈ Rd. Here, x.λ denotes the scalar product ∑k=1d xkλk and F is a bounded,  real-valued function satisfying ∫A dF(λ) ≥ 0 ∀ measurable A ⊂ Rd

The cross covariance function is defined as R12(x, y) = E[Z1(x)Z2(y)] − m1(x)m2(y)

, where m1 and m2 are the respective mean functions. Obviously, R12(x, y) = R21(y, x). A family of processes Zι with ι belonging to some index set I can be considered as a process in the product space (Rd, I).

A central concept in the study of random fields is that of homogeneity or stationarity. A random field is homogeneous or (second-order) stationary if E[Z(x)2] is finite ∀ x and

• m(x) ≡ m is independent of x ∈ Rd

• R(x, y) solely depends on the difference x − y

Thus we may consider R(h) = Cov(Z(x), Z(x+h)) = E[Z(x) Z(x+h)] − m2, h ∈ Rd,

and denote R the covariance function of Z. In this case, the following correspondence exists between the covariance and correlation function, respectively:

c(h) = R(h)/R(o)

i.e. c(h) ∝ R(h). For this reason, the attention is confined to either c or R. Two stationary random fields Z1, Z2 are stationarily correlated if their cross covariance function R12(x, y) depends on the difference x − y only. The two random fields are uncorrelated if R12 vanishes identically.

An interesting special class of homogeneous random fields that often arise in practice is the class of isotropic fields. These are characterized by the property that the covariance function R depends only on the length ∥h∥ of the vector h:

R(h) = R(∥h∥) .

In many applications, random fields are considered as functions of “time” and “space”. In this case, the parameter set is most conveniently written as (t,x) with t ∈ R+ and x ∈ Rd. Such processes are often homogeneous in (t, x) and isotropic in x in the sense that

E[Z(t, x)Z(t + h, x + y)] = R(h, ∥y∥) ,

where R is a function from R2 into R. In such a situation, the covariance function can be written as

R(t, ∥x∥) = ∫Rλ=0 eitu Hd (λ ∥x∥) dG(u, λ),

where

Hd(r) = (2/r)(d – 2)/2 Γ(d/2) J(d – 2)/2 (r)

and Jm is the Bessel function of the first kind of order m and G is a multiple of a distribution function on the half plane {(λ,u)|λ ≥ 0,u ∈ R}.