# Probability Space Intertwines Random Walks – Thought of the Day 144.0

Many deliberations of stochasticity start with “let (Ω, F, P) be a probability space”. One can actually follow such discussions without having the slightest idea what Ω is and who lives inside. So, what is “Ω, F, P” and why do we need it? Indeed, for many users of probability and statistics, a random variable X is synonymous with its probability distribution μX and all computations such as sums, expectations, etc., done on random variables amount to analytical operations such as integrations, Fourier transforms, convolutions, etc., done on their distributions. For defining such operations, you do not need a probability space. Isn’t this all there is to it?

One can in fact compute quite a lot of things without using probability spaces in an essential way. However the notions of probability space and random variable are central in modern probability theory so it is important to understand why and when these concepts are relevant.

From a modelling perspective, the starting point is a set of observations taking values in some set E (think for instance of numerical measurement, E = R) for which we would like to build a stochastic model. We would like to represent such observations x1, . . . , xn as samples drawn from a random variable X defined on some probability space (Ω, F, P). It is important to see that the only natural ingredient here is the set E where the random variables will take their values: the set of events Ω is not given a priori and there are many different ways to construct a probability space (Ω, F, P) for modelling the same set of observations.

Sometimes it is natural to identify Ω with E, i.e., to identify the randomness ω with its observed effect. For example if we consider the outcome of a dice rolling experiment as an integer-valued random variable X, we can define the set of events to be precisely the set of possible outcomes: Ω = {1, 2, 3, 4, 5, 6}. In this case, X(ω) = ω: the outcome of the randomness is identified with the randomness itself. This choice of Ω is called the canonical space for the random variable X. In this case the random variable X is simply the identity map X(ω) = ω and the probability measure P is formally the same as the distribution of X. Note that here X is a one-to-one map: given the outcome of X one knows which scenario has happened so any other random variable Y is completely determined by the observation of X. Therefore using the canonical construction for the random variable X, we cannot define, on the same probability space, another random variable which is independent of X: X will be the sole source of randomness for all other variables in the model. This also show that, although the canonical construction is the simplest way to construct a probability space for representing a given random variable, it forces us to identify this particular random variable with the “source of randomness” in the model. Therefore when we want to deal with models with a sufficiently rich structure, we need to distinguish Ω – the set of scenarios of randomness – from E, the set of values of our random variables.

Let us give an example where it is natural to distinguish the source of randomness from the random variable itself. For instance, if one is modelling the market value of a stock at some date T in the future as a random variable S1, one may consider that the stock value is affected by many factors such as external news, market supply and demand, economic indicators, etc., summed up in some abstract variable ω, which may not even have a numerical representation: it corresponds to a scenario for the future evolution of the market. S1(ω) is then the stock value if the market scenario which occurs is given by ω. If the only interesting quantity in the model is the stock price then one can always label the scenario ω by the value of the stock price S1(ω), which amounts to identifying all scenarios where the stock S1 takes the same value and using the canonical construction. However if one considers a richer model where there are now other stocks S2, S3, . . . involved, it is more natural to distinguish the scenario ω from the random variables S1(ω), S2(ω),… whose values are observed in these scenarios but may not completely pin them down: knowing S1(ω), S2(ω),… one does not necessarily know which scenario has happened. In this way one reserves the possibility of adding more random variables later on without changing the probability space.

These have the following important consequence: the probabilistic description of a random variable X can be reduced to the knowledge of its distribution μX only in the case where the random variable X is the only source of randomness. In this case, a stochastic model can be built using a canonical construction for X. In all other cases – as soon as we are concerned with a second random variable which is not a deterministic function of X – the underlying probability measure P contains more information on X than just its distribution. In particular, it contains all the information about the dependence of the random variable X with respect to all other random variables in the model: specifying P means specifying the joint distributions of all random variables constructed on Ω. For instance, knowing the distributions μX, μY of two variables X, Y does not allow to compute their covariance or joint moments. Only in the case where all random variables involved are mutually independent can one reduce all computations to operations on their distributions. This is the case covered in most introductory texts on probability, which explains why one can go quite far, for example in the study of random walks, without formalizing the notion of probability space.

# Bayesianism in Game Theory. Thought of the Day 24.0

Bayesianism in game theory can be characterised as the view that it is always possible to define probabilities for anything that is relevant for the players’ decision-making. In addition, it is usually taken to imply that the players use Bayes’ rule for updating their beliefs. If the probabilities are to be always definable, one also has to specify what players’ beliefs are before the play is supposed to begin. The standard assumption is that such prior beliefs are the same for all players. This common prior assumption (CPA) means that the players have the same prior probabilities for all those aspects of the game for which the description of the game itself does not specify different probabilities. Common priors are usually justified with the so called Harsanyi doctrine, according to which all differences in probabilities are to be attributed solely to differences in the experiences that the players have had. Different priors for different players would imply that there are some factors that affect the players’ beliefs even though they have not been explicitly modelled. The CPA is sometimes considered to be equivalent to the Harsanyi doctrine, but there seems to be a difference between them: the Harsanyi doctrine is best viewed as a metaphysical doctrine about the determination of beliefs, and it is hard to see why anybody would be willing to argue against it: if everything that might affect the determination of beliefs is included in the notion of ‘experience’, then it alone does determine the beliefs. The Harsanyi doctrine has some affinity to some convergence theorems in Bayesian statistics: if individuals are fed with similar information indefinitely, their probabilities will ultimately be the same, irrespective of the original priors.

The CPA however is a methodological injunction to include everything that may affect the players’ behaviour in the game: not just everything that motivates the players, but also everything that affects the players’ beliefs should be explicitly modelled by the game: if players had different priors, this would mean that the game structure would not be completely specified because there would be differences in players’ behaviour that are not explained by the model. In a dispute over the status of the CPA, Faruk Gul essentially argues that the CPA does not follow from the Harsanyi doctrine. He does this by distinguishing between two different interpretations of the common prior, the ‘prior view’ and the ‘infinite hierarchy view’. The former is a genuinely dynamic story in which it is assumed that there really is a prior stage in time. The latter framework refers to Mertens and Zamir’s construction in which prior beliefs can be consistently formulated. This framework however, is static in the sense that the players do not have any information on a prior stage, indeed, the ‘priors’ in this framework do not even pin down a player’s priors for his own types. Thus, the existence of a common prior in the latter framework does not have anything to do with the view that differences in beliefs reflect differences in information only.

It is agreed by everyone that for most (real-world) problems there is no prior stage in which the players know each other’s beliefs, let alone that they would be the same. The CPA, if understood as a modelling assumption, is clearly false. Robert Aumann, however, defends the CPA by arguing that whenever there are differences in beliefs, there must have been a prior stage in which the priors were the same, and from which the current beliefs can be derived by conditioning on the differentiating events. If players differ in their present beliefs, they must have received different information at some previous point in time, and they must have processed this information correctly. Based on this assumption, he further argues that players cannot ‘agree to disagree’: if a player knows that his opponents’ beliefs are different from his own, he should revise his beliefs to take the opponents’ information into account. The only case where the CPA would be violated, then, is when players have different beliefs, and have common knowledge about each others’ different beliefs and about each others’ epistemic rationality. Aumann’s argument seems perfectly legitimate if it is taken as a metaphysical one, but we do not see how it could be used as a justification for using the CPA as a modelling assumption in this or that application of game theory and Aumann does not argue that it should.

# Forward, Futures Contracts and Options: Top Down or bottom Up Modeling?

The simulation of financial markets can be modeled, from a theoretical viewpoint, according to two separate approaches: a bottom up approach and (or) a top down approach. For instance, the modeling of financial markets starting from diffusion equations and adding a noise term to the evolution of a function of a stochastic variable is a top down approach. This type of description is, effectively, a statistical one.

A bottom up approach, instead, is the modeling of artificial markets using complex data structures (agent based simulations) using general updating rules to describe the collective state of the market. The number of procedures implemented in the simulations can be quite large, although the computational cost of the simulation becomes forbidding as the size of each agent increases. Readers familiar with Sugarscape Models and the computational strategies based on Growing of Artificial Societies have probably an idea of the enormous potentialities of the field. All Sugarscape models include the agents (inhabitants), the environment (a two-dimensional grid) and the rules governing the interaction of the agents with each other and the environment. The original model presented by J. Epstein & R. Axtell (considered as the first large scale agent model) is based on a 51 x 51 cell grid, where every cell can contain different amounts of sugar (or spice). In every step agents look around, find the closest cell filled with sugar, move and metabolize. They can leave pollution, die, reproduce, inherit sources, transfer information, trade or borrow sugar, generate immunity or transmit diseases – depending on the specific scenario and variables defined at the set-up of the model. Sugar in simulation could be seen as a metaphor for resources in an artificial world through which the examiner can study the effects of social dynamics such as evolution, marital status and inheritance on populations. Exact simulation of the original rules provided by J. Epstein & R. Axtell in their book can be problematic and it is not always possible to recreate the same results as those presented in Growing Artificial Societies. However, one would expect that the bottom up description should become comparable to the top down description for a very large number of simulated agents.

The bottom up approach should also provide a better description of extreme events, such as crashes, collectively conditioned behaviour and market incompleteness, this approach being of purely algorithmic nature. A top down approach is, therefore, a model of reduced complexity and follows a statistical description of the dynamics of complex systems.

Forward, Futures Contracts and Options: Let the price at time t of a security be S(t). A specific good can be traded at time t at the price S(t) between a buyer and a seller. The seller (short position) agrees to sell the goods to the buyer (long position) at some time T in the future at a price F(t,T) (the contract price). Notice that contract prices have a 2-time dependence (actual time t and maturity time T). Their difference τ = T − t is usually called time to maturity. Equivalently, the actual price of the contract is determined by the prevailing actual prices and interest rates and by the time to maturity. Entering into a forward contract requires no money, and the value of the contract for long position holders and strong position holders at maturity T will be

(−1)p (S(T)−F(t,T)) (1)

where p = 0 for long positions and p = 1 for short positions. Futures Contracts are similar, except that after the contract is entered, any changes in the market value of the contract are settled by the parties. Hence, the cashflows occur all the way to expiry unlike in the case of the forward where only one cashflow occurs. They are also highly regulated and involve a third party (a clearing house). Forward, futures contracts and options go under the name of derivative products, since their contract price F(t, T) depend on the value of the underlying security S(T). Options are derivatives that can be written on any security and have a more complicated payoff function than the futures or forwards. For example, a call option gives the buyer (long position) the right (but not the obligation) to buy or sell the security at some predetermined strike-price at maturity. A payoff function is the precise form of the price. Path dependent options are derivative products whose value depends on the actual path followed by the underlying security up to maturity. In the case of path-dependent options, since the payoff may not be directly linked to an explicit right, they must be settled by cash. This is sometimes true for futures and plain options as well as this is more efficient.

# Simulations of Representations: Rational Calculus versus Empirical Weights

While modeling a complex system, it should never be taken for granted that these models somehow simplify the systems, for that would only strip the models of the capability to account for encoding, decoding, and retaining information that are sine qua non for the environment they plan to model, and the environment that these models find themselves embedded in. Now, that the traditional problems of representation are fraught with loopholes, there needs to be a way to jump out of this quandary, if modeling complex systems are not to be impacted by the traces of these very traditional notions of representation. The employment of post-structuralist theories are sure indicative of getting rid of the symptoms, since they score over the analytical tradition, where, representation is only an analogue of the thing represented, whereas, simulation with its affinity to French theory is conducive to a distributed and a holistic analogy. Any argument against representation is not to be taken as meaning anti-scientific, since it is merely an argument against a particular scientific methodology and/or strategy that assumes complexity to be reducible, and therefore implementable or representable in a machine. The argument takes force only as an appreciation for the nature of complexity, something that could perhaps be repeated in a machine, should the machine itself be complex enough to cope with the distributed character of complexity. Representation is a state that stands-in for some other state, and hence is nothing short of “essentially” about meaning. The language, thought that is incorporated in understanding the world we are embedded in is efficacious only if representation relates to the world, and therefore “relationship” is another pillar of representation. Unless a relationship relates the two, one gets only an abstracted version of the so-called identities in themselves with no explanatory discourse. In the world of complexity, such identity based abstractions lose their essence, for modeling takes over the onus of explanations, and therefore, it is without doubt, the establishment of these relations that bring together states of representations as taking high priority. Representation holds a central value in both formal systems and in neural networks or connectionism, where the former is characterized by a rational calculus, and the latter by patterns that operate over the network lending it a more empirical weight.

Let logic programming be the starting point for deliberations here. The idea behind this is using mathematical logic to successfully apply to computer programming. When logic is used as such, it is used as a declarative representational language; declarative because, logic of computation is expressed without accounting for the flow of control. In other words, within this language, the question is centered around what-ness, rather than how-ness. Declarative representation has a counterpart in procedural representation, where the onus is on procedures, functions, routines and methods. Procedural representation is more algorithmic in nature, as it depends upon following steps to carry out computation. In other words, the question is centered around how-ness. But logic programming as it is commonly understood cannot do without both of them becoming a part of programming language at the same time. Since both of them are required, propositional logic that deals primarily with declarative representational languages would not suffice all alone, and hence, what is required is a logic that would touch upon predicates as well. This is made possible by first-order predicate logic that distinguishes itself from propositional logic by its use of quantifiers(1). The predicate logic thus finds its applications suited for deductive apparatus of formal systems, where axioms and rules of inferences are instrumental in deriving theorems that guide these systems. This setup is too formal in character and thus calls for a connectionist approach, since the latter is simply not keen to have predicate logic operate over deductive apparatus of a formal system at its party.

If brain and language (natural language and not computer languages, which are more rule-based and hence strict) as complex systems could be shown to have circumvented representationism via modeling techniques, the classical issues inherent in representation would be gotten rid of in the sense of a problematic. Functionalism as the prevalent theory in philosophy of mind that parallels computational model is the target here. In the words of Putnam,

I may have been the first philosopher to advance the thesis that the computer is the right model for mind. I gave my form of this doctrine the name ‘functionalism’, and under this name, it has become the dominant view – some say the orthodoxy – in contemporary philosophy of mind.

The computer metaphor with mind is clearly visible here, with the former having an hardware apparatus that is operated upon by the software programs, while the latter shares the same relation with brain (hardware) and mind (software). So far, so good, but there is a hitch. Like the computer side of metaphor, which can have a software loaded on to different hardwares, provided there is enough computational capability possessed by the hardware, the mind-brain relationship should meet the same criteria as well. If one goes by what Sterelny has hinted for functionalism as a certain physical state of the machine realizing a certain functional state, then a couple of descriptions, mutually exclusive of one another result, viz, a description on the physical level, and a description on the mental level. The consequences of such descriptions are bizarre to the extent that mind as a software can also find its implementation on any other hardware, provided the conditions for hardware’s capability to run the software are met successfully. One could hardly argue against these consequences that follow logically enough from the premisses, but a couple of blocks are not to be ignored at the same time, viz, the adequacy of the physical systems to implement the functional states, and what defines the relationships between these two mutually exclusive descriptions under the context of the same physical system. Sterelny comes up with a couple of criteria for adequate physical systems, designed, and teleological. Rather than provide any support for what he means by the systems as designed, he comes up with evolutionary tendencies, thus vouching for an external designer. The second one gets disturbing, if there is no description made, and this is precisely what Sterelny never offers. His citation of a bucket of water not having a telos in the sense of brain having one, only makes matters slide into metaphysics. Even otherwise, functionalism as a nature of mental states is metaphysical and ontological in import. This claim gets all the more highlighted, if one believes following Brentano that intentionality is the mark of the mental, then any theory of intentionality can be converted into a theory of of the ontological nature of psychological states. Getting back to the second description of Sterelny, functional states attain meaning, if they stand for something else, hence functionalism gets representational. And as Paul Cilliers says it cogently, grammatical structure of the language represents semantical content, and the neurological states of the brain represent certain mental states, thus proving without doubt, the responsibility on representation on establishing a link between the states of the system and conceptual meaning. This is again echoed in Sterelny,

There can be no informational sensitivity without representation. There can be no flexible and adaptive response to the world without representation. To learn about the world, and to use what we learn to act in new ways, we must be able to represent the world, our goals and options. Furthermore we must make appropriate inferences from these representations.

As representation is essentially about meaning, two levels are to be related with one another for any meaning to be possible. In the formal systems, or the rule-based approach, these relations are provided by creating a nexus between “symbol” and what it “symbolizes”. This fundamental linkage is offered by Fodor in his 1975 book, The Language of Thought. The main thesis of the book is about cognition and cognitive processes as remotely plausible, when computationally expressed in terms of representational systems. The language in possession of its own syntactic and semantic structures, and also independent of any medium, exhibits a causal effect on mental representations. Such a language is termed by him “mentalese”, which is implemented in the neural structure (a case in point for internal representation(2)), and following permutations allows for complex thoughts getting built up through simpler versions. The underlying hypothesis states that such a language applies to thoughts having propositional content, implying thoughts as having syntaxes. In order for complex thoughts to be generated, simple concepts are attached with the most basic linguistic token that combine following rules of logic (combinatorial rules). The language thus enriched is not only productive, with regard to length of the sentence getting longer (potentially so) without altering the meaning (concatenation), but also structured, in that rules of grammar that allow us to make inferences about linguistic elements previously unrelated. Once this task is accomplished, the representational theory of thought steps in to explicate on the essence of tokens and how they behave and relate. The representational theory of thought validates mental representations, that stand in uniquely for a subject of representation having a specific content to itself, to allow for causally generated complex thought. Sterelny echoes this when he says,

Internal representation helps us visualize our movements in the world and our embeddedness in the world. Internal representation takes it for granted that organisms inherently have such an attribute to have any cognition whatsoever. The plus point as in the work of Fodor is the absence of any other theory that successfully negotiates or challenges the very inherent-ness of internal representation.

For this model, and based on it, require an agent to represent the world as it is and as it might be, and to draw appropriate inferences from that representation. Fodor argues that the agent must have a language-like symbol system, for she can represent indefinitely many and indefinitely complex actual and possible states of her environment. She could not have this capacity without an appropriate means of representation, a language of thought. Mentalese thus is too rationalist in its approach, and hence in opposition to neural networks or connectionism. As there can be no possible cognitive processes without mental representations, the theory has many takers(3). One line of thought that supports this approach is the plausibility of psychological models that represent cognitive processes as representational thereby inviting computational thought to compute.

(1) Quantifier is an operator that binds a variable over a domain of discourse. The domain of discourse in turn specifies the range of these relevant variables.

(2) Internal representation helps us visualize our movements in the world and our embeddedness in the world. Internal representation takes it for granted that organisms inherently have such an attribute to have any cognition whatsoever. The plus point as in the work of Fodor is the absence of any other theory that successfully negotiates or challenges the very inherent-ness of internal representation.

(3) Tim Crane is a notable figure here. Crane explains Fodor’s Mentalese Hypothesis as desiring one thing and something else. Crane returns to the question of why we should believe the vehicle of mental representation is a language. Crane states that while he agrees with Fodor, his method of reaching it is very different. Crane goes on to say that reason: our ability as humans to decide a rational decision from the information giving is his argument for this question. Association of ideas lead to other ideas which only have a connection for the thinker. Fodor agrees that free association goes on but he says that is in a systemic, rational way that can be shown to work with the Language of Thought theory. Fodor states you must look at in a computational manner and that this allows it to be seen in a different light than normally and that free association follows a certain manner that can be broken down and explained with Language of Thought. Language of Thought.