# Cadlag Stochasticities: Lévy Processes. Part 1.

A compound Poisson process with a Gaussian distribution of jump sizes, and a jump diffusion of a Lévy process with Gaussian component and finite jump intensity.

A cadlag stochastic process (Xt)t≥0 on (Ω,F,P) with values in Rd such that X0 = 0 is called a Lévy process if it possesses the following properties:

1. Independent increments: for every increasing sequence of times t0 . . . tn, the random variables Xt0, Xt1 − Xt0 , . . . , Xtn − Xtn−1 are independent.

2. Stationary increments: the law of Xt+h − Xt does not depend on t.

3. Stochastic continuity: ∀ε > 0, limh→0 P(|Xt+h − Xt| ≥ ε) = 0.

A sample function x on a well-ordered set T is cadlag if it is continuous from the right and limited from the left at every point. That is, for every t0 ∈ T, t ↓ t0 implies x(t) → x(t0), and for t ↑ t0, limt↑t0 x(t)exists, but need not be x(t0). A stochastic process X is cadlag if almost all its sample paths are cadlag.

The third condition does not imply in any way that the sample paths are continuous, and is verified by the Poisson process. It serves to exclude processes with jumps at fixed (nonrandom) times, which can be regarded as “calendar effects” and means that for given time t, the probability of seeing a jump at t is zero: discontinuities occur at random times.

If we sample a Lévy process at regular time intervals 0, ∆, 2∆, . . ., we obtain a random walk: defining Sn(∆) ≡ Xn∆, we can write Sn(∆) = ∑k=0n−1 Yk where Yk = X(k+1)∆ − Xk∆ are independent and identically dependent random variables whose distribution is the same as the distribution of X. Since this can be done for any sampling interval ∆ we see that by specifying a Lévy process one can specify a whole family of random walks Sn(∆).

Choosing n∆ = t, we see that for any t > 0 and any n ≥ 1, Xt = Sn(∆) can be represented as a sum of n independent and identically distributed random variables whose distribution is that of Xt/n: Xt can be “divided” into n independent and identically distributed parts. A distribution having this property is said to be infinitely divisible.

A probability distribution F on Rd is said to be infinitely divisible if for any integer n ≥ 2, ∃ n independent and identically distributed random variables Y1, …Yn such that Y1 + … + Yn has distribution F.

Since the distribution of independent and identically distributed sums is given by convolution of the distribution of the summands, denoting by μ the distribution of Yk-s, F = μ ∗ μ ∗ ··· ∗ μ is the nth convolution of μ. So an infinitely divisible distribution can also be defined as a distribution F for which the nth convolution root is still a probability distribution, for any n ≥ 2.

Thus, if X is a Lévy process, for any t > 0 the distribution of Xt is infinitely divisible. This puts a constraint on the possible choices of distributions for Xt: whereas the increments of a discrete-time random walk can have arbitrary distribution, the distribution of increments of a Lévy process has to be infinitely divisible.

The most common examples of infinitely divisible laws are: the Gaussian distribution, the gamma distribution, α-stable distributions and the Poisson distribution: a random variable having any of these distributions can be decomposed into a sum of n independent and identically distributed parts having the same distribution but with modified parameters. Conversely, given an infinitely divisible distribution F, it is easy to see that for any n ≥ 1 by chopping it into n independent and identically distributed components we can construct a random walk model on a time grid with step size 1/n such that the law of the position at t = 1 is given by F. In the limit, this procedure can be used to construct a continuous time Lévy process (Xt)t≥0 such that the law of X1 if given by F. Let (Xt)t≥0 be a Lévy process. Then for every t, Xt has an infinitely divisible distribution. Conversely, if F is an infinitely divisible distribution then ∃ a Lévy process (Xt) such that the distribution of X1 is given by F.

# Probability Space Intertwines Random Walks – Thought of the Day 144.0

Many deliberations of stochasticity start with “let (Ω, F, P) be a probability space”. One can actually follow such discussions without having the slightest idea what Ω is and who lives inside. So, what is “Ω, F, P” and why do we need it? Indeed, for many users of probability and statistics, a random variable X is synonymous with its probability distribution μX and all computations such as sums, expectations, etc., done on random variables amount to analytical operations such as integrations, Fourier transforms, convolutions, etc., done on their distributions. For defining such operations, you do not need a probability space. Isn’t this all there is to it?

One can in fact compute quite a lot of things without using probability spaces in an essential way. However the notions of probability space and random variable are central in modern probability theory so it is important to understand why and when these concepts are relevant.

From a modelling perspective, the starting point is a set of observations taking values in some set E (think for instance of numerical measurement, E = R) for which we would like to build a stochastic model. We would like to represent such observations x1, . . . , xn as samples drawn from a random variable X defined on some probability space (Ω, F, P). It is important to see that the only natural ingredient here is the set E where the random variables will take their values: the set of events Ω is not given a priori and there are many different ways to construct a probability space (Ω, F, P) for modelling the same set of observations.

Sometimes it is natural to identify Ω with E, i.e., to identify the randomness ω with its observed effect. For example if we consider the outcome of a dice rolling experiment as an integer-valued random variable X, we can define the set of events to be precisely the set of possible outcomes: Ω = {1, 2, 3, 4, 5, 6}. In this case, X(ω) = ω: the outcome of the randomness is identified with the randomness itself. This choice of Ω is called the canonical space for the random variable X. In this case the random variable X is simply the identity map X(ω) = ω and the probability measure P is formally the same as the distribution of X. Note that here X is a one-to-one map: given the outcome of X one knows which scenario has happened so any other random variable Y is completely determined by the observation of X. Therefore using the canonical construction for the random variable X, we cannot define, on the same probability space, another random variable which is independent of X: X will be the sole source of randomness for all other variables in the model. This also show that, although the canonical construction is the simplest way to construct a probability space for representing a given random variable, it forces us to identify this particular random variable with the “source of randomness” in the model. Therefore when we want to deal with models with a sufficiently rich structure, we need to distinguish Ω – the set of scenarios of randomness – from E, the set of values of our random variables.

Let us give an example where it is natural to distinguish the source of randomness from the random variable itself. For instance, if one is modelling the market value of a stock at some date T in the future as a random variable S1, one may consider that the stock value is affected by many factors such as external news, market supply and demand, economic indicators, etc., summed up in some abstract variable ω, which may not even have a numerical representation: it corresponds to a scenario for the future evolution of the market. S1(ω) is then the stock value if the market scenario which occurs is given by ω. If the only interesting quantity in the model is the stock price then one can always label the scenario ω by the value of the stock price S1(ω), which amounts to identifying all scenarios where the stock S1 takes the same value and using the canonical construction. However if one considers a richer model where there are now other stocks S2, S3, . . . involved, it is more natural to distinguish the scenario ω from the random variables S1(ω), S2(ω),… whose values are observed in these scenarios but may not completely pin them down: knowing S1(ω), S2(ω),… one does not necessarily know which scenario has happened. In this way one reserves the possibility of adding more random variables later on without changing the probability space.

These have the following important consequence: the probabilistic description of a random variable X can be reduced to the knowledge of its distribution μX only in the case where the random variable X is the only source of randomness. In this case, a stochastic model can be built using a canonical construction for X. In all other cases – as soon as we are concerned with a second random variable which is not a deterministic function of X – the underlying probability measure P contains more information on X than just its distribution. In particular, it contains all the information about the dependence of the random variable X with respect to all other random variables in the model: specifying P means specifying the joint distributions of all random variables constructed on Ω. For instance, knowing the distributions μX, μY of two variables X, Y does not allow to compute their covariance or joint moments. Only in the case where all random variables involved are mutually independent can one reduce all computations to operations on their distributions. This is the case covered in most introductory texts on probability, which explains why one can go quite far, for example in the study of random walks, without formalizing the notion of probability space.

# Gauge Theory of Arbitrage, or Financial Markets Resembling Screening in Electrodynamics

When a mispricing appears in a market, market speculators and arbitrageurs rectify the mistake by obtaining a profit from it. In the case of profitable fluctuations they move into profitable assets, leaving comparably less profitable ones. This affects prices in such a way that all assets of similar risk become equally attractive, i.e. the speculators restore the equilibrium. If this process occurs infinitely rapidly, then the market corrects the mispricing instantly and current prices fully reflect all relevant information. In this case one says that the market is efficient. However, clearly it is an idealization and does not hold for small enough times.

The general picture, sketched above, of the restoration of equilibrium in financial markets resembles screening in electrodynamics. Indeed, in the case of electrodynamics, negative charges move into the region of the positive electric field, positive charges get out of the region and thus screen the field. Comparing this with the financial market we can say that a local virtual arbitrage opportunity with a positive excess return plays a role of the positive electric field, speculators in the long position behave as negative charges, whilst the speculators in the short position behave as positive ones. Movements of positive and negative charges screen out a profitable fluctuation and restore the equilibrium so that there is no arbitrage opportunity any more, i.e. the speculators have eliminated the arbitrage opportunity.

The analogy is apparently superficial, but it is not. The analogy emerges naturally in the framework of the Gauge Theory of Arbitrage (GTA). The theory treats a calculation of net present values and asset buying and selling as a parallel transport of money in some curved space, and interpret the interest rate, exchange rates and prices of asset as proper connection components. This structure is exactly equivalent to the geometrical structure underlying the electrodynamics where the components of the vector-potential are connection components responsible for the parallel transport of the charges. The components of the corresponding curvature tensors are the electromagnetic field in the case of electrodynamics and the excess rate of return in case of GTA. The presence of uncertainty is equivalent to the introduction of noise in the electrodynamics, i.e. quantization of the theory. It allows one to map the theory of the capital market onto the theory of quantized gauge field interacting with matter (money flow) fields. The gauge transformations of the matter field correspond to a change of the par value of the asset units which effect is eliminated by a gauge tuning of the prices and rates. Free quantum gauge field dynamics (in the absence of money flows) is described by a geometrical random walk for the assets prices with the log-normal probability distribution. In general case the consideration maps the capital market onto Quantum Electrodynamics where the price walks are affected by money flows.

Electrodynamical model of quasi-efficient financial market

# 1 + 2 + 3 + … = -1/12. ✓✓✓

The Bernoulli numbers are a sequence of signed rational numbers that can be defined by the exponential generating function

These numbers arise in the series expansions of trigonometric functions, and are extremely important in number theory and analysis.

The Bernoulli number ￼ can be defined by the contour integral

where the contour encloses the origin, has radius less than (to avoid the poles at ), and is traversed in a counterclockwise direction.

The numbers of digits in the numerator of for the , 4, … are 1, 1, 1, 1, 1, 3, 1, 4, 5, 6, 6, 9, 7, 11, … , while the numbers of digits in the corresponding denominators are 1, 2, 2, 2, 2, 4, 1, 3, 3, 3, 3, 4, 1, 3, 5, 3, …. Both of these are plotted above.

The denominator of is given by

where the product is taken over the primes , a result which is related to the von Staudt-Clausen theorem.

In 1859 Riemann published a paper giving an explicit formula for the number of primes up to any preassigned limit—a decided improvement over the approximate value given by the prime number theorem. However, Riemann’s formula depended on knowing the values at which a generalized version of the zeta function equals zero. (The Riemann zeta function is defined for all complex numbers—numbers of the form x + iy, where i = (−1), except for the line x = 1.) Riemann knew that the function equals zero for all negative even integers −2, −4, −6, … (so-called trivial zeros), and that it has an infinite number of zeros in the critical strip of complex numbers between the lines x = 0 and x = 1, and he also knew that all nontrivial zeros are symmetric with respect to the critical line x = 1/2. Riemann conjectured that all of the nontrivial zeros are on the critical line, a conjecture that subsequently became known as the Riemann hypothesis. In 1900 the German mathematician David Hilbert called the Riemann hypothesis one of the most important questions in all of mathematics, as indicated by its inclusion in his influential list of 23 unsolved problems with which he challenged 20th-century mathematicians. In 1915 the English mathematician Godfrey Hardy proved that an infinite number of zeros occur on the critical line, and by 1986 the first 1,500,000,001 nontrivial zeros were all shown to be on the critical line. Although the hypothesis may yet turn out to be false, investigations of this difficult problem have enriched the understanding of complex numbers.

Suppose you want to put a probability distribution on the natural numbers for the purpose of doing number theory. What properties might you want such a distribution to have? Well, if you’re doing number theory then you want to think of the prime numbers as acting “independently”: knowing that a number is divisible by p should give you no information about whether it’s divisible by q

That quickly leads you to the following realization: you should choose the exponent of each prime in the prime factorization independently. So how should you choose these? It turns out that the probability distribution on the non-negative integers with maximum entropy and a given mean is a geometric distribution. So let’s take the probability that the exponent of p is k to be equal to (1−rp)rpk  for some constant rp

This gives the probability that a positive integer n = p1e1…pkek occurs as

C ∏ki=1 rpei

, where

C = ∏p(1-rp)

So we need to choose rp such that this product converges. Now, we’d like the probability that n occurs to be monotonically decreasing as a function of n. It turns out that this is true iff r= p−s for some s > 1 (since C has to converge), which gives the probability that n occurs as

1/ns/ζ(s),

ζ(s) is the zeta function.

The Riemann-Zeta function is a complex function that tells us many things about the theory of numbers. Its mystery is increased by the fact it has no closed form – i.e. it can’t be expressed a single formula that contains other standard (elementary) functions.

The plot above shows the “ridges” of ￼  for ￼ 0 < x < 1 and 0 < y < 100￼. The fact that the ridges appear to decrease monotonically for ￼0 ≤ x ≤ 1/2 is not a coincidence since it turns out that monotonic decrease implies the Riemann hypothesis.

On the real line with , the Riemann-Zeta function can be defined by the integral

where is the gamma function. If x is an integer n, then we have the identity,

=

=

so,

The Riemann zeta function can also be defined in the complex plane by the contour integral

￼, where the contour is illustrated below

Zeros of ￼ come in (at least) two different types. So-called “trivial zeros” occur at all negative even integers ￼, ￼, ￼, …, and “nontrivial zeros” at certain

for in the “critical strip. The Riemann hypothesis asserts that the nontrivial Riemann zeta function zeros of all have real part , a line called the “critical line.” This is now known to be true for the first roots.

The plot above shows the real and imaginary parts of (i.e., values of along the critical line) as is varied from 0 to 35.

Now consider this John Cook’s take…

where p is a positive integer. Here looking at what happens when p becomes a negative integer and we let n go to infinity.

If p < -1, then the limit as n goes to infinity of Sp(n) is ζ(-p). That is, for s > 1, the Riemann-Zeta function ζ(s) is defined by

We don’t have to limit ourselves to real numbers s > 1; the definition holds for complex numbers s with real part greater than 1. That’ll be important below.

When s is a positive even number, there’s a formula for ζ(s) in terms of the Bernoulli numbers:

The best-known special case of this formula is that

1 + 1/4 + 1/9 + 1/16 + … = π2 / 6.

It’s a famous open problem to find a closed-form expression for ζ(3) or any other odd argument.

The formula relating the zeta function and Bernoulli tells us a couple things about the Bernoulli numbers. First, for n ≥ 1 the Bernoulli numbers with index 2n alternate sign. Second, by looking at the sum defining ζ(2n) we can see that it is approximately 1 for large n. This tells us that for large n, |B2n| is approximately (2n)! / 22n-1 π2n.

We said above that the sum defining the Riemann zeta function is valid for complex numbers s with real part greater than 1. There is a unique analytic extension of the zeta function to the rest of the complex plane, except at s = 1. The zeta function is defined, for example, at negative integers, but the sum defining zeta in the half-plane Re(s) > 1 is not valid.

One must have seen the equation

1 + 2 + 3 + … = -1/12.

This is an abuse of notation. The sum on the left clearly diverges to infinity. But if the sum defining ζ(s) for Re(s) > 1 were valid for s = -1 (which it is not) then the left side would equal ζ(-1). The analytic continuation of ζ is valid at -1, and in fact ζ(-1) = -1/12. So the equation above is true if you interpret the left side, not as an ordinary sum, but as a way of writing ζ(-1). The same approach could be used to make sense of similar equations such as

12 + 22 + 32 + … = 0

and

13 + 23 + 33 + … = 1/120.