Entropic Value Measurement and Monetary Value Measures. End Part.

Any Grothendieck topology on χ we are discussing in the following has at least one sheaf for it. Therefore, we can assume any sieve I on U satisfying ∨ I =U.

Let J be a Grothendieck topology on χ. Then,

J{UK} = {↓ UK}

for K = 0, 1, 2, or 3

Discussing about J(U

for k = 0, 1, 2, 3, ∞, define sieves IK on UK by IK := ↓ UK

Followings are all possible sieves on U.

I12 = I1 ∪ I2, I13 = I1 ∪ I3, I23 = I2 ∪ I3, I123 = I1 ∪ I2 ∪ I3

Now, we define two Grothendieck topologies J0 and J1

J0 is defined as J0(UK) = {IK}, for K = 0, 1, 2, 3, or 

J1 is defined as J1(UK) = {IK}, for K = 0, 1, 2, 3 or J1(U) = {I,  I123}

We can easily show that any Grothendieck topology on χ that has at least one sheaf on χ other than J0 contains J1. In other words, Jis the smallest Grothendieck topology on χ next to J0.

The diagram shows the unique extension from I123 to I


So, we have a necessary and sufficient condition for a monetary value measure to be a J1-sheaf.

1.0 Ψ becomes a sheaf for J1 iff ∀ a, a’, b, b’, c, c’ ∈ ℜ

g1 (a – c’) + c’ = g2 (b – a’) + a’= g3 (c – b’) + b’

⇒ (c’ = f1 (b – c) + c) ∧ (a’ = (f2 (c – a) + a) ∧ (b’ = f3 (a – b) + b)

Entropic value measurement: Let P be a probability measure on Ω defined by P = (p1, p2, p3and Ψ be the entropic value measured by

ΨVU(X) := 1/λ log EP [eλX | V]

Then from


Ψ1 (a, b, c)  = 1/λ log E[(eλa, eλb, eλc) | U1]

= (a, 1/λ log (p2eλb + p3eλc)/(p2 + p3), 1/λ 1/λ log (p2eλb + p3eλc)/(p2 + p3))

the corresponding six functions from part 3 now are,

f1(x) = 1/λ log (p2eλx + p3)/(p2 + p3)

f2(x) = 1/λ log (p3eλx + p1)/(p3 + p1)

f3(x) = 1/λ log (p1eλx + p2)/(p1 +p2)

g1(x) = 1/λ log (p1eλx + p2 + p3)

g2(x) = 1/λ log (p1 + p2eλx + p3)

g3(x) = 1/λ log (p1 + p2 + p3eλx)

So, the question is if the entropic value measure is a J1-sheaf. The necessary and sufficient condition becomes like

p1eλa + (1 – p1)eλc’ = p2eλb + (1 – p2)eλa’ = p3eλc + (1 – p3)eλb’ := Z

⇒ Z = p1eλa + p2eλb + p3eλc

However, this does not hold true in general. On the corollary, any set of axioms on Ω = {1, 2, 3} that accepts concave monetary value measures is not complete.

The concept of monetary value measures through the language of category theory is defined as an appropriate class of presheaves over a set of σ-fields as a poset. The resulting monetary value measures satisfy naturally so-called time consistency condition as well as dynamic programming principle. Next, we saw how a concrete shape of the largest Grothendieck topology for which monetary value measures satisfying given axioms become sheaves. By using sheafification functors, for any monetary value measure, it is possible to construct its best approximation of the monetary value measure that satisfies given axioms in case the axioms are complete.

Sheafification Functor and Arbitrary Monetary Value Measure. Part 3

Here are Part 1 and Part 2

Let A be a fixed set of axioms. Then for a given arbitrary monetary value measure Ψ can we make a good alternative for it? In other words, can we find a monetary value measure that satisfies A and is the best approximation of the original Ψ? For a Grothendieck topology J on χ, define Sh(χ, J) ⊂ Setχop to be a full sub-category whose objects are all sheaves for J. Then, it is well known that  a left adjoint πJ in the following diagram.

Sh(χ, J) → Setχop

Sh(χ, J) ←πJ Setχop

πJ (Ψ) ← Ψ

The functor πJ is known as Sheafification functor, which has the following limit cone:


for sieves I, K and U. This also satisfies the following theorem.

1.0 If πJ (Ψ) is a sheaf for J

1.1 If Ψ is a sheaf for J, then for any U ∈ χ,  πJ (Ψ)(U) ≅ L(U)

The theorem suggests that for an arbitrary monetary value measure, the sheafification functor provides one of its closest monetary value measures that may satisfy the given set of axioms. To make this certain, we need a following definition.

2.0 Let A be a set of axioms of monetary value measures

2.1 M(A) := the collection of all monetary value measures satisfying A

2.2 MO := collection of all monetary value measures

2.3 A is called complete if

πJM(A) (MO) ⊂ M(A)

3.0 Let A be a complete set of axioms. Then, for a monetary value measure Ψ ∈ MO, πJM(A(Ψ) is the monetary value measure that is the best approximation satisfying A.

Let us investigate if the set of axioms of concave monetary value measures is complete in the case of Ω = {1, 2, 3} with a σ-field F := 2Ω

We enumerate all possible sub-σ-fields of Ω, that is, the shape of the category χ = χ(Ω),



U := F := 2Ω

U1 := {Φ, {1}, {2, 3}, Ω}

U2 := {Φ, {2}, {1, 3}, Ω}

U3 := {Φ, {3}, {1, 2}, Ω}

U4 := {Φ, Ω}

The Banach spaces defined by the elements of χ are

L := L := L(U) := {a, b, c | a, b, c ∈ ℜ}

L1 := L(U1) := {a, b, b | a, b ∈ ℜ}

L2 := L(U2) := {a, b, a | a, b ∈ ℜ}

L3 := L(U3) := {a, a, c | a, c ∈ ℜ}

L0 := L(U0) := {a, a, a, | a ∈ ℜ}

Then a monetary value measure Ψ : χop → Set on χ is determined by the following six functions


We will investigate its concrete shape one by one by considering axioms it satisfies.

For Ψ1 : L → L1, we have by the cash invariance axiom,

Ψ1 (a, b, c) = Ψ1 ((0, b – c, 0) + (a, c, c))

= Ψ1 ((0, b – c, 0)) + (a, c, c)

= (f12 (b – c), f11 (b – c), f11 (b – c)) + (a, c, c)

= (f12 (b – c) + a, f11 (b – c) + c, f11 (b – c)+ c)

where f11, f12 : ℜ → ℜ are defined by (f12(x), f11(x), f11(x)) = Ψ1 (0, x, 0).

Similarly, if we define nine functions

f11, f12, f21, f22, f31, f32, g1, g2, g3ℜ → ℜ by

(f12(x), f11(x), f11(x)) = Ψ1(0, x, 0)

(f21(x), f22(x), f21(x)) = Ψ2(0, 0, x)

(f31(x), f31(x), f32(x)) = Ψ3(x, 0, 0)

(g1(x), g1(x), g1(x)) = Ψ01(x, 0, 0)

(g2(x), g2(x), g2(x)) = Ψ02(0, x, 0)

(g3(x), g3(x), g3(x)) = Ψ03(0, 0, x)

We can represent the original six functions by nine functions

Ψ1(a, b, c) = (f12(b – c) + a, f11(b – c) + c, f11(b – c) + c),

Ψ2(a, b, c) = (f21(c – a) + a, f22(c – a) + a, f21(c – a) + a),

Ψ3(a, b, c) = (f31(a – b) + b, f31(a – b) + b, f32(a – b) + c),

Ψ01(a, b, b) = (g1(a – b) + b, g1(a – b) + b, g1(a – b) + b),

Ψ02(a, b, a) = (g2(b – a) + a, g2(b – a) + a, g2(b – a) + a),

Ψ03(a, a, c) = (g3(c – a) + a, g3(c – a) + a, g3(c – a) + a)

Next by the normahzation axiom, we have

f11(0) = f12(0) = f21(0) = f22(0) = f31(0) = f32(0) = g1(0) = g2(0) = g3(0) = 0

partially differentiating the function in Ψ1(a, b, c), we have

Ψ1(a, b, c)/∂a = (1, 0, 0)

∂Ψ1(a, b, c)/∂b = (f’12(b – c), f’11(b – c), f’11(b – c))

Ψ1(a, b, c)/∂c = (- f’12(b – c), 1 –  f’11(b – c), 1 –  f’11(b – c))

Therefore, by the monotonicity, we have f’12(x) = 0 and 0 ≤ f’11 ≤ 1. Then by the result of the normalization axiom, we have

x ∈ ℜ, f12(x) = 0. Hence, ∀ x ∈ ℜ,

f12(x) = f22(x) = f32(x) = 0

With this knowledge, let us redefine the three functions f1, f2, f3 : ℜ → ℜ by

(0, f1(x), f1(x)) = Ψ1(0, x, 0)

(f2(x), 0, f2(x)) = Ψ2(0, 0, x)

(f3(x), f3(x), 0) = Ψ3(x, 0, 0)

Then, we have a new representation of the original six functions

Ψ1(a, b, c) = (a, f1(b – c) + c, f1(b – c) + c)

Ψ2(a, b, c) = (f2(c – a) + a, b, f2(c – a) + a)

Ψ3(a, b, c) = (f3(a – b) + b, f3(a – b) + b, c)

Ψ01(a, b, b) = (g1(a – b) + b, g1(a – b) + b, g1(a – b) + b)

Ψ02(a, b, a) = (g2(b – a) + a, g2(b – a) + a, g2(b – a) + a)

Ψ03(a, a, c) = (g3(c – a) + a, g3(c – a) + a, g3(c – a) + a)

Thinking about the composition rule, we have

Ψ0 = Ψ01 o Ψ1 = Ψ02 o Ψ2 = Ψ03 o Ψ3

g1(a – f1(b – c) – c) + f1(b – c) + c

= g2(b – f2(c – a) – a) + f2(c – a) + a

=g3(c – f3(a – b) – b) + f3(a – b) + b



Grothendieck Sheaves and Topologies Within Monetary Value Measures. Part 2

A contravariant functor ρ : Cop → Set is  called a presheaf for a category C. By definition, a monetary value measure is a presheaf. The name presheaf suggests that it is related to another concept sheaves, which is a quite important concept in some classical branches in mathematics such as algebraic topology. For a given set, a topology defined on it provides a criteria to distinguish continuous functions from given functions on the set. In a similar way, there is a concept called a Grothendieck topology defined on a given category that gives a criteria to distinguish good presheaves (=sheaves) from given presheaves on the category. In both cases, a Grothendieck topology can be seen as a vehicle to identify good functions (presheaves) among general functions (presheaves).


On the other hand, if we have a set of functions that we want to make continuous, we can find the weakest topology that makes the functions continuous. In a similar way, if we have a set of presheaves that we want to make good, it is known that we can pick a Grothendieck topology with which the presheaves become sheaves. Since a monetary value measure is a presheaf, if we have a set of good monetary value measures (= the monetary value measures that satisfy a given set of axioms), we may find a Grothendieck topology with which the monetary value measures become sheaves. Now suppose we have a weak topology that makes given functions continuous. This, however, does not imply the fact that any continuous function w.r.t. the topology is contained in the originally given functions. Similarly, Suppose that we have a Grothendieck topology that makes all monetary value measures satisfying a given set of axioms sheaves. It, however, does not mean that any sheaf w.r.t. the Grothendieck topology satisfies the given set of axioms.

What are Grothendieck topologies and sheaves?

1.0 Let U ∈ χ

1.1 ↓ U := {V ∈ χ | V ⊂ U}

1.2 A sieve on U is a set I ⊂↓ U such that (∀ V ∈ ↓ U)(∀ W ∈ ↓ U)[W ⊂ V ∈ I ⇒ W ∈ I]

1.3 For a sieve I on U and V ⊂ U in χ, I ↓ V := I ∩ ↓ V

1.4 A family of I is an element X ∈ ∏V∈I L(V), or X = (XV)V∈I

1.5 A family X = (XV)V∈I is called a P-martingale if (∀ V ∈ I)(∀ W ∈ I)[W ⊂ V ⇒ EP[XV | W] = XW]

A sieve on U is considered as a kind of a time domain.

2.0 Ξ : χop → Set is a contravariant functor such that for iVU : V → U in χΞ(U) is the set of all sieves on U, and that Ξ(iVU)(I) = I ↓ V for I ∈ Ξ(U)

2.1 A Grothendieck topology on χ is a sub-factor J → Ξ satisfying the following conditions

2.2 (∀ U ∈ χ) ↓ (U ∈ J(U))

2.3 (∀ V ∈ χ)(∀I ∈ J(U))(∀K ∈ Ξ(U))[(∀ V ∈ I)K ↓ V ∈ J(V) ⇒ K ∈ J(U)]

This way sieve I J-covers U if I ∈ J(U)

U is considered as a time horizon of a time domain I if it is covered by I.

3.0 Theorem Let {Ja | a ∈ A} be a collection of Grothendieck topologies on χ. Then the sub-functor J → Ξ defined by J(U) := a∈A Ja(U) is a Grothendieck topology.

4.0 Let Ψ ∈ Setχop be a monetary value measure, and I be a sieve on U ∈ χ

4.1 A family X = (XV)V∈I is called Ψ-matching if (∀ V ∈ I)(∀ W ∈ I)ΨV∧WV = ΨV∧WW (XW)

4.2 A random variable X ̄∈ L(U) be a Ψ-amalgamation for a family X = (XV)V ∈ I if (∀ V ∈ I)ΨVU(X ̄) = XV

5.0 Let Ψ ∈ Setχop be a monetary value measure. I be a sieve on U ∈ χ and X = (XV)V∈I be a family that has Ψ-amalgamation. Then X is Ψ-matching. 

Let X ̄∈  L(U) be a Ψ-amalgamation. Then for every V ∈ I, XV = ΨVU(X ̄). Therefore, for every V, W ∈ I ΨV∧WV (XV) = XV∧W = ΨV∧WW (XW)

6.0 Ψ ∈ Setχop be a monetary value measure. I be a sieve on U ∈ χ and X = (XV)V∈I be a Ψ-matching family.

6.1 For V, W ∈ I, if W ⊂ V, we have ΨWV(XV) = XW

ΨWV(XV) = ΨV∧WV (XV) = ΨV∧WW (XW) = ΨWW(XW) = (XW)

6.2 If U ∈ I, XU is the unique Ψ-amalgamation for X. By 6.1, XU is the unique Ψ-amalgamation for X. Now, let X ̄∈ L(U) be another Ψ-amalgamation for X. Then for every V ∈ I, XV = ΨVU(X ̄). Put V := U. Then we have XU = ΨUU(X ̄) = 1U(X ̄) = X ̄.

7.0 Let J be a Grothendieck topology on Ψ ∈ Setχop. A monetary value measure Ψ ∈ Setχop is called a sheaf if for any U ∈ χ. Any covering sieve I ∈ J(U) and any Ψ-matching family X = (XV)V∈I b, X has a unique Ψ-amalgamation. Now, we will try to find a Grothendieck topology for which a given class of monetary value measures specified by a given set of (extra) axioms are sheaves.

Let us consider a sieve I on U ∈ χ as a subfunctor I → HomX(-,U), that is a contravariant functor I : χop → Set defined by

I(V) := {iVU} if V ∈ I

:= Φ if V ∉ I

for V ∈ χ

8.0 Let M ⊂ Setχop be the collection of all monetary value measures satisfying a given set of axioms. Then, there exists a Grothendieck topology for which all monetary value measures in M sheaves, where the topology is largest among topologies rep resenting the axioms. Let this topology be denoted by JM

Let J:= ∩Ψ∈MJΨ

Then, it is the largest Grothendieck topology for which every monetary value measure M is a sheaf……..


Monetary Value Measure as a Contravariant Functor (Category Theory) Part 1


Let us get a very brief review of dynamic risk measure theory and category theory.

1.0 A one period monetary risk measure is a function ρ = LP (Ω, F, P) → ℜ satisfying the following axioms

1.1 Cash Invariance (∀X) (∀ a ∈ ℜ) ρ (X + a) = ρ (X) – a

1.2 Monotonicity (∀ X) (∀ Y) X ≤ Y ⇒ ρ (X) ≥ ρ (Y)

1.3 Normalization ρ (0) = 0

where ρ = LP (Ω, F, P) is the space of equivalence classes of ℜ-valued random variables which are bounded by the || . ||P form.

2.0 For a σ-field U ⊂ F, L(U) = L (Ω, U, P | U) is the space of all equivalence classes of bounded ℜ-valued random variables, equipped with the usual sup form.

3.0 Let F = {Ft}t∈[0,T] be a filtration. A dynamic monetary value measure is a collection of functions ψ = {Ψt : L(FT) → L(Ft)}t∈[0,T] satisfying

3.1 Cash Invariance (∀ X ∈ L(FT))(∀ Z ∈ L(FT)) Ψt (X + Z) = Ψt (X) + Z

3.2 Monotonicity (∀ X ∈ L(FT))(∀ Y ∈ L(FT)) X ≤ Y ⇒ Ψt (X) ≤ Ψt (Y)

3.3 Normalization Ψt (0) = 0

Note that the directions of some inequalities in Definition 1.0-1.3 are different from those of Definition 3.0-3.3, because we now are dealing with monetary value measures instead of monetary risk measures.

Since dynamic monetary value measures treat multi-period situations, we may require some extra axioms to regulate them toward the time dimension.

Axiom 4.0 Dynamic Programming Principle: For 0 ≤ s ≤ t ≤ T, (∀ X ∈ L(FT)) Ψs (X) =  Ψst (X))

Axiom 5.0 Time Consistency: For 0 ≤ s ≤ t ≤ T, (∀ X, ∀ Y ∈  L(FT)) Ψt (X) ≤ Ψt (Y) ⇒ Ψs (X) ≤ Ψs (Y)

Category theory is an area of study in mathematics that examines in an abstract way the properties of maps (called morphisms or arrows) satisfying some basic conditions.

A Category C consists of a collection of OC of objects and a collection of MC of arrows or morphisms such that

6.0 There are two functions MCdom OC & MC →cod OC

When dom(f) = A and cod (f) = B, we write f : A → B

We define a so-called hom-set of given objects A and B by

HomC(A, B) := {f ∈ MC | f : A → B}

6.1 For f : A → B & g : B → C, there is an arrow g o f : A → C called the composition of g and f. 

6.2 Every object A is associated with an identity arrow 1A : A → A, such that f o 1A = f and 1A o g = g where dom(f) = A & cod(g) = g

7.0 Functors: Let C and D be two categories. A functor F: C → D consists of two functions

FO : OC → OD and FM : MC → MD


7.1 f : A → B ⇒ F(f) : F(A) → F(B)

7.2 F(g o f) = F(g) o F(f)

7.3 F(1A) = 1F(A)

8.0 Contravariant Functors: A functor F : Cop → D is called a contravariant functor if 7.1 and 7.2 are replaced by

8.1 f : A → B ⇒ F(f) : F(B) → F(A)

8.2 F(g o f) = F(f) o F(g)

9.0 Natural Transformations: Let C →F D and C →G D be two functors. A natural transformation α : F →. G consists of a family of arrows 〈αC | C ∈ OCmaking the following diagram commute

C1          F(C1) —>αC1 G(C1)

f↓       F(f) ↓             G(f)↓

C2         F(c2) —>αC2 G(C2)

10.0 Functor Categories: Let C and D be categories. A functor category DC is the category such that

10.1 ODC := collection of all functors from C to D

10.2 HomDC (F, G) := collection of all natural transformations from F to G.

Now, for defining monetary value measures with the language of category theory, we introduce a simple category that is actually a partially-ordered set derived by the σ-field F.

11.0 Let χ := χ(F) be the et of all sub-fields of F. It becomes a poset with the set inclusion relation χ becomes a category whose hom set Homχ(V, U) for U, V ∈ χ is defined by

Homχ(V, U) := iVU if V ⊂ U

:= Φ otherwise.

The arrow iVU is called the inclusion map. 

12.0 ⊥ := {Ω, Φ}, which is the least element of χ. 

13.0 Monetary Value Measure is a contravariant functor

Ψ : χop → Set

satisfying the following two conditions

13.1 for U ∈ χ, Ψ(U) := L(U)

13.2 for U, V ∈ χ, such that V ⊂ U, the map ΨVU := Ψ(iVU) : L(U) → L(V) satisfies

13.3 cash invariance: (∀ X ∈ L(U))(∀ Z ∈ L(V)) ΨVU (X + Z) =  ΨVU (X) + Z

13.4 monotonicity: (∀ X ∈ L(U)) (∀ Y ∈ L(U)) X ≤ Y ⇒ ΨVU(X) ≤ ΨVU(Y)

13.5 normalization: ΨVU(0) = 0

At this point, we do not require the monetary value measures to satisfy some familiar con- ditions such as concavity or law invariance. Instead of doing so, we want to see what kind of properties are deduced from this minimal setting. One of the key parameters from 13.0 is that Ψ is a contravariant functor, and thus for any triple σ-fields W ⊂ V ⊂ U in χ, we have

13.6 ΨUU = 1L(U) and ΨWV o ΨVU = ΨWU

14.0 Concave monetary value measure: A monetary value measure Ψ is said to be concave if for any V ⊂ U in χ, X, Y ∈ L(U) and λ ∈ [0,1],

ΨVU(λX + (1- λ)Y) ≥ λΨVU(X) + (1-λ)ΨVU(Y)

An entropic value measure is concave.

Here are some properties of monetary value measures.

15.0 Proposition: Let Ψ : χop → Set be a monetary value measure, and W ⊂ V ⊂ U be σ-fields in χ.

15.1 (∀ X ∈ L(V)) ΨVU(X) = X

By cash invariance and normalization, ΨVU(X) = ΨVU(0 + X) = ΨVU(0) + X = X

15.2 Idempotentness: (∀ X ∈ L(U)) ΨVUΨVU(X) = ΨVU(X)

Since, ΨVU(X) ∈  L(V), it is obvious by 15.1

15.3 Local property: (∀ X ∈ L(U))(∀ Y ∈ L(U))(∀ A ∈  V) ΨVU (1AX + 1ACY) = 1AΨVU(X) + 1AC ΨVU(Y) 

First we show for any A ∈ V,


Since, X ∈ L(Ω, U, P), we have |X| ≤ ||X||


1AX – 1AC||X|| ≤ 1AX + 1ACX ≤ 1AX + 1AC||x||

hence, by cash invariance and monotonicity,

ΨVU(1AX) – 1AC||x|| = ΨVU(1AX – 1AC||x||) ≤ ΨVU(X) ≤ ΨVU(1AX) + 1AC||x||)


1AΨVU(1AX) = 1AVU(1AX) – 1AC||x||) ≤ 1AΨVU(X) ≤ 1AVU(1AX) + 1AC||x||) = 1AVU(1AX)

getting 15.3

Using 15.3 twice, we have

ΨVU (1AX + 1ACY) = 1AΨVU(1AX + 1ACY) + 1ACΨVU(1AX + 1ACY)

1AΨVU(1A(1AX + 1ACY)) + 1ACΨVU(1AX + 1ACY))



15.4 Dynamic programming principle: (∀ X ∈ L(U)) ΨWU(X) = ΨWUVU(X))

by way of dynamic risk measure and monetary value measure,

ΨWU(X) = ΨWVVU(X)) =  ΨWVVUVU(X))) = (ΨWV o ΨVU)(ΨVU(X)) = ΨWUVU(X))

15.5 Time consistency: (∀ X ∈ L(U))(∀ Y ∈ L(U)) (ΨVU(X)) ≤ (ΨVU(Y)) ⇒ ΨWU(X) ≤ ΨWU(Y)

Assuming ΨVU(X) ≤ ΨVU(Y), then, by monotonicity and monetary value measure,



Binary, Ternary Connect, Neural N/W Deep Learning & Eliminating Multiplications in Forward and Backward Pass

Consider a neural network layer with N input and M output units. The forward computation is y = h(W x + b) where W and b are weights and biases, respectively, h is the activation function, and x and y are the layer’s inputs and outputs. If we choose ReLU, or Rectified Linear Unit/Ramp Function as h, there will be no multiplications in computing the activation function, thus all multiplications reside in the matrix product W x. For each input vector x, N M floating point multiplications are needed.


Binary connect eliminates these multiplications by stochastically sampling weights to be −1 or 1. Full resolution weights w ̄ are kept in memory as reference, and each time when y is needed, we sample a stochastic weight matrix W according to w ̄. For each element of the sampled matrix W, the probability of getting a 1 is proportional to how “close” its corresponding entry in w ̄ is to 1. i.e.,

P(Wij = 1) = (w ̄ij+ 1)/2;

P(Wij = −1) = 1 − P(Wij = 1)

It is necessary to add some edge constraints to w ̄. To ensure that P(Wij = 1) lies in a reasonable range, values in w ̄ are forced to be a real value in the interval [-1, 1]. If during the updates any of its value grows beyond that interval, we set it to be its corresponding edge values −1 or 1. That way floating point multiplications become sign changes.

A remaining question concerns the use of multiplications in the random number generator involved in the sampling process. Sampling an integer has to be faster than multiplication for the algorithm to be worth it.

Moving on from binary to ternary connect, whereas in the former weights are allowed to be −1 or 1, in a trained neural network, it is common to observe that many learned weights are zero or close to zero. Although the stochastic sampling process would allow the mean value of sampled weights to be zero, this suggests that it may be beneficial to explicitly allow weights to be zero.

To allow weights to be zero, split the interval of [-1, 1], within which the full resolution weight value w ̄ lies, into two sub-intervals: [−1, 0] and (0, 1]. If a weight value w ̄ij drops into one of them, we sample w ̄ij to be the two edge values of that interval,

according to their distance from w ̄ij , i.e., if w ̄ij > 0:

P(Wij =1)= w ̄ij; P(Wij = 0) = 1−w ̄ij

and if

w ̄ij <=0:

P(Wij = −1) = −w ̄ij; P(Wij = 0) = 1 + w ̄ij

Like binary connect, ternary connect also eliminates all multiplications in the forward pass.

We move from the forward to the backward pass. Suppose the i-th layer of the network has N input and M output units, and consider an error signal δ propagating downward from its output. The updates for weights and biases would be the outer product of the layer’s input and the error signal:

∆W = ηδ◦h′ (W x + b) x

∆b = ηδ◦h (W x + b)

where η is the learning rate, and x the input to the layer. While propagating through the layers, the error signal δ needs to be updated, too. Its update taking into account the next layer below takes the form:

δ = WTδ◦h′ (W x + b)

Three terms appear repeatedly in the above three equations, viz. δ, h (W x + b) and x. The latter two terms introduce matrix outer products. To eliminate multiplications, one can quantize one of them to be an integer power of 2, so that multiplications involving that term become binary shifts. The expression h′ (W x + b) contains down flowing gradients, which are largely determined by the cost function and network parameters, thus it is hard to bound its values. However, bounding the values is essential for quantization because we need to supply a fixed number of bits for each sampled value, and if that value varies too much, we will need too many bits for the exponent. This, in turn, will result in the need for more bits to store the sampled value and unnecessarily increase the required amount of computation.

While h′ (W x + b) is not a good choice for quantization, x is a better choice, because it is the hidden representation at each layer, and we know roughly the distribution of each layer’s activation.

The approach is therefore to eliminate multiplications in

∆W = ηδ◦h′ (W x + b) x

by quantizing each entry in x to an integer power of 2. That way the outer product in

∆W = ηδ◦h′ (W x + b) x becomes a series of bit shifts. Experimentally, it is discovered that allowing a maximum of 3 to 4 bits of shift is sufficient to make the network work well. This means that 3 bits are already enough to quantize x. As the float 32 format has 24 bits of mantissa, shifting (to the left or right) by 3 to 4 bits is completely tolerable. This approach is referred to as “quantized back propagation”.

If we choose ReLU as the activation function and use binary (ternary) connect to sample W, computing the term h’ (W x + b) involves no multiplications at all. In addition, quantized back propagation eliminates the multiplications in the outer product in

∆W = ηδ◦h′ (W x + b) xT.

The only place where multiplications remain is the element-wise product. From

∆W = ηδ◦h′ (W x + b) xT, ∆b = ηδ◦h (W x + b), and  δ = WTδ◦h′ (W x + b), one can see that 6 × M multiplications are needed for all computations. Like in the forward pass, most of the multiplications are used in the weight updates. Compared with standard back propagation, which would need 2MN + 6M multiplications, the amount of multiplications left is negligible in quantized back propagation.

Implementing a Quantum Support Vector Machine on a 4-Qubit Quantum Simulator

Whilst great progress has been made in the field of quantum technologies, a general purpose error-corrected quantum computer with a meaningful number of qubits is far from realisation. It is not yet clear how many logical qubits quantum computers require to outperform classical computers, which are very powerful, but it is thought that QML or quantum simulation may provide the first demonstration of a quantum speedup. The obstacles in engineering a quantum computer include ensuring that the qubits remain coherent for the time taken to implement an algorithm, being able to implement gates with ≈0.1% error rates, such that quantum error correction may be performed, and having the qubit implementation be scalabe, such that it admits efficient multiplicative expansions in system size.

A recent attempt at implementing quantum machine learning using a liquid-state nuclear magnetic resonance (NMR) processor was carried out by Zhaokai Li and others. Their approach focused on solving a simple pattern recognition problem of whether a hand-written number was a 6 or a 9. This kind of task can usually be split into preprocessing, image division, feature extraction and classification. First, an image containing a number of characters will be fed into the computer and transformed to an appropriate input format for the classification algorithm. If necessary, a number of other adjustments can be made at this stage, such as resizing the pixels. Next, the image must be split by character, so each can be categorised separately. The NMR-machine built by Li et al. is only configured to accept inputs which represent single digits, so this step was omitted. Key features of the character are then calculated and stored in a vector. In the case of Li et al., each number was split along the horizontal and vertical axes, such that the pixel number ratio across each division could be ascertained. These ratios (one for the horizontal split and one for the vertical) work well as features, since they are heavily dependent on whether the digit is a 6 or a 9. Finally, the features of the input characters are compared with those from a training set. In this case, the training set was constructed from numbers which had been type-written in standard fonts, allowing the machine to determine which class each input belonged to.


Splitting a character in half, either horizontally or vertically, enables it to be classified in a binary fashion. To identify whether a hand-written input is a 6 or a 9, the proportion of the character’s constituent pixels which lie on one side of the division are compared with correspondent features from a type-written training set.

In order to classify hand-written numbers, Li et al. used a quantum support vector machine, which is simply a more rigorous version of Lloyd’s quantum nearest centroid algorithm.

We define a normal vector, n, as

n = Σm i=1 wixi

where wi is the weight of the training vector xi. The machine then identifies an optimal hyperplane (a subspace of one dimension less than the space in which it resides), satisfying the linear equation

n · x + c = 0

The optimisation procedure consists of maximising the distance 2/|n|2 between the two classes, by solving a linear equation made up of the hyperplane parameters wi and c. Harrow, Hassidim and Lloyd (HLL) solves linear systems of equations exponentially faster than classical algorithms designed to tackle the same problem. Therefore, it is hoped that reformulating the support vector machine in a quantum environment will also result in a speedup.

After perfect classification we find that, if xi corresponds to the number 6,

n·xi + c ≥ 1,

whereas if it corresponds to the number 9,

n·xi + c ≤ −1. 

As a result, it is possible to determine whether a hand-written digit is a 6 or a 9 simply by evaluating where its feature vector resides with respect to the hyperplane.

The experimental results published by Li et al. are presented in the Figure below. We can see that their machine was able to recognise the hand-written characters across all instances. Unfortunately, it has long been established that quantum entanglement is not present in any physical implementation of liquid-state NMR. As such, it is highly likely that the work presented here is only a classical simulation of quantum machine learning.


The Poverty of Left in Latin America as orchestrated and endorsed by Joseph Stiglitz

For a change, here is another mushroom cropping up (It actually cropped up as an idea and then materialised in 2008-09), this time, thanks to Leftist governments of Venezuela, Ecuador and Bolivia. Christened “Bank of the South“, the $7 billion fund-Development Bank is the most logical culmination (what else is there?) of these Latin Americans against the neoliberal, austerity-directed reforms of the Bretton Woods behemoths. Let them dig the grounds for fecundity, what really caught my attention here is that almost a decade back, Joseph Stiglitz endorsed Hugo Chavez’s economic policies, and in 2007 even called such a development bank calling it as reflecting the perspectives of those in the South. And that was a bad call.

I want to chip in why I think such endorsements portray the vacuity, and strangely that too coming from the likes of Joseph Stiglitz.

South-South cooperation is actually becoming the malafide of the resistance against the neoliberal (Oh! how much do I despise this word now, and all the more so when it is gaining currency amongst the alternative political viewpoints) policies, and it is easily gauged by the receding of the Pink Tide in Latin america, the so-called cradle of Left in the late 90s of the earlier century and the first decade of the new. “With global stagnation and falling export prices, the ‘pink tide’ states must choose between their social programme and their economic strategy,” Financial Times is tightening screws on the coffin of Left on the continent. With the recent killing of Bolivia’s Deputy Interior Minister Rodolfo Illanes by striking miners, President Evo Morales has resorted to what the Left has always been classically resorting to: “conspiracy theory”. As Richard Seymour has quipped with a lot of prescience, Morales’ resort to conspiracy theory makes a certain sense in the context of Latin America, where a series of left-wing governments elected as part of a “pink tide” in the 2000s have gone into crisis. Argentina elected its first right-wing government in 12 years in November. Venezuela’s economic crash has led to the victory of the right-wing opposition in the senate. Notwithstanding the hyperventilating coverage of the country’s total collapse, the country is beset by real problems, a combination of opposition disruptioninternational pressure and government mistakes exacerbating the turmoil. In Brazil, impeachment proceedings against Dilma Rousseff have put the unelected opposition in power. Rousseff is impeached for manipulating the figures to make the government’s finances look better than they were, but the real problem appears to be that amid economic troubles, Rousseff was elected on a programme of investment rather than austerity. Bolivia, did set an example of an anomaly, where growth stabilised, public investment reached a high level, and minimum wages greater than the rate of inflation were introduced. So, why this turnaround? The government has built its authority on support from the police and army, and has repeatedly deployed police against social movements where they were inconvenient, such as during the protests against fuel price increases in 2010, or against a road built on indigenous land during 2011. Dissatisfaction with the left-wing and left-center governments in Argentina, Brazil, and Venezuela did not arise because the right-wing is admired by the public, but rather because the improvements begun under the left have stalled. The problems are most severe in Venezuela, where the drop in world oil prices has led to extreme inflation and scarcity of supplies in many sectors of the economy. It is important to understand that, largely, the problems have not arisen in the socialist part of the economies of these countries, but in sectors that are still under the control of private enterprise. In Venezuela, more than 70 percent of economic activity is still private. Food distribution, which is central to the problems of scarcity and inflation, is virtually monopolized by a small number of private companies that have ties to the right-wing opposition, especially the Polar company that controls 40 percent of the market. Venezuela, in short is a failed state. So, obviously the tide is turning.


From within the simmering, rises a reinvigorated bank, and its efficiency would depend on a host of issues, viz. a replication of WB/IMF’s more contributions to fund, more weightage to vote; exemption from taxes salaries and procurement of investment, which also incidentally happens to be copy of WB/IMF; undecidability on reserve funds; prioritising infrastructure over agriculture and social sectors; chalking out a plan for investment in financial intermediaries to develop national companies; procurement; and participation & transparency. Would the Bank actually be able to overcome these is in time. But, my main intention has been Joseph’s remark, or rather his subscription to such alternatives to WB/IMF. More than he actually welcoming this bank, or for that matter the NDB as rivalling the hegemonic structures of WB/IMF, it’s his stance on financialisation of capital that needs to be sent through a scanner. I have to admit honestly that I was bowled over by his discontent book, which did send me on a trip to track change via his honest and integrity-filled analysis of globalisation. Even reading Bhagwati in concomitance wasn’t a powerful let down to following JS. But, just like the Left stands precariously, JS’s conceptualisation somehow misses the beat for me these days.

He, undoubtedly was a voice to hear during and in the aftermath of global crisis. His attack was three-pronged and all of it suited the purpose for the non-esoteric to figure out the causes of 2008-09 downturn. That there are problems associated with mainstream economics with over reliance on algorithms designed by mathematical geniuses, questionable character of rationality as a result of conflict of interests impairing ratings agencies, and lack of accountability on Wall Street’s excessive risk-taking adventures isn’t really in any doubt. But, thereafter fluctuations start becoming noticeable, and as a left-inclined theorist, he blames the neoliberal policies that had its beginnings in the 70s for all the ills with current economic and financial mess the world over. Assuming it to be true, then how do we explain the fact that Western Europe’s hyper-regulated economies are presently in even worse shape than America’s? Today Greece is a nation on financial life-support. Yet it has long been one of the most regulated and interventionist economies in the entire EU. This, however, doesn’t stop Stiglitz from proposing a massive expansion of regulation. This, he says, should be shaped “by financial experts in unions, nongovernmental organisations  and universities”. More generally, there’s nothing new about what Stiglitz calls “New Capitalism.” It’s a return to old-fashioned Keynesian demand-management and the pursuit of “full employment” – that old Keynesian mantra – through the government’s direction of any number of economic sectors. Then there’s Stiglitz’s proposal for a Global Reserve System to effectively undertake demand-management for the world economy. To be fair, this is not an instance of megalomania on Stiglitz’s part. Keynes argued for something similar almost 70 years ago. But here Stiglitz wraps himself again in contradiction. Having stressed the Fed’s inability to manage America’s economy, why does Stiglitz imagine a global central bank could possibly manage monetary policy for the entire world economy? What precisely, we might ask, is the optimal interest rate for the global economy? Surely only God could know that. Until then, I’d have my reservations in taking him seriously.

The Transmission of Affect, or Brennan’s Argument Against Neo-Darwinism…Note Quote

[According to neo-Darwinism], the individual organism is born with the urges and affects that will determine its fate. Its predisposition to certain behaviors is part of its individual genetic package, and, of course, these behaviors are intrinsically affective. Such behaviors and affects may be modified by the environment, or they may not survive because they are not adaptive. But the point is that no other source or origin for the affects is acknowledged outside of the individual one. The dominant model for transmission in neo-Darwinism is genetic transmission… and the critical thing about it here is that its proponents ignore the claims of social and historical context when it comes to accounting for causation.

As Brennan convincingly argues below, the neo-Darwinist adopts an essentialist position that neglects to engage at all with the capacity of affects to occur outside of the genetically formed individual. 


To be sure, in both biological and non-biological contexts, the neo-Darwinian paradigm negates the creative potential of chance encounters by grossly inflating the status of a deterministic code mechanism. By analogy it attributes the same high level of agency to the fidelity, fecundity and longevity of the genetic package as it does to the passive passing on of a competing idea. Memetics crudely consigns, as such, the by and large capricious, unconscious and imitative transmission of desire and social invention through a population to an insentient surrender to a self-serving code.

Model Concession Agreement, or Why Environmental Clearances are Not Required Before Financial Closures?


There is something called a model concession agreement, which is tied with what is termed a financial closure. Model Concession Agreement (MCA) forms the core of public private partnership (PPP) projects in India. The MCA spells out the policy and regulatory framework for implementation of a PPP project. It addresses a gamut of critical issues pertaining to a PPP framework like mitigation and unbundling of risks; allocation of risks and returns; symmetry of obligations between the principal parties; precision and predictability of costs & obligations; reduction of transaction costs and termination. The MCA allocates risk to parties best suited to manage them. The preparation of contract documents can be a major administrative task in PPP development and may also require a considerable amount of time. The availability of standardized contract documents or model contract agreements with the provisions of model clauses can be of great help in this respect. It helps considerably in streamlining the administrative process by reducing the time in preparing such documents and getting them cleared from the concerned government agencies.  Model concession/contract agreements also reduce the cost of legal fees in preparing contract documents. Considering its advantages many governments have developed MCAs for their PPP programmes. According to the model concession agreement, ‘financial closure’ is defined as fulfilment of all conditions precedent to the initial availability of funds under the financing agreements. The phrase ‘conditions precedent’ refers to commitments to be met by the developer. After loans are tied-up, lenders agree that the ‘conditions precedent’ in the loan document are fulfilled. These are the conditions to be fulfilled prior to when you can start working on a project. These can be land acquisition, rehabilitation, and environmental clearances. While the developer’s responsibility will be arranging for shareholders’ funds, setting up an escrow account, the development authority would have to deal with the necessary state support required for the project in terms of clearances and land acquisition. Now, the question is: is there anything wrong in this mechanism even without the project being properly conceptualized and sent over for environmental clearances? On paper, everything is deemed to go haywire, but, if one were to connect PPP, model concession agreement, financial closure and SPV models within the larger ambit of project finance, this route is apparently considered the silk route (for smoothness of operationally more than anything else), since banks have their job cut out in terms of the increased number of companies which would approach them for financial closure. Add to that IFIs’ stakes in equity, nationalized banks + private banks consider it all the more worthwhile and profitable (in some goddamn sense) to honor the model concession agreement.