Grothendieck’s Universes and Wiles Proof (Fermat’s Last Theorem). Thought of the Day 77.0


In formulating the general theory of cohomology Grothendieck developed the concept of a universe – a collection of sets large enough to be closed under any operation that arose. Grothendieck proved that the existence of a single universe is equivalent over ZFC to the existence of a strongly inaccessible cardinal. More precisely, 𝑈 is the set 𝑉𝛼 of all sets with rank below 𝛼 for some uncountable strongly inaccessible cardinal.

Colin McLarty summarised the general situation:

Large cardinals as such were neither interesting nor problematic to Grothendieck and this paper shares his view. For him they were merely legitimate means to something else. He wanted to organize explicit calculational arithmetic into a geometric conceptual order. He found ways to do this in cohomology and used them to produce calculations which had eluded a decade of top mathematicians pursuing the Weil conjectures. He thereby produced the basis of most current algebraic geometry and not only the parts bearing on arithmetic. His cohomology rests on universes but weaker foundations also suffice at the loss of some of the desired conceptual order.

The applications of cohomology theory implicitly rely on universes. Most number theorists regard the applications as requiring much less than their ‘on their face’ strength and in particular believe the large cardinal appeals are ‘easily eliminable’. There are in fact two issues. McLarty writes:

Wiles’s proof uses hard arithmetic some of which is on its face one or two orders above PA, and it uses functorial organizing tools some of which are on their face stronger than ZFC.

There are two current programs for verifying in detail the intuition that the formal requirements for Wiles proof of Fermat’s last theorem can be substantially reduced. On the one hand, McLarty’s current work aims to reduce the ‘on their face’ strength of the results in cohomology from large cardinal hypotheses to finite order Peano. On the other hand Macintyre aims to reduce the ‘on their face’ strength of results in hard arithmetic to Peano. These programs may be complementary or a full implementation of Macintyre’s might avoid the first.

McLarty reduces

  1. ‘ all of SGA (Revêtements Étales et Groupe Fondamental)’ to Bounded Zermelo plus a Universe.
  2. “‘the currently existing applications” to Bounded Zermelo itself, thus the con-sistency strength of simple type theory.’ The Grothendieck duality theorem and others like it become theorem schema.

The essential insight of the McLarty’s papers on cohomology is the role of replacement in giving strength to the universe hypothesis. A 𝑍𝐶-universe is defined to be a transitive set U modeling 𝑍𝐶 such that every subset of an element of 𝑈 is itself an element of 𝑈. He remarks that any 𝑉𝛼 for 𝛼 a limit ordinal is provable in 𝑍𝐹𝐶 to be a 𝑍𝐶-universe. McLarty then asserts the essential use of replacement in the original Grothendieck formulation is to prove: For an arbitrary ring 𝑅 every module over 𝑅 embeds in an injective 𝑅-module and thus injective resolutions exist for all 𝑅-modules. But he gives a proof in a system with the proof theoretic strength of finite order arithmetic that every sheaf of modules on any small site has an infinite resolution.

Angus Macintyre dismisses with little comment the worries about the use of ‘large-structure’ tools in Wiles proof. He begins his appendix,

At present, all roads to a proof of Fermat’s Last Theorem pass through some version of a Modularity Theorem (generically MT) about elliptic curves defined over Q . . . A casual look at the literature may suggest that in the formulation of MT (or in some of the arguments proving whatever version of MT is required) there is essential appeal to higher-order quantification, over one of the following.

He then lists such objects as C, modular forms, Galois representations …and summarises that a superficial formulation of MT would be 𝛱1m for some small 𝑚. But he continues,

I hope nevertheless that the present account will convince all except professional sceptics that MT is really 𝛱01.

There then follows a 13 page highly technical sketch of an argument for the proposition that MT can be expressed by a sentence in 𝛱01 along with a less-detailed strategy for proving MT in PA.

Macintyre’s complexity analysis is in traditional proof theoretic terms. But his remark that ‘genus’ is more a useful geometric classification of curves than the syntactic notion of degree suggests that other criteria may be relevant. McLarty’s approach is not really a meta-theorem, but a statement that there was only one essential use of replacement and it can be eliminated. In contrast, Macintyre argues that ‘apparent second order quantification’ can be replaced by first order quantification. But the argument requires deep understanding of the number theory for each replacement in a large number of situations. Again, there is no general theorem that this type of result is provable in PA.

Welfare Economics, or Social Psychic Wellbeing. Note Quote.


The economic system is a social system in which commodities are exchanged. Sets of these commodities can be represented by vectors x within a metric space X contained within the non-negative orthant of an Euclidean space RNx+ of dimensionality N equal to the number of such commodities.

An allocation {xi}i∈N ⊂ X ⊂ RNx+ of commodities in society is a set of vectors xi representing the commodities allocated within the economic system to each individual i ∈ N.

In questions of welfare economics at least in all practical policy matters, the state of society is equated with this allocation, that is, s = {xi}i∈N, and the set of all possible information concerning the economic state of society is S = X. It is typically taken to be the case that the individual’s preference-information is simply their allocation xi, si = xi. The concept of Pareto efficiency is thus narrowed to “neoclassical Pareto efficiency” for the school of economic thought in which originates, and to distinguish it from the weaker criterion.

An allocation {xi}i∈N is said to be neoclassical Pareto efficient iff ∄{xi}i∈N ⊂ X & i ∈ N : x′i ≻ xi & x′j ≽ xj ∀ j ≠ i ∈ N.

A movement between two allocations, {xi}i∈N → {x′i}i∈N is called a neoclassical Pareto improvement iff ∃i∈N : x′i ≻ xi & x′j ≽ xj ∀ j ≠ i ∈ N.

For technical reasons it is almost always in practice assumed for simplicity that individual preference relations are monotonically increasing across the space of commodities.

If individual preferences are monotonically increasing then x′ii xi ⇐⇒ x′i ≥ xi, and x′ ≻ xi ⇐⇒ xi > x′i2.

This is problematic, because a normative economics guided by the principle of implementing a decision if it yields a neoclassical Pareto improvement where individuals have such preference relations above leads to the following situation.

Suppose that individual’s preference-information is their own allocation of commodities, and that their preferences are monotonically increasing. Take one individual j ∈ N and an initial allocation {xi}i∈N.

– A series of movements between allocations {{xi}ti∈N → {x′i}ti∈N}Tt=1 such that xi≠j = x′i≠j ∀ t and x′j > xj ∀ t and therefore that xj − xi → ∞∀i≠j ∈ N, are neoclassical Pareto improvements. Furthermore, if these movements are made possible only by the discovery of new commodities, each individual state in the movement is neoclassical Pareto efficient prior to the next discovery if the first allocation was neoclassical Pareto efficient.

Admittedly perhaps not to the economic theorist, but to most this seems a rather dubious out- come. It means that if we are guided by neoclassical Pareto efficiency it is acceptable, indeed de- sirable, that one individual within society be made increasingly “richer” without end and without increasing the wealth of others. Provided only the wealth of others does not decrease. The same result would hold if instead of an individual, we made a whole group, or indeed the whole of society “better off”, without making anyone else “worse off”.

Even the most devoted disciple of Ayn Rand would find this situation dubious, for there is no requirement that the individual in question be in some sense “deserving” of their riches. But it is perfectly logically consistent with Pareto optimality if individual preferences concern only to their allocation and are monotonically increasing. So what is it that is strange here? What generates this odd condonation? It is the narrowing of that which the polity care about to each individual allocation, alone, independent of others. The fact that neoclassical Pareto improvements are distribution-invariant because the polity is supposed to care only about their own individual allocation xi ∈ {xi}ti∈N alone rather than broader states of society si ⊂ s as they see it.

To avoid such awkward results, the economist may move from the preference-axiomatic concept of Pareto efficiency to embrace utilitarianism. The policy criterion (actually not immediately representative of Bentham’s surprisingly subtle statement) being the maximisation of some combination W(x) = W {ui(xi)}i∈N of individual utilities ui(xi) over allocations. The “social psychic wellbeing” metric known as the Social Welfare Function.

In theory, the maximisation of W(x) would, given the “right” assumptions on the combination method W (·) (sum, multiplication, maximin etc.) and utilities (concavity, montonocity, independence etc.) fail to condone a distribution of commodities x extreme as that discussed above. By dint of its failure to maximise social welfare W(x). But to obtain this egalitarian sensitivity to the distribution of income, three properties of Social Welfare Functions are introduced. Which prove fatal to the a-politicality of the economist’s policy advice, and introduce presuppositions which must lay naked upon the political passions of the economist, so much more indecently for their hazy concealment under the technicalistic canopy of functional mathematics.

Firstly, it is so famous a result as to be called the “third theorem of welfare economics” that any such function W(·) as has certain “uncontroversially” desirable technical properties will impose upon the polity N the preferences of a dictator i ∈ N within it. The preference of one individual i ∈ N will serve to determine the preference indicated between by society between different states by W(x). In practice, the preferences of the economist, who decides upon the form of W(·) and thus imposes their particular political passions (be they egalitarian or otherwise) upon policy, deeming what is “socially optimal” by the different weightings assigned to individual utilities ui(·) within the polity. But the political presuppositions imported by the economist go deeper in fact than this. Utilitari-anism which allows for inter-personal comparisons of utility in the construction of W(x) requires utility functions be “cardinal” – representing “how much” utility one derives from commodities over and above the bare preference between different sets thereof. Utility is an extremely vague concept, because it was constructed to represent a common hedonistic experiential metric where the very existence of such is uncertain in the first place. In practice, the economist decides upon, extrapolates, assigns to i ∈ N a particular utility function which imports yet further assumptions about how any one individual values their commodity allocation, and thus contributes to social psychic wellbeing.

And finally, utilitarianism not only makes political statements about who in the polity is to be assigned a disimproved situation. It makes statements so outlandish and outrageous to the common sensibility as to have provided the impetus for two of the great systems of philosophy of justice in modernity – those of John Rawls and Amartya Sen. Under almost any combination method W(·), the maximization of W(·) demands allocation to those most able to realize utility from their allocation. It would demand, for instance, redistribution of commodities from sick children to the hedonistic libertine, for the latter can obtain greater “utility” there from. A problem so severe in its political implications it provided the basic impetus for Rawls’ and Sen’s systems. A Theory of Justice is, of course, a direct response to the problematic political content of utilitarianism.

So Pareto optimality stands as the best hope for the economist to make a-political statements about policy, refraining from making statements therein concerning the assignation of dis-improvements in the situation of any individual. Yet if applied to preferences over individual allocations alone it condones some extreme situations of dubious political desirability across the spectrum of political theory and philosophy. But how robust a guide is it when we allow the polity to be concerned with states of society in general? Not only their own individual allocation of commodities. As they must be in the process of public reasoning in every political philosophy from Plato to Popper and beyond.