The Affinity of Mirror Symmetry to Algebraic Geometry: Going Beyond Formalism



Even though formalism of homological mirror symmetry is an established case, what of other explanations of mirror symmetry which lie closer to classical differential and algebraic geometry? One way to tackle this is the so-called Strominger, Yau and Zaslow mirror symmetry or SYZ in short.

The central physical ingredient in this proposal is T-duality. To explain this, let us consider a superconformal sigma model with target space (M, g), and denote it (defined as a geometric functor, or as a set of correlation functions), as

CFT(M, g)

In physics, a duality is an equivalence

CFT(M, g) ≅ CFT(M′, g′)

which holds despite the fact that the underlying geometries (M,g) and (M′, g′) are not classically diffeomorphic.

T-duality is a duality which relates two CFT’s with toroidal target space, M ≅ M′ ≅ Td, but different metrics. In rough terms, the duality relates a “small” target space, with noncontractible cycles of length L < ls, with a “large” target space in which all such cycles have length L > ls.

This sort of relation is generic to dualities and follows from the following logic. If all length scales (lengths of cycles, curvature lengths, etc.) are greater than ls, string theory reduces to conventional geometry. Now, in conventional geometry, we know what it means for (M, g) and (M′, g′) to be non-isomorphic. Any modification to this notion must be associated with a breakdown of conventional geometry, which requires some length scale to be “sub-stringy,” with L < ls. To state T-duality precisely, let us first consider M = M′ = S1. We parameterise this with a coordinate X ∈ R making the identification X ∼ X + 2π. Consider a Euclidean metric gR given by ds2 = R2dX2. The real parameter R is usually called the “radius” from the obvious embedding in R2. This manifold is Ricci-flat and thus the sigma model with this target space is a conformal field theory, the “c = 1 boson.” Let us furthermore set the string scale ls = 1. With this, we attain a complete physical equivalence.

CFT(S1, gR) ≅ CFT(S1, g1/R)

Thus these two target spaces are indistinguishable from the point of view of string theory.

Just to give a physical picture for what this means, suppose for sake of discussion that superstring theory describes our universe, and thus that in some sense there must be six extra spatial dimensions. Suppose further that we had evidence that the extra dimensions factorized topologically and metrically as K5 × S1; then it would make sense to ask: What is the radius R of this S1 in our universe? In principle this could be measured by producing sufficiently energetic particles (so-called “Kaluza-Klein modes”), or perhaps measuring deviations from Newton’s inverse square law of gravity at distances L ∼ R. In string theory, T-duality implies that R ≥ ls, because any theory with R < ls is equivalent to another theory with R > ls. Thus we have a nontrivial relation between two (in principle) observable quantities, R and ls, which one might imagine testing experimentally. Let us now consider the theory CFT(Td, g), where Td is the d-dimensional torus, with coordinates Xi parameterising Rd/2πZd, and a constant metric tensor gij. Then there is a complete physical equivalence

CFT(Td, g) ≅ CFT(Td, g−1)

In fact this is just one element of a discrete group of T-duality symmetries, generated by T-dualities along one-cycles, and large diffeomorphisms (those not continuously connected to the identity). The complete group is isomorphic to SO(d, d; Z).

While very different from conventional geometry, T-duality has a simple intuitive explanation. This starts with the observation that the possible embeddings of a string into X can be classified by the fundamental group π1(X). Strings representing non-trivial homotopy classes are usually referred to as “winding states.” Furthermore, since strings interact by interconnecting at points, the group structure on π1 provided by concatenation of based loops is meaningful and is respected by interactions in the string theory. Now π1(Td) ≅ Zd, as an abelian group, referred to as the group of “winding numbers”.

Of course, there is another Zd we could bring into the discussion, the Pontryagin dual of the U(1)d of which Td is an affinization. An element of this group is referred to physically as a “momentum,” as it is the eigenvalue of a translation operator on Td. Again, this group structure is respected by the interactions. These two group structures, momentum and winding, can be summarized in the statement that the full closed string algebra contains the group algebra C[Zd] ⊕ C[Zd].

In essence, the point of T-duality is that if we quantize the string on a sufficiently small target space, the roles of momentum and winding will be interchanged. But the main point can be seen by bringing in some elementary spectral geometry. Besides the algebra structure, another invariant of a conformal field theory is the spectrum of its Hamiltonian H (technically, the Virasoro operator L0 + L ̄0). This Hamiltonian can be thought of as an analog of the standard Laplacian ∆g on functions on X, and its spectrum on Td with metric g is

Spec ∆= {∑i,j=1d gijpipj; pi ∈ Zd}

On the other hand, the energy of a winding string is (intuitively) a function of its length. On our torus, a geodesic with winding number w ∈ Zd has length squared

L2 = ∑i,j=1d gijwiwj

Now, the only string theory input we need to bring in is that the total Hamiltonian contains both terms,

H = ∆g + L2 + · · ·

where the extra terms … express the energy of excited (or “oscillator”) modes of the string. Then, the inversion g → g−1, combined with the interchange p ↔ w, leaves the spectrum of H invariant. This is T-duality.

There is a simple generalization of the above to the case with a non-zero B-field on the torus satisfying dB = 0. In this case, since B is a constant antisymmetric tensor, we can label CFT’s by the matrix g + B. Now, the basic T-duality relation becomes

CFT(Td, g + B) ≅ CFT(Td, (g + B)−1)

Another generalization, which is considerably more subtle, is to do T-duality in families, or fiberwise T-duality. The same arguments can be made, and would become precise in the limit that the metric on the fibers varies on length scales far greater than ls, and has curvature lengths far greater than ls. This is sometimes called the “adiabatic limit” in physics. While this is a very restrictive assumption, there are more heuristic physical arguments that T-duality should hold more generally, with corrections to the relations proportional to curvatures ls2R and derivatives ls∂ of the fiber metric, both in perturbation theory and from world-sheet instantons.

Grothendieck’s Abstract Homotopy Theory


Let E be a Grothendieck topos (think of E as the category, Sh(X), of set valued sheaves on a space X). Within E, we can pick out a subcategory, C, of locally finite, locally constant objects in E. (If X is a space with E = Sh(X), C corresponds to those sheaves whose espace étale is a finite covering space of X.) Picking a base point in X generalises to picking a ‘fibre functor’ F : C → Setsfin, a functor satisfying various conditions implying that it is pro-representable. (If x0 ∈ X is a base point {x0} → X induces a ‘fibre functor’ Sh(X) → Sh{x0} ≅ Sets, by pullback.)

If F is pro-representable by P, then π1(E, F) is defined to be Aut(P), which is a profinite group. Grothendieck proves there is an equivalence of categories C ≃ π1(E) − Setsfin, the category of finite π1(E)-sets. If X is a locally nicely behaved space such as a CW-complex and E = Sh(X), then π1(E) is the profinite completion of π1(X). This profinite completion occurs only because Grothendieck considers locally finite objects. Without this restriction, a covering space Y of X would correspond to a π1(X) – set, Y′, but if Y is a finite covering of X then the homomorphism from π1(X) to the finite group of transformations of Y factors through the profinite completion of π1(X). This is defined by : if G is a group, Gˆ = lim(G/H : H ◅ G, H of finite index) is its profinite completion. This idea of using covering spaces or their analogue in E raises several important points:

a) These are homotopy theoretic results, but no paths are used. The argument involving sheaf theory, the theory of (pro)representable functors, etc., is of a purely categorical nature. This means it is applicable to spaces where the use of paths, and other homotopies is impossible because of bad (or unknown) local properties. Such spaces have been studied within Shape Theory and Strong Shape Theory, although not by using Grothendieck’s fundamental group, nor using sheaf theory.

b) As no paths are used, these methods can also be applied to non-spaces, e.g. locales and possibly to their non-commutative analogues, quantales. For instance, classically one could consider a field k and an algebraic closure K of k and then choose C to be a category of étale algebras over k, in such a way that π1(E) ≅ Gal(K/k), the Galois group of k. It, in fact, leads to a classification theorem for Grothendieck toposes. From this viewpoint, low dimensional homotopy theory is ssen as being part of Galois theory, or vice versa.

c) This underlines the fact that π1(X) classifies covering spaces – but for i > 1, πi(X) does not seem to classify anything other than maps from Si into X!

This is abstract homotopy theory par excellence.

Of Topos and Torsors

Let X be a topological space. One goal of algebraic topology is to study the topology of X by means of algebraic invariants, such as the singular cohomology groups Hn(X;G) of X with coefficients in an abelian group G. These cohomology groups have proven to be an extremely useful tool, due largely to the fact that they enjoy excellent formal properties, and the fact that they tend to be very computable. However, the usual definition of Hn(X;G) in terms of singular G-valued cochains on X is perhaps somewhat unenlightening. This raises the following question: can we understand the cohomology group Hn(X;G) in more conceptual terms?

As a first step toward answering this question, we observe that Hn(X;G) is a representable functor of X. That is, there exists an Eilenberg-MacLane space K(G,n) and a universal cohomology class η ∈ Hn(K(G,n);G) such that, for any topological space X, pullback of η determines a bijection

[X, K(G, n)] → Hn(X; G)

Here [X,K(G,n)] denotes the set of homotopy classes of maps from X to K(G,n). The space K(G,n) can be characterized up to homotopy equivalence by the above property, or by the the formula πkK(G,n)≃ ∗ if k̸ ≠ n


G if k = n.

In the case n = 1, we can be more concrete. An Eilenberg MacLane space K(G,1) is called a classifying space for G, and is typically denoted by BG. The universal cover of BG is a contractible space EG, which carries a free action of the group G by covering transformations. We have a quotient map π : EG → BG. Each fiber of π is a discrete topological space, on which the group G acts simply transitively. We can summarize the situation by saying that EG is a G-torsor over the classifying space BG. For every continuous map X → BG, the fiber product X~ : EG × BG X has the structure of a G-torsor on X: that is, it is a space endowed with a free action of G and a homeomorphism X~/G ≃ X. This construction determines a map from [X,BG] to the set of isomorphism classes of G-torsors on X. If X is a well-behaved space (such as a CW complex), then this map is a bijection. We therefore have (at least) three different ways of thinking about a cohomology class η ∈ H1(X; G):

(1) As a G-valued singular cocycle on X, which is well-defined up to coboundaries.

(2) As a continuous map X → BG, which is well-defined up to homotopy.

(3) As a G-torsor on X, which is well-defined up to isomorphism.

The singular cohomology of a space X is constructed using continuous maps from simplices ∆k into X. If there are not many maps into X (for example if every path in X is constant), then we cannot expect singular cohomology to tell us very much about X. The second definition uses maps from X into the classifying space BG, which (ultimately) relies on the existence of continuous real-valued functions on X. If X does not admit many real-valued functions, then the set of homotopy classes [X,BG] is also not a very useful invariant. For such spaces, the third approach is the most powerful: there is a good theory of G-torsors on an arbitrary topological space X.

There is another reason for thinking about H1(X;G) in the language of G-torsors: it continues to make sense in situations where the traditional ideas of topology break down. If X is a G-torsor on a topological space X, then the projection map X → X is a local homeomorphism; we may therefore identify X with a sheaf of sets F on X. The action of G on X determines an action of G on F. The sheaf F (with its G-action) and the space X (with its G-action) determine each other, up to canonical isomorphism. Consequently, we can formulate the definition of a G-torsor in terms of the category ShvSet(X) of sheaves of sets on X without ever mentioning the topological space X itself. The same definition makes sense in any category which bears a sufficiently strong resemblance to the category of sheaves on a topological space: for example, in any Grothendieck topos. This observation allows us to construct a theory of torsors in a variety of nonstandard contexts, such as the étale topology of algebraic varieties.

Describing the cohomology of X in terms of the sheaf theory of X has still another advantage, which comes into play even when the space X is assumed to be a CW complex. For a general space X, isomorphism classes of G-torsors on X are classified not by the singular cohomology H1sing(X;G), but by the sheaf cohomology H1sheaf(X; G) of X with coefficients in the constant sheaf G associated to G. This sheaf cohomology is defined more generally for any sheaf of groups G on X. Moreover, we have a conceptual interpretation of H1sheaf(X; G) in general: it classifies G-torsors on X (that is, sheaves F on X which carry an action of G and locally admit a G-equivariant isomorphism F ≃ G) up to isomorphism. The general formalism of sheaf cohomology is extremely useful, even if we are interested only in the case where X is a nice topological space: it includes, for example, the theory of cohomology with coefficients in a local system on X.

Let us now attempt to obtain a similar interpretation for cohomology classes η ∈ H2 (X ; G). What should play the role of a G-torsor in this case? To answer this question, we return to the situation where X is a CW complex, so that η can be identified with a continuous map X → K(G,2). We can think of K(G,2) as the classifying space of a group: not the discrete group G, but instead the classifying space BG (which, if built in a sufficiently careful way, comes equipped with the structure of a topological abelian group). Namely, we can identify K(G, 2) with the quotient E/BG, where E is a contractible space with a free action of BG. Any cohomology class η ∈ H2(X;G) determines a map X → K(G,2), and we can form the pullback X~ = E × BG X. We now think of X as a torsor over X: not for the discrete group G, but instead for its classifying space BG.

To complete the analogy with our analysis in the case n = 1, we would like to interpret the fibration X → X as defining some kind of sheaf F on the space X. This sheaf F should have the property that for each x ∈ X, the stalk Fx can be identified with the fiber X~x ≃ BG. Since the space BG is not discrete (or homotopy equivalent to a discrete space), the situation cannot be adequately described in the usual language of set-valued sheaves. However, the classifying space BG is almost discrete: since the homotopy groups πiBG vanish for i > 1, we can recover BG (up to homotopy equivalence) from its fundamental groupoid. This suggests that we might try to think about F as a “groupoid-valued sheaf” on X, or a stack (in groupoids) on X.

Time-Evolution in Quantum Mechanics is a “Flow” in the (Abstract) Space of Automorphisms of the Algebra of Observables

Spiral of life

In quantum mechanics, time is not a geometrical flow. Time-evolution is characterized as a transformation that preserves the algebraic relations between physical observables. If at a time t = 0 an observable – say the angular momentum L(0) – is defined as a certain combination (product and sum) of some other observables – for instance positions X(0), Y (0) and momenta PX (0), PY (0), that is to say

L(0) = X (0)PY (0) − Y (0)PX (0) —– (1)

then one asks that the same relation be satisfied at any other instant t (preceding or following t = 0),

L(t) = X (t)PY (t) − Y (t)PX (t) —– (2)

The quantum time-evolution is thus a map from an observable at time 0 to an observable at time t that preserves the algebraic form of the relation between observables. Technically speaking, one talks of an automorphism of the algebra of observables.

At first sight, this time-evolution has nothing to do with a flow. However there is still “something flowing”, although in an abstract mathematical space. Indeed, to any value of t (here time is an absolute parameter, as in Newton mechanics) is associated an automorphism αt that allows to deduce the observables at time t from the knowledge of the observables at time 0. Mathematically, one writes

L(t) = αt(L(0)), X(t) = αt(X(0)) —– (3)

and so on for the other observables. The term “group” is important for it precisely explains why it still makes sense to talk about a flow. Group refers to the property of additivity of the evolution: going from t to t′ is equivalent to going from t to t1, then from t1 to t′. Considering small variations of time (t′−t)/n where n is an integer, in the limit of large n one finds that going from t to t′ consists in flowing through n small variations, exactly as the geometric flow consists in going from a point x to a point y through a great number of infinitesimal variations (x−y)/n. That is why the time-evolution in quantum mechanics can be seen as a “flow” in the (abstract) space of automorphisms of the algebra of observables. To summarize, in quantum mechanics time is still “something that flows”, although in a less intuitive manner than in relativity. The idea of “flow of time” makes sense, as a flow in an abstract space rather than a geometrical flow.

Philosophizing Forgetful Functors: This Functor Forgets only Properties: Namely, the Property of Being Abelian + This Functor Forgets Both Structure (the generating set) and Properties (the property of being a free group).


forgetful functor is a functor which is defined by ‘forgetting’ something. For example, the forgetful functor from Grp to Set forgets the group structure of a group, remembering only the underlying set.

In common parlance, the term ‘forgetful functor’ has no precise definition, being simply used whenever a functor is obviously defined by forgetting something. Many forgetful functors of this sort have left or right adjoints (and many are actually monadic or comonadic), leading to the paradigmatic adjunction “free ⊣ forgetful.”

On the other hand, from the perspective of stuff, structure, propertyevery functor is regarded as a forgetful functor and classified by how much it forgets (namely, stuff, structure, or properties). From this perspective, the forgetful functor from GrpGrp to SetSet forgets the structure of a group and the property of admitting a group structure, as usual; but its left adjoint (the free group functor) is also forgetful: if you identify SetSet with the category of free groups with specified generators, then it forgets the structure of a set of free generators and the property of being free.

There are many cases in which we want to say that one kind of mathematical object has more structure than another kind of mathematical object. For instance, a topological space has more structure than a set. A Lie group has more structure than a smooth manifold. A ring has more structure than a group. And so on. In each of these cases, there is a sense in which the first sort of object – say, a topological space – results by taking an instance of the second sort – say, a set – and adding something more – in this case, a topology. In other cases, we want to say that two different kinds of mathematical objects have the same amount of structure. For instance, given a Boolean algebra, one can construct a special kind of topological space, known as a Stone space, from which one can uniquely reconstruct the original Boolean algebra; and vice-versa.

These sorts of relationships between mathematical objects are naturally captured in the language of category theory, via the notion of a forgetful functor. For instance, there is a functor F : Top → Set from the category Top, whose objects are topological spaces and whose arrows are continuous maps, to the category Set, whose objects are sets and whose arrows are functions. This functor takes every topological space to its underlying set, and it takes every continuous function to its underlying function. We say this functor is forgetful because, intuitively speaking, it forgets something: namely the choice of topology on a given set.

The idea of a forgetful functor is made precise by a classification of functors due to Baez et al. (2004). This requires some machinery. A functor F : C → D is said to be full if for every pair of objects A, B of C, the map F : hom(A, B) → hom(F (A), F (B)) induced by F is surjective, where hom(A, B) is the collection of arrows from A to B. Likewise, F is faithful if this induced map is injective for every such pair of objects. Finally, a functor is essentially surjective if for every object X of D, there exists some object A of C such that F(A) is isomorphic to X.

If a functor is full, faithful, and essentially surjective, we will say that it forgets nothing. A functor F : C → D is full, faithful, and essentially surjective if and only if it is essentially invertible, i.e., there exists a functor G : D → C such that G ◦ F : C → C is naturally isomorphic to 1C, the identity functor on C, and F ◦ G : D → D is naturally isomorphic to 1D. (Note, then, that G is also essentially invertible, and thus G also forgets nothing.) This means that for each object A of C, there is an isomorphism ηA : G ◦ F (A) → A such that for any arrow f : A → B in C, ηB ◦ G ◦ F(f) = f ◦ ηA, and similarly for every object of D. When two categories are related by a functor that forgets nothing, we say the categories are equivalent and that the pair F, G realizes an equivalence of categories.

Conversely, any functor that fails to be full, faithful, and essentially surjective forgets something. But functors can forget in different ways. A functor F : C → D forgets structure if it is not full; properties if it is not essentially surjective; and stuff if it is not faithful. Of course, “structure”, “property”, and “stuff” are technical terms in this context. But they are intended to capture our intuitive ideas about what it means for one kind of object to have more structure (resp., properties, stuff) than another. We can see this by considering some examples.

For instance, the functor F : Top → Set described above is faithful and essentially surjective, but not full, because not every function is continuous. So this functor forgets only structure – which is just the verdict we expected. Likewise, there is a functor G : AbGrp → Grp from the category AbGrp whose objects are Abelian groups and whose arrows are group homomorphisms to the category Grp whose objects are (arbitrary) groups and whose arrows are group homomorphisms. This functor acts as the identity on the objects and arrows of AbGrp. It is full and faithful, but not essentially surjective because not every group is Abelian. So this functor forgets only properties: namely, the property of being Abelian. Finally, consider the unique functor H : Set → 1, where 1 is the category with one object and one arrow. This functor is full and essentially surjective, but it is not faithful, so it forgets only stuff – namely all of the elements of the sets, since we may think of 1 as the category whose only object is the empty set, which has exactly one automorphism.

In what follows, we will say that one sort of object has more structure (resp. properties, stuff) than another if there is a functor from the first category to the second that forgets structure (resp. properties, stuff). It is important to note, however, that comparisons of this sort must be relativized to a choice of functor. In many cases, there is an obvious functor to choose – i.e., a functor that naturally captures the standard of comparison in question. But there may be other ways of comparing mathematical objects that yield different verdicts.

For instance, there is a natural sense in which groups have more structure than sets, since any group may be thought of as a set of elements with some additional structure. This relationship is captured by a forgetful functor F : Grp → Set that takes groups to their underlying sets and group homomorphisms to their underlying functions. But any set also uniquely determines a group, known as the free group generated by that set; likewise, functions generate group homomorphisms between free groups. This relationship is captured by a different functor, G : Set → Grp, that takes every set to the free group generated by it and every function to the corresponding group homomorphism. This functor forgets both structure (the generating set) and properties (the property of being a free group). So there is a sense in which sets may be construed to have more structure than groups.

Galois Theor(y)/(em)

The most significant discovery of Galois is that under some hypotheses, there is a one-to-one correspondence between

1. subgroups of the Galois group Gal(E/F)

2. subfields M of E such that F ⊆ M.

The correspondence goes as follows:

To each intermediate subfield M, associate the group Gal(E/M) of all M-automorphisms of E:

G = Gal : {intermediate fields} → {subgroups of Gal(E/F)}

M → G(M) = Gal(E/M)

To each subgroup H of Gal(E/F), associate the fixed subfield F(H):

F : {subgroups of Gal(E/F )} → {intermediate fields}

H → F(H)

We will prove that, under the right hypotheses, we actually have a bijection (namely G is the inverse of F). For example.

Consider the field extension E = Q(i, √5)/Q. It has four Q-automorphisms, given by (it is enough to describe their actions on i and √5):
σ1 : i →i, √5 →√5
σ2 : i →−i, √5 →√5
σ3 : i →i, √5 →−√5
σ4 : i →−i, √5 →−√5
Gal(E/Q) = {σ1, σ2, σ3, σ4}. The proper subgroups of Gal(E/Q) are {σ1}, {σ1, σ2}, {σ1, σ3}, {σ1, σ4} and their corresponding subfields are E, Q(√5), Q(i), Q(i√5). This yields the following diagram:
Theorem: Let E/F be a finite Galois extension with Galois group G.
  1. The map F is a bijection from subgroups to intermediate fields, with inverse G.
  2. Consider the intermediate field K = F(H) which is fixed by H, and σ ∈ G.Then the intermediate fieldσK = {σ(x), x∈K}

    is fixed by σHσ−1, namely σK = F(σHσ−1)

    Proof: 1. We first consider the composition of maps H → F(H) → GF(H).

    We need to prove that GF(H) = H. Take σ in H, then σ fixes F(H) by definition and σ ∈ Gal(E/F(H)) = G(F(H)), showing that

    H ⊆ GF(H).

    To prove equality, we need to rule out the strict inclusion. If H were a proper subgroup of G(F(H)), by the above proposition the fixed field F(H) of H should properly contain the fixed field of GF(H) which is F(H) itself, a contradiction, showing that

    H = GF(H)

    Now consider the reverse composition of maps K → G(K) → FG(K)

    This time we need to prove that K = FG(K). But FG(K) = fixed field by Gal(E/K) which is exactly K by the above proposition (its first point). It is enough to compute F(σHσ−1) and show that it is actually equal to

    σK = σF(H).

    F(σHσ−1) = {x ∈ E, στσ−1(x) = x ∀ τ ∈ H} = {x ∈ E, τσ−1(x)=σ−1(x) ∀ τ ∈ H}

    =  {x ∈ E, σ−1(x) ∈ F(H)}

    =  {x ∈ E, x ∈ σ(F(H))} = σ(F(H))

    We now look at subextensions of the finite Galois extension E/F and ask about their respective Galois group.

    Theorem: Let E/F be a finite Galois extension with Galois group G. Let K be an intermediate subfield, fixed by the subgroup H.

    1. The extension E/K is Galois.

    2. The extension K/F is normal if and only if H is a normal subgroup of G.

    3. If H is a normal subgroup of G, then

    Gal(K/F ) ≃ G/H = Gal(E/F )/Gal(E/K).

    4. Whether K/F is normal or not, we have

    [K : F] = [G : H]


    That E/K is Galois is immediate from the fact that a subextension E/K/F inherits normality and separability from E/F.

    First note that σ is an F-monomorphism of K into E if and only if σ is the restriction to K of an element of G: if σ is an F -monomorphism of K into E, it can be extended to an F-monomorphism of E into itself thanks to the normality of E. Conversely, if τ is an F-automorphism of E, then σ = τ|K is surely a F-monomorphism of K into E.

    Now, this time by a characterization of a normal extension, we have

    K/F normal ⇐⇒ σ(K) = K ∀ σ ∈ G

    Since K = F(H), we just rewrite

    K/F normal ⇐⇒ σ(F(H)) = F(H) ∀ σ ∈ G.

    Now by the above theorem, we know that σ(F(H)) = F(σHσ−1), and we have

    K/F normal ⇐⇒ F(σHσ−1) = F(H) for all σ ∈ G

    We now use again the above theorem that tells us that F is invertible, with inverse G, to get the conclusion:

    K/F normal ⇐⇒ σHσ−1 =H ∀ σ ∈ G

    To prove this isomorphism, we will use the 1st isomorphism Theorem for groups. Consider the group homomorphism

    Gal(E/F)→Gal(K/F), σ →σ|K.

    This map is surjective and its kernel is given by

    Ker={σ, σ|K =1}=H =Gal(E/K).

    Applying the first isomorphism Theorem for groups, we get

    Gal(K/F ) ≃ Gal(E/F )/Gal(E/K)

    Finally, by multiplicativity of the degrees:

    [E :F]=[E :K][K :F]

    Since E/F and E/K are Galois, we can rewrite |G| = |H|[K : F]. We conclude by Lagrange Theorem:

    [G:H]=|G|/|H|=[K :F]