Malicious Machine Learnings? Privacy Preservation and Computational Correctness Across Parties. Note Quote/Didactics.

Invincea_graph_DARPA

Multi-Party Computation deals with the following problem: There are n ≥ 2 parties P1, . . ., Pn where party Pi holds input ti, 1 ≤ i ≤ n, and they wish to compute together a functions = f (t1, . . . , tn) on their inputs. The goal is that each party will learn the output of the function, s, yet with the restriction that Pi will not learn any additional information about the input of the other parties aside from what can be deduced from the pair (ti, s). Clearly it is the secrecy restriction that adds complexity to the problem, as without it each party could announce its input to all other parties, and each party would locally compute the value of the function. Thus, the goal of Multi-Party Computation is to achieve the following two properties at the same time: correctness of the computation and privacy preservation of the inputs.

The following two generalizations are often useful:

(i) Probabilistic functions. Here the value of the function depends on some random string r chosen according to some distribution: s = f (t1, . . . , tn; r). An example of this is the coin-flipping functionality, which takes no inputs, and outputs an unbiased random bit. It is crucial that the value r is not controlled by any of the parties, but is somehow jointly generated during the computation.

(ii) Multioutput functions. It is not mandatory that there be a single output of the function. More generally there could be a unique output for each party, i.e., (s1, . . . , sn) = f(t1,…, tn). In this case, only party Pi learns the output si, and no other party learns any information about the other parties’ input and outputs aside from what can be derived from its own input and output.

One of the most interesting aspects of Multi-Party Computation is to reach the objective of computing the function value, but under the assumption that some of the parties may deviate from the protocol. In cryptography, the parties are usually divided into two types: honest and faulty. An honest party follows the protocol without any deviation. Otherwise, the party is considered to be faulty. The faulty behavior can exemplify itself in a wide range of possibilities. The most benign faulty behavior is where the parties follow the protocol, yet try to learn as much as possible about the inputs of the other parties. These parties are called honest-but-curious (or semihonest). At the other end of the spectrum, the parties may deviate from the prescribed protocol in any way that they desire, with the goal of either influencing the computed output value in some way, or of learning as much as possible about the inputs of the other parties. These parties are called malicious.

We envision an adversary A, who controls all the faulty parties and can coordinate their actions. Thus, in a sense we assume that the faulty parties are working together and can exert the most knowledge and influence over the computation out of this collusion. The adversary can corrupt any number of parties out of the n participating parties. Yet, in order to be able to achieve a solution to the problem, in many cases we would need to limit the number of corrupted parties. This limit is called the threshold k, indicating that the protocol remains secure as long as the number of corrupted parties is at most k.

Assume that there exists a trusted party who privately receives the inputs of all the participating parties, calculates the output value s, and then transmits this value to each one of the parties. This process clearly computes the correct output of f, and also does not enable the participating parties to learn any additional information about the inputs of others. We call this model the ideal model. The security of Multi-Party Computation then states that a protocol is secure if its execution satisfies the following: (1) the honest parties compute the same (correct) outputs as they would in the ideal model; and (2) the protocol does not expose more information than a comparable execution with the trusted party, in the ideal model.

Intuitively, the adversary’s interaction with the parties (on a vector of inputs) in the protocol generates a transcript. This transcript is a random variable that includes the outputs of all the honest parties, which is needed to ensure correctness, and the output of the adversary A. The latter output, without loss of generality, includes all the information that the adversary learned, including its inputs, private state, all the messages sent by the honest parties to A, and, depending on the model, maybe even include more information, such as public messages that the honest parties exchanged. If we show that exactly the same transcript distribution can be generated when interacting with the trusted party in the ideal model, then we are guaranteed that no information is leaked from the computation via the execution of the protocol, as we know that the ideal process does not expose any information about the inputs. More formally,

Let f be a function on n inputs and let π be a protocol that computes the function f. Given an adversary A, which controls some set of parties, we define REALA,π(t) to be the sequence of outputs of honest parties resulting from the execution of π on input vector t under the attack of A, in addition to the output of A. Similarly, given an adversary A′ which controls a set of parties, we define IDEALA′,f(t) to be the sequence of outputs of honest parties computed by the trusted party in the ideal model on input vector t, in addition to the output of A′. We say that π securely computes f if, for every adversary A as above, ∃ an adversary A′, which controls the same parties in the ideal model, such that, on any input vector t, we have that the distribution of REALA,π(t) is “indistinguishable” from the distribution of IDEALA′,f(t).

Intuitively, the task of the ideal adversary A′ is to generate (almost) the same output as A generates in the real execution or the real model. Thus, the attacker A′ is often called the simulator of A. The transcript value generated in the ideal model, IDEALA′,f(t), also includes the outputs of the honest parties (even though we do not give these outputs to A′), which we know were correctly computed by the trusted party. Thus, the real transcript REALA,π(t) should also include correct outputs of the honest parties in the real model.

We assumed that every party Pi has an input ti, which it enters into the computation. However, if Pi is faulty, nothing stops Pi from changing ti into some ti′. Thus, the notion of a “correct” input is defined only for honest parties. However, the “effective” input of a faulty party Pi could be defined as the value ti′ that the simulator A′ gives to the trusted party in the ideal model. Indeed, since the outputs of honest parties look the same in both models, for all effective purposes Pi must have “contributed” the same input ti′ in the real model.

Another possible misbehavior of Pi, even in the ideal model, might be a refusal to give any input at all to the trusted party. This can be handled in a variety of ways, ranging from aborting the entire computation to simply assigning ti some “default value.” For concreteness, we assume that the domain of f includes a special symbol ⊥ indicating this refusal to give the input, so that it is well defined how f should be computed on such missing inputs. What this requires is that in any real protocol we detect when a party does not enter its input and deal with it exactly in the same manner as if the party would input ⊥ in the ideal model.

As regards security, it is implicitly assumed that all honest parties receive the output of the computation. This is achieved by stating that IDEALA′,f(t) includes the outputs of all honest parties. We therefore say that our currency guarantees output delivery. A more relaxed property than output delivery is fairness. If fairness is achieved, then this means that if at least one (even faulty) party learns its outputs, then all (honest) parties eventually do too. A bit more formally, we allow the ideal model adversary A′ to instruct the trusted party not to compute any of the outputs. In this case, in the ideal model either all the parties learn the output, or none do. Since A’s transcript is indistinguishable from A′’s this guarantees that the same fairness guarantee must hold in the real model as well.

A further relaxation of the definition of security is to provide only correctness and privacy. This means that faulty parties can learn their outputs, and prevent the honest parties from learning theirs. Yet, at the same time the protocol will still guarantee that (1) if an honest party receives an output, then this is the correct value, and (2) the privacy of the inputs and outputs of the honest parties is preserved.

The basic security notions are universal and model-independent. However, specific implementations crucially depend on spelling out precisely the model where the computation will be carried out. In particular, the following issues must be specified:

  1. The faulty parties could be honest-but-curious or malicious, and there is usually an upper bound k on the number of parties that the adversary can corrupt.
  2. Distinguishing between the computational setting and the information theoretic setting, in the latter, the adversary is unlimited in its computing powers. Thus, the term “indistinguishable” is formalized by requiring the two transcript distributions to be either identical (so-called perfect security) or, at least, statistically close in their variation distance (so-called statistical security). On the other hand, in the computational, the power of the adversary (as well as that of the honest parties) is restricted. A bit more precisely, Multi-Party Computation problem is parameterized by the security parameter λ, in which case (a) all the computation and communication shall be done in time polynomial in λ; and (b) the misbehavior strategies of the faulty parties are also restricted to be run in time polynomial in λ. Furthermore, the term “indistinguishability” is formalized by computational indistinguishability: two distribution ensembles {Xλ}λ and {Yλ}λ are said to be computationally indistinguishable, if for any polynomial-time distinguisher D, the quantity ε, defined as |Pr[D(Xλ) = 1] − Pr[D(Yλ) = 1]|, is a “negligible” function of λ. This means that for any j > 0 and all sufficiently large λ, ε eventually becomes smaller than λ − j. This modeling helps us to build secure Multi-Party Computational protocols depending on plausible computational assumptions, such as the hardness of factoring large integers.
  3. The two common communication assumptions are the existence of a secure channel and the existence of a broadcast channel. Secure channels assume that every pair of parties Pi and Pj are connected via an authenticated, private channel. A broadcast channel is a channel with the following properties: if a party Pi (honest or faulty) broadcasts a message m, then m is correctly received by all the parties (who are also sure the message came from Pi). In particular, if an honest party receives m, then it knows that every other honest party also received m. A different communication assumption is the existence of envelopes. An envelope guarantees the following properties: a value m can be stored inside the envelope, it will be held without exposure for a given period of time, and then the value m will be revealed without modification. A ballot box is an enhancement of the envelope setting that also provides a random shuffling mechanism of the envelopes. These are, of course, idealized assumptions that allow for a clean description of a protocol, as they separate the communication issues from the computational ones. These idealized assumptions may be realized by a physical mechanisms, but in some settings such mechanisms may not be available. Then it is important to address the question if and under what circumstances we can remove a given communication assumption. For example, we know that the assumption of a secure channel can be substituted with a protocol, but under the introduction of a computational assumption and a public key infrastructure.
Advertisement

Fermi Surface Singularities

In ideal Fermi gases, the Fermi surface at p = pF = √2μm is the boundary in p-space between the occupied states (np = 1) at p2/2m < μ and empty states (np = 0) at p2/2m > μ. At this boundary (the surface in 3D momentum space) the energy is zero. What happens when the interaction between particles is introduced? Due to interaction the distribution function np of particles in the ground state is no longer exactly 1 or 0. However, it appears that the Fermi surface survives as the singularity in np. Such stability of the Fermi surface comes from a topological property of the one-particle Green’s function at imaginary frequency:

G-1 = iω – p2/2m + μ —– (1)

Let us for simplicity skip one spatial dimension pz so that the Fermi surface becomes the line in 2D momentum space (px,py); this does not change the co-dimension of zeroes which remains 1 = 3−2 = 2−1. The Green’s function has singularities lying on a closed line ω = 0, p2x + p2y = p2F in the 3D momentum-frequency space (ω,px,py). This is the line of the quantized vortex in the momentum space, since the phase Φ of the Green’s function G = |G|e changes by 2πN1 around the path embracing any element of this vortex line. In the considered case the phase winding number is N1 = 1. If we add the third momentum dimension pz the vortex line becomes the surface in the 4D momentum-frequency space (ω,px,py,pz) – the Fermi surface – but again the phase changes by 2π along any closed loop empracing the element of the 2D surface in the 4D momentum-frequency space.

Untitled

Fermi surface is a topological object in momentum space – a vortex loop. When the chemical potential μ decreases the loop shrinks and disappears at μ < 0. The point μ = T = 0 marks the Lifshitz transition between the gapless ground state at μ > 0 to the fully gapped vacuum at μ < 0.

The winding number cannot change by continuous deformation of the Green’s function: the momentum-space vortex is robust toward any perturbation. Thus the singularity of the Green’s function on the Fermi surface is preserved, even when interaction between fermions is introduced. The invariant is the same for any space dimension, since the co-dimension remains 1.

The Green function is generally a matrix with spin indices. In addition, it may have the band indices (in the case of electrons in the periodic potential of crystals). In such a case the phase of the Green’s function becomes meaning-less; however, the topological property of the Green’s function remains robust. The general analysis demonstrates that topologically stable Fermi surfaces are described by the group Z of integers. The winding number N1 is expressed analytically in terms of the Green’s function:

N= tr ∮C dl/2πi G(μ,p) ∂lG-1(μ,p) —– (2)

Here the integral is taken over an arbitrary contour C around the momentum- space vortex, and tr is the trace over the spin, band and/or other indices.

The Fermi surface cannot be destroyed by small perturbations, since it is protected by topology and thus is robust to perturbations. But the Fermi surface can be removed by large perturbations in the processes which reproduces the processes occurring for the real-space counterpart of the Fermi surface – the loop of quantized vortex in superfluids and superconductors. The vortex ring can continuously shrink to a point and then disappear, or continuously expand and leave the momentum space. The first scenario occurs when one continuously changes the chemical potential from the positive to the negative value: at μ < 0 there is no vortex loop in momentum space and the ground state (vacuum) is fully gapped. The point μ = 0 marks the quantum phase transition – the Lifshitz transition – at which the topology of the energy spectrum changes. At this transition the symmetry of the ground state does not changes. The second scenario of the quantum phase transition to the fully gapped states occurs when the inverse mass 1/m in (1) crosses zero.

Similar Lifshitz transitions from the fully gapped state to the state with the Fermi surface may occur in superfluids and superconductors. This happens, for example, when the superfluid velocity crosses the Landau critical velocity. The symmetry of the order parameter does not change across such a quantum phase transition. In the non-superconduting states, the transition from the gapless to gapped state is the metal-insulator transition.

Untitled

The Lifshitz transitions involving the vortex lines in p-space may occur be- tween the gapless states. They are accompanied by the change of the topology of the Fermi surface itself. The simplest example of such a phase transition discussed in terms of the vortex lines is provided by the reconnection of the vortex lines.

Untitled

 

Weil Conjectures. Note Quote.

2

Solving Diophantine equations, that is giving integer solutions to polynomials, is often unapproachably difficult. Weil describes one strategy in a letter to his sister, the philosopher Simone Weil: Look for solutions in richer fields than the rationals, perhaps fields of rational functions over the complex numbers. But these are quite different from the integers:

We would be badly blocked if there were no bridge between the two. And voilà god carries the day against the devil: this bridge exists; it is the theory of algebraic function fields over a finite field of constants.

A solution modulo 5 to a polynomial P(X,Y,..Z) is a list of integers X,Y,..Z making the value P(X,Y,..Z) divisible by 5, or in other words equal to 0 modulo 5. For example, X2 + Y2 − 3 has no integer solutions. That is clear since X and Y would both have to be 0 or ±1, to keep their squares below 3, and no combination of those works. But it has solutions modulo 5 since, among others, 32 + 32 − 3 = 15 is divisible by 5. Solutions modulo a given prime p are easier to find than integer solutions and they amount to the same thing as solutions in the finite field of integers modulo p.

To see if a list of polynomial equations Pi(X, Y, ..Z) = 0 have a solution modulo p we need only check p different values for each variable. Even if p is impractically large, equations are more manageable modulo p. Going farther, we might look at equations modulo p, but allow some irrationals, and ask how the number of solutions grows as we allow irrationals of higher and higher degree—roots of quadratic polynomials, roots of cubic polynomials, and so on. This is looking for solutions in all finite fields, as in Weil’s letter.

The key technical points about finite fields are: For each prime number p, the field of integers modulo p form a field, written Fp. For each natural number r > 0 there is (up to isomorphism) just one field with pr elements, written as Fpr or as Fq with q = pr. This comes from Fp by adjoining the roots of a degree r polynomial. These are all the finite fields. Trivially, then, for any natural number s > 0 there is just one field with qs elements, namely Fp(r+s) which we may write Fqs. The union for all r of the Fpr is the algebraic closure Fp. By Galois theory, roots for polynomials in Fpr, are fixed points for the r-th iterate of the Frobenius morphism, that is for the map taking each x ∈ Fp to xpr.

Take any good n-dimensional algebraic space (any smooth projective variety of dimension n) defined by integer polynomials on a finite field Fq. For each s ∈ N, let Ns be the number of points defined on the extension field F(qs). Define the zeta function Z(t) as an exponential using a formal variable t:

Z(t) = exp(∑s=1Nsts/s)

The first Weil conjecture says Z(t) is a rational function:

Z(t) = P(t)/Q(t)

for integer polynomials P(t) and Q(t). This is a strong constraint on the numbers of solutions Ns. It means there are complex algebraic numbers a1 . . . ai and b1 . . . bj such that

Ns =(as1 +…+ asi) − (bs1 +…+ bsj)

And each algebraic conjugate of an a (resp. b) also an a (resp. b).

The second conjecture is a functional equation:

Z(1/qnt) = ± qnE/2tEZ(t)

This says the operation x → qn/x permutes the a’s (resp. the b’s).The third is a Riemann Hypothesis

Z(t) = (P1(t)P3(t) · · · P2n−1(t))/(P0(t)P2(t) · · · P2n(t))

where each Pk is an integer polynomial with all roots of absolute value q−k/2. That means each a has absolute value qk for some 0 ≤ k ≤ n. Each b has absolute value q(2k−1)/2 for some 0 ≤ k ≤ n.

Over it all is the conjectured link to topology. Let B0, B1, . . . B2n be the Betti numbers of the complex manifold defined by the same polynomials. That is, each Bk gives the number of k-dimensional holes or handles on the continuous space of complex number solutions to the equations. And recall an algebraically n-dimensional complex manifold is topologically 2n-dimensional. Then each Pk has degree Bk. And E is the Euler number of the manifold, the alternating sum

k=02n (−1)kBk

On its face the topology of a continuous manifold is worlds apart from arithmetic over finite fields. But the topology of this manifold tells how many a’s and b’s there are with each absolute value. This implies useful numerical approximations to the numbers of roots Ns. Special cases of these conjectures, with aspects of the topology, were proved before Weil, and he proved more. All dealt with curves (1-dimensional) or hypersurfaces (defined by a single polynomial).

Weil presented the topology as motivating the conjectures for higher dimensional varieties. He especially pointed out how the whole series of conjectures would follow quickly if we could treat the spaces of finite field solutions as topological manifolds. The topological strategy was powerfully seductive but seriously remote from existing tools. Weil’s arithmetic spaces were not even precisely defined. To all appearances they would be finite or (over the algebraic closures of the finite fields) countable and so everywhere discontinuous. Topological manifold methods could hardly apply.

Cantorian Diagonal Slash

What is Cantor’s diagonal slash? This is often considered to be an absurd argument from physics point of view. Why is that so? The argument says that “infinity of reals is uncountable, and infinity of integers is countable”, thus giving us two different quantifiable infinities.

diagonal_argument_powerset_svg-jpg

What happens if we apply Cantor’s diagonal slash to the integers? How can this be done? Think of the infinite binary tree, which starts at ground level as the trunk and splits in two (bifurcates) as we ascend; if we take the left branch then we assign 0 to the first binary digit, if the right branch then 1. At the next level we do the same….0 if we go left, 1 if we go right…. and we ascend the binary tree all the way to infinity. In this way every possible infinite sequence of binary digits is represented somehow or other (mapped out) by a path up the tree; and every path up the tree corresponds to a unique infinite sequence of binary digits….“the number of routes up the tree is equal to 2n (where n is the number of levels), and the number of nodes (branches) in the tree is equal to (2n)-1.”

Now we can arrange all of these paths in an infinite x infinite binary matrix, of form similar to the infinite x infinite matrix of real numbers that Cantor used for his diagonal slash. What happens if we perform Cantor’s diagonal slash on this infinite binary matrix? According to Cantor, we produce a new infinite binary sequence which is NOT contained within the original matrix (nor is it contained on the infinite binary tree). But how can this be? The matrix contains every possible route up the binary tree, there IS no other route not contained in the matrix. What does this show? That Cantor’s diagonal slash argument is meaningless when applied to infinite matrices.