Malicious Machine Learnings? Privacy Preservation and Computational Correctness Across Parties. Note Quote/Didactics.

Invincea_graph_DARPA

Multi-Party Computation deals with the following problem: There are n ≥ 2 parties P1, . . ., Pn where party Pi holds input ti, 1 ≤ i ≤ n, and they wish to compute together a functions = f (t1, . . . , tn) on their inputs. The goal is that each party will learn the output of the function, s, yet with the restriction that Pi will not learn any additional information about the input of the other parties aside from what can be deduced from the pair (ti, s). Clearly it is the secrecy restriction that adds complexity to the problem, as without it each party could announce its input to all other parties, and each party would locally compute the value of the function. Thus, the goal of Multi-Party Computation is to achieve the following two properties at the same time: correctness of the computation and privacy preservation of the inputs.

The following two generalizations are often useful:

(i) Probabilistic functions. Here the value of the function depends on some random string r chosen according to some distribution: s = f (t1, . . . , tn; r). An example of this is the coin-flipping functionality, which takes no inputs, and outputs an unbiased random bit. It is crucial that the value r is not controlled by any of the parties, but is somehow jointly generated during the computation.

(ii) Multioutput functions. It is not mandatory that there be a single output of the function. More generally there could be a unique output for each party, i.e., (s1, . . . , sn) = f(t1,…, tn). In this case, only party Pi learns the output si, and no other party learns any information about the other parties’ input and outputs aside from what can be derived from its own input and output.

One of the most interesting aspects of Multi-Party Computation is to reach the objective of computing the function value, but under the assumption that some of the parties may deviate from the protocol. In cryptography, the parties are usually divided into two types: honest and faulty. An honest party follows the protocol without any deviation. Otherwise, the party is considered to be faulty. The faulty behavior can exemplify itself in a wide range of possibilities. The most benign faulty behavior is where the parties follow the protocol, yet try to learn as much as possible about the inputs of the other parties. These parties are called honest-but-curious (or semihonest). At the other end of the spectrum, the parties may deviate from the prescribed protocol in any way that they desire, with the goal of either influencing the computed output value in some way, or of learning as much as possible about the inputs of the other parties. These parties are called malicious.

We envision an adversary A, who controls all the faulty parties and can coordinate their actions. Thus, in a sense we assume that the faulty parties are working together and can exert the most knowledge and influence over the computation out of this collusion. The adversary can corrupt any number of parties out of the n participating parties. Yet, in order to be able to achieve a solution to the problem, in many cases we would need to limit the number of corrupted parties. This limit is called the threshold k, indicating that the protocol remains secure as long as the number of corrupted parties is at most k.

Assume that there exists a trusted party who privately receives the inputs of all the participating parties, calculates the output value s, and then transmits this value to each one of the parties. This process clearly computes the correct output of f, and also does not enable the participating parties to learn any additional information about the inputs of others. We call this model the ideal model. The security of Multi-Party Computation then states that a protocol is secure if its execution satisfies the following: (1) the honest parties compute the same (correct) outputs as they would in the ideal model; and (2) the protocol does not expose more information than a comparable execution with the trusted party, in the ideal model.

Intuitively, the adversary’s interaction with the parties (on a vector of inputs) in the protocol generates a transcript. This transcript is a random variable that includes the outputs of all the honest parties, which is needed to ensure correctness, and the output of the adversary A. The latter output, without loss of generality, includes all the information that the adversary learned, including its inputs, private state, all the messages sent by the honest parties to A, and, depending on the model, maybe even include more information, such as public messages that the honest parties exchanged. If we show that exactly the same transcript distribution can be generated when interacting with the trusted party in the ideal model, then we are guaranteed that no information is leaked from the computation via the execution of the protocol, as we know that the ideal process does not expose any information about the inputs. More formally,

Let f be a function on n inputs and let π be a protocol that computes the function f. Given an adversary A, which controls some set of parties, we define REALA,π(t) to be the sequence of outputs of honest parties resulting from the execution of π on input vector t under the attack of A, in addition to the output of A. Similarly, given an adversary A′ which controls a set of parties, we define IDEALA′,f(t) to be the sequence of outputs of honest parties computed by the trusted party in the ideal model on input vector t, in addition to the output of A′. We say that π securely computes f if, for every adversary A as above, ∃ an adversary A′, which controls the same parties in the ideal model, such that, on any input vector t, we have that the distribution of REALA,π(t) is “indistinguishable” from the distribution of IDEALA′,f(t).

Intuitively, the task of the ideal adversary A′ is to generate (almost) the same output as A generates in the real execution or the real model. Thus, the attacker A′ is often called the simulator of A. The transcript value generated in the ideal model, IDEALA′,f(t), also includes the outputs of the honest parties (even though we do not give these outputs to A′), which we know were correctly computed by the trusted party. Thus, the real transcript REALA,π(t) should also include correct outputs of the honest parties in the real model.

We assumed that every party Pi has an input ti, which it enters into the computation. However, if Pi is faulty, nothing stops Pi from changing ti into some ti′. Thus, the notion of a “correct” input is defined only for honest parties. However, the “effective” input of a faulty party Pi could be defined as the value ti′ that the simulator A′ gives to the trusted party in the ideal model. Indeed, since the outputs of honest parties look the same in both models, for all effective purposes Pi must have “contributed” the same input ti′ in the real model.

Another possible misbehavior of Pi, even in the ideal model, might be a refusal to give any input at all to the trusted party. This can be handled in a variety of ways, ranging from aborting the entire computation to simply assigning ti some “default value.” For concreteness, we assume that the domain of f includes a special symbol ⊥ indicating this refusal to give the input, so that it is well defined how f should be computed on such missing inputs. What this requires is that in any real protocol we detect when a party does not enter its input and deal with it exactly in the same manner as if the party would input ⊥ in the ideal model.

As regards security, it is implicitly assumed that all honest parties receive the output of the computation. This is achieved by stating that IDEALA′,f(t) includes the outputs of all honest parties. We therefore say that our currency guarantees output delivery. A more relaxed property than output delivery is fairness. If fairness is achieved, then this means that if at least one (even faulty) party learns its outputs, then all (honest) parties eventually do too. A bit more formally, we allow the ideal model adversary A′ to instruct the trusted party not to compute any of the outputs. In this case, in the ideal model either all the parties learn the output, or none do. Since A’s transcript is indistinguishable from A′’s this guarantees that the same fairness guarantee must hold in the real model as well.

A further relaxation of the definition of security is to provide only correctness and privacy. This means that faulty parties can learn their outputs, and prevent the honest parties from learning theirs. Yet, at the same time the protocol will still guarantee that (1) if an honest party receives an output, then this is the correct value, and (2) the privacy of the inputs and outputs of the honest parties is preserved.

The basic security notions are universal and model-independent. However, specific implementations crucially depend on spelling out precisely the model where the computation will be carried out. In particular, the following issues must be specified:

  1. The faulty parties could be honest-but-curious or malicious, and there is usually an upper bound k on the number of parties that the adversary can corrupt.
  2. Distinguishing between the computational setting and the information theoretic setting, in the latter, the adversary is unlimited in its computing powers. Thus, the term “indistinguishable” is formalized by requiring the two transcript distributions to be either identical (so-called perfect security) or, at least, statistically close in their variation distance (so-called statistical security). On the other hand, in the computational, the power of the adversary (as well as that of the honest parties) is restricted. A bit more precisely, Multi-Party Computation problem is parameterized by the security parameter λ, in which case (a) all the computation and communication shall be done in time polynomial in λ; and (b) the misbehavior strategies of the faulty parties are also restricted to be run in time polynomial in λ. Furthermore, the term “indistinguishability” is formalized by computational indistinguishability: two distribution ensembles {Xλ}λ and {Yλ}λ are said to be computationally indistinguishable, if for any polynomial-time distinguisher D, the quantity ε, defined as |Pr[D(Xλ) = 1] − Pr[D(Yλ) = 1]|, is a “negligible” function of λ. This means that for any j > 0 and all sufficiently large λ, ε eventually becomes smaller than λ − j. This modeling helps us to build secure Multi-Party Computational protocols depending on plausible computational assumptions, such as the hardness of factoring large integers.
  3. The two common communication assumptions are the existence of a secure channel and the existence of a broadcast channel. Secure channels assume that every pair of parties Pi and Pj are connected via an authenticated, private channel. A broadcast channel is a channel with the following properties: if a party Pi (honest or faulty) broadcasts a message m, then m is correctly received by all the parties (who are also sure the message came from Pi). In particular, if an honest party receives m, then it knows that every other honest party also received m. A different communication assumption is the existence of envelopes. An envelope guarantees the following properties: a value m can be stored inside the envelope, it will be held without exposure for a given period of time, and then the value m will be revealed without modification. A ballot box is an enhancement of the envelope setting that also provides a random shuffling mechanism of the envelopes. These are, of course, idealized assumptions that allow for a clean description of a protocol, as they separate the communication issues from the computational ones. These idealized assumptions may be realized by a physical mechanisms, but in some settings such mechanisms may not be available. Then it is important to address the question if and under what circumstances we can remove a given communication assumption. For example, we know that the assumption of a secure channel can be substituted with a protocol, but under the introduction of a computational assumption and a public key infrastructure.

Fragmentation – Lit and Dark Electronic Exchanges. Thought of the Day 116.0

Untitled

Exchanges also control the amount and degree of granularity of the information you receive (e.g., you can use the consolidated/public feed at a low cost or pay a relatively much larger cost for direct/proprietary feeds from the exchanges). They also monetise the need for speed by renting out computer/server space next to their matching engines, a process called colocation. Through coloca­tion, exchanges can provide uniform service to trading clients at competitive rates. Having the traders’ trading engines at a common location owned by the exchange simplifies the exchange’s ability to provide uniform service as it can control the hardware connecting each client to the trading engine, the cable (so all have the same cable of the same length), and the network. This ensures that all traders in colocation have the same fast access, and are not disadvantaged (at least in terms of exchange-provided hardware). Naturally, this imposes a clear distinction between traders who are colocated and those who are not. Those not colocated will always have a speed disadvantage. It then becomes an issue for reg­ulators who have to ensure that exchanges keep access to colocation sufficiently competitive.

The issue of distance from the trading engine brings us to another key dimen­sion of trading nowadays, especially in US equity markets, namely fragmentation. A trader in US equities markets has to be aware that there are up to 13 lit electronic exchanges and more than 40 dark ones. Together with this wide range of trading options, there is also specific regulation (the so-called ‘trade-through’ rules) which affects what happens to market orders sent to one exchange if there are better execution prices at other exchanges. The interaction of multiple trading venues, latency when moving be­tween these venues, and regulation introduces additional dimensions to keep in mind when designing success l trading strategies.

The role of time is fundamental in the usual price-time priority electronic ex­change, and in a fragmented market, the issue becomes even more important. Traders need to be able to adjust their trading positions fast in response to or in anticipation of changes in market circumstances, not just at the local exchange but at other markets as well. The race to be the first in or out of a certain position is one of the focal points of the debate on the benefits and costs of ‘high-frequency trading’.

The importance of speed permeates the whole process of designing trading algorithms, from the actual code, to the choice of programming language, to the hardware it is implemented on, to the characteristics of the connection to the matching engine, and the way orders are routed within an exchange and between exchanges. Exchanges, being aware of the importance of speed, have adapted and, amongst other things, moved well beyond the basic two types of orders (Market Orders and Limit Orders). Any trader should be very well-informed regarding all the different order types available at the exchanges, what they are and how they may be used.

When coding an algorithm one should be very aware of all the possible types of orders allowed, not just in one exchange, but in all competing exchanges where one’s asset of interest is traded. Being uninformed about the variety of order types can lead to significant losses. Since some of these order types allow changes and adjustments at the trading engine level, they cannot be beaten in terms of latency by the trader’s engine, regardless of how efficiently your algorithms are coded and hardwired.

Untitled

Another important issue to be aware of is that trading in an exchange is not free, but the cost is not the same for all traders. For example, many exchanges run what is referred to as a maker-taker system of fees whereby a trader sending an MO (and hence taking liquidity away from the market) pays a trading fee, while a trader whose posted LO is filled by the MO (that is, the LO with which the MO is matched) will a pay much lower trading fee, or even receive a payment (a rebate) from the exchange for providing liquidity (making the market). On the other hand, there are markets with an inverted fee schedule, a taker-maker system where the fee structure is the reverse: those providing liquidity pay a higher fee than those taking liquidity (who may even get a rebate). The issue of exchange fees is quite important as fees distort observed market prices (when you make a transaction the relevant price for you is the net price you pay/receive, which is the published price net of fees).

String’s Depth of Burial

string_conversion_03.2013

A string’s depth might be defined as the execution time of its minimal program.

The difficulty with this definition arises in cases where the minimal program is only a few bits smaller than some much faster program, such as a print program, to compute the same output x. In this case, slight changes in x may induce arbitrarily large changes in the run time of the minimal program, by changing which of the two competing programs is minimal. Analogous instability manifests itself in translating programs from one universal machine to another. This instability emphasizes the essential role of the quantity of buried redundancy, not as a measure of depth, but as a certifier of depth. In terms of the philosophy-of-science metaphor, an object whose minimal program is only a few bits smaller than its print program is like an observation that points to a nontrivial hypothesis, but with only a low level of statistical confidence.

To adequately characterize a finite string’s depth one must therefore consider the amount of buried redundancy as well as the depth of its burial. A string’s depth at significance level s might thus be defined as that amount of time complexity which is attested by s bits worth of buried redundancy. This characterization of depth may be formalized in several ways.

A string’s depth at significance level s be defined as the time required to compute the string by a program no more than s bits larger than the minimal program.

This definition solves the stability problem, but is unsatisfactory in the way it treats multiple programs of the same length. Intuitively, 2k distinct (n + k)-bit programs that compute same output ought to be accorded the same weight as one n-bit program; but, by the present definition, they would be given no more weight than one (n + k)-bit program.

A string’s depth at signicifcance level s depth might be defined as the time t required for the string’s time-bounded algorithmic probability Pt(x) to rise to within a factor 2−s of its asymptotic time-unbounded value P(x).

This formalizes the notion that for the string to have originated by an effective process of t steps or fewer is less plausible than for the first s tosses of a fair coin all to come up heads.

It is not known whether there exist strings that are deep, in other words, strings having no small fast programs, even though they have enough large fast programs to contribute a significant fraction of their algorithmic probability. Such strings might be called deterministically deep but probabilistically shallow, because their chance of being produced quickly in a probabilistic computation (e.g. one where the input bits of U are supplied by coin tossing) is significant compared to their chance of being produced slowly. The question of whether such strings exist is probably hard to answer because it does not relativize uniformly. Deterministic and probabilistic depths are not very different relative to a random coin-toss oracle A of the equality of random-oracle-relativized deterministic and probabilistic polynomial time complexity classes; but they can be very different relative to an oracle B deliberately designed to hide information from deterministic computations (this parallels Hunt’s proof that deterministic and probabilistic polynomial time are unequal relative to such an oracle).

(Depth of Finite Strings): Let x and w be strings and s a significance parameter. A string’s depth at significance level s, denoted Ds(x), will be defined as min{T(p) : (|p|−|p| < s)∧(U(p) = x)}, the least time required to compute it by a s-incompressible program. At any given significance level, a string will be called t-deep if its depth exceeds t, and t-shallow otherwise.

The difference between this definition and the previous one is rather subtle philosophically and not very great quantitatively. Philosophically, when each individual hypothesis for the rapid origin of x is implausible at the 2−s confidence level, then it requires only that a weighted average of all such hypotheses be implausible.

There exist constants c1 and c2 such that for any string x, if programs running in time ≤ t contribute a fraction between 2−s and 2−s+1 of the string’s total algorithmic probability, then x has depth at most t at significance level s + c1 and depth at least t at significance level s − min{H(s), H(t)} − c2.

Proof : The first part follows easily from the fact that any k-compressible self-delimiting program p is associated with a unique, k − O(1) bits shorter, program of the form “execute the result of executing p∗”. Therefore there exists a constant c1 such that if all t-fast programs for x were s + c1– compressible, the associated shorter programs would contribute more than the total algorithmic probability of x. The second part follows because, roughly, if fast programs contribute only a small fraction of the algorithmic probability of x, then the property of being a fast program for x is so unusual that no program having that property can be random. More precisely, the t-fast programs for x constitute a finite prefix set, a superset S of which can be computed by a program of size H(x) + min{H(t), H(s)} + O(1) bits. (Given x∗ and either t∗ or s∗, begin enumerating all self-delimiting programs that compute x, in order of increasing running time, and quit when either the running time exceeds t or the accumulated measure of programs so far enumerated exceeds 2−(H(x)−s)). Therefore there exists a constant c2 such that, every member of S, and thus every t-fast program for x, is compressible by at least s − min{H(s), H(t)} − O(1) bits.

The ability of universal machines to simulate one another efficiently implies a corresponding degree of machine-independence for depth: for any two efficiently universal machines of the sort considered here, there exists a constant c and a linear polynomial L such that for any t, strings whose (s+c)-significant depth is at least L(t) on one machine will have s-significant depth at least t on the other.

Depth of one string relative to another may be defined analogously, and represents the plausible time required to produce one string, x, from another, w.

(Relative Depth of Finite Strings): For any two strings w and x, the depth of x relative to w at significance level s, denoted Ds(x/w), will be defined as min{T(p, w) : (|p|−|(p/w)∗| < s)∧(U(p, w) = x)}, the least time required to compute x from w by a program that is s-incompressible relative to w.

Depth of a string relative to its length is a particularly useful notion, allowing us, as it were, to consider the triviality or nontriviality of the “content” of a string (i.e. its bit sequence), independent of its “form” (length). For example, although the infinite sequence 000… is intuitively trivial, its initial segment 0n is deep whenever n is deep. However, 0n is always shallow relative to n, as is, with high probability, a random string of length n.

In order to adequately represent the intuitive notion of stored mathematical work, it is necessary that depth obey a “slow growth” law, i.e. that fast deterministic processes be unable to transform a shallow object into a deep one, and that fast probabilistic processes be able to do so only with low probability.

(Slow Growth Law): Given any data string x and two significance parameters s2 > s1, a random program generated by coin tossing has probability less than 2−(s2−s1)+O(1) of transforming x into an excessively deep output, i.e. one whose s2-significant depth exceeds the s1-significant depth of x plus the run time of the transforming program plus O(1). More precisely, there exist positive constants c1, c2 such that for all strings x, and all pairs of significance parameters s2 > s1, the prefix set {q : Ds2(U(q, x)) > Ds1(x) + T(q, x) + c1} has measure less than 2−(s2−s1)+c2.

Proof: Let p be a s1-incompressible program which computes x in time Ds1(x), and let r be the restart prefix mentioned in the definition of the U machine. Let Q be the prefix set {q : Ds2(U(q, x)) > T(q, x) + Ds1(x) + c1}, where the constant c1 is sufficient to cover the time overhead of concatenation. For all q ∈ Q, the program rpq by definition computes some deep result U(q, x) in less time than that result’s own s2-significant depth, and so rpq must be compressible by s2 bits. The sum of the algorithmic probabilities of strings of the form rpq, where q ∈ Q, is therefore

Σq∈Q P(rpq)< Σq∈Q 2−|rpq| + s2 = 2−|r|−|p|+s2 μ(Q)

On the other hand, since the self-delimiting program p can be recovered from any string of the form rpq (by deleting r and executing the remainder pq until halting occurs, by which time exactly p will have been read), the algorithmic probability of p is at least as great (within a constant factor) as the sum of the algorithmic probabilities of the strings {rpq : q ∈ Q} considered above:

P(p) > μ(Q) · 2−|r|−|p|+s2−O(1)

Recalling the fact that minimal program size is equal within a constant factor to the −log of algorithmic probability, and the s1-incompressibility of p, we have P(p) < 2−(|p|−s1+O(1)), and therefore finally

μ(Q) < 2−(s2−s1)+O(1), which was to be demonstrated.

Main Stuxnet DLL: Installing Stuxnet into the Infected Machine

stuxnet_tweets_per-week2

When the main DLL begins the execution. It unupx itself (as the DLL is upxed) and then checks the configuration data of this stuxnet sample and checks the environment to choose if it will continue or exit from the beginning.

It checks if the configuration data is correct and recent and then it checks the admin rights. If it’s not running on administrator level, it uses one of two zero-day vulnerabilities to escalate the privileges and run in the administrator level.

CVE-2010-2743(MS-10-073) – Win32K.sys Keyboard Layout Vulnerability CVE-xxxx-xxxx(MS-xx-xxx) – Windows Task Scheduler Vulnerability

These two vulnerabilities allow the worm to escalate the privileges and run in a new process (“csrss.exe” in case of Win32K.sys) or as a new task in the Task Scheduler case.

It makes also some other checks like checking on 64bits or 32bits and so on.

After everything goes right and the environment is prepared to be infected by stuxnet, it injects itself into another process to install itself from that process. The injection begins by searching for an Antivirus application installed in the machine.

Depending on the antivirus application (AVP or McAfee or what?), stuxnet chooses the process to inject itself into. If there’s no antivirus program it chooses “lsass.exe”….

The Function #16 begins by checking the configuration data and be sure that everything is ready to begin the installation. And also, it checks if the there’s a value in the registry with this name “NTVDM TRACE” in

SOFTWARE\Microsoft\Windows\CurrentVersion\MS-DOS Emulation

And then, it checks if this value equal to “19790509”. This special number seems a date “May 9, 1979” and this date has a historical meaning “Habib Elghanian was executed by a firing squad in Tehran sending shock waves through the closely knit Iranian Jewish community”….

Stuxnet_Malware_Analysis_Paper