CUSUM, or cumulative sum is used for detecting and monitoring change detection. Let us introduce a measurable space (Ω, F), where Ω = R^{∞}, F = ∪_{n}F_{n} and F_{n }= σ{Y_{i}, i ∈ {0, 1, …, n}}. The law of the sequence Y_{i}, i = 1, …., is defined by the family of probability measures {P_{v}}, v ∈ N*. In other words, the probability measure P_{v} for a given v > 0, playing the role of the change point, is the measure generated on Ω by the sequence Y_{i}, i = 1, … , when the distribution of the Y_{i}’s changes at time v. The probability measures P_{0} and P_{∞} are the measures generated on Ω by the random variables Y_{i} when they have an identical distribution. In other words, the system defined by the sequence Y_{i} undergoes a “regime change” from the distribution P_{0} to the distribution P_{∞} at the change point time v.

The CUSUM (cumulative sum** control chart)** statistic is defined as the maximum of the log-likelihood ratio of the measure P_{v} to the measure P_{∞} on the σ-algebra F_{n}. That is,

C_{n} := max_{0≤v≤n} log dP_{v}/dP_{∞}|_{Fn} —– (1)

is the CUSUM statistic on the σ-algebra F_{n}. The CUSUM statistic process is then the collection of the CUSUM statistics {C_{n}} of (1) for n = 1, ….

The CUSUM stopping rule is then

T(h) := inf {n ≥ 0: max_{0≤v≤n} log dP_{v}/dP_{∞}|_{Fn} ≥ h} —– (2)

for some threshold h > 0. In the CUSUM stopping rule (2), the CUSUM statistic process of (1) is initialized at

C_{0} = 0 —– (3)

The CUSUM statistic process was first introduced by E. S. Page in the form that it takes when the sequence of random variables Y_{i }is independent and Gaussian; that is, Y_{i} ∼ N(μ, 1), i = 1, 2,…, with μ = μ_{0} for i < 𝜈 and μ = μ_{1} for i ≥ 𝜈. Since its introduction by Page, the CUSUM statistic process of (1) and its associated CUSUM stopping time of (2) have been used in a plethora of applications where it is of interest to perform detection of abrupt changes in the statistical behavior of observations in real time. Examples of such applications are signal processing, monitoring the outbreak of an epidemic, financial surveillance, and computer vision. The popularity of the CUSUM stopping time (2) is mainly due to its low complexity and optimality properties in both discrete and continuous time models.

Let Y_{i} ∼ N(μ_{0}, σ_{2}) that change to Y_{i} ∼ N(μ_{1}, σ_{2}) at the change point time v. We now proceed to derive the form of the CUSUM statistic process (1) and its associated CUSUM stopping time (2). Let us denote by φ(x) = 1/√2π e^{-x2/2 }the Gaussian kernel. For the sequence of random variables Y_{i} given earlier,

C_{n} := max_{0≤v≤n} log dP_{v}/dP_{∞}|_{Fn}

= max_{0≤v≤n }log (∏_{i=1}^{v-1}φ(Y_{i}-μ_{0})/σ ∏_{i=v}^{n}φ(Y_{i}-μ_{1})/σ)/∏_{i=1}^{n}φ(Y_{i}-μ_{0})/σ

= 1/σ^{2}max_{0≤v≤n }(μ_{1 }– μ_{0})∑_{i=v}^{n}[Y_{i} – (μ_{1 }+ μ_{0})/2] —– (4)

In view of (3), let us initialize the sequence (4) at Y_{0} = (μ_{1 }+ μ_{0})/2 and distinguish two cases.

a) μ_{1 }> μ_{0}: divide out (μ_{1 }– μ_{0}), multiply by the constant σ^{2} in (4) and use (2) to obtain CUSUM stopping T^{+}:

T^{+}(h^{+}) = inf {n ≥ 0: max_{0≤v≤n }∑_{i=v}^{n}[Y_{i} – (μ_{1 }+ μ_{0})/2] ≥ h^{+}} —– (5)

for an appropriately scaled threshold h^{+ }> 0.

b) μ_{1 }< μ_{0}: divide out (μ_{1 }– μ_{0}), multiply by the constant σ^{2} in (4) and use (2) to obtain CUSUM stopping T^{–}:

T^{–}(h^{–}) = inf {n ≥ 0: max_{0≤v≤n }∑_{i=v}^{n}[(μ_{1 }+ μ_{0})/2 – Y_{i}] ≥ h^{–}} —– (6)

for an appropriately scaled threshold h^{–} > 0.

The sequences form a CUSUM according to the deviation of the monitored sequential observations from the average of their pre- and postchange means. Although the stopping times (5) and (6) can be derived by formal CUSUM regime change considerations, they may also be used as general nonparametric stopping rules directly applied to sequential observations.