1. Itô Integrals with Respect to Brownian Motion

Stochastic integrals with respect to Brownian motion ³⁴ are fundamental objects in stochastic calculus, enabling the rigorous treatment of stochastic differential equations and random processes in continuous time. Unlike ordinary calculus, where integration is performed with respect to smooth functions, stochastic integration involves integrating with respect to Brownian motion, a process with infinite variation (but finite quadratic variation) and nowhere-differentiable paths.

Again as any other posts, this one is for recalling the step-by-step process to deal with Itô integrals.

1.1 The Itô Integral: Construction and Definition

1.1.1 Mathematical Framework

Let \(W(t)\) be a standard Brownian motion defined on a filtered probability space \((\Omega, \mathcal{F}, (\mathcal{F}_t)_{t \geq 0}, \mathbb{P})\). We're looking for how to solve/analyze the integral:

\[ \begin{equation} \int_s^t f(u)\,dW(u) \label{eq:ito_integral_notation} \end{equation} \]

where \(f(u)\) is either a deterministic function of time or an adapted stochastic process (i.e., \(f(u)\) is \(\mathcal{F}_u\)-measurable for each \(u\)).

Why Not Riemann-Stieltjes Integration?

Brownian motion has unbounded variation on any interval \([s, t]\), meaning:

\[ \sum_{i=0}^{n-1} |W(t_{i+1}) - W(t_i)| \to \infty \quad \text{as } n \to \infty \]

This makes classical Riemann-Stieltjes integration impossible. We need a different approach based on mean-square convergence rather than pathwise convergence (Quadratic variation).

1.1.2 Construction via Partition and Approximation

The Itô integral is defined as the limit of Riemann-like sums constructed through partitioning. Divide the interval \([s, t]\) into \(n\) subintervals:

\[ \begin{equation} s = t_0 < t_1 < t_2 < \cdots < t_n = t \label{eq:partition} \end{equation} \]

with mesh size \(\Delta t_i = t_{i+1} - t_i\) for \(i = 0, 1, \ldots, n-1\). Then, the Itô integral is approximated by:

\[ \begin{equation} \int_s^t f(u)\,dW(u) = \lim_{n \to \infty} \sum_{i=0}^{n-1} f(t_i)\left(W(t_{i+1}) - W(t_i)\right) \label{eq:riemann_sum_approximation} \end{equation} \]

where:

\(f(t_i)\) is the value of the integrand at the left endpoint of the interval \([t_i, t_{i+1}]\)
\(W(t_{i+1}) - W(t_i)\) is the Brownian motion increment over that interval
The limit is taken in the mean-square sense as the mesh size \(\max_i \Delta t_i \to 0\)

Left-Endpoint Evaluation: The Itô Convention

In the Itô integral, we evaluate the integrand at the left endpoint of each subinterval. This is crucial for two reasons:

Causality: \(f(t_i)\) is known (or measurable with respect to \(\mathcal{F}_{t_i}\)) at time \(t_i\), before the random increment \(W(t_{i+1}) - W(t_i)\) occurs
Martingale Property: This non-anticipating property ensures that Itô integrals are martingales with zero drift

Alternative conventions (e.g., Stratonovich integration) use midpoint evaluation and yield different calculus rules.

1.1.3 Integrability Conditions

For the integral \(\ref{eq:riemann_sum_approximation}\) to be well-defined, some conditions are required, like:

Integrability Type	Condition
Deterministic	\(\int_s^t f(u)^2\,du < \infty\)
Stochastic	\(\mathbb{E}\left[\int_s^t H(u)^2\,du\right] < \infty\)

Adapted Process Requirement

For stochastic integrands \(H(u)\), we require that \(H(u)\) is adapted to the filtration \((\mathcal{F}_u)\), meaning \(H(u)\) only depends on information available up to time \(u\). This ensures:

Non-anticipation: The integrand cannot "look into the future"
Martingale property: The integral remains a martingale with zero drift
Well-definedness: The mean-square limit exists

The square-integrability condition in stochastic integrability requirement, ensures that the mean-square limit in \(\ref{eq:riemann_sum_approximation}\) exists. This is weaker than requiring pathwise convergence and is the natural framework for stochastic integration.

1.1.4 Example: Exponential Integrand

For the specific case \(f(u) = e^{au}\) where \(a\) is a constant:

\[ \begin{equation} \int_s^t e^{au}\,dW(u) = \lim_{n \to \infty} \sum_{i=0}^{n-1} e^{at_i}\left(W(t_{i+1}) - W(t_i)\right) \label{eq:exponential_ito_integral_construction} \end{equation} \]

This integral appears frequently in the solutions to linear stochastic differential equations, including the Vasicek model.

1.2 Expected Value: The Martingale Property

1.2.1 Statement

One of the most important properties of Itô integrals is that they are martingales with zero conditional expectation:

\[ \begin{equation} \mathbb{E}\left[\int_s^t f(u)\,dW(u) \mid \mathcal{F}_s\right] = 0 \label{eq:ito_integral_zero_expectation} \end{equation} \]

for all \(0 \leq s < t\) and any adapted integrand \(f(u)\) satisfying the integrability conditions. This fundamental result tells us that stochastic integrals represent pure randomness with no systematic drift, as they average to zero when conditioned on past information.

1.2.2 Step-by-Step Proof for Deterministic Integrands

We will prove equation \(\ref{eq:ito_integral_zero_expectation}\) for the exponential case:

\[ \begin{equation} \mathbb{E}\left[\int_s^t e^{au}\,dW(u) \mid \mathcal{F}_s\right] = 0 \label{eq:exponential_ito_integral} \end{equation} \]

We compute the conditional expectation of the approximating sum as explained above in \(\ref{eq:riemann_sum_approximation}\).

\[ \begin{equation} \mathbb{E}\left[\sum_{i=0}^{n-1} e^{at_i}\left(W(t_{i+1}) - W(t_i)\right) \mid \mathcal{F}_s\right] \label{eq:conditional_expectation_sum} \end{equation} \]

Using the linearity of conditional expectation, we can say that

\[ \begin{equation} \mathbb{E}\left[\sum_{i=0}^{n-1} e^{at_i}\left(W(t_{i+1}) - W(t_i)\right) \mid \mathcal{F}_s\right] = \sum_{i=0}^{n-1} \mathbb{E}\left[e^{at_i}\left(W(t_{i+1}) - W(t_i)\right) \mid \mathcal{F}_s\right] \label{eq:linearity_of_expectation} \end{equation} \]

Critical Insight

Consider a single term:

\[ \begin{equation} \mathbb{E}\left[e^{at_i}\left(W(t_{i+1}) - W(t_i)\right) \mid \mathcal{F}_s\right] \label{eq:single_term} \end{equation} \]

Since we integrate from \(s\) to \(t\), we have \(t_i \geq s\) for all terms in the sum. The increment \(W(t_{i+1}) - W(t_i)\) occurs in the future relative to time \(s\).

The key to understanding why this expectation is zero lies in recognizing when information becomes available and when randomness occurs.

Subsequently, we can apply the tower property (law of iterated expectations):

\[ \begin{equation} \mathbb{E}\left[e^{at_i}\left(W(t_{i+1}) - W(t_i)\right) \mid \mathcal{F}_s\right] = \mathbb{E}\left[\style{color: var(--math-highlight-4)}{\mathbb{E}\left[e^{at_i}\left(W(t_{i+1}) - W(t_i)\right) \mid \mathcal{F}_{t_i}\right]} \mid \mathcal{F}_s\right] \label{eq:tower_property} \end{equation} \]

Tower Property

The tower property (also called the law of iterated expectations) states that for any \(s \leq t_i\):

\[ \mathbb{E}\left[X \mid \mathcal{F}_s\right] = \mathbb{E}\left[\mathbb{E}\left[X \mid \mathcal{F}_{t_i}\right] \mid \mathcal{F}_s\right] \]

This allows us to "condition down" to a finer filtration first, then condition up to the coarser filtration.

Why This is Crucial for Computing Itô Integrals:

When we want to find \(\mathbb{E}\left[\int_s^t f(u)\,dW(u) \mid \mathcal{F}_s\right]\), we face a challenge: the integral involves increments \(W(t_{i+1}) - W(t_i)\) that occur at times \(t_i \geq s\), which are in the future relative to time \(s\).

The tower property lets us break this into two steps:

First, condition on the intermediate time \(\mathcal{F}_{t_i}\) where we can use the fact that \(e^{at_i}\) is known and the future increment \(W(t_{i+1}) - W(t_i)\) is independent of \(\mathcal{F}_{t_i}\)
Then, average this result back down to the original filtration \(\mathcal{F}_s\)

Without this property, we couldn't separate the "known" part (\(e^{at_i}\)) from the "random" part (\(W(t_{i+1}) - W(t_i)\)) and exploit the independent increments property of Brownian motion.

Now, since \(e^{at_i}\) is deterministic (depends only on time, not randomness) and \(t_i\) is known at time \(t_i\), we can factor it out of the inner expectation:

\[ \begin{equation} \style{color: var(--math-highlight-4)}{\mathbb{E}\left[e^{at_i}\left(W(t_{i+1}) - W(t_i)\right) \mid \mathcal{F}_{t_i}\right]} = e^{at_i} \style{color: var(--math-highlight-2)}{\mathbb{E}\left[W(t_{i+1}) - W(t_i) \mid \mathcal{F}_{t_i}\right]} \label{eq:factor_out_deterministic} \end{equation} \]

The key property of Brownian motion is that increments are independent of the past. The increment \(W(t_{i+1}) - W(t_i)\) is:

Independent of all information available at time \(t_i\) (i.e., independent of \(\mathcal{F}_{t_i}\))
Normally distributed: \(W(t_{i+1}) - W(t_i) \sim \mathcal{N}(0, t_{i+1} - t_i)\)

Therefore:

\[ \begin{equation} \style{color: var(--math-highlight-2)}{\mathbb{E}\left[W(t_{i+1}) - W(t_i) \mid \mathcal{F}_{t_i}\right]} = \mathbb{E}\left[W(t_{i+1}) - W(t_i)\right] = 0 \label{eq:brownian_increment_expectation} \end{equation} \]

Why Zero?

Brownian motion increments have zero mean:

\[ W(t_{i+1}) - W(t_i) \sim \mathcal{N}(0, t_{i+1} - t_i) \]

The mean of a \(\mathcal{N}(0, \sigma^2)\) distribution is always 0, regardless of the variance.

Eventually, and substituting back:

\[ \begin{equation} e^{at_i} \times 0 = 0 \label{eq:term_conclusion} \end{equation} \]

Since every term in the sum \(\ref{eq:linearity_of_expectation}\) equals zero:

\[ \begin{equation} \sum_{i=0}^{n-1} \mathbb{E}\left[e^{at_i}\left(W(t_{i+1}) - W(t_i)\right) \mid \mathcal{F}_s\right] = \sum_{i=0}^{n-1} 0 = 0 \label{eq:sum_of_zeros} \end{equation} \]

As the partition becomes finer (mesh size \(\to 0\)), the Riemann sum converges to the Itô integral:

\[ \begin{equation} \mathbb{E}\left[\int_s^t e^{au}\,dW(u) \mid \mathcal{F}_s\right] = \lim_{n \to \infty} \mathbb{E}\left[\sum_{i=0}^{n-1} e^{at_i}\left(W(t_{i+1}) - W(t_i)\right) \mid \mathcal{F}_s\right] = 0 \label{eq:limit_conclusion} \end{equation} \]

1.2.3 Generalization to Arbitrary Integrands

The proof above generalizes to any deterministic function \(f(u)\) satisfying the integrability condition:

\[ \begin{equation} \mathbb{E}\left[\int_s^t f(u)\,dW(u) \mid \mathcal{F}_s\right] = 0 \label{eq:general_deterministic_integrand} \end{equation} \]

Moreover, for adapted stochastic processes \(H(u)\) (where \(H(u)\) is \(\mathcal{F}_u\)-measurable for each \(u\)), the same result holds under the square-integrability condition:

\[ \begin{equation} \mathbb{E}\left[\int_s^t H(u)\,dW(u) \mid \mathcal{F}_s\right] = 0 \label{eq:general_adapted_integrand} \end{equation} \]

1.3 Variance: Itô's Isometry

While the martingale property tells us that the mean of an Itô integral is zero, we also need to characterize its variance to fully understand the distribution of stochastic integrals.

1.3.1 Itô's Isometry Theorem

For any deterministic function \(f(u)\) satisfying \(\int_s^t f(u)^2\,du < \infty\):

\[ \begin{equation} \mathbb{E}\left[\left(\int_s^t f(u)\,dW(u)\right)^2\right] = \int_s^t f(u)^2\,du \label{eq:ito_isometry} \end{equation} \]

More generally, for the conditional variance ¹:

\[ \begin{equation} \begin{aligned} \text{Var}\left[\int_s^t f(u)\,dW(u) \mid \mathcal{F}_s\right] &= \mathbb{E}\left[\left(\int_s^t f(u)\,dW(u)\right)^2 \mid \mathcal{F}_s\right]\\ &= \int_s^t f(u)^2\,du \end{aligned} \label{eq:ito_isometry_variance} \end{equation} \]

1.3.2 Proof Sketch for Itô's Isometry

First, let's take the approximation using riemman sum.

\[ \begin{equation} \left(\int_s^t f(u)\,dW(u)\right)^2 \approx \left(\sum_{i=0}^{n-1} f(t_i)\left(W(t_{i+1}) - W(t_i)\right)\right)^2 \label{eq:squared_riemann_sum} \end{equation} \]

that, when expanding the 2nd-power we get where \(\Delta W_i = W(t_{i+1}) - W(t_i)\).

\[ \begin{equation} \left(\sum_{i=0}^{n-1} f(t_i)\Delta W_i\right)^2 = \sum_{i=0}^{n-1} f(t_i)^2 (\style{color: var(--math-highlight-2)}{\Delta W_i})^2 + 2\sum_{i<j} f(t_i)f(t_j)\style{color: var(--math-highlight-4)}{\Delta W_i \Delta W_j} \label{eq:expansion} \end{equation} \]

The cross terms vanish because increments at different times are independent:

\[ \begin{equation} \mathbb{E}[\style{color: var(--math-highlight-4)}{\Delta W_i \Delta W_j}] = \mathbb{E}[\Delta W_i]\mathbb{E}[\Delta W_j] = 0 \quad \text{for } i < j \label{eq:cross_terms_zero} \end{equation} \]

Then, remaining terms are called diagonal terms, contribute with ²:

\[ \begin{equation} \mathbb{E}[(\style{color: var(--math-highlight-2)}{\Delta W_i})^2] = \text{Var}(\Delta W_i) = t_{i+1} - t_i = \Delta t_i \label{eq:diagonal_variance} \end{equation} \]

Then summing up and using the limit, we can say

\[ \begin{equation} \mathbb{E}\left[\left(\sum_{i=0}^{n-1} f(t_i)\Delta W_i\right)^2\right] = \sum_{i=0}^{n-1} f(t_i)^2 \Delta t_i \to \int_s^t f(u)^2\,du \label{eq:isometry_limit} \end{equation} \]

as the mesh size goes to zero.

1.3.3 Example: Exponential Integrand

For \(f(u) = e^{au}\):

\[ \begin{equation} \text{Var}\left[\int_s^t e^{au}\,dW(u) \mid \mathcal{F}_s\right] = \int_s^t e^{2au}\,du = \frac{e^{2at} - e^{2as}}{2a} \label{eq:exponential_variance} \end{equation} \]

This variance formula in \(\ref{eq:exponential_variance}\) is crucial for computing the variance of the short rate in the Vasicek model.

1.4 Complete Distributional Characterization

Combining the zero mean property \(\ref{eq:ito_integral_zero_expectation}\) with the variance formula \(\ref{eq:ito_isometry_variance}\), when \(f(u)\) is deterministic:

\[ \begin{equation} \int_s^t f(u)\,dW(u) \mid \mathcal{F}_s \sim \mathcal{N}\left(0, \int_s^t f(u)^2\,du\right) \label{eq:ito_integral_distribution} \end{equation} \]

This Gaussian distribution is fundamental to stochastic calculus:

The integral is normally distributed
Mean is zero
Variance is the integral of the squared integrand

Key Insight

The pair (mean, variance) = \(\left(0, \int_s^t f(u)^2\,du\right)\) completely characterizes the distribution of Itô integrals with deterministic integrands.

1.5 Connection to Other Results

Brownian Motion: The integrator in Itô integrals; its independent increments and zero mean properties are essential
Itô's Isometry (Full Treatment): A more detailed exposition of the variance formula and its applications
Vasicek Model: Practical application showing how to compute both mean and variance using these properties

See Itô's Isometry. ↩
See Brownian Motion, section associated with Brownian differences properties. ↩
Bernt Øksendal. Stochastic Differential Equations: An Introduction with Applications. Springer, 6th edition, 2013. URL: https://link.springer.com/book/10.1007/978-3-642-14394-6. ↩
Ioannis Karatzas and Steven E. Shreve. Brownian Motion and Stochastic Calculus. Springer, 2nd edition, 1991. ↩