Donsker’s invariance principle

Donsker’s invariance principle is a classical extension of the central limit theorem. It says that the path of a discrete random walk converges, if properly rescaled, in distribution to a standard Brownian motion. More details can be found in your favorite text book or on wikipedia. Here just comes an animation:

library(dplyr)
library(gganimate)
library(transformr)
set.seed(42)
n <- 1000  # maximal steps of random walk  
n_seq <- seq(from=10, 100, by = 10)  # starting phase
n_seq <- c(n_seq, seq(from=100, n, by = 100))
number_iteration <-length(n_seq)   
X <- sample(c(-1,1), n, replace=TRUE, prob=c(0.5, 0.5))
S <- cumsum(X)  # random walk S_n

W <- numeric(number_iteration*n)   
id <- numeric(number_iteration*n)      
time <- numeric(number_iteration*n)   
for (i in 1:number_iteration){
  for (j in (1:n)){
    index <- floor((j-1)/n * n_seq[i])
    if (index==0) {
      W[j+(i-1)*n] <-((((j-1)/n * n_seq[i]) -(index)) *( S[index+1]))/ sqrt(n_seq[i])
    }
    else {
      W[j+(i-1)*n] <- (S[index] + (((j-1)/n * n_seq[i]) -(index)) *( S[index+1]-S[index]))/ sqrt(n_seq[i]) 
    }
  }
  id[(1+n*(i-1)):(n*i)] <- rep(n_seq[i],n)
  time[(1+n*(i-1)):(n*i)] <- (1:n)/n
}

rw <- data.frame(x = time,
                     y = W,
                     id =id)

p <- ggplot(rw, aes(x=x, y=y)) + 
  geom_line()+ 
  ylab(expression(W^{(n)}~ (x)))+
  transition_states(
    id,
    transition_length = 2,
    state_length = 1) +
  enter_fade() + 
  exit_shrink() +
  ease_aes('sine-in-out')+
  labs(title = "Donsker's invariance principle. n = {closest_state}") 

animate(p, nframes=number_iteration*2)

Brownian motion via Hilbert basis

Brownian motion is a central object of probability theory. One perspective to look at its construction uses Hilbert spaces.
Let (f_i)_{i\in I} be a Hilbert basis H:=L^2([0,1]) equipped with standard scalar product (\cdot,\cdot). Hence, every f\in H can be written in a unique way as

    \[f  =  \sum_{i\in I} \alpha_{i} f_i,\]

where \alpha_{i}=(f,f_i) and \sum_{i\in I} \alpha_{i}^2=|f|_2^2<\infty. Let (\xi_i)_{i\in I} be an i.i.d. sequeence of standard gaussian random variables and consider

    \[ I: \left{ \begin{array}{ccc}  H & \longrightarrow & L^2(\Omega)\cr f=\sum_{i\in I} \alpha_{i} f_i & \longmapsto & \sum_{i\in I} \alpha_{i} \xi_i. \end{array} \right.\]

Using the independence of \xi_i we obtain

    \[ \mathbb{E} [I(f)I(g)]  =  (f,g).\]

Brownian motion is then given by (a continuous modification of) B_t:= {I}(\mathbf{1}_{[0,t]}), t\ge 0. Hence, “all” we need to do is to find a Hilbert basis. The most known is the following.

Haar basis

Here we refer to http://datatreker.com/haar-basis for an introduction.

haar_mother <- function(x){
   (x >0 & x <= 0.5) - (x >0.5 & x <= 1)
}
haar <- function(x,j,k){
  (2^(j/2) * haar_mother(2^j*x-k))
}
set.seed(42)

j_max <- 10
n_max <- 11  # maximal resolution
delta <- 2^{-n_max}
x <- (0: 2^n_max)/2^n_max
xi_0 <- rnorm(1) # xi corresponding to constant function
xi <- list()     # list of random variables xi
for (j in 0:j_max){
  xi[[j+1]] <- rep(0, 2^j)
  for (k  in 0:(2^j-1))
  {
    xi[[j+1]][k+1] <-rnorm(1)  
  }
}

# data.frame containing approximation of the Brownian motion
df <- data.frame(x=numeric(),
                 y=numeric(),
                 id=numeric())
for (i in x){
  y <- c(1,rep(1, i*2^n_max), rep(0, 2^n_max-i*2^n_max)) 
  alpha <- list() # wavelet coefficients for y
  bm <- sum(y)/(length(y)+1) * xi_0
  for (j in 0:j_max){
    alpha[[j+1]] <- rep(0, 2^j)
    for (k  in 0:(2^j-1))
    {
      alpha[[j+1]][k+1] <-sum(haar(x,j,k)*y)*delta 
      bm <- bm + alpha[[j+1]][k+1]*xi[[j+1]][k+1]
    }
    df_new <- data.frame(x=i,
                         y=bm,
                         id=j)
    df <- bind_rows(df, df_new)
  }
}

#### Animation using gganimate

ggplot(df, aes(x=x, y=y)) + 
  geom_line()+
  transition_states(
    id,
    transition_length = 2,
    state_length = 1
  ) +
  labs(title = "Construction of Brownian motion using Haar basis. Step: {closest_state}") +
  ease_aes('sine-in-out')

The result looks the same as with Levy’s construction which is not astonishing. The implementation above is not optimal and is much slower than the previous one.

It might be worth to consider a different Hilbert basis.

Trigonometric Hilbert basis

The trigonometric basis is defined as

    \[f_n(x) = \sqrt{2} \cos (\pi n x), \quad n\in \mathbb{N},\]


together with f_0=1.

set.seed(42)
N_max <-  200 # number of trigonometric basis functions used
n_max <- 11  # maximal resolution
delta <- 2^{-n_max}
x <- (0: 2^n_max)/2^n_max
xi <- rnorm(N_max+1)     # list of random variables xi



# data.frame containing approximation of the Brownian motion
df <- data.frame(x=numeric(),
                 y=numeric(),
                 id=numeric())

for (i in x){
  y <- c(1,rep(1, i*2^n_max), rep(0, 2^n_max-i*2^n_max)) 
  alpha <- numeric() # wavelet coefficients for y
  alpha[1] <- mean(y)
  bm <-  alpha[1] * xi[1]
  for (n in 1:N_max){
      alpha[n] <-sum(sqrt(2)*cos(x*pi*n)*y)*delta 
      bm <- bm + alpha[n]*xi[n]
      df_new <- data.frame(x=i,
                         y=bm,
                         id=n)
      df <- bind_rows(df, df_new)}
}

#### Animation using gganimate

p <- ggplot(df, aes(x=x, y=y)) + 
  geom_line()+
  transition_states(
    id,
    transition_length = 2,
    state_length = 1
  ) +
  labs(title = "Construction of Brownian motion using trigonometric basis. Step: {closest_state}") +
  ease_aes('sine-in-out')
animate(p, nframes=1000)

The above convergence is very slow. The irregularity properties of the Brownian motion are not yet very visible.

Haar basis

Hilbert spaces play a prominent role in various fields of mathematics. An orthonormal basis of such a space is called a Hilbert basis. The purpose of this blog is to illustrate a very clasic and basic Hilbert basis – the Haar basis.

Let (f_i)_{i \in I} be a Hilbert basis of H:=L^2([0,1]) equipped with the standard scalar product (\cdot,\cdot). Hence, every f\in H can be written in a unique way as

    \[f  =  \sum_{i\in I} \alpha_{i} f_i,\]

where \alpha_{i}=(f,f_i) and \sum_{i\in I} \alpha_{i}^2=\|f\|_2^2<\infty.

A classic Hilbert basis consists of Haar functions that are supported on [0, 1]. They are defined using the Haar wavelet:

    \[ h(x)=\left\{ \begin{array}{ll}         1 & 0< x \leq 1/2 \cr         -1& 1/2 \leq x \leq 1 \cr         0& \mbox{otherwise}.\end{array}\right.\]

The Haar basis consists then of rescaled (by 2^{j}) versions of h(x) shifted by 2^{-j}k,

    \[h_{k}^{j}(x)= 2^{j/2} h(2^{j}x -k),\quad j \in \mathbb{N}\cup\{0\}, 0\leq k < 2^{n},\]

together with the constant function 1. (Note: the constant function has to be added since we consider the interval [0,1] and not \mathbb{R}). In R this looks like:

haar_mother <- function(x){
   (x >0 & x <= 0.5) - (x >0.5 & x <= 1)
}
haar <- function(x,j,k){
  2^(j/2) * haar_mother(2^j*x-k)
}

Animation of the Haar basis

j_max <- 3    # maximal depth
n_max <- 10   # resolution of grid
df <- data.frame(x=numeric(),
                 y=numeric(),
                 id=numeric())

x <- (1: 2^n_max)/2^n_max
id<-1
for (j in 0:j_max){
  for (k  in 0:(2^j-1))
  {
    df_new<- data.frame(x=x,
                        y=haar(x,j,k),
                        id=id)
    df <- bind_rows(df, df_new)
    id <- id+1
  }
}

ggplot(df, aes(x=x, y=y)) + 
  geom_step()+
  transition_states(
    id,
    transition_length = 2,
    state_length = 5
  ) +
  labs(title = 'Illustration of Haar basis') +
  ease_aes('sine-in-out')

Approximation via Haar basis

Now, every function f\in L^{2}([0,1]) can be written as

    \[ f(x)=\sum_{j,k} (f, h_{k}^{j}) h_{k}^{j}(x) + (f,1).\]

The coefficients \alpha_{k}^{j}=(f, h_{k}^{j}) are called the Haar Wavelet coefficients. Let us calculate them in the discrete setting. We give us a mesh x and values y and some function f(x)=x \sin(x):

j_max <- 12
n_max <- 13  # maximal resolution
delta <- 2^{-n_max}
x <- (1: 2^n_max)/2^n_max
y <- x* sin(1/x) 
alpha <- list()  # list of Haar coefficients
for (j in 0:j_max){
  alpha[[j+1]] <- rep(0, 2^j)
  for (k  in 0:(2^j-1))
  {
    alpha[[j+1]][k+1] <-sum(haar(x,j,k)*y)*delta  # approximation of scalar product
  }
  
}

y_approx <- rep(0, length(y))  # approximated values of y

#### Calculating the approximations for different values of j

df <- data.frame(x=numeric(),
                 y=numeric(),
                 id=numeric())


y_approx <-  mean(y) # this is the contribution of the constant function
for (j in 0:j_max){
  for (k  in 0:(2^j-1))
  {
    y_approx <- y_approx + alpha[[j+1]][k+1]*haar(x,j,k)
  }
  df_new <- data.frame(x=x,
                       y=y_approx,
                       id=j)
  df <- bind_rows(df, df_new)
}

Animation of convergence


ggplot(df, aes(x=x, y=y)) + 
  geom_step()+
  transition_states(
    id,
    transition_length = 2,
    state_length = 1
  ) +
  labs(title = "Haar basis approximation of x sin(1/x). Depth:  {closest_state}") +
  ease_aes('sine-in-out')

Lévy’s construction of Brownian motion

Animation of Lévy’s construction of Brownian motion

Brownian motion is a central object of probability theory. The idea of Lévy’s construction is to construct the Brownian motion step by step on finite sets

    \[\mathcal{D}_n := \left\{\frac{k}{2^n}: 0\leq k \leq n \right\}\]


of dyadic points. As \mathcal{D} := \bigcup_n \mathcal{D}_n is dense in [0,1] the Brownian motion is then obtained as the uniform limit of linear interpolation on \mathcal{D}_n.
It is pretty easy to illustrate this construction using R and the package gganimate. We use the same notation as in the proof of Wiener’s theorem given on page 23 in “Brownian motion” by Peter Mörters and Yuval Peres.


library(dplyr)
library(gganimate)
library(transformr)
set.seed(42)
n_max <- 14                # maximal number of steps
D <- (0: 2^n_max)/2^n_max   # this is in fact D_n_max
B <- list()  
Z <- rnorm(1)
B[[1]] <- c(0, Z/2 + rnorm(1), Z) 
for (n in 2:n_max){
  B[[n]] <- rep(0, 2^n+1)
  index_known <- seq(1,2^n+1, by=2) # indices where values are known from previous steps
  B[[n]][index_known] <- B[[n-1]]
  index_unknown <- seq(2, 2^n, by=2) # indices where values are not yet defined
  for (d in index_unknown){
    B[[n]][d] <- 0.5*(B[[n]][d-1]+B[[n]][d+1])+ rnorm(1)*2^(-(n+1)/2)
  }
}


## interpolation and transformation in a data.frame
df <- data.frame(time=numeric(),
                 value=numeric(),
                 id=numeric())

for (n in 1:n_max){
  D_n<-(0: 2^n)/2^n   
  B_interpol<- approx(D_n, B[[n]], xout = D)$y   # interpolation
  df_new <- data.frame(time=D,
                       value=B_interpol,
                       id=n)
  df <- bind_rows(df, df_new)
}

## animation
ggplot(df, aes(x=time, y=value)) + 
  geom_line()+
  transition_states(
    id,
    transition_length = 2,
    state_length = 1
  ) +
labs(title = 'Levy`s construction of Brownian motion.  Step: {closest_state}', x = 'time', y = 'position') +
ease_aes('sine-in-out')