### persistent homology of Brownian motions

#### Introduction

Talk will be dealing with topology of random functions. Random functions enter data analysis as the distance functions
to the sample. Gaussian random functions
are the key model for the null-hypothesis in many algorithms of image analysis.

More generally, understanding the topology of
random sets seems to be the necessary step beyond analysis the random sets.

In all the cases, one studies the excursion sets $M_c=\{f\leq c\}, f\in C(\Real^D), c\in\Real$.

#### Example: point clouds

To determine the topology of a point cloud $X\in \Real^D$ one thickens the points, or, equivalently, considers the excursion sets
of the distance to $X$. The hope (often justified, see Niyogi-Smale-Weinberger) is that if $X$ is a (slightly) noisy sample from a submanifold $N\subset \Real^D$,
the union of the balls around points of $X$ would have the same topology as $N$.

#### Example: Gaussian random fields

The real life data (MRI scans, WMAP, ...) are inherently noisy, and the statistical analysis always requires understanding the null-hypothesis: that what is observed is merely noise
(say, a Gaussian random field). Hence the necessity to understand the topology of the Gaussian random fields, a well tended field (see Adler-Taylor's corpus of works).

#### Topology of noisy functions: persistence

Typically, whichever level one chooses, the excursion set has a lot of noise: small components, small holes, handles, etc. Changing the level removes them, but introduces new ones.
One needs the features that are there for a long time, that *persist*. Hence the persistence kicks in. One defines the persistent homology $\ph_k(a,b)$ as
$$
\image H_a^b, \,\mathrm{where}\, H_a^b:H_k(M_a)\to H_k(M_b), a\lt b
$$
is the image of the action of the natural inclusion $i(a,b):M_a\to M_b$ on homologies (which here are assumed to be over a field $k$).

#### Barcodes and persistance diagrams

Persistence allows one to record the birth and death of homologic classes. Long living ones represent important features. Short living one represent noise.
One uses either *barcodes* or *persistence diagrams* to represent these data.
#### persistence diagrams

For a *random* function, it seems that the natural object to study is the ($k$-th)-persistence point process $\phpp_k$
placing a dot of multiplicity $\mu$ (i.e. $\mu \delta_{(x,y)}$) on the $k$-th sheet of persistence diagram if the rank of $\ph_k(x,y)=\mu$.

Here we will be mostly interested in the* intensity density* $\beta_k=\ex \phpp_k$ of this point process.

#### persistence for Brownian trajectories

The persistence toolbox aims at discarding homological noise generated by small wrinkles of a mapping.
Understanding the nature of noise for random functions is an essential component of statistical
topology.

Here we address the most basic type of random functions, 1D Brownian motion. The fractal nature of
the trajectories lead to interesting structure of the persistence homology (only $\beta_0$ is nontrivial, for dimensional reasons).

In fact, the naively definition of $\ph$ is a bit awkward; better to use the identity
$$
\image H_k(a,b)\cong \ker(H_k(M_b)\to H_k(M_b,M_a));
$$
for $k=0$ this means that we count only those components of $M_b$ that contain points of $M_a$.

#### persistence for Brownian trajectories

Let
\[
\bbr:[0,1]\to \Real
\]
be a sample trajectory of the Brownian bridge (Brownian motion conditioned on $\bm(0)=\bm(1)=0$, or, equivalently, Brownian motion on the unit circle).

#### fractal nature of persistence homology for Brownian motion

Denote by $N_k^f(a,b)$ the number of bars in $k$-th persistent homology of $f$ overlapping the interval $[a,b]$, and by
$B(x,y)=\ex N_o^\bbr (x,y)$. Then

for $0 \leq x \leq y $ one has
\[
B(x,y)=\sum_{n\geq 1} e^{-2(n(y-x)+x)^2}
\]

When $x$ is close to $y$, the number of the bars overlapping $[x,y], x>0$ is close to
\[
(y-x)^{-1}\sqrt{\pi/2} \mathtt{erfc}(2x)
\]
#### fractal nature of persistence homology for Brownian motion

In other words, the density $\beta_0$ of the persistence point process near the diagonal is exploding like
$$
\beta_0^{\bbr}(x,y)\sim (y-x)^{-3}
$$

#### fractal nature, cont'd

Similarly, for the Brownian motion with drift,
$$
\bmv_t=\sigma\bm_t+vt,
$$
the number of bars overlapping any given interval is finite a.s. and is geometrically distributed with parameter
$$
\exp(-v(y-x)/\sigma^2).
$$

As a corollary, if $f$ is smooth, then
$$
\lim_{\sigma\to 0} \frac{1}{\sigma^2} \lim_{\Delta\to 0} \Delta \beta_0^{f+\sigma \bm}(x, x+\Delta)=\sum_{s:f(s)=x} \frac{1}{f'(s)}=\frac{f_*(dx)}{dx}.
$$

In particular, the *short bars* of a small perturbation of a polynomial fix the polynomial up to shifts and reflections.

#### Persistence dimension

Somewhat more generally, one can define the persistence dimension of a function $f$ on a manifold $M$ as
$$
\dimph(f):=\inf\{k: \langle (y-x)^k \cdot \phpp \rangle <\infty\},
$$
and the persistence dinension of a class of functions $\class$ as the $\sup_{f\in \class} \dimph(f)$.

Some recent results by Cohen-Steiner-Edelsbrunner-Harer-Mileyko suggest that for
Lipschitz functions on a smooth manifold $M$ of dimension $d$ the persistence dimension is $d$;
the results above make it plausible that for almost all Brownian motions the
persistence dimension is $2$.

In general, it is natural to conjecture that for the class of $\alpha$-HÃ¶lder functions on $M$,
the persistence dimension is $d/\alpha$.

#### questions

- Big question is to understand the persistence dimension for large classes of functions and underlying topological spaces. Already for smooth or Lipschitz
functions on smooth manifolds unknown.
- Correlation functions for $\phpp_k$ for Brownian motions.