Quasi-Monte Carlo for R^s: digital nets and worst-case error

Recently I submitted the paper titled Quasi-Monte Carlo numerical integration over $\mathbb{R}^s$: digital nets and worst-case error. I will give some heuristic explanation of the results in this paper. Interesting in this context is also the sparse grid approach from the PhD thesis of Markus Holtz (supervised by Prof. Michael Griebel).

Therein I aim to develop a theory on quasi-Monte Carlo integration over $\mathbb{R}^s$. The idea is to transform a digital net from $[0,1]^s$ to $\mathbb{R}^s$ such that in elementary intervals of the form $\displaystyle \prod_{i=1}^s [A b^j, (A+1) b^j)$ one has a digitally shifted digital net, and at the same time, globally one has a given (discretized) distribution.

1. Tilings of the Walsh phase plane

The analysis is based on tilings of the Walsh phase plane. We restrict ourselves to the case where the dimension $s = 1$ and the base $b = 2$.

First we define Walsh functions

$\displaystyle {\rm wal}_k(x) = (-1)^{\kappa_0 x_1 + \cdots + \kappa_{a-1} x_a} \, 1_{[0,1)}(x)$,

where the nonnegative integer $k = \kappa_0 + 2\kappa_1 + \cdots + 2^{a-1} \kappa_{a-1}$ and $x = X + x_1 2^{-1} + x_2 2^{-2} + \cdots \in \mathbb{R}$ for some integer $X$ and $\kappa_l, x_l \in \{0,1\}$. Here $1_{[0,1)}(x)$ denotes the characteristic function of the interval $[0,1)$.

We define translations and dilations of ${\rm wal}_k$ by

$\displaystyle w_{j,k,l}(x) = 2^{-j/2} {\rm wal}_k(x 2^{-j} - l),$

for $j,k,l \in \mathbb{Z}$ with $k \ge 0$. The support of $w_{j,k,l}$ is $[2^j l, 2^j (l+1))$. To each function $w_{j,k,l}$ we associate a so-called tile, this is simply a rectangle in $\mathbb{R} \times \mathbb{R}_0^+$, where $\mathbb{R}_0^+ = \{x \in \mathbb{R}: x \ge 0\}$, of area $1$ given by

$\displaystyle T_{j,k,l} = [2^j l, 2^j (l+1)) \times [2^{-j} k, 2^{-j} (k+1))$.

The first interval is simply the support of $w_{j,k,l}$, whereas the second interval corresponds to the frequency of $w_{j,k,l}$. Hence one can think of the first coordinate of the half plane $\mathbb{R} \times \mathbb{R}_0^+$ as the location of $w_{j,k,l}$ and the second coordinate as the phase (or frequency) of $w_{j,k,l}$.

Tiles are useful in the following way:

1. Two functions $w_{j,k,l}, w_{j',k',l'}$ are orthogonal if and only if $T_{j,k,l}, T_{j',k',l'}$ do not intersect.

Orthogonal means that the $L_2$ inner product

$\displaystyle \langle w_{j,k,l}, w_{j',k',l'} \rangle = \int_{\mathbb{R}} w_{j,k,l}(x) \overline{w_{j',k',l'}(x)} \, {\rm d} x = 0.$

If the support of the two functions does not intersect then the corresponding tiles do also not intersect. If the support does intersect but the frequency differs then the functions are still orthogonal (as usual for Walsh functions). In this case the tiles do not intersect as well. The second property concerns the completeness, namely, roughly speaking,

2. if one takes a set of functions of the form $w_{j,k,l}$ such that the corresponding tiles are disjoint and cover $\mathbb{R} \times \mathbb{R}_0^+$ then the set of functions is a complete orthonormal system in $L_2(\mathbb{R})$.

This is now useful in the sense that we are flexible in choosing an orthonormal system in $L_2(\mathbb{R})$.

2. Three types of smoothness

Again we consider only dimension $s = 1$. Let the Walsh coefficient be given by

$\displaystyle \widehat{f}_{j,k,l} = \langle f, w_{j,k,l}\rangle_{L_2} = \int_{\mathbb{R}} f(x) \, \overline{w_{j,k,l}(x)} \, {\rm d} x.$

Assume we write in each tile $T_{j,k,l}$ the value $\widehat{f}_{j,k,l}$. Then, roughly speaking, the values of $\widehat{f}_{j,k,l}$ decay in three directions as shown in the following picture.

Direction 1 is when the frequency ${}k$ increases, Direction 2 is when the location changes by moving further away from the origin, that is, ${}|l|$ increases, and Direction 3 is when both frequency and location change simultaneously, that is, ${}k$ and $|l|$ increase simultaneously.

Roughly speaking, this decay follows from the bounds

$\displaystyle |\widehat{f}_{j,k,l}| \le C_j 2^{-k} \left(\int_{2^j l}^{2^j (l+1)} |f'(x)|^2 \, {\rm d} x \right)^{1/2}$

where $C_j > 0$ is a constant which depends only on $j$ and

$\displaystyle |\widehat{f}_{j,0,l}| \le 2^{-j/2} \int_{2^j l}^{2^j (l+1)} |f(x)| \, {\rm d} x.$

Those three types of smoothness are needed because of the following:

• Direction 1: When the frequency increases the Walsh coefficients decay according to the smoothness of the function $f$. This is needed since we want the Walsh series to converge.
• Direction 2: We need $\int_{\mathbb{R}} |f(x)| \, {\rm d} x < \infty$, hence we want $f(x) \to 0$ as $|x| \to \infty$. Hence also the Walsh coefficients should decay as one moves away from the origin.
• Direction 3: Roughly speaking, if $\int_{\mathbb{R}} |f'(x)| \, {\rm d} x < \infty$, then $|f'(x)| \to 0$ as $|x| \to \infty$. In this case we also have a decay of the Walsh coefficients in direction 3.

Why a decay in Direction 3 is usefull will be explained in the following section.

3. Numerical integration

Still consider $s = 1$. We now construct point sets which integrate certain $w_{j,k,l}$ exactly.

For example one could aim at integrating all functions $w_{j,k,l}$ for which the tiles $T_{j,k,l}$ are in a certain area exactly. To integrate tiles for which the second coordinate is large, we need a large number of points. Hence, if we have a decay of the Walsh coefficients in Direction 3, then the further we go away from the origin, the less points are needed in these regions.

If there is no decay of the Walsh coefficients in Direction 3, then this means that the red lines in the picture above should form a rectangle. Hence the number of points in this case is large (and increases exponentially with the number of dimensions). This makes the problem infeasible.

Thus, for the method we consider here, to integrate functions $f:\mathbb{R} \to \mathbb{R}$ we need to look at the distribution of the Walsh coefficients $\widehat{f}_{j,k,l}$ of $f$.

4. Construction of points

In one dimension, the construction of points is relatively simple. For each elementary interval (corresponding to the support of $w_{j,k,l}$, we put $2^m$ equally spaced points such that the first point is on the left side of the elementary interval and the distance between the points is $2^{-m}$ as shown in the picture below.

For dimension $s > 1$ we use digital nets. For this case, in each coordinate we introduce a numbering of the points like in the picture below.

Then we construct a point set in the following way. Let $C_1, \ldots, C_s \in \mathbb{Z}_2^{m \times m}$ be the generating matrices of a digital $(t,m,s)$-net over $\mathbb{Z}_2$. Let

$\displaystyle C_i \vec{n} = (\eta_0, \ldots, \eta_{m-1})^\top \in \mathbb{Z}_2^m.$

Then instead of mapping the vector $(\eta_0,\ldots, \eta_{m-1})$ onto $[0,1)$ we define a number

$\displaystyle z_i(n) = \eta_0 2^{m-1} + \eta_1 2^{m-2} + \cdots + \eta_{m-1} 2^0.$

Then the point in our numbering as in the picture which was assigned $z_i(n)$ yields the $i$th coordinate of the $n$th point.

An example of a point set in $\mathbb{R}^2$ (i.e. $s = 2$) constructed this way may look the following way.

The inner four rectangles all contain digitally shifted digital nets with $4$ points and the outer rectangles contain digitally shifted digital nets with $2$ points. (Notice that the rectangles are considered to be closed on the left and bottom and open on the right and top, so that points on the boundary of two rectangles always belong to the one which is on the right or at the top.)

Notice that the number of points decreases as we go away from the origin since we assume a decay of the Walsh coefficients in Direction 3. In terms of the construction of the points this means that the number of points decreases as we move away from the origin. Furthermore, if we move away from the origin in both coordinates simultaneously, then we need even less points, since we now have a decay of the Walsh coefficients in both coordinates. Hence we only need to use points which are in a hyperbolic cross in $\mathbb{R}^2$ as shown in the picture below.

Since the points are in some hyperbolic cross shape, we can control the number of points needed as the dimension $s$ grows, since the area we need to cover also does not grow too fast as the dimension $s$ increases. Again, this is connected with the decrease of the Walsh coefficients in the three directions.

In the paper we show that points constructed this way yield the optimal rate of convergence of the worst-case integration error.

This entry printed to pdf.