Quasi-Monte Carlo for R^s: digital nets and worst-case error

Recently I submitted the paper titled Quasi-Monte Carlo numerical integration over \mathbb{R}^s: digital nets and worst-case error. I will give some heuristic explanation of the results in this paper. Interesting in this context is also the sparse grid approach from the PhD thesis of Markus Holtz (supervised by Prof. Michael Griebel).

Therein I aim to develop a theory on quasi-Monte Carlo integration over \mathbb{R}^s. The idea is to transform a digital net from [0,1]^s to \mathbb{R}^s such that in elementary intervals of the form \displaystyle \prod_{i=1}^s [A b^j, (A+1) b^j) one has a digitally shifted digital net, and at the same time, globally one has a given (discretized) distribution.

1. Tilings of the Walsh phase plane

The analysis is based on tilings of the Walsh phase plane. We restrict ourselves to the case where the dimension s = 1 and the base b = 2.

First we define Walsh functions

\displaystyle {\rm wal}_k(x) = (-1)^{\kappa_0 x_1 + \cdots + \kappa_{a-1} x_a} \, 1_{[0,1)}(x),

where the nonnegative integer k = \kappa_0 + 2\kappa_1 + \cdots + 2^{a-1} \kappa_{a-1} and x = X + x_1 2^{-1} + x_2 2^{-2} + \cdots \in \mathbb{R} for some integer X and \kappa_l, x_l \in \{0,1\}. Here 1_{[0,1)}(x) denotes the characteristic function of the interval [0,1).

We define translations and dilations of {\rm wal}_k by

\displaystyle w_{j,k,l}(x) = 2^{-j/2} {\rm wal}_k(x 2^{-j} - l),

for j,k,l \in \mathbb{Z} with k \ge 0. The support of w_{j,k,l} is [2^j l, 2^j (l+1)). To each function w_{j,k,l} we associate a so-called tile, this is simply a rectangle in \mathbb{R} \times \mathbb{R}_0^+, where \mathbb{R}_0^+ = \{x \in \mathbb{R}: x \ge 0\}, of area 1 given by

\displaystyle T_{j,k,l} = [2^j l, 2^j (l+1)) \times [2^{-j} k, 2^{-j} (k+1)).

The first interval is simply the support of w_{j,k,l}, whereas the second interval corresponds to the frequency of w_{j,k,l}. Hence one can think of the first coordinate of the half plane \mathbb{R} \times \mathbb{R}_0^+ as the location of w_{j,k,l} and the second coordinate as the phase (or frequency) of w_{j,k,l}.

Tiles are useful in the following way:

1. Two functions w_{j,k,l}, w_{j',k',l'} are orthogonal if and only if T_{j,k,l}, T_{j',k',l'} do not intersect.

Orthogonal means that the L_2 inner product

\displaystyle \langle w_{j,k,l}, w_{j',k',l'} \rangle = \int_{\mathbb{R}} w_{j,k,l}(x) \overline{w_{j',k',l'}(x)} \, {\rm d} x = 0.

If the support of the two functions does not intersect then the corresponding tiles do also not intersect. If the support does intersect but the frequency differs then the functions are still orthogonal (as usual for Walsh functions). In this case the tiles do not intersect as well. The second property concerns the completeness, namely, roughly speaking,

2. if one takes a set of functions of the form w_{j,k,l} such that the corresponding tiles are disjoint and cover \mathbb{R} \times \mathbb{R}_0^+ then the set of functions is a complete orthonormal system in L_2(\mathbb{R}).

This is now useful in the sense that we are flexible in choosing an orthonormal system in L_2(\mathbb{R}).

Example of tiling of Walsh phase plane.

2. Three types of smoothness

Again we consider only dimension s = 1. Let the Walsh coefficient be given by

\displaystyle \widehat{f}_{j,k,l} = \langle f, w_{j,k,l}\rangle_{L_2} = \int_{\mathbb{R}} f(x) \, \overline{w_{j,k,l}(x)} \, {\rm d} x.

Assume we write in each tile T_{j,k,l} the value \widehat{f}_{j,k,l}. Then, roughly speaking, the values of \widehat{f}_{j,k,l} decay in three directions as shown in the following picture.

The Walsh coefficients of a function can decay in the three directions indicated.

Direction 1 is when the frequency {}k increases, Direction 2 is when the location changes by moving further away from the origin, that is, {}|l| increases, and Direction 3 is when both frequency and location change simultaneously, that is, {}k and |l| increase simultaneously.

Roughly speaking, this decay follows from the bounds

\displaystyle |\widehat{f}_{j,k,l}| \le C_j 2^{-k} \left(\int_{2^j l}^{2^j (l+1)} |f'(x)|^2 \, {\rm d} x \right)^{1/2}

where C_j > 0 is a constant which depends only on j and

\displaystyle |\widehat{f}_{j,0,l}| \le 2^{-j/2} \int_{2^j l}^{2^j (l+1)} |f(x)| \, {\rm d} x.

Those three types of smoothness are needed because of the following:

  • Direction 1: When the frequency increases the Walsh coefficients decay according to the smoothness of the function f. This is needed since we want the Walsh series to converge.
  • Direction 2: We need \int_{\mathbb{R}} |f(x)| \, {\rm d} x < \infty, hence we want f(x) \to 0 as |x| \to \infty. Hence also the Walsh coefficients should decay as one moves away from the origin.
  • Direction 3: Roughly speaking, if \int_{\mathbb{R}} |f'(x)| \, {\rm d} x < \infty, then |f'(x)| \to 0 as |x| \to \infty. In this case we also have a decay of the Walsh coefficients in direction 3.

Why a decay in Direction 3 is usefull will be explained in the following section.

3. Numerical integration

Still consider s = 1. We now construct point sets which integrate certain w_{j,k,l} exactly.

Integrate all function corresponding to tiles under red lines exactly.

For example one could aim at integrating all functions w_{j,k,l} for which the tiles T_{j,k,l} are in a certain area exactly. To integrate tiles for which the second coordinate is large, we need a large number of points. Hence, if we have a decay of the Walsh coefficients in Direction 3, then the further we go away from the origin, the less points are needed in these regions.

If there is no decay of the Walsh coefficients in Direction 3, then this means that the red lines in the picture above should form a rectangle. Hence the number of points in this case is large (and increases exponentially with the number of dimensions). This makes the problem infeasible.

Thus, for the method we consider here, to integrate functions f:\mathbb{R} \to \mathbb{R} we need to look at the distribution of the Walsh coefficients \widehat{f}_{j,k,l} of f.

4. Construction of points

In one dimension, the construction of points is relatively simple. For each elementary interval (corresponding to the support of w_{j,k,l}, we put 2^m equally spaced points such that the first point is on the left side of the elementary interval and the distance between the points is 2^{-m} as shown in the picture below.

Construction of quadrature points.

For dimension s > 1 we use digital nets. For this case, in each coordinate we introduce a numbering of the points like in the picture below.

Numbering of the points in one dimension.

Then we construct a point set in the following way. Let C_1, \ldots, C_s \in \mathbb{Z}_2^{m \times m} be the generating matrices of a digital (t,m,s)-net over \mathbb{Z}_2. Let

\displaystyle C_i \vec{n} = (\eta_0, \ldots, \eta_{m-1})^\top \in \mathbb{Z}_2^m.

Then instead of mapping the vector (\eta_0,\ldots, \eta_{m-1}) onto [0,1) we define a number

\displaystyle z_i(n) = \eta_0 2^{m-1} + \eta_1 2^{m-2} + \cdots + \eta_{m-1} 2^0.

Then the point in our numbering as in the picture which was assigned z_i(n) yields the ith coordinate of the nth point.

An example of a point set in \mathbb{R}^2 (i.e. s = 2) constructed this way may look the following way.

Example of a point set in two dimensions.

The inner four rectangles all contain digitally shifted digital nets with 4 points and the outer rectangles contain digitally shifted digital nets with 2 points. (Notice that the rectangles are considered to be closed on the left and bottom and open on the right and top, so that points on the boundary of two rectangles always belong to the one which is on the right or at the top.)

Notice that the number of points decreases as we go away from the origin since we assume a decay of the Walsh coefficients in Direction 3. In terms of the construction of the points this means that the number of points decreases as we move away from the origin. Furthermore, if we move away from the origin in both coordinates simultaneously, then we need even less points, since we now have a decay of the Walsh coefficients in both coordinates. Hence we only need to use points which are in a hyperbolic cross in \mathbb{R}^2 as shown in the picture below.Points in a hyperbolic cross area

Since the points are in some hyperbolic cross shape, we can control the number of points needed as the dimension s grows, since the area we need to cover also does not grow too fast as the dimension s increases. Again, this is connected with the decrease of the Walsh coefficients in the three directions.

In the paper we show that points constructed this way yield the optimal rate of convergence of the worst-case integration error.

This entry printed to pdf.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s