Exponential Convergence and Tractability of Multivariate Integration for Korobov Spaces

I discuss the recently resubmitted manuscript [DLPW] titled `Exponential Convergence and Tractability of Multivariate Integration for Korobov Spaces‘ by J.D., G. Larcher, F. Pillichshammer, and H. Wo\’zniakowski.

The initial aim of the paper is to show that lattice rules can achieve an exponential rate of convergence for infinitely times differentiable functions. The technical difficulty therein lies in the fact that an application of Jensen’s inequality (which states that {(\sum_{n} |a_n|)^\lambda \le \sum_n |a_n|^\lambda} for {0 < \lambda < 1}) yields only a convergence of {O(n^{-\alpha})}, where {n} is the number of quadrature points. Though {\alpha} can be arbitrarily large, in the land of asymptotia this is still worse than a convergence of, say, {\omega^{-n^{1/s}}}, for some {0 < \omega < 1}. Hence the first challenge is to find ways to prove convergence rates without relying on Jensen’s inequality.

1. The worst case error in the Korobov space

Let {0 < \omega < 1}. Then the worst case error for integration using a lattice rule is given by

\displaystyle e^2_n(\boldsymbol{g}) = \sum_{\boldsymbol{h} \in \mathcal{L} \setminus \{\boldsymbol{0}\}} \omega^{|h_1|+ \cdots + |h_s|}, \ \ \ \ \ (1)

where {\boldsymbol{g} = (g_1,\ldots, g_s) \in \{1,\ldots, n-1\}^s}, {\boldsymbol{h} = (h_1,\ldots, h_s)}, {\boldsymbol{h} \cdot \boldsymbol{g} = h_1 g_1 + \cdots + h_s g_s}, and the dual lattice {\mathcal{L}} is defined as

\displaystyle \mathcal{L} = \mathcal{L}(\boldsymbol{g}) = \{\boldsymbol{h} \in \mathbb{Z}^s: \boldsymbol{h} \cdot \boldsymbol{g} \equiv 0 \pmod{n}\}.

This means that for any {f:[0,1]^s \rightarrow \mathbb{R}} with {f(\boldsymbol{x}) = \sum_{\boldsymbol{h} \in \mathbb{Z}^s} \widehat{f}(\boldsymbol{h}) {\rm e}^{2\pi {\rm i} \boldsymbol{h} \cdot \boldsymbol{x}}} we have

\displaystyle \left|\int_{[0,1]^s} f(\boldsymbol{x}) \, {\rm d}\boldsymbol{x} - \frac{1}{n} \sum_{k=0}^{n-1} f(\{k \boldsymbol{g}/n \}) \right| \le e_n(\boldsymbol{g}) \|f\|,

where

\displaystyle \|f\| = \left(\sum_{\boldsymbol{h} \in \mathbb{Z}^s} \frac{|\widehat{f}(\boldsymbol{h})|^2}{\omega^{|h_1|+\cdots + |h_s|}} \right)^{1/2}.

2. Averaging argument and Jensen’s inequality

Let us first describe the averaging argument with Jensen’s inequality. This argument works in the following way. Let {\boldsymbol{g}^\ast \in \{1,\ldots, n-1\}^s} be such that {e_n^2(\boldsymbol{g}^\ast) \le e_n^2(\boldsymbol{g})} for all {\boldsymbol{g} \in \{1,\ldots, n-1\}^s}. Then for any {0 < \lambda < 1} we have

\displaystyle \begin{array}{rcl} e^{2\lambda}_n(\boldsymbol{g}^\ast) & \le & \frac{1}{(n-1)^s} \sum_{\boldsymbol{g} \in \{1,\ldots, n-1\}^s} e^{2\lambda}_n(\boldsymbol{g}) \\ && \\ & = & \frac{1}{(n-1)^s} \sum_{\boldsymbol{g} \in \{1,\ldots, n-1\}^s} \left(\sum_{\boldsymbol{h} \in \mathcal{L} \setminus \{\boldsymbol{0}\}} \omega^{|h_1|+\cdots + |h_s|} \right)^\lambda \\ && \\ & \le & \frac{1}{(n-1)^s} \sum_{\boldsymbol{g} \in \{1,\ldots, n-1\}^s} \sum_{\boldsymbol{h} \in \mathcal{L} \setminus \{\boldsymbol{0}\}} \omega^{\lambda(|h_1|+\cdots +|h_s|)} \\ && \\ & = & \sum_{\boldsymbol{h} \in \mathbb{Z} \setminus \{\boldsymbol{0}\}} \omega^{\lambda(|h_1|+\cdots + |h_s|)} \frac{1}{(n-1)^s} \sum_{\boldsymbol{g} \in \{1,\ldots, n-1\}^s} 1_{\boldsymbol{g} \in \mathcal{L}(\boldsymbol{h})}, \end{array}

where the last inequality follows by using Jensen’s inequality and where {1_{\boldsymbol{g} \in \mathcal{L}(\boldsymbol{h})}} is {1} if {\boldsymbol{h} \cdot \boldsymbol{g} \equiv 0 \pmod{n}} and {0} otherwise.

The last expression can now be separated into two parts, namely

\displaystyle \frac{1}{(n-1)^s} \sum_{\boldsymbol{g} \in \{1,\ldots, n-1\}^s} 1_{\boldsymbol{g} \in \mathcal{L}(\boldsymbol{h})} \approx \frac{1}{n}

and

\displaystyle \sum_{\boldsymbol{h} \in \mathbb{Z} \setminus \{\boldsymbol{0}\}} \omega^{\lambda(|h_1|+\cdots + |h_s|)} = C_{s,\lambda, \omega},

where {0 < C_{s,\lambda, \omega} < \infty} is a constant which depends on {s}, {\lambda}, and {\omega} but not on {n} or {\boldsymbol{g}}.

Thus we obtain

\displaystyle e_n^{2\lambda}(\boldsymbol{g}^\ast) \lessapprox C_{s,\lambda,\omega} n^{-1}

and therefore

\displaystyle e_n(\boldsymbol{g}^\ast) \lessapprox C_{s,\lambda,\omega} n^{-1/2\lambda}.

This way we do get {e(\boldsymbol{g}^\ast) = O(n^{\alpha})} with {\alpha > 0} arbitrarily large, but we do not get {e(\boldsymbol{g}^\ast) = O(\omega^{-n^{1/s}})}.

3. Figure of Merit, Trigonometric Degree, and Exponential Rate of Convergence

We now show how we do get an exponential rate of convergence. The main observation is that the sum (1) is dominated by its largest term in the following way.

Let {\boldsymbol{h}^\ast = (h^\ast_1,\ldots, h_s^\ast)} be such that

\displaystyle \omega^{|h_1^\ast|+\cdots + |h_s^\ast|} \ge \omega^{|h_1|+\cdots + |h_s|}

for all {\boldsymbol{h} = (h_1,\ldots, h_s) \in \mathcal{L}}. Then the largest term in (1) is given by {\omega^{|h_1^\ast|+\cdots + |h_s^\ast|}} (this term may occur several times though). That means that

\displaystyle |h_1^\ast|+\cdots + |h_s^\ast| \le |h_1|+\cdots + |h_s|,

for all {\boldsymbol{h} = (h_1,\ldots, h_s) \in \mathcal{L}}. To shorten the notation let {|\boldsymbol{h}|_1 = |h_1| + \cdots + |h_s|}. This leads us to define a so-called figure of merit by

\displaystyle \rho(\boldsymbol{g}) = \min_{\boldsymbol{h} \in \mathcal{L}\setminus\{\boldsymbol{0}\}} |\boldsymbol{h}|_1.

This figure of merit occured elsewhere in the literature already where it is called the trigonometric degree. The motivation for the trigonometric degree is the following. Let {\boldsymbol{h}=(h_1,\ldots, h_s) \in \mathbb{Z}^s\setminus\{\boldsymbol{0}\}} with {|\boldsymbol{h}|_1 < \rho(\boldsymbol{g})}. Then {\boldsymbol{h} \cdot \boldsymbol{g} \not\equiv 0 \pmod{n}} since {|\boldsymbol{h}|_1 < \rho(\boldsymbol{g})}, and therefore

\displaystyle \frac{1}{n}\sum_{k=0}^{n-1} {\rm e}^{2\pi {\rm i} k \boldsymbol{h}\cdot \boldsymbol{g}/n} = 0.

Hence all nonconstant trigonometric polynomials with degree smaller than {\rho(\boldsymbol{g})} get integrted exactly by a lattice rule with generator {\boldsymbol{g}}.

Coming back to our aim of proving an exponential convergence, we show that

\displaystyle e_n^2(\boldsymbol{g}) \lessapprox \omega^{\rho(\boldsymbol{g})} {\rho(\boldsymbol{g}) \choose s-1}.

Hence the problem of obtaining an exponential rate now comes down to showing a lower bound on {\rho(\boldsymbol{g})}. It is known that for each prime number {n} there exists a {\boldsymbol{g} \in \{1,\ldots, n-1\}^s} such that

\displaystyle \rho(\boldsymbol{g}) \asymp (s!n)^{1/s}.

This now yields an exponential rate of convergence:

\displaystyle e_n^2(\boldsymbol{g}) \lessapprox \omega^{(s!n)^{1/s}} n.

4. Further Results

The question arises whether the exponential rate of decay of the worst case error can prevent it from depending on the dimension. That is, is there a constant {C > 0}, independent of {s} and {n}, such that

\displaystyle e^2_n(\boldsymbol{g}) \le C \omega^{(n^\delta)} \quad \mbox{for some } \delta > 0 \mbox{ independent of } s?

In the manuscript we show that this is not the case if one wants to have an exponential rate of convergence, see Theorem 3 in [DLPW]. On the other hand if we are satisfied with a convergence of {O(n^{-\alpha})}, then we are able to obtain a bound which does not depend on the dimension, see Remark 3 in [DLPW].

In order to obtain tractability one therefore needs to introduce some weights. These weights change the shape of the unit ball of the space we are considering such that certain groups of coordinates are easier to integrate.

The results described above are pure existence results, hence we cannot actually find good generating vectors {\boldsymbol{g}}. We do have, however some constructive results. On is outlined in Section 6 of [DLPW] where we show that the regular grid (as quadrature points) does perform reasonably well, i.e., yields an exponential rate of decay of the worst case error.

Another semi-constructive approach is outlined in Remark 1 in [DLPW] based on algebraic number fields. There it is shown that good generating vector {\boldsymbol{g} = (g_1,\ldots, g_s)} where {g_j = g_j(n)} is the nearest integer to {\delta_j n} exist. Here {1, \delta_1,\ldots, \delta_s} form a basis of an algebraic number field {F} of degree {s+1}. For example, one can choose {\delta_j = 2^{j/(s+1)}} for {1 \le j \le s}. A possible search algorithm is also outlined. The constant in the lower bound for {\rho(\boldsymbol{g})} using this method is considerably weaker than for the existence result.

A numerical investigation of this method has shown that one can find reasonably good generating vectors this way. The results are competitive with the result obtained in [CKN]

R. Cools, F. Y. Kuo, and D. Nuyens, Constructing lattice rules based on weighted degree of exactness and worst case error, to appear in Computing;

(though the vectors are not extensible in the dimension as they are in the paper by [CKN]). On the other hand, the study of the trigonometric degree of lattice rules has mainly been aimed at finding the best lattice rule, i.e., at finding {\boldsymbol{g}^\ast} such that for given {n} and {s} one has {\rho(\boldsymbol{g}^\ast) \ge \rho(\boldsymbol{g})} for all {\boldsymbol{g} \in \{1,\ldots, n-1\}^s}. For this aim the outlined construction does not appear to be suitable. In this case exhaustive searches or nearly exhaustive searches are employed.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s