这是indexloc提供的服务,不要输入任何密码
\undefine@key

newfloatplacement\undefine@keynewfloatname\undefine@keynewfloatfileext\undefine@keynewfloatwithin

Non-abelian amplification and bilinear forms with Kloosterman sums

Alexandru Pascadi Mathematisches Institut, Endenicher Allee 60, 53115 Bonn, Germany alexpascadi@gmail.com
Abstract.

We introduce a new method to bound bilinear (Type II) sums of Kloosterman sums with composite moduli cc, using Fourier analysis on SL2(/c)\mathrm{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) and an amplification argument with non-abelian characters. For sums of length c\sqrt{c}, our method produces a non-trivial bound for all moduli except near-primes, saving c1/12c^{-1/12} for products of two primes of the same size. Combining this with previous results for prime moduli, we achieve savings beyond the Pólya–Vinogradov range for all moduli. We give applications to moments of twisted cuspidal LL-functions, and to large sieve inequalities for exceptional cusp forms with composite levels.

1. Introduction

1.1. Brief background

There is by now a fairly comprehensive history of bounds for bilinear forms with Kloosterman sums and their applications [3, 1, 2, 14, 24, 25, 11, 28, 32, 40, 39, 22, 46, 44]. In the simplest form, the objects of interest are the sums

mMnNαmβnS(m,n;c),whereS(m,n;c):=x(/c)×e(mx+nx¯c),\sum_{m\leq M}\sum_{n\leq N}\alpha_{m}\beta_{n}S(m,n;c),\qquad\text{where}\qquad S(m,n;c):=\sum_{x\in(\mathbb{Z}/c\mathbb{Z})^{\times}}e\left(\frac{mx+n\overline{x}}{c}\right), (1.1)

for positive integers c,M,Nc,M,N with M,NcM,N\leq c and complex sequences (αm)(\alpha_{m}), (βn)(\beta_{n}); here e(t):=exp(2πit)e(t):=\exp(2\pi it) and xx¯1(mod c)x\overline{x}\equiv 1\ (\textnormal{mod }c). In this work, we are mainly concerned with the ‘Type II’ setting where (αm)(\alpha_{m}) and (βn)(\beta_{n}) are arbitrary sequences, and we search for an upper bound in terms of their 2\ell^{2} norms α:=(m|αm|2)1/2\|\alpha\|:=(\sum_{m}|\alpha_{m}|^{2})^{1/2}, β:=(n|βn|2)1/2\|\beta\|:=(\sum_{n}|\beta_{n}|^{2})^{1/2}. This is equivalent to bounding the operator norm, or the largest singular value, of the M×NM\times N matrix (S(m,n;c))mM,nN(S(m,n;c))_{m\leq M,n\leq N}.

For the Type II sums, it is in general necessary to incorporate a coprimality constraint (m,n,c)=1(m,n,c)=1. In practice, since S(gm,gn;gc)=ϕ(gc)ϕ(c)S(m,n;c)S(gm,gn;gc)=\tfrac{\phi(gc)}{\phi(c)}S(m,n;c), one can separately consider each value of (m,n,c)(m,n,c); therefore, in most bounds discussed henceforth, one can replace the restriction (m,n,c)=1(m,n,c)=1 with the assumption that (αm)(\alpha_{m}) and (βn)(\beta_{n}) are 11-bounded (and the norms α\|\alpha\|, β\|\beta\| with M\sqrt{M}, N\sqrt{N}).

There are two main ‘trivial’ bounds to beat, packaged together into the following inequality with a slightly more general setup. For any (integer) intervals ,𝒥\mathcal{I},\mathcal{J}\subset\mathbb{Z} with ||=Mc|\mathcal{I}|=M\leq c, |𝒥|=Nc|\mathcal{J}|=N\leq c, any complex sequences (αm)m(\alpha_{m})_{m\in\mathcal{I}}, (βn)n𝒥(\beta_{n})_{n\in\mathcal{J}}, and any a(/c)×a\in(\mathbb{Z}/c\mathbb{Z})^{\times}, one has

m,n𝒥(m,n,c)=1αmβnS(am,n;c)αβco(1)min(c,MNc).\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,c)=1\end{subarray}}\alpha_{m}\beta_{n}S(am,n;c)\ll\|\alpha\|\|\beta\|c^{o(1)}\min\left(c,\sqrt{MNc}\right). (1.2)

The term cc comes from Fourier analysis (a.k.a. the Pólya–Vinogradov method111See also [14, Theorem 1.17] for more standard versions of the Pólya–Vinogradov bound.) and is sharp when M=N=cM=N=c, while the term MNc\sqrt{MNc} comes from the pointwise Weil bound S(am,n;c)co(1)(m,n,c)cS(am,n;c)\ll c^{o(1)}\sqrt{(m,n,c)c}, and performs better when MN<cMN<c. The best one could hope for is the perfect-orthogonality bound αβco(1)(M+N)c\|\alpha\|\|\beta\|c^{o(1)}\sqrt{(M+N)c}, but making any improvement over ˜1.2, particularly in the range MNcMN\approx c where the two trivial bounds match, is notoriously difficult. We note that while many applications [24, 25] require an improvement of the pointwise Weil bound when MNcMN\ll c (i.e., beyond the Pólya–Vinogradov range), some applications require an improvement of the Fourier-theoretic bound for larger values of M,Nc1o(1)M,N\leq c^{1-o(1)}; this is the case in our Section˜9.

The first improvement of ˜1.2 when MNcMN\approx c was the celebrated breakthrough of Kowalski–Michel–Sawin [24], which requires a prime modulus c=pc=p, and which saves a factor of p1/64p^{-1/64} when M,NpM,N\asymp\sqrt{p}; see also [25] for their follow-up work, which outperforms the pointwise Weil bound for MNMN as small as p3/4+o(1)p^{3/4+o(1)}. These results rely on a shift-by-abab trick of Vinogradov and Karatsuba, a Hölder step, and deep inputs of \ell-adic cohomology; notably, the same bounds hold for more general algebraic trace functions, including hyper-Kloosterman sums.

Closer to our methods is an approach of Shkredov [38] for prime moduli pp, which relies (in the Type II setting) on non-abelian Fourier analysis [37, Lemma 22], expansion in SL2(/p)\textnormal{SL}_{2}(\mathbb{Z}/p\mathbb{Z}) [37, Theorem 50], and combinatorics; this beats ˜1.2 in the full range M,N(p1/2δ,p1o(1))M,N\in(p^{1/2-\delta},p^{1-o(1)}) with a small but effective power saving, and for sequences (αm)(\alpha_{m}), (βn)(\beta_{n}) with more general additively-structured supports. We also mention some related additive-combinatorial approaches of Shparlinski–Zhang [40] for smooth sequences, of the author [32, §4] for additively-structured sequences, and of Kerr–Shparlinski–Wu–Xi [22] for Type I bilinear forms (where only the sequence (αm)(\alpha_{m}) is smooth).

For moduli with a suitable factorization, the best Type II bounds so far have come from the qq-van der Corput method [15], which relies on the twisted multiplicativity of Kloosterman sums, Cauchy–Schwarz, and a shifting trick; it was first applied in this setting by Blomer–Milićević [3]. The qq-van der Corput method can also be iterated, leading to strong results for smooth square-free moduli [46, 44]. Unfortunately, these arguments fail to handle certain types of composite moduli when MNcMN\approx c, including squares of primes and products of two distinct primes of the same size.

1.2. Main results.

In this work, we develop a new method to bound bilinear forms with Kloosterman sums for essentially all composite moduli. Like Shkredov [38, 37], we rely on non-abelian Fourier analysis; unlike Shkredov, we use the normal subgroups of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) to our advantage, and we avoid relying on L2L^{2}-flattening results, to arrive at quantitatively-good power savings over ˜1.2 (up to c1/12c^{-1/12}; see Example˜1.3). Our key innovation is a new type of amplification argument with non-abelian characters, detailed in Section˜2.3, which may be of independent interest.

Combining our bounds with those of Kowalski–Michel–Sawin222One could also combine our bounds with the results of Shkredov [38, Theorem 4] for prime moduli, to obtain a result like Theorem 1.1 which does not rely on algebraic geometry. [24] (as well as Blomer–Milićević [3] for an optimization), we obtain a non-trivial result for general moduli beyond the Pólya–Vinogradov range, given in Theorem˜7.4. We state a particular case of this result below, when M,Nc1/2+o(1)M,N\ll c^{1/2+o(1)}.

Theorem 1.1.

Let c,M,N+c,M,N\in\mathbb{Z}_{+} with M,Nc1/2+o(1)M,N\ll c^{1/2+o(1)}. Then for any complex sequences (αm)mM(\alpha_{m})_{m\leq M}, (βn)nN(\beta_{n})_{n\leq N} and a(/c)×a\in(\mathbb{Z}/c\mathbb{Z})^{\times}, one has

m=1Mn=1N(m,n,c)=1αmβnS(am,n;c)αβc11700+o(1).\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,c)=1}\alpha_{m}\beta_{n}S(am,n;c)\ll\|\alpha\|\|\beta\|c^{1-\frac{1}{700}+o(1)}.

Moreover, if |αm|1|\alpha_{m}|\leq 1 for all mm (so αM\|\alpha\|\leq\sqrt{M}), then

m=1Mn=1N(n,c)=1αmβnS(am,n;c)Mβc11276+o(1).\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(n,c)=1}\alpha_{m}\beta_{n}S(am,n;c)\ll\sqrt{M}\|\beta\|c^{1-\frac{1}{276}+o(1)}.

Our main technical result leading to Theorem˜1.1 is Theorem˜7.1, which considers a factorization of the modulus into three parts, c=ddec=dd^{\prime}e (but one can usually take d=1d^{\prime}=1 or e=1e=1). Below we state a particular case of Theorem˜7.1, focusing on the same range M,Nc1/2+o(1)M,N\ll c^{1/2+o(1)} as in Theorem˜1.1.

Theorem 1.2.

Let c=ddec=dd^{\prime}e for some d,d,e+d,d^{\prime},e\in\mathbb{Z}_{+} with ddd^{\prime}\mid d and (d,e)=1(d,e)=1, and let ff be the largest integer such that f2cdf^{2}\mid cd. Let ,𝒥\mathcal{I},\mathcal{J}\subset\mathbb{Z} be intervals with ||,|𝒥|c1/2+o(1)|\mathcal{I}|,|\mathcal{J}|\ll c^{1/2+o(1)}. Then for any complex sequences (αm)m(\alpha_{m})_{m\in\mathcal{I}}, (βn)n𝒥(\beta_{n})_{n\in\mathcal{J}} and a(/c)×a\in(\mathbb{Z}/c\mathbb{Z})^{\times}, one has

m,n𝒥(m,n,c)=1αmβnS(am,n;c)\displaystyle\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,c)=1\end{subarray}}\alpha_{m}\beta_{n}S(am,n;c) αβc1+o(1)(fmin(c,d2))16.\displaystyle\ll\|\alpha\|\|\beta\|c^{1+o(1)}\left(\frac{f}{\min(c,d^{2})}\right)^{\frac{1}{6}}.

Since fcdf\leq\sqrt{cd}, Theorem˜1.2 automatically gives a non-trivial result when d(c1/3+o(1),c1o(1))d\in(c^{1/3+o(1)},c^{1-o(1)}). Unless cc has a prime factor larger than c1o(1)c^{1-o(1)}, one can always find a factorization c=ddec=dd^{\prime}e with dd in this range, (d,e)=1(d,e)=1, and ddd^{\prime}\mid d, which makes the general result in Theorem˜1.1 possible.

Example 1.3.

Let dcd\mid c such that cd\tfrac{c}{d} is square-free. Then one can take d:=(cd,d)d^{\prime}:=(\tfrac{c}{d},d), e:=cdde:=\tfrac{c}{dd^{\prime}}, f=df=d in Theorem˜1.2, so the saving over the trivial bound becomes (d+cd)1/6(d+\tfrac{c}{d})^{-1/6}. If dcd\asymp\sqrt{c}, this is roughly

c112.c^{-\frac{1}{12}}.

In particular, this saving is achieved if c{p2,pq}c\in\{p^{2},pq\}, where pp and qq are distinct primes with pqp\asymp q. For the same values of cc, the more general Theorem˜7.1 beats ˜1.2 in the range

MN[c512+o(1),c58o(1)].M\asymp N\in[c^{\frac{5}{12}+o(1)},c^{\frac{5}{8}-o(1)}].

Notably, while the values c{p2,pq}c\in\{p^{2},pq\}, M,NcM,N\asymp\sqrt{c} give blind spots of the qq-van der Corput method, they happen to give one of the best cases for our methods; this case has until now constituted the remaining barrier towards the application in Theorem˜1.5.

As a quick corollary of Theorem˜1.2, we prove a trilinear-sum bound which includes a short averaging over cc with a given large divisor qq; the point is that only a factorization of qq (rather than cc) is assumed. Such sums arise in the spectral theory of automorphic forms, in particular in Section˜9. Below is such a trilinear-sum bound, which is a particular case of Corollary˜7.5.

Corollary 1.4.

Let C12C\geq\tfrac{1}{2}, q=ddeq=dd^{\prime}e for some d,d,e+d,d^{\prime},e\in\mathbb{Z}_{+} with ddd^{\prime}\mid d and (d,e)=1(d,e)=1, and let ff be the largest integer such that f2qdf^{2}\mid qd. Let ,𝒥\mathcal{I},\mathcal{J}\subset\mathbb{Z} be intervals of lengths ||,|𝒥|C1/2+o(1)|\mathcal{I}|,|\mathcal{J}|\ll C^{1/2+o(1)}. Then or any complex sequences (αm)m(\alpha_{m})_{m\in\mathcal{I}}, (βn)n𝒥(\beta_{n})_{n\in\mathcal{J}}, one has

C<c2Cqc|m,n𝒥(m,n,q)=1αmβnS(m,n;c)|\displaystyle\sum_{\begin{subarray}{c}C<c\leq 2C\\ q\mid c\end{subarray}}\left|\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,q)=1\end{subarray}}\alpha_{m}\beta_{n}S(m,n;c)\right| αβC2+o(1)q(fmin(C,d2)+min(q,d2Cq))16.\displaystyle\ll\|\alpha\|\|\beta\|\frac{C^{2+o(1)}}{q}\left(\frac{f}{\min(C,d^{2})+\min(q,d^{2}\frac{C}{q})}\right)^{\frac{1}{6}}.
Remark.

Milićević, Qin and Wu [29] have simultaneously and independently obtained results similar to our Theorems˜1.1 and 1.5 using substantially different methods. The two papers are complementary, each performing slightly better in different ranges and for different types of moduli, and both achieving power savings for bilinear sums of square-root length and general moduli. The methods in [29] (which use algebraic geometry and build on Kowalski–Michel–Sawin [24] and Blomer–Milićević [3]) obtain better savings for general moduli and remove the dependency on the Ramanujan–Petersson conjecture in the application to moments of twisted cuspidal LL-functions. Our methods (which use non-abelian Fourier analysis and are closer to the work of Shkredov [38, 37]) perform better and in longer ranges for specific classes of moduli cc (see Examples˜1.3 and 7.2), can handle more general supports of the sequences (αm),(βn)(\alpha_{m}),(\beta_{n}) (intervals or other additively-structured subsets of /c\mathbb{Z}/c\mathbb{Z}), and find an application to exceptional-spectrum large sieve inequalities (Corollary˜1.6).

Finally, we note that our methods might also lead to bounds for other exponential sums with SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) or GL2(/c)\textnormal{GL}_{2}(\mathbb{Z}/c\mathbb{Z}) structure, such as

x/ce(mx+nF(x)c),\ \sideset{}{{}^{*}}{\sum}_{x\in\mathbb{Z}/c\mathbb{Z}}e\left(\frac{mx+nF(x)}{c}\right),

where F(x)F(x) is a suitable Möbius transformation (and the sum is restricted to those xx such that F(x)F(x) is well-defined). Related methods for SL3(/c)\textnormal{SL}_{3}(\mathbb{Z}/c\mathbb{Z}) could be worth investigating as well.

1.3. Applications

As our first application, we prove an asymptotic for the averaged second moment of modular LL-functions twisted by primitive Dirichlet characters modulo qq, where the modulus qq is arbitrary. Blomer–Milićević [1] established such an asymptotic for most moduli, more specifically whenever qq is not close to a prime or to a product of two primes of the same size. The missing ingredient in these cases has been precisely a power-saving bound for bilinear forms with Kloosterman sums modulo qq, where both sums have length q\approx\sqrt{q}. The case of prime moduli qq was established by Kowalski–Michel–Sawin [24, Theorem 1.5], and the remaining case can now be handled using our Theorem˜1.1 (essentially in the setting from Example˜1.3).

To state this application, we introduce some quick notation as in [3]. Given q+q\in\mathbb{Z}_{+}, we write

ϕ(q):=dqϕ(d)μ(qd)\phi^{*}(q):=\sum_{d\mid q}\phi(d)\,\mu\left(\frac{q}{d}\right)

for the number of primitive characters modulo qq; this vanishes if and only if q2(mod 4)q\equiv 2\ (\textnormal{mod }4), and is otherwise of size q1o(1)q^{1-o(1)}. We write χmod q\sum^{*}_{\chi\ \textnormal{mod }q} for a sum over all primitive characters modulo qq. Also following [3], given an LL-function L(s)L(s), we write Lq(s)L_{q}(s) for the product pqLp(s)\prod_{p\mid q}L_{p}(s) over all local factors at primes dividing qq; thus for example ζq(s)=pq(1ps)1\zeta_{q}(s)=\prod_{p\mid q}(1-p^{-s})^{-1}.

Theorem 1.5.

Let f1,f2f_{1},f_{2} be fixed holomorphic cuspidal newforms for SL2()\textnormal{SL}_{2}(\mathbb{Z}) with even weights κ1,κ2\kappa_{1},\kappa_{2} and Hecke eigenvalues λ1(n),λ2(n)\lambda_{1}(n),\lambda_{2}(n) normalized as in ˜8.1. Provided that κ1κ2(mod 4)\kappa_{1}\equiv\kappa_{2}\ (\textnormal{mod }4), one has the asymptotic

χmod qL(12,f1χ)L(12,f2χ)¯=2ϕ(q)ζ(2)M(f1,f2;q)+Of1,f2(q11674+o(1)),\ \sideset{}{{}^{*}}{\sum}_{\chi\ \textnormal{mod }q}L(\tfrac{1}{2},f_{1}\otimes\chi)\overline{L(\tfrac{1}{2},f_{2}\otimes\chi)}=\frac{2\phi^{*}(q)}{\zeta(2)}M(f_{1},f_{2};q)+O_{f_{1},f_{2}}\left(q^{1-\frac{1}{674}+o(1)}\right),

with a main term of

M(f1,f2;q):={P(1)L(1,sym2f1)(logq+c(f1)+P(1)P(1)),f1=f2,Q(1)L(1,f1×f2),f1f2,M(f_{1},f_{2};q):=\begin{cases}P(1)\,L(1,\textnormal{sym}^{2}f_{1})\left(\log q+c(f_{1})+\frac{P^{\prime}(1)}{P(1)}\right),&f_{1}=f_{2},\\ Q(1)\,L(1,f_{1}\times f_{2}),&f_{1}\neq f_{2},\end{cases}

where c(f1)c(f_{1}) is a constant depending only on f1f_{1} and

P(s):=ζq(2s)Lq(s,sym2f1),Q(s):=ζq(2s)Lq(s,f1×f2).P(s):=\frac{\zeta_{q}(2s)}{L_{q}(s,\textnormal{sym}^{2}f_{1})},\qquad\qquad Q(s):=\frac{\zeta_{q}(2s)}{L_{q}(s,f_{1}\times f_{2})}.
Remark.

Similar results can be obtained for Maass cusp forms, with some care in removing the dependency on the Ramanujan conjecture; see also [29]. The case of (non-cuspidal) Eisenstein series reduces to the result of Young [47] on fourth moments of Dirichlet LL-functions for prime moduli, extended to all moduli by Wu [45].

Our second application concerns the exceptional spectrum of the hyperbolic Laplacian on Γ0(q)\\Gamma_{0}(q)\backslash\mathbb{H}, consisting of Maass cusp forms of level q+q\in\mathbb{Z}_{+} with eigenvalues λ<14\lambda<\tfrac{1}{4}. Selberg’s eigenvalue conjecture [34], one of the central open problems in the theory of GL2\textnormal{GL}_{2} automorphic forms, states that this exceptional spectrum is empty. However, unconditionally, exceptional forms often produce the worst contribution in applications of the Kuznetsov trace formula [11, 27] to analytic number theory problems [12, 43, 7, 32], losing exponential factors in the parameter θ:=(14λ)1/2\theta:=(\tfrac{1}{4}-\lambda)^{1/2}. The best known pointwise bound is Kim–Sarnak θ764\theta\leq\tfrac{7}{64} [23, Appendix 2], but on-average results can also lead to savings in the θ\theta-aspect, sometimes enough to match the conditional results [33, 17]. Following Deshouillers–Iwaniec [11], these on-average results often take the shape of large sieve inequalities for the Fourier coefficients of exceptional Maass forms, incorporating factors of X2θX^{2\theta}. While improvements are now possible [32] for exceptional-spectrum large sieve inequalities with special sequences (αn)nN(\alpha_{n})_{n\leq N}, the savings in the θ\theta-aspect for arbitrary sequences have been limited to (qN)2θ(\tfrac{q}{N})^{2\theta}, due to Deshouillers–Iwaniec [11, Theorem 5]. In fact, obtaining any power saving for arbitrary sequences when NqN\asymp q is as hard as proving Selberg’s eigenvalue conjecture [32, §2].

In Theorem˜9.4, we overcome this barrier at (qN)2θ(\tfrac{q}{N})^{2\theta} if qq is suitably-composite and NN is not too large, using Theorem˜7.1. We note that in many applications [4, 28, 12, 9, 10], the level qq is a product of two factors of similar sizes, and N(q,q)N\in(\sqrt{q},q). We state below a particular case of Theorem˜9.4, for NqN\approx\sqrt{q}. We point the reader to Section˜9 for more background and notation.

Corollary 1.6.

Let q+q\in\mathbb{Z}_{+} have a divisor dqd\asymp\sqrt{q} such that qd\tfrac{q}{d} is square-free. Consider an orthonormal basis of Maass cusp forms for Γ0(q)\Gamma_{0}(q), with Laplacian eigenvalues λj\lambda_{j} and Fourier coefficients (ρj(n))n(\rho_{j}(n))_{n\in\mathbb{Z}} around \infty (normalized as in [11, 32]). Let Nq12+o(1)N\ll q^{\frac{1}{2}+o(1)}, and (αn)N<n2N(\alpha_{n})_{N<n\leq 2N} be a complex sequence supported on (n,q)=1(n,q)=1. Then with θj:=(14λj)1/2764\theta_{j}:=(\tfrac{1}{4}-\lambda_{j})^{1/2}\leq\tfrac{7}{64}, one has

λj<14q65θj|N<n2Nαnρj(n)|2(qN)o(1)α2.\sum_{\lambda_{j}<\frac{1}{4}}q^{\frac{6}{5}\theta_{j}}\left|\sum_{N<n\leq 2N}\alpha_{n}\,\rho_{j}(n)\right|^{2}\ll(qN)^{o(1)}\|\alpha\|^{2}. (1.3)

For reference, [11, Theorem 5] of Deshouillers–Iwaniec would include a factor of (qN)2θj(\tfrac{q}{N})^{2\theta_{j}} (which is qθjq^{\theta_{j}} when N=qN=\sqrt{q}) in the left-hand side, so Corollary˜1.6 wins a factor of qθ/5q^{\theta/5} in this case.

Remark.

It follows from the more general Theorem˜9.4 that one can relax the condition that qd\tfrac{q}{d} is square-free when some averaging over levels qQq\leq Q with dqd\mid q (and dQd\asymp\sqrt{Q}) is available. The sequence (αn)(\alpha_{n}) inside the large sieve may depend on qq in this case, unlike in [11, Theorem 6].

1.4. Acknowledgements

The author is deeply grateful to Valentin Blomer, James Maynard, Sary Drappeau, Philippe Michel, Emmanuel Kowalski, and Ilya Shkredov for helpful comments and discussions. This work was supported by the ERC Advanced Grant 101054336 and Germany’s Excellence Strategy grant EXC-2047/1-390685813. For a part of the duration of this project, the author was also supported by an EPSRC Scholarship, as well as a Campus France Scholarship.

2. Outline

2.1. Structure of the paper

Our proof of Theorem˜1.2 has three main steps:

  • I

    (Fourier analysis). In Section˜4 (particularly, Proposition˜4.10), we relate matrices of Kloosterman sums to the Fourier transform of certain functions at a special representation ρc\rho_{c}^{\circ} of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}). This involves Fourier analysis on both abelian (/c\mathbb{Z}/c\mathbb{Z}, \mathbb{R}) and non-abelian (SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})) groups, and a Möbius inversion process for representations of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}).

  • II

    (Amplification). In Section˜5 (particularly, Proposition˜5.5), we upper bound the spectral norm of the above Fourier coefficients by a weighted count of solutions to an equation in PSL2(/d)\textnormal{PSL}_{2}(\mathbb{Z}/d\mathbb{Z}), with dcd\mid c. This is where our non-abelian amplification argument comes in.

  • III

    (Combinatorics). In Section˜6 (particularly, Proposition˜6.4), we analyze this counting problem using elementary arguments. In particular, we do not rely on expansion techniques.

In Section˜7, we combine these ingredients to deduce Theorem˜1.2 and its variations. The applications to moments of twisted cuspidal LL-functions and large sieve inequalities for exceptional cusp forms are handled in Sections˜8 and 9, respectively.

For the rest of this section, we give a brief informal overview of our argument, ignoring various technical details. We will use the symbols ‘\approx’, ‘\lesssim’ for identities and inequalities that are ‘morally’ true (and can be made rigorous with minor modifications, such as including co(1)c^{o(1)} factors).

2.2. First steps: Fourier analysis

Let us focus on the balanced case M=NM=N. We begin by considering the N×NN\times N complex matrix

K:=(S(m,n;c))m,nN.K:=\left(S(m,n;c)\right)_{m,n\leq N}.

where c,N+c,N\in\mathbb{Z}_{+} with NcN\leq c. Our task is to bound the operator norm K\|K\| by less than min(c,Nc)\min(c,N\sqrt{c}), to beat ˜1.2. We extend KK to a c×cc\times c matrix, and multiply it on both sides by the unitary matrix (1ce(xyc))x,y/c(\tfrac{1}{\sqrt{c}}e(\tfrac{xy}{c}))_{x,y\in\mathbb{Z}/c\mathbb{Z}}, which preserves the operator norm and essentially amounts to taking a Fourier transform in the m,nm,n variables. Letting HcNH\approx\tfrac{c}{N}, a truncated version of Poisson summation yields

cUKU1H2(|h|H𝒯h)𝒮(|h|H𝒯h),where{𝒯:=(𝟙u=x+1)u,x/c,𝒮:=(𝟙xy=1)x,y/c.c\,U^{*}KU^{*}\approx\frac{1}{H^{2}}\left(\sum_{|h|\leq H}\mathcal{T}^{h}\right)\mathcal{S}\left(\sum_{|h|\leq H}\mathcal{T}^{h}\right),\qquad\text{where}\qquad\begin{cases}\mathcal{T}:=(\mathbbm{1}_{u=x+1})_{u,x\in\mathbb{Z}/c\mathbb{Z}},\\ \mathcal{S}:=(\mathbbm{1}_{xy=-1})_{x,y\in\mathbb{Z}/c\mathbb{Z}}.\end{cases}

By inserting a few more rows and columns, we can in fact work over the projective line 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) rather than /c\mathbb{Z}/c\mathbb{Z}. The matrices 𝒯\mathcal{T} and 𝒮\mathcal{S} then extend to ρc(T)\rho_{c}(T) and ρc(S)\rho_{c}(S), where TT and SS are the usual generators of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) (see ˜3.13), and

ρc:SL2(/c){Unitary maps of 1(/c)}\rho_{c}:\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\to\left\{\text{Unitary maps of }\mathbb{C}^{\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})}\right\}

is the c1+o(1)c^{1+o(1)}-dimensional permutation representation corresponding to the action of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) on 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) by Möbius transformations. It then remains to bound the spectral norm of the matrix

1H2|h1|,|h2|Hρc(Th1STh2)\frac{1}{H^{2}}\sum_{|h_{1}|,|h_{2}|\leq H}\rho_{c}(T^{h_{1}}ST^{h_{2}})

by less than min(1,N/c)\min(1,N/\sqrt{c}). In this form, our task is actually impossible: the matrix above decomposes as a direct sum corresponding to the irreducible representations inside ρc\rho_{c}, one of which is the trivial representation—and this contributes exactly one singular value of size 11. Other small-dimensional subrepresentations of ρc\rho_{c} are also problematic for similar reasons.

This is where the coprimality constraint (m,n,c)=1(m,n,c)=1 comes in. Incorporating this weight into the matrix KK and expanding it by Möbius inversion ultimately results in a ‘sifted’ representation,

K:=(S(m,n;c)𝟙(m,n,c)=1)mM,nNρc:SL2(/c),K^{\circ}:=(S(m,n;c)\mathbbm{1}_{(m,n,c)=1})_{m\leq M,n\leq N}\qquad\rightsquigarrow\qquad\rho_{c}^{\circ}:\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}),

where ρc\rho_{c}^{\circ} is essentially obtained by removing from ρc\rho_{c} the contribution of all subrepresentations isomorphic to

SL2(/c)Reduction mod dSL2(/d)ρd{Unitary maps of 1(/d)},\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\xrightarrow{\text{Reduction mod }d}\textnormal{SL}_{2}(\mathbb{Z}/d\mathbb{Z})\xrightarrow{\rho_{d}}\left\{\text{Unitary maps of }\mathbb{C}^{\mathbb{P}^{1}(\mathbb{Z}/d\mathbb{Z})}\right\},

for dcd\mid c. Although ρc\rho_{c}^{\circ} is not irreducible in general, it has the key property that all of its co(1)c^{o(1)} irreducible subrepresentations are large, of dimension c1o(1)c^{1-o(1)} (see Proposition˜4.6).

2.3. The key step: Amplification

We are left to bound the spectral norm of the non-abelian Fourier coefficient

F^(ρ)=gSL2(/c)F(g)ρ(g),F:=1H2|h1|,|h2|H𝟙Th1STh2:SL2(/c),\widehat{F}(\rho)=\sum_{g\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})}F(g)\rho(g),\qquad\quad F:=\frac{1}{H^{2}}\sum_{|h_{1}|,|h_{2}|\leq H}\mathbbm{1}_{T^{h_{1}}ST^{h_{2}}}:\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\to\mathbb{C},

where ρ\rho is any irreducible subrepresentation of ρc\rho_{c}^{\circ}. A natural approach is to use the trace method, i.e., to bound the top singular value F^(ρc)\|\widehat{F}(\rho_{c}^{\circ})\| by an even moment of all singular values, and then to expand the latter as a trace; this brings in the character χ:=Trρ\chi:=\textnormal{Tr}\rho.

One can then attempt to use non-abelian Fourier analysis by summing over all irreducible characters χ\chi^{\prime} of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}). However, this sum must somehow amplify the contribution of χ=χ\chi^{\prime}=\chi compared to other irreducible characters of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}), especially the small-dimensional ones—otherwise, our construction of ρc\rho_{c}^{\circ} by eliminating various subrepresentations from ρc\rho_{c} will have been useless.

If χ\chi was an abelian character of (/c)×(\mathbb{Z}/c\mathbb{Z})^{\times}, i.e., a Dirichlet character, then following the ideas of Duke–Friendlander–Iwaniec [13], one could weigh the sum by an amplifier of the shape

A(χ):=|χ¯()χ()|2=1,2χ¯(121)χ(121),χ(/c)×^,A(\chi^{\prime}):=\left|\sum_{\ell\in\mathcal{L}}\overline{\chi}^{\prime}(\ell)\chi(\ell)\right|^{2}=\sum_{\ell_{1},\ell_{2}\in\mathcal{L}}\overline{\chi}^{\prime}(\ell_{1}\ell_{2}^{-1})\chi(\ell_{1}\ell_{2}^{-1}),\qquad\quad\chi\in\widehat{(\mathbb{Z}/c\mathbb{Z})^{\times}},

where \mathcal{L} is some set of positive integers (e.g., the primes in a dyadic interval). This A(χ)A(\chi^{\prime}) has size ||2\approx|\mathcal{L}|^{2} when χ=χ\chi^{\prime}=\chi, and should typically obey square-root cancellation when χχ\chi^{\prime}\neq\chi.

Inspired by this, we construct a general amplifier for irreducible representations of a finite non-abelian group GG—which is to the best of our knowledge the first instance of such a construction, and which might find applications to other problems. We set

A(χ):=ρ()¯ρ()S22=1,2χ¯(121)χ(121),ρG^,χ:=Trρ,A(\chi^{\prime}):=\left\|\sum_{\ell\in\mathcal{L}}\overline{\rho^{\prime}(\ell)}\otimes\rho(\ell)\right\|_{S^{2}}^{2}=\sum_{\ell_{1},\ell_{2}\in\mathcal{L}}\overline{\chi}^{\prime}(\ell_{1}\ell_{2}^{-1})\chi(\ell_{1}\ell_{2}^{-1}),\qquad\quad\rho^{\prime}\in\widehat{G},\ \chi^{\prime}:=\textnormal{Tr}\rho^{\prime},

where S2\|\cdot\|_{S^{2}} denotes the Frobenius norm of a map (which is the 2\ell^{2} norm of its singular values), and \mathcal{L} is a well-chosen subset of GG. It is most convenient to pick \mathcal{L} to be a normal subgroup of GG; note that when G=SL2(/c)G=\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}), this is only possible if cc is composite. We will in fact pick

=Γc(d):=ker(SL2(/c)SL2(/d)),\mathcal{L}=\Gamma_{c}(d):=\ker\left(\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\to\textnormal{SL}_{2}(\mathbb{Z}/d\mathbb{Z})\right),

for a suitable divisor dd of cc. The result of this amplification argument for the sixth moment of singular values is a bound of the shape (see Proposition˜5.1)

F^(ρ)6c3H6Γc(d)|χ()|2|h1|,,|h6|HTh1STh6SΓc(d)χ(Th1STh6S).\|\widehat{F}(\rho)\|^{6}\lesssim\frac{c^{3}H^{-6}}{\sum_{\ell\in\Gamma_{c}(d)}|\chi(\ell)|^{2}}\sum_{\begin{subarray}{c}|h_{1}|,\ldots,|h_{6}|\leq H\\ T^{h_{1}}S\cdots T^{h_{6}}S\in\Gamma_{c}(d)\end{subarray}}\chi(T^{h_{1}}S\cdots T^{h_{6}}S). (2.1)

To go any further, we need to know the typical size of the character χ\chi on Γc(d)\Gamma_{c}(d), based on the information that dimχc1o(1)\dim\chi\gg c^{1-o(1)}. This is a somewhat challenging computation involving Clifford theory, and depends on the factorizations of cc and dd; see Lemmas˜5.4 and 5.2.

Let us now focus on the case when c=p2c=p^{2} is the square of a prime and NHpN\approx H\approx p; we naturally pick d=pd=p. It turns out that χ\chi typically has size p\approx p on Γp2(p)\Gamma_{p^{2}}(p), and roughly p2\approx p^{2} at ±ISL2(/p2)\pm I\in\textnormal{SL}_{2}(\mathbb{Z}/p^{2}\mathbb{Z}). To beat ˜1.2, it essentially remains to bound

|h1|,,|h6|p𝟙Th1STh6S±I(mod p2)<?p4,|h1|,,|h6|p𝟙Th1STh6S±I(mod p)<?p5.\sum_{|h_{1}|,\ldots,|h_{6}|\leq p}\mathbbm{1}_{T^{h_{1}}S\cdots T^{h_{6}}S\equiv\pm I\ (\textnormal{mod }p^{2})}\stackrel{{\scriptstyle?}}{{<}}p^{4},\qquad\sum_{|h_{1}|,\ldots,|h_{6}|\leq p}\mathbbm{1}_{T^{h_{1}}S\cdots T^{h_{6}}S\equiv\pm I\ (\textnormal{mod }p)}\stackrel{{\scriptstyle?}}{{<}}p^{5}. (2.2)

2.4. Final steps: Combinatorics

The estimates in ˜2.2 amount to counting the number of solutions to the system of congruences

{1h2h3(1h5h6)h1(1h2h3)+h3±h5h4(1h2h3)+h2±h6(mod p2, respectively, p),\begin{cases}1-h_{2}h_{3}\equiv\mp(1-h_{5}h_{6})\\ h_{1}(1-h_{2}h_{3})+h_{3}\equiv\pm h_{5}\\ h_{4}(1-h_{2}h_{3})+h_{2}\equiv\pm h_{6}\end{cases}\ (\textnormal{mod }p^{2}\text{, respectively, }p),

with |h1|,,|h6|p|h_{1}|,\ldots,|h_{6}|\leq p. For the generic solutions, one can expect each congruence to cut down the total number of solutions p6p^{6} by the size of the modulus—but one must also account for certain diagonal solutions where some hi=0h_{i}=0. A careful but elementary analysis (which becomes more involved when the modulus cc is arbitrary) shows that these congruences have p2\approx p^{2} solutions modulo p2p^{2} and p3\approx p^{3} solutions modulo pp; see Proposition˜6.4. Both of these counts are sharp, and save a factor of pp over the bounds required in ˜2.2. This saving is ultimately raised to the power 16\tfrac{1}{6} in ˜2.1 (since we considered a sixth moment of singular values), and putting everything together yields

(S(m,n;p2))m,np𝟙(m,n,p)=1p216,\left\|\left(S(m,n;p^{2})\right)_{m,n\leq p}\mathbbm{1}_{(m,n,p)=1}\right\|\lesssim p^{2-\frac{1}{6}},

as in Example˜1.3. We note that for the other case c=pqc=pq from Example˜1.3, one can simplify the amplification argument by noting that all irreducible characters of SL2(/pq)\textnormal{SL}_{2}(\mathbb{Z}/pq\mathbb{Z}) are tensor products of irreducible characters of SL2(/p)\textnormal{SL}_{2}(\mathbb{Z}/p\mathbb{Z}) and SL2(/q)\textnormal{SL}_{2}(\mathbb{Z}/q\mathbb{Z}), but the end result is the same.

2.5. Comments on prime moduli

When c=pc=p is a prime and d=pd=p, the amplifier from Section˜2.3 reduces to the ‘trivial’ choice

A(χ)=χ¯(I)χ(I)=dimχdimχ,A(\chi^{\prime})=\overline{\chi}^{\prime}(I)\chi(I)=\dim\chi^{\prime}\dim\chi,

since =Γc(c)={I}\mathcal{L}=\Gamma_{c}(c)=\{I\}. In this setting, ˜2.1 (with 66 replaced by another even integer qq) reads

F^(ρ)qp2Hq|h1|,,|hq|H𝟙Th1SThqS=I.\|\widehat{F}(\rho)\|^{q}\lesssim p^{2}H^{-q}\sum_{|h_{1}|,\ldots,|h_{q}|\leq H}\mathbbm{1}_{T^{h_{1}}S\cdots T^{h_{q}}S=I}.

This was observed, using a somewhat different language, by Shkredov [37, proofs of Lemmas 22 and 53]. Shkredov then relied on an L2L^{2}-flattening lemma [37, Theorem 50], which stems from a result of Helfgott [19], to bound the right-hand side above by O(p2HqHqp3)=O(p1)O(p^{2}H^{-q}H^{q}p^{-3})=O(p^{-1}) for a quite large value of qq depending on logplogH\tfrac{\log p}{\log H}. This leads to a bound for bilinear (Type II) sums of Kloosterman sums with prime moduli [38, (6) of Theorem 4], with a power saving of pδp^{-\delta}, where δ1q\delta\approx\tfrac{1}{q}.

To obtain a more quantitatively-relevant power saving, competitive with [24, 25], one must use a smaller value of qq; one would then need to solve a counting problem with few variables h1,,hqh_{1},\ldots,h_{q}, as in Section˜2.4. We do not know how to do this, but such an approach could produce good results when, e.g., q{8,10,12}q\in\{8,10,12\}. In particular, assuming ˜6.2 for q=8q=8, one could prove a non-trivial bound for bilinear sums of Kloosterman sums with prime moduli pp and sequences of lengths MN>p3/8+o(1)M\asymp N>p^{3/8+o(1)}; interestingly, the same limit at p3/8+o(1)p^{3/8+o(1)} appears in the results of Kowalski–Michel–Sawin [25], so our work reaffirms the difficulty of this barrier.

Alternatively, to obtain non-trivial results at prime moduli, it might be possible to use a different choice of subset SL2(/p)\mathcal{L}\subset\textnormal{SL}_{2}(\mathbb{Z}/p\mathbb{Z}) in the construction of the amplifier from Section˜2.4. Indeed, although a normal subgroup is the most natural choice for \mathcal{L}, it is possible that another conjugation-invariant subset might produce a useful amplifier when normal subgroups are not available.

3. Preliminaries

3.1. Analytic and arithmetic notation

We use the standard asymptotic notation from analytic number theory, indicating dependencies of implicit constants on a parameter ε\varepsilon through subscripts. In particular, fεgf\ll_{\varepsilon}g and f=Oε(g)f=O_{\varepsilon}(g) both mean |f|Cεg|f|\leq C_{\varepsilon}g for some constant Cε>0C_{\varepsilon}>0 depending only on ε\varepsilon; fεgf\asymp_{\varepsilon}g means fεgεff\ll_{\varepsilon}g\ll_{\varepsilon}f, f=Ωε(g)f=\Omega_{\varepsilon}(g) means fεgf\gg_{\varepsilon}g; f(x)=o(g(x))f(x)=o(g(x)) means f(x)g(x)0\tfrac{f(x)}{g(x)}\to 0 as xx\to\infty; f(x)xo(1)g(x)f(x)\ll x^{o(1)}g(x) is equivalent to the statement that f(x)εxεg(x)f(x)\ll_{\varepsilon}x^{\varepsilon}g(x) for all ε>0\varepsilon>0. With this notation, the divisor bound reads dc1co(1)\sum_{d\mid c}1\ll c^{o(1)}.

We use the notation 𝟙S\mathbbm{1}_{S} for both indicator functions of sets SS and truth values (0 or 11) of statements SS; we also abbreviate 𝟙x:=𝟙{x}\mathbbm{1}_{x}:=\mathbbm{1}_{\{x\}} for singletons. We write nNn\sim N for the range N<n2NN<n\leq 2N, α:=(n|αn|2)1/2\|\alpha\|:=(\sum_{n}|\alpha_{n}|^{2})^{1/2} for the 2\ell^{2} norm of a sequence (αn)n𝒩(\alpha_{n})_{n\in\mathcal{N}} for some 𝒩\mathcal{N}\subset\mathbb{Z} (or 𝒩/c\mathcal{N}\subset\mathbb{Z}/c\mathbb{Z}), and e(t):=exp(2πit)e(t):=\exp(2\pi it) for t/t\in\mathbb{R}/\mathbb{Z}. Given a positive integer cc, we let cc\mathbb{Z} (resp., c+c\mathbb{Z}_{+}) be the sets of integers (resp., positive integers) divisible by cc, and x¯\overline{x} be the inverse of xx modulo cc (here cc may be implied from context, e.g., in an exponential phase e(x¯/c)e(\overline{x}/c)). We use μ\mu and ϕ\phi be the Möbius and Euler totient functions. Given a,b+a,b\in\mathbb{Z}_{+}, we write (a,b)(a,b) for their greatest common divisor (and similarly for more positive integers), and (a,b)(a,b^{\infty}) for the greatest divisor of aa whose prime factors all divide bb. We write pkcp^{k}\|c when a prime power exactly divides a positive integer, meaning that pkcp^{k}\mid c but pk+1cp^{k+1}\nmid c. We will reserve the letter ψ\psi for functions on /c\mathbb{Z}/c\mathbb{Z}, and Φ,Ψ\Phi,\Psi for functions on \mathbb{R}. We denote the Fourier transform of an L1L^{1} function Φ:\Phi:\mathbb{R}\to\mathbb{C} by

Φ^(ξ):=Φ(t)e(tξ)𝑑t.\widehat{\Phi}(\xi):=\int_{-\infty}^{\infty}\Phi(t)\,e(-t\xi)\,dt. (3.1)

In particular, if Ψ(t):=Φ(At)e(Bt)\Psi(t):=\Phi(At)e(Bt) for some A>0A>0 and BB\in\mathbb{R}, then a change of variables yields

Ψ^(ξ)\displaystyle\widehat{\Psi}(\xi) =Φ(At)e(t(ξB))𝑑t=1AΦ^(ξBA),\displaystyle=\int_{-\infty}^{\infty}\Phi(At)\,e(-t(\xi-B))\,dt=\frac{1}{A}\widehat{\Phi}\left(\frac{\xi-B}{A}\right), (3.2)

and the Poisson summation identity reads

nΦ(n)=kΦ^(k).\sum_{n\in\mathbb{Z}}\Phi(n)=\sum_{k\in\mathbb{Z}}\widehat{\Phi}(k). (3.3)

Given a map MM between finite-dimensional complex Hilbert spaces, we write its operator norm as

M:=supv=1Mv=supv=w=1|wTMv|.\|M\|:=\sup_{\|\vec{v}\|=1}\|M\vec{v}\|=\sup_{\|\vec{v}\|=\|\vec{w}\|=1}|\vec{w}^{T}M\vec{v}|. (3.4)

On the Hilbert space n\mathbb{C}^{n} equipped with the Euclidean norm, we define MSq\|M\|_{S^{q}} as the q\ell^{q} norm of singular values of a map (or a matrix) MM, for q[1,]q\in[1,\infty]. In particular, we have

MS=MandMSqq=Tr((MM)q/2)1q for q2+,\|M\|_{S^{\infty}}=\|M\|\qquad\quad\text{and}\qquad\quad\|M\|_{S^{q}}^{q}=\textnormal{Tr}\left((MM^{*})^{q/2}\right)^{\frac{1}{q}}\text{ for }q\in 2\mathbb{Z}_{+}, (3.5)

where MM^{*} denotes the adjoint (conjugate transpose) of MM. We quickly record the following simple fact about projections and operator norms.

Lemma 3.1.

Let VV be a finite-dimensional complex Hilbert space, WVW\subset V be a subspace, and PW:VVP_{W}:V\to V be the orthogonal projection onto WW. Suppose that WW is an invariant subspace of a linear map M:VVM:V\to V (i.e., the restriction M|W:WWM|_{W}:W\to W is well-defined). Then M|W=MPW\|M|_{W}\|=\|MP_{W}\|.

Proof.

By definition, we have M|W=supwW,w=1Mw\|M|_{W}\|=\sup_{\vec{w}\in W,\|\vec{w}\|=1}\|M\vec{w}\| and MPW=supvV,v=1MPWv\|MP_{W}\|=\sup_{\vec{v}\in V,\|\vec{v}\|=1}\|MP_{W}\vec{v}\|. Since PWvWP_{W}\vec{v}\in W with PWvv=1\|P_{W}\vec{v}\|\leq\|\vec{v}\|=1 for all vV\vec{v}\in V with v=1\|\vec{v}\|=1, we have MPWM|W\|MP_{W}\|\leq\|M|_{W}\|. On the other hand, for each wW\vec{w}\in W with w=1\|\vec{w}\|=1, we have PWw=wP_{W}\vec{w}=\vec{w}, so M|WMPW\|M|_{W}\|\leq\|MP_{W}\|. ∎

3.2. Bounds for Kloosterman sums

We now recall the Ramanujan and Weil bounds for Kloosterman sums, as well as some results of Kowalski–Michel–Sawin [24] and Blomer–Milićević [3].

Lemma 3.2 (Ramanujan bound).

For c+c\in\mathbb{Z}_{+} and nn\in\mathbb{Z}, one has

|S(0,n;c)|(n,c).|S(0,n;c)|\leq(n,c).
Proof.

This is a classical result which follows from Möbius inversion. ∎

Lemma 3.3 (Weil bound).

For c+c\in\mathbb{Z}_{+} and m,nm,n\in\mathbb{Z}, one has

S(m,n;c)co(1)(m,n,c)c.S(m,n;c)\ll c^{o(1)}\sqrt{(m,n,c)c}.
Proof.

This is [21, Corollary 11.12] followed by the divisor bound. ∎

For the sake of completeness, we give a quick proof of the trivial bound from ˜1.2.

Proof of ˜1.2.

The second bound implicit in ˜1.2, with a term of MNc\sqrt{MNc}, follows immediately from Lemma˜3.3 and Cauchy–Schwarz. For the first bound implicit in ˜1.2, we eliminate the constraint (m,n,c)=1(m,n,c)=1 by Möbius inversion and use the identity S(dm,dn;c)=ϕ(c)ϕ(c/d)S(m,n;cd)S(dm,dn;c)=\tfrac{\phi(c)}{\phi(c/d)}S(m,n;\tfrac{c}{d}) to write

m,n𝒥(m,n,c)=1αmβnS(am,n;c)co(1)maxdcd|dmαdmdn𝒥βdnS(am,n;cd)|.\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,c)=1\end{subarray}}\alpha_{m}\beta_{n}S(am,n;c)\ll c^{o(1)}\max_{d\mid c}d\left|\sum_{dm\in\mathcal{I}}\alpha_{dm}\sum_{dn\in\mathcal{J}}\beta_{dn}S(am,n;\tfrac{c}{d})\right|.

Now apply Cauchy–Schwarz in the sum over mm, and complete the sum over m(mod cd)m\ (\textnormal{mod }\tfrac{c}{d}) to get

d|dmαdmdn𝒥βdnS(am,n;cd)|α(d2m(mod cd)|dn𝒥βdnS(am,n;cd)|2)12.d\left|\sum_{dm\in\mathcal{I}}\alpha_{dm}\sum_{dn\in\mathcal{J}}\beta_{dn}S(am,n;\tfrac{c}{d})\right|\leq\|\alpha\|\left(d^{2}\sum_{m\ (\textnormal{mod }\frac{c}{d})}\left|\sum_{dn\in\mathcal{J}}\beta_{dn}S(am,n;\tfrac{c}{d})\right|^{2}\right)^{\frac{1}{2}}.

Expanding the square and the Kloosterman sums, then performing the sum over mm, one reaches

d2m(mod cd)|dn𝒥βdnS(am,n;cd)|2=dcx(/cd)×|dn𝒥βdne(nxc/d)|2.d^{2}\sum_{m\ (\textnormal{mod }\frac{c}{d})}\left|\sum_{dn\in\mathcal{J}}\beta_{dn}S(am,n;\tfrac{c}{d})\right|^{2}=dc\sum_{x\in(\mathbb{Z}/\frac{c}{d}\mathbb{Z})^{\times}}\left|\sum_{dn\in\mathcal{J}}\beta_{dn}e\left(\frac{nx}{c/d}\right)\right|^{2}.

Finally, complete the sum over x(mod cd)x\ (\textnormal{mod }\tfrac{c}{d}), expand the square, and perform the sum over xx to obtain

dcx(/cd)×|dn𝒥βdne(nxc/d)|2c2β2.dc\sum_{x\in(\mathbb{Z}/\frac{c}{d}\mathbb{Z})^{\times}}\left|\sum_{dn\in\mathcal{J}}\beta_{dn}e\left(\frac{nx}{c/d}\right)\right|^{2}\leq c^{2}\|\beta\|^{2}.

Putting these bounds together completes our proof. ∎

Theorem 3.4 (Kowalski–Michel–Sawin [24]).

Let pp be a prime and M,NM,N\in\mathbb{Z} be such that 1NMp11\leq N\leq M\leq p-1 and p1/4<MN<p5/4p^{1/4}<MN<p^{5/4}. Then for any complex sequences (αm)mM(\alpha_{m})_{m\leq M}, (βn)nN(\beta_{n})_{n\leq N} and any a(/p)×a\in(\mathbb{Z}/p\mathbb{Z})^{\times}, one has

m=1Mn=1NαmβnS(am,n;p)αβpo(1)MNp(N12+(MN)316p1164).\sum_{m=1}^{M}\sum_{n=1}^{N}\alpha_{m}\beta_{n}S(am,n;p)\ll\|\alpha\|\|\beta\|p^{o(1)}\sqrt{MNp}\left(N^{-\frac{1}{2}}+(MN)^{-\frac{3}{16}}p^{\frac{11}{64}}\right).
Proof.

This is [24, Theorem 1.1] with k=2k=2 and M,NM,N swapped. ∎

Remark.

The constraint p1/4<MN<p5/4p^{1/4}<MN<p^{5/4} from Theorem˜3.4 can be removed in light of the trivial bound ˜1.2. Indeed, if MNp1/4MN\leq p^{1/4}, then

MNp(MN)316p1164MNpp1164364>MNp,\sqrt{MNp}\cdot(MN)^{-\frac{3}{16}}\cdot p^{\frac{11}{64}}\geq\sqrt{MNp}\cdot p^{\frac{11}{64}-\frac{3}{64}}>\sqrt{MNp},

so the bound ˜1.2 is better. Similarly, if MNp5/4MN\geq p^{5/4}, then

MNp(MN)316p1164p54(12316)p12+1164=p2564+4364>p.\sqrt{MNp}\cdot(MN)^{-\frac{3}{16}}\cdot p^{\frac{11}{64}}\geq p^{\frac{5}{4}(\frac{1}{2}-\frac{3}{16})}\cdot p^{\frac{1}{2}+\frac{11}{64}}=p^{\frac{25}{64}+\frac{43}{64}}>p.
Theorem 3.5 (Blomer–Milićević [3]).

Let c,d,M,N+c,d,M,N\in\mathbb{Z}_{+} such that dcd\mid c and dd is odd. Then for any complex sequences (αm)mM(\alpha_{m})_{m\leq M} and (βn)nN(\beta_{n})_{n\leq N} such that |αm|1|\alpha_{m}|\leq 1 for all mm, and any a(/c)×a\in(\mathbb{Z}/c\mathbb{Z})^{\times}, one has

m=1Mn=1N(n,c)=1αmβnS(am,n;c)Mβ(MNc)12+o(1)(c1/2d1/2M1/2+1d1/4+d1/4N1/2).\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(n,c)=1}\alpha_{m}\beta_{n}S(am,n;c)\ll\sqrt{M}\|\beta\|(MNc)^{\frac{1}{2}+o(1)}\left(\frac{c^{1/2}}{d^{1/2}M^{1/2}}+\frac{1}{d^{1/4}}+\frac{d^{1/4}}{N^{1/2}}\right).
Proof.

Dyadically summing instances of [3, Theorem 5] with (q,r,s,M,K,λ(k))(q,r,s,M,K,\lambda(k)) in loc. cit. replaced by (c,c,cd,N,M,αm)(c,c,\tfrac{c}{d},N,M,\alpha_{m}), one obtains the bound333[3, Theorem 5] does not include an aa-scalar inside the Kloosterman sum, but it holds in this slightly more general form with the same proof, and it is in fact applied this way in [3, p. 471, after (4.2)].

nN(n,c)=1|mMαmS(am,n;c)|2(cMN)o(1)M2Nc(cdM+1d+dN).\sum_{\begin{subarray}{c}n\leq N\\ (n,c)=1\end{subarray}}\left|\sum_{m\leq M}\alpha_{m}S(am,n;c)\right|^{2}\ll(cMN)^{o(1)}M^{2}Nc\left(\frac{c}{dM}+\frac{1}{\sqrt{d}}+\frac{\sqrt{d}}{N}\right).

The desired bound now follows from Cauchy–Schwarz in the shape

|m=1Mn=1N(n,c)=1αmβnS(am,n;c)|2β2nN(n,c)=1|mMαmS(am,n;c)|2.\left|\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(n,c)=1}\alpha_{m}\beta_{n}S(am,n;c)\right|^{2}\leq\|\beta\|^{2}\sum_{\begin{subarray}{c}n\leq N\\ (n,c)=1\end{subarray}}\left|\sum_{m\leq M}\alpha_{m}S(am,n;c)\right|^{2}.

(Since (βn)(\beta_{n}) can be chosen to attain equality in this Cauchy–Schwarz step, Theorem˜3.5 is in fact a restatement of [3, Theorem 5].) ∎

3.3. Fourier analysis on finite groups

Here we recall some general facts and notation from representation theory on finite groups; we point the reader to [35, 16, 42, 20] for more background. Let GG be a finite group with identity element ee. A (unitary) representation of GG is a homomorphism

ρ:GU(V),\rho:G\to U(V),

where VV is a finite-dimensional complex Hilbert space and U(V)U(V) is the set of unitary transformations of VV. In particular, ρ(e)=IdV\rho(e)=\textnormal{Id}_{V} is the identity transformation on VV. We write444Given a choice of orthonormal basis of VdimρV\cong\mathbb{C}^{\dim\rho}, one may of course represent the transformations ρ(g)\rho(g) for gGg\in G as matrices in dimρ×dimρ\mathbb{C}^{\dim\rho\times\dim\rho}.

dimρ:=dimV\dim\rho:=\dim V

for the dimension of ρ\rho. We say that two representations ρ1:GU(V1)\rho_{1}:G\to U(V_{1}), ρ2:GU(V2)\rho_{2}:G\to U(V_{2}) are isomorphic iff there is an invertible linear map M:V1V2M:V_{1}\to V_{2} such that Mρ1(g)=ρ2(g)MM\circ\rho_{1}(g)=\rho_{2}(g)\circ M for all gGg\in G (since we normalize all representations to be unitary, the map MM can also be taken unitary).

Example 3.6.

We write 𝟎:GU({0})\mathbf{0}:G\to U(\{0\}) for the zero representation given by 𝟎(g)=0gG\mathbf{0}(g)=0\ \forall g\in G, and 𝟏:GU()\mathbf{1}:G\to U(\mathbb{C}) for the trivial representation given by 𝟏(g)=IdgG\mathbf{1}(g)=\textnormal{Id}_{\mathbb{C}}\ \forall g\in G. Any action of GG on a finite set XX induces a permutation representation ρ:GU(X)\rho:G\to U(\mathbb{C}^{X}), defined by (ρ(g)f)(x):=f(g1x)(\rho(g)f)(x):=f(g^{-1}x) for gGg\in G, xXx\in X. The regular representation RGR_{G} is the permutation representation induced by the action by left-multiplication on X=GX=G, so dimRG=|G|\dim R_{G}=|G|.

Given two representations ρ1:GU(V1)\rho_{1}:G\to U(V_{1}) and ρ2:GU(V2)\rho_{2}:G\to U(V_{2}), we write ρ1ρ2:GU(V1V2)\rho_{1}\oplus\rho_{2}:G\to U(V_{1}\oplus V_{2}) and ρ1ρ2:GU(V1V2)\rho_{1}\otimes\rho_{2}:G\to U(V_{1}\otimes V_{2}) for their direct sum and product. The operations \oplus and \otimes have identity elements 𝟎\mathbf{0} and 𝟏\mathbf{1} respectively (up to isomorphism). Given ρ:GU(V)\rho:G\to U(V), we write

ρm:=ρρm times\rho\oplus^{m}:=\underbrace{\rho\oplus\cdots\oplus\rho}_{m\text{ times}}

for all nonnegative integers mm; when m=0m=0, we interpret this as the zero representation 𝟎\mathbf{0}. We use a similar notation for repeated direct sums of linear maps.

An invariant subspace WW of a representation ρ:GU(V)\rho:G\to U(V) is a subspace of VV such that ρ(g)WW\rho(g)W\subset W for all gGg\in G. For such WW, we define ρ|W:GU(W)\rho|_{W}:G\to U(W) by ρ|W(g):=ρ(g)\rho|_{W}(g):=\rho(g) for all gGg\in G, which is automatically unitary, and we say that ρ|W\rho|_{W} is a subrepresentation of ρ\rho. One can decompose ρρWρW\rho\cong\rho_{W}\oplus\rho_{W^{\perp}}; conversely, if ρρ1ρ2\rho\cong\rho_{1}\oplus\rho_{2} then ρ1\rho_{1} and ρ2\rho_{2} are isomorphic to subrepresentations of ρ\rho.

We say that a representation of GG is irreducible iff it is nonzero and has no nonzero subrepresentations other than itself. We write G^\widehat{G} for a complete set of irreducible representations of GG up to isomorphism, which always includes the trivial representation 𝟏\mathbf{1}. Any representation ρ\rho of GG has a unique decomposition (up to permutation and isomorphism) into irreducible representations,

ρρG^ρMult(ρ,ρ),\rho\cong\bigoplus_{\rho^{\prime}\in\widehat{G}}\rho^{\prime}\oplus^{\textnormal{Mult}(\rho^{\prime},\rho)}, (3.6)

where Mult(ρ,ρ)\textnormal{Mult}(\rho^{\prime},\rho) is called the multiplicity of ρ\rho^{\prime} inside ρ\rho. In particular, Mult(ρ,RG)=dimρ\textnormal{Mult}(\rho^{\prime},R_{G})=\dim\rho^{\prime}.

Given two finite groups G1,G2G_{1},G_{2} and representations ρ1:G1U(V1)\rho_{1}:G_{1}\to U(V_{1}) and ρ2:G2U(V2)\rho_{2}:G_{2}\to U(V_{2}), we write ρ1ρ2:G1×G2U(V1V2)\rho_{1}\boxtimes\rho_{2}:G_{1}\times G_{2}\to U(V_{1}\otimes V_{2}) for the representation of G1×G2G_{1}\times G_{2} given by

(ρ1ρ2)(g1,g2):=ρ1(g1)ρ2(g2),g1G1,g2G2.(\rho_{1}\boxtimes\rho_{2})(g_{1},g_{2}):=\rho_{1}(g_{1})\otimes\rho_{2}(g_{2}),\qquad\quad g_{1}\in G_{1},\ g_{2}\in G_{2}.

The irreducible representations of G1×G2G_{1}\times G_{2} are (up to isomorphism) precisely those of the form ρ1ρ2\rho_{1}\boxtimes\rho_{2} where ρ1G^1\rho_{1}\in\widehat{G}_{1} and ρ2G^2\rho_{2}\in\widehat{G}_{2} [35, §3.2].

Notation 3.7.

If G1,G2,ρ1,ρ2G_{1},G_{2},\rho_{1},\rho_{2} are as above, and G1,2G_{1,2} is a group isomorphic to G1×G2G_{1}\times G_{2} by a fixed implicit map (such as ˜3.16), we also use the notation ρ1ρ2\rho_{1}\boxtimes\rho_{2} to describe representations of G1,2G_{1,2}.

A character χ:G\chi:G\to\mathbb{C} is any function of the form χ(g)=Trρ(g)\chi(g)=\textnormal{Tr}\rho(g), where ρ\rho is a representation of GG; note that characters are constant on conjugacy classes, that χ(e)=dimρ\chi(e)=\dim\rho and χ(g1)=χ¯(g)\chi(g^{-1})=\overline{\chi}(g), and that isomorphic representations induce the same character. If ρ1,ρ2\rho_{1},\rho_{2} are two representations of GG with characters χ1,χ2\chi_{1},\chi_{2}, then Tr(ρ1ρ2)=χ1+χ2\textnormal{Tr}(\rho_{1}\oplus\rho_{2})=\chi_{1}+\chi_{2} and Tr(ρ1ρ2)=χ1χ2\textnormal{Tr}(\rho_{1}\otimes\rho_{2})=\chi_{1}\chi_{2}. If ρ1,ρ2\rho_{1},\rho_{2} are representations of G1,G2G_{1},G_{2} with characters χ1,χ2\chi_{1},\chi_{2} (respectively), then Tr(ρ1ρ2)(g1,g2)=χ1(g1)χ2(g2)\textnormal{Tr}(\rho_{1}\boxtimes\rho_{2})(g_{1},g_{2})=\chi_{1}(g_{1})\chi_{2}(g_{2}). We say that χ\chi is irreducible iff ρ\rho is, and write Irr(G)\textnormal{Irr}(G) for the set of all irreducible characters of GG. The character table of GG satisfies the following orthogonality relations.

Lemma 3.8 (Character orthogonality).

One has

gGχ1(g)χ¯2(g)\displaystyle\sum_{g\in G}\chi_{1}(g)\overline{\chi}_{2}(g) =|G|𝟙χ1=χ2,χ1,χ2Irr(G),\displaystyle=|G|\mathbbm{1}_{\chi_{1}=\chi_{2}},\qquad\qquad\chi_{1},\chi_{2}\in\textnormal{Irr}(G), (3.7)
χIrr(G)χ(g1)χ¯(g2)\displaystyle\sum_{\chi\in\textnormal{Irr}(G)}\chi(g_{1})\overline{\chi}(g_{2}) ={|G||C|,g1,g2 belong to the same conjugacy class C of G,0,g1,g2G are not conjugate.\displaystyle=\begin{cases}\frac{|G|}{|C|},&g_{1},g_{2}\text{ belong to the same conjugacy class $C$ of $G$,}\\ 0,&g_{1},g_{2}\in G\text{ are not conjugate.}\end{cases} (3.8)
Proof.

See, e.g., [16, Theorem 2.12 and Exercise 2.21]. ∎

It follows from ˜3.7 and 3.6 that for an arbitrary character χ=Trρ\chi=\textnormal{Tr}\rho of GG, one has

1|G|gG|χ(g)|2=ρG^Mult(ρ,ρ)2.\frac{1}{|G|}\sum_{g\in G}|\chi(g)|^{2}=\sum_{\rho^{\prime}\in\widehat{G}}\textnormal{Mult}(\rho^{\prime},\rho)^{2}. (3.9)

We may also restrict a representation ρ:GU(V)\rho:G\to U(V) and its character χ=Trρ\chi=\textnormal{Tr}\rho to a subgroup HGH\leq G, to obtain a representation of ρ|H:HU(V)\rho|_{H}:H\to U(V) with character χ|H=Trρ|H\chi|_{H}=\textnormal{Tr}\rho|_{H}. If ρ\rho is irreducible, ρ|H\rho|_{H} is not necessarily irreducible. When HH is a normal subgroup, the structure of ρ|H\rho|_{H} can be better understood using Clifford theory [6].

Lemma 3.9 (Clifford).

Let GG be a group, NGN\triangleleft G be a normal subgroup, and ρG^\rho\in\widehat{G} be an irreducible representation. Then there exist positive integers L,m,dL,m,d with dimρ=Lmd\dim\rho=Lmd, and non-isomorphic irreducible representations σ1,,σLN^\sigma_{1},\ldots,\sigma_{L}\in\widehat{N} of dimension dd, all lying in the same orbit of the action of GG by conjugation (i.e., (gσ)(n):=σ(gng1)(g\cdot\sigma)(n):=\sigma(gng^{-1}) for gGg\in G and nNn\in N), such that

ρ|N=1Lσm.\rho|_{N}\cong\bigoplus_{\ell=1}^{L}\sigma_{\ell}\oplus^{m}.
Proof.

See, e.g., [20, Theorem 6.5]. ∎

Given a function F:GF:G\to\mathbb{C} and a (not necessarily irreducible) representation ρ:GU(V)\rho:G\to U(V), we define the Fourier coefficient F^(ρ):VV\widehat{F}(\rho):V\to V by

F^(ρ):=gGF(g)ρ(g).\widehat{F}(\rho):=\sum_{g\in G}F(g)\rho(g). (3.10)

This obeys F1F2^(ρ)=F1^(ρ)F2^(ρ)\widehat{F_{1}*F_{2}}(\rho)=\widehat{F_{1}}(\rho)\widehat{F_{2}}(\rho), where (F1F2)(g):=g1g2=gF1(g)F2(g)(F_{1}*F_{2})(g):=\sum_{g_{1}g_{2}=g}F_{1}(g)F_{2}(g) denotes the convolution of two functions F1,F2:GF_{1},F_{2}:G\to\mathbb{C}. In particular, if G=/cG=\mathbb{Z}/c\mathbb{Z}, the irreducible representations (which are all 11-dimensional since GG is abelian) are of the shape ρa(g):=e(agc)\rho_{a}(g):=e(\tfrac{ag}{c}) for a,g/ca,g\in\mathbb{Z}/c\mathbb{Z}. In this case, we write

F^(a):=F^(ρa)=g/cF(g)e(agc).\widehat{F}(a):=\widehat{F}(\rho_{-a})=\sum_{g\in\mathbb{Z}/c\mathbb{Z}}F(g)\,e\left(-\frac{ag}{c}\right). (3.11)
Lemma 3.10.

Let F:GF:G\to\mathbb{C}, ρ:GU(V)\rho:G\to U(V) be a representation, and q[1,)q\in[1,\infty). Then one has

F^(ρ)Sqq=ρG^Mult(ρ,ρ)F^(ρ)Sqq,F^(ρ)=maxρG^Mult(ρ,ρ)>0F^(ρ).\|\widehat{F}(\rho)\|_{S^{q}}^{q}=\sum_{\rho^{\prime}\in\widehat{G}}\textnormal{Mult}(\rho^{\prime},\rho)\|\widehat{F}(\rho^{\prime})\|_{S^{q}}^{q},\qquad\qquad\|\widehat{F}(\rho)\|=\max_{\begin{subarray}{c}\rho^{\prime}\in\widehat{G}\\ \textnormal{Mult}(\rho^{\prime},\rho)>0\end{subarray}}\|\widehat{F}(\rho^{\prime})\|.
Proof.

By ˜3.6, there exists a unitary map UU (from VV to the direct sum of ρ\rho’s irreducible invariant subspaces) such that for any gGg\in G,

Uρ(g)U=ρG^ρ(g)Mult(ρ,ρ),U\rho(g)U^{*}=\bigoplus_{\rho^{\prime}\in\widehat{G}}{\rho^{\prime}(g)\oplus}^{\textnormal{Mult}(\rho^{\prime},\rho)},

using some implicit ordering of G^\widehat{G}. But then, by ˜3.10, we have

UF^(ρ)U=gGF(g)Uρ(g)U\displaystyle U\widehat{F}(\rho)U^{*}=\sum_{g\in G}F(g)U\rho(g)U^{*} =gGF(g)ρG^ρ(g)Mult(ρ,ρ)\displaystyle=\sum_{g\in G}F(g)\bigoplus_{\rho^{\prime}\in\widehat{G}}\rho^{\prime}(g)\oplus^{\textnormal{Mult}(\rho^{\prime},\rho)}
=ρG^(gGF(g)ρ(g))Mult(ρ,ρ)=ρG^F^(ρ)Mult(ρ,ρ),\displaystyle=\bigoplus_{\rho^{\prime}\in\widehat{G}}\left(\sum_{g\in G}F(g)\rho^{\prime}(g)\right)\oplus^{\textnormal{Mult}(\rho^{\prime},\rho)}=\bigoplus_{\rho^{\prime}\in\widehat{G}}\widehat{F}(\rho^{\prime})\oplus^{\textnormal{Mult}(\rho^{\prime},\rho)},

and the conclusion follows from the fact that the multiset of singular values of a direct sum of matrices is the union of the multisets of singular values of those matrices. ∎

3.4. Facts about SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})

Let c+c\in\mathbb{Z}_{+}. Recall the special linear groups SL2()\textnormal{SL}_{2}(\mathbb{Z}) and SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) of matrices in 2×2\mathbb{Z}^{2\times 2} (resp., (/c)2×2(\mathbb{Z}/c\mathbb{Z})^{2\times 2}) with determinant 11, and the projective special linear groups,

PSL2():=SL2()/{±I},PSL2(/c):=SL2(/c)/{γI:γ/c,γ2=1}.\textnormal{PSL}_{2}(\mathbb{Z}):=\textnormal{SL}_{2}(\mathbb{Z})/\{\pm I\},\qquad\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z}):=\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})/_{\{\gamma I:\gamma\in\mathbb{Z}/c\mathbb{Z},\gamma^{2}=1\}}. (3.12)

When the group SL2()\textnormal{SL}_{2}(\mathbb{Z}), PSL2()\textnormal{PSL}_{2}(\mathbb{Z}), SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) or PSL2(/c)\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z}) is understood from context, we write

I:=(1001),T:=(1101),S:=(0110),I:=\begin{pmatrix}1&0\\ 0&1\end{pmatrix},\qquad T:=\begin{pmatrix}1&1\\ 0&1\end{pmatrix},\qquad S:=\begin{pmatrix}0&-1\\ 1&0\end{pmatrix}, (3.13)

which satisfy the relations S2=(ST)3=I-S^{2}=-(ST)^{3}=I, and in the case of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) or PSL2(/c)\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z}), Tc=IT^{c}=I. Note that TT and SS generate SL2()\textnormal{SL}_{2}(\mathbb{Z}).

Notation 3.11 (Projective line).

For c+c\in\mathbb{Z}_{+}, we recall the projective line

1(/c):={(x,y):x,y/c,x(/c)+y(/c)=/c}/,\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}):=\left\{(x,y):x,y\in\mathbb{Z}/c\mathbb{Z},\ x(\mathbb{Z}/c\mathbb{Z})+y(\mathbb{Z}/c\mathbb{Z})=\mathbb{Z}/c\mathbb{Z}\right\}/_{\sim},

where \sim is the equivalence relation generated by (x,y)(αx,αy)(x,y)\sim(\alpha x,\alpha y) for α(/c)×\alpha\in(\mathbb{Z}/c\mathbb{Z})^{\times}. We write the equivalence class of (x,y)(x,y) as [x:y][x:y], and we will typically use the letters u,vu,v to denote projective points in 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}), reserving x,yx,y for elements of /c\mathbb{Z}/c\mathbb{Z}. For dcd\mid c, we write the natural map 1(/c)1(/d)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})\to\mathbb{P}^{1}(\mathbb{Z}/d\mathbb{Z}) which reduces both entries modulo dd as uumod du\mapsto u\ \textnormal{mod }d.

The group PSL2(/c)\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z}) (and, through it, SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})) acts on 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) by

(mnpq)[x:y]:=[mx+ny:px+qy].\begin{pmatrix}m&n\\ p&q\end{pmatrix}[x:y]:=[mx+ny:px+qy]. (3.14)

One can think of 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) as /c\mathbb{Z}/c\mathbb{Z} with a few additional points, which must be included to obtain a well-defined action. Indeed, any projective point u1(/c)u\in\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) can be written as u=[x:y]u=[x:y] for some x{1,,c}x\in\{1,\ldots,c\} and ycy\mid c with (x,y)=1(x,y)=1, and thus |1(/c)|=c1+o(1)|\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})|=c^{1+o(1)}. In particular, one can embed /c1(/c)\mathbb{Z}/c\mathbb{Z}\subset\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) by x[x:1]x\mapsto[x:1], and via this embedding, the generators from ˜3.13 act on elements of /c\mathbb{Z}/c\mathbb{Z} by

Tx=x+1,Sy=y¯,for x/c,y(/c)×.Tx=x+1,\qquad\qquad Sy=-\overline{y},\qquad\qquad\text{for }x\in\mathbb{Z}/c\mathbb{Z},\ y\in(\mathbb{Z}/c\mathbb{Z})^{\times}.

We now briefly go over a few facts about the subgroups and representations of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}).

Notation 3.12 (Reduction mod dd).

Given a positive integer dd with dcd\mid c, we denote by

πc,d:SL2(/c)SL2(/d)\pi_{c,d}:\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\to\textnormal{SL}_{2}(\mathbb{Z}/d\mathbb{Z})

the natural epimorphism which ‘reads’ the entries of gSL2(/c)g\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) modulo dd. We write

Γc(d):=kerπc,d\Gamma_{c}(d):=\ker\pi_{c,d}

for the congruence subgroup given by the kernel of this map (consisting of matrices of the form I+dAI+dA, where one may view the entries of AA as elements of /cd\mathbb{Z}/\tfrac{c}{d}\mathbb{Z}).

Lemma 3.13.

SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) acts transitively on 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) (i.e., there is only one orbit). In fact, for dcd\mid c, there is a bijection between 1(/d)\mathbb{P}^{1}(\mathbb{Z}/d\mathbb{Z}) and orbits of 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) under Γc(d)\Gamma_{c}(d),

Γc(d)\1(/c)1(/d),Γc(d)uumod d.\begin{array}[]{rcl}\Gamma_{c}(d)\backslash\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})&\longrightarrow&\mathbb{P}^{1}(\mathbb{Z}/d\mathbb{Z}),\\ \Gamma_{c}(d)\cdot u&\longmapsto&u\ \textnormal{mod }d.\end{array} (3.15)
Proof.

For any [x:y]1(/c)[x:y]\in\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}), there exist by definition a,b/ca,b\in\mathbb{Z}/c\mathbb{Z} with ax+by1(mod c)ax+by\equiv 1\ (\textnormal{mod }c), so [x:y]=(xbya)[1:0]SL2(/c)[1:0][x:y]=\left(\begin{smallmatrix}x&-b\\ y&a\end{smallmatrix}\right)[1:0]\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\cdot[1:0]. Thus the action of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) on 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) is transitive.

The map in ˜3.15 is well-defined since (I+dA)u(mod d)=u(mod d)(I+dA)u\ (\textnormal{mod }d)=u\ (\textnormal{mod }d) for any I+dAΓc(d)I+dA\in\Gamma_{c}(d). It is surjective since the original map 1(/c)1(/d)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})\to\mathbb{P}^{1}(\mathbb{Z}/d\mathbb{Z}) is surjective. To show that ˜3.15 is also injective, suppose u(mod d)=v(mod d)u\ (\textnormal{mod }d)=v\ (\textnormal{mod }d) for some u,v1(/c)u,v\in\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}), and we aim to show that Γc(d)u=Γc(d)v\Gamma_{c}(d)\cdot u=\Gamma_{c}(d)\cdot v. By the transitivity of the action of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}), we can find gSL2(/c)g\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) such that gv=[1:0]1(/c)gv=[1:0]\in\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}), so

(gu)mod d=(gv)mod d=[1:0]1(/d).(gu)\ \textnormal{mod }d=(gv)\ \textnormal{mod }d=[1:0]\in\mathbb{P}^{1}(\mathbb{Z}/d\mathbb{Z}).

Write gu=[xd+1:yd]gu=[xd+1:yd] for some x,y/cx,y\in\mathbb{Z}/c\mathbb{Z}. Since gu1(/c)gu\in\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}), we have 1=(xd+1,yd,c)=(xd+1,yd2,c)1=(xd+1,yd,c)=(xd+1,yd^{2},c), so there exist a,b/ca,b\in\mathbb{Z}/c\mathbb{Z} with a(xd+1)+byd21(mod c)a(xd+1)+byd^{2}\equiv 1\ (\textnormal{mod }c), and in particular a1(mod d)a\equiv 1\ (\textnormal{mod }d). Then,

gu=(xd+1bdyda)[1:0]Γc(d)gv=gΓc(d)v,gu=\begin{pmatrix}xd+1&-bd\\ yd&a\end{pmatrix}[1:0]\in\Gamma_{c}(d)\cdot gv=g\Gamma_{c}(d)\cdot v,

where the last equality is due to the normality of Γc(d)\Gamma_{c}(d). Hence uΓc(d)vu\in\Gamma_{c}(d)\cdot v, as we wanted. ∎

By the Chinese remainder theorem (/cpkc/pk\mathbb{Z}/c\mathbb{Z}\cong\prod_{p^{k}\|c}\mathbb{Z}/p^{k}\mathbb{Z}), combining the maps πc,pk\pi_{c,p^{k}} for pkcp^{k}\|c produces isomorphisms

SL2(/c)pkcSL2(/pk),Γc(d)pkcpjdΓpk(pj),\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\cong\prod_{p^{k}\|c}\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}),\qquad\qquad\Gamma_{c}(d)\cong\prod_{\begin{subarray}{c}p^{k}\|c\\ p^{j}\|d\end{subarray}}\Gamma_{p^{k}}(p^{j}), (3.16)

for dcd\mid c (in the products above, it is understood that only primes which divide cc are included, so k1k\geq 1, but we allow j=0j=0). Since |SL2(/pk)|=p3k(11p2)|\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z})|=p^{3k}(1-\tfrac{1}{p^{2}}), it follows that

|SL2(/c)|=c3prime p|c(11p2)c3|Γc(d)|=|SL2(/c)||SL2(/d)|=c3d3,|\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})|=c^{3}\prod_{\text{prime }p|c}\left(1-\frac{1}{p^{2}}\right)\asymp c^{3}\qquad\Rightarrow\qquad|\Gamma_{c}(d)|=\frac{|\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})|}{|\textnormal{SL}_{2}(\mathbb{Z}/d\mathbb{Z})|}=\frac{c^{3}}{d^{3}}, (3.17)

and that the irreducible representations of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) can be parametrized as

SL^2(/c)={

pkc
ρp,k
:ρp,kSL^2(/pk)}
.
\widehat{\textnormal{SL}}_{2}(\mathbb{Z}/c\mathbb{Z})=\left\{\mathop{\mathchoice{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}}_{p^{k}\|c}\rho_{p,k}:\rho_{p,k}\in\widehat{\textnormal{SL}}_{2}(\mathbb{Z}/p^{k}\mathbb{Z})\right\}.
(3.18)

Now let pp be a prime and k+k\in\mathbb{Z}_{+}, and let us focus on understanding SL^2(/pk)\widehat{\textnormal{SL}}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}).

Definition 3.14 (Primitive representations).

A representation ρ:SL2(/pk)U(V)\rho:\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z})\to U(V) is called primitive iff its kernel does not contain Γpk(pk1)\Gamma_{p^{k}}(p^{k-1}). Equivalently (by the first isomorphism theorem), ρ\rho cannot be factored as ρπpk,pk1\rho^{\prime}\circ\pi_{p^{k},p^{k-1}} for some representation ρ\rho^{\prime} of SL2(/pk1)\textnormal{SL}_{2}(\mathbb{Z}/p^{k-1}\mathbb{Z}). A primitive (resp., non-primitive) character is one induced by a primitive (resp., non-primitive) representation.

Thus the primitive irreducible representations of SL2(/pk)\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}) are ‘new’ at level pkp^{k}, much like primitive Dirichlet characters or newforms in the theory of automorphic representations. We can easily isolate the ‘maximal’ non-primitive component of a representation using the following lemma.

Lemma 3.15.

Let ρ:SL2(/pk)U(V)\rho:\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z})\to U(V) be a representation and

Vf:={vV:ρ(g)v=v,gΓpk(pk1)}.V_{f}:=\{v\in V:\rho(g)v=v,\ \forall g\in\Gamma_{p^{k}}(p^{k-1})\}.

Then ρ|Vf\rho|_{V_{f}} is non-primitive, and ρ|Vf\rho|_{V_{f}^{\perp}} is isomorphic to a direct sum of primitive irreducible representations.

Proof.

The fact that VfV_{f} and thus VfV_{f}^{\perp} are an invariant subspaces of VV follows quickly from the fact that Γpkpk1G\Gamma_{p^{k}\to p^{k-1}}\triangleleft G, so ρ|Vf\rho|_{V_{f}} and ρ|Vf\rho|_{V_{f}^{\perp}} are well-defined. By definition, ρ|Vf(g)=IdVf\rho|_{V_{f}}(g)=\textnormal{Id}_{V_{f}} for all gΓpkpk1g\in\Gamma_{p^{k}\to p^{k-1}}, so the kernel of ρ|Vf\rho|_{V_{f}} includes Γpkpk1\Gamma_{p^{k}\to p^{k-1}}, i.e., ρ|Vf\rho|_{V_{f}} is non-primitive.

Now let ρ|V0\rho|_{V_{0}} be any irreducible subrepresentation of ρ|Vf\rho|_{V_{f}^{\perp}}, where V0VfV_{0}\subset V_{f}^{\perp}. Since V0{0}V_{0}\neq\{0\} and V0Vf={0}V_{0}\cap V_{f}=\{0\}, we can find some vV0Vfv\in V_{0}\setminus V_{f}, and thus some gΓpkpk1g\in\Gamma_{p^{k}\to p^{k-1}} such that ρ(g)vv\rho(g)v\neq v. But then ρ|V0(g)IdV0\rho|_{V_{0}}(g)\neq\textnormal{Id}_{V_{0}}, so the kernel of ρ|V0\rho|_{V_{0}} does not contain Γpkpk1\Gamma_{p^{k}\to p^{k-1}}, i.e., ρV0\rho_{V_{0}} is primitive. ∎

The primitive irreducible representations of SL2(/pk)\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}) are fairly complicated, but the following lemma will suffice for our purposes. This generalizes the classical spectral-gap result for SL2(/p)\textnormal{SL}_{2}(\mathbb{Z}/p\mathbb{Z}).

Lemma 3.16.

Any primitive irreducible representation ρ\rho of SL2(/pk)\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}) has dimρpk\dim\rho\gg p^{k}.

Proof.

Complete tables with the dimensions of irreducible representations of SL2(/pk)\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}), including the case p=2p=2, were given by Nobs–Wolfart [31, p. 525] (who refer to primitive representations in the sense of our Lemma˜3.16 as having ‘level kk’). For odd primes pp, these had been classified by Shalika [36, §4.3], Tanaka [41], and Kutzko [26].

For a more direct proof of the lower bound via Clifford theory, we refer to Bourgain–Gamburd’s [5, Lemma 7.1] (this assumes pp is odd, but an analogous argument applies if p=2p=2). To summarize their argument when kk is even, Bourgain–Gamburd apply (a variant of) Lemma˜3.9 with G:=SL2(/pk)G:=\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}) and N:=Γpk(pk/2)N:=\Gamma_{p^{k}}(p^{k/2}), to decompose ρ|N\rho|_{N} into irreducible representations σ1,,σLN^\sigma_{1},\ldots,\sigma_{L}\in\widehat{N}, all lying in the same orbit under GG-conjugation. But NN is abelian, so N^N\widehat{N}\cong N, and σ1,,σL\sigma_{1},\ldots,\sigma_{L} correspond to GG-conjugate elements g1,,gLN=Γpk(pk/2)g_{1},\ldots,g_{L}\in N=\Gamma_{p^{k}}(p^{k/2}). Moreover, the primitivity condition that kerρ\ker\rho does not contain Γpk(pk1)\Gamma_{p^{k}}(p^{k-1}) implies that g1,,gLΓpk(p(k/2)+1)g_{1},\ldots,g_{L}\not\in\Gamma_{p^{k}}(p^{(k/2)+1}). It follows that

dimρL=|G||CG(g1)|p3k|CG(g1)|,\dim\rho\geq L=\frac{|G|}{|C_{G}(g_{1})|}\gg\frac{p^{3k}}{|C_{G}(g_{1})|},

where CG(g1)C_{G}(g_{1}) is the centralizer of g1g_{1} in GG. It thus remains to bound |CG(g)|p2k|C_{G}(g)|\ll p^{2k} for gΓpk(pk/2)Γpk(p(k/2)+1)g\in\Gamma_{p^{k}}(p^{k/2})\setminus\Gamma_{p^{k}}(p^{(k/2)+1}), which follows from an explicit matrix computation [5, Claim 7.1] (minor modifications are needed here if p=2p=2, but these only incur a constant-factor loss). ∎

4. Representations and Kloosterman matrices

Here we connect matrices of Kloosterman sums modulo cc to Fourier analysis on SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}).

4.1. The relevant representations

When digesting the notation below, the reader should keep in mind the informal outline from Section˜2.2. We will first define the simpler representations (ρc,Vc)(\rho_{c},V_{c}) which are connected to matrices of Kloosterman sums S(m,n;c)S(m,n;c), and then the more relevant subrepresentations (ρc,Vc)(\rho^{\circ}_{c},V_{c}^{\circ}) which correspond to adding the restriction (m,n,c)=1(m,n,c)=1. In fact, the subspace VcVcV^{\circ}_{c}\subset V_{c} will be constructed by sifting out ‘old’ subspaces isomorphic to VdV_{d} for dcd\mid c.

Definition 4.1 (Permutation representations of the projective action).

For c+c\in\mathbb{Z}_{+}, we denote the permutation representation of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) induced by the action ˜3.14 on 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) by

ρc:SL2(/c)U(Vc),Vc:=1(/c),\rho_{c}:\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\to U(V_{c}),\qquad\quad V_{c}:=\mathbb{C}^{\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})}, (4.1)

and its character by χc:=Trρc\chi_{c}:=\textnormal{Tr}\rho_{c}. Hence VcV_{c} is the space of functions f:1(/c)f:\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})\to\mathbb{C}, equipped with the standard inner product, and (ρc(g)f)(u)=f(g1u)(\rho_{c}(g)f)(u)=f(g^{-1}u) for gSL2(/c)g\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}), u1(/c)u\in\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}). In particular, for any u1(/c)u\in\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}), one has ρc(g)𝟙u=𝟙gu\rho_{c}(g)\mathbbm{1}_{u}=\mathbbm{1}_{gu}.

Definition 4.2 (Invariant subspaces).

For c,d+c,d\in\mathbb{Z}_{+} with dcd\mid c, define

Vc(d):={fVc:ρc(n)f=fnΓc(d)}Vc.V_{c}(d):=\left\{f\in V_{c}:\rho_{c}(n)f=f\quad\forall n\in\Gamma_{c}(d)\right\}\subset V_{c}.

In particular, Vc(c)=VcV_{c}(c)=V_{c}. Thus Vc(d)V_{c}(d) is the space of complex-valued functions on 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) which are constant on orbits of Γc(d)\Gamma_{c}(d), so by Lemma˜3.13,

Vc(d)Γc(d)\1(/c)1(/d)=Vd.V_{c}(d)\cong\mathbb{C}^{\Gamma_{c}(d)\backslash\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})}\cong\mathbb{C}^{\mathbb{P}^{1}(\mathbb{Z}/d\mathbb{Z})}=V_{d}. (4.2)
Lemma 4.3.

For c,d+c,d\in\mathbb{Z}_{+} with dcd\mid c, Vc(d)V_{c}(d) is an invariant subspace of ρc\rho_{c}. In fact, using Notation˜3.12, we have

ρc|Vc(d)ρdπc,d.\rho_{c}|_{V_{c}(d)}\cong\rho_{d}\circ\pi_{c,d}.
Proof.

The fact that Vc(d)V_{c}(d) is an invariant subspace follows immediately from the normality of Γc(d)\Gamma_{c}(d). Now let Φ:VdVc(d)\Phi:V_{d}\to V_{c}(d) be the invertible linear map from ˜4.2, which relies on the bijection from ˜3.15. Then for any gSL2(/c)g\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}), one can easily check that ρc(g)|Vc(d)Φ=Φρd(πc,d(g))\rho_{c}(g)|_{V_{c}(d)}\circ\Phi=\Phi\circ\rho_{d}(\pi_{c,d}(g)): both maps take the basis vector 𝟙uVd=1(/d)\mathbbm{1}_{u}\in V_{d}=\mathbb{C}^{\mathbb{P}^{1}(\mathbb{Z}/d\mathbb{Z})} to the L2L^{2}-normalized function in Vc(d)V_{c}(d) which is only nonzero on the orbit gΓc(d)u=Γc(d)gug\Gamma_{c}(d)\cdot u=\Gamma_{c}(d)\cdot gu. ∎

In light of Lemma˜4.3, we will need to remove the contribution of ‘old’ representations (ρd,Vd)(\rho_{d},V_{d}) to (ρc,Vc)(\rho_{c},V_{c}). To this end, it will be helpful to adopt the following convention for tensor products.

Notation 4.4 (Ordered tensor products).

If c,c1,c2+c,c_{1},c_{2}\in\mathbb{Z}_{+} are such that c=c1c2c=c_{1}c_{2} and (c1,c2)=1(c_{1},c_{2})=1, then the Chinese Remainder Theorem gives 1(/c)1(/c1)×1(/c2)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})\cong\mathbb{P}^{1}(\mathbb{Z}/c_{1}\mathbb{Z})\times\mathbb{P}^{1}(\mathbb{Z}/c_{2}\mathbb{Z}) by u(umod c1,umod c2)u\mapsto(u\ \textnormal{mod }c_{1},u\ \textnormal{mod }c_{2}), so VcVc1Vc2V_{c}\cong V_{c_{1}}\otimes V_{c_{2}}. Since tensor products of vector spaces are defined up to isomorphism, it is not a great abuse of notation to write

Vc=Vc1Vc2.V_{c}=V_{c_{1}}\otimes V_{c_{2}}.

In particular, given f1Vc1f_{1}\in V_{c_{1}} and f2Vc2f_{2}\in V_{c_{2}}, we view f1f2f_{1}\otimes f_{2} as a function on 1(/c)\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) with values (f1f2)(u)=f1(umod c1)f2(umod c2)(f_{1}\otimes f_{2})(u)=f_{1}(u\ \textnormal{mod }c_{1})\cdot f_{2}(u\ \textnormal{mod }c_{2}). This notation extends to tensor products of subspaces W1Vc1W_{1}\subset V_{c_{1}}, W2Vc2W_{2}\subset V_{c_{2}} (so W1W2VcW_{1}\otimes W_{2}\subset V_{c}), and of linear transformations T1:Vc1Vc1T_{1}:V_{c_{1}}\to V_{c_{1}}, T2:Vc2Vc2T_{2}:V_{c_{2}}\to V_{c_{2}}.

We note that with the conventions from Notations˜3.7 and 4.4, for c=c1c2c=c_{1}c_{2} with (c1,c2)=1(c_{1},c_{2})=1, the product ρc1ρc2:SL2(/c)U(Vc1Vc2)=U(Vc)\rho_{c_{1}}\boxtimes\rho_{c_{2}}:\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\to U(V_{c_{1}}\otimes V_{c_{2}})=U(V_{c}) is precisely the permutation representation ρc\rho_{c} (this is a genuine equality, not just an isomorphism). Moreover, a tensor product of invariant subspaces of ρc1\rho_{c_{1}} and ρc2\rho_{c_{2}} gives an invariant subspace of ρc\rho_{c}, and in fact Vc1(d1)Vc2(d2)=Vc(d)V_{c_{1}}(d_{1})\otimes V_{c_{2}}(d_{2})=V_{c}(d) for d1c1d_{1}\mid c_{1}, d2c2d_{2}\mid c_{2}, and d=d1d2d=d_{1}d_{2}. Iterating this yields the factorizations

Vc=pkcVpk,ρc=

pkc
ρpk
.
V_{c}=\bigotimes_{p^{k}\|c}V_{p^{k}},\qquad\qquad\rho_{c}=\mathop{\mathchoice{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}}_{p^{k}\|c}\rho_{p^{k}}.
(4.3)

and, more generally, for dcd\mid c,

Vc(d)=pkcpjdVpk(pj),ρc|Vc(d)=

pkcpjd
ρpk
|
Vpk(pj)
.
V_{c}(d)=\bigotimes_{\begin{subarray}{c}p^{k}\|c\\ p^{j}\|d\end{subarray}}V_{p^{k}}(p^{j}),\qquad\qquad\rho_{c}|_{V_{c}(d)}=\mathop{\mathchoice{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}}_{\begin{subarray}{c}p^{k}\|c\\ p^{j}\|d\end{subarray}}\rho_{p^{k}}|_{V_{p^{k}}(p^{j})}.
(4.4)

Finally, we can define the representations (ρc,Vc)(\rho_{c}^{\circ},V_{c}^{\circ}).

Definition 4.5 (Sifted representations).

For a prime power pkp^{k}, we let Vpk:=Vpk(pk1)VpkV_{p^{k}}^{\circ}:=V_{p^{k}}(p^{k-1})^{\perp}\subset V_{p^{k}} be the orthogonal complement of Vpk(pk1)V_{p^{k}}(p^{k-1}) inside VpkV_{p^{k}} (which is an invariant subspace of ρpk\rho_{p^{k}}). For c+c\in\mathbb{Z}_{+}, we define

Vc:=pkcVpk,ρc:=ρc|Vc,χc:=Trρc.V_{c}^{\circ}:=\bigotimes_{p^{k}\|c}V_{p^{k}}^{\circ},\qquad\qquad\rho_{c}^{\circ}:=\rho_{c}|_{V_{c}^{\circ}},\qquad\qquad\chi_{c}^{\circ}:=\textnormal{Tr}\rho_{c}^{\circ}.
Proposition 4.6 (Decomposition of sifted representations).

For any c+c\in\mathbb{Z}_{+}, one has

ρc=

pkc
ρpk
.
\rho_{c}^{\circ}=\mathop{\mathchoice{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}}_{p^{k}\|c}\rho_{p^{k}}^{\circ}.
(4.5)

Moreover, each ρpk\rho_{p^{k}}^{\circ} is isomorphic to a nonempty direct sum of primitive irreducible representations, and ρc\rho_{c}^{\circ} is isomorphic to a direct sum of co(1)c^{o(1)} irreducible representations of dimensions c1o(1)c^{1-o(1)}.

Proof.

The factorization in ˜4.5 follows immediately from ˜4.3 and 4.5. The fact that ρpk=ρpk|Vpk\rho_{p^{k}}^{\circ}=\rho_{p^{k}}|_{V_{p^{k}}^{\circ}} is isomorphic to a direct sum of primitive irreducible representations is precisely the content of Lemma˜3.15, wherein Vf=Vpk(pk1)V_{f}=V_{p^{k}}(p^{k-1}) and Vf=VpkV_{f}^{\perp}=V_{p^{k}}^{\circ}. One can easily construct a function on VpkV_{p^{k}} which is not constant on orbits of Γpk(pk1)\Gamma_{p^{k}}(p^{k-1}), so Vpk{0}V_{p^{k}}^{\circ}\neq\{0\}, and thus ρpk0\rho_{p^{k}}^{\circ}\neq 0.

Now write each ρpk\rho_{p^{k}}^{\circ} as a direct sum of primitive irreducible representations ρp,k\rho_{p,k} of SL2(/pk)\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}) (up to isomorphism), and expand the tensor product in ˜4.5. This expresses ρc\rho_{c}^{\circ} as a direct sum of representations (potentially with repetitions) of the shape

ρ=

pkc
ρp,k
,
\rho=\mathop{\mathchoice{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}}_{p^{k}\|c}\rho_{p,k},

which are irreducible, and have dimensions c1o(1)\gg c^{1-o(1)} by Lemma˜3.16 and the divisor bound. Since dimρcdimρc=|1(/c)|=c1+o(1)\dim\rho_{c}^{\circ}\leq\dim\rho_{c}=|\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})|=c^{1+o(1)}, the number of these representations is at most co(1)c^{o(1)}. ∎

We now briefly analyze the orthogonal projections onto invariant subspaces of VcV_{c}. It will turn out that the projection onto VcV_{c}^{\circ} can be obtained by a Möbius-inversion-type process.

Definition 4.7 (Special projections).

For c,d+c,d\in\mathbb{Z}_{+} with dcd\mid c, we let Pc(d),Pc:VcVcP_{c}(d),P_{c}^{\circ}:V_{c}\to V_{c} be the orthogonal projections onto Vc(d)V_{c}(d), respectively VcV_{c}^{\circ}. In particular, Pc(c)P_{c}(c) is the identity map on VcV_{c}.

Lemma 4.8.

For dcd\mid c, one has

Pc(d)=pkcpjdPpk(pj)=1|Γc(d)|nΓc(d)ρc(n).P_{c}(d)=\bigotimes_{\begin{subarray}{c}p^{k}\|c\\ p^{j}\|d\end{subarray}}P_{p^{k}}(p^{j})=\frac{1}{|\Gamma_{c}(d)|}\sum_{n\in\Gamma_{c}(d)}\rho_{c}(n). (4.6)

In particular, Pc(d)P_{c}(d) commutes with ρc(g)\rho_{c}(g) for any gSL2(/c)g\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}). The matrix representation of this map with respect to the standard basis of Vc=1(/c)V_{c}=\mathbb{C}^{\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})} has entries

Pc(d)u,v=d2ϕ(c)c2ϕ(d)𝟙uΓc(d)v,u,v1(/c).P_{c}(d)_{u,v}=\frac{d^{2}\phi(c)}{c^{2}\phi(d)}\mathbbm{1}_{u\in\Gamma_{c}(d)\cdot v},\qquad\qquad u,v\in\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}). (4.7)

The proof of Lemma˜4.8 is left to Appendix˜A.

Lemma 4.9.

For c+c\in\mathbb{Z}_{+}, one has

Pc=pkcPpk=dcμ(cd)Pc(d).P_{c}^{\circ}=\bigotimes_{p^{k}\|c}P_{p^{k}}^{\circ}=\sum_{d\mid c}\mu\left(\frac{c}{d}\right)P_{c}(d).
Proof.

The factorization as a tensor product follows immediately from Definitions˜4.5 and 4.7. Now for a prime power pkp^{k}, recall that Ppk(pk)P_{p^{k}}(p^{k}) is the identity map on VpkV_{p^{k}} and Ppk(pk1)P_{p^{k}}(p^{k-1}) is the orthogonal projection onto Vpk(pk1)V_{p^{k}}(p^{k-1}), so the orthogonal projection onto Vpk=Vpk(pk1)V_{p^{k}}^{\circ}=V_{p^{k}}(p^{k-1})^{\perp} can be written as

Ppk=Ppk(pk)Ppk(pk1).P_{p^{k}}^{\circ}=P_{p^{k}}(p^{k})-P_{p^{k}}(p^{k-1}).

It follows from this and ˜4.6 that

pkcPpk\displaystyle\bigotimes_{p^{k}\|c}P_{p^{k}}^{\circ} =pkc(Ppk(pk)Ppk(pk1))\displaystyle=\bigotimes_{p^{k}\|c}\left(P_{p^{k}}(p^{k})-P_{p^{k}}(p^{k-1})\right)
=dcμ(cd)pkcpjdPpk(pj)=dcμ(cd)Pc(d),\displaystyle=\sum_{d\mid c}\mu\left(\frac{c}{d}\right)\bigotimes_{\begin{subarray}{c}p^{k}\|c\\ p^{j}\|d\end{subarray}}P_{p^{k}}(p^{j})\quad=\sum_{d\mid c}\mu\left(\frac{c}{d}\right)P_{c}(d),

as claimed. ∎

4.2. The Kloosterman matrix

Here we finally relate the abstract discussion in the preceding subsections to the classical Kloosterman sums.

Proposition 4.10 (From Kloosterman matrices to Fourier coefficients).

Let c+c\in\mathbb{Z}_{+}, ψ1,ψ2:/c\psi_{1},\psi_{2}:\mathbb{Z}/c\mathbb{Z}\to\mathbb{C}, and Kcψ1,ψ2/c×/cK_{c}^{\psi_{1},\psi_{2}}\in\mathbb{C}^{\mathbb{Z}/c\mathbb{Z}\times\mathbb{Z}/c\mathbb{Z}} be the c×cc\times c complex matrix with entries

(Kcψ1,ψ2)m,n:=ψ1(m)ψ2(n)𝟙(m,n,c)=1S(m,n;c).(K_{c}^{\psi_{1},\psi_{2}})_{m,n}:=\psi_{1}(m)\psi_{2}(n)\mathbbm{1}_{(m,n,c)=1}S(m,n;c). (4.8)

Consider the function Fcψ1,ψ2:SL2(/c)F_{c}^{\psi_{1},\psi_{2}}:\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\to\mathbb{C} given by

Fcψ1,ψ2:=1c2h1,h2/cψ^1(h1)ψ^2(h2) 1Th1STh2,F_{c}^{\psi_{1},\psi_{2}}:=\frac{1}{c^{2}}\sum_{h_{1},h_{2}\in\mathbb{Z}/c\mathbb{Z}}\widehat{\psi}_{1}(h_{1})\widehat{\psi}_{2}(h_{2})\,\mathbbm{1}_{T^{h_{1}}ST^{h_{2}}}, (4.9)

where TT and SS are as in ˜3.13. Then one has the inequality of operator norms

Kcψ1,ψ2cF^cψ1,ψ2(ρc).\|K_{c}^{\psi_{1},\psi_{2}}\|\leq c\|\widehat{F}_{c}^{\psi_{1},\psi_{2}}(\rho_{c}^{\circ})\|.
Remark.

In ψ^1\widehat{\psi}_{1} and ψ^2\widehat{\psi}_{2}, the Fourier transform is taken over /c\mathbb{Z}/c\mathbb{Z}, as in ˜3.11. In F^cψ1,ψ2\widehat{F}^{\psi_{1},\psi_{2}}_{c}, the Fourier transform is taken over the non-abelian group SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}), as in ˜3.10.

Proof of Proposition˜4.10.

Let UcU_{c} be the unitary c×cc\times c matrix with entries (Uc)u,v=c1/2e(uvc)(U_{c})_{u,v}=c^{-1/2}e(\tfrac{uv}{c}). By expanding the Kloosterman sums, we have

(UcKcψ1,ψ2Uc)u,v=1cm,n/cx(/c)×𝟙(m,n,c)=1ψ1(m)e(m(xu)c)ψ2(n)e(n(x¯+v)c),(U_{c}^{*}K_{c}^{\psi_{1},\psi_{2}}U_{c})_{u,v}=\frac{1}{c}\sum_{\begin{subarray}{c}m,n\in\mathbb{Z}/c\mathbb{Z}\\ x\in(\mathbb{Z}/c\mathbb{Z})^{\times}\end{subarray}}\mathbbm{1}_{(m,n,c)=1}\psi_{1}(m)\,e\left(\frac{m(x-u)}{c}\right)\psi_{2}(n)\,e\left(\frac{n(\overline{x}+v)}{c}\right),

for any u,v/cu,v\in\mathbb{Z}/c\mathbb{Z}. We then expand the indicator 𝟙(m,n,c)=1\mathbbm{1}_{(m,n,c)=1} by Möbius inversion and Fourier analysis,

𝟙(m,n,c)=1=dcμ(d)𝟙dm𝟙dn=dcμ(d)d2a,b/de(amd)e(bmd),\displaystyle\mathbbm{1}_{(m,n,c)=1}=\sum_{d\mid c}\mu(d)\mathbbm{1}_{d\mid m}\mathbbm{1}_{d\mid n}=\sum_{d\mid c}\frac{\mu(d)}{d^{2}}\sum_{a,b\in\mathbb{Z}/d\mathbb{Z}}e\left(\frac{am}{d}\right)e\left(\frac{bm}{d}\right),

and evaluate the sums over m,nm,n to obtain

(UcKcψ1,ψ2Uc)u,v\displaystyle(U_{c}^{*}K_{c}^{\psi_{1},\psi_{2}}U_{c})_{u,v} =1cdcμ(d)d2x(/c)×a,b/dψ^1(x+uacd)ψ^2(x¯vbcd)\displaystyle=\frac{1}{c}\sum_{d\mid c}\frac{\mu(d)}{d^{2}}\sum_{x\in(\mathbb{Z}/c\mathbb{Z})^{\times}}\sum_{a,b\in\mathbb{Z}/d\mathbb{Z}}\widehat{\psi}_{1}\left(-x+u-\frac{ac}{d}\right)\widehat{\psi}_{2}\left(-\overline{x}-v-\frac{bc}{d}\right)
=1cdcμ(d)d2x(/c)×h1,h2/cψ^1(h1)ψ^2(h2)𝟙xuh1(mod cd)x¯v+h2(mod cd),\displaystyle=\frac{1}{c}\sum_{d\mid c}\frac{\mu(d)}{d^{2}}\sum_{x\in(\mathbb{Z}/c\mathbb{Z})^{\times}}\sum_{h_{1},h_{2}\in\mathbb{Z}/c\mathbb{Z}}\widehat{\psi}_{1}(h_{1})\widehat{\psi}_{2}(h_{2})\mathbbm{1}_{\begin{subarray}{c}x\equiv u-h_{1}\ (\textnormal{mod }\frac{c}{d})\\ -\overline{x}\equiv v+h_{2}\ (\textnormal{mod }\frac{c}{d})\end{subarray}},

where we substituted h1:=x+u+acdh_{1}:=-x+u+\tfrac{ac}{d}, h2=x¯vbcdh_{2}=-\overline{x}-v-\tfrac{bc}{d}. Switching divisors dcdd\mapsto\tfrac{c}{d}, swapping sums, and evaluating the sum over xx (which gives either 0 or ϕ(c)/ϕ(d)\phi(c)/\phi(d) solutions), we reach

(UcKcψ1,ψ2Uc)u,v=1ch1,h2/cψ^1(h1)ψ^2(h2)dcμ(cd)d2ϕ(c)c2ϕ(d)𝟙(uh1)(v+h2)1(mod d).(U_{c}^{*}K_{c}^{\psi_{1},\psi_{2}}U_{c})_{u,v}=\frac{1}{c}\sum_{h_{1},h_{2}\in\mathbb{Z}/c\mathbb{Z}}\widehat{\psi}_{1}(h_{1})\widehat{\psi}_{2}(h_{2})\sum_{d\mid c}\mu\left(\frac{c}{d}\right)\frac{d^{2}\phi(c)}{c^{2}\phi(d)}\mathbbm{1}_{(u-h_{1})(v+h_{2})\equiv-1\ (\textnormal{mod }d)}. (4.10)

Let us keep this in mind. Separately, by ˜4.9 and 4.5, we have

F^cψ1,ψ2(ρc)\displaystyle\widehat{F}_{c}^{\psi_{1},\psi_{2}}(\rho_{c}^{\circ}) =1c2h1,h2/cψ^1(h1)ψ^2(h2)ρc(Th1STh2)\displaystyle=\frac{1}{c^{2}}\sum_{h_{1},h_{2}\in\mathbb{Z}/c\mathbb{Z}}\widehat{\psi}_{1}(h_{1})\widehat{\psi}_{2}(h_{2})\rho_{c}^{\circ}(T^{h_{1}}ST^{h_{2}})
=(1c2h1,h2/cψ^1(h1)ψ^2(h2)ρc(Th1STh2))|Vc,\displaystyle=\Bigg(\frac{1}{c^{2}}\sum_{h_{1},h_{2}\in\mathbb{Z}/c\mathbb{Z}}\widehat{\psi}_{1}(h_{1})\widehat{\psi}_{2}(h_{2})\rho_{c}(T^{h_{1}}ST^{h_{2}})\Bigg)\Bigg|_{V_{c}^{\circ}},

and thus by Lemma˜3.1,

F^cψ1,ψ2(ρc)=Mcψ1,ψ2,\|\widehat{F}_{c}^{\psi_{1},\psi_{2}}(\rho_{c}^{\circ})\|=\|M_{c}^{\psi_{1},\psi_{2}}\|, (4.11)

where Mcψ1,ψ2:VcVcM_{c}^{\psi_{1},\psi_{2}}:V_{c}\to V_{c} is the map

Mcψ1,ψ2\displaystyle M_{c}^{\psi_{1},\psi_{2}} =1c2h1,h2/cψ^1(h1)ψ^2(h2)ρc(Th1STh2)Pc.\displaystyle=\frac{1}{c^{2}}\sum_{h_{1},h_{2}\in\mathbb{Z}/c\mathbb{Z}}\widehat{\psi}_{1}(h_{1})\widehat{\psi}_{2}(h_{2})\rho_{c}(T^{h_{1}}ST^{h_{2}})P_{c}^{\circ}.

By Lemma˜4.9 and the commutativity claim in Lemma˜4.8, we can further write

Mcψ1,ψ2\displaystyle M_{c}^{\psi_{1},\psi_{2}} =1c2h1,h2/cψ^1(h1)ψ^2(h2)ρc(Th1STh2)dcμ(cd)Pc(d)\displaystyle=\frac{1}{c^{2}}\sum_{h_{1},h_{2}\in\mathbb{Z}/c\mathbb{Z}}\widehat{\psi}_{1}(h_{1})\widehat{\psi}_{2}(h_{2})\rho_{c}(T^{h_{1}}ST^{h_{2}})\sum_{d\mid c}\mu\left(\frac{c}{d}\right)P_{c}(d)
=1c2h1,h2/cψ^1(h1)ψ^2(h2)dcμ(cd)ρc(Th1)Pc(d)ρc(STh2).\displaystyle=\frac{1}{c^{2}}\sum_{h_{1},h_{2}\in\mathbb{Z}/c\mathbb{Z}}\widehat{\psi}_{1}(h_{1})\widehat{\psi}_{2}(h_{2})\sum_{d\mid c}\mu\left(\frac{c}{d}\right)\rho_{c}(T^{h_{1}})P_{c}(d)\rho_{c}(ST^{h_{2}}).

By Definitions˜4.1 and 4.7, we can represent this map as a matrix in 1(/c)×1(/c)\mathbb{C}^{\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})\times\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z})} with entries

(Mcψ1,ψ2)u,v=1c2h1,h2/cψ^1(h1)ψ^2(h2)dcμ(cd)d2ϕ(c)c2ϕ(d)𝟙Th1uΓc(d)STh2v(M_{c}^{\psi_{1},\psi_{2}})_{u,v}=\frac{1}{c^{2}}\sum_{h_{1},h_{2}\in\mathbb{Z}/c\mathbb{Z}}\widehat{\psi}_{1}(h_{1})\widehat{\psi}_{2}(h_{2})\sum_{d\mid c}\mu\left(\frac{c}{d}\right)\frac{d^{2}\phi(c)}{c^{2}\phi(d)}\mathbbm{1}_{T^{-h_{1}}u\in\Gamma_{c}(d)\cdot ST^{h_{2}}v} (4.12)

for u,v1(/c)u,v\in\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}); compare this to ˜4.10. We will show that restricting the matrix Mcψ1,ψ2M_{c}^{\psi_{1},\psi_{2}} to those rows and columns indexed by u,v/c1(/c)u,v\in\mathbb{Z}/c\mathbb{Z}\subset\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}) (by the canonical embedding x[x:1]x\mapsto[x:1]) yields precisely the matrix UcKcψ1,ψ2UcU_{c}^{*}K_{c}^{\psi_{1},\psi_{2}}U_{c}. Indeed, using the notation above, if u,v/cu,v\in\mathbb{Z}/c\mathbb{Z}, then Th1u=uh1=:x/cT^{-h_{1}}u=u-h_{1}=:x\in\mathbb{Z}/c\mathbb{Z}, Th2v=v+h2=:y/cT^{h_{2}}v=v+h_{2}=:y\in\mathbb{Z}/c\mathbb{Z}, and we have xΓc(d)Syx\in\Gamma_{c}(d)\cdot Sy if and only if the equation

(I+dA)(x1)=α(1y)(I+dA)\begin{pmatrix}x\\ 1\end{pmatrix}=\alpha\begin{pmatrix}-1\\ y\end{pmatrix}

has solutions in I+dAΓc(d)I+dA\in\Gamma_{c}(d) and α(/c)×\alpha\in(\mathbb{Z}/c\mathbb{Z})^{\times}. On the one hand, the existence of such solutions implies that (x1)(ααy)(mod d)\left(\begin{smallmatrix}x\\ 1\end{smallmatrix}\right)\equiv\left(\begin{smallmatrix}-\alpha\\ \alpha y\end{smallmatrix}\right)\ (\textnormal{mod }d), so xy1(mod d)xy\equiv-1\ (\textnormal{mod }d). On the other hand, if xy1(mod d)xy\equiv-1\ (\textnormal{mod }d), then one can take A=0A=0 and α=x\alpha=-x to obtain a solution. It follows that

𝟙Th1uΓc(d)STh2v=𝟙(uh1)(v+h2)1(mod d),u,v/c1(/c),\mathbbm{1}_{T^{-h_{1}}u\in\Gamma_{c}(d)\cdot ST^{h_{2}}v}=\mathbbm{1}_{(u-h_{1})(v+h_{2})\equiv-1\ (\textnormal{mod }d)},\qquad\qquad u,v\in\mathbb{Z}/c\mathbb{Z}\subset\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}),

and then by comparing ˜4.10 and 4.12, we find that

(UcKcψ1,ψ2Uc)u,v=c(Mcψ1,ψ2)u,vu,v/c1(/c).(U_{c}^{*}K_{c}^{\psi_{1},\psi_{2}}U_{c})_{u,v}=c(M_{c}^{\psi_{1},\psi_{2}})_{u,v}\qquad\qquad u,v\in\mathbb{Z}/c\mathbb{Z}\subset\mathbb{P}^{1}(\mathbb{Z}/c\mathbb{Z}).

Since removing some rows and columns of a matrix can only decrease its spectral norm, we conclude that

Kcψ1,ψ2=UcKcψ1,ψ2UccMcψ1,ψ2,\|K_{c}^{\psi_{1},\psi_{2}}\|=\|U_{c}^{*}K_{c}^{\psi_{1},\psi_{2}}U_{c}\|\leq c\|M_{c}^{\psi_{1},\psi_{2}}\|,

which, together with ˜4.11, completes our proof. ∎

Corollary 4.11.

Let c,M,N+c,M,N\in\mathbb{Z}_{+} with 1M,Nc1\leq M,N\leq c, a(/c)×a\in(\mathbb{Z}/c\mathbb{Z})^{\times}, and ,𝒥\mathcal{I},\mathcal{J}\subset\mathbb{Z} be intervals of lengths ||=M|\mathcal{I}|=M, |𝒥|=N|\mathcal{J}|=N. Let Kc,a,𝒥×𝒥K_{c,a}^{\mathcal{I},\mathcal{J}}\in\mathbb{C}^{\mathcal{I}\times\mathcal{J}} be the M×NM\times N matrix indexed by mm\in\mathcal{I} and n𝒥n\in\mathcal{J}, with entries

(Kc,a,𝒥)m,n:=S(am,n;c)𝟙(m,n,c)=1.(K_{c,a}^{\mathcal{I},\mathcal{J}})_{m,n}:=S(am,n;c)\mathbbm{1}_{(m,n,c)=1}. (4.13)

Let ε>0\varepsilon>0 and H1:=c1+εM1H_{1}:=c^{1+\varepsilon}M^{-1}, H2:=c1+εN1H_{2}:=c^{1+\varepsilon}N^{-1}. Then there exist complex weights αh,βh1\alpha_{h},\beta_{h}\ll 1 such that for the function Fc,aH1,H2:SL2(/c)F_{c,a}^{H_{1},H_{2}}:\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\to\mathbb{C} given by

Fc,aH1,H2:=1H1H2|h1|H1|h2|H2αh1βh2𝟙Ta¯h1STh2,F_{c,a}^{H_{1},H_{2}}:=\frac{1}{H_{1}H_{2}}\sum_{\begin{subarray}{c}|h_{1}|\leq H_{1}\\ |h_{2}|\leq H_{2}\end{subarray}}\alpha_{h_{1}}\beta_{h_{2}}\mathbbm{1}_{T^{\overline{a}h_{1}}ST^{h_{2}}}, (4.14)

one has

Kc,a,𝒥c1+2εF^c,aH1,H2(ρc)+Oε(c100).\|K_{c,a}^{\mathcal{I},\mathcal{J}}\|\leq c^{1+2\varepsilon}\|\widehat{F}_{c,a}^{H_{1},H_{2}}(\rho_{c}^{\circ})\|+O_{\varepsilon}(c^{-100}). (4.15)
Remark.

Given ˜4.14, one can apply the triangle inequality for the operator norm to obtain

F^c,aH1,H2(ρc)\displaystyle\|\widehat{F}_{c,a}^{H_{1},H_{2}}(\rho_{c}^{\circ})\| =1H1H2|h1|H1|h2|H2αh1βh2ρc(Tah1STh2)\displaystyle=\Bigg\|\frac{1}{H_{1}H_{2}}\sum_{\begin{subarray}{c}|h_{1}|\leq H_{1}\\ |h_{2}|\leq H_{2}\end{subarray}}\alpha_{h_{1}}\beta_{h_{2}}\,\rho_{c}^{\circ}(T^{ah_{1}}ST^{h_{2}})\Bigg\| (4.16)
1H1H2|h1|H1|h2|H2|αh1βh2|ρc(Tah1STh2)1,\displaystyle\leq\frac{1}{H_{1}H_{2}}\sum_{\begin{subarray}{c}|h_{1}|\leq H_{1}\\ |h_{2}|\leq H_{2}\end{subarray}}|\alpha_{h_{1}}\beta_{h_{2}}|\|\rho_{c}^{\circ}(T^{ah_{1}}ST^{h_{2}})\|\ll 1,

since ρc(Tah1STh2)=1\|\rho_{c}^{\circ}(T^{ah_{1}}ST^{h_{2}})\|=1 (as the norm of a unitary map). Plugging this into ˜4.15 recovers the trivial bound Kc,𝒥c1+o(1)\|K_{c}^{\mathcal{I},\mathcal{J}}\|\ll c^{1+o(1)} from ˜1.2. Our task in the later sections will therefore be to establish some power-saving spectral cancellation in the sum over h1,h2h_{1},h_{2} from ˜4.16.

Proof of Corollary˜4.11.

Let us write [M]:={1,,M}[M]:=\{1,\ldots,M\}, [N]:={1,,N}[N]:=\{1,\ldots,N\}, and =[M]+r\mathcal{I}=[M]+r, 𝒥=[N]+s\mathcal{J}=[N]+s for some r,sr,s\in\mathbb{Z}. Since M,NcM,N\leq c, we may identify \mathcal{I}, 𝒥\mathcal{J} with their images in /c\mathbb{Z}/c\mathbb{Z}. Let Φ:\Phi:\mathbb{R}\to\mathbb{C} be a smooth function supported in [1,2][-1,2], such that Φ𝟙[0,1]\Phi\geq\mathbbm{1}_{[0,1]} and Φ(j)j1\Phi^{(j)}\ll_{j}1 for j0j\geq 0, and define ψ1,ψ2:/c\psi_{1},\psi_{2}:\mathbb{Z}/c\mathbb{Z}\to\mathbb{C} by

ψ1(m):=ma(m+r)m(mod c)Φ(mM),ψ2(n):=nn+sn(mod c)Φ(nN).\psi_{1}(m):=\sum_{\begin{subarray}{c}m^{\prime}\in\mathbb{Z}\\ a(m^{\prime}+r)\equiv m\ (\textnormal{mod }c)\end{subarray}}\Phi\left(\frac{m^{\prime}}{M}\right),\qquad\qquad\psi_{2}(n):=\sum_{\begin{subarray}{c}n^{\prime}\in\mathbb{Z}\\ n^{\prime}+s\equiv n\ (\textnormal{mod }c)\end{subarray}}\Phi\left(\frac{n^{\prime}}{N}\right). (4.17)

Since Φ𝟙[0,1]\Phi\geq\mathbbm{1}_{[0,1]}, we have ψ1𝟙a\psi_{1}\geq\mathbbm{1}_{a\mathcal{I}} and ψ2𝟙𝒥\psi_{2}\geq\mathbbm{1}_{\mathcal{J}} (viewing these as functions on /c\mathbb{Z}/c\mathbb{Z}). But scaling a row or a column of a matrix by a constant in [0,1][0,1] can only decrease its spectral norm, so with the notation from ˜4.8 we get

Kcψ1,ψ2Kc,a,𝒥.\|K_{c}^{\psi_{1},\psi_{2}}\|\geq\|K_{c,a}^{\mathcal{I},\mathcal{J}}\|.

So from Proposition˜4.10, it follows that

Kc,a,𝒥cF^cψ1,ψ2(ρc),\|K_{c,a}^{\mathcal{I},\mathcal{J}}\|\leq c\|\widehat{F}_{c}^{\psi_{1},\psi_{2}}(\rho_{c}^{\circ})\|, (4.18)

and it remains to compute Fcψ1,ψ2F_{c}^{\psi_{1},\psi_{2}}. For h/ch\in\mathbb{Z}/c\mathbb{Z}, we obtain from ˜4.17, ˜3.11, ˜3.3, and ˜3.2 that

ψ^1(h)=m/cψ1(m)e(hmc)\displaystyle\widehat{\psi}_{1}(h)=\sum_{m\in\mathbb{Z}/c\mathbb{Z}}\psi_{1}(m)\,e\left(-\frac{hm}{c}\right) =mΦ(mM)e(ah(m+r)c)\displaystyle=\sum_{m^{\prime}\in\mathbb{Z}}\Phi\left(\frac{m^{\prime}}{M}\right)e\left(-\frac{ah(m^{\prime}+r)}{c}\right)
=Me(rahc)kΦ^(M(k+ahc))\displaystyle=Me\left(-\frac{rah}{c}\right)\sum_{k\in\mathbb{Z}}\widehat{\Phi}\left(M\Big(k+\frac{ah}{c}\Big)\right)
=Me(rahc)hhah(mod c)Φ^(hc/M),\displaystyle=Me\left(-\frac{rah}{c}\right)\sum_{\begin{subarray}{c}h^{\prime}\in\mathbb{Z}\\ h^{\prime}\equiv ah\ (\textnormal{mod }c)\end{subarray}}\widehat{\Phi}\left(\frac{h^{\prime}}{c/M}\right),

and similarly for ψ^2(h)\widehat{\psi}_{2}(h) (with 11 in place of aa). Plugging this into ˜4.9, we conclude that

Fcψ1,ψ2\displaystyle F_{c}^{\psi_{1},\psi_{2}} =MNc2h1,h2/ch1,h2h1ah1(mod c)h2h2(mod c)Φ^(h1c/M)Φ^(h2c/N)e(rah1sh2c)𝟙Th1STh2\displaystyle=\frac{MN}{c^{2}}\sum_{h_{1},h_{2}\in\mathbb{Z}/c\mathbb{Z}}\sum_{\begin{subarray}{c}h_{1}^{\prime},h_{2}^{\prime}\in\mathbb{Z}\\ h_{1}^{\prime}\equiv ah_{1}\ (\textnormal{mod }c)\\ h_{2}^{\prime}\equiv h_{2}\ (\textnormal{mod }c)\end{subarray}}\widehat{\Phi}\left(\frac{h_{1}^{\prime}}{c/M}\right)\widehat{\Phi}\left(\frac{h_{2}^{\prime}}{c/N}\right)e\left(\frac{-rah_{1}-sh_{2}}{c}\right)\mathbbm{1}_{T^{h_{1}}ST^{h_{2}}}
=c2εH1H2h1,h2Φ^(h1c/M)Φ^(h2c/N)e(rah1sh2c)𝟙Ta¯h1STh2.\displaystyle=\frac{c^{2\varepsilon}}{H_{1}H_{2}}\sum_{h_{1}^{\prime},h_{2}^{\prime}\in\mathbb{Z}}\widehat{\Phi}\left(\frac{h_{1}^{\prime}}{c/M}\right)\widehat{\Phi}\left(\frac{h_{2}^{\prime}}{c/N}\right)e\left(\frac{-rah_{1}^{\prime}-sh_{2}^{\prime}}{c}\right)\mathbbm{1}_{T^{\overline{a}h_{1}^{\prime}}ST^{h_{2}^{\prime}}}.

Using the Schwarz decay of Φ^\widehat{\Phi}, we can discard the contribution of the terms with |h1|>H1|h_{1}^{\prime}|>H_{1} or |h2|>H2|h_{2}^{\prime}|>H_{2} to F^cψ1,ψ2(ρc)\|\widehat{F}_{c}^{\psi_{1},\psi_{2}}(\rho_{c}^{\circ})\|, up to an error of Oε(c100)O_{\varepsilon}(c^{-100}). Choosing

αh:=Φ^(hc/M)e(rahc),βh:=Φ^(hc/N)e(shc)\alpha_{h}:=\widehat{\Phi}\left(\frac{h}{c/M}\right)e\left(-\frac{rah}{c}\right),\qquad\qquad\beta_{h}:=\widehat{\Phi}\left(\frac{h}{c/N}\right)e\left(-\frac{sh}{c}\right)

concludes our proof in light of ˜4.18 and ˜4.14. ∎

5. The amplification argument

Recall that Propositions˜4.10 and 4.11 reduce the problem of bounding bilinear forms with Kloosterman sums to that of bounding Fourier coefficients of certain functions on SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) at a certain representation. One can then reduce to irreducible subrepresentations via Lemma˜3.10. To use Fourier analysis, we will pass to a sum over all irreducible representations of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})—but to avoid a critical loss, we insert an amplifier weight in this sum, as outlined in Section˜2.3. Making this work in a non-abelian setting is the key step in our argument.

5.1. Introducing the amplifier

We state the results in this subsection in fair generality, since they may be useful in other contexts. We recall the notation for Schatten norms from Section˜3.1.

Proposition 5.1 (Non-abelian amplification).

Let GG be a finite group, NGN\triangleleft G, F:GF:G\to\mathbb{C}, ρG^\rho\in\widehat{G}, χ:=Trρ\chi:=\textnormal{Tr}\,\rho, and qq be an even positive integer. Then one has

F^(ρ)Sqq|G|nN|χ(n)|2g1,,gqGg1gqNF(g1)F¯(g21)F(gq1)F¯(gq1)χ(g1gq).\|\widehat{F}(\rho)\|_{S^{q}}^{q}\leq\frac{|G|}{\sum_{n\in N}|\chi(n)|^{2}}\sum_{\begin{subarray}{c}g_{1},\ldots,g_{q}\in G\\ g_{1}\cdots g_{q}\in N\end{subarray}}F(g_{1})\overline{F}(g_{2}^{-1})\cdots F(g_{q-1})\overline{F}(g_{q}^{-1})\chi(g_{1}\cdots g_{q}).
Remark.

In comparison, expanding F^(ρ)Sqq\|\widehat{F}(\rho)\|_{S^{q}}^{q} by ˜3.10 and 3.5 yields

F^(ρ)Sqq=g1,,gqGF(g1)F¯(g21)F(gq1)F¯(gq1)χ(g1gq).\|\widehat{F}(\rho)\|_{S^{q}}^{q}=\sum_{g_{1},\ldots,g_{q}\in G}F(g_{1})\overline{F}(g_{2}^{-1})\cdots F(g_{q-1})\overline{F}(g_{q}^{-1})\chi(g_{1}\cdots g_{q}).

The upper bound from Proposition˜5.1 replaces χ\chi with a function proportional to χ𝟙N\chi\mathbbm{1}_{N}, normalized so that equality is attained when F=χ¯F=\overline{\chi}. The requirement that ρ\rho be irreducible is crucial.

We first prove Proposition˜5.1 using character orthogonality, which resembles more classical amplification arguments, and then give a more conceptual sketch of proof via induced representations. The first proof has the advantage that it may be generalized by considering other amplifiers.

Proof of Proposition˜5.1 via character orthogonality.

Write F0:=FF_{0}:=F and F1:GF_{1}:G\to\mathbb{C} for the function F1(g):=F¯(g1)F_{1}(g):=\overline{F}(g^{-1}), so that F^1(ρ)=F^(ρ)\widehat{F}_{1}(\rho)=\widehat{F}(\rho)^{*}. By considering the function

Fconv:=F0F1F0Fq2mod 2,F_{\text{conv}}:=F_{0}*F_{1}*F_{0}*\cdots*F_{\frac{q}{2}\ \textnormal{mod }2},

where there are q2\tfrac{q}{2} factors in the convolution, so that F^conv(ρ)S22=F^(ρ)Sqq\|\widehat{F}_{\text{conv}}(\rho)\|_{S^{2}}^{2}=\|\widehat{F}(\rho)\|_{S^{q}}^{q}, we see that it suffices to prove the desired result when q=2q=2.

Let 𝟙N:G{0,1}\mathbbm{1}_{N}:G\to\{0,1\} be the indicator function of NN, and consider the amplifier A:G^[0,)A:\widehat{G}\to[0,\infty) given at ρG^\rho^{\prime}\in\widehat{G} with χ:=Trρ\chi^{\prime}:=\textnormal{Tr}\rho^{\prime} by

A(ρ):=1|N|𝟙^N(ρ¯ρ)S22\displaystyle A(\rho^{\prime})=\frac{1}{|N|}\left\|\widehat{\mathbbm{1}}_{N}\left(\overline{\rho}^{\prime}\otimes\rho\right)\right\|_{S^{2}}^{2} =1|N|nNρ¯(n)ρ(n)S22\displaystyle=\frac{1}{|N|}\left\|\sum_{n\in N}\overline{\rho}^{\prime}(n)\otimes\rho(n)\right\|_{S^{2}}^{2}
=1|N|Tr(n1,n2Nρ¯(n1n21)ρ(n1n21))=nNχ¯(n)χ(n),\displaystyle=\frac{1}{|N|}\textnormal{Tr}\left(\sum_{n_{1},n_{2}\in N}\overline{\rho}^{\prime}(n_{1}n_{2}^{-1})\otimes\rho(n_{1}n_{2}^{-1})\right)=\sum_{n\in N}\overline{\chi^{\prime}}(n)\chi(n),

where we implicitly used that NN is a group. In particular, A(ρ)=nN|χ(n)|2A(\rho)=\sum_{n\in N}|\chi(n)|^{2} is a positive integer multiple of NN, due to ˜3.9. It follows from this and nonnegativity that

(nN|χ(n)|2)F^(ρ)S22\displaystyle\left(\sum_{n\in N}|\chi(n)|^{2}\right)\|\widehat{F}(\rho)\|_{S^{2}}^{2} ρG^A(ρ)F^(ρ)S22\displaystyle\leq\sum_{\rho^{\prime}\in\widehat{G}}A(\rho^{\prime})\|\widehat{F}(\rho^{\prime})\|_{S^{2}}^{2}
=ρG^nNTrρ¯(n)χ(n)F^(ρ)S22=nNχ(n)ρG^Trρ¯(n)F^(ρ)S22.\displaystyle=\sum_{\rho^{\prime}\in\widehat{G}}\sum_{n\in N}\overline{\textnormal{Tr}\rho^{\prime}}(n)\chi(n)\|\widehat{F}(\rho^{\prime})\|_{S^{2}}^{2}=\sum_{n\in N}\chi(n)\sum_{\rho^{\prime}\in\widehat{G}}\overline{\textnormal{Tr}\rho^{\prime}}(n)\|\widehat{F}(\rho^{\prime})\|_{S^{2}}^{2}.

Expanding F^(ρ)S22\|\widehat{F}(\rho^{\prime})\|_{S^{2}}^{2} via ˜3.10 and 3.5 and then using ˜3.8, the sum over ρ\rho^{\prime} above becomes

ρG^Trρ¯(n)g1,g2GF(g1)F¯(g21)Trρ(g1g2)\displaystyle\sum_{\rho^{\prime}\in\widehat{G}}\overline{\textnormal{Tr}\rho^{\prime}}(n)\sum_{g_{1},g_{2}\in G}F(g_{1})\overline{F}(g_{2}^{-1})\textnormal{Tr}\rho^{\prime}(g_{1}g_{2}) =g1,g2GF(g1)F¯(g21)χIrr(G)χ¯(n)χ(g1g2)\displaystyle=\sum_{g_{1},g_{2}\in G}F(g_{1})\overline{F}(g_{2}^{-1})\sum_{\chi^{\prime}\in\textnormal{Irr}(G)}\overline{\chi^{\prime}}(n)\chi^{\prime}(g_{1}g_{2})
=g1,g2GF(g1)F¯(g21)|G||Cn|𝟙g1g2Cn,\displaystyle=\sum_{g_{1},g_{2}\in G}F(g_{1})\overline{F}(g_{2}^{-1})\frac{|G|}{|C_{n}|}\mathbbm{1}_{g_{1}g_{2}\in C_{n}},

where CnC_{n} denotes the conjugacy class of nn in GG. Since NN can be partitioned into such conjugacy classes by normality, and since characters are constant on conjugacy classes, it follows that

(nN|χ(n)|2)F^(ρ)S22\displaystyle\left(\sum_{n\in N}|\chi(n)|^{2}\right)\|\widehat{F}(\rho)\|_{S^{2}}^{2} |G|g1,g2GF(g1)F¯(g21)nNχ(n)𝟙g1g2Cn|Cn|\displaystyle\leq|G|\sum_{g_{1},g_{2}\in G}F(g_{1})\overline{F}(g_{2}^{-1})\sum_{n\in N}\chi(n)\frac{\mathbbm{1}_{g_{1}g_{2}\in C_{n}}}{|C_{n}|}
=|G|g1,g2Gg1g2NF(g1)F¯(g21)χ(g1g2).\displaystyle=|G|\sum_{\begin{subarray}{c}g_{1},g_{2}\in G\\ g_{1}g_{2}\in N\end{subarray}}F(g_{1})\overline{F}(g_{2}^{-1})\chi(g_{1}g_{2}).

This settles the case q=2q=2, thus completing our proof. ∎

Remark.

After reducing to the case q=2q=2, one could also attempt to use the triangle inequality for Schatten norms and Cauchy–Schwarz, in the shape

gGF(g)ρ(g)S22\displaystyle\left\|\sum_{g\in G}F(g)\rho(g)\right\|_{S^{2}}^{2} (Ng0N\GgNg0F(g)ρ(g)S2)2\displaystyle\leq\left(\sum_{Ng_{0}\in N\backslash G}\left\|\sum_{g\in Ng_{0}}F(g)\rho(g)\right\|_{S^{2}}\right)^{2}
[G:N]Ng0N\GgNg0F(g)ρ(g)S22=|G||N|g1,g2Gg1g2NF(g1)F¯(g21)χ(g1g2).\displaystyle\leq[G:N]\sum_{Ng_{0}\in N\backslash G}\left\|\sum_{g\in Ng_{0}}F(g)\rho(g)\right\|_{S^{2}}^{2}=\frac{|G|}{|N|}\sum_{\begin{subarray}{c}g_{1},g_{2}\in G\\ g_{1}g_{2}\in N\end{subarray}}F(g_{1})\overline{F}(g_{2}^{-1})\chi(g_{1}g_{2}).

This argument does not use that NN is normal or that ρ\rho is irreducible, but it produces a weaker bound in general. Indeed, compared to Proposition˜5.1, the bound above loses a factor of

1|N|nN|χ(n)|2,\frac{1}{|N|}\sum_{n\in N}|\chi(n)|^{2},

which is a (potentially large) positive integer by ˜3.9; note that although χ\chi is irreducible on GG, it is usually not irreducible on NN (e.g., when N={e}N=\{e\} is the trivial subgroup, the factor lost is (dimρ)2(\dim\rho)^{2}). This is a discrepancy which does not arise in the abelian setting, when all irreducible characters are 11-dimensional—which is why classical amplification arguments with Dirichlet characters can often be re-formulated by applying Cauchy–Schwarz to a sum over residues (see, e.g., [28]).

Proof sketch of Proposition˜5.1 using induced representations.

Starting from ρ:GU(V)\rho:G\to U(V), consider the restricted representation ρ|N:NU(V)\rho|_{N}:N\to U(V), and then the induced representation

R:=IndNG(ρ|N).R:=\mathrm{Ind}_{N}^{G}(\rho|_{N}).

This acts by translation on the space

W:={fVG:f(ng)=ρ(n)f(g),nN,gG}VG/N,W:=\{f\in V^{G}:f(ng)=\rho(n)f(g),\forall n\in N,g\in G\}\cong V^{G/N},

so dimR=dimW=[G:N]dimV=|G||N|dimρ\dim R=\dim W=[G:N]\dim V=\tfrac{|G|}{|N|}\dim\rho. It can be shown using ˜3.9 that ρ\rho is a subrepresentation of RR, with

Mult(ρ,R)=1|N|nN|χ(n)|2,\textnormal{Mult}(\rho,R)=\frac{1}{|N|}\sum_{n\in N}|\chi(n)|^{2},

and therefore, by Lemma˜3.10,

F^(R)Sqq=ρG^Mult(ρ,R)F^(ρ)Sqq1|N|(nN|χ(n)|2)F^(ρ)Sqq.\|\widehat{F}(R)\|_{S^{q}}^{q}=\sum_{\rho^{\prime}\in\widehat{G}}\textnormal{Mult}(\rho^{\prime},R)\|\widehat{F}(\rho^{\prime})\|_{S^{q}}^{q}\geq\frac{1}{|N|}\left(\sum_{n\in N}|\chi(n)|^{2}\right)\|\widehat{F}(\rho)\|_{S^{q}}^{q}.

To complete the proof, one can expand F^(R)Sqq\|\widehat{F}(R)\|_{S^{q}}^{q} via ˜3.10 and 3.5, and use the Frobenius formula

TrR(g)=1|N|xGx1gxNχ(x1gx)=|G||N|χ(g)𝟙N(g),\textnormal{Tr}R(g)=\frac{1}{|N|}\sum_{\begin{subarray}{c}x\in G\\ x^{-1}gx\in N\end{subarray}}\chi(x^{-1}gx)=\frac{|G|}{|N|}\chi(g)\mathbbm{1}_{N}(g),

for all gGg\in G, where the last equality uses the normality of NN. ∎

Remark.

In the particular case when N={e}N=\{e\} is the trivial subgroup, the induced representation RR in the proof above is (isomorphic to) the regular representation RGR_{G}. In this case, the conclusion of Proposition˜5.1 simply reads

F^(ρ)Sqq1dimρF^(RG)Sqq,\|\widehat{F}(\rho)\|_{S^{q}}^{q}\leq\frac{1}{\dim\rho}\|\widehat{F}(R_{G})\|_{S^{q}}^{q},

and similar ideas appear in [37, 30].

5.2. Passing to a counting problem

We now return to the setting when G=SL2(/c)G=\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}) and N=Γc(d)N=\Gamma_{c}(d) for some dcd\mid c. The goal is to pass from the spectral norm F^c,aH1,H2(ρc)\|\widehat{F}_{c,a}^{H_{1},H_{2}}(\rho_{c}^{\circ})\| in ˜4.15 to a count of solutions to a certain equation in PSL2(/d)\textnormal{PSL}_{2}(\mathbb{Z}/d\mathbb{Z}). After applying Proposition˜5.1, we will need an upper bound for χ(g1gq)\chi(g_{1}\cdots g_{q}), and a lower bound for the denominator nN|χ(n)|2\sum_{n\in N}|\chi(n)|^{2}. For the specific characters χc\chi_{c} and χc\chi^{\circ}_{c} from Section˜4, such bounds are given in the following results.

Lemma 5.2.

Let c+c\in\mathbb{Z}_{+}, gSL2(/c)g\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}), and dd be the largest divisor of cc such that

g{γ/c:γ2=1}Γc(d).g\in\{\gamma\in\mathbb{Z}/c\mathbb{Z}:\gamma^{2}=1\}\cdot\Gamma_{c}(d).

Let fcdf\leq\sqrt{cd} be the largest positive integer such that f2cdf^{2}\mid cd. Then one has

χc(g)co(1)f.\chi_{c}(g)\ll c^{o(1)}f. (5.1)
Remark.

The bound in ˜5.1 is sharp in terms of cc and dd, as can be seen by taking g=Tdg=T^{d}. In particular, if c=pkc=p^{k} is a prime power, then g=Tpjg=T^{p^{j}} has p(k+j)/2p^{\left\lfloor(k+j)/2\right\rfloor} fixed points of the shape [1:x][1:x], where pjx20(mod pk)p^{j}x^{2}\equiv 0\ (\textnormal{mod }p^{k}).

The proof of Lemma˜5.2 is left to Appendix˜A.

Proposition 5.3 (Lower bound for squared character sums).

Let c+c\in\mathbb{Z}_{+} and χ\chi be an irreducible character inside χc\chi_{c}^{\circ}. Let d,d,e+d,d^{\prime},e\in\mathbb{Z}_{+} be such that ddd^{\prime}\mid d, (d,e)=1(d,e)=1, and c=ddec=dd^{\prime}e. Then one has

nΓc(d)|χ(n)|2c3o(1)d.\sum_{n\in\Gamma_{c}(d)}|\chi(n)|^{2}\gg\frac{c^{3-o(1)}}{d}.
Remark.

The lower bound in Proposition˜5.3 wins a factor of about d2d^{2} over the ‘trivial’ bound of |Γc(d)|=c3d3|\Gamma_{c}(d)|=\tfrac{c^{3}}{d^{3}} due to ˜3.9. This is because |χ(n)||\chi(n)| typically has size d\gtrsim d when nΓc(d)n\in\Gamma_{c}(d) (note that when d=cd=c, one has Γc(c)={I}\Gamma_{c}(c)=\{I\} and χ(I)=dimχ\chi(I)=\dim\chi).

The proof of Proposition˜5.3 reduces to a local computation (i.e., for c=pkc=p^{k} a prime power) given below, which builds on Lemma˜3.16. Indeed, one can rephrase Lemma˜3.16 as follows: if χIrr(SL2(/pk))\chi\in\textnormal{Irr}(\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z})) is primitive, then |χ(I)|2p2k|\chi(I)|^{2}\gg p^{2k}. Using a bit of Clifford theory, we can generalize this to averages of |χ|2|\chi|^{2} over the congruence subgroups Γpk(pj)\Gamma_{p^{k}}(p^{j}), for some values of jj.

Lemma 5.4.

Let pkp^{k} be a prime power and jj\in\mathbb{Z} be such that either j=0j=0 or k2jk\tfrac{k}{2}\leq j\leq k. Let χ\chi be a primitive irreducible character of SL2(/pk)\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}). Then

nΓpk(pj)|χ(n)|2(kj+1)1p3kj.\sum_{n\in\Gamma_{p^{k}}(p^{j})}|\chi(n)|^{2}\gg(k-j+1)^{-1}p^{3k-j}.
Remark.

We expect the bound in Lemma˜5.4 to be sharp, and to actually hold for all 0jk0\leq j\leq k. This might follow from a more careful study of Γ^pk(pj)\widehat{\Gamma}_{p^{k}}(p^{j}) for 1j<k21\leq j<\tfrac{k}{2}, and it would imply Proposition˜5.3 (as well as Theorem˜1.2) for more flexible factorizations c=dec=de.

The proof of Lemma˜5.4 is also left to Appendix˜A, a key ingredient being the fact that Γpk(pj)\Gamma_{p^{k}}(p^{j}) is abelian if k2jk\tfrac{k}{2}\leq j\leq k.

Proof of Proposition˜5.3.

By (the proof of) Proposition˜4.6, ρc\rho_{c}^{\circ} is isomorphic to a direct sum of representations of the shape

ρ=

pkc
ρp,k
,
\rho=\mathop{\mathchoice{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}{\vbox{\hbox{\scalebox{2.0}{$\displaystyle\boxtimes$}}}}}_{p^{k}\|c}\rho_{p,k},
(5.2)

where each ρp,k\rho_{p,k} is a primitive irreducible representation of SL2(/pk)\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}). This gives the decomposition of ρc\rho_{c}^{\circ} into irreducible representations, so any irreducible representation occurring inside ρc\rho_{c}^{\circ} must be of the form in ˜5.2. Now let χ:=Trρ\chi:=\textnormal{Tr}\rho and χp,k:=Trρp,k\chi_{p,k}:=\textnormal{Tr}\rho_{p,k} for such a representation. From ˜3.16 and fact that χ(n)=pknχp,k(πc,pk(n))\chi(n)=\prod_{p^{k}\|n}\chi_{p,k}(\pi_{c,p^{k}}(n)) for nSL2(/c)n\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}), it follows that

nΓc(d)|χ(n)|2=pkcpjdnΓpk(pj)|χp,k(n)|2.\sum_{n\in\Gamma_{c}(d)}|\chi(n)|^{2}=\prod_{\begin{subarray}{c}p^{k}\|c\\ p^{j}\|d\end{subarray}}\,\sum_{n\in\Gamma_{p^{k}}(p^{j})}|\chi_{p,k}(n)|^{2}.

One can then apply Lemma˜5.4 to obtain

nΓc(d)|χ(n)|2pkc(k+1)1p3kj,\sum_{n\in\Gamma_{c}(d)}|\chi(n)|^{2}\gg\prod_{p^{k}\|c}(k+1)^{-1}p^{3k-j},

Note that the hypothesis on jj from Lemma˜5.4 is satisfied because whenever pp is a prime dividing cc with pkcp^{k}\|c and pjdp^{j}\|d, one of the following holds:

  1. (1)

    One has pep\mid e and pdp\nmid d, so j=0j=0, or

  2. (2)

    One has pdp\mid d and pep\nmid e, so k=vp(dd)2jk=v_{p}(dd^{\prime})\leq 2j.

The desired conclusion then follows from the divisor bound. ∎

Finally, we can state the result of our amplification argument.

Proposition 5.5 (From Fourier coefficients to a counting problem).

Let c+c\in\mathbb{Z}_{+}, a(/c)×a\in(\mathbb{Z}/c\mathbb{Z})^{\times}, H1,H21H_{1},H_{2}\gg 1, and Fc,aH1,H2F_{c,a}^{H_{1},H_{2}} be as in ˜4.14. Let d,d,e+d,d^{\prime},e\in\mathbb{Z}_{+} be such that (d,e)=1(d,e)=1, ddd^{\prime}\mid d, and c=ddec=dd^{\prime}e. Then for any even positive integer qq, one has

F^c,aH1,H2(ρc)Sqqco(1)d(H1H2)q/2maxdd~cf~2cd~f~h1,,hq|hi|2Hjij(mod 2)𝟙Ta¯h1STh2STa¯hq1SThqS=I in PSL2(/d~),\|\widehat{F}_{c,a}^{H_{1},H_{2}}(\rho_{c}^{\circ})\|_{S^{q}}^{q}\ll\frac{c^{o(1)}d}{(H_{1}H_{2})^{q/2}}\max_{\begin{subarray}{c}d\mid\tilde{d}\mid c\\ \tilde{f}^{2}\mid c\tilde{d}\end{subarray}}\tilde{f}\sum_{\begin{subarray}{c}h_{1},\ldots,h_{q}\in\mathbb{Z}\\ |h_{i}|\leq 2H_{j}\\ \forall i\equiv j\ (\textnormal{mod }2)\end{subarray}}\mathbbm{1}_{T^{\overline{a}h_{1}}ST^{h_{2}}S\cdots T^{\overline{a}h_{q-1}}ST^{h_{q}}S=I\text{ in }\textnormal{PSL}_{2}(\mathbb{Z}/\tilde{d}\mathbb{Z})},

where both variables d~,f~\tilde{d},\tilde{f} in the maxima are understood to be integers (note that f~cd~\tilde{f}\leq\sqrt{c\tilde{d}}).

Proof.

By Proposition˜4.6, ρc\rho_{c}^{\circ} is a sum of co(1)c^{o(1)} irreducible representations ρ\rho. By Lemma˜3.10, it suffices to prove the desired upper bound for each F^c,aH1,H2(ρ)Sqq\|\widehat{F}_{c,a}^{H_{1},H_{2}}(\rho)\|_{S^{q}}^{q}. The loss factor of co(1)c^{o(1)} is acceptable here (but if one is only interested in the spectral norm, so q=q=\infty, there is actually no loss at this step).

For each irreducible representation ρ\rho with Mult(ρ,ρc)>0\textnormal{Mult}(\rho,\rho_{c}^{\circ})>0, we apply Propositions˜5.1 and 5.3 with F:=Fc,aH1,H2F:=F_{c,a}^{H_{1},H_{2}} to obtain

F^(ρ)Sqqc3c3o(1)/dg1,,gqSL2(/c)g1gqΓc(d)F(g1)F¯(g21)F(gq1)F¯(gq1)χ(g1gq),\|\widehat{F}(\rho)\|_{S^{q}}^{q}\ll\frac{c^{3}}{c^{3-o(1)}/d}\sum_{\begin{subarray}{c}g_{1},\ldots,g_{q}\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\\ g_{1}\cdots g_{q}\in\Gamma_{c}(d)\end{subarray}}F(g_{1})\overline{F}(g_{2}^{-1})\cdots F(g_{q-1})\overline{F}(g_{q}^{-1})\chi(g_{1}\cdots g_{q}),

where χ:=Trρ\chi:=\textnormal{Tr}\rho. In fact, by Proposition˜5.1, the sum above is nonnegative if one replaces χ\chi with any irreducible character of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}). Summing over all such characters χ=Trρ\chi^{\prime}=\textnormal{Tr}\rho^{\prime} with weight Mult(ρ,ρc)\textnormal{Mult}(\rho^{\prime},\rho_{c}) (which is at least 11 when ρ=ρ\rho^{\prime}=\rho), we find that

F^(ρ)Sqqco(1)dg1,,gqSL2(/c)g1gqΓc(d)F(g1)F¯(g21)F(gq1)F¯(gq1)χc(g1gq).\|\widehat{F}(\rho)\|_{S^{q}}^{q}\ll c^{o(1)}d\sum_{\begin{subarray}{c}g_{1},\ldots,g_{q}\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})\\ g_{1}\cdots g_{q}\in\Gamma_{c}(d)\end{subarray}}F(g_{1})\overline{F}(g_{2}^{-1})\cdots F(g_{q-1})\overline{F}(g_{q}^{-1})\chi_{c}(g_{1}\cdots g_{q}).

Here ρc\rho_{c} is the original permutation representation from Definition˜4.1. In light of Lemma˜5.2, we ought to split the sum above based on the largest d~c\tilde{d}\mid c such that g1gqγΓc(d~)g_{1}\cdots g_{q}\in\gamma\Gamma_{c}(\tilde{d}) for some γ/c\gamma\in\mathbb{Z}/c\mathbb{Z} with γ2=1\gamma^{2}=1; note that by ˜3.12, this is equivalent to the equation g1gq=Ig_{1}\cdots g_{q}=I in PSL2(/d~)\textnormal{PSL}_{2}(\mathbb{Z}/\tilde{d}\mathbb{Z}). Then from the triangle inequality, Lemma˜5.2, and the divisor bound for d~\tilde{d}, we find that

F^(ρ)Sqqco(1)dmaxdd~cf~2cd~f~g1,,gqSL2(/c)|F(g1)F(g21)F(gq1)F(gq1)|𝟙g1gq=I in PSL2(/d~).\|\widehat{F}(\rho)\|_{S^{q}}^{q}\ll c^{o(1)}d\max_{\begin{subarray}{c}d\mid\tilde{d}\mid c\\ \tilde{f}^{2}\mid c\tilde{d}\end{subarray}}\tilde{f}\sum_{g_{1},\ldots,g_{q}\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})}|F(g_{1})F(g_{2}^{-1})\cdots F(g_{q-1})F(g_{q}^{-1})|\mathbbm{1}_{g_{1}\cdots g_{q}=I\text{ in }\textnormal{PSL}_{2}(\mathbb{Z}/\tilde{d}\mathbb{Z})}. (5.3)

Now recalling ˜4.14, we can expand

|F(g)|=|Fc,aH1,H2(g)|1H1H2|h1|H1|h2|H2𝟙g=Ta¯h1STh2.|F(g)|=|F_{c,a}^{H_{1},H_{2}}(g)|\ll\frac{1}{H_{1}H_{2}}\sum_{\begin{subarray}{c}|h_{1}|\leq H_{1}\\ |h_{2}|\leq H_{2}\end{subarray}}\mathbbm{1}_{g=T^{\overline{a}h_{1}}ST^{h_{2}}}.

The conclusion follows by plugging this into ˜5.3, and noting that as h,hh,h^{\prime} vary in [H,H][-H,H]\cap\mathbb{Z}, the difference hhh-h^{\prime} varies in [2H,2H][-2H,2H]\cap\mathbb{Z}, each value being attained O(H)O(H) times. ∎

6. Counting solutions in PSL2(/c)\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z})

We now develop the final ingredient towards Theorem˜1.2, as outlined in Section˜2.4. Given c,q+c,q\in\mathbb{Z}_{+} with qq even, a1,a2(/c)×a_{1},a_{2}\in(\mathbb{Z}/c\mathbb{Z})^{\times}, and 1H1H2c1\leq H_{1}\leq H_{2}\ll c, we will count solutions in (h1,,hq)(h_{1},\ldots,h_{q})\in\mathbb{Z} to the system

{Ta1h1STa2h2STa1hq1Ta2hqS=I in PSL2(/c),|hi|Hj, for all i,j with ij(mod 2).\begin{cases}T^{a_{1}h_{1}}ST^{a_{2}h_{2}}S\cdots T^{a_{1}h_{q-1}}T^{a_{2}h_{q}}S=I\text{ in }\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z}),\\ |h_{i}|\leq H_{j},\text{ for all }i,j\text{ with }i\equiv j\ (\textnormal{mod }2).\end{cases} (6.1)

In particular, the ranges of hih_{i} alone produce the trivial bound q(H1H2)q/2\ll_{q}(H_{1}H_{2})^{q/2} for the number of solutions. Focusing on the case a1=a2=1a_{1}=a_{2}=1, below are some classes of solutions to ˜6.1:

  • (i)(i).

    Integer solutions (1). If h1=hq2+1=0h_{1}=h_{\frac{q}{2}+1}=0 (and similarly for cyclic permutations of this case), noting that S2=IS^{2}=I in PSL2(/c)\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z}), ˜6.1 becomes

    Th2SSThq2=ThqSThq1SSThq2+2.T^{h_{2}}S\cdots ST^{h_{\frac{q}{2}}}=T^{-h_{q}}ST^{-h_{q-1}}S\cdots ST^{-h_{\frac{q}{2}+2}}.

    This has qH1(q2)/4H2(q2)/4\asymp_{q}H_{1}^{\left\lfloor(q-2)/4\right\rfloor}H_{2}^{\left\lceil(q-2)/4\right\rceil} diagonal solutions with hi=hqi+2h_{i}=h_{q-i+2}, which actually give solutions to ˜6.1 in PSL2()\textnormal{PSL}_{2}(\mathbb{Z}).

  • (ii)(ii).

    Integer solutions (2). If h1=h3==hq1=0h_{1}=h_{3}=\cdots=h_{q-1}=0, ˜6.1 becomes

    Th2+h4++hq=Ih2+h4++hq0(mod c).T^{h_{2}+h_{4}+\cdots+h_{q}}=I\qquad\iff\qquad h_{2}+h_{4}+\cdots+h_{q}\equiv 0\ (\textnormal{mod }c).

    This has qH2(q2)/2\asymp_{q}H_{2}^{(q-2)/2} solutions, which supersedes the diagonal contribution from (i)(i).

  • (iii)(iii).

    Generic terms. The product Th1STh2SThqST^{h_{1}}ST^{h_{2}}S\cdots T^{h_{q}}S can take c3\approx c^{3} values in PSL2(/c)\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z}). If each matrix is attained roughly the same number of times (and such equidistribution ought to happen for large enough qq), this gives an expected number of c3(H1H2)q/2\approx c^{-3}(H_{1}H_{2})^{q/2} solutions.

These heuristics can be formalized to produce a lower bound, imposing a limitation on our methods.

Lemma 6.1.

Let c,q+c,q\in\mathbb{Z}_{+} with qq even, a1,a2(/c)×a_{1},a_{2}\in(\mathbb{Z}/c\mathbb{Z})^{\times}, and 1H1H2c1\leq H_{1}\leq H_{2}\ll c. The number of solutions (h1,,hq)q(h_{1},\ldots,h_{q})\in\mathbb{Z}^{q} to ˜6.1 is at least

qH2(q2)/2+(H1H2)q/2c3.\gg_{q}H_{2}^{(q-2)/2}+\frac{(H_{1}H_{2})^{q/2}}{c^{3}}. (6.2)
Proof.

The lower bound by the first term in ˜6.2 follows by considering the aforementioned integer solutions with h1=h3==hq1=0h_{1}=h_{3}=\cdots=h_{q-1}=0 and h2+h4++hq=0h_{2}+h_{4}+\cdots+h_{q}=0. To obtain a lower bound by the second term in ˜6.2, we let k:=q+22k:=\tfrac{q+2}{2}, write (Hj,aj)(H_{j},a_{j}) for (H1,a1)(H_{1},a_{1}) or (H2,a2)(H_{2},a_{2}) depending on whether jj is odd or even, and apply Cauchy–Schwarz to obtain

h1,,hk|hi|12Hjij(mod 2)1\displaystyle\sum_{\begin{subarray}{c}h_{1},\ldots,h_{k}\in\mathbb{Z}\\ |h_{i}|\leq\frac{1}{2}H_{j}\\ \forall i\equiv j\ (\textnormal{mod }2)\end{subarray}}1 =gPSL2(/c)h1,,hk|hi|12Hjij(mod 2)𝟙Ta1h1STakhkS=g\displaystyle=\sum_{g\in\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z})}\sum_{\begin{subarray}{c}h_{1},\ldots,h_{k}\in\mathbb{Z}\\ |h_{i}|\leq\frac{1}{2}H_{j}\\ \forall i\equiv j\ (\textnormal{mod }2)\end{subarray}}\mathbbm{1}_{T^{a_{1}h_{1}}S\cdots T^{a_{k}h_{k}}S=g}
c3/2(gPSL2(/c)(h1,,hk|hi|12Hjij(mod 2)𝟙Ta1h1STakhkS=g)2)1/2\displaystyle\ll c^{3/2}\Bigg(\sum_{g\in\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z})}\Bigg(\sum_{\begin{subarray}{c}h_{1},\ldots,h_{k}\in\mathbb{Z}\\ |h_{i}|\leq\frac{1}{2}H_{j}\\ \forall i\equiv j\ (\textnormal{mod }2)\end{subarray}}\mathbbm{1}_{T^{a_{1}h_{1}}S\cdots T^{a_{k}h_{k}}S=g}\Bigg)^{2}\Bigg)^{1/2}
=c3/2(h1,,hkh1,,hk|hi|,|hi|12Hjij(mod 2)𝟙Ta1h1STakhkS=Ta1h1STakhkS in PSL2(/c))1/2.\displaystyle=c^{3/2}\Bigg(\sum_{\begin{subarray}{c}h_{1},\ldots,h_{k}\in\mathbb{Z}\\ h_{1}^{\prime},\ldots,h_{k}^{\prime}\in\mathbb{Z}\\ |h_{i}|,|h_{i}^{\prime}|\leq\frac{1}{2}H_{j}\\ \forall i\equiv j\ (\textnormal{mod }2)\end{subarray}}\mathbbm{1}_{T^{a_{1}h_{1}}S\cdots T^{a_{k}h_{k}}S=T^{a_{1}h_{1}^{\prime}}S\cdots T^{a_{k}h_{k}^{\prime}}S\text{ in }\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z})}\Bigg)^{1/2}.

One can rewrite the last equation Ta1h1STakhkS=Ta1h1STakhkST^{a_{1}h_{1}}S\cdots T^{a_{k}h_{k}}S=T^{a_{1}h_{1}^{\prime}}S\cdots T^{a_{k}h_{k}^{\prime}}S in PSL2(/c)\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z}) as

Ta1(h1h1)STa2h2STak1hk1STak(hkhk)STak1hk1STa1h1S=I.T^{a_{1}(h_{1}-h_{1}^{\prime})}ST^{a_{2}h_{2}}S\cdots T^{a_{k-1}h_{k-1}}ST^{a_{k}(h_{k}-h_{k}^{\prime})}ST^{-a_{k-1}h_{k-1}^{\prime}}S\cdots T^{-a_{1}h_{1}^{\prime}}S=I.

Comparing this with ˜6.1, recalling that k=q+22k=\tfrac{q+2}{2}, and noting that h1h1h_{1}-h_{1}^{\prime} takes each value in [H1,H1]\mathbb{Z}\cap[-H_{1},H_{1}] at most O(H1)O(H_{1}) times (and similarly for hkhkh_{k}-h_{k}^{\prime}), we conclude that the desired count of solutions is at least

q1c3H1Hk(h1,,hk|hi|12Hjij(mod 2)1)2q1c3H1HkH12k/2H22k/2=H1k1H2k1c3.\gg_{q}\frac{1}{c^{3}H_{1}H_{k}}\Bigg(\sum_{\begin{subarray}{c}h_{1},\ldots,h_{k}\in\mathbb{Z}\\ |h_{i}|\leq\frac{1}{2}H_{j}\\ \forall i\equiv j\ (\textnormal{mod }2)\end{subarray}}1\Bigg)^{2}\gg_{q}\frac{1}{c^{3}H_{1}H_{k}}H_{1}^{2\left\lceil k/2\right\rceil}H_{2}^{2\left\lfloor k/2\right\rfloor}=\frac{H_{1}^{k-1}H_{2}^{k-1}}{c^{3}}.

Since k1=q2k-1=\tfrac{q}{2}, this completes our proof. ∎

Optimistically, we may expect the lower bound in Lemma˜6.1 to be essentially sharp.

Conjecture 6.2.

Let c,q+c,q\in\mathbb{Z}_{+} with qq even. For all a1,a2(/c)×a_{1},a_{2}\in(\mathbb{Z}/c\mathbb{Z})^{\times} and 1H1H2c1\leq H_{1}\leq H_{2}\ll c, the number of solutions (h1,,hq)q(h_{1},\ldots,h_{q})\in\mathbb{Z}^{q} to ˜6.1 is at most

qco(1)(H2(q2)/2+(H1H2)q/2c3).\ll_{q}c^{o(1)}\left(H_{2}^{(q-2)/2}+\frac{(H_{1}H_{2})^{q/2}}{c^{3}}\right). (6.3)
Remark.

When cc is large enough in terms of H1,H2,qH_{1},H_{2},q (so in particular, the first term in ˜6.3 dominates), ˜6.2 becomes a statement about PSL2()\textnormal{PSL}_{2}(\mathbb{Z}), which follows from a somewhat tedious combinatorial computation. When qq is large enough in terms of H1,H2,cH_{1},H_{2},c (so in particular, the second term in ˜6.3 dominates), ˜6.2 can be attacked using expansion methods; see [37, Lemma 53 and Theorem 50] for the case when cc is prime. However, note that ˜6.3 saves at best c3c^{3} over the trivial bound of (H1H2)q/2(H_{1}H_{2})^{q/2}, and this saving becomes c3/qc^{3/q} in our final bounds; therefore, using a large value of qq ultimately produces a quantitatively-weak power saving. On the other hand, if qq is too small, then combining ˜6.2 with Proposition˜5.5 gives information about a small moment of singular values, which produces a weak bound for the top singular value.

Because of this, ˜6.2 is most relevant in the median range when q1q\asymp 1, say q{6,8,10}q\in\{6,8,10\}, and H1,H2[c,c]H_{1},H_{2}\in[\sqrt{c},c]. It seems very difficult to fully establish ˜6.2 in these cases, but we can nevertheless make some partial progress towards it. When cc is prime, the cases q{4,6}q\in\{4,6\} are also related to some computations of Shkredov [38, Lemma 15].

Lemma 6.3.

˜6.2 holds if q=2q=2 or q=4q=4.

Proof.

When q=2q=2, the equation in ˜6.1 reads Ta1h1S=STa2h2T^{a_{1}h_{1}}S=ST^{-a_{2}h_{2}}, which implies the entry-wise congruence

(a1h1110)γ(011a2h2)(mod c),\begin{pmatrix}a_{1}h_{1}&-1\\ 1&0\end{pmatrix}\equiv\gamma\begin{pmatrix}0&-1\\ 1&-a_{2}h_{2}\end{pmatrix}\ (\textnormal{mod }c),

for some γ/c\gamma\in\mathbb{Z}/c\mathbb{Z} with γ2=1\gamma^{2}=1. This actually forces γ=1\gamma=1, and h1,h20(mod c)h_{1},h_{2}\equiv 0\ (\textnormal{mod }c). Since H1,H2cH_{1},H_{2}\ll c, we obtain O(1)O(1) choices of h1,h2h_{1},h_{2}, which matches the bound from ˜6.3.

When q=4q=4, the equation in ˜6.1 reads Ta1h1STa2h2S=STa2h4STa1h3T^{a_{1}h_{1}}ST^{a_{2}h_{2}}S=ST^{-a_{2}h_{4}}ST^{-a_{1}h_{3}}, which translates to

(a1a2h1h21a1h1a2h21)γ(1a1h3a1h4a1a2h3h41)(mod c),\begin{pmatrix}a_{1}a_{2}h_{1}h_{2}-1&-a_{1}h_{1}\\ a_{2}h_{2}&-1\end{pmatrix}\equiv\gamma\begin{pmatrix}-1&a_{1}h_{3}\\ -a_{1}h_{4}&a_{1}a_{2}h_{3}h_{4}-1\end{pmatrix}\ (\textnormal{mod }c),

for some γ/c\gamma\in\mathbb{Z}/c\mathbb{Z} with γ2=1\gamma^{2}=1. Since there are co(1)c^{o(1)} such values555This follows from a local computation via Hensel’s lemma, and the divisor bound. of γ\gamma, we may fix γ\gamma up to an acceptable loss. To establish the desired bound of O(co(1)H2)O(c^{o(1)}H_{2}) for the number of solutions (h1,h2,h3,h4)(h_{1},h_{2},h_{3},h_{4}) with |h1|,|h3|H1|h_{1}|,|h_{3}|\leq H_{1} and |h2|,|h4|H2|h_{2}|,|h_{4}|\leq H_{2}, we split into three cases.

Case 1: h1=0h_{1}=0. This forces h30(mod c)h_{3}\equiv 0\ (\textnormal{mod }c), and each choice of h2h_{2} induces a unique residue of h4(mod c)h_{4}\ (\textnormal{mod }c). Since H1,H2cH_{1},H_{2}\ll c, this gives a total of O(co(1)H2)O(c^{o(1)}H_{2}) solutions.

Case 2: h2=0h_{2}=0. This forces h40(mod c)h_{4}\equiv 0\ (\textnormal{mod }c), and each choice of h1h_{1} induces a unique residue of h3(mod c)h_{3}\ (\textnormal{mod }c). Since H1,H2cH_{1},H_{2}\ll c, this gives a total of O(co(1)H1)O(c^{o(1)}H_{1}) solutions, and recall H1H2H_{1}\leq H_{2}.

Case 3: h1h20h_{1}h_{2}\neq 0. Then the congruence a1a2h1h21γ(mod c)a_{1}a_{2}h_{1}h_{2}-1\equiv-\gamma\ (\textnormal{mod }c) fixes the residue of h1h2(mod c)h_{1}h_{2}\ (\textnormal{mod }c), leaving O(1+H1H2c)O(1+\tfrac{H_{1}H_{2}}{c}) possible values of h1h2h_{1}h_{2}, each of which gives O(co(1))O(c^{o(1)}) choices of h1,h2h_{1},h_{2} by the divisor bound. This gives a total of co(1)(1+H1H2c)co(1)H2\ll c^{o(1)}(1+\tfrac{H_{1}H_{2}}{c})\ll c^{o(1)}H_{2} solutions. ∎

Proposition 6.4 (Combinatorial count for q=6q=6).

Let c+c\in\mathbb{Z}_{+}, a1,a2(/c)×a_{1},a_{2}\in(\mathbb{Z}/c\mathbb{Z})^{\times}, and 1H1H2c1\leq H_{1}\leq H_{2}\ll c. The number of solutions (h1,,h6)6(h_{1},\ldots,h_{6})\in\mathbb{Z}^{6} to ˜6.1 with q=6q=6 is at most

co(1)(H22+(H1H2)2c).\ll c^{o(1)}\left(H_{2}^{2}+\frac{(H_{1}H_{2})^{2}}{c}\right). (6.4)
Remark.

˜6.2 would replace the second term in ˜6.4 with (H1H2)3c3\tfrac{(H_{1}H_{2})^{3}}{c^{3}}. In particular, Proposition˜6.4 establishes ˜6.2 when q=6q=6 and either H12cH_{1}^{2}\ll c or H1,H2cH_{1},H_{2}\asymp c.

Proof of Proposition˜6.4.

To simplify the exposition, we focus on the case a1=a2=1a_{1}=a_{2}=1. The proof is almost completely unchanged when a1,a2(/c)×a_{1},a_{2}\in(\mathbb{Z}/c\mathbb{Z})^{\times} are arbitrary. We may then write the equation in ˜6.1 (with q=6q=6) as

Th1STh2STh3STh4S=STh6STh5.T^{h_{1}}ST^{h_{2}}ST^{h_{3}}ST^{h_{4}}S=ST^{-h_{6}}ST^{-h_{5}}.

A short computation brings this to the entry-wise congruence

(h1h2h3h4h1h4h3h4h1h2+1h1h2h3+h1+h3h2h3h4h2h4h2h3+1)γ(1h5h6h5h61)(mod c)\begin{pmatrix}h_{1}h_{2}h_{3}h_{4}-h_{1}h_{4}-h_{3}h_{4}-h_{1}h_{2}+1&-h_{1}h_{2}h_{3}+h_{1}+h_{3}\\ h_{2}h_{3}h_{4}-h_{2}-h_{4}&-h_{2}h_{3}+1\end{pmatrix}\equiv\gamma\begin{pmatrix}-1&h_{5}\\ -h_{6}&h_{5}h_{6}-1\end{pmatrix}\ (\textnormal{mod }c)

for some γ/c\gamma\in\mathbb{Z}/c\mathbb{Z} with γ2=1\gamma^{2}=1. As before, since there are co(1)c^{o(1)} possible values of γ\gamma, we may as well regard γ\gamma as fixed. Since both sides have determinant 11, this is actually a system of three congruences:

{1h2h3γ(h5h61)(mod c),h1(1h2h3)+h3γh5(mod c),h4(1h2h3)+h2γh6(mod c).\begin{cases}1-h_{2}h_{3}\equiv\gamma(h_{5}h_{6}-1)\ (\textnormal{mod }c),\\ h_{1}(1-h_{2}h_{3})+h_{3}\equiv\gamma h_{5}\ (\textnormal{mod }c),\\ h_{4}(1-h_{2}h_{3})+h_{2}\equiv\gamma h_{6}\ (\textnormal{mod }c).\end{cases} (6.5)

Our argument now requires some casework.

Case 1: One has hj=0h_{j}=0 for some j{1,,6}j\in\{1,\ldots,6\}. Since the original equation can also be written as Thj1SThjSThj+1SThj+2SThj+3SThj+4S=IT^{h_{j-1}}ST^{h_{j}}ST^{h_{j+1}}ST^{h_{j+2}}ST^{h_{j+3}}ST^{h_{j+4}}S=I in PSL2(/c)\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z}) (viewing indices modulo 66), we may assume without loss of generality that j=2j=2, up to potentially swapping H1H_{1} and H2H_{2} in the final bound (so we momentarily forget that H1H2H_{1}\leq H_{2}). So let us say h2=0h_{2}=0, which reduces ˜6.5 to

{1γ(h5h61)(mod c),h1+h3γh5(mod c),h4γh6(mod c).\begin{cases}1\equiv\gamma(h_{5}h_{6}-1)\ (\textnormal{mod }c),\\ h_{1}+h_{3}\equiv\gamma h_{5}\ (\textnormal{mod }c),\\ h_{4}\equiv\gamma h_{6}\ (\textnormal{mod }c).\end{cases} (6.6)

Subcase 1.1: One has h5=0h_{5}=0. Then for any values of h1h_{1} and h6h_{6}, the system in ˜6.6 leaves only O(1)O(1) possibilities for h3,h4h_{3},h_{4} (since H1,H2cH_{1},H_{2}\ll c). This gives O(H1H2)O(H_{1}H_{2}) solutions.

Subcase 1.2: One has h6=0h_{6}=0. Then for any values of h1h_{1} and h5h_{5}, the system in ˜6.6 leaves only O(1)O(1) possibilities for h3,h4h_{3},h_{4}. This gives O(H12)O(H_{1}^{2}) solutions.

Subcase 1.3: One has h5h60h_{5}h_{6}\neq 0. Then the first congruence in ˜6.6 fixes h5h6(mod c)h_{5}h_{6}\ (\textnormal{mod }c), leaving 1+H1H2c1+\tfrac{H_{1}H_{2}}{c} possibilities for the nonzero integer h5h6h_{5}h_{6}, each of which gives O(co(1))O(c^{o(1)}) possible values for h5,h6h_{5},h_{6} by the divisor bound. Once h5h_{5} and h6h_{6} are fixed, each value of h1h_{1} produces O(1)O(1) final solutions. This gives (1+H1H2c)H1(1+\tfrac{H_{1}H_{2}}{c})H_{1} solutions.

From Case 1, we obtain a total number of solutions of

co(1)(H12+H22+(1+H1H2c)(H1+H2)),\ll c^{o(1)}\left(H_{1}^{2}+H_{2}^{2}+\left(1+\frac{H_{1}H_{2}}{c}\right)(H_{1}+H_{2})\right),

which is O(co(1)H22)O(c^{o(1)}H_{2}^{2}) once we remember that H1H2H_{1}\leq H_{2} and H1cH_{1}\ll c. This is acceptable in ˜6.4.

Case 2: One has hj0h_{j}\neq 0 for all j{1,,6}j\in\{1,\ldots,6\}. We fix d:=(1h2h3,c)=(h5h61,c)d:=(1-h_{2}h_{3},c)=(h_{5}h_{6}-1,c) up to an acceptable O(co(1))O(c^{o(1)}) loss. Since h5h61(mod d)h_{5}h_{6}\equiv 1\ (\textnormal{mod }d), there are O(1+H1H2c)O(1+\frac{H_{1}H_{2}}{c}) ways to pick the nonzero integer h5h6h_{5}h_{6}, each of which gives O(co(1))O(c^{o(1)}) ways to pick h5,h6h_{5},h_{6} by the divisor bound.

Once h5,h6h_{5},h_{6} are fixed, we pick h2,h3h_{2},h_{3} subject to the system

{1h2h3γ(h5h61)(mod c),h2γh6=:r(mod d),\begin{cases}1-h_{2}h_{3}\equiv\gamma(h_{5}h_{6}-1)\ (\textnormal{mod }c),\\ h_{2}\equiv\gamma h_{6}=:r\ (\textnormal{mod }d),\end{cases} (6.7)

which follows from ˜6.5. We can do this in either of the following ways:

  • Pick the nonzero integer h2h3h_{2}h_{3} subject to its residue mod cc (due to the first congruence in ˜6.7) in O(1+H1H2c)O(1+\tfrac{H_{1}H_{2}}{c}) ways, and then h2,h3h_{2},h_{3} in O(co(1))O(c^{o(1)}) ways by the divisor bound.

  • For each choice of h2h_{2} with |h2|H2|h_{2}|\leq H_{2} and h2r(mod d)h_{2}\equiv r\ (\textnormal{mod }d), pick h3h_{3} subject to its residue mod c(h2,c)\tfrac{c}{(h_{2},c)} (again, due to the first congruence in ˜6.7) in O(1+H1c(h2,c))O(1+\frac{H_{1}}{c}(h_{2},c)) ways.

Finally, once h5,h6,h2,h3h_{5},h_{6},h_{2},h_{3} are fixed, ˜6.5 determines the residues of h1h_{1} and h4h_{4} modulo cd\tfrac{c}{d}, so there are (1+H1cd)(1+H2cd)(1+\tfrac{H_{1}}{c}d)(1+\tfrac{H_{2}}{c}d) choices of h1,h4h_{1},h_{4}. From Case 2, we obtain a total number of solutions of

co(1)maxdc(1+H1H2d)from picking h5,h6min(1+H1H2c,maxr(/d)×|h2|H2h2r(mod d)(1+H1c(h2,c)))from picking h2,h3\displaystyle\ll c^{o(1)}\max_{d\mid c}\underbrace{\left(1+\frac{H_{1}H_{2}}{d}\right)}_{\text{from picking }h_{5},h_{6}}\underbrace{\min\Bigg(1+\frac{H_{1}H_{2}}{c},\max_{r\in(\mathbb{Z}/d\mathbb{Z})^{\times}}\sum_{\begin{subarray}{c}|h_{2}|\leq H_{2}\\ h_{2}\equiv r\ (\textnormal{mod }d)\end{subarray}}\left(1+\frac{H_{1}}{c}(h_{2},c)\right)\Bigg)}_{\text{from picking }h_{2},h_{3}} (6.8)
×(1+H1cd)from picking h1(1+H2cd)from picking h4.\displaystyle\times\underbrace{\left(1+\frac{H_{1}}{c}d\right)}_{\text{from picking }h_{1}}\ \underbrace{\left(1+\frac{H_{2}}{c}d\right)}_{\text{from picking }h_{4}}.

To bound the sum over h2h_{2}, we write

|h2|H2h2r(mod d)(h2,c)\displaystyle\sum_{\begin{subarray}{c}|h_{2}|\leq H_{2}\\ h_{2}\equiv r\ (\textnormal{mod }d)\end{subarray}}(h_{2},c) gc(g,d)=1g|h2|H2gh2r(mod d)1\displaystyle\leq\sum_{\begin{subarray}{c}g\mid c\\ (g,d)=1\end{subarray}}g\sum_{\begin{subarray}{c}|h_{2}|\leq H_{2}\\ g\mid h_{2}\equiv r\ (\textnormal{mod }d)\end{subarray}}1
=gcd(g,d)=1g|h2|H2gh2g¯r(mod d)1\displaystyle=\sum_{\begin{subarray}{c}g\mid\frac{c}{d}\\ (g,d)=1\end{subarray}}g\sum_{\begin{subarray}{c}|h_{2}^{\prime}|\leq\frac{H_{2}}{g}\\ h_{2}^{\prime}\equiv\overline{g}r\ (\textnormal{mod }d)\end{subarray}}1
|g|cd(g,d)=1g(1+H2gd)co(1)(cd+H2d),\displaystyle\ll\sum_{\begin{subarray}{c}|g|\leq\frac{c}{d}\\ (g,d)=1\end{subarray}}g\left(1+\frac{H_{2}}{gd}\right)\ll c^{o(1)}\left(\frac{c}{d}+\frac{H_{2}}{d}\right),

and the term H2d\tfrac{H_{2}}{d} can be omitted since H2cH_{2}\ll c. Plugging this into ˜6.8 gives a total count of

co(1)maxdc(1+H1H2d)(1+H2cd)(1+H1cd)min(1+H1H2c,1+H2d+H1ccd)\ll c^{o(1)}\max_{d\mid c}\left(1+\frac{H_{1}H_{2}}{d}\right)\left(1+\frac{H_{2}}{c}d\right)\left(1+\frac{H_{1}}{c}d\right)\min\left(1+\frac{H_{1}H_{2}}{c},1+\frac{H_{2}}{d}+\frac{H_{1}}{c}\cdot\frac{c}{d}\right)

Since H1H2H_{1}\leq H_{2}, the final term of H1ccd=H1d\tfrac{H_{1}}{c}\cdot\tfrac{c}{d}=\tfrac{H_{1}}{d} can be omitted. The bound above now becomes

co(1)maxdc(1+H1H2d)(1+H2cd)max(H1cd,1)(1+H2dmin(H1cd,1))\displaystyle\ll c^{o(1)}\max_{d\mid c}\left(1+\frac{H_{1}H_{2}}{d}\right)\left(1+\frac{H_{2}}{c}d\right)\max\left(\frac{H_{1}}{c}d,1\right)\left(1+\frac{H_{2}}{d}\min\left(\frac{H_{1}}{c}d,1\right)\right)
=co(1)maxdc(1+H1H2d)(1+H2cd)(max(H1cd,1)+H2d(H1cd1))\displaystyle=c^{o(1)}\max_{d\mid c}\left(1+\frac{H_{1}H_{2}}{d}\right)\left(1+\frac{H_{2}}{c}d\right)\left(\max\left(\frac{H_{1}}{c}d,1\right)+\frac{H_{2}}{d}\left(\frac{H_{1}}{c}d\cdot 1\right)\right)
co(1)maxdc(1+H1H2d)(1+H2cd)(1+H1cd+H1H2c).\displaystyle\leq c^{o(1)}\max_{d\mid c}\left(1+\frac{H_{1}H_{2}}{d}\right)\left(1+\frac{H_{2}}{c}d\right)\left(1+\frac{H_{1}}{c}d+\frac{H_{1}H_{2}}{c}\right).

After expanding the expression inside the maximum, each term is either strictly increasing, constant, or strictly decreasing in d[1,c]d\in[1,c], so each term is maximized when d=1d=1 or d=cd=c. It follows that we can bound the maximum over dcd\mid c, up to a constant, by looking only at the extreme points d=1d=1 and d=cd=c. This gives a total count of

co(1)max(H1H2(1+H2c)(1+H1c+H1H2c),(1+H1H2c)H2(H1+H1H2c))\displaystyle\ll c^{o(1)}\max\left(H_{1}H_{2}\left(1+\frac{H_{2}}{c}\right)\left(1+\frac{H_{1}}{c}+\frac{H_{1}H_{2}}{c}\right),\left(1+\frac{H_{1}H_{2}}{c}\right)H_{2}\left(H_{1}+\frac{H_{1}H_{2}}{c}\right)\right)
co(1)max(H1H2(1+H1H2c),(1+H1H2c)H2H1),\displaystyle\ll c^{o(1)}\max\left(H_{1}H_{2}\left(1+\frac{H_{1}H_{2}}{c}\right),\left(1+\frac{H_{1}H_{2}}{c}\right)H_{2}H_{1}\right),

where, to reach the last line, we used that H1,H2cH_{1},H_{2}\ll c. This establishes the desired bound. ∎

Finally, we may remove the restriction that H1,H2cH_{1},H_{2}\ll c from the counting results in this section up to some additional factors.

Corollary 6.5.

Let c+c\in\mathbb{Z}_{+}, a1,a2(/c)×a_{1},a_{2}\in(\mathbb{Z}/c\mathbb{Z})^{\times}, and 1H1H21\leq H_{1}\leq H_{2}. Then the number of solutions to ˜6.1 with q=4q=4 is at most

co(1)(1+H12c2)(1+H2c)H2,\ll c^{o(1)}\left(1+\frac{H_{1}^{2}}{c^{2}}\right)\left(1+\frac{H_{2}}{c}\right)H_{2}, (6.9)

and the number of solutions to ˜6.1 with q=6q=6 is at most

co(1)(1+H1H2c+H13H2c3)H22.\ll c^{o(1)}\left(1+\frac{H_{1}H_{2}}{c}+\frac{H_{1}^{3}H_{2}}{c^{3}}\right)H_{2}^{2}. (6.10)
Proof.

Since the equation in ˜6.1 only depends on the residues of h1,,hq(mod c)h_{1},\ldots,h_{q}\ (\textnormal{mod }c), we may as well count solutions to the system

{Ta1h1STa2h2STa1hq1Ta2hqS=I in PSL2(/c),|hi|min(Hj,c), for all i,j with ij(mod 2),\begin{cases}T^{a_{1}h_{1}^{\prime}}ST^{a_{2}h_{2}^{\prime}}S\cdots T^{a_{1}h_{q-1}^{\prime}}T^{a_{2}h_{q}^{\prime}}S=I\text{ in }\textnormal{PSL}_{2}(\mathbb{Z}/c\mathbb{Z}),\\ |h_{i}^{\prime}|\leq\min(H_{j},c),\text{ for all }i,j\text{ with }i\equiv j\ (\textnormal{mod }2),\end{cases} (6.11)

and multiply the final count by a factor of q(1+H1c)q/2(1+H2c)q/2\ll_{q}(1+\tfrac{H_{1}}{c})^{q/2}(1+\tfrac{H_{2}}{c})^{q/2} (indeed, each solution (h1,,hq)(h_{1}^{\prime},\ldots,h_{q}^{\prime}) to ˜6.11 induces at most this many solutions (h1,,hq)(h_{1},\ldots,h_{q}) to ˜6.1 with the same residues modulo cc, and all solutions to ˜6.1 can be obtained this way).

If q=4q=4, Lemma˜6.3 applied for min(H1,c)\min(H_{1},c) and min(H2,c)\min(H_{2},c) gives a total number of solutions of

co(1)(1+H1c)2(1+H2c)2min(H2,c),\ll c^{o(1)}\left(1+\frac{H_{1}}{c}\right)^{2}\left(1+\frac{H_{2}}{c}\right)^{2}\min(H_{2},c),

and the bound in ˜6.9 follows by noting that min(H2,c)H2(1+H2c)1\min(H_{2},c)\asymp H_{2}\left(1+\frac{H_{2}}{c}\right)^{-1}.

Similarly, if q=6q=6, Proposition˜6.4 applied for min(H1,c)\min(H_{1},c) and min(H2,c)\min(H_{2},c) gives a total count of

co(1)(1+H1c)3(1+H2c)3(min(H2,c)2+min(H1,c)2min(H2,c)2c)\displaystyle\ll c^{o(1)}\left(1+\frac{H_{1}}{c}\right)^{3}\left(1+\frac{H_{2}}{c}\right)^{3}\left(\min(H_{2},c)^{2}+\frac{\min(H_{1},c)^{2}\min(H_{2},c)^{2}}{c}\right)
co(1)(1+H13c3)(1+H2c)H22+co(1)(1+H1c)(1+H2c)H12H22c\displaystyle\ll c^{o(1)}\left(1+\frac{H_{1}^{3}}{c^{3}}\right)\left(1+\frac{H_{2}}{c}\right)H_{2}^{2}+c^{o(1)}\left(1+\frac{H_{1}}{c}\right)\left(1+\frac{H_{2}}{c}\right)\frac{H_{1}^{2}H_{2}^{2}}{c}
co(1)(1+H2c+H13c3+H13H2c4)H22+co(1)(1+H2c+H1H2c2)H12H22c.\displaystyle\ll c^{o(1)}\left(1+\frac{H_{2}}{c}+\frac{H_{1}^{3}}{c^{3}}+\frac{H_{1}^{3}H_{2}}{c^{4}}\right)H_{2}^{2}+c^{o(1)}\left(1+\frac{H_{2}}{c}+\frac{H_{1}H_{2}}{c^{2}}\right)\frac{H_{1}^{2}H_{2}^{2}}{c}.

We note that the third and fourth terms in the first parenthesis above can be ignored: their contribution to the final bound is H13H22c3+H13H23c4\tfrac{H_{1}^{3}H_{2}^{2}}{c^{3}}+\tfrac{H_{1}^{3}H_{2}^{3}}{c^{4}}, which is superseded by the contribution of H13H23c3\tfrac{H_{1}^{3}H_{2}^{3}}{c^{3}} from the third term in the second parenthesis. This gives a total count of

co(1)(1+H2c+H12c+H12H2c2+H13H2c3)H22.\ll c^{o(1)}\left(1+\frac{H_{2}}{c}+\frac{H_{1}^{2}}{c}+\frac{H_{1}^{2}H_{2}}{c^{2}}+\frac{H_{1}^{3}H_{2}}{c^{3}}\right)H_{2}^{2}. (6.12)

This bounded by ˜6.10, noting that H12H2c2\tfrac{H_{1}^{2}H_{2}}{c^{2}} is the geometric mean of H1H2c\tfrac{H_{1}H_{2}}{c} and H13H2c3\tfrac{H_{1}^{3}H_{2}}{c^{3}}. ∎

7. Bilinear forms with Kloosterman sums

We now combine the work in Sections˜4, 5 and 6, to deduce our main results from Theorems˜1.2 and 1.1.

7.1. Composite moduli

Here we prove a generalization of Theorem˜1.2, which allows for larger values of M,NM,N. We state our upper bound in two ways, to facilitate comparison with ˜1.2.

Theorem 7.1.

Let c=ddec=dd^{\prime}e for some d,d,e+d,d^{\prime},e\in\mathbb{Z}_{+} with ddd^{\prime}\mid d and (d,e)=1(d,e)=1, and fcdf\leq\sqrt{cd} be the largest integer with f2cdf^{2}\mid cd. Let ,𝒥\mathcal{I},\mathcal{J}\subset\mathbb{Z} be intervals of lengths ||=M|\mathcal{I}|=M, |𝒥|=N|\mathcal{J}|=N, with666The assumption that NMN\leq M is only included to shorten the statement of Theorem 7.1; one can of course swap mm and nn in the bilinear sum, up to swapping MM and NN in the upper bound. 1NMc1\leq N\leq M\leq c. Then for any complex sequences (αm)m(\alpha_{m})_{m\in\mathcal{I}}, (βn)n𝒥(\beta_{n})_{n\in\mathcal{J}} and a(/c)×a\in(\mathbb{Z}/c\mathbb{Z})^{\times}, one has

m,n𝒥(m,n,c)=1αmβnS(am,n;c)\displaystyle\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,c)=1\end{subarray}}\alpha_{m}\beta_{n}S(am,n;c) αβc1+o(1)(dM3Nc3+fM2c2+fd2)16\displaystyle\ll\|\alpha\|\|\beta\|c^{1+o(1)}\left(\frac{dM^{3}N}{c^{3}}+\frac{fM^{2}}{c^{2}}+\frac{f}{d^{2}}\right)^{\frac{1}{6}}
=αβco(1)MNc(dN2+fcMN3+fc3d2M3N3)16.\displaystyle=\|\alpha\|\|\beta\|c^{o(1)}\sqrt{MNc}\left(\frac{d}{N^{2}}+\frac{fc}{MN^{3}}+\frac{fc^{3}}{d^{2}M^{3}N^{3}}\right)^{\frac{1}{6}}.
Example 7.2.

Suppose MNM\asymp N. The smallest value of NN for which Theorem˜7.1 can beat the Weil bound in ˜1.2 is N=c2/5+o(1)N=c^{2/5+o(1)}, attained, e.g., when c=pqc=pq where p,qp,q are distinct primes with pq3/2p\asymp q^{3/2}. The largest value of NN for which Theorem˜7.1 can beat the Fourier-theoretic bound in ˜1.2 is N=c3/4o(1)N=c^{3/4-o(1)}, attained when cc has a divisor d=co(1)d=c^{o(1)} such that c/dc/d is square-free.

Remark.

Additional savings are possible in Theorem˜7.1 in the unbalanced range M>NM>N. Firstly, the bound in ˜6.10 can be refined to ˜6.12, but we omit this optimization for the sake of simplicity. Secondly and more substantially, bounding the largest singular value of an M×NM\times N matrix by the sixth moment of its singular values (as we do) can be particularly lossy if M>NM>N, since then the singular values often exhibit concentration near their maximum; one can try to amend this by subtracting a suitable main term from the sixth moment, as in [18, Lemma 4.2].

Proof of Theorem˜7.1.

Let ε>0\varepsilon>0 and H1:=c1+εM1H_{1}:=c^{1+\varepsilon}M^{-1}, H2:=c1+εN1H_{2}:=c^{1+\varepsilon}N^{-1}. Since MNM\geq N, we have H1H2H_{1}\leq H_{2}. Let qq be an even positive integer.

Then by the characterization of operator norms from ˜3.4, Corollary˜4.11 (using the notation from ˜4.13 and 4.14), the fact that AASq\|A\|\leq\|A\|_{S^{q}} for any linear map AA, and then Proposition˜5.5, we have that

|m,n𝒥(m,n,c)=1αmβnS(am,n;c)|\displaystyle\left|\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,c)=1\end{subarray}}\alpha_{m}\beta_{n}S(am,n;c)\right| αβKc,a,𝒥\displaystyle\leq\|\alpha\|\|\beta\|\|K_{c,a}^{\mathcal{I},\mathcal{J}}\| (7.1)
αβ(c1+2εF^c,aH1,H2(ρc)+Oε(c100))\displaystyle\leq\|\alpha\|\|\beta\|\left(c^{1+2\varepsilon}\|\widehat{F}_{c,a}^{H_{1},H_{2}}(\rho_{c}^{\circ})\|+O_{\varepsilon}(c^{-100})\right)
εαβ(c1+3εd1/q(H1H2)1/2𝒮1/q+Oε(c100)),\displaystyle\ll_{\varepsilon}\|\alpha\|\|\beta\|\left(c^{1+3\varepsilon}\frac{d^{1/q}}{(H_{1}H_{2})^{1/2}}\mathscr{S}^{1/q}+O_{\varepsilon}(c^{-100})\right),

where

𝒮:=maxdd~cf~2cd~f~h1,,hq|hi|2Hjij(mod 2)𝟙Ta¯h1STh2STa¯hq1SThqS=I in PSL2(/d~).\mathscr{S}:=\max_{\begin{subarray}{c}d\mid\tilde{d}\mid c\\ \tilde{f}^{2}\mid c\tilde{d}\end{subarray}}\tilde{f}\sum_{\begin{subarray}{c}h_{1},\ldots,h_{q}\in\mathbb{Z}\\ |h_{i}|\leq 2H_{j}\\ \forall i\equiv j\ (\textnormal{mod }2)\end{subarray}}\mathbbm{1}_{T^{\overline{a}h_{1}}ST^{h_{2}}S\cdots T^{\overline{a}h_{q-1}}ST^{h_{q}}S=I\text{ in }\textnormal{PSL}_{2}(\mathbb{Z}/\tilde{d}\mathbb{Z})}.

The inner sum is a count of solutions to an equation of the type ˜6.1 with cc replaced by d~\tilde{d}, so we can estimate 𝒮\mathscr{S} using Corollary˜6.5. We will make use of the following quick fact. Recalling that ff is the maximal positive integer with f2cdf^{2}\mid cd, we claim that for all integers k0k\geq 0, one has

maxdd~cf~2cd~f~d~k={fdk,k1,c,k=0.\max_{\begin{subarray}{c}d\mid\tilde{d}\mid c\\ \tilde{f}^{2}\mid c\tilde{d}\end{subarray}}\frac{\tilde{f}}{\tilde{d}^{k}}=\begin{cases}\frac{f}{d^{k}},&k\geq 1,\\ c,&k=0.\end{cases} (7.2)

Indeed, when one appends a prime pp to d~\tilde{d}, the numerator f~\tilde{f} can increase by at most pp, so the expression f~/d~k\tilde{f}/\tilde{d}^{k} cannot increase if k1k\geq 1, and the maximum is attained when d~=d\tilde{d}=d. On the other hand, if k=0k=0, then clearly f~cc=c\tilde{f}\leq\sqrt{c\cdot c}=c, and the maximum is attained when d~=c\tilde{d}=c.

Now set q=6q=6. From ˜6.10 and ˜7.2, we obtain that

𝒮εcεmaxdd~cf~2cd~f~(1+H1H2d~+H13H2d~3)H22cε(c+fH1H2d+fH13H2d3)H22.\mathscr{S}\ll_{\varepsilon}c^{\varepsilon}\max_{\begin{subarray}{c}d\mid\tilde{d}\mid c\\ \tilde{f}^{2}\mid c\tilde{d}\end{subarray}}\tilde{f}\left(1+\frac{H_{1}H_{2}}{\tilde{d}}+\frac{H_{1}^{3}H_{2}}{\tilde{d}^{3}}\right)H_{2}^{2}\leq c^{\varepsilon}\left(c+\frac{fH_{1}H_{2}}{d}+\frac{fH_{1}^{3}H_{2}}{d^{3}}\right)H_{2}^{2}.

Plugging this into ˜7.1, we conclude that

m,n𝒥(m,n,c)=1αmβnS(am,n;c)\displaystyle\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,c)=1\end{subarray}}\alpha_{m}\beta_{n}S(am,n;c) εαβc1+4εd1/6(H1H2)1/2(cH22+fH1H23d+fH13H23d3)1/6.\displaystyle\ll_{\varepsilon}\|\alpha\|\|\beta\|c^{1+4\varepsilon}\frac{d^{1/6}}{(H_{1}H_{2})^{1/2}}\left(cH_{2}^{2}+\frac{fH_{1}H_{2}^{3}}{d}+\frac{fH_{1}^{3}H_{2}^{3}}{d^{3}}\right)^{1/6}.
=αβc1+4ε(cdH13H2+fH12+fd2)1/6.\displaystyle=\|\alpha\|\|\beta\|c^{1+4\varepsilon}\left(\frac{cd}{H_{1}^{3}H_{2}}+\frac{f}{H_{1}^{2}}+\frac{f}{d^{2}}\right)^{1/6}.

The desired bound follows by recalling that H1=c1+εM1H_{1}=c^{1+\varepsilon}M^{-1}, H2=c1+εN1H_{2}=c^{1+\varepsilon}N^{-1}. ∎

Remark.

Using ˜6.9 with q=4q=4 instead of ˜6.10 with q=6q=6 in the proof above leads to a final bound of

m,n𝒥(m,n,c)=1αmβnS(am,n;c)\displaystyle\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,c)=1\end{subarray}}\alpha_{m}\beta_{n}S(am,n;c) co(1)αβc(dM2Nc2+fM2c2+fNdc+fd2)1/4\displaystyle\ll c^{o(1)}\|\alpha\|\|\beta\|c\left(\frac{dM^{2}N}{c^{2}}+\frac{fM^{2}}{c^{2}}+\frac{fN}{dc}+\frac{f}{d^{2}}\right)^{1/4}
=co(1)αβMNc(dN+fN2+fcdM2N+fc2d2M2N2)1/4,\displaystyle=c^{o(1)}\|\alpha\|\|\beta\|\sqrt{MNc}\left(\frac{d}{N}+\frac{f}{N^{2}}+\frac{fc}{dM^{2}N}+\frac{fc^{2}}{d^{2}M^{2}N^{2}}\right)^{1/4},

which is weaker than Theorem˜7.1 in the main ranges of interest.

Proof of Theorem˜1.2.

Note that the result holds trivially if c=O(1)c=O(1). Since M,Nc1/2+o(1)M,N\ll c^{1/2+o(1)} and the result is symmetric in M,NM,N, we can assume without loss of generality that NMcN\leq M\leq c. One can then apply Theorem˜7.1, and since M,Nc1/2+o(1)M,N\ll c^{1/2+o(1)}, the upper bound becomes

αβc1+o(1)(dc+fc+fd2)16.\|\alpha\|\|\beta\|c^{1+o(1)}\left(\frac{d}{c}+\frac{f}{c}+\frac{f}{d^{2}}\right)^{\frac{1}{6}}.

The first term can be omitted in light of the bound dfd\leq f (since d2cdd^{2}\mid cd). ∎

7.2. Near-prime moduli

Building towards an unconditional result for general moduli, we need to slightly develop the result of Kowalski–Michel–Sawin [24] from Theorem˜3.4.

Corollary 7.3 (Kowalski–Michel–Sawin bounds for near-primes).

Let c=pqc=pq where pp is a prime, q+q\in\mathbb{Z}_{+}, and pqp\nmid q. Let M,NM,N\in\mathbb{Z} be integers such that 1NMc1\leq N\leq M\leq c. Then for any complex sequences (αm)mM(\alpha_{m})_{m\leq M} and (βn)nN(\beta_{n})_{n\leq N}, and any a(/c)×a\in(\mathbb{Z}/c\mathbb{Z})^{\times}, one has

m=1Mn=1N(m,n,c)=1αmβnS(am,n;c)αβco(1)MNc(N12q+(MN)316c1164q5364).\displaystyle\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,c)=1}\alpha_{m}\beta_{n}S(am,n;c)\ll\|\alpha\|\|\beta\|c^{o(1)}\sqrt{MNc}\left(N^{-\frac{1}{2}}q+(MN)^{-\frac{3}{16}}c^{\frac{11}{64}}q^{\frac{53}{64}}\right).
Proof.

Using the twisted multiplicativity of Kloosterman sums and splitting the sums over m,nm,n according to their residues modulo qq, we can write

m=1Mn=1N(m,n,pq)=1αmβnS(am,n;pq)\displaystyle\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,pq)=1}\alpha_{m}\beta_{n}S(am,n;pq) =m=1Mn=1N(m,n,pq)=1αmβnS(aq¯2m,n;p)S(ap¯2m,n;q)\displaystyle=\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,pq)=1}\alpha_{m}\beta_{n}S(a\overline{q}^{2}m,n;p)S(a\overline{p}^{2}m,n;q) (7.3)
=r=1qs=1q(r,s,q)=1S(ap¯2r,s;q)m=1Mn=1N(m,n,p)=1αm,rβn,sS(aq¯2m,n;p),\displaystyle=\mathop{\sum_{r=1}^{q}\sum_{s=1}^{q}}_{(r,s,q)=1}S(a\overline{p}^{2}r,s;q)\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,p)=1}\alpha_{m,r}\beta_{n,s}S(a\overline{q}^{2}m,n;p),

where

αm,r:=αm𝟙mr(mod q),βn,s:=βn𝟙ns(mod q).\alpha_{m,r}:=\alpha_{m}\mathbbm{1}_{m\equiv r\ (\textnormal{mod }q)},\qquad\qquad\beta_{n,s}:=\beta_{n}\mathbbm{1}_{n\equiv s\ (\textnormal{mod }q)}.

First, we consider the contribution to ˜7.3 of those m,nm,n with pmp\mid m (and pnp\nmid n) or pnp\mid n (and pmp\nmid m). By the Weil and Ramanujan bounds from Lemmas˜3.3 and 3.2, this contribution is

r=1qs=1q(r,s,q)=1q12+o(1)m=1Mn=1N(m,n,p)=1|αm,rβn,s|\displaystyle\ll\mathop{\sum_{r=1}^{q}\sum_{s=1}^{q}}_{(r,s,q)=1}q^{\frac{1}{2}+o(1)}\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,p)=1}|\alpha_{m,r}\beta_{n,s}| q12+o(1)(m=1M|αm|)(n=1N|βn|)\displaystyle\ll q^{\frac{1}{2}+o(1)}\left(\sum_{m=1}^{M}|\alpha_{m}|\right)\left(\sum_{n=1}^{N}|\beta_{n}|\right)
αβqo(1)MNq,\displaystyle\ll\|\alpha\|\|\beta\|q^{o(1)}\sqrt{MNq},

Plugging this into ˜7.3 and applying the Weil bound from Lemma˜3.3, we find that

m=1Mn=1N(m,n,pq)=1αmβnS(am,n;pq)r=1qs=1q(r,s,q)=1q12+o(1)|m=1Mn=1Npmnαm,rβn,sS(aq¯2m,n;p)|\displaystyle\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,pq)=1}\alpha_{m}\beta_{n}S(am,n;pq)\ll\mathop{\sum_{r=1}^{q}\sum_{s=1}^{q}}_{(r,s,q)=1}q^{\frac{1}{2}+o(1)}\left|\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{p\nmid mn}\alpha_{m,r}\beta_{n,s}S(a\overline{q}^{2}m,n;p)\right| (7.4)
+αβqo(1)MNq.\displaystyle+\|\alpha\|\|\beta\|q^{o(1)}\sqrt{MNq}.

The last bilinear sum over m,nm,n is now almost in the correct shape to apply Theorem˜3.4. Indeed, splitting it according to the residues of mm and nn modulo pp, we can rewrite it as

m=1Mn=1Npmnαm,rβn,sS(aq¯2m,n;p)=m=1min(M,p1)n=1min(N,p1)αm,rβn,sS(aq¯2m,n;p),\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{p\nmid mn}\alpha_{m,r}\beta_{n,s}S(a\overline{q}^{2}m,n;p)=\sum_{m^{\prime}=1}^{\min(M,p-1)}\ \sum_{n^{\prime}=1}^{\min(N,p-1)}\alpha_{m^{\prime},r}\beta_{n^{\prime},s}S(a\overline{q}^{2}m^{\prime},n^{\prime};p), (7.5)

where

αm,r:=mMmm(mod p)αm,r=mMmm(mod p)mr(mod q)αm,\alpha_{m^{\prime},r}:=\sum_{\begin{subarray}{c}m\leq M\\ m\equiv m^{\prime}\ (\textnormal{mod }p)\end{subarray}}\alpha_{m,r}=\sum_{\begin{subarray}{c}m\leq M\\ m\equiv m^{\prime}\ (\textnormal{mod }p)\\ m\equiv r\ (\textnormal{mod }q)\end{subarray}}\alpha_{m},

and βn,s\beta_{n^{\prime},s} is defined similarly. Note that the last sum above contains at most one term, so

m=1min(M,p1)|αm,r|2=m=1min(M,p1)mMmm(mod p)mr(mod q)|αm|2mMmr(mod q)|αm|2,\sum_{m^{\prime}=1}^{\min(M,p-1)}|\alpha_{m^{\prime},r}|^{2}=\sum_{m^{\prime}=1}^{\min(M,p-1)}\sum_{\begin{subarray}{c}m\leq M\\ m\equiv m^{\prime}\ (\textnormal{mod }p)\\ m\equiv r\ (\textnormal{mod }q)\end{subarray}}|\alpha_{m}|^{2}\leq\sum_{\begin{subarray}{c}m\leq M\\ m\equiv r\ (\textnormal{mod }q)\end{subarray}}|\alpha_{m}|^{2},

and similarly for n=1min(N,p1)|βn,s|2\sum_{n^{\prime}=1}^{\min(N,p-1)}|\beta_{n^{\prime},s}|^{2}. Applying Theorem˜3.4 (and the remark that follows it) for a bilinear sum with lengths min(M,p1)\min(M,p-1) and min(N,p1)\min(N,p-1), and using the monotonicity of the bound from Theorem˜3.4 in MM and NN, we obtain

m=1min(M,p1)n=1min(N,p1)αm,rβn,sS(aq¯2m,n;p)mMmr(mod q)|αm|2nNns(mod q)|βn|2\displaystyle\sum_{m^{\prime}=1}^{\min(M,p-1)}\ \sum_{n^{\prime}=1}^{\min(N,p-1)}\alpha_{m^{\prime},r}\beta_{n^{\prime},s}S(a\overline{q}^{2}m^{\prime},n^{\prime};p)\ll\sqrt{\sum_{\begin{subarray}{c}m\leq M\\ m\equiv r\ (\textnormal{mod }q)\end{subarray}}|\alpha_{m}|^{2}\sum_{\begin{subarray}{c}n\leq N\\ n\equiv s\ (\textnormal{mod }q)\end{subarray}}|\beta_{n}|^{2}}
×po(1)MNp(N12+(MN)316p1164).\displaystyle\times\ p^{o(1)}\sqrt{MNp}\left(N^{-\frac{1}{2}}+(MN)^{-\frac{3}{16}}p^{\frac{11}{64}}\right).

Plugging this into ˜7.5 and ˜7.4, and applying Cauchy–Schwarz to the sum over r,sr,s, we find that

m=1Mn=1N(m,n,c)=1αmβnS(am,n;c)co(1)q32αβMNp(N12+(MN)316p1164)\displaystyle\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,c)=1}\alpha_{m}\beta_{n}S(am,n;c)\ll c^{o(1)}q^{\frac{3}{2}}\|\alpha\|\|\beta\|\sqrt{MNp}\left(N^{-\frac{1}{2}}+(MN)^{-\frac{3}{16}}p^{\frac{11}{64}}\right)
+αβco(1)MNq.\displaystyle+\|\alpha\|\|\beta\|c^{o(1)}\sqrt{MNq}.

Finally, recalling that pq=cpq=c and M,NcM,N\leq c (which imply MNqMcqq3/2Mp\sqrt{MNq}\leq\sqrt{Mcq}\leq q^{3/2}\sqrt{Mp}), the last term can be omitted, and we arrive at the desired bound. ∎

7.3. General moduli

Finally, we prove a generalization of Theorem˜1.1.

Theorem 7.4.

Let δ[0,124]\delta\in[0,\tfrac{1}{24}], c,M,N+c,M,N\in\mathbb{Z}_{+} with M,NcM,N\leq c. Let M~:=max(M,N)\tilde{M}:=\max(M,N) and N~:=min(M,N)\tilde{N}:=\min(M,N). Then for any complex sequences (αm)mM(\alpha_{m})_{m\leq M}, (βn)nN(\beta_{n})_{n\leq N} and any a(/c)×a\in(\mathbb{Z}/c\mathbb{Z})^{\times}, one has

m=1Mn=1N(m,n,c)=1αmβnS(am,n;c)\displaystyle\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,c)=1}\alpha_{m}\beta_{n}S(am,n;c) αβco(1)MNc\displaystyle\ll\|\alpha\|\|\beta\|c^{o(1)}\sqrt{MNc} (7.6)
×(c11+53δ64(MN)316+c1δ6N~13+c4δ12M~16N~12+c1124(MN)12).\displaystyle\times\left(\frac{c^{\frac{11+53\delta}{64}}}{(MN)^{\frac{3}{16}}}+\frac{c^{\frac{1-\delta}{6}}}{\tilde{N}^{\frac{1}{3}}}+\frac{c^{\frac{4-\delta}{12}}}{\tilde{M}^{\frac{1}{6}}\tilde{N}^{\frac{1}{2}}}+\frac{c^{\frac{11}{24}}}{(MN)^{\frac{1}{2}}}\right).

Moreover, if |αm|1|\alpha_{m}|\leq 1 for all mm (so αM\|\alpha\|\leq\sqrt{M}), then

m=1Mn=1N(n,c)=1αmβnS(am,n;c)\displaystyle\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(n,c)=1}\alpha_{m}\beta_{n}S(am,n;c) Mβco(1)MNc\displaystyle\ll\sqrt{M}\|\beta\|c^{o(1)}\sqrt{MNc} (7.7)
×(c11+53δ64(MN)316+c1δ4N~12+1c316+c18N~13+c1124(MN)12).\displaystyle\times\left(\frac{c^{\frac{11+53\delta}{64}}}{(MN)^{\frac{3}{16}}}+\frac{c^{\frac{1-\delta}{4}}}{\tilde{N}^{\frac{1}{2}}}+\frac{1}{c^{\frac{3}{16}}}+\frac{c^{\frac{1}{8}}}{\tilde{N}^{\frac{1}{3}}}+\frac{c^{\frac{11}{24}}}{(MN)^{\frac{1}{2}}}\right).
Remark.

The bound ˜7.7 actually holds without the assumption that M,NcM,N\leq c. Indeed, Theorem˜7.4 covers the case M,NcM,N\leq c. If M,N>cM,N>c, then applying the (first) bound from ˜1.2 for the sequences (αm)mc(\alpha^{\prime}_{m^{\prime}})_{m^{\prime}\leq c}, (βn)nN(\beta^{\prime}_{n})_{n\leq N} given by αm:=mm(mod c)αm\alpha^{\prime}_{m^{\prime}}:=\sum_{m\equiv m^{\prime}\ (\textnormal{mod }c)}\alpha_{m} and βn:=𝟙(n,c)=1nn(mod c)βn\beta^{\prime}_{n}:=\mathbbm{1}_{(n,c)=1}\sum_{n\equiv n^{\prime}\ (\textnormal{mod }c)}\beta_{n} leads to the bound

m=1Mn=1N(n,c)=1αmβnS(am,n;c)αβc1+o(1)\displaystyle\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(n,c)=1}\alpha_{m}\beta_{n}S(am,n;c)\ll\|\alpha^{\prime}\|\|\beta^{\prime}\|c^{1+o(1)} MccNcβc1+o(1)\displaystyle\ll\frac{M}{c}\sqrt{c}\sqrt{\frac{N}{c}}\|\beta\|c^{1+o(1)}
Mβco(1)MNcc3/16,\displaystyle\ll\sqrt{M}\|\beta\|c^{o(1)}\sqrt{MNc}\cdot c^{-3/16},

so ˜7.7 still holds. If Nc<MN\leq c<M, then applying the (first) trivial bound from ˜1.2 for the sequences (αm)mc(\alpha^{\prime}_{m^{\prime}})_{m^{\prime}\leq c}, (βn)nN(\beta^{\prime}_{n})_{n\leq N} given by αm:=mm(mod c)αm\alpha^{\prime}_{m^{\prime}}:=\sum_{m\equiv m^{\prime}\ (\textnormal{mod }c)}\alpha_{m} and βn:=βn𝟙(n,c)=1\beta^{\prime}_{n}:=\beta_{n}\mathbbm{1}_{(n,c)=1} leads to the bound

m=1Mn=1N(n,c)=1αmβnS(am,n;c)αβc1+o(1)\displaystyle\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(n,c)=1}\alpha_{m}\beta_{n}S(am,n;c)\ll\|\alpha^{\prime}\|\|\beta^{\prime}\|c^{1+o(1)} Mccβc1+o(1)\displaystyle\ll\frac{M}{c}\sqrt{c}\|\beta\|c^{1+o(1)}
Mβco(1)MNcc1δ4N12,\displaystyle\ll\sqrt{M}\|\beta\|c^{o(1)}\sqrt{MNc}\cdot\frac{c^{\frac{1-\delta}{4}}}{N^{\frac{1}{2}}},

so ˜7.7 still holds. An analogous argument covers the remaining case Mc<NM\leq c<N.

Proof of Theorem˜7.4.

We begin by noting a quick consequence of Theorem˜7.1. Suppose cc has a factorization c=ddec=dd^{\prime}e with ddd^{\prime}\mid d and (d,e)=1(d,e)=1 such that dc1/2d\geq c^{1/2}. Then by combining Theorem˜7.1 (applied for M~\tilde{M}, N~\tilde{N} instead of M,NM,N) with the bounds fcdf\leq\sqrt{cd} and then dc1/2d\geq c^{1/2}, we get

m=1Mn=1N(m,n,c)=1αmβnS(am,n;c)\displaystyle\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,c)=1}\alpha_{m}\beta_{n}S(am,n;c) (7.8)
αβco(1)MNc(dN~2+c32d12M~N~3+c114M3N3)16\displaystyle\ll\|\alpha\|\|\beta\|c^{o(1)}\sqrt{MNc}\left(\frac{d}{\tilde{N}^{2}}+\frac{c^{\frac{3}{2}}d^{\frac{1}{2}}}{\tilde{M}\tilde{N}^{3}}+\frac{c^{\frac{11}{4}}}{M^{3}N^{3}}\right)^{\frac{1}{6}}
αβco(1)MNc(d16N~13+c14d112M~16N~12+c1124(MN)12)=:(d).\displaystyle\ll\|\alpha\|\|\beta\|c^{o(1)}\sqrt{MNc}\left(\frac{d^{\frac{1}{6}}}{\tilde{N}^{\frac{1}{3}}}+\frac{c^{\frac{1}{4}}d^{\frac{1}{12}}}{\tilde{M}^{\frac{1}{6}}\tilde{N}^{\frac{1}{2}}}+\frac{c^{\frac{11}{24}}}{(MN)^{\frac{1}{2}}}\right)=:\mathcal{B}(d).

Note that this bound (d)\mathcal{B}(d) is increasing with d[c1/2,c]d\in[c^{1/2},c], and that the right-hand side of ˜7.6 supersedes (c1δ)\mathcal{B}(c^{1-\delta}). In particular,

(c3/4)\displaystyle\mathcal{B}(c^{3/4}) =αβco(1)MNc(c18N~13+c516M~16N~12+c1124(MN)12).\displaystyle=\|\alpha\|\|\beta\|c^{o(1)}\sqrt{MNc}\left(\frac{c^{\frac{1}{8}}}{\tilde{N}^{\frac{1}{3}}}+\frac{c^{\frac{5}{16}}}{\tilde{M}^{\frac{1}{6}}\tilde{N}^{\frac{1}{2}}}+\frac{c^{\frac{11}{24}}}{(MN)^{\frac{1}{2}}}\right).

A quick computation shows that since δ124\delta\leq\tfrac{1}{24}, one has

c516M~16N~12max(c1124(M~N~)12,c1δ4N~12).\frac{c^{\frac{5}{16}}}{\tilde{M}^{\frac{1}{6}}\tilde{N}^{\frac{1}{2}}}\leq\max\left(\frac{c^{\frac{11}{24}}}{(\tilde{M}\tilde{N})^{\frac{1}{2}}},\frac{c^{\frac{1-\delta}{4}}}{\tilde{N}^{\frac{1}{2}}}\right).

Therefore, the right-hand side of ˜7.7 supersedes (c3/4)\mathcal{B}(c^{3/4}). We now split into cases depending on the factorization of the modulus cc.

Case 1: cc is divisible by a maximal prime power pkc1δp^{k}\geq c^{1-\delta}. Then let us write c=pkqc=p^{k}q, where qq is not necessarily a prime, but qcδq\leq c^{\delta} and (p,q)=1(p,q)=1.

Subcase 1.1: One has k=1k=1. Then we can apply Corollary˜7.3 (with M,NM,N replaced by M~,N~\tilde{M},\tilde{N}), which gives the bound

m=1Mn=1N(m,n,c)=1αmβnS(am,n;c)αβco(1)MNc(N~12cδ+(MN)316c11+53δ64).\displaystyle\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(m,n,c)=1}\alpha_{m}\beta_{n}S(am,n;c)\ll\|\alpha\|\|\beta\|c^{o(1)}\sqrt{MNc}\left(\tilde{N}^{-\frac{1}{2}}c^{\delta}+(MN)^{-\frac{3}{16}}c^{\frac{11+53\delta}{64}}\right).

The first term here is superseded by the third term in ˜7.6 and the second term in ˜7.7, since M~c\tilde{M}\leq c and δ124\delta\leq\tfrac{1}{24}. The second term here appears directly in both ˜7.6 and 7.7.

Subcase 1.2: One has k2k\geq 2. Then we let d:=pk/2qd:=p^{\left\lceil k/2\right\rceil}q, d:=pk/2d^{\prime}:=p^{\left\lfloor k/2\right\rfloor}, and e:=1e:=1, which gives a valid decomposition c=ddec=dd^{\prime}e to use in our Theorem˜7.1. Moreover, since k2k\geq 2, we have k/22k/3\left\lceil k/2\right\rceil\leq 2k/3, so dp2k/3q=c2/3q1/3d\leq p^{2k/3}q=c^{2/3}q^{1/3}, and thus

d[c1/2,c(2+δ)/3].d\in[c^{1/2},c^{(2+\delta)/3}].

From ˜7.8 we thus obtain an upper bound of (c(2+δ)/3)\mathcal{B}(c^{(2+\delta)/3}), which is acceptable in both ˜7.6 and 7.7 since 2+δ334\tfrac{2+\delta}{3}\leq\tfrac{3}{4}.

Case 2: All prime powers pkcp^{k}\mid c have pk<c1δp^{k}<c^{1-\delta}.

Subcase 2.1: All prime powers pkcp^{k}\mid c have pk<c1/2p^{k}<c^{1/2}. Then we set d:=1d^{\prime}:=1, and construct d,ed,e by a greedy algorithm. Initially, we take d=e:=1d=e:=1. For each prime power pkcp^{k}\|c, we append pkp^{k} to the smaller of dd and ee. Note that throughout this process, dd and ee cannot differ by a factor larger than c1/2c^{1/2}. In the end, if d<ed<e, we swap dd and ee. This produces a factorization c=dec=de with (d,e)=1(d,e)=1 and

d[c1/2,c3/4].d\in[c^{1/2},c^{3/4}].

Then ˜7.8 gives a bound of (c3/4)\mathcal{B}(c^{3/4}), which is acceptable in both ˜7.6 and 7.7.

Subcase 2.2: The largest prime power dividing cc is some pk[c1/2,c3/4)p^{k}\in[c^{1/2},c^{3/4}). Then we let d:=pkd:=p^{k}, d:=1d^{\prime}:=1, and e:=cpke:=cp^{-k}, and ˜7.8 gives an acceptable bound of (c3/4)\mathcal{B}(c^{3/4}) once again.

Subcase 2.3: The largest prime power dividing cc is some pk[c3/4,c1δ)p^{k}\in[c^{3/4},c^{1-\delta}). On the one hand, ˜7.8 gives a bound of (c1δ)\mathcal{B}(c^{1-\delta}), which is acceptable in ˜7.6; this completes the proof of ˜7.6.

Now assume (still within Subcase 2.3) that |αm|1|\alpha_{m}|\leq 1 for all mm, and we aim to establish ˜7.7.

  • If p=2p=2, then writing c=2kqc=2^{k}q, we can factorize c=ddec=dd^{\prime}e with d:=2k/2qd:=2^{\left\lceil k/2\right\rceil}q, d:=2k/2d^{\prime}:=2^{\left\lfloor k/2\right\rfloor}, and e:=1e:=1. Here d2kq2=cqc3/4d\ll\sqrt{2^{k}q^{2}}=\sqrt{cq}\leq c^{3/4}, since 2kc1/22^{k}\geq c^{1/2} implies qc1/2q\leq c^{1/2}. But then ˜7.8 gives an acceptable bound of (c3/4)\mathcal{B}(c^{3/4}).

  • If p>2p>2, then we can use d:=pkd:=p^{k} in Theorem˜3.5. Since d[c3/4,c1δ)d\in[c^{3/4},c^{1-\delta}), this gives the bound

    m=1Mn=1N(n,c)=1αmβnS(am,n;c)Mβco(1)MNc(c18M12+1c316+c1δ4N12).\mathop{\sum_{m=1}^{M}\sum_{n=1}^{N}}_{(n,c)=1}\alpha_{m}\beta_{n}S(am,n;c)\ll\sqrt{M}\|\beta\|c^{o(1)}\sqrt{MNc}\left(\frac{c^{\frac{1}{8}}}{M^{\frac{1}{2}}}+\frac{1}{c^{\frac{3}{16}}}+\frac{c^{\frac{1-\delta}{4}}}{N^{\frac{1}{2}}}\right).

    Since M,NN~M,N\geq\tilde{N} and 181δ4\tfrac{1}{8}\leq\tfrac{1-\delta}{4}, the first and the last terms in the parenthesis above are superseded by the term c(1δ)/4N~1/2c^{(1-\delta)/4}\tilde{N}^{-1/2} from ˜7.7. The second term appears directly in ˜7.7.

This covers all cases. ∎

Proof of Theorem˜1.1.

As before, we can assume without loss of generality that M,NcM,N\leq c since the result is trivial when c=O(1)c=O(1). Then the result follows by applying Theorem˜7.4 with M,Nc1/2+o(1)M,N\ll c^{1/2+o(1)}, and using the optimal choices δ=3175\delta=\frac{3}{175} in ˜7.6, respectively δ=169\delta=\frac{1}{69} in ˜7.7. ∎

7.4. Averaging over moduli

Finally, let us prove a generalization of Corollary˜1.4.

Corollary 7.5.

Let q=ddeq=dd^{\prime}e for some d,d,e+d,d^{\prime},e\in\mathbb{Z}_{+} with ddd^{\prime}\mid d and (d,e)=1(d,e)=1, and fqdf\leq\sqrt{qd} be the largest integer with f2qdf^{2}\mid qd. Let C12C\geq\tfrac{1}{2} and ,𝒥\mathcal{I},\mathcal{J}\subset\mathbb{Z} be intervals of lengths ||=M|\mathcal{I}|=M, |𝒥|=N|\mathcal{J}|=N, with 1NMC1\leq N\leq M\leq C. Let (αm)m,(βn)n𝒥(\alpha_{m})_{m\in\mathcal{I}},(\beta_{n})_{n\in\mathcal{J}} be complex sequences, and for each cCc\sim C, let (αm(c))m(\alpha_{m}(c))_{m\in\mathcal{I}}, (βn(c))n𝒥(\beta_{n}(c))_{n\in\mathcal{J}} be such that |αm(c)||αm||\alpha_{m}(c)|\leq|\alpha_{m}|, |βn(c)||βn||\beta_{n}(c)|\leq|\beta_{n}| for all m,n𝒥m\in\mathcal{I},n\in\mathcal{J}. Then one has

cCqc|m,n𝒥(m,n,q)=1αm(c)βn(c)S(m,n;c)|\displaystyle\sum_{\begin{subarray}{c}c\sim C\\ q\mid c\end{subarray}}\left|\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,q)=1\end{subarray}}\alpha_{m}(c)\beta_{n}(c)S(m,n;c)\right| αβC2+o(1)qmin{(dM3NC3+fM2C2+fd2)16,(dM3NqC2+fM2qC+fqd2C)16.\displaystyle\ll\|\alpha\|\|\beta\|\frac{C^{2+o(1)}}{q}\min\hskip-22.76228pt
Proof of Corollary˜7.5.

Throughout this proof, we will use the notation

fa:=maxf~2af~,αa:=mam|αm|2,βa:=n𝒥an|βn|2,f_{a}:=\max_{\tilde{f}^{2}\mid a}\tilde{f},\qquad\qquad\|\alpha_{a*}\|:=\sqrt{\sum_{\begin{subarray}{c}m\in\mathcal{I}\\ a\mid m\end{subarray}}|\alpha_{m}|^{2}},\qquad\qquad\|\beta_{a*}\|:=\sqrt{\sum_{\begin{subarray}{c}n\in\mathcal{J}\\ a\mid n\end{subarray}}|\beta_{n}|^{2}},

for any a+a\in\mathbb{Z}_{+}. In particular, the assumption of the present Corollary˜7.5 takes f=fqdf=f_{qd}. Note that fafabf_{a}\mid f_{ab} and fa2b=afbf_{a^{2}b}=af_{b} for any a,b+a,b\in\mathbb{Z}_{+}, and that fab=fafbf_{ab}=f_{a}f_{b} when (a,b)=1(a,b)=1.

We can of course assume without loss of generality that CqC\gg q, since otherwise the sum over cc is empty. For each cCc\sim C with qcq\mid c, we consider the sum

𝒮(c)\displaystyle\mathcal{S}(c) :=m,n𝒥(m,n,q)=1αm(c)βn(c)S(m,n;c)\displaystyle=\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,q)=1\end{subarray}}\alpha_{m}(c)\beta_{n}(c)S(m,n;c)
=gc(g,q)=1m,n𝒥(m,n,c)=gαm(c)βn(c)S(m,n;c)=gc(g,q)=1ϕ(c)ϕ(c/g)m,n𝒥(m,n,c)=gαm(c)βn(c)S(mg,ng;cg),\displaystyle=\sum_{\begin{subarray}{c}g\mid c\\ (g,q)=1\end{subarray}}\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,c)=g\end{subarray}}\alpha_{m}(c)\beta_{n}(c)S(m,n;c)=\sum_{\begin{subarray}{c}g\mid c\\ (g,q)=1\end{subarray}}\frac{\phi(c)}{\phi(c/g)}\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ (m,n,c)=g\end{subarray}}\alpha_{m}(c)\beta_{n}(c)S(\tfrac{m}{g},\tfrac{n}{g};\tfrac{c}{g}),

where the last equality follows from the identity S(m,n;c)=ϕ(c)ϕ(c/g)S(mg,ng;cg)S(m,n;c)=\tfrac{\phi(c)}{\phi(c/g)}S(\tfrac{m}{g},\tfrac{n}{g};\tfrac{c}{g}). From the triangle inequality and the bound ϕ(c)ϕ(c/g)g\tfrac{\phi(c)}{\phi(c/g)}\leq g, we find that

cCqc|𝒮(c)|g2C/q(g,q)=1gcCgqc|𝒮(c;g)|,\sum_{\begin{subarray}{c}c\sim C\\ q\mid c\end{subarray}}|\mathcal{S}(c)|\leq\sum_{\begin{subarray}{c}g\leq 2C/q\\ (g,q)=1\end{subarray}}g\sum_{\begin{subarray}{c}c\sim C\\ gq\mid c\end{subarray}}|\mathcal{S}(c;g)|, (7.9)

where

𝒮(c;g):=m,n𝒥g(m,n)(mg,ng,cg)=1αm(c)βn(c)S(mg,ng;cg).\mathcal{S}(c;g):=\mathop{\sum\sum}_{\begin{subarray}{c}m\in\mathcal{I},n\in\mathcal{J}\\ g\mid(m,n)\\ (\frac{m}{g},\frac{n}{g},\frac{c}{g})=1\end{subarray}}\alpha_{m}(c)\beta_{n}(c)S(\tfrac{m}{g},\tfrac{n}{g};\tfrac{c}{g}).

We aim to apply Theorem˜7.1 (with M,N,cMg,Ng,cgM,N,c\leftarrow\tfrac{M}{g},\tfrac{N}{g},\tfrac{c}{g}) to bound each sum 𝒮(c;g)\mathcal{S}(c;g), and this requires a suitable factorization of the modulus cg\tfrac{c}{g}. There are two ways to construct this from the assumed factorization q=ddeq=dd^{\prime}e, which correspond to placing ‘most’ of the factor cgq\tfrac{c}{gq} into ee or into dd.

Method 1. For each cCc\sim C with gqcgq\mid c, consider the factorization

cg=:cq=d~de~,d~:=(c,d)d,e~:=ec(c,d),\frac{c}{g}=:c^{\prime}q=\tilde{d}d^{\prime}\tilde{e},\qquad\tilde{d}:=(c^{\prime},d^{\infty})d,\qquad\tilde{e}:=\frac{ec^{\prime}}{(c^{\prime},d^{\infty})},

which has dd~d^{\prime}\mid\tilde{d} and (d~,e~)=1(\tilde{d},\tilde{e})=1. We find that

f(c/g)d~2cq(c,q)d=(c,q)2c(c,q)qdf(c/g)d~(c,q)fcfqd=(c,q)fcf,f_{(c/g)\tilde{d}}^{2}\mid c^{\prime}q(c^{\prime},q^{\infty})d=(c^{\prime},q^{\infty})^{2}\frac{c^{\prime}}{(c^{\prime},q^{\infty})}qd\qquad\Rightarrow\qquad f_{(c/g)\tilde{d}}\leq(c^{\prime},q^{\infty})f_{c^{\prime}}f_{qd}=(c^{\prime},q^{\infty})f_{c^{\prime}}f,

so Theorem˜7.1 gives

𝒮(c;g)=𝒮(cgq;g)\displaystyle\mathcal{S}(c;g)=\mathcal{S}(c^{\prime}gq;g) αgβgc1+o(1)g(d~M3Nc3+f(c/g)d~M2c2+f(c/g)d~d~2)16g16\displaystyle\ll\|\alpha_{g*}\|\|\beta_{g*}\|\frac{c^{1+o(1)}}{g}\left(\frac{\tilde{d}M^{3}N}{c^{3}}+\frac{f_{(c/g)\tilde{d}}M^{2}}{c^{2}}+\frac{f_{(c/g)\tilde{d}}}{\tilde{d}^{2}}\right)^{\frac{1}{6}}g^{\frac{1}{6}}
αgβgC1+o(1)(dM3NC3+fM2C2+fd2)16((c,q)fc)16.\displaystyle\ll\|\alpha_{g*}\|\|\beta_{g*}\|C^{1+o(1)}\left(\frac{dM^{3}N}{C^{3}}+\frac{fM^{2}}{C^{2}}+\frac{f}{d^{2}}\right)^{\frac{1}{6}}\left((c^{\prime},q^{\infty})f_{c^{\prime}}\right)^{\frac{1}{6}}.

Therefore,

cCgqc|𝒮(c;g)|\displaystyle\sum_{\begin{subarray}{c}c\sim C\\ gq\mid c\end{subarray}}|\mathcal{S}(c;g)| =cCgq|𝒮(c;g)|\displaystyle=\sum_{c^{\prime}\sim\frac{C}{gq}}|\mathcal{S}(c;g)|
αgβgC1+o(1)(dM3NC3+fM2C2+fd2)16cCgq((c,q)fc)16.\displaystyle\ll\|\alpha_{g*}\|\|\beta_{g*}\|C^{1+o(1)}\left(\frac{dM^{3}N}{C^{3}}+\frac{fM^{2}}{C^{2}}+\frac{f}{d^{2}}\right)^{\frac{1}{6}}\sum_{c^{\prime}\sim\frac{C}{gq}}\left((c^{\prime},q^{\infty})f_{c^{\prime}}\right)^{\frac{1}{6}}.

After applying Cauchy–Schwarz to the last sum, it remains to bound the sums cC/(gq)(c,q)\sum_{c^{\prime}\sim C/(gq)}(c^{\prime},q^{\infty}) and cC/(gq)fc\sum_{c^{\prime}\sim C/(gq)}f_{c^{\prime}}, both of which are O(C1+o(1)gq)O(\tfrac{C^{1+o(1)}}{gq}). In particular, for the second sum, we can write

cCgqfcfCgqfcCgq𝟙f2cfCgqCgqfC1+o(1)gq.\sum_{c^{\prime}\sim\frac{C}{gq}}f_{c^{\prime}}\leq\sum_{f\ll\sqrt{\frac{C}{gq}}}f\sum_{c^{\prime}\sim\frac{C}{gq}}\mathbbm{1}_{f^{2}\mid c^{\prime}}\ll\sum_{f\leq\sqrt{\frac{C}{gq}}}\frac{C}{gqf}\ll\frac{C^{1+o(1)}}{gq}.

From this and ˜7.9, we conclude that

cCqc|𝒮(c)|C2+o(1)q(dM3NC3+fM2C2+fd2)16g2C/q(g,q)=1αgβg.\sum_{\begin{subarray}{c}c\sim C\\ q\mid c\end{subarray}}|\mathcal{S}(c)|\ll\frac{C^{2+o(1)}}{q}\left(\frac{dM^{3}N}{C^{3}}+\frac{fM^{2}}{C^{2}}+\frac{f}{d^{2}}\right)^{\frac{1}{6}}\sum_{\begin{subarray}{c}g\leq 2C/q\\ (g,q)=1\end{subarray}}\|\alpha_{g*}\|\|\beta_{g*}\|.

Finally, the last sum is easily bounded by Co(1)αβC^{o(1)}\|\alpha\|\|\beta\| using Cauchy–Schwarz and the divisor bound. This establishes the bound from Corollary˜7.5 with the first term from the minimum.

Method 2. For each cCc\sim C with gqcgq\mid c, consider the factorization

cg=:cq=d~de~,d~:=cd(c,e),e~:=e(c,e),\frac{c}{g}=:c^{\prime}q=\tilde{d}d^{\prime}\tilde{e},\qquad\tilde{d}:=\frac{c^{\prime}d}{(c^{\prime},e^{\infty})},\qquad\tilde{e}:=e(c^{\prime},e^{\infty}),

which satisfies dd~d^{\prime}\mid\tilde{d} and (d~,e~)=1(\tilde{d},\tilde{e})=1. We find that

f(c/g)d~2(c)2qdf(c/g)d~cfqdfCq,f_{(c/g)\tilde{d}}^{2}\mid(c^{\prime})^{2}qd\qquad\Rightarrow\qquad f_{(c/g)\tilde{d}}\leq c^{\prime}f_{qd}\ll\frac{fC}{q},

so Theorem˜7.1 gives

𝒮(c;g)=𝒮(cgq;g)\displaystyle\mathcal{S}(c;g)=\mathcal{S}(c^{\prime}gq;g) αgβgC1+o(1)(dM3NqC2+fM2qC+fqd2C)16(c,e)13.\displaystyle\ll\|\alpha_{g*}\|\|\beta_{g*}\|C^{1+o(1)}\left(\frac{dM^{3}N}{qC^{2}}+\frac{fM^{2}}{qC}+\frac{fq}{d^{2}C}\right)^{\frac{1}{6}}(c^{\prime},e^{\infty})^{\frac{1}{3}}.

The second bound from Corollary˜7.5 now follows similarly as before from ˜7.9, since the sum over cCgqc^{\prime}\sim\tfrac{C}{gq} ‘washes out’ the factor (c,e)(c^{\prime},e^{\infty}). ∎

Proof of Corollary˜1.4.

This follows from Corollary˜7.5 analogously to how Theorem˜1.2 follows from Theorem˜7.1. ∎

8. Moments of twisted modular LL-functions

Here we prove Theorem˜1.5, by inserting our bounds for bilinear forms with Kloosterman sums into the proofs from [3]. We begin by restating ˜7.7 in a shape more similar to [3, Theorem 5].

Corollary 8.1.

Let r,q+r,q\in\mathbb{Z}_{+} with rqr\mid q. Let K,M1K,M\geq 1, K~:=max(K,M)\tilde{K}:=\max(K,M), M~:=min(K,M)\tilde{M}:=\min(K,M), and (λk)Kk2K(\lambda_{k})_{K\leq k\leq 2K} be a sequence with |λk|1|\lambda_{k}|\leq 1 for all kk. Then one has

Mm2M(m,q)=1|Kk2KλkS(k,m;r)|2\displaystyle\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ (m,q)=1\end{subarray}}\left|\sum_{K\leq k\leq 2K}\lambda_{k}S(k,m;r)\right|^{2} (qKM)o(1)K2Mr\displaystyle\ll(qKM)^{o(1)}K^{2}Mr
×(r11+53δ32(KM)38+r1δ2M~+1r38+r14M~23+r1112KM).\displaystyle\times\left(\frac{r^{\frac{11+53\delta}{32}}}{(KM)^{\frac{3}{8}}}+\frac{r^{\frac{1-\delta}{2}}}{\tilde{M}}+\frac{1}{r^{\frac{3}{8}}}+\frac{r^{\frac{1}{4}}}{\tilde{M}^{\frac{2}{3}}}+\frac{r^{\frac{11}{12}}}{KM}\right).
Proof.

One can of course assume without loss of generality that M,K+M,K\in\mathbb{Z}_{+}, and extend the sum over mm to include all m[M,2M]m\in[M,2M] with (m,r)=1(m,r)=1. By duality, it suffices to establish the bound

Mm2M(m,r)=1βmKk2KλkS(k,m;r)\displaystyle\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ (m,r)=1\end{subarray}}\beta_{m}\sum_{K\leq k\leq 2K}\lambda_{k}S(k,m;r) (qKM)o(1)βKMr\displaystyle\ll(qKM)^{o(1)}\|\beta\|K\sqrt{Mr}
×(r11+53δ64(KM)316+r1δ4M~12+1r316+r18M~13+r1124(KM)12),\displaystyle\times\left(\frac{r^{\frac{11+53\delta}{64}}}{(KM)^{\frac{3}{16}}}+\frac{r^{\frac{1-\delta}{4}}}{\tilde{M}^{\frac{1}{2}}}+\frac{1}{r^{\frac{3}{16}}}+\frac{r^{\frac{1}{8}}}{\tilde{M}^{\frac{1}{3}}}+\frac{r^{\frac{11}{24}}}{(KM)^{\frac{1}{2}}}\right),

for any sequence (βm)Mm2M(\beta_{m})_{M\leq m\leq 2M}. But this is precisely the content of ˜7.7 with (M,N,c)(M,N,c) replaced by (K,M,r)(K,M,r); the remark after Theorem˜7.4 allows us to ignore the constraint K,McK,M\leq c. ∎

We can now prove an analogue of [3, Proposition 7]. We use the same normalization as in [3, (2.3)] for the Hecke eigenvalues λf(n)\lambda_{f}(n) of a holomorphic cuspidal newform ff for SL2()\textnormal{SL}_{2}(\mathbb{Z}), so that

λf(n)ρf(1)=nρf(n),wheref(z)=n=1ρf(n)(4πn)k/2e(nz).\lambda_{f}(n)\rho_{f}(1)=\sqrt{n}\rho_{f}(n),\qquad\text{where}\qquad f(z)=\sum_{n=1}^{\infty}\rho_{f}(n)(4\pi n)^{k/2}e(nz). (8.1)

In particular, the Deligne bound [8] reads

λf(n)no(1).\lambda_{f}(n)\ll n^{o(1)}. (8.2)
Proposition 8.2.

Let ε>0\varepsilon>0, q,d+q,d\in\mathbb{Z}_{+} with dqd\mid q, 120NM1\frac{1}{20}N\geq M\geq 1 with MNq2+εMN\leq q^{2+\varepsilon}, and let λ1(m)\lambda_{1}(m), λ2(n)\lambda_{2}(n) be the Hecke eigenvalues of two (fixed) holomorphic cuspidal newforms for SL2()\textnormal{SL}_{2}(\mathbb{Z}). Let V1,V2:V_{1},V_{2}:\mathbb{R}\to\mathbb{C} be functions supported in [1,2][1,2] with derivatives Vi(j)j,εqεV_{i}^{(j)}\ll_{j,\varepsilon}q^{\varepsilon}, and denote

SN,M,d,q:=d(NM)1/2r=12N/dnm(mod d)(nm,q)=1nmλ1(m)λ2(n)V1(mM)V2(nN).S_{N,M,d,q}:=\frac{d}{(NM)^{1/2}}\sum_{r=1}^{2N/d}\sum_{\begin{subarray}{c}n\equiv m\ (\textnormal{mod }d)\\ (nm,q)=1\\ n\neq m\end{subarray}}\lambda_{1}(m)\lambda_{2}(n)V_{1}\left(\frac{m}{M}\right)V_{2}\left(\frac{n}{N}\right). (8.3)

Then for any δ[0,124]\delta\in[0,\tfrac{1}{24}], one has

SN,M,d,qεqO(ε)(M516q83+53δ64N516+q7δ4N+M12q2116N12+q138M16N12+q2324).S_{N,M,d,q}\ll_{\varepsilon}q^{O(\varepsilon)}\left(\frac{M^{\frac{5}{16}}q^{\frac{83+53\delta}{64}}}{N^{\frac{5}{16}}}+\frac{q^{\frac{7-\delta}{4}}}{\sqrt{N}}+\frac{M^{\frac{1}{2}}q^{\frac{21}{16}}}{N^{\frac{1}{2}}}+\frac{q^{\frac{13}{8}}M^{\frac{1}{6}}}{N^{\frac{1}{2}}}+q^{\frac{23}{24}}\right). (8.4)
Proof.

We closely follow the proof in [3, §4]. In particular, we decompose q=qdqq=q_{d}q^{\prime} where qq^{\prime} is maximal with (q,d)=1(q^{\prime},d)=1. The bound [3, (4.2)] reads

SN,M,d,qNgfqrdμ2(f)|λ2(f/g)|fgr(mM(m,q)=1|nS(fg¯m,n;r)λ2(n)V2(nNfgr2)|2)1/2,S_{N,M,d,q}\ll\sqrt{N}\sum_{\begin{subarray}{c}g\mid f\mid q^{\prime}\\ r\mid d\end{subarray}}\frac{\mu^{2}(f)|\lambda_{2}(f/g)|}{fgr}\left(\sum_{\begin{subarray}{c}m\asymp M\\ (m,q)=1\end{subarray}}\Big|\sum_{n}S(\overline{fg}m,n;r)\lambda_{2}(n)V_{2}^{\circ}\left(\frac{nN}{fgr^{2}}\right)\Big|^{2}\right)^{1/2},

where V2V_{2}^{\circ} is a transform of V2V_{2} as in [3, (2.10)] (coming from an application of the Voronoi summation formula). Using the rapid decay of V2V_{2}^{\circ}, we may truncate the sum over nn at

nKf,g,r:=qεfgr2N,n\leq K_{f,g,r}:=q^{\varepsilon}\frac{fgr^{2}}{N},

up to an acceptable loss. Note that the resulting sum over nn vanishes unless Kf,g,r1K_{f,g,r}\geq 1. From Corollary˜8.1, ˜8.2, and the divisor bound, we conclude that

SN,M,d,qεqO(ε)Nmaxgfqrd1fgrKf,g,rMr(r11+53δ64(Kf,g,rM)316+r1δ4min(Kf,g,r,M)12+1r316\displaystyle S_{N,M,d,q}\ll_{\varepsilon}q^{O(\varepsilon)}\sqrt{N}\max_{\begin{subarray}{c}g\mid f\mid q^{\prime}\\ r\mid d\end{subarray}}\frac{1}{fgr}K_{f,g,r}\sqrt{Mr}\Bigg(\frac{r^{\frac{11+53\delta}{64}}}{(K_{f,g,r}M)^{\frac{3}{16}}}+\frac{r^{\frac{1-\delta}{4}}}{\min(K_{f,g,r},M)^{\frac{1}{2}}}+\frac{1}{r^{\frac{3}{16}}}
+r18min(Kf,g,r,M)13+r1124(Kf,g,rM)12).\displaystyle+\frac{r^{\frac{1}{8}}}{\min(K_{f,g,r},M)^{\frac{1}{3}}}+\frac{r^{\frac{11}{24}}}{(K_{f,g,r}M)^{\frac{1}{2}}}\Bigg).

Plugging in the definition of Kf,g,rK_{f,g,r}, we see that the expression inside the maximum is non-decreasing in rr and non-increasing in f,gf,g. Writing

K:=K1,1,q=q2+εNM,K:=K_{1,1,q}=\frac{q^{2+\varepsilon}}{N}\geq M,

we find that

SN,M,d,q\displaystyle S_{N,M,d,q} εqO(ε)N1qKMq(q11+53δ64(KM)316+q1δ4M12+1q316+q18M13+q1124(KM)12)\displaystyle\ll_{\varepsilon}q^{O(\varepsilon)}\sqrt{N}\frac{1}{q}K\sqrt{Mq}\left(\frac{q^{\frac{11+53\delta}{64}}}{(KM)^{\frac{3}{16}}}+\frac{q^{\frac{1-\delta}{4}}}{M^{\frac{1}{2}}}+\frac{1}{q^{\frac{3}{16}}}+\frac{q^{\frac{1}{8}}}{M^{\frac{1}{3}}}+\frac{q^{\frac{11}{24}}}{(KM)^{\frac{1}{2}}}\right)
εqO(ε)q32M12N12(q11+53δ64(q2M/N)316+q1δ4M12+1q316+q18M13+q1124(q2M/N)12),\displaystyle\ll_{\varepsilon}q^{O(\varepsilon)}\frac{q^{\frac{3}{2}}M^{\frac{1}{2}}}{N^{\frac{1}{2}}}\left(\frac{q^{\frac{11+53\delta}{64}}}{(q^{2}M/N)^{\frac{3}{16}}}+\frac{q^{\frac{1-\delta}{4}}}{M^{\frac{1}{2}}}+\frac{1}{q^{\frac{3}{16}}}+\frac{q^{\frac{1}{8}}}{M^{\frac{1}{3}}}+\frac{q^{\frac{11}{24}}}{(q^{2}M/N)^{\frac{1}{2}}}\right),

which reduces to the desired bound. ∎

We can now prove the desired asymptotic for twisted moments of modular LL-functions.

Proof of Theorem˜1.5.

Let ε>0\varepsilon>0 and γ:=1674\gamma:=\tfrac{1}{674}. We closely follow the proof in [3, §3], making no changes to the main term analysis from [3, §3.1]. Treating the off-diagonal term as in [3, §3.2], it remains to establish the bound

SN,M,d,qε?εq1γ+O(ε),S_{N,M,d,q}\stackrel{{\scriptstyle?}}{{\ll_{\varepsilon}}}q^{1-\gamma+O(\varepsilon)}, (8.5)

for all dqd\mid q and NM1N\geq M\geq 1 with MNq2+εMN\leq q^{2+\varepsilon}, using the notation from ˜8.3. As in [3, §3.3], we can easily discount the contribution of the range MN<20MM\leq N<20M using [3, (3.12)], so let us assume that N20MN\geq 20M. We will rely on the bounds

SN,M,d,q\displaystyle S_{N,M,d,q} εqO(ε)(MN)12,\displaystyle\ll_{\varepsilon}q^{O(\varepsilon)}(MN)^{\frac{1}{2}}, (8.6)
SN,M,d,q\displaystyle S_{N,M,d,q} εqO(ε)((Nq)12M12+N34M14+N14q34M14+N12q14),\displaystyle\ll_{\varepsilon}q^{O(\varepsilon)}\left(\frac{(Nq)^{\frac{1}{2}}}{M^{\frac{1}{2}}}+\frac{N^{\frac{3}{4}}}{M^{\frac{1}{4}}}+\frac{N^{\frac{1}{4}}q^{\frac{3}{4}}}{M^{\frac{1}{4}}}+N^{\frac{1}{2}}q^{\frac{1}{4}}\right), (8.7)

from [3, (3.6) and (3.11)], as well as on our Proposition˜8.2 (instead of [3, Proposition 7]). First, the trivial bound ˜8.6 establishes ˜8.5 unless

M>q22γN,M>\frac{q^{2-2\gamma}}{N}, (8.8)

so let us assume that we are in this range. We now split into cases depending on the size of NN.

Case 1. One has Nq3/23γN\leq q^{3/2-3\gamma}. Then by plugging ˜8.8 into ˜8.7, we obtain ˜8.5.

Case 2. One has N(q3/23γ,q3/22γ]N\in(q^{3/2-3\gamma},q^{3/2-2\gamma}]. Then by plugging ˜8.8 into ˜8.7, we find that

SN,M,d,qεqO(ε)(q1γ+N14q34M14),S_{N,M,d,q}\ll_{\varepsilon}q^{O(\varepsilon)}\left(q^{1-\gamma}+\frac{N^{\frac{1}{4}}q^{\frac{3}{4}}}{M^{\frac{1}{4}}}\right),

which is acceptable in ˜8.5 unless

NM>q14γ.\frac{N}{M}>q^{1-4\gamma}.

Plugging this and N>q3/23γN>q^{3/2-3\gamma} into ˜8.4, we find that

SN,M,d,qεqO(ε)(q63+53δ64+5γ4+q1δ4+3γ2+q2324+2γ),S_{N,M,d,q}\ll_{\varepsilon}q^{O(\varepsilon)}\left(q^{\frac{63+53\delta}{64}+\frac{5\gamma}{4}}+q^{1-\frac{\delta}{4}+\frac{3\gamma}{2}}+q^{\frac{23}{24}+2\gamma}\right), (8.9)

which is acceptable in ˜8.5 provided that

10γδ1144γ53.10\gamma\leq\delta\leq\frac{1-144\gamma}{53}.

This is precisely attained for our choice of γ=1674\gamma=\frac{1}{674} by taking δ:=10674\delta:=\frac{10}{674} in Proposition˜8.2.

Case 3. One has N(q3/22γ,q3/2+γ)N\in(q^{3/2-2\gamma},q^{3/2+\gamma}). Then ˜8.7 is useless because of the last term. We plug in Mq2+ε/NM\leq q^{2+\varepsilon}/N and then Nq3/22γN\geq q^{3/2-2\gamma} into ˜8.4 to find that

SN,M,d,qεqO(ε)(q63+53δ64+5γ4+q1δ4+γ+q2324+2γ),S_{N,M,d,q}\ll_{\varepsilon}q^{O(\varepsilon)}\left(q^{\frac{63+53\delta}{64}+\frac{5\gamma}{4}}+q^{1-\frac{\delta}{4}+\gamma}+q^{\frac{23}{24}+2\gamma}\right),

which is a stronger bound than ˜8.9. This completes our proof. ∎

9. Large sieve for exceptional cusp forms

Here we prove a generalization of Corollary˜1.6, which requires some background from the spectral theory of automorphic forms. We recall [11] that for q+q\in\mathbb{Z}_{+}, the congruence subgroup Γ0(q)\Gamma_{0}(q) contains those matrices in SL2()\textnormal{SL}_{2}(\mathbb{Z}) with bottom-left entries divisible by qq. Each cusp 𝔞\mathfrak{a} of the the fundamental domain Γ0(q)\\Gamma_{0}(q)\backslash\mathbb{H} is equivalent to a fraction of the form uw\tfrac{u}{w}, where u,w+u,w\in\mathbb{Z}_{+}, wqw\mid q, (u,w)=1(u,w)=1, and u(w,qw)u\leq(w,\tfrac{q}{w}); in particular, the cusp at \infty is equivalent to 1q\tfrac{1}{q}. To such a cusp, one can associate a scaling matrix σ𝔞PSL2()\sigma_{\mathfrak{a}}\in\textnormal{PSL}_{2}(\mathbb{R}) with σ𝔞=𝔞\sigma_{\mathfrak{a}}\infty=\mathfrak{a}, and via these scaling matrices, functions on Γ0(q)\\Gamma_{0}(q)\backslash\mathbb{H} can be Fourier expanded around 𝔞\mathfrak{a}.

The discrete spectrum of the hyperbolic Laplacian Δ=y2(x2+y2)\Delta=-y^{2}(\partial_{x}^{2}+\partial_{y}^{2}) is parametrized by Maass cusp forms: these are smooth functions f:Γ0(q)\f:\Gamma_{0}(q)\backslash\mathbb{H}\to\mathbb{C} which are eigenfunctions of Δ\Delta, vanish at all cusps of Γ0(q)\\Gamma_{0}(q)\backslash\mathbb{H}, and are square-integrable with respect to the Petersson inner product. Following the normalization of Deshouillers–Iwaniec [11], we write the Fourier expansion of ff at z=x+iyz=x+iy\in\mathbb{H} around a cusp 𝔞\mathfrak{a} (with scaling matrix σ𝔞\sigma_{\mathfrak{a}}) as

f(σ𝔞z)=y1/2n0ρ𝔞(n)Kiκ(2π|n|y)e(mx),f(\sigma_{\mathfrak{a}}z)=y^{1/2}\sum_{n\neq 0}\rho_{\mathfrak{a}}(n)K_{i\kappa}(2\pi|n|y)\,e(mx),

where KK is a Whittaker function as in [11, p. 264]. Altering the choice of scaling matrix σ𝔞\sigma_{\mathfrak{a}} results in multiplying the Fourier coefficients ρ𝔞(n)\rho_{\mathfrak{a}}(n) by an exponential phase e(nω)e(n\omega), for some uniform ω/\omega\in\mathbb{R}/\mathbb{Z}.

The Kuznetsov trace formula [11, 27], as well as the large sieve inequalities that derive from it, involve an orthonormal basis of Maass cusp forms. The following notation will therefore be useful.

Notation 9.1.

Let q+q\in\mathbb{Z}_{+}, 𝔞\mathfrak{a} be a cusp of Γ0(q)\Gamma_{0}(q) equivalent777The assumption that 𝔞\mathfrak{a} is equivalent to 1s\tfrac{1}{s} is true in most applications (note that this includes the cusp at \infty), and only made for convenience; one can prove similar results at arbitrary cusps with small adjustments. to 1s\tfrac{1}{s} for some sqs\mid q with (s,qs)=1(s,\tfrac{q}{s})=1, and σ𝔞PSL2()\sigma_{\mathfrak{a}}\in\textnormal{PSL}_{2}(\mathbb{R}) be any scaling matrix for 𝔞\mathfrak{a}. Consider an orthonormal basis (fj)j1(f_{j})_{j\geq 1} of Maass cusp forms for Γ0(q)\Gamma_{0}(q), with:

  • (i)(i).

    Laplacian eigenvalues λj\lambda_{j} and spectral parameters θj:=max(0,14λj)1/2\theta_{j}:=\max(0,\tfrac{1}{4}-\lambda_{j})^{1/2};

  • (ii)(ii).

    Fourier coefficients (ρj𝔞(n))n(\rho_{j\mathfrak{a}}(n))_{n\in\mathbb{Z}} around the cusp 𝔞\mathfrak{a}, using the scaling matrix σ𝔞\sigma_{\mathfrak{a}}.

Proposition 9.2.

Assume Notation˜9.1, let X,N1/2X,N\geq 1/2, and let (αn)nN(\alpha_{n})_{n\sim N} be a complex sequence. Let Φ:[0,)\Phi:\mathbb{R}\to[0,\infty) be a smooth function supported in [Ω(1),O(1)][\Omega(1),O(1)], with Φ(t)𝑑t1\int\Phi(t)\,dt\gg 1 and Φ(j)(t)j1\Phi^{(j)}(t)\ll_{j}1. Then there exists ω/\omega\in\mathbb{R}/\mathbb{Z} (depending only on 𝔞\mathfrak{a}, σ𝔞\sigma_{\mathfrak{a}}) such that

λj<1/4X2θj|nNαnρj𝔞(n)|2\displaystyle\sum_{\lambda_{j}<1/4}X^{2\theta_{j}}\left|\sum_{n\sim N}\alpha_{n}\,\rho_{j\mathfrak{a}}(n)\right|^{2} (qN)o(1)(1+Nq)αn2\displaystyle\ll(qN)^{o(1)}\left(1+\frac{N}{q}\right)\|\alpha_{n}\|^{2} (9.1)
+|cq+1cm,nNαme(mω)¯αne(nω)S(m,n;c)Φ(mncX)|.\displaystyle+\left|\sum_{c\in q\mathbb{Z}_{+}}\frac{1}{c}\sum_{m,n\sim N}\overline{\alpha_{m}e(m\omega)}\,\alpha_{n}e(n\omega)\,S(m,n;c)\,\Phi\left(\frac{\sqrt{mn}}{c}X\right)\right|.
Proof.

This is [32, Corollary I], which follows from the Kuznetsov trace formula and the regular-spectrum large sieve inequalities of Deshouillers–Iwaniec [11, Theorem 2]. We have implicitly used [32, Lemma B] to write down the Kloosterman sums and cc-supports for cusps 𝔞\mathfrak{a} equivalent to 1s\tfrac{1}{s} for some sqs\mid q (the latter condition is written as μ(𝔞)=q1\mu(\mathfrak{a})=q^{-1} in loc. cit.). Note that we incur factors of e(mω)e(m\omega) and e(nω)e(n\omega) since we do not assume a special scaling matrix σ𝔞\sigma_{\mathfrak{a}} (as we may), but this will be irrelevant in our computations since the sequence (αn)(\alpha_{n}) is arbitrary. ∎

In the right-hand side of ˜9.1, the sum over cc is really supported on cNXc\asymp NX due to the Φ\Phi-weight, and it vanishes if qNXq\gg NX with a large enough implied constant. Deshouillers–Iwaniec used this simple observation to deduce the following result, which combines [11, Theorems 2 and 5].

Theorem 9.3 (Deshouillers–Iwaniec [11]).

Assume Notation˜9.1, let N12N\geq\tfrac{1}{2}, and let (αn)nN(\alpha_{n})_{n\sim N} be a complex sequence. Then one has

λj<1/4X2θj|nNαnρj𝔞(n)|2(qN)o(1)(1+Nq)α2,\sum_{\lambda_{j}<1/4}X^{2\theta_{j}}\left|\sum_{n\sim N}\alpha_{n}\,\rho_{j\mathfrak{a}}(n)\right|^{2}\ll(qN)^{o(1)}\left(1+\frac{N}{q}\right)\|\alpha\|^{2}, (9.2)

for any positive X1+qNX\ll 1+\frac{q}{N}.

Until now, if qNq\sqrt{q}\ll N\ll q, Theorem˜9.3 has been the state-of-the-art exceptional-spectrum large sieve bound for general sequences (αn)(\alpha_{n}) and a single group Γ0(q)\Gamma_{0}(q); the same is true if one averages over levels qQq\sim Q and allows the sequence (αn)(\alpha_{n}) to depend on qq.

We can now achieve an improvement of Theorem˜9.3 when qq has a factorization as in Theorem˜7.1, and similar results can be deduced for arbitrary levels qq using Theorem˜7.4. We require a coprimality constraint (n,q)=1(n,q)=1 for technical reasons, but this is usually harmless in applications. The resulting power savings are relatively small, but serve as a proof of concept that Theorem˜9.3 is not a fundamental barrier.

Theorem 9.4 (Large sieve for composite levels).

Assume Notation˜9.1, let N12N\geq\tfrac{1}{2}, and let (αn)nN(\alpha_{n})_{n\sim N} be a complex sequence supported on (n,q)=1(n,q)=1. Suppose that q=ddeq=dd^{\prime}e with ddd^{\prime}\mid d and (d,e)=1(d,e)=1, and let fqdf\leq\sqrt{qd} be the largest integer with f2qdf^{2}\mid qd. Then ˜9.2 holds for any positive

X1+qN+min(q2d1/3N7/3,q3/2f1/4N3/2,qd1/3f1/6N)+min(q7/4d1/4N2,q7/5f1/5N7/5,qd2/5f1/5N).X\ll 1+\frac{q}{N}+\min\left(\frac{q^{2}}{d^{1/3}N^{7/3}},\frac{q^{3/2}}{f^{1/4}N^{3/2}},\frac{qd^{1/3}}{f^{1/6}N}\right)+\min\left(\frac{q^{7/4}}{d^{1/4}N^{2}},\frac{q^{7/5}}{f^{1/5}N^{7/5}},\frac{qd^{2/5}}{f^{1/5}N}\right). (9.3)
Proof of Theorem˜9.4.

We may assume without loss of generality that

1+qN<Xmin(q2d1/3N7/3,q3/2f1/4N3/2,qd1/3f1/6N)+min(q7/4d1/4N2,q7/5f1/5N7/5,qd2/5f1/5N),1+\frac{q}{N}<X\ll\min\left(\frac{q^{2}}{d^{1/3}N^{7/3}},\frac{q^{3/2}}{f^{1/4}N^{3/2}},\frac{qd^{1/3}}{f^{1/6}N}\right)+\min\left(\frac{q^{7/4}}{d^{1/4}N^{2}},\frac{q^{7/5}}{f^{1/5}N^{7/5}},\frac{qd^{2/5}}{f^{1/5}N}\right), (9.4)

since otherwise the result follows from Theorem˜9.3. We apply Proposition˜9.2 with a choice of Φ\Phi supported on [2,4][2,4], then separate variables in the smooth weight Φ()\Phi(\cdot) via two-dimensional Fourier inversion, as in [32, Proof of Theorem 13], to arrive at

λj<1/4X2θj|nNαnρj𝔞(n)|2\displaystyle\sum_{\lambda_{j}<1/4}X^{2\theta_{j}}\left|\sum_{n\sim N}\alpha_{n}\,\rho_{j\mathfrak{a}}(n)\right|^{2} (qN)o(1)(1+Nq)α2\displaystyle\ll(qN)^{o(1)}\left(1+\frac{N}{q}\right)\|\alpha\|^{2}
+NX4<xNXqc1csup(βn)nN|βn|=|αn|sup(γn)nN|γn|=|αn||m,nNβmγnS(m,n;c)|.\displaystyle+\sum_{\begin{subarray}{c}\frac{NX}{4}<x\leq NX\\ q\mid c\end{subarray}}\frac{1}{c}\sup_{\begin{subarray}{c}(\beta_{n})_{n\sim N}\\ |\beta_{n}|=|\alpha_{n}|\end{subarray}}\sup_{\begin{subarray}{c}(\gamma_{n})_{n\sim N}\\ |\gamma_{n}|=|\alpha_{n}|\end{subarray}}\left|\sum_{m,n\sim N}\beta_{m}\gamma_{n}S(m,n;c)\right|.

The sequences (βn)(\beta_{n}), (γn)(\gamma_{n}) in the supremum arise by incorporating exponential phases e(nω)e(n\omega) into (αn)(\alpha_{n}), partly from the choice of the scaling matrix σ𝔞\sigma_{\mathfrak{a}}, and partly due to the separation of variables. The suprema are of course attained by some sequences (βn)(\beta_{n}), (γn)(\gamma_{n}) supported on (n,q)=1(n,q)=1, so we can apply Corollary˜7.5 with M=NM=N and CNXC\asymp NX, to obtain

λj<1/4X2θj|nNαnρj𝔞(n)|2\displaystyle\sum_{\lambda_{j}<1/4}X^{2\theta_{j}}\left|\sum_{n\sim N}\alpha_{n}\,\rho_{j\mathfrak{a}}(n)\right|^{2} (qN)o(1)(1+Nq)α2\displaystyle\ll(qN)^{o(1)}\left(1+\frac{N}{q}\right)\|\alpha\|^{2}
+α2(NX)1+o(1)qmin{(dNX3+fX2+fd2)16,(dN2qX2+fNqX+fqd2NX)16.\displaystyle+\|\alpha\|^{2}\frac{(NX)^{1+o(1)}}{q}\min

We conclude by noting that

NXq(dNX3+fX2+fd2)161forXmin(q2d1/3N7/3,q3/2f1/4N3/2,qd1/3f1/6N),\frac{NX}{q}\left(\frac{dN}{X^{3}}+\frac{f}{X^{2}}+\frac{f}{d^{2}}\right)^{\frac{1}{6}}\ll 1\qquad\text{for}\qquad X\ll\min\left(\frac{q^{2}}{d^{1/3}N^{7/3}},\frac{q^{3/2}}{f^{1/4}N^{3/2}},\frac{qd^{1/3}}{f^{1/6}N}\right),

and

NXq(dN2qX2+fNqX+fqd2NX)161forXmin(q7/4d1/4N2,q7/5f1/5N7/5,qd2/5f1/5N).\frac{NX}{q}\left(\frac{dN^{2}}{qX^{2}}+\frac{fN}{qX}+\frac{fq}{d^{2}NX}\right)^{\frac{1}{6}}\ll 1\qquad\text{for}\qquad X\ll\min\left(\frac{q^{7/4}}{d^{1/4}N^{2}},\frac{q^{7/5}}{f^{1/5}N^{7/5}},\frac{qd^{2/5}}{f^{1/5}N}\right).

This covers the range in ˜9.4. ∎

Proof of Corollary˜1.6.

If qq has a divisor dqd\asymp\sqrt{q} such that qd\tfrac{q}{d} is square-free, then we can take d=(d,qd)d^{\prime}=(d,\tfrac{q}{d}) and f=dqf=d\asymp\sqrt{q} in Theorem˜9.4, so ˜9.2 holds for any positive

X1+qN+min(q11/6N7/3,q11/8N3/2,q13/12N)+min(q13/8N2,q13/10N7/5,q11/10N).X\ll 1+\frac{q}{N}+\min\left(\frac{q^{11/6}}{N^{7/3}},\frac{q^{11/8}}{N^{3/2}},\frac{q^{13/12}}{N}\right)+\min\left(\frac{q^{13/8}}{N^{2}},\frac{q^{13/10}}{N^{7/5}},\frac{q^{11/10}}{N}\right).

If additionally Nq1/2+o(1)N\ll q^{1/2+o(1)} (as Corollary˜1.6 assumes), then we can take X=q3/5X=q^{3/5}, since this is only larger by a factor of qo(1)q^{o(1)} than the second minimum above. ∎

Appendix A Some necessary computations in SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z})

Here we fill in some details involving explicit matrix computations, subgroups, and characters of SL2(/c)\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}). In particular, we prove Lemmas˜4.8, 5.4 and 5.2.

Proof of Lemma˜4.8.

The first equality from ˜4.6 follows immediately from ˜4.4. Now let Tc(d):VcVcT_{c}(d):V_{c}\to V_{c} denote the map in the (extreme) right-hand side of ˜4.6; we will show that Tc(d)=Pc(d)T_{c}(d)=P_{c}(d).

The fact that Γc(d)\Gamma_{c}(d) is a subgroup quickly implies that Tc(d)T_{c}(d) is self-adjoint and that Tc(d)2=Tc(d)T_{c}(d)^{2}=T_{c}(d), so Tc(d)T_{c}(d) is an orthogonal projection. Moreover, one has ρc(n)Tc(d)=Tc(d)\rho_{c}(n)T_{c}(d)=T_{c}(d) for any nΓc(d)n\in\Gamma_{c}(d), so any Tc(d)fTc(d)VcT_{c}(d)f\in T_{c}(d)V_{c} has ρc(n)Tc(d)f=Tc(d)f\rho_{c}(n)T_{c}(d)f=T_{c}(d)f, which shows Tc(d)VcVc(d)T_{c}(d)V_{c}\subset V_{c}(d). Conversely, if fVc(d)f\in V_{c}(d), so ρc(n)f=f\rho_{c}(n)f=f for all nΓc(d)n\in\Gamma_{c}(d), then clearly f=Tc(d)ff=T_{c}(d)f, which shows Vc(d)Tc(d)VcV_{c}(d)\subset T_{c}(d)V_{c}. Thus Tc(d)T_{c}(d) is the orthogonal projection onto Tc(d)Vc=Vc(d)T_{c}(d)V_{c}=V_{c}(d), i.e., Tc(d)=Pc(d)T_{c}(d)=P_{c}(d).

The claim about commutativity follows directly from ˜4.6 and the normality of Γc(d)\Gamma_{c}(d).

Finally, let us prove ˜4.7. It follows from ˜4.6 and ˜3.17 that

Pc(d)u,v=1|Γc(d)|nΓc(d)𝟙u=nv=d3c3𝟙uΓc(d)v|Γc(d)u|,P_{c}(d)_{u,v}=\frac{1}{|\Gamma_{c}(d)|}\sum_{n\in\Gamma_{c}(d)}\mathbbm{1}_{u=nv}=\frac{d^{3}}{c^{3}}\mathbbm{1}_{u\in\Gamma_{c}(d)\cdot v}|\Gamma_{c}(d)_{u}|, (A.1)

where |Γc(d)u||\Gamma_{c}(d)_{u}| is the stabilizer of uu inside Γc(d)\Gamma_{c}(d) (indeed, once u=n0vu=n_{0}v for some n0Γc(d)n_{0}\in\Gamma_{c}(d), all other solutions to u=nvu=nv satisfy n0n1u=un_{0}n^{-1}u=u, so nΓc(d)un0n\in\Gamma_{c}(d)_{u}n_{0}). By the normality of Γc(d)\Gamma_{c}(d), we see that for any gSL2(/c)g\in\textnormal{SL}_{2}(\mathbb{Z}/c\mathbb{Z}),

|Γc(d)gu|\displaystyle|\Gamma_{c}(d)_{gu}| ={nΓc(d):ngu=gu}\displaystyle=\{n\in\Gamma_{c}(d):ngu=gu\}
={nΓc(d):(g1ng)u=u}=|(g1Γc(d)g)u|=|Γc(d)u|,\displaystyle=\{n\in\Gamma_{c}(d):(g^{-1}ng)u=u\}=|(g^{-1}\Gamma_{c}(d)g)_{u}|=|\Gamma_{c}(d)_{u}|,

so by the transitivity of the projective action,

|Γc(d)u|=|Γc(d)[1:0]|={(qrst)Γc(d):[q:s]=[1:0]}.|\Gamma_{c}(d)_{u}|=|\Gamma_{c}(d)_{[1:0]}|=\left\{\begin{pmatrix}q&r\\ s&t\end{pmatrix}\in\Gamma_{c}(d):[q:s]=[1:0]\right\}.

Note that [q:s]=[1:0][q:s]=[1:0] simply means (q,s)=(α,0)(q,s)=(\alpha,0) for some unit α(/c)×\alpha\in(\mathbb{Z}/c\mathbb{Z})^{\times}. This forces s=0s=0, q(/c)×q\in(\mathbb{Z}/c\mathbb{Z})^{\times}, and t=q¯t=\overline{q}; moreover, given such a choice of q,s,tq,s,t, any rd/cr\in d\mathbb{Z}/c\mathbb{Z} gives a solution. Since there are ϕ(c)/ϕ(d)\phi(c)/\phi(d) choices of qq in the kernel of /c/d\mathbb{Z}/c\mathbb{Z}\to\mathbb{Z}/d\mathbb{Z}, and cd\tfrac{c}{d} choices of rr in d/c/cdd\mathbb{Z}/c\mathbb{Z}\cong\mathbb{Z}/\tfrac{c}{d}\mathbb{Z}, we find that

|Γc(d)u|=ϕ(c)ϕ(d)cd,|\Gamma_{c}(d)_{u}|=\frac{\phi(c)}{\phi(d)}\cdot\frac{c}{d},

and plugging this into ˜A.1 proves ˜4.7. ∎

Proof of Lemma˜5.4.

Set G:=SL2(/pk)G:=\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}) and N:=Γpk(pj)N:=\Gamma_{p^{k}}(p^{j}), so NGN\triangleleft G. Say χ=Trρ\chi=\textnormal{Tr}\rho where ρG^\rho\in\widehat{G} is primitive. By ˜3.9, we have

1|N|nN|χ(n)|2=ρ0N^Mult(ρ0,ρ|N)2.\frac{1}{|N|}\sum_{n\in N}|\chi(n)|^{2}=\sum_{\rho_{0}\in\widehat{N}}\textnormal{Mult}(\rho_{0},\rho|_{N})^{2}.

By Lemma˜3.9, ρ|N\rho|_{N} contains LL irreducible representations of NN, each with multiplicity mm, for some positive integers L,mL,m. Thus

ρ0N^Mult(ρ0,ρ|N)2=Lm2,\sum_{\rho_{0}\in\widehat{N}}\textnormal{Mult}(\rho_{0},\rho|_{N})^{2}=Lm^{2},

and in light of ˜3.17, it remains to show that

Lm2(kj+1)1p2j.Lm^{2}\gg(k-j+1)^{-1}p^{2j}. (A.2)

If j=0j=0, this is a trivial statement. Suppose now that k2jk\tfrac{k}{2}\leq j\leq k. Then

N=Γpk(pj)\displaystyle N=\Gamma_{p^{k}}(p^{j}) ={I+pjA:A(/pkj)2×2,det(I+pjA)1(mod pk)}\displaystyle=\left\{I+p^{j}A:A\in(\mathbb{Z}/p^{k-j}\mathbb{Z})^{2\times 2},\ \det(I+p^{j}A)\equiv 1\ (\textnormal{mod }p^{k})\right\}
={I+pjA:A(/pkj)2×2,Tr(A)0(mod pkj)}\displaystyle=\left\{I+p^{j}A:A\in(\mathbb{Z}/p^{k-j}\mathbb{Z})^{2\times 2},\ \textnormal{Tr}(A)\equiv 0\ (\textnormal{mod }p^{k-j})\right\}

is abelian, since (I+pjA)(I+pjB)=I+pj(A+B)(I+p^{j}A)(I+p^{j}B)=I+p^{j}(A+B) for jk2j\geq\tfrac{k}{2}. In fact, this shows that

(N,)({A(/pkj)2×2:Tr(A)=0},+)(N^,).(N,\cdot)\cong\left(\left\{A\in(\mathbb{Z}/p^{k-j}\mathbb{Z})^{2\times 2}:\textnormal{Tr}(A)=0\right\},+\right)\cong(\widehat{N},\cdot).

In particular, all irreducible representations of NN are 11-dimensional, and can be expressed as

σB(I+pkA):=e(Tr(AB)pkj),B(/pkj)2×2,Tr(B)=0.\sigma_{B}(I+p^{k}A):=e\left(\frac{\textnormal{Tr}(AB)}{p^{k-j}}\right),\qquad B\in(\mathbb{Z}/p^{k-j}\mathbb{Z})^{2\times 2},\ \textnormal{Tr}(B)=0. (A.3)

Now equating dimensions and using Lemma˜3.16, we find that

Lm=dimρpkLm2p2kL.Lm=\dim\rho\gg p^{k}\qquad\iff\qquad Lm^{2}\gg\frac{p^{2k}}{L}. (A.4)

We will finish by finding an upper bound on LL. By the conclusion of Lemma˜3.9, all LL non-isomorphic representations in the decomposition of ρ|N\rho|_{N} lie in the same orbit of GG’s action by conjugation. For gGg\in G and B,σBB,\sigma_{B} as in ˜A.3, we have

σB(g(I+pkA)g1)=σB(I+pkgAg1)\displaystyle\sigma_{B}(g(I+p^{k}A)g^{-1})=\sigma_{B}(I+p^{k}gAg^{-1}) =e(Tr(gAg1B)pkj)\displaystyle=e\left(\frac{\textnormal{Tr}(gAg^{-1}B)}{p^{k-j}}\right)
=e(Tr(Ag1Bg)pkj)=σg1Bg(I+pkA).\displaystyle=e\left(\frac{\textnormal{Tr}(Ag^{-1}Bg)}{p^{k-j}}\right)=\sigma_{g^{-1}Bg}(I+p^{k}A).

In other words, the action of GG by conjugation on irreducible representations of NN corresponds to conjugation of the underlying matrices BB. It follows that LL is at most the maximal size of an orbit in the set

{B(/pkj)2×2:Tr(B)=0}\left\{B\in(\mathbb{Z}/p^{k-j}\mathbb{Z})^{2\times 2}:\textnormal{Tr}(B)=0\right\}

under conjugation by SL2(/pk)\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}), or equivalently by SL2(/pkj)\textnormal{SL}_{2}(\mathbb{Z}/p^{k-j}\mathbb{Z}). Since conjugation preserves the determinant Δ=det(B)\Delta=\det(B), we find that

LmaxΔ/pkj#{B(/pkj)2×2:Tr(B)=0,det(B)=Δ}.L\leq\max_{\Delta\in\mathbb{Z}/p^{k-j}\mathbb{Z}}\#\left\{B\in(\mathbb{Z}/p^{k-j}\mathbb{Z})^{2\times 2}:\textnormal{Tr}(B)=0,\ \det(B)=\Delta\right\}.

Writing B=(xyzx)B=\left(\begin{smallmatrix}x&y\\ z&-x\end{smallmatrix}\right), we further get

LmaxΔ/pkjx,y,z/pkj𝟙x2yz=Δpkjmaxa/pkjy,z/pkj𝟙yz=a,\displaystyle L\leq\max_{\Delta\in\mathbb{Z}/p^{k-j}\mathbb{Z}}\sum_{x,y,z\in\mathbb{Z}/p^{k-j}\mathbb{Z}}\mathbbm{1}_{-x^{2}-yz=\Delta}\leq p^{k-j}\max_{a\in\mathbb{Z}/p^{k-j}\mathbb{Z}}\sum_{y,z\in\mathbb{Z}/p^{k-j}\mathbb{Z}}\mathbbm{1}_{yz=a},

where we substituted a:=x2Δa:=-x^{2}-\Delta. Now given a/pkja\in\mathbb{Z}/p^{k-j}\mathbb{Z}, write a=paa=p^{\ell}a^{\prime} for some 0kj0\leq\ell\leq k-j and a(/pkj)×a^{\prime}\in(\mathbb{Z}/p^{k-j-\ell}\mathbb{Z})^{\times}. The equation yz=ayz=a then implies

y=pyy,z=pzz,yz=a.y=p^{\ell_{y}}y^{\prime},\qquad z=p^{\ell_{z}}z^{\prime},\qquad y^{\prime}z^{\prime}=a^{\prime}.

for some y,z0\ell_{y},\ell_{z}\geq 0 with y+z=\ell_{y}+\ell_{z}=\ell, and some y(/pkjy)×y^{\prime}\in(\mathbb{Z}/p^{k-j-\ell_{y}}\mathbb{Z})^{\times}, z(/pkjz)×z^{\prime}\in(\mathbb{Z}/p^{k-j-\ell_{z}}\mathbb{Z})^{\times}. There are +1kj+1\ell+1\leq k-j+1 choices of (y,z)(\ell_{y},\ell_{z}), and for every choice of (y,z,y)(\ell_{y},\ell_{z},y^{\prime}), there are at most pkjz/pkj=pyp^{k-j-\ell_{z}}/p^{k-j-\ell}=p^{\ell_{y}} choices of zz^{\prime} (since z(mod pkj)z^{\prime}\ (\textnormal{mod }p^{k-j-\ell}) is fixed). Putting these counts together, we obtain

Lpkj(kj+1)maxy+z=kpkjypy(kj+1)p2k2j.L\leq p^{k-j}(k-j+1)\max_{\ell_{y}+\ell_{z}=\ell\leq k}p^{k-j-\ell_{y}}p^{\ell_{y}}\ll(k-j+1)p^{2k-2j}.

Combining this with ˜A.4 establishes the desired bound from ˜A.2. ∎

Proof of Lemma˜5.2.

From ˜4.3, it follows that χc(g)=pkcχpk(πc,pk(g))\chi_{c}(g)=\prod_{p^{k}\|c}\chi_{p^{k}}(\pi_{c,p^{k}}(g)). Working locally at a prime p|cp|c, with say pkcp^{k}\|c and pjdp^{j}\|d, we will establish the bound

χpk(g)pk+j2,\chi_{p^{k}}(g)\ll p^{\left\lfloor\frac{k+j}{2}\right\rfloor}, (A.5)

for all gSL2(/pk)g\in\textnormal{SL}_{2}(\mathbb{Z}/p^{k}\mathbb{Z}) such that pjp^{j} is the largest pp-power for which g{γ/pk:γ2=1}Γpk(pj)g\in\{\gamma\in\mathbb{Z}/p^{k}\mathbb{Z}:\gamma^{2}=1\}\cdot\Gamma_{p^{k}}(p^{j}). Given ˜A.5, the desired bound in ˜5.1 follows from the divisor bound.

Since ρpk(g)\rho_{p^{k}}(g) is a permutation map, χpk(g)=Trρpk(g)\chi_{p^{k}}(g)=\textnormal{Tr}\rho_{p^{k}}(g) equals the number of fixed points of gg in 1(/pk)\mathbb{P}^{1}(\mathbb{Z}/p^{k}\mathbb{Z}), i.e., the number of solutions in u1(/pk)u\in\mathbb{P}^{1}(\mathbb{Z}/p^{k}\mathbb{Z}) to gu=ugu=u. Let us write g=(qrst)g=\left(\begin{smallmatrix}q&r\\ s&t\end{smallmatrix}\right) and u=[x:y]u=[x:y] for some integers q,r,s,t,x,yq,r,s,t,x,y with qtrs1(mod pk)qt-rs\equiv 1\ (\textnormal{mod }p^{k}) and (x,y,p)=1(x,y,p)=1. Scaling both entries of uu by a unit in (/pk)×(\mathbb{Z}/p^{k}\mathbb{Z})^{\times}, we can assume without loss of generality that x=1x=1 or y=1y=1; in fact, replacing (qrst)(tsrq)\left(\begin{smallmatrix}q&r\\ s&t\end{smallmatrix}\right)\leftrightarrow\left(\begin{smallmatrix}t&s\\ r&q\end{smallmatrix}\right) if necessary, we may assume that y=1y=1. Then the equality gu=ugu=u means that for some α\alpha\in\mathbb{Z}, one has

(qrst)(x1)α(x1)(mod pk)qx+r(sx+t)x(mod pk).\begin{pmatrix}q&r\\ s&t\end{pmatrix}\begin{pmatrix}x\\ 1\end{pmatrix}\equiv\alpha\begin{pmatrix}x\\ 1\end{pmatrix}\ (\textnormal{mod }p^{k})\qquad\Rightarrow\qquad qx+r\equiv(sx+t)x\ (\textnormal{mod }p^{k}).

This gives the quadratic congruence

sx2+(tq)xr0(mod pk).sx^{2}+(t-q)x-r\equiv 0\ (\textnormal{mod }p^{k}).

Now from our assumption that gγΓpk(pj)g\in\gamma\Gamma_{p^{k}}(p^{j}) for some γ/pk\gamma\in\mathbb{Z}/p^{k}\mathbb{Z} with γ2=1\gamma^{2}=1, we know that pjsp^{j}\mid s, pjrp^{j}\mid r, and pjtqp^{j}\mid t-q (since tγq(mod pj)t\equiv\gamma\equiv q\ (\textnormal{mod }p^{j})). In fact, pjp^{j} is the largest pp-power with this property (otherwise, we could pick some γq(mod pj+1)\gamma\equiv q\ (\textnormal{mod }p^{j+1}) such that gγΓpk(pj+1)g\in\gamma\Gamma_{p^{k}}(p^{j+1})). Therefore, letting a2:=spja_{2}:=sp^{-j}, a1:=(tq)pja_{1}:=(t-q)p^{-j} and a0:=rpja_{0}:=-rp^{-j}, we find that

a2x2+a1x+a00(mod pkj),a_{2}x^{2}+a_{1}x+a_{0}\equiv 0\ (\textnormal{mod }p^{k-j}), (A.6)

where a0,a1,a2a_{0},a_{1},a_{2} are not all divisible by pp. It now remains to show that this equation has

O(pkj2)O\left(p^{\left\lfloor\frac{k-j}{2}\right\rfloor}\right)

solutions in x(mod pkj)x\ (\textnormal{mod }p^{k-j}); every such solution will have pjp^{j} lifts to /pk\mathbb{Z}/p^{k}\mathbb{Z}, inducing a total of O(p(kj)/2+j)=O(p(k+j)/2)O(p^{\left\lfloor(k-j)/2\right\rfloor+j})=O(p^{\left\lfloor(k+j)/2\right\rfloor}) fixed points u=[x:1]u=[x:1] of gg.

Case 1. pa2p\nmid a_{2}. Then given any two solutions x0,xx_{0},x of ˜A.6, we can subtract the two equalities to obtain

pkja2(x2x02)+a1(xx0)=(xx0)(a2(x+x0)+a1).p^{k-j}\mid a_{2}(x^{2}-x_{0}^{2})+a_{1}(x-x_{0})=(x-x_{0})(a_{2}(x+x_{0})+a_{1}). (A.7)

Let :=(kj)/2\ell:=\left\lceil(k-j)/2\right\rceil. By the pigeonhole principle, we must have pxx0p^{\ell}\mid x-x_{0} or pa2(x+x0)+a1p^{\ell}\mid a_{2}(x+x_{0})+a_{1}. Since pa2p\nmid a_{2}, either option uniquely determines x(mod p)x\ (\textnormal{mod }p^{\ell}) in terms of x0x_{0}. So there can be at most

pkjp=p(kj)kj2=pkj2\frac{p^{k-j}}{p^{\ell}}=p^{(k-j)-\left\lceil\frac{k-j}{2}\right\rceil}=p^{\left\lfloor\frac{k-j}{2}\right\rfloor}

solutions in x(mod pkj)x\ (\textnormal{mod }p^{k-j}).

Case 2. pa1p\nmid a_{1}. Given the previous case, we can assume pa2p\mid a_{2}. Then pa2(x+x0)+a1p\nmid a_{2}(x+x_{0})+a_{1}, so from ˜A.7 we find that pkjxx0p^{k-j}\mid x-x_{0}, forcing only one solution in x(mod pkj)x\ (\textnormal{mod }p^{k-j}).

Case 3. pa0p\nmid a_{0}. Then ˜A.6 implies pxp\nmid x, and by substituting xx¯(mod pkj)x\leftrightarrow\overline{x}\ (\textnormal{mod }p^{k-j}), we reduce to the case pa2p\nmid a_{2}. ∎

References

  • [1] Valentin Blomer, Étienne Fouvry, Emmanuel Kowalski, Philippe Michel, and Djordje Milićević. On moments of twisted LL-functions. Amer. J. Math., 139(3):707–768, 2017.
  • [2] Valentin Blomer, Étienne Fouvry, Emmanuel Kowalski, Philippe Michel, Djordje Milićević, and Will Sawin. The second moment theory of families of LL-functions—the case of twisted Hecke LL-functions. Mem. Amer. Math. Soc., 282(1394):v+148, 2023.
  • [3] Valentin Blomer and Djordje Milićević. The second moment of twisted modular LL-functions. Geom. Funct. Anal., 25(2):453–516, 2015.
  • [4] Enrico Bombieri, John B. Friedlander, and Henryk Iwaniec. Primes in arithmetic progressions to large moduli. Acta Math., 156(3-4):203–251, 1986.
  • [5] Jean Bourgain and Alex Gamburd. Expansion and random walks in SLd(/pn){\rm SL}_{d}(\mathbb{Z}/p^{n}\mathbb{Z}). I. J. Eur. Math. Soc. (JEMS), 10(4):987–1011, 2008.
  • [6] A. H. Clifford. Representations induced in an invariant subgroup. Ann. of Math. (2), 38(3):533–550, 1937.
  • [7] Régis de La Bretèche and Sary Drappeau. Niveau de répartition des polynômes quadratiques et crible majorant pour les entiers friables. J. Eur. Math. Soc., 22(5):1577–1624, 2020.
  • [8] Pierre Deligne. La conjecture de Weil. I. Inst. Hautes Études Sci. Publ. Math., (43):273–307, 1974.
  • [9] J.-M. Deshouillers and H. Iwaniec. Power mean values of the Riemann zeta function. Mathematika, 29(2):202–212, 1982.
  • [10] J.-M. Deshouillers and H. Iwaniec. Power mean-values for Dirichlet’s polynomials and the Riemann zeta-function. II. Acta Arith., 43(3):305–312, 1984.
  • [11] Jean-Marc Deshouillers and Henryk Iwaniec. Kloosterman sums and Fourier coefficients of cusp forms. Invent. Math., 70(2):219–288, 1982.
  • [12] Sary Drappeau, Kyle Pratt, and Maksym Radziwiłł. One-level density estimates for Dirichlet LL-functions with extended support. Algebra Number Theory, 17(4):805–830, 2023.
  • [13] William Duke, John Friedlander, and Henryk Iwaniec. Bilinear forms with Kloosterman fractions. Invent. Math., 128(1):23–43, 1997.
  • [14] Étienne Fouvry, Emmanuel Kowalski, and Philippe Michel. Algebraic trace functions over the primes. Duke Math. J., 163(9):1683–1736, 2014.
  • [15] Étienne Fouvry, Emmanuel Kowalski, Philippe Michel, and Will Sawin. Lectures on applied \ell-adic cohomology. In Analytic methods in arithmetic geometry, volume 740 of Contemp. Math., pages 113–195. Amer. Math. Soc., [Providence], RI, [2019] ©2019.
  • [16] William Fulton and Joe Harris. Representation theory, volume 129 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1991. A first course, Readings in Mathematics.
  • [17] Lasse Grimmelt and Jori Merikoski. On the greatest prime factor and uniform equidistribution of quadratic polynomials. Preprint, arXiv:2505.00493, 2025.
  • [18] Larry Guth and James Maynard. New large value estimates for Dirichlet polynomials. Ann. of Math., to appear. Preprint, arXiv:2405.20552, 2024.
  • [19] H. A. Helfgott. Growth and generation in SL2(/p){\rm SL}_{2}(\mathbb{Z}/p\mathbb{Z}). Ann. of Math. (2), 167(2):601–623, 2008.
  • [20] I. Martin Isaacs. Character theory of finite groups. AMS Chelsea Publishing, Providence, RI, 2006. Corrected reprint of the 1976 original [Academic Press, New York; MR0460423].
  • [21] Henryk Iwaniec and Emmanuel Kowalski. Analytic number theory, volume 53. American Mathematical Society, Providence, RI, 2021.
  • [22] Kerr, Bryce and Shparlinski, Igor E. and Wu, Xiaosheng and Xi, Ping. Bounds on bilinear forms with Kloosterman sums. J. Lond. Math. Soc. (2), 108(2):578–621, 2023.
  • [23] Henry H. Kim. Functoriality for the exterior square of GL4{\rm GL}_{4} and the symmetric fourth of GL2{\rm GL}_{2}. J. Amer. Math. Soc., 16(1):139–183, 2003. With Appendix 1 by Dinakar Ramakrishnan and Appendix 2 by Kim and Peter Sarnak.
  • [24] Emmanuel Kowalski, Philippe Michel, and Will Sawin. Bilinear forms with Kloosterman sums and applications. Ann. of Math. (2), 186(2):413–500, 2017.
  • [25] Emmanuel Kowalski, Philippe Michel, and Will Sawin. Stratification and averaging for exponential sums: bilinear forms with generalized Kloosterman sums. Ann. Sc. Norm. Super. Pisa Cl. Sci. (5), 21:1453–1530, 2020.
  • [26] Philip C. Kutzko. The characters of the binary modular congruence group. Bull. Amer. Math. Soc., 79:702–704, 1973.
  • [27] Nikolai V. Kuznetsov. The Petersson conjecture for cusp forms of weight zero and the Linnik conjecture. Sums of Kloosterman sums. Mat. Sb. (N.S.), 111(153)(3):334–383, 479, 1980.
  • [28] James Maynard. Primes in Arithmetic Progressions to Large Moduli I: Fixed Residue Classes. Mem. Amer. Math. Soc., 306(1542), 2025.
  • [29] Djordje Milićević, Xinhua Qin, and Xiaosheng Wu. Bilinear forms with Kloosterman sums and moments of twisted LL-functions. arXiv preprint, November 2025.
  • [30] Nikolay G. Moshchevitin and Ilya D. Shkredov. On a modular form of Zaremba’s conjecture. Pacific J. Math., 309(1):195–211, 2020.
  • [31] Alexandre Nobs and Jürgen Wolfart. Die irreduziblen Darstellungen der Gruppen SL2(Zp)SL_{2}(Z_{p}), insbesondere SL2(Zp)SL_{2}(Z_{p}). II. Comment. Math. Helv., 51(4):491–526, 1976.
  • [32] Alexandru Pascadi. Large sieve inequalities for exceptional Maass forms and the greatest prime factor of n2+1n^{2}+1. Forum Math. Pi, to appear. Preprint, arXiv:2404.04239, 2025.
  • [33] Alexandru Pascadi. On the exponents of distribution of primes and smooth numbers. Preprint, arXiv:2505.00653, 2025.
  • [34] Atle Selberg. On the estimation of Fourier coefficients of modular forms. In Proc. Sympos. Pure Math., Vol. VIII, pages 1–15. Amer. Math. Soc., Providence, RI, 1965.
  • [35] Jean-Pierre Serre. Linear representations of finite groups, volume Vol. 42 of Graduate Texts in Mathematics. Springer-Verlag, New York-Heidelberg, french edition, 1977.
  • [36] Joseph A. Shalika. Representation of the two by two unimodular group over local fields. In Contributions to automorphic forms, geometry, and number theory, pages 1–38. Johns Hopkins Univ. Press, Baltimore, MD, 2004.
  • [37] I. D. Shkredov. On asymptotic formulae in some sum-product questions. Trans. Moscow Math. Soc., 79:231–281, 2018.
  • [38] I. D. Shkredov. Modular hyperbolas and bilinear forms of Kloosterman sums. J. Number Theory, 220:182–211, 2021.
  • [39] Igor E. Shparlinski. On sums of Kloosterman and Gauss sums. Trans. Amer. Math. Soc., 371(12):8679–8697, 2019.
  • [40] Igor E. Shparlinski and Tianping Zhang. Cancellations amongst Kloosterman sums. Acta Arith., 176(3):201–210, 2016.
  • [41] Shunichi Tanaka. Irreducible representations of the binary modular congruence groups modpλ{\rm mod}\ p^{\lambda}. J. Math. Kyoto Univ., 7:123–132, 1967.
  • [42] Audrey Terras. Fourier analysis on finite groups and applications, volume 43 of London Mathematical Society Student Texts. Cambridge University Press, Cambridge, 1999.
  • [43] Berke Topacogullari. The shifted convolution of generalized divisor functions. Int. Math. Res. Not. IMRN, (24):7681–7724, 2018.
  • [44] Jie Wu and Ping Xi. Arithmetic exponent pairs for algebraic trace functions and applications. Algebra Number Theory, 15(9):2123–2172, 2021. With an appendix by Will Sawin.
  • [45] Xiaosheng Wu. The fourth moment of Dirichlet LL-functions at the central value. Math. Ann., 387(3-4):1199–1248, 2023.
  • [46] Ping Xi. Ternary divisor functions in arithmetic progressions to smooth moduli. Mathematika, 64(3):701–729, 2018.
  • [47] Matthew P. Young. The fourth moment of Dirichlet LL-functions. Ann. of Math. (2), 173(1):1–50, 2011.