newfloatplacement\undefine@keynewfloatname\undefine@keynewfloatfileext\undefine@keynewfloatwithin
Non-abelian amplification and bilinear forms with Kloosterman sums
Abstract.
We introduce a new method to bound bilinear (Type II) sums of Kloosterman sums with composite moduli , using Fourier analysis on and an amplification argument with non-abelian characters. For sums of length , our method produces a non-trivial bound for all moduli except near-primes, saving for products of two primes of the same size. Combining this with previous results for prime moduli, we achieve savings beyond the Pólya–Vinogradov range for all moduli. We give applications to moments of twisted cuspidal -functions, and to large sieve inequalities for exceptional cusp forms with composite levels.
Contents
1. Introduction
1.1. Brief background
There is by now a fairly comprehensive history of bounds for bilinear forms with Kloosterman sums and their applications [3, 1, 2, 14, 24, 25, 11, 28, 32, 40, 39, 22, 46, 44]. In the simplest form, the objects of interest are the sums
| (1.1) |
for positive integers with and complex sequences , ; here and . In this work, we are mainly concerned with the ‘Type II’ setting where and are arbitrary sequences, and we search for an upper bound in terms of their norms , . This is equivalent to bounding the operator norm, or the largest singular value, of the matrix .
For the Type II sums, it is in general necessary to incorporate a coprimality constraint . In practice, since , one can separately consider each value of ; therefore, in most bounds discussed henceforth, one can replace the restriction with the assumption that and are -bounded (and the norms , with , ).
There are two main ‘trivial’ bounds to beat, packaged together into the following inequality with a slightly more general setup. For any (integer) intervals with , , any complex sequences , , and any , one has
| (1.2) |
The term comes from Fourier analysis (a.k.a. the Pólya–Vinogradov method111See also [14, Theorem 1.17] for more standard versions of the Pólya–Vinogradov bound.) and is sharp when , while the term comes from the pointwise Weil bound , and performs better when . The best one could hope for is the perfect-orthogonality bound , but making any improvement over ˜1.2, particularly in the range where the two trivial bounds match, is notoriously difficult. We note that while many applications [24, 25] require an improvement of the pointwise Weil bound when (i.e., beyond the Pólya–Vinogradov range), some applications require an improvement of the Fourier-theoretic bound for larger values of ; this is the case in our Section˜9.
The first improvement of ˜1.2 when was the celebrated breakthrough of Kowalski–Michel–Sawin [24], which requires a prime modulus , and which saves a factor of when ; see also [25] for their follow-up work, which outperforms the pointwise Weil bound for as small as . These results rely on a shift-by- trick of Vinogradov and Karatsuba, a Hölder step, and deep inputs of -adic cohomology; notably, the same bounds hold for more general algebraic trace functions, including hyper-Kloosterman sums.
Closer to our methods is an approach of Shkredov [38] for prime moduli , which relies (in the Type II setting) on non-abelian Fourier analysis [37, Lemma 22], expansion in [37, Theorem 50], and combinatorics; this beats ˜1.2 in the full range with a small but effective power saving, and for sequences , with more general additively-structured supports. We also mention some related additive-combinatorial approaches of Shparlinski–Zhang [40] for smooth sequences, of the author [32, §4] for additively-structured sequences, and of Kerr–Shparlinski–Wu–Xi [22] for Type I bilinear forms (where only the sequence is smooth).
For moduli with a suitable factorization, the best Type II bounds so far have come from the -van der Corput method [15], which relies on the twisted multiplicativity of Kloosterman sums, Cauchy–Schwarz, and a shifting trick; it was first applied in this setting by Blomer–Milićević [3]. The -van der Corput method can also be iterated, leading to strong results for smooth square-free moduli [46, 44]. Unfortunately, these arguments fail to handle certain types of composite moduli when , including squares of primes and products of two distinct primes of the same size.
1.2. Main results.
In this work, we develop a new method to bound bilinear forms with Kloosterman sums for essentially all composite moduli. Like Shkredov [38, 37], we rely on non-abelian Fourier analysis; unlike Shkredov, we use the normal subgroups of to our advantage, and we avoid relying on -flattening results, to arrive at quantitatively-good power savings over ˜1.2 (up to ; see Example˜1.3). Our key innovation is a new type of amplification argument with non-abelian characters, detailed in Section˜2.3, which may be of independent interest.
Combining our bounds with those of Kowalski–Michel–Sawin222One could also combine our bounds with the results of Shkredov [38, Theorem 4] for prime moduli, to obtain a result like Theorem 1.1 which does not rely on algebraic geometry. [24] (as well as Blomer–Milićević [3] for an optimization), we obtain a non-trivial result for general moduli beyond the Pólya–Vinogradov range, given in Theorem˜7.4. We state a particular case of this result below, when .
Theorem 1.1.
Let with . Then for any complex sequences , and , one has
Moreover, if for all (so ), then
Our main technical result leading to Theorem˜1.1 is Theorem˜7.1, which considers a factorization of the modulus into three parts, (but one can usually take or ). Below we state a particular case of Theorem˜7.1, focusing on the same range as in Theorem˜1.1.
Theorem 1.2.
Let for some with and , and let be the largest integer such that . Let be intervals with . Then for any complex sequences , and , one has
Since , Theorem˜1.2 automatically gives a non-trivial result when . Unless has a prime factor larger than , one can always find a factorization with in this range, , and , which makes the general result in Theorem˜1.1 possible.
Example 1.3.
Let such that is square-free. Then one can take , , in Theorem˜1.2, so the saving over the trivial bound becomes . If , this is roughly
In particular, this saving is achieved if , where and are distinct primes with . For the same values of , the more general Theorem˜7.1 beats ˜1.2 in the range
Notably, while the values , give blind spots of the -van der Corput method, they happen to give one of the best cases for our methods; this case has until now constituted the remaining barrier towards the application in Theorem˜1.5.
As a quick corollary of Theorem˜1.2, we prove a trilinear-sum bound which includes a short averaging over with a given large divisor ; the point is that only a factorization of (rather than ) is assumed. Such sums arise in the spectral theory of automorphic forms, in particular in Section˜9. Below is such a trilinear-sum bound, which is a particular case of Corollary˜7.5.
Corollary 1.4.
Let , for some with and , and let be the largest integer such that . Let be intervals of lengths . Then or any complex sequences , , one has
Remark.
Milićević, Qin and Wu [29] have simultaneously and independently obtained results similar to our Theorems˜1.1 and 1.5 using substantially different methods. The two papers are complementary, each performing slightly better in different ranges and for different types of moduli, and both achieving power savings for bilinear sums of square-root length and general moduli. The methods in [29] (which use algebraic geometry and build on Kowalski–Michel–Sawin [24] and Blomer–Milićević [3]) obtain better savings for general moduli and remove the dependency on the Ramanujan–Petersson conjecture in the application to moments of twisted cuspidal -functions. Our methods (which use non-abelian Fourier analysis and are closer to the work of Shkredov [38, 37]) perform better and in longer ranges for specific classes of moduli (see Examples˜1.3 and 7.2), can handle more general supports of the sequences (intervals or other additively-structured subsets of ), and find an application to exceptional-spectrum large sieve inequalities (Corollary˜1.6).
Finally, we note that our methods might also lead to bounds for other exponential sums with or structure, such as
where is a suitable Möbius transformation (and the sum is restricted to those such that is well-defined). Related methods for could be worth investigating as well.
1.3. Applications
As our first application, we prove an asymptotic for the averaged second moment of modular -functions twisted by primitive Dirichlet characters modulo , where the modulus is arbitrary. Blomer–Milićević [1] established such an asymptotic for most moduli, more specifically whenever is not close to a prime or to a product of two primes of the same size. The missing ingredient in these cases has been precisely a power-saving bound for bilinear forms with Kloosterman sums modulo , where both sums have length . The case of prime moduli was established by Kowalski–Michel–Sawin [24, Theorem 1.5], and the remaining case can now be handled using our Theorem˜1.1 (essentially in the setting from Example˜1.3).
To state this application, we introduce some quick notation as in [3]. Given , we write
for the number of primitive characters modulo ; this vanishes if and only if , and is otherwise of size . We write for a sum over all primitive characters modulo . Also following [3], given an -function , we write for the product over all local factors at primes dividing ; thus for example .
Theorem 1.5.
Let be fixed holomorphic cuspidal newforms for with even weights and Hecke eigenvalues normalized as in ˜8.1. Provided that , one has the asymptotic
with a main term of
where is a constant depending only on and
Remark.
Similar results can be obtained for Maass cusp forms, with some care in removing the dependency on the Ramanujan conjecture; see also [29]. The case of (non-cuspidal) Eisenstein series reduces to the result of Young [47] on fourth moments of Dirichlet -functions for prime moduli, extended to all moduli by Wu [45].
Our second application concerns the exceptional spectrum of the hyperbolic Laplacian on , consisting of Maass cusp forms of level with eigenvalues . Selberg’s eigenvalue conjecture [34], one of the central open problems in the theory of automorphic forms, states that this exceptional spectrum is empty. However, unconditionally, exceptional forms often produce the worst contribution in applications of the Kuznetsov trace formula [11, 27] to analytic number theory problems [12, 43, 7, 32], losing exponential factors in the parameter . The best known pointwise bound is Kim–Sarnak [23, Appendix 2], but on-average results can also lead to savings in the -aspect, sometimes enough to match the conditional results [33, 17]. Following Deshouillers–Iwaniec [11], these on-average results often take the shape of large sieve inequalities for the Fourier coefficients of exceptional Maass forms, incorporating factors of . While improvements are now possible [32] for exceptional-spectrum large sieve inequalities with special sequences , the savings in the -aspect for arbitrary sequences have been limited to , due to Deshouillers–Iwaniec [11, Theorem 5]. In fact, obtaining any power saving for arbitrary sequences when is as hard as proving Selberg’s eigenvalue conjecture [32, §2].
In Theorem˜9.4, we overcome this barrier at if is suitably-composite and is not too large, using Theorem˜7.1. We note that in many applications [4, 28, 12, 9, 10], the level is a product of two factors of similar sizes, and . We state below a particular case of Theorem˜9.4, for . We point the reader to Section˜9 for more background and notation.
Corollary 1.6.
For reference, [11, Theorem 5] of Deshouillers–Iwaniec would include a factor of (which is when ) in the left-hand side, so Corollary˜1.6 wins a factor of in this case.
Remark.
It follows from the more general Theorem˜9.4 that one can relax the condition that is square-free when some averaging over levels with (and ) is available. The sequence inside the large sieve may depend on in this case, unlike in [11, Theorem 6].
1.4. Acknowledgements
The author is deeply grateful to Valentin Blomer, James Maynard, Sary Drappeau, Philippe Michel, Emmanuel Kowalski, and Ilya Shkredov for helpful comments and discussions. This work was supported by the ERC Advanced Grant 101054336 and Germany’s Excellence Strategy grant EXC-2047/1-390685813. For a part of the duration of this project, the author was also supported by an EPSRC Scholarship, as well as a Campus France Scholarship.
2. Outline
2.1. Structure of the paper
Our proof of Theorem˜1.2 has three main steps:
-
I
(Fourier analysis). In Section˜4 (particularly, Proposition˜4.10), we relate matrices of Kloosterman sums to the Fourier transform of certain functions at a special representation of . This involves Fourier analysis on both abelian (, ) and non-abelian () groups, and a Möbius inversion process for representations of .
-
II
(Amplification). In Section˜5 (particularly, Proposition˜5.5), we upper bound the spectral norm of the above Fourier coefficients by a weighted count of solutions to an equation in , with . This is where our non-abelian amplification argument comes in.
-
III
(Combinatorics). In Section˜6 (particularly, Proposition˜6.4), we analyze this counting problem using elementary arguments. In particular, we do not rely on expansion techniques.
In Section˜7, we combine these ingredients to deduce Theorem˜1.2 and its variations. The applications to moments of twisted cuspidal -functions and large sieve inequalities for exceptional cusp forms are handled in Sections˜8 and 9, respectively.
For the rest of this section, we give a brief informal overview of our argument, ignoring various technical details. We will use the symbols ‘’, ‘’ for identities and inequalities that are ‘morally’ true (and can be made rigorous with minor modifications, such as including factors).
2.2. First steps: Fourier analysis
Let us focus on the balanced case . We begin by considering the complex matrix
where with . Our task is to bound the operator norm by less than , to beat ˜1.2. We extend to a matrix, and multiply it on both sides by the unitary matrix , which preserves the operator norm and essentially amounts to taking a Fourier transform in the variables. Letting , a truncated version of Poisson summation yields
By inserting a few more rows and columns, we can in fact work over the projective line rather than . The matrices and then extend to and , where and are the usual generators of (see ˜3.13), and
is the -dimensional permutation representation corresponding to the action of on by Möbius transformations. It then remains to bound the spectral norm of the matrix
by less than . In this form, our task is actually impossible: the matrix above decomposes as a direct sum corresponding to the irreducible representations inside , one of which is the trivial representation—and this contributes exactly one singular value of size . Other small-dimensional subrepresentations of are also problematic for similar reasons.
This is where the coprimality constraint comes in. Incorporating this weight into the matrix and expanding it by Möbius inversion ultimately results in a ‘sifted’ representation,
where is essentially obtained by removing from the contribution of all subrepresentations isomorphic to
for . Although is not irreducible in general, it has the key property that all of its irreducible subrepresentations are large, of dimension (see Proposition˜4.6).
2.3. The key step: Amplification
We are left to bound the spectral norm of the non-abelian Fourier coefficient
where is any irreducible subrepresentation of . A natural approach is to use the trace method, i.e., to bound the top singular value by an even moment of all singular values, and then to expand the latter as a trace; this brings in the character .
One can then attempt to use non-abelian Fourier analysis by summing over all irreducible characters of . However, this sum must somehow amplify the contribution of compared to other irreducible characters of , especially the small-dimensional ones—otherwise, our construction of by eliminating various subrepresentations from will have been useless.
If was an abelian character of , i.e., a Dirichlet character, then following the ideas of Duke–Friendlander–Iwaniec [13], one could weigh the sum by an amplifier of the shape
where is some set of positive integers (e.g., the primes in a dyadic interval). This has size when , and should typically obey square-root cancellation when .
Inspired by this, we construct a general amplifier for irreducible representations of a finite non-abelian group —which is to the best of our knowledge the first instance of such a construction, and which might find applications to other problems. We set
where denotes the Frobenius norm of a map (which is the norm of its singular values), and is a well-chosen subset of . It is most convenient to pick to be a normal subgroup of ; note that when , this is only possible if is composite. We will in fact pick
for a suitable divisor of . The result of this amplification argument for the sixth moment of singular values is a bound of the shape (see Proposition˜5.1)
| (2.1) |
To go any further, we need to know the typical size of the character on , based on the information that . This is a somewhat challenging computation involving Clifford theory, and depends on the factorizations of and ; see Lemmas˜5.4 and 5.2.
Let us now focus on the case when is the square of a prime and ; we naturally pick . It turns out that typically has size on , and roughly at . To beat ˜1.2, it essentially remains to bound
| (2.2) |
2.4. Final steps: Combinatorics
The estimates in ˜2.2 amount to counting the number of solutions to the system of congruences
with . For the generic solutions, one can expect each congruence to cut down the total number of solutions by the size of the modulus—but one must also account for certain diagonal solutions where some . A careful but elementary analysis (which becomes more involved when the modulus is arbitrary) shows that these congruences have solutions modulo and solutions modulo ; see Proposition˜6.4. Both of these counts are sharp, and save a factor of over the bounds required in ˜2.2. This saving is ultimately raised to the power in ˜2.1 (since we considered a sixth moment of singular values), and putting everything together yields
as in Example˜1.3. We note that for the other case from Example˜1.3, one can simplify the amplification argument by noting that all irreducible characters of are tensor products of irreducible characters of and , but the end result is the same.
2.5. Comments on prime moduli
When is a prime and , the amplifier from Section˜2.3 reduces to the ‘trivial’ choice
since . In this setting, ˜2.1 (with replaced by another even integer ) reads
This was observed, using a somewhat different language, by Shkredov [37, proofs of Lemmas 22 and 53]. Shkredov then relied on an -flattening lemma [37, Theorem 50], which stems from a result of Helfgott [19], to bound the right-hand side above by for a quite large value of depending on . This leads to a bound for bilinear (Type II) sums of Kloosterman sums with prime moduli [38, (6) of Theorem 4], with a power saving of , where .
To obtain a more quantitatively-relevant power saving, competitive with [24, 25], one must use a smaller value of ; one would then need to solve a counting problem with few variables , as in Section˜2.4. We do not know how to do this, but such an approach could produce good results when, e.g., . In particular, assuming ˜6.2 for , one could prove a non-trivial bound for bilinear sums of Kloosterman sums with prime moduli and sequences of lengths ; interestingly, the same limit at appears in the results of Kowalski–Michel–Sawin [25], so our work reaffirms the difficulty of this barrier.
Alternatively, to obtain non-trivial results at prime moduli, it might be possible to use a different choice of subset in the construction of the amplifier from Section˜2.4. Indeed, although a normal subgroup is the most natural choice for , it is possible that another conjugation-invariant subset might produce a useful amplifier when normal subgroups are not available.
3. Preliminaries
3.1. Analytic and arithmetic notation
We use the standard asymptotic notation from analytic number theory, indicating dependencies of implicit constants on a parameter through subscripts. In particular, and both mean for some constant depending only on ; means , means ; means as ; is equivalent to the statement that for all . With this notation, the divisor bound reads .
We use the notation for both indicator functions of sets and truth values ( or ) of statements ; we also abbreviate for singletons. We write for the range , for the norm of a sequence for some (or ), and for . Given a positive integer , we let (resp., ) be the sets of integers (resp., positive integers) divisible by , and be the inverse of modulo (here may be implied from context, e.g., in an exponential phase ). We use and be the Möbius and Euler totient functions. Given , we write for their greatest common divisor (and similarly for more positive integers), and for the greatest divisor of whose prime factors all divide . We write when a prime power exactly divides a positive integer, meaning that but . We will reserve the letter for functions on , and for functions on . We denote the Fourier transform of an function by
| (3.1) |
In particular, if for some and , then a change of variables yields
| (3.2) |
and the Poisson summation identity reads
| (3.3) |
Given a map between finite-dimensional complex Hilbert spaces, we write its operator norm as
| (3.4) |
On the Hilbert space equipped with the Euclidean norm, we define as the norm of singular values of a map (or a matrix) , for . In particular, we have
| (3.5) |
where denotes the adjoint (conjugate transpose) of . We quickly record the following simple fact about projections and operator norms.
Lemma 3.1.
Let be a finite-dimensional complex Hilbert space, be a subspace, and be the orthogonal projection onto . Suppose that is an invariant subspace of a linear map (i.e., the restriction is well-defined). Then .
Proof.
By definition, we have and . Since with for all with , we have . On the other hand, for each with , we have , so . ∎
3.2. Bounds for Kloosterman sums
We now recall the Ramanujan and Weil bounds for Kloosterman sums, as well as some results of Kowalski–Michel–Sawin [24] and Blomer–Milićević [3].
Lemma 3.2 (Ramanujan bound).
For and , one has
Proof.
This is a classical result which follows from Möbius inversion. ∎
Lemma 3.3 (Weil bound).
For and , one has
Proof.
This is [21, Corollary 11.12] followed by the divisor bound. ∎
For the sake of completeness, we give a quick proof of the trivial bound from ˜1.2.
Proof of ˜1.2.
The second bound implicit in ˜1.2, with a term of , follows immediately from Lemma˜3.3 and Cauchy–Schwarz. For the first bound implicit in ˜1.2, we eliminate the constraint by Möbius inversion and use the identity to write
Now apply Cauchy–Schwarz in the sum over , and complete the sum over to get
Expanding the square and the Kloosterman sums, then performing the sum over , one reaches
Finally, complete the sum over , expand the square, and perform the sum over to obtain
Putting these bounds together completes our proof. ∎
Theorem 3.4 (Kowalski–Michel–Sawin [24]).
Let be a prime and be such that and . Then for any complex sequences , and any , one has
Proof.
This is [24, Theorem 1.1] with and swapped. ∎
Remark.
The constraint from Theorem˜3.4 can be removed in light of the trivial bound ˜1.2. Indeed, if , then
so the bound ˜1.2 is better. Similarly, if , then
Theorem 3.5 (Blomer–Milićević [3]).
Let such that and is odd. Then for any complex sequences and such that for all , and any , one has
Proof.
Dyadically summing instances of [3, Theorem 5] with in loc. cit. replaced by , one obtains the bound333[3, Theorem 5] does not include an -scalar inside the Kloosterman sum, but it holds in this slightly more general form with the same proof, and it is in fact applied this way in [3, p. 471, after (4.2)].
The desired bound now follows from Cauchy–Schwarz in the shape
(Since can be chosen to attain equality in this Cauchy–Schwarz step, Theorem˜3.5 is in fact a restatement of [3, Theorem 5].) ∎
3.3. Fourier analysis on finite groups
Here we recall some general facts and notation from representation theory on finite groups; we point the reader to [35, 16, 42, 20] for more background. Let be a finite group with identity element . A (unitary) representation of is a homomorphism
where is a finite-dimensional complex Hilbert space and is the set of unitary transformations of . In particular, is the identity transformation on . We write444Given a choice of orthonormal basis of , one may of course represent the transformations for as matrices in .
for the dimension of . We say that two representations , are isomorphic iff there is an invertible linear map such that for all (since we normalize all representations to be unitary, the map can also be taken unitary).
Example 3.6.
We write for the zero representation given by , and for the trivial representation given by . Any action of on a finite set induces a permutation representation , defined by for , . The regular representation is the permutation representation induced by the action by left-multiplication on , so .
Given two representations and , we write and for their direct sum and product. The operations and have identity elements and respectively (up to isomorphism). Given , we write
for all nonnegative integers ; when , we interpret this as the zero representation . We use a similar notation for repeated direct sums of linear maps.
An invariant subspace of a representation is a subspace of such that for all . For such , we define by for all , which is automatically unitary, and we say that is a subrepresentation of . One can decompose ; conversely, if then and are isomorphic to subrepresentations of .
We say that a representation of is irreducible iff it is nonzero and has no nonzero subrepresentations other than itself. We write for a complete set of irreducible representations of up to isomorphism, which always includes the trivial representation . Any representation of has a unique decomposition (up to permutation and isomorphism) into irreducible representations,
| (3.6) |
where is called the multiplicity of inside . In particular, .
Given two finite groups and representations and , we write for the representation of given by
The irreducible representations of are (up to isomorphism) precisely those of the form where and [35, §3.2].
Notation 3.7.
If are as above, and is a group isomorphic to by a fixed implicit map (such as ˜3.16), we also use the notation to describe representations of .
A character is any function of the form , where is a representation of ; note that characters are constant on conjugacy classes, that and , and that isomorphic representations induce the same character. If are two representations of with characters , then and . If are representations of with characters (respectively), then . We say that is irreducible iff is, and write for the set of all irreducible characters of . The character table of satisfies the following orthogonality relations.
Lemma 3.8 (Character orthogonality).
One has
| (3.7) | ||||
| (3.8) |
Proof.
See, e.g., [16, Theorem 2.12 and Exercise 2.21]. ∎
It follows from ˜3.7 and 3.6 that for an arbitrary character of , one has
| (3.9) |
We may also restrict a representation and its character to a subgroup , to obtain a representation of with character . If is irreducible, is not necessarily irreducible. When is a normal subgroup, the structure of can be better understood using Clifford theory [6].
Lemma 3.9 (Clifford).
Let be a group, be a normal subgroup, and be an irreducible representation. Then there exist positive integers with , and non-isomorphic irreducible representations of dimension , all lying in the same orbit of the action of by conjugation (i.e., for and ), such that
Proof.
See, e.g., [20, Theorem 6.5]. ∎
Given a function and a (not necessarily irreducible) representation , we define the Fourier coefficient by
| (3.10) |
This obeys , where denotes the convolution of two functions . In particular, if , the irreducible representations (which are all -dimensional since is abelian) are of the shape for . In this case, we write
| (3.11) |
Lemma 3.10.
Let , be a representation, and . Then one has
Proof.
By ˜3.6, there exists a unitary map (from to the direct sum of ’s irreducible invariant subspaces) such that for any ,
using some implicit ordering of . But then, by ˜3.10, we have
and the conclusion follows from the fact that the multiset of singular values of a direct sum of matrices is the union of the multisets of singular values of those matrices. ∎
3.4. Facts about
Let . Recall the special linear groups and of matrices in (resp., ) with determinant , and the projective special linear groups,
| (3.12) |
When the group , , or is understood from context, we write
| (3.13) |
which satisfy the relations , and in the case of or , . Note that and generate .
Notation 3.11 (Projective line).
For , we recall the projective line
where is the equivalence relation generated by for . We write the equivalence class of as , and we will typically use the letters to denote projective points in , reserving for elements of . For , we write the natural map which reduces both entries modulo as .
The group (and, through it, ) acts on by
| (3.14) |
One can think of as with a few additional points, which must be included to obtain a well-defined action. Indeed, any projective point can be written as for some and with , and thus . In particular, one can embed by , and via this embedding, the generators from ˜3.13 act on elements of by
We now briefly go over a few facts about the subgroups and representations of .
Notation 3.12 (Reduction mod ).
Given a positive integer with , we denote by
the natural epimorphism which ‘reads’ the entries of modulo . We write
for the congruence subgroup given by the kernel of this map (consisting of matrices of the form , where one may view the entries of as elements of ).
Lemma 3.13.
acts transitively on (i.e., there is only one orbit). In fact, for , there is a bijection between and orbits of under ,
| (3.15) |
Proof.
For any , there exist by definition with , so . Thus the action of on is transitive.
The map in ˜3.15 is well-defined since for any . It is surjective since the original map is surjective. To show that ˜3.15 is also injective, suppose for some , and we aim to show that . By the transitivity of the action of , we can find such that , so
Write for some . Since , we have , so there exist with , and in particular . Then,
where the last equality is due to the normality of . Hence , as we wanted. ∎
By the Chinese remainder theorem (), combining the maps for produces isomorphisms
| (3.16) |
for (in the products above, it is understood that only primes which divide are included, so , but we allow ). Since , it follows that
| (3.17) |
and that the irreducible representations of can be parametrized as
| (3.18) |
Now let be a prime and , and let us focus on understanding .
Definition 3.14 (Primitive representations).
A representation is called primitive iff its kernel does not contain . Equivalently (by the first isomorphism theorem), cannot be factored as for some representation of . A primitive (resp., non-primitive) character is one induced by a primitive (resp., non-primitive) representation.
Thus the primitive irreducible representations of are ‘new’ at level , much like primitive Dirichlet characters or newforms in the theory of automorphic representations. We can easily isolate the ‘maximal’ non-primitive component of a representation using the following lemma.
Lemma 3.15.
Let be a representation and
Then is non-primitive, and is isomorphic to a direct sum of primitive irreducible representations.
Proof.
The fact that and thus are an invariant subspaces of follows quickly from the fact that , so and are well-defined. By definition, for all , so the kernel of includes , i.e., is non-primitive.
Now let be any irreducible subrepresentation of , where . Since and , we can find some , and thus some such that . But then , so the kernel of does not contain , i.e., is primitive. ∎
The primitive irreducible representations of are fairly complicated, but the following lemma will suffice for our purposes. This generalizes the classical spectral-gap result for .
Lemma 3.16.
Any primitive irreducible representation of has .
Proof.
Complete tables with the dimensions of irreducible representations of , including the case , were given by Nobs–Wolfart [31, p. 525] (who refer to primitive representations in the sense of our Lemma˜3.16 as having ‘level ’). For odd primes , these had been classified by Shalika [36, §4.3], Tanaka [41], and Kutzko [26].
For a more direct proof of the lower bound via Clifford theory, we refer to Bourgain–Gamburd’s [5, Lemma 7.1] (this assumes is odd, but an analogous argument applies if ). To summarize their argument when is even, Bourgain–Gamburd apply (a variant of) Lemma˜3.9 with and , to decompose into irreducible representations , all lying in the same orbit under -conjugation. But is abelian, so , and correspond to -conjugate elements . Moreover, the primitivity condition that does not contain implies that . It follows that
where is the centralizer of in . It thus remains to bound for , which follows from an explicit matrix computation [5, Claim 7.1] (minor modifications are needed here if , but these only incur a constant-factor loss). ∎
4. Representations and Kloosterman matrices
Here we connect matrices of Kloosterman sums modulo to Fourier analysis on .
4.1. The relevant representations
When digesting the notation below, the reader should keep in mind the informal outline from Section˜2.2. We will first define the simpler representations which are connected to matrices of Kloosterman sums , and then the more relevant subrepresentations which correspond to adding the restriction . In fact, the subspace will be constructed by sifting out ‘old’ subspaces isomorphic to for .
Definition 4.1 (Permutation representations of the projective action).
For , we denote the permutation representation of induced by the action ˜3.14 on by
| (4.1) |
and its character by . Hence is the space of functions , equipped with the standard inner product, and for , . In particular, for any , one has .
Definition 4.2 (Invariant subspaces).
For with , define
In particular, . Thus is the space of complex-valued functions on which are constant on orbits of , so by Lemma˜3.13,
| (4.2) |
Lemma 4.3.
For with , is an invariant subspace of . In fact, using Notation˜3.12, we have
Proof.
The fact that is an invariant subspace follows immediately from the normality of . Now let be the invertible linear map from ˜4.2, which relies on the bijection from ˜3.15. Then for any , one can easily check that : both maps take the basis vector to the -normalized function in which is only nonzero on the orbit . ∎
In light of Lemma˜4.3, we will need to remove the contribution of ‘old’ representations to . To this end, it will be helpful to adopt the following convention for tensor products.
Notation 4.4 (Ordered tensor products).
If are such that and , then the Chinese Remainder Theorem gives by , so . Since tensor products of vector spaces are defined up to isomorphism, it is not a great abuse of notation to write
In particular, given and , we view as a function on with values . This notation extends to tensor products of subspaces , (so ), and of linear transformations , .
We note that with the conventions from Notations˜3.7 and 4.4, for with , the product is precisely the permutation representation (this is a genuine equality, not just an isomorphism). Moreover, a tensor product of invariant subspaces of and gives an invariant subspace of , and in fact for , , and . Iterating this yields the factorizations
| (4.3) |
and, more generally, for ,
| (4.4) |
Finally, we can define the representations .
Definition 4.5 (Sifted representations).
For a prime power , we let be the orthogonal complement of inside (which is an invariant subspace of ). For , we define
Proposition 4.6 (Decomposition of sifted representations).
For any , one has
| (4.5) |
Moreover, each is isomorphic to a nonempty direct sum of primitive irreducible representations, and is isomorphic to a direct sum of irreducible representations of dimensions .
Proof.
The factorization in ˜4.5 follows immediately from ˜4.3 and 4.5. The fact that is isomorphic to a direct sum of primitive irreducible representations is precisely the content of Lemma˜3.15, wherein and . One can easily construct a function on which is not constant on orbits of , so , and thus .
Now write each as a direct sum of primitive irreducible representations of (up to isomorphism), and expand the tensor product in ˜4.5. This expresses as a direct sum of representations (potentially with repetitions) of the shape
which are irreducible, and have dimensions by Lemma˜3.16 and the divisor bound. Since , the number of these representations is at most . ∎
We now briefly analyze the orthogonal projections onto invariant subspaces of . It will turn out that the projection onto can be obtained by a Möbius-inversion-type process.
Definition 4.7 (Special projections).
For with , we let be the orthogonal projections onto , respectively . In particular, is the identity map on .
Lemma 4.8.
For , one has
| (4.6) |
In particular, commutes with for any . The matrix representation of this map with respect to the standard basis of has entries
| (4.7) |
The proof of Lemma˜4.8 is left to Appendix˜A.
Lemma 4.9.
For , one has
Proof.
The factorization as a tensor product follows immediately from Definitions˜4.5 and 4.7. Now for a prime power , recall that is the identity map on and is the orthogonal projection onto , so the orthogonal projection onto can be written as
It follows from this and ˜4.6 that
as claimed. ∎
4.2. The Kloosterman matrix
Here we finally relate the abstract discussion in the preceding subsections to the classical Kloosterman sums.
Proposition 4.10 (From Kloosterman matrices to Fourier coefficients).
Let , , and be the complex matrix with entries
| (4.8) |
Consider the function given by
| (4.9) |
where and are as in ˜3.13. Then one has the inequality of operator norms
Remark.
Proof of Proposition˜4.10.
Let be the unitary matrix with entries . By expanding the Kloosterman sums, we have
for any . We then expand the indicator by Möbius inversion and Fourier analysis,
and evaluate the sums over to obtain
where we substituted , . Switching divisors , swapping sums, and evaluating the sum over (which gives either or solutions), we reach
| (4.10) |
Let us keep this in mind. Separately, by ˜4.9 and 4.5, we have
and thus by Lemma˜3.1,
| (4.11) |
where is the map
By Lemma˜4.9 and the commutativity claim in Lemma˜4.8, we can further write
By Definitions˜4.1 and 4.7, we can represent this map as a matrix in with entries
| (4.12) |
for ; compare this to ˜4.10. We will show that restricting the matrix to those rows and columns indexed by (by the canonical embedding ) yields precisely the matrix . Indeed, using the notation above, if , then , , and we have if and only if the equation
has solutions in and . On the one hand, the existence of such solutions implies that , so . On the other hand, if , then one can take and to obtain a solution. It follows that
and then by comparing ˜4.10 and 4.12, we find that
Since removing some rows and columns of a matrix can only decrease its spectral norm, we conclude that
which, together with ˜4.11, completes our proof. ∎
Corollary 4.11.
Let with , , and be intervals of lengths , . Let be the matrix indexed by and , with entries
| (4.13) |
Let and , . Then there exist complex weights such that for the function given by
| (4.14) |
one has
| (4.15) |
Remark.
Given ˜4.14, one can apply the triangle inequality for the operator norm to obtain
| (4.16) | ||||
since (as the norm of a unitary map). Plugging this into ˜4.15 recovers the trivial bound from ˜1.2. Our task in the later sections will therefore be to establish some power-saving spectral cancellation in the sum over from ˜4.16.
Proof of Corollary˜4.11.
Let us write , , and , for some . Since , we may identify , with their images in . Let be a smooth function supported in , such that and for , and define by
| (4.17) |
Since , we have and (viewing these as functions on ). But scaling a row or a column of a matrix by a constant in can only decrease its spectral norm, so with the notation from ˜4.8 we get
So from Proposition˜4.10, it follows that
| (4.18) |
and it remains to compute . For , we obtain from ˜4.17, ˜3.11, ˜3.3, and ˜3.2 that
and similarly for (with in place of ). Plugging this into ˜4.9, we conclude that
Using the Schwarz decay of , we can discard the contribution of the terms with or to , up to an error of . Choosing
5. The amplification argument
Recall that Propositions˜4.10 and 4.11 reduce the problem of bounding bilinear forms with Kloosterman sums to that of bounding Fourier coefficients of certain functions on at a certain representation. One can then reduce to irreducible subrepresentations via Lemma˜3.10. To use Fourier analysis, we will pass to a sum over all irreducible representations of —but to avoid a critical loss, we insert an amplifier weight in this sum, as outlined in Section˜2.3. Making this work in a non-abelian setting is the key step in our argument.
5.1. Introducing the amplifier
We state the results in this subsection in fair generality, since they may be useful in other contexts. We recall the notation for Schatten norms from Section˜3.1.
Proposition 5.1 (Non-abelian amplification).
Let be a finite group, , , , , and be an even positive integer. Then one has
Remark.
In comparison, expanding by ˜3.10 and 3.5 yields
The upper bound from Proposition˜5.1 replaces with a function proportional to , normalized so that equality is attained when . The requirement that be irreducible is crucial.
We first prove Proposition˜5.1 using character orthogonality, which resembles more classical amplification arguments, and then give a more conceptual sketch of proof via induced representations. The first proof has the advantage that it may be generalized by considering other amplifiers.
Proof of Proposition˜5.1 via character orthogonality.
Write and for the function , so that . By considering the function
where there are factors in the convolution, so that , we see that it suffices to prove the desired result when .
Let be the indicator function of , and consider the amplifier given at with by
where we implicitly used that is a group. In particular, is a positive integer multiple of , due to ˜3.9. It follows from this and nonnegativity that
Expanding via ˜3.10 and 3.5 and then using ˜3.8, the sum over above becomes
where denotes the conjugacy class of in . Since can be partitioned into such conjugacy classes by normality, and since characters are constant on conjugacy classes, it follows that
This settles the case , thus completing our proof. ∎
Remark.
After reducing to the case , one could also attempt to use the triangle inequality for Schatten norms and Cauchy–Schwarz, in the shape
This argument does not use that is normal or that is irreducible, but it produces a weaker bound in general. Indeed, compared to Proposition˜5.1, the bound above loses a factor of
which is a (potentially large) positive integer by ˜3.9; note that although is irreducible on , it is usually not irreducible on (e.g., when is the trivial subgroup, the factor lost is ). This is a discrepancy which does not arise in the abelian setting, when all irreducible characters are -dimensional—which is why classical amplification arguments with Dirichlet characters can often be re-formulated by applying Cauchy–Schwarz to a sum over residues (see, e.g., [28]).
Proof sketch of Proposition˜5.1 using induced representations.
Starting from , consider the restricted representation , and then the induced representation
This acts by translation on the space
so . It can be shown using ˜3.9 that is a subrepresentation of , with
and therefore, by Lemma˜3.10,
To complete the proof, one can expand via ˜3.10 and 3.5, and use the Frobenius formula
for all , where the last equality uses the normality of . ∎
Remark.
In the particular case when is the trivial subgroup, the induced representation in the proof above is (isomorphic to) the regular representation . In this case, the conclusion of Proposition˜5.1 simply reads
5.2. Passing to a counting problem
We now return to the setting when and for some . The goal is to pass from the spectral norm in ˜4.15 to a count of solutions to a certain equation in . After applying Proposition˜5.1, we will need an upper bound for , and a lower bound for the denominator . For the specific characters and from Section˜4, such bounds are given in the following results.
Lemma 5.2.
Let , , and be the largest divisor of such that
Let be the largest positive integer such that . Then one has
| (5.1) |
Remark.
The bound in ˜5.1 is sharp in terms of and , as can be seen by taking . In particular, if is a prime power, then has fixed points of the shape , where .
The proof of Lemma˜5.2 is left to Appendix˜A.
Proposition 5.3 (Lower bound for squared character sums).
Let and be an irreducible character inside . Let be such that , , and . Then one has
Remark.
The lower bound in Proposition˜5.3 wins a factor of about over the ‘trivial’ bound of due to ˜3.9. This is because typically has size when (note that when , one has and ).
The proof of Proposition˜5.3 reduces to a local computation (i.e., for a prime power) given below, which builds on Lemma˜3.16. Indeed, one can rephrase Lemma˜3.16 as follows: if is primitive, then . Using a bit of Clifford theory, we can generalize this to averages of over the congruence subgroups , for some values of .
Lemma 5.4.
Let be a prime power and be such that either or . Let be a primitive irreducible character of . Then
Remark.
We expect the bound in Lemma˜5.4 to be sharp, and to actually hold for all . This might follow from a more careful study of for , and it would imply Proposition˜5.3 (as well as Theorem˜1.2) for more flexible factorizations .
The proof of Lemma˜5.4 is also left to Appendix˜A, a key ingredient being the fact that is abelian if .
Proof of Proposition˜5.3.
By (the proof of) Proposition˜4.6, is isomorphic to a direct sum of representations of the shape
| (5.2) |
where each is a primitive irreducible representation of . This gives the decomposition of into irreducible representations, so any irreducible representation occurring inside must be of the form in ˜5.2. Now let and for such a representation. From ˜3.16 and fact that for , it follows that
One can then apply Lemma˜5.4 to obtain
Note that the hypothesis on from Lemma˜5.4 is satisfied because whenever is a prime dividing with and , one of the following holds:
-
(1)
One has and , so , or
-
(2)
One has and , so .
The desired conclusion then follows from the divisor bound. ∎
Finally, we can state the result of our amplification argument.
Proposition 5.5 (From Fourier coefficients to a counting problem).
Let , , , and be as in ˜4.14. Let be such that , , and . Then for any even positive integer , one has
where both variables in the maxima are understood to be integers (note that ).
Proof.
By Proposition˜4.6, is a sum of irreducible representations . By Lemma˜3.10, it suffices to prove the desired upper bound for each . The loss factor of is acceptable here (but if one is only interested in the spectral norm, so , there is actually no loss at this step).
For each irreducible representation with , we apply Propositions˜5.1 and 5.3 with to obtain
where . In fact, by Proposition˜5.1, the sum above is nonnegative if one replaces with any irreducible character of . Summing over all such characters with weight (which is at least when ), we find that
Here is the original permutation representation from Definition˜4.1. In light of Lemma˜5.2, we ought to split the sum above based on the largest such that for some with ; note that by ˜3.12, this is equivalent to the equation in . Then from the triangle inequality, Lemma˜5.2, and the divisor bound for , we find that
| (5.3) |
Now recalling ˜4.14, we can expand
The conclusion follows by plugging this into ˜5.3, and noting that as vary in , the difference varies in , each value being attained times. ∎
6. Counting solutions in
We now develop the final ingredient towards Theorem˜1.2, as outlined in Section˜2.4. Given with even, , and , we will count solutions in to the system
| (6.1) |
In particular, the ranges of alone produce the trivial bound for the number of solutions. Focusing on the case , below are some classes of solutions to ˜6.1:
- .
-
.
Integer solutions (2). If , ˜6.1 becomes
This has solutions, which supersedes the diagonal contribution from .
-
.
Generic terms. The product can take values in . If each matrix is attained roughly the same number of times (and such equidistribution ought to happen for large enough ), this gives an expected number of solutions.
These heuristics can be formalized to produce a lower bound, imposing a limitation on our methods.
Lemma 6.1.
Let with even, , and . The number of solutions to ˜6.1 is at least
| (6.2) |
Proof.
The lower bound by the first term in ˜6.2 follows by considering the aforementioned integer solutions with and . To obtain a lower bound by the second term in ˜6.2, we let , write for or depending on whether is odd or even, and apply Cauchy–Schwarz to obtain
One can rewrite the last equation in as
Comparing this with ˜6.1, recalling that , and noting that takes each value in at most times (and similarly for ), we conclude that the desired count of solutions is at least
Since , this completes our proof. ∎
Optimistically, we may expect the lower bound in Lemma˜6.1 to be essentially sharp.
Conjecture 6.2.
Let with even. For all and , the number of solutions to ˜6.1 is at most
| (6.3) |
Remark.
When is large enough in terms of (so in particular, the first term in ˜6.3 dominates), ˜6.2 becomes a statement about , which follows from a somewhat tedious combinatorial computation. When is large enough in terms of (so in particular, the second term in ˜6.3 dominates), ˜6.2 can be attacked using expansion methods; see [37, Lemma 53 and Theorem 50] for the case when is prime. However, note that ˜6.3 saves at best over the trivial bound of , and this saving becomes in our final bounds; therefore, using a large value of ultimately produces a quantitatively-weak power saving. On the other hand, if is too small, then combining ˜6.2 with Proposition˜5.5 gives information about a small moment of singular values, which produces a weak bound for the top singular value.
Lemma 6.3.
˜6.2 holds if or .
Proof.
When , the equation in ˜6.1 reads , which implies the entry-wise congruence
for some with . This actually forces , and . Since , we obtain choices of , which matches the bound from ˜6.3.
When , the equation in ˜6.1 reads , which translates to
for some with . Since there are such values555This follows from a local computation via Hensel’s lemma, and the divisor bound. of , we may fix up to an acceptable loss. To establish the desired bound of for the number of solutions with and , we split into three cases.
Case 1: . This forces , and each choice of induces a unique residue of . Since , this gives a total of solutions.
Case 2: . This forces , and each choice of induces a unique residue of . Since , this gives a total of solutions, and recall .
Case 3: . Then the congruence fixes the residue of , leaving possible values of , each of which gives choices of by the divisor bound. This gives a total of solutions. ∎
Proposition 6.4 (Combinatorial count for ).
Let , , and . The number of solutions to ˜6.1 with is at most
| (6.4) |
Remark.
˜6.2 would replace the second term in ˜6.4 with . In particular, Proposition˜6.4 establishes ˜6.2 when and either or .
Proof of Proposition˜6.4.
To simplify the exposition, we focus on the case . The proof is almost completely unchanged when are arbitrary. We may then write the equation in ˜6.1 (with ) as
A short computation brings this to the entry-wise congruence
for some with . As before, since there are possible values of , we may as well regard as fixed. Since both sides have determinant , this is actually a system of three congruences:
| (6.5) |
Our argument now requires some casework.
Case 1: One has for some . Since the original equation can also be written as in (viewing indices modulo ), we may assume without loss of generality that , up to potentially swapping and in the final bound (so we momentarily forget that ). So let us say , which reduces ˜6.5 to
| (6.6) |
Subcase 1.1: One has . Then for any values of and , the system in ˜6.6 leaves only possibilities for (since ). This gives solutions.
Subcase 1.2: One has . Then for any values of and , the system in ˜6.6 leaves only possibilities for . This gives solutions.
Subcase 1.3: One has . Then the first congruence in ˜6.6 fixes , leaving possibilities for the nonzero integer , each of which gives possible values for by the divisor bound. Once and are fixed, each value of produces final solutions. This gives solutions.
From Case 1, we obtain a total number of solutions of
which is once we remember that and . This is acceptable in ˜6.4.
Case 2: One has for all . We fix up to an acceptable loss. Since , there are ways to pick the nonzero integer , each of which gives ways to pick by the divisor bound.
Once are fixed, we pick subject to the system
| (6.7) |
which follows from ˜6.5. We can do this in either of the following ways:
Finally, once are fixed, ˜6.5 determines the residues of and modulo , so there are choices of . From Case 2, we obtain a total number of solutions of
| (6.8) | |||
To bound the sum over , we write
and the term can be omitted since . Plugging this into ˜6.8 gives a total count of
Since , the final term of can be omitted. The bound above now becomes
After expanding the expression inside the maximum, each term is either strictly increasing, constant, or strictly decreasing in , so each term is maximized when or . It follows that we can bound the maximum over , up to a constant, by looking only at the extreme points and . This gives a total count of
where, to reach the last line, we used that . This establishes the desired bound. ∎
Finally, we may remove the restriction that from the counting results in this section up to some additional factors.
Corollary 6.5.
Proof.
Since the equation in ˜6.1 only depends on the residues of , we may as well count solutions to the system
| (6.11) |
and multiply the final count by a factor of (indeed, each solution to ˜6.11 induces at most this many solutions to ˜6.1 with the same residues modulo , and all solutions to ˜6.1 can be obtained this way).
If , Lemma˜6.3 applied for and gives a total number of solutions of
and the bound in ˜6.9 follows by noting that .
Similarly, if , Proposition˜6.4 applied for and gives a total count of
We note that the third and fourth terms in the first parenthesis above can be ignored: their contribution to the final bound is , which is superseded by the contribution of from the third term in the second parenthesis. This gives a total count of
| (6.12) |
This bounded by ˜6.10, noting that is the geometric mean of and . ∎
7. Bilinear forms with Kloosterman sums
We now combine the work in Sections˜4, 5 and 6, to deduce our main results from Theorems˜1.2 and 1.1.
7.1. Composite moduli
Here we prove a generalization of Theorem˜1.2, which allows for larger values of . We state our upper bound in two ways, to facilitate comparison with ˜1.2.
Theorem 7.1.
Let for some with and , and be the largest integer with . Let be intervals of lengths , , with666The assumption that is only included to shorten the statement of Theorem 7.1; one can of course swap and in the bilinear sum, up to swapping and in the upper bound. . Then for any complex sequences , and , one has
Example 7.2.
Suppose . The smallest value of for which Theorem˜7.1 can beat the Weil bound in ˜1.2 is , attained, e.g., when where are distinct primes with . The largest value of for which Theorem˜7.1 can beat the Fourier-theoretic bound in ˜1.2 is , attained when has a divisor such that is square-free.
Remark.
Additional savings are possible in Theorem˜7.1 in the unbalanced range . Firstly, the bound in ˜6.10 can be refined to ˜6.12, but we omit this optimization for the sake of simplicity. Secondly and more substantially, bounding the largest singular value of an matrix by the sixth moment of its singular values (as we do) can be particularly lossy if , since then the singular values often exhibit concentration near their maximum; one can try to amend this by subtracting a suitable main term from the sixth moment, as in [18, Lemma 4.2].
Proof of Theorem˜7.1.
Let and , . Since , we have . Let be an even positive integer.
Then by the characterization of operator norms from ˜3.4, Corollary˜4.11 (using the notation from ˜4.13 and 4.14), the fact that for any linear map , and then Proposition˜5.5, we have that
| (7.1) | ||||
where
The inner sum is a count of solutions to an equation of the type ˜6.1 with replaced by , so we can estimate using Corollary˜6.5. We will make use of the following quick fact. Recalling that is the maximal positive integer with , we claim that for all integers , one has
| (7.2) |
Indeed, when one appends a prime to , the numerator can increase by at most , so the expression cannot increase if , and the maximum is attained when . On the other hand, if , then clearly , and the maximum is attained when .
Remark.
Using ˜6.9 with instead of ˜6.10 with in the proof above leads to a final bound of
which is weaker than Theorem˜7.1 in the main ranges of interest.
Proof of Theorem˜1.2.
Note that the result holds trivially if . Since and the result is symmetric in , we can assume without loss of generality that . One can then apply Theorem˜7.1, and since , the upper bound becomes
The first term can be omitted in light of the bound (since ). ∎
7.2. Near-prime moduli
Building towards an unconditional result for general moduli, we need to slightly develop the result of Kowalski–Michel–Sawin [24] from Theorem˜3.4.
Corollary 7.3 (Kowalski–Michel–Sawin bounds for near-primes).
Let where is a prime, , and . Let be integers such that . Then for any complex sequences and , and any , one has
Proof.
Using the twisted multiplicativity of Kloosterman sums and splitting the sums over according to their residues modulo , we can write
| (7.3) | ||||
where
First, we consider the contribution to ˜7.3 of those with (and ) or (and ). By the Weil and Ramanujan bounds from Lemmas˜3.3 and 3.2, this contribution is
Plugging this into ˜7.3 and applying the Weil bound from Lemma˜3.3, we find that
| (7.4) | |||
The last bilinear sum over is now almost in the correct shape to apply Theorem˜3.4. Indeed, splitting it according to the residues of and modulo , we can rewrite it as
| (7.5) |
where
and is defined similarly. Note that the last sum above contains at most one term, so
and similarly for . Applying Theorem˜3.4 (and the remark that follows it) for a bilinear sum with lengths and , and using the monotonicity of the bound from Theorem˜3.4 in and , we obtain
Plugging this into ˜7.5 and ˜7.4, and applying Cauchy–Schwarz to the sum over , we find that
Finally, recalling that and (which imply ), the last term can be omitted, and we arrive at the desired bound. ∎
7.3. General moduli
Finally, we prove a generalization of Theorem˜1.1.
Theorem 7.4.
Let , with . Let and . Then for any complex sequences , and any , one has
| (7.6) | ||||
Moreover, if for all (so ), then
| (7.7) | ||||
Remark.
The bound ˜7.7 actually holds without the assumption that . Indeed, Theorem˜7.4 covers the case . If , then applying the (first) bound from ˜1.2 for the sequences , given by and leads to the bound
so ˜7.7 still holds. If , then applying the (first) trivial bound from ˜1.2 for the sequences , given by and leads to the bound
so ˜7.7 still holds. An analogous argument covers the remaining case .
Proof of Theorem˜7.4.
We begin by noting a quick consequence of Theorem˜7.1. Suppose has a factorization with and such that . Then by combining Theorem˜7.1 (applied for , instead of ) with the bounds and then , we get
| (7.8) | ||||
Note that this bound is increasing with , and that the right-hand side of ˜7.6 supersedes . In particular,
A quick computation shows that since , one has
Therefore, the right-hand side of ˜7.7 supersedes . We now split into cases depending on the factorization of the modulus .
Case 1: is divisible by a maximal prime power . Then let us write , where is not necessarily a prime, but and .
Subcase 1.1: One has . Then we can apply Corollary˜7.3 (with replaced by ), which gives the bound
The first term here is superseded by the third term in ˜7.6 and the second term in ˜7.7, since and . The second term here appears directly in both ˜7.6 and 7.7.
Subcase 1.2: One has . Then we let , , and , which gives a valid decomposition to use in our Theorem˜7.1. Moreover, since , we have , so , and thus
From ˜7.8 we thus obtain an upper bound of , which is acceptable in both ˜7.6 and 7.7 since .
Case 2: All prime powers have .
Subcase 2.1: All prime powers have . Then we set , and construct by a greedy algorithm. Initially, we take . For each prime power , we append to the smaller of and . Note that throughout this process, and cannot differ by a factor larger than . In the end, if , we swap and . This produces a factorization with and
Then ˜7.8 gives a bound of , which is acceptable in both ˜7.6 and 7.7.
Subcase 2.2: The largest prime power dividing is some . Then we let , , and , and ˜7.8 gives an acceptable bound of once again.
Subcase 2.3: The largest prime power dividing is some . On the one hand, ˜7.8 gives a bound of , which is acceptable in ˜7.6; this completes the proof of ˜7.6.
Now assume (still within Subcase 2.3) that for all , and we aim to establish ˜7.7.
-
•
If , then writing , we can factorize with , , and . Here , since implies . But then ˜7.8 gives an acceptable bound of .
-
•
If , then we can use in Theorem˜3.5. Since , this gives the bound
Since and , the first and the last terms in the parenthesis above are superseded by the term from ˜7.7. The second term appears directly in ˜7.7.
This covers all cases. ∎
Proof of Theorem˜1.1.
As before, we can assume without loss of generality that since the result is trivial when . Then the result follows by applying Theorem˜7.4 with , and using the optimal choices in ˜7.6, respectively in ˜7.7. ∎
7.4. Averaging over moduli
Finally, let us prove a generalization of Corollary˜1.4.
Corollary 7.5.
Let for some with and , and be the largest integer with . Let and be intervals of lengths , , with . Let be complex sequences, and for each , let , be such that , for all . Then one has
Proof of Corollary˜7.5.
Throughout this proof, we will use the notation
for any . In particular, the assumption of the present Corollary˜7.5 takes . Note that and for any , and that when .
We can of course assume without loss of generality that , since otherwise the sum over is empty. For each with , we consider the sum
where the last equality follows from the identity . From the triangle inequality and the bound , we find that
| (7.9) |
where
We aim to apply Theorem˜7.1 (with ) to bound each sum , and this requires a suitable factorization of the modulus . There are two ways to construct this from the assumed factorization , which correspond to placing ‘most’ of the factor into or into .
Method 1. For each with , consider the factorization
which has and . We find that
so Theorem˜7.1 gives
Therefore,
After applying Cauchy–Schwarz to the last sum, it remains to bound the sums and , both of which are . In particular, for the second sum, we can write
From this and ˜7.9, we conclude that
Finally, the last sum is easily bounded by using Cauchy–Schwarz and the divisor bound. This establishes the bound from Corollary˜7.5 with the first term from the minimum.
Method 2. For each with , consider the factorization
which satisfies and . We find that
so Theorem˜7.1 gives
The second bound from Corollary˜7.5 now follows similarly as before from ˜7.9, since the sum over ‘washes out’ the factor . ∎
Proof of Corollary˜1.4.
This follows from Corollary˜7.5 analogously to how Theorem˜1.2 follows from Theorem˜7.1. ∎
8. Moments of twisted modular -functions
Here we prove Theorem˜1.5, by inserting our bounds for bilinear forms with Kloosterman sums into the proofs from [3]. We begin by restating ˜7.7 in a shape more similar to [3, Theorem 5].
Corollary 8.1.
Let with . Let , , , and be a sequence with for all . Then one has
Proof.
One can of course assume without loss of generality that , and extend the sum over to include all with . By duality, it suffices to establish the bound
for any sequence . But this is precisely the content of ˜7.7 with replaced by ; the remark after Theorem˜7.4 allows us to ignore the constraint . ∎
We can now prove an analogue of [3, Proposition 7]. We use the same normalization as in [3, (2.3)] for the Hecke eigenvalues of a holomorphic cuspidal newform for , so that
| (8.1) |
In particular, the Deligne bound [8] reads
| (8.2) |
Proposition 8.2.
Let , with , with , and let , be the Hecke eigenvalues of two (fixed) holomorphic cuspidal newforms for . Let be functions supported in with derivatives , and denote
| (8.3) |
Then for any , one has
| (8.4) |
Proof.
We closely follow the proof in [3, §4]. In particular, we decompose where is maximal with . The bound [3, (4.2)] reads
where is a transform of as in [3, (2.10)] (coming from an application of the Voronoi summation formula). Using the rapid decay of , we may truncate the sum over at
up to an acceptable loss. Note that the resulting sum over vanishes unless . From Corollary˜8.1, ˜8.2, and the divisor bound, we conclude that
Plugging in the definition of , we see that the expression inside the maximum is non-decreasing in and non-increasing in . Writing
we find that
which reduces to the desired bound. ∎
We can now prove the desired asymptotic for twisted moments of modular -functions.
Proof of Theorem˜1.5.
Let and . We closely follow the proof in [3, §3], making no changes to the main term analysis from [3, §3.1]. Treating the off-diagonal term as in [3, §3.2], it remains to establish the bound
| (8.5) |
for all and with , using the notation from ˜8.3. As in [3, §3.3], we can easily discount the contribution of the range using [3, (3.12)], so let us assume that . We will rely on the bounds
| (8.6) | ||||
| (8.7) |
from [3, (3.6) and (3.11)], as well as on our Proposition˜8.2 (instead of [3, Proposition 7]). First, the trivial bound ˜8.6 establishes ˜8.5 unless
| (8.8) |
so let us assume that we are in this range. We now split into cases depending on the size of .
9. Large sieve for exceptional cusp forms
Here we prove a generalization of Corollary˜1.6, which requires some background from the spectral theory of automorphic forms. We recall [11] that for , the congruence subgroup contains those matrices in with bottom-left entries divisible by . Each cusp of the the fundamental domain is equivalent to a fraction of the form , where , , , and ; in particular, the cusp at is equivalent to . To such a cusp, one can associate a scaling matrix with , and via these scaling matrices, functions on can be Fourier expanded around .
The discrete spectrum of the hyperbolic Laplacian is parametrized by Maass cusp forms: these are smooth functions which are eigenfunctions of , vanish at all cusps of , and are square-integrable with respect to the Petersson inner product. Following the normalization of Deshouillers–Iwaniec [11], we write the Fourier expansion of at around a cusp (with scaling matrix ) as
where is a Whittaker function as in [11, p. 264]. Altering the choice of scaling matrix results in multiplying the Fourier coefficients by an exponential phase , for some uniform .
The Kuznetsov trace formula [11, 27], as well as the large sieve inequalities that derive from it, involve an orthonormal basis of Maass cusp forms. The following notation will therefore be useful.
Notation 9.1.
Let , be a cusp of equivalent777The assumption that is equivalent to is true in most applications (note that this includes the cusp at ), and only made for convenience; one can prove similar results at arbitrary cusps with small adjustments. to for some with , and be any scaling matrix for . Consider an orthonormal basis of Maass cusp forms for , with:
-
.
Laplacian eigenvalues and spectral parameters ;
-
.
Fourier coefficients around the cusp , using the scaling matrix .
Proposition 9.2.
Assume Notation˜9.1, let , and let be a complex sequence. Let be a smooth function supported in , with and . Then there exists (depending only on , ) such that
| (9.1) | ||||
Proof.
This is [32, Corollary I], which follows from the Kuznetsov trace formula and the regular-spectrum large sieve inequalities of Deshouillers–Iwaniec [11, Theorem 2]. We have implicitly used [32, Lemma B] to write down the Kloosterman sums and -supports for cusps equivalent to for some (the latter condition is written as in loc. cit.). Note that we incur factors of and since we do not assume a special scaling matrix (as we may), but this will be irrelevant in our computations since the sequence is arbitrary. ∎
In the right-hand side of ˜9.1, the sum over is really supported on due to the -weight, and it vanishes if with a large enough implied constant. Deshouillers–Iwaniec used this simple observation to deduce the following result, which combines [11, Theorems 2 and 5].
Theorem 9.3 (Deshouillers–Iwaniec [11]).
Until now, if , Theorem˜9.3 has been the state-of-the-art exceptional-spectrum large sieve bound for general sequences and a single group ; the same is true if one averages over levels and allows the sequence to depend on .
We can now achieve an improvement of Theorem˜9.3 when has a factorization as in Theorem˜7.1, and similar results can be deduced for arbitrary levels using Theorem˜7.4. We require a coprimality constraint for technical reasons, but this is usually harmless in applications. The resulting power savings are relatively small, but serve as a proof of concept that Theorem˜9.3 is not a fundamental barrier.
Theorem 9.4 (Large sieve for composite levels).
Assume Notation˜9.1, let , and let be a complex sequence supported on . Suppose that with and , and let be the largest integer with . Then ˜9.2 holds for any positive
| (9.3) |
Proof of Theorem˜9.4.
We may assume without loss of generality that
| (9.4) |
since otherwise the result follows from Theorem˜9.3. We apply Proposition˜9.2 with a choice of supported on , then separate variables in the smooth weight via two-dimensional Fourier inversion, as in [32, Proof of Theorem 13], to arrive at
The sequences , in the supremum arise by incorporating exponential phases into , partly from the choice of the scaling matrix , and partly due to the separation of variables. The suprema are of course attained by some sequences , supported on , so we can apply Corollary˜7.5 with and , to obtain
We conclude by noting that
and
This covers the range in ˜9.4. ∎
Proof of Corollary˜1.6.
If has a divisor such that is square-free, then we can take and in Theorem˜9.4, so ˜9.2 holds for any positive
If additionally (as Corollary˜1.6 assumes), then we can take , since this is only larger by a factor of than the second minimum above. ∎
Appendix A Some necessary computations in
Here we fill in some details involving explicit matrix computations, subgroups, and characters of . In particular, we prove Lemmas˜4.8, 5.4 and 5.2.
Proof of Lemma˜4.8.
The first equality from ˜4.6 follows immediately from ˜4.4. Now let denote the map in the (extreme) right-hand side of ˜4.6; we will show that .
The fact that is a subgroup quickly implies that is self-adjoint and that , so is an orthogonal projection. Moreover, one has for any , so any has , which shows . Conversely, if , so for all , then clearly , which shows . Thus is the orthogonal projection onto , i.e., .
The claim about commutativity follows directly from ˜4.6 and the normality of .
Finally, let us prove ˜4.7. It follows from ˜4.6 and ˜3.17 that
| (A.1) |
where is the stabilizer of inside (indeed, once for some , all other solutions to satisfy , so ). By the normality of , we see that for any ,
so by the transitivity of the projective action,
Note that simply means for some unit . This forces , , and ; moreover, given such a choice of , any gives a solution. Since there are choices of in the kernel of , and choices of in , we find that
Proof of Lemma˜5.4.
Set and , so . Say where is primitive. By ˜3.9, we have
By Lemma˜3.9, contains irreducible representations of , each with multiplicity , for some positive integers . Thus
and in light of ˜3.17, it remains to show that
| (A.2) |
If , this is a trivial statement. Suppose now that . Then
is abelian, since for . In fact, this shows that
In particular, all irreducible representations of are -dimensional, and can be expressed as
| (A.3) |
Now equating dimensions and using Lemma˜3.16, we find that
| (A.4) |
We will finish by finding an upper bound on . By the conclusion of Lemma˜3.9, all non-isomorphic representations in the decomposition of lie in the same orbit of ’s action by conjugation. For and as in ˜A.3, we have
In other words, the action of by conjugation on irreducible representations of corresponds to conjugation of the underlying matrices . It follows that is at most the maximal size of an orbit in the set
under conjugation by , or equivalently by . Since conjugation preserves the determinant , we find that
Writing , we further get
where we substituted . Now given , write for some and . The equation then implies
for some with , and some , . There are choices of , and for every choice of , there are at most choices of (since is fixed). Putting these counts together, we obtain
Combining this with ˜A.4 establishes the desired bound from ˜A.2. ∎
Proof of Lemma˜5.2.
From ˜4.3, it follows that . Working locally at a prime , with say and , we will establish the bound
| (A.5) |
for all such that is the largest -power for which . Given ˜A.5, the desired bound in ˜5.1 follows from the divisor bound.
Since is a permutation map, equals the number of fixed points of in , i.e., the number of solutions in to . Let us write and for some integers with and . Scaling both entries of by a unit in , we can assume without loss of generality that or ; in fact, replacing if necessary, we may assume that . Then the equality means that for some , one has
This gives the quadratic congruence
Now from our assumption that for some with , we know that , , and (since ). In fact, is the largest -power with this property (otherwise, we could pick some such that ). Therefore, letting , and , we find that
| (A.6) |
where are not all divisible by . It now remains to show that this equation has
solutions in ; every such solution will have lifts to , inducing a total of fixed points of .
Case 1. . Then given any two solutions of ˜A.6, we can subtract the two equalities to obtain
| (A.7) |
Let . By the pigeonhole principle, we must have or . Since , either option uniquely determines in terms of . So there can be at most
solutions in .
Case 2. . Given the previous case, we can assume . Then , so from ˜A.7 we find that , forcing only one solution in .
Case 3. . Then ˜A.6 implies , and by substituting , we reduce to the case . ∎
References
- [1] Valentin Blomer, Étienne Fouvry, Emmanuel Kowalski, Philippe Michel, and Djordje Milićević. On moments of twisted -functions. Amer. J. Math., 139(3):707–768, 2017.
- [2] Valentin Blomer, Étienne Fouvry, Emmanuel Kowalski, Philippe Michel, Djordje Milićević, and Will Sawin. The second moment theory of families of -functions—the case of twisted Hecke -functions. Mem. Amer. Math. Soc., 282(1394):v+148, 2023.
- [3] Valentin Blomer and Djordje Milićević. The second moment of twisted modular -functions. Geom. Funct. Anal., 25(2):453–516, 2015.
- [4] Enrico Bombieri, John B. Friedlander, and Henryk Iwaniec. Primes in arithmetic progressions to large moduli. Acta Math., 156(3-4):203–251, 1986.
- [5] Jean Bourgain and Alex Gamburd. Expansion and random walks in . I. J. Eur. Math. Soc. (JEMS), 10(4):987–1011, 2008.
- [6] A. H. Clifford. Representations induced in an invariant subgroup. Ann. of Math. (2), 38(3):533–550, 1937.
- [7] Régis de La Bretèche and Sary Drappeau. Niveau de répartition des polynômes quadratiques et crible majorant pour les entiers friables. J. Eur. Math. Soc., 22(5):1577–1624, 2020.
- [8] Pierre Deligne. La conjecture de Weil. I. Inst. Hautes Études Sci. Publ. Math., (43):273–307, 1974.
- [9] J.-M. Deshouillers and H. Iwaniec. Power mean values of the Riemann zeta function. Mathematika, 29(2):202–212, 1982.
- [10] J.-M. Deshouillers and H. Iwaniec. Power mean-values for Dirichlet’s polynomials and the Riemann zeta-function. II. Acta Arith., 43(3):305–312, 1984.
- [11] Jean-Marc Deshouillers and Henryk Iwaniec. Kloosterman sums and Fourier coefficients of cusp forms. Invent. Math., 70(2):219–288, 1982.
- [12] Sary Drappeau, Kyle Pratt, and Maksym Radziwiłł. One-level density estimates for Dirichlet -functions with extended support. Algebra Number Theory, 17(4):805–830, 2023.
- [13] William Duke, John Friedlander, and Henryk Iwaniec. Bilinear forms with Kloosterman fractions. Invent. Math., 128(1):23–43, 1997.
- [14] Étienne Fouvry, Emmanuel Kowalski, and Philippe Michel. Algebraic trace functions over the primes. Duke Math. J., 163(9):1683–1736, 2014.
- [15] Étienne Fouvry, Emmanuel Kowalski, Philippe Michel, and Will Sawin. Lectures on applied -adic cohomology. In Analytic methods in arithmetic geometry, volume 740 of Contemp. Math., pages 113–195. Amer. Math. Soc., [Providence], RI, [2019] ©2019.
- [16] William Fulton and Joe Harris. Representation theory, volume 129 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1991. A first course, Readings in Mathematics.
- [17] Lasse Grimmelt and Jori Merikoski. On the greatest prime factor and uniform equidistribution of quadratic polynomials. Preprint, arXiv:2505.00493, 2025.
- [18] Larry Guth and James Maynard. New large value estimates for Dirichlet polynomials. Ann. of Math., to appear. Preprint, arXiv:2405.20552, 2024.
- [19] H. A. Helfgott. Growth and generation in . Ann. of Math. (2), 167(2):601–623, 2008.
- [20] I. Martin Isaacs. Character theory of finite groups. AMS Chelsea Publishing, Providence, RI, 2006. Corrected reprint of the 1976 original [Academic Press, New York; MR0460423].
- [21] Henryk Iwaniec and Emmanuel Kowalski. Analytic number theory, volume 53. American Mathematical Society, Providence, RI, 2021.
- [22] Kerr, Bryce and Shparlinski, Igor E. and Wu, Xiaosheng and Xi, Ping. Bounds on bilinear forms with Kloosterman sums. J. Lond. Math. Soc. (2), 108(2):578–621, 2023.
- [23] Henry H. Kim. Functoriality for the exterior square of and the symmetric fourth of . J. Amer. Math. Soc., 16(1):139–183, 2003. With Appendix 1 by Dinakar Ramakrishnan and Appendix 2 by Kim and Peter Sarnak.
- [24] Emmanuel Kowalski, Philippe Michel, and Will Sawin. Bilinear forms with Kloosterman sums and applications. Ann. of Math. (2), 186(2):413–500, 2017.
- [25] Emmanuel Kowalski, Philippe Michel, and Will Sawin. Stratification and averaging for exponential sums: bilinear forms with generalized Kloosterman sums. Ann. Sc. Norm. Super. Pisa Cl. Sci. (5), 21:1453–1530, 2020.
- [26] Philip C. Kutzko. The characters of the binary modular congruence group. Bull. Amer. Math. Soc., 79:702–704, 1973.
- [27] Nikolai V. Kuznetsov. The Petersson conjecture for cusp forms of weight zero and the Linnik conjecture. Sums of Kloosterman sums. Mat. Sb. (N.S.), 111(153)(3):334–383, 479, 1980.
- [28] James Maynard. Primes in Arithmetic Progressions to Large Moduli I: Fixed Residue Classes. Mem. Amer. Math. Soc., 306(1542), 2025.
- [29] Djordje Milićević, Xinhua Qin, and Xiaosheng Wu. Bilinear forms with Kloosterman sums and moments of twisted -functions. arXiv preprint, November 2025.
- [30] Nikolay G. Moshchevitin and Ilya D. Shkredov. On a modular form of Zaremba’s conjecture. Pacific J. Math., 309(1):195–211, 2020.
- [31] Alexandre Nobs and Jürgen Wolfart. Die irreduziblen Darstellungen der Gruppen , insbesondere . II. Comment. Math. Helv., 51(4):491–526, 1976.
- [32] Alexandru Pascadi. Large sieve inequalities for exceptional Maass forms and the greatest prime factor of . Forum Math. Pi, to appear. Preprint, arXiv:2404.04239, 2025.
- [33] Alexandru Pascadi. On the exponents of distribution of primes and smooth numbers. Preprint, arXiv:2505.00653, 2025.
- [34] Atle Selberg. On the estimation of Fourier coefficients of modular forms. In Proc. Sympos. Pure Math., Vol. VIII, pages 1–15. Amer. Math. Soc., Providence, RI, 1965.
- [35] Jean-Pierre Serre. Linear representations of finite groups, volume Vol. 42 of Graduate Texts in Mathematics. Springer-Verlag, New York-Heidelberg, french edition, 1977.
- [36] Joseph A. Shalika. Representation of the two by two unimodular group over local fields. In Contributions to automorphic forms, geometry, and number theory, pages 1–38. Johns Hopkins Univ. Press, Baltimore, MD, 2004.
- [37] I. D. Shkredov. On asymptotic formulae in some sum-product questions. Trans. Moscow Math. Soc., 79:231–281, 2018.
- [38] I. D. Shkredov. Modular hyperbolas and bilinear forms of Kloosterman sums. J. Number Theory, 220:182–211, 2021.
- [39] Igor E. Shparlinski. On sums of Kloosterman and Gauss sums. Trans. Amer. Math. Soc., 371(12):8679–8697, 2019.
- [40] Igor E. Shparlinski and Tianping Zhang. Cancellations amongst Kloosterman sums. Acta Arith., 176(3):201–210, 2016.
- [41] Shunichi Tanaka. Irreducible representations of the binary modular congruence groups . J. Math. Kyoto Univ., 7:123–132, 1967.
- [42] Audrey Terras. Fourier analysis on finite groups and applications, volume 43 of London Mathematical Society Student Texts. Cambridge University Press, Cambridge, 1999.
- [43] Berke Topacogullari. The shifted convolution of generalized divisor functions. Int. Math. Res. Not. IMRN, (24):7681–7724, 2018.
- [44] Jie Wu and Ping Xi. Arithmetic exponent pairs for algebraic trace functions and applications. Algebra Number Theory, 15(9):2123–2172, 2021. With an appendix by Will Sawin.
- [45] Xiaosheng Wu. The fourth moment of Dirichlet -functions at the central value. Math. Ann., 387(3-4):1199–1248, 2023.
- [46] Ping Xi. Ternary divisor functions in arithmetic progressions to smooth moduli. Mathematika, 64(3):701–729, 2018.
- [47] Matthew P. Young. The fourth moment of Dirichlet -functions. Ann. of Math. (2), 173(1):1–50, 2011.