Bounds for monochromatic solutions to
Abstract.
Let be a sufficiently large positive integer, and let . Then any -colouring of contains a monochromatic copy of with .
Contents
- 1 Introduction
- 2 Diophantine sets and averages
- 3 Diophantine properties of almost primes
- 4 Fourier decomposition of a majorant for the primes
- 5 An inverse theorem
- 6 Averaging projections and orthogonality
- 7 Proof of the main theorem
- 8 Further remarks
- A Properties of averages
- B An exponential sum estimate over the primes
1. Introduction
The key result in this work is an effective bound for -colourings of the natural numbers containing a monochromatic copy of .
Theorem 1.1.
There is a constant such that the following holds. Let be an integer and let . Then any -colouring of contains a monochromatic copy of with .
Remarks.
The constant is effectively computable. Furthermore, minor tweaks to the numerics in our arguments would allow one to replace with a slightly smaller constant. However, to obtain a βsmallβ constant (less than 10, say) would appear to require new ideas. Finally we make no effort to compute an actual value of . Due to arguments regarding the possible existence of a Siegel zero in AppendixΛB (among other reasons), to do so would be rather painful.
In the other direction, for all there is an -colouring of with no monochromatic with when , and therefore TheoremΛ1.1 is at most one logarithm from the optimal result. To obtain such a colouring, use colour for where , , and any colour for . The point here is that and so if with then . We never have or .
We remark that obtaining effective bounds for the pattern has been raised by both the first author [GreOp, ProblemΒ 22] and by Richter [Ric25, QuestionΒ 7.2].
1.1. Previous results
TheoremΛ1.1 guarantees the existence of infinitely many pairs given a fixed -colouring of . To see this, suppose that we have found such monochromatic pairs , . We modify our colouring of to an -colouring in which are given distinct colours, different to the original , and then use TheoremΛ1.1 to find a further pair . (Alternatively one may observe that our proof of TheoremΛ1.1 may be trivially modified to give many monochromatic pairs as for a fixed value of .)
This existential statement was first proven in a celebrated paper of Moreira [Mor17]; furthermore Moreira in fact guarantees a monochromatic pattern of the form . This result represents substantial progress towards Hindmanβs conjecture that any -colouring of contains a monochromatic copy of . Recently there has been further important progress towards Hindmanβs conjecture in various settings. Bowen [Bow25] has proven that any -colouring of contains infinitely many copies of . Bowen and Sabok [BS24] have proven that any -colouring of contains a copy of and Alweiss [Alw23] extended this to patterns of the form where ranges over all nontrivial subsets. Additionally Alweiss [Alw24] has given an alternate proof of the result of Moreira. However even when restricting to the proofs of Moreira and Alweiss give at least towerβtype bounds due to highly recursive Ramsey type arguments. We remark that while the main argument of Moreira is purely qualitative, he indicates in [Mor17, Section 5] a variant argument using van der Waerdenβs theorem (or SzemerΓ©diβs theorem) which does give explicit finite bounds when used with appropriate bounds for SzemerΓ©diβs theorem due to Gowers [Gow01].
Recently, Richter [Ric25] provided a quite different, more analytic, proof of Moreiraβs result about . The argument of Richter is quite infinitary in flavour and gives no bounds. However, as will be discussed shortly, our methods in this paper are very strongly influenced by those of Richter.
One may additionally compare TheoremΛ1.1 with bounds for certain Schur-type equations. For instance, for the configuration , bounds of the form are known due to work of Cwalina and Schoen [CS17]. Note that this (by restricting to powers of ) gives an essentially double-exponential bound for . Furthermore for more general linear systems , bounds of the form are proven in generality by Sanders [San20], and good control on the implicit constant for many systems may be found in work of Chapman and Prendiville [CP20].
1.2. Proof outline
Our work draws heavily on recent beautiful work of Richter [Ric25]; many of the ideas presented in this section are drawn from this work.
Logarithmic averages play a central role, so we define these before turning to an outline of the proof. If is a finite set of positive integers and if is a function, we write
We write as a shorthand for (and similarly for higher iterates). We will often use this notation when .
Suppose now that is an -colouring of in which we seek to find a monochromatic pair . The colour class in which this pair will be found is identified right at the very start of the proof. We take to be a fixed set of βhighly divisibleβ numbers; the precise set we take is , where for appropriate constants . By the pigeonhole principle there is some which contains many multiples of elements of in the sense that that for at least elements . We will find the desired configuration in this colour class, which we fix for the rest of the argument.
The next key idea, which follows [Ric25] very closely, is to locate a βrichβ set of pairs in . This is done using a variant of arguments of Ahlswede, Khachatrian and SΓ‘rkΓΆzy [AKS99] and Davenport and ErdΕs [DE36]. This argument involves the choice of various auxiliary sets of primes (for details see SectionΛ7.1) and a key component is Elliottβs inequality from multiplicative number theory (given in LemmaΛA.5 in the form we shall need). The output of this argument is many instances of the inequality
| (1.1) |
for some fixed and many with and associated where , and where the sets of primes can be chosen at many different scales. (The precise statement we are sketching here may be found at Λ7.6.) This provides the aforementioned rich source of configurations , here with and ,
The main business of the proof is a kind of deformation of the patterns to the desired . To describe how this works, fix an instance of Λ1.1 (that is, fix and the sets of primes). Set . We will then consider two βprojectionsβ and , both of which average over progressions. They are defined by
where here and . (The actual choice of parameters depends on the scale of the sets of primes ; the details are given at Λ7.7). One should think of as being bounded in terms of , whereas the lengths grow with .
The small projection is chosen so that we may run the following argument, starting from Λ1.1. First, via a kind of maximal function argument, we replace Λ1.1 by
| (1.2) |
Details of this argument may be found in LemmaΛ6.5.
Then, we use the almost-periodicity property to replace Λ1.2 by
| (1.3) |
(note here that is an integer by the highly divisible nature of the set ). In order for this almost-periodicity property to hold, the small projection must be chosen appropriately: must divide and must be sufficiently long.
Leaving Λ1.3 aside for the moment, the technical heart of the proof is then an argument to the effect that (for an appropriate choice of the large projection ) we have
| (1.4) |
Supposing that this has been established, imagine that we additionally have
| (1.5) |
(in an sense). Combining Λ1.3, 1.4, andΒ 1.5 then gives, assuming the various uses of work in our favour, that
Recalling that , it then follows that for some choice of and we have . This is the desired configuration , with and .
Whilst Λ1.5 will not be true in general (the projections are quite different in scale), an βenergy-chainingβ or arithmetic regularity type of argument can be used to show that Λ1.5 does hold for at least one scale of primes . This part of the argument can be thought of as a quantitative version of the existence of projections in Hilbert space, specifically of the decomposition into locally aperiodic and locally quasiperiodic functions which is important in Richterβs work. This connection between existence of projections in Hilbert space and regularity lemmas is by now well established; see e.g. [Tao07, SectionΒ 2].
The remaining part of the argument is then to justify Λ1.4. This is done via a general study of averages
| (1.6) |
where in our setting. Here, we consider arbitrary -bounded functions , and the key question of interest is the βinverse questionβ of what can be said if Λ1.6 is at least in magnitude for some . Our main result on this topic, PropositionΛ5.1, is an inverse theorem for this question. It concludes that under such a hypothesis (and with suitable assumptions on the sets of primes) the function is biased along progressions to some modulus and length comparable (in logarithmic scale) to the largest of the primes . The statement Λ1.4 follows very quickly from this inverse theorem (see LemmaΛ6.3 for the argument).
This inverse theorem, PropositionΛ5.1, is the most novel part of our paper. Whilst it is in a sense a quantitative, finitary version of [Ric25, Theorem 3.5], it is not a direct translation of that result, which would appear to be far too weak for our purposes. The key difference when unwinding the argument in [Ric25] in finitary language is that the latter finds bias along progressions with size depending on while ours depends only on . The proof of PropositionΛ5.1 is lengthy, and involves a Fourier analytic argument combined with CauchyβSchwarz manΕuvres inspired by certain βconcatenationβ results in the additive combinatorics literature, for instance [PP24, Pel20]. Ultimately it is these concatenation ideas which eliminate the dependence on . Key further ingredients are:
-
β’
Quantitative diophantine approximation results (LemmaΛ2.2);
-
β’
βLog-freeβ exponential sum estimates for certain arithmetic sets, specifically sets of βalmost primesβ, as well as the sets of squares of the elements of such sets (SectionΛ3);
-
β’
Construction of a majorant for the primes with a certain Fourier decomposition (SectionΛ4), in order to avoid the constant in our main result being ineffective due to possible Siegel zeros.
1.3. Acknowledgments
BG is supported by Simons Investigator Award 376201. This research was conducted during the period MS served as a Clay Research Fellow.
1.4. Notation
At various points, for brevity it will be expedient to use the following notation. If is a function and if , we write . If is some further integer parameter, by we mean .
By a dyadic interval we mean any subset of of the form . We will occasionally abuse notation by writing when we really mean , for some .
When we say that a parameter (for instance ) is βsufficiently smallβ we mean that for some absolute which we do not explicitly specify, and analogously if we say that is βsufficiently largeβ we mean that for some absolute constant . It is important to remark that are absolute and do not depend on the number of colours (otherwise our results would have little content). Throughout the paper the letter will always denote a sufficiently large integer parameter.
We write for the greatest common divisor of and for the lowest common multiple.
2. Diophantine sets and averages
The purpose of this section is to bound certain averages that will appear in the arguments of the next section, where our key technical result is established. The averages in question will be of the form
where is contained in some dyadic interval, or the analogous average with in place of the logarithmic average. The main result of the section is LemmaΛ2.4 below.
In our applications the set will have a useful arithmetic property, namely that it satisfies a βlog-free Weyl-type estimateβ. The precise definition we will use is the following.
Definition 2.1.
Let be parameters. Let be a set of integers. Suppose that whenever and , then there is some natural number , , such that . Then we say that is -diophantine.
Remarks.
Note that the definition is invariant under translation of . In applications the parameter will be comparable to the diameter of , but it is convenient not to simply set , since this would lead to unnecessary estimations of the diameter of in some situations. Being diophantine with (for some ) is a common property of sets of integers. For instance, (the log-free variant of) Weylβs inequality asserts that the set of th powers in is -diophantine with appropriate parameters ; the set of th powers of primes in is also -diophantine for some . In fact, we will use the latter fact in our argument; for the proof see LemmaΛB.2.
Before turning to the statement and proof of the main results, we isolate the following lemma, which is of a standard type in the analysis of exponential sums. A proof of this particular variant may be found in [Gre25, Lemma C.1] (we have changed some dummy variables to avoid conflicts with the present paper).
Lemma 2.2.
Suppose that and that is an integer. Suppose that are positive real numbers satisfying , and suppose that there are at least elements for which . Suppose that . Then there is some positive integer such that .
We next give the definition of certain norms describing bias of functions along arithmetic progressions.
Definition 2.3.
Let be a function. Let and be parameters. Set
| (2.1) |
and
| (2.2) |
The logarithmic norm Λ2.1 will play the more prominent role in our analysis, with the uniform norm Λ2.2 being relegated to a more modest technical role in LemmaΛ2.6. We record that, roughly speaking, we have if and that (for a precise statement, see LemmaΛA.6). In particular for fixed the information that is large becomes weaker as becomes smaller. We are now ready for the first main result of the section, which could potentially have other applications.
Lemma 2.4.
Let be a sufficiently small positive parameter and . Let be -diophantine with , and let be a parameter. Suppose that and that . Let be any positive integer with . Let be -bounded and suppose that we have
| (2.3) |
Then there exists , , such that .
Remark.
Note here that may depend on , but we are free to specify subject to the stated upper bound condition.
Proof.
Throughout the proof we assume that is sufficiently small without further comment. The proof is Fourier-analytic; closely related arguments have appeared as base cases for various βconcatenationβ results (see e.g. [PP24, LemmaΒ 5.3] or [Pel20, LemmaΒ 5.4]). By ΛA.2 applied with we have
which for brevity we write
with the understanding that is considered with multiplicity. By CauchyβSchwarz this gives that
By a further application of ΛA.2, followed by the triangle inequality, we have that
Denote
By a simple averaging argument we have
| (2.4) |
For the time being, let be fixed. Defining by for and otherwise, we have from the definition of that
Note here that , using here that . Taking the Fourier expansion and applying the triangle inequality, this gives
| (2.5) |
where
with
| (2.6) |
Now by bounding the terms trivially by and using that
we have . From this, CauchyβSchwarz and Parseval it follows that
Integrating over , we see that the contribution to Λ2.5 from is negligible for sufficiently large, that is to say
Therefore, bounding the geometric series part of trivially by ,
In particular, for some we have
Using the AMβGM inequality with and , it follows that
| (2.7) |
By Parsevalβs inequality we have , and so for sufficiently small we have
| (2.8) |
To proceed further we need to analyse the for which . Suppose in the following discussion that has this property. Recalling that the definition of is Λ2.6, it follows that . Writing , we see that , where denotes the natural weighted probability measure on . Since pointwise, it follows that .
We now apply the diophantine assumption on . We conclude that for each there is some nonzero such that . By further refining the set of (to a set of size ) we may assume that does not depend on . Denote this common value by .
Now we apply LemmaΛ2.2, taking , and . One can check that the conditions of LemmaΛ2.2 are consequences of the hypothesised lower bounds on and , provided is large enough. The conclusion of the lemma is then that there is some such that . Taking , we see that and .
It follows from this analysis and Λ2.7 that , where is the set of all for which for some with . Since the measure of is , there is some such that
| (2.9) |
By refining we may, using Λ2.4, find such that
| (2.10) |
and such that, for all , the corresponding all have the same value of ; that is, for all . Writing out the definition of the Fourier transform, we have from Λ2.9 that
Recall that is a given parameter, satisfying . By the properties of , we have
Substituting gives
which implies that
by the bound on and ΛA.1. Dropping the dashes on and swapping the order of the averages gives
By CauchyβSchwarz, it follows that
Recall that we have this for all . However, the quantity on the left is non-negative for all . Taking the logarithmic average over (and recalling Λ2.10) we obtain
Recalling that , and taking the average to the inside, this is
Applying ΛA.2 to the inner average for each (and using the assumed bound on ) we may drop the -average, obtaining
This is equivalent to the stated result. β
Lemma 2.5.
There is an absolute constant such that the following holds. Fix and . Let be a set which is -diophantine, and let be a parameter. Let be a further sufficiently large parameter. Suppose that and that . Let be a positive integer with . Let be -bounded and suppose that we have
| (2.11) |
Then there exists , , such that .
Proof.
Rather than LemmaΛ2.5 itself, we will need the following iterated variant. Here we use the notation for difference operators described in SectionΛ1.4.
Lemma 2.6.
There are absolute constants and such that the following holds. Fix and . For suppose that is a set which is -diophantine, and let be a parameter. Let be a sufficiently large parameter and suppose that and that . Let be positive integers with . Let be -bounded and suppose that
| (2.12) |
Then there exist , , such that
Remark.
We have only stated a version with two difference operators (which will involve one iteration of LemmaΛ2.5), since this is what we will need later. A similar argument gives a version with difference operators.
Proof.
By an averaging argument, there are at least triples such that
Since, for any , the average over is non-negative, we have
For each such triple, this is exactly the hypothesis Λ2.11 of LemmaΛ2.5 with (and replaced by and by ). The conclusion of LemmaΛ2.5 is then that there exists such that . Squaring and writing out, this gives
| (2.13) |
By pigeonhole, we may pass to set of triples such that is independent of . Since the expression on the left in Λ2.13 is always nonnegative, we may average over all , obtaining
For at least pairs , we have
For each such pair, this is again the hypothesis Λ2.11 of LemmaΛ2.5, now with , replaced by , and again with . Another application of LemmaΛ2.5 gives that there exists such that , provided that is sufficiently large that the relevant conditions on and are satisfied. Squaring and writing out, this gives
| (2.14) |
Passing to a further subset of pairs , we may assume that does not depend on . Since the expression on the left in Λ2.14 is non-negative for all , we obtain the desired result by averaging over . β
3. Diophantine properties of almost primes
The main result of this section, LemmaΛ3.2, is a vital technical ingredient in our later arguments. Roughly, it states that sets such as and are diophantine (see DefinitionΛ2.1) with suitable parameters, where are dyadically localised sets of primes. We first note a general lemma for βbilinearβ exponential sums.
Lemma 3.1.
Let . Let , and let and be sets with for . Suppose that, for some , we have . Then either for some , or else there is some , , such that .
Remark.
We will only need the cases , in which case of course the exponents may be taken to be absolute constants.
Proof.
Write the condition as
By two applications of the CauchyβSchwarz inequality, we obtain
To handle this, we use the βlog-freeβ multidimensional Weyl inequality [GT14, PropositionΒ 2.2]; we remark that the published version of that paper omits the necessary constraint that be sufficiently large. β
We now proceed to the main technical lemma of the section. Although we will only need this lemma for , it is no harder to prove it for general .
Lemma 3.2.
Let . Then there is a constant such that the following holds. Let be a natural number and let . Let be a sequence of integers such that the intervals are disjoint. Set , and suppose the condition is satisfied. For each , suppose we are given a parameter satisfying and define to be the set of primes satisfying , and set . Then is -diophantine.
Proof.
We first note that the case is also true and is essentially a standard result about exponential sums over powers of primes. We in fact need (a slight generalisation of) this result in our proof. Since it is hard to find an appropriate reference with the log-free bound that we require, we give this in LemmaΛB.2.
Suppose now that . Without loss of generality, assume . Let and suppose that
| (3.1) |
We must show that there is some such that
| (3.2) |
We try applying LemmaΛ3.1 with and . Define sets , by and (the stated containments are easily verified). Set . Since , it follows from the prime number theorem with classical error term (see e.g. [IK-book, SectionΒ 5.6]) that we have for some absolute . Therefore
using in this last step that . Applying LemmaΛ3.1, and noting that , it follows that either
| (3.3) |
or else there is some with
| (3.4) |
We leave aside Λ3.3 for now, and assume that Λ3.4 holds. If then Λ3.2 follows immediately (with equal to twice the exponent). Therefore we may suppose henceforth that . In particular, Λ3.4 gives (after doubling the implied constant in the exponents) that
Thus for some , with
| (3.5) |
We now return to the original sum Λ3.1. By pigeonhole, there is a choice of such that . By LemmaΛB.2 (that is, essentially the case of the present lemma) it follows that there is some such that . Since , this means that is within of , an integer multiple of . However, we may also note using Λ3.5 and the bound that
Here, in the penultimate step we used that (and so ), and in the last step we invoked the assumption and the upper bound (and assumed is large enough). Since (for the aforementioned reasons) the only possible integer multiple of that can be near is , and therefore and . Dividing through by , we obtain . Note also that since all prime factors of are at least , and therefore . Finally it follows that , which is the desired conclusion Λ3.2.
It remains to analyse the βsmall parameterβ case Λ3.3, that is to say . The assumption certainly implies that . Therefore (assuming large enough) we have . It follows that
| (3.6) |
As before, Λ3.1 implies that there is some such that . By LemmaΛB.2 it follows that there is some such that . Taking , we then have (using Λ3.6)
and (using Λ3.6 again) , which is once again the desired conclusion Λ3.2. β
4. Fourier decomposition of a majorant for the primes
In this section we give another technical ingredient for our later arguments. Here is the main result.
Lemma 4.1.
Let be a large parameter. Then there is a function with
| (4.1) |
for all primes and
| (4.2) |
such that the following is true. Let be a constant. For any parameter , , there is a -periodic function satisfying
| (4.3) |
together with a decomposition (where the sum over is finite) with the following properties. First, the function is small in in the sense that
| (4.4) |
Second, the functions are reasonably bounded in sup norm in the sense that
| (4.5) |
for all . Finally, denoting we have the estimate
| (4.6) |
Here, all implied constants may depend on but are effectively computable.
Proof.
We take to be a Selberg-type majorant for the primes. Rather than describe the construction explicitly here, we can just refer to [green-tao-selberg, Proposition 3.1], which provides the relevant properties. Taking in that proposition (thus the singular series as defined in [green-tao-selberg] is ) and , we can take , where is the function constructed in [green-tao-selberg, Proposition 3.1]. The desired majorant property Λ4.1 is a consequence of [green-tao-selberg, Equation (3.1)]. The bound Λ4.2 is an absolutely standard fact about the Selberg sieve. It could be deduced within the framework of [green-tao-selberg] by summing [green-tao-selberg, Equation (3.3)] over , and discarding the negligible contribution from all frequencies except . On the Fourier side we have (see [green-tao-selberg, Equation (7.7)])
It is shown in [green-tao-selberg, Proposition 7.1], following RamarΓ© and Ruzsa [ramare-ruzsa], that
with supported on squarefrees with and . Set , for some to be specified below, and finally set
for and
It is then clear that is -periodic and that . We now define by βthresholdingβ the , specifically by setting
for . Set . The bound Λ4.5 is then immediate.
Next we establish Λ4.4. For this, we will use the moment estimates
| (4.7) |
for and for some constants , which we will establish below. Indeed, taking in Λ4.7 yields
| (4.8) |
uniformly for , and similarly
| (4.9) |
provided is chosen large enough (depending only on ). The desired estimate Λ4.4 is now immediate from the triangle inequality, the dominant contribution being from Λ4.8 with values . (Here we use the assumption that to guarantee that the contribution from Λ4.9 is insignificant.)
Now we establish Λ4.6. It is enough to show that
| (4.10) |
for , since the desired estimate then follows using the bounds on the implicit in the definitions of these functions. To show Λ4.10, it suffices to show the non-thresholded estimates
| (4.11) |
for , from which Λ4.10 follows using Λ4.8 andΒ 4.9. From the definition of , summing the geometric series and the bound , we have
Since the fractions are -separated, the contribution from all except at most one will be (crudely) . For the fraction closest to , we have the trivial bound , which is by the divisor bound, and Λ4.11 (and therefore Λ4.6) follows.
We now return to establish the moment estimate Λ4.7. An ingredient in the proof will be the (standard) estimate
| (4.12) |
To prove this, observe that the LHS is .
Turning to Λ4.7 itself, it suffices to prove the general estimate
| (4.13) |
for , where
the are supported on squarefrees and . To prove such an estimate, we first write in physical space using Kluyverβs identity for Ramanujan sums. This gives
| (4.14) |
Now we have
| (4.15) |
by Λ4.12. Now observe that
| (4.16) |
In the middle step here the key point was that the number of representations of as is at most , and in the penultimate step that for all . Combining Λ4.14, 4.15, andΒ 4.16 gives Λ4.13, and so Λ4.7 follows.
We remark that from the first bound in Λ4.3 and the -periodicity of (or by direct proof) we have
| (4.17) |
for any interval of length .
Remarks.
It is possible to establish an analogue of LemmaΛ4.1 with equal to the von Mandoldt function itself, taking and the to be suitable CramΓ©r approximants to the von Mangoldt function and . The details necessary to accomplish this may be found in [Gre05], though the context there was different. There are some advantages to this, for instance is non-negative and subject to good - and -bounds. The drawback of proceeding this way is that the bounds are ineffective due to an application of the Siegel-Walfisz theorem. This can be corrected via the introduction of appropriate βSiegel-modified CramΓ©r approximantsβ as in [TT25] but this is quite technical. By passing to a suitable majorant as in LemmaΛ4.1 we can avoid all Siegel zero issues entirely.
5. An inverse theorem
In this section we explore the consequences of an assumption
| (5.1) |
where are -bounded, consists of primes, of almost primes and is some parameter. The reason for being interested in such an assumption was sketched in SectionΛ1.2 and will be further apparent in SectionΛ7.
The aim is to show that Λ5.1 implies that is large for suitable parameters . (Recall from DefinitionΛ2.3 the definition of these norms.) This result is directly inspired by [Ric25, Theorem 3.5], a connection we shall elaborate upon later. Here is the technical statement of our main result.
Proposition 5.1.
There is an absolute constant such that the following holds. Let be a sufficiently small parameter and let . Suppose that and . Let be parameters with and . Suppose that satisfies , and that all prime factors of are less than . Let denote the set of primes in and suppose that is a set of βalmost primesβ of the following form: , where are disjoint intervals, all with , and the range over all primes in for . Set . Suppose we have Λ5.1. Then we have for any with .
Remarks.
For the rest of the section we write , thus the lower bound on is . Any sufficiently small absolute constant would do here. More generally, several of the assumptions on parameters are made so as to be comfortable for the required application and we do not claim these conditions are tight. For instance, the lower bound could be for an appropriate .
5.1. Setting up the proof of the inverse theorem
The proof of PropositionΛ5.1 is somewhat lengthy. We prepare the ground by defining some key parameters and observing simple preliminary bounds. In the proof are absolute constants, with assumed to be sufficiently large and assumed sufficiently large in terms of . We will write .
Next we point out some consequences of the (somewhat elaborate) conditions on parameters in the statement of PropositionΛ5.1. First, the are enormously larger than powers of (and a fortiori powers of ). Indeed whilst , using here the assumption that .
Second, we have
| (5.2) |
for any fixed constant (assuming sufficiently large in terms of ). This is easily confirmed using the assumptions , and , and will be used (twice) to verify the key condition in LemmaΛ3.2.
Third and finally, we note that all the parameters are significantly smaller than , and one has for example , which will be used several times in the analysis to assert that error terms coming from ΛA.2 are negligible.
Next we record the fact that, under the stated conditions, the elements of are almost pairwise coprime. If is any finite set of positive integers, we define , where is the gcd of . This is a measure of the pairwise coprimality of elements of ; note that always, and that if is small then we expect the elements of to be mostly coprime. Recall that (though this is irrelevant to the following lemma).
Lemma 5.2.
Under the conditions of PropositionΛ5.1, we have .
Proof.
If is a set of primes and then unless , and so if we denote by the set of primes in we have
| (5.3) |
Now since , it follows from Mertensβ theorem (see e.g. [Kou19, TheoremΒ 5.4]) and Λ5.3 that we have . It follows that
As we said, the proof of PropositionΛ5.1 is lengthy. Moreover, the logic is somewhat complicated, since it is difficult to state self-contained intermediate lemmas. For reference we summarise the proof structure now.
- β’
- β’
- β’
- β’
-
β’
We then proceed directly from Λ5.10 to the desired conclusion via a quite lengthy (but linear) sequence of manipulations.
5.2. Proof of the inverse theorem
We turn now to the proof of PropositionΛ5.1.
Proof.
Throughout the proof we will freely use the fact that is sufficiently large and that is sufficiently small. The starting assumption is Λ5.1. We start by removing the function using essentially the same manipulation as in [Ric25, Theorem 5.2]. First observe that, for each , an application of LemmaΛA.4 yields
Averaging over (and using the upper bound ) gives
Write for the expression on the left, and for the first expression on the right; thus with . Now the assumption is that . By CauchyβSchwarz (since is -bounded) we have . Therefore , using the -boundedness of to estimate the second term. That is,
Using CauchyβSchwarz on the inner sum (and the -boundedness of ) gives
We now pass to a non-logarithmic average in the variable, on a suitable dyadic interval. To do this, first partition into intervals with . By averaging, there is some such for which
Let be such that . We introduce the majorant from LemmaΛ4.1. Since the logarithmic weight varies by a factor at most on , it follows that
Expanding out the square gives
| (5.4) |
The next technical reduction is to replace the cutoff with , which we do using the fact that the elements of are mostly coprime due to LemmaΛ5.2. Let us justify this carefully. Since are -bounded and , the error in making this switch in the LHS of Λ5.4 is bounded up to a constant factor by
| (5.5) |
We have the pointwise bound and therefore
using in the last step that are much smaller than . It follows that Λ5.5 is bounded above by . Using the pointwise bound , this in turn is bounded by . By LemmaΛ5.2, we see that Λ5.5 is bounded by . Therefore, as claimed, we may replace Λ5.4 by
| (5.6) |
The reason for having replaced Λ5.4 with Λ5.6 is that we may now invoke LemmaΛA.4 (with ) to conclude that
| (5.7) |
We now apply LemmaΛ4.1 with parameter and constant . Observe that the required inequality
| (5.8) |
is true and follows from the choice of parameters, using here that .
Let be the -periodic function as in that lemma. Our aim is to replace in Λ5.7 by .
From Λ5.7 and the triangle inequality, one of the following two statements holds:
| (5.9) |
or
| (5.10) |
We analyse these two possibilities in turn. In the analysis we will use several times that
| (5.11) |
which follows from Λ4.3 and the triangle inequality (since is non-negative).
Analysis of Λ5.9. We begin by dyadically localising the (two copies of) the set . Recall that . Since , we can decompose each as a disjoint union of intervals , each of the form for some satisfying . We then have a corresponding decomposition , where . Note that, since , each is contained in a dyadic interval. By averaging, there are and such that
| (5.12) |
For notational brevity, write and . As are each contained in dyadic intervals, we can remove the logarithmic averaging to obtain
| (5.13) |
The next several manipulations leading to Λ5.14 are straightforward and are aimed to replacing the logarithmic average over by an ordinary average on an appropriate subinterval. We first discard the contribution from small values of . Set (say). Writing Λ5.13 as
(where is the harmonic sum), using Λ5.11 we see that the contribution to the LHS from is bounded by , using here that .
Since , it follows that we may replace Λ5.13 by
We now break into intervals whose lengths satisfy . By pigeonhole there exists such an interval for which
The weight varies by at most on and so, using Λ5.11, we can justify replacing with a uniform average , thus obtaining
| (5.14) |
Our plan now is to use the decomposition from LemmaΛ4.1 in order to obtain a contradiction from Λ5.14. To do this, we claim that for a general function we have
| (5.15) |
Here, . Assuming the claim for now, we see that the LHS of Λ5.14 is bounded above by
by Λ4.4, 4.6, andΒ 4.5, assuming here that is sufficiently large and noting Λ5.8. This contradicts Λ5.14, recalling here that , that is sufficiently large in terms of , and additionally recalling here our assumption (in PropositionΛ5.1) that . That is (assuming the claim Λ5.15) we cannot have Λ5.9, and therefore Λ5.10 holds.
Proof of claim Λ5.15. The first bound is trivial, but the second is a somewhat involved task. By homogeneity, we may assume that . Thus if the second bound in Λ5.15 does not hold, we have
| (5.16) |
where we are free to choose an absolute and , thus in particular
| (5.17) |
It therefore suffices to show that the assumption Λ5.16 and the inclusion Λ5.17 imply that
| (5.18) |
since this immediately contradicts the definition of . The remainder of the proof of claim Λ5.15 is devoted to this task.
Suppose that and , where . Set and . Since (by one of the assumptions of PropositionΛ5.1) and we have . Let be integers with , and substitute , in Λ5.16. This gives
| (5.19) |
Now observe that , and also we have the crude bound
using here that all of are , that and that . It follows using LemmaΛA.1 that for each fixed we may replace the -average in Λ5.19 by , and the -average by , at the cost of changing the inner sum in Λ5.19 by . Doing this, averaging over and dropping the dashes on for clarity, we obtain
Therefore there exists such that
This implies that
with being -bounded functions; here we have absorbed the absolute value as a unit complex number into . We now apply CauchyβSchwarz twice, and replace the dummy variable by , to obtain that
(The notation used here is described in SectionΛ1.4.)
The expression on the LHS is the same as the one in Λ2.12, with . In order to apply LemmaΛ2.6, we need the sets to have suitable diophantine properties. Such a statement is precisely the content of LemmaΛ3.2. To see this, recall that by definition we have , where each interval has the form for some and some , which we may assume to be the smallest prime in . Note that we always have , and since and . We now apply LemmaΛ3.2 with . The required condition in that lemma follows using Λ5.2. Thus LemmaΛ3.2 gives that is -diophantine, for some absolute constant . Similarly, is -diophantine.
We may now apply LemmaΛ2.6 with for , , , and for . To apply that lemma we need to verify, for , the three conditions and . The first condition holds comfortably using and the choice of parameters. The second condition holds even more comfortably using and the choice of parameters. Finally, the third condition holds using that , provided is large enough; larger than is sufficient.
The conclusion of LemmaΛ2.6 gives that for any with , there are , such that
| (5.20) |
Set . The expression on the left in Λ5.20 is closely related to the Gowers -norm of (or more accurately a GowersβPeluse norm; see [Pel20] where they are called βGowers box normsβ). Rather than appeal to any general theory of such norms, we proceed with a direct analysis using the Fourier transform. By the Fourier expansion , Λ5.20 is
Here, denotes the normalised probability measure on . By AM-GM and the pointwise bound we have that
Substitute , for , and integrate over . This gives (dropping the dashes)
| (5.21) |
We claim that
| (5.22) |
By AM-GM and symmetry it suffices to prove that
The triple ranges uniformly over as ranges over and so it is enough to show that . This, however, follows immediately using the bound . The claim Λ5.22 is therefore proven. From this and Λ5.21 we immediately have (if is sufficiently large). Since by Parseval, it follows that and so . In this last step we used the fact Λ5.17 that ; what is written is then true if is chosen sufficiently small. This completes the proof that the claims Λ5.16 andΒ 5.17 imply Λ5.18, and hence finishes the proof of the claim Λ5.15.
As explained just before the statement of claim Λ5.15, it now follows that Λ5.10 holds. The remainder of the proof of PropositionΛ5.1 consists of the analysis of this case.
Analysis of Λ5.10. We first recall the statement, which is (after a mild reordering of the averaging operators)
| (5.23) |
The advantage of having the function in place of is that the former is invariant under shifts by . This is by construction (LemmaΛ4.1); recall here that . For fixed , in the inner average over and in Λ5.23 we substitute and for some and then average over all . By the periodicity of we obtain
| (5.24) |
Fix . By ΛA.2, crude bounds for the parameters, and Λ4.3, the error in replacing the average over by is
so we may make this replacement without affecting Λ5.24. Moreover, by applying ΛA.1 and the bound , the error in then replacing the average over by is , so we may again make the replacement without affecting Λ5.24. (In the chain of inequalities here we used that , that and that is much larger than fixed powers of and , cf. remarks in SectionΛ5.1.) Having made these two replacements we drop the dashes on for clarity, thereby arriving at
By the triangle inequality, we obtain
Applying CauchyβSchwarz, we obtain
Using the pointwise bound and the fact (LemmaΛ5.2) that , as well as the bound (see Λ4.3), we see that the contribution from pairs with can be ignored. Thus
Since is invariant under shifts by , we may introduce an additional average obtaining
where here ; note that is much larger than 1 by the choice of parameters. Apart from the invariance of under translation by , the key point here is that, for each fixed , the shifted average differs from the original one by at most by Λ4.17 (and a similar term corresponding to the edge effects near ).
In the display above, consider the average over (for fixed ). The point now is that, from the point of view of logarithmic averages, may be regarded as essentially just varying over . More precisely, applying LemmaΛA.3 with , , and , we may replace the above with
| (5.25) |
Let us comment on the application of LemmaΛA.3. First, we used that and are coprime. That follows from the assumption that all prime factors of are less than , and that follows using that is much larger than . The error terms and resulting from the application of LemmaΛA.3 are all by simple verifications using the choice of parameters, the key point being that is much larger than , but is much smaller than .
Applying ΛA.2, we may remove the shifts in Λ5.25, allowing us to decouple the average over and thus obtain via another application of Λ4.3 that
| (5.26) |
We may remove the condition (losing a further factor of 2 in the implicit constant) exactly as before, obtaining
To analyse this, we will eventually use the diophantine nature of suitable sets , applying LemmaΛ3.2 in the case . To prepare the ground, we must again foliate into appropriate βsubdyadic productsβ as we did in the analysis of Λ5.9 leading to Λ5.12. With notation exactly the same as in that analysis, we may locate and such that
Note here that we were able to replace the logarithmic average over the variables by a uniform average since these are now dyadically localised, and each -average is nonnegative. Suppose that and , where . Without loss of generality, . Pigeonholing in , we see that there is some such that
By LemmaΛA.2 with modulus , this gives
where . For each fixed , the inner average is of the form Λ2.3, with and replaced by . We showed in LemmaΛ3.2 (with ) that is -Diophantine (the condition in that lemma follows using Λ5.2), and so by translation invariance of the notion of diophantine, the same is true of . Observe that . Thus we may aim to apply LemmaΛ2.4 with , , , and replaced by . There are three conditions to be checked, namely that , and that .
The first condition, involving , is immediate from and the parameter hierarchy. The second condition, involving , is also immediate. For the third condition note that and is much smaller than .
Thus the appeal to LemmaΛ2.4 is indeed valid, and we are free to take any in this application. Recalling that , the conclusion of LemmaΛ2.4 that for each there is such that . By pigeonhole there is a set of values of such that does not depend on . Denote this common value by (which is of course not the same quantity as in the application of LemmaΛA.2 above). It follows that
that is to say
A further application of LemmaΛA.2 then yields
which is the statement
Finally, let be as in the statement of PropositionΛ5.1. Set and . Note that . Therefore by LemmaΛA.6 we have
where the error terms can be estimated crudely bearing in mind the comments in SectionΛ5.1 (essentially, is much smaller than but much larger than all other variables). This concludes the proof of PropositionΛ5.1. β
6. Averaging projections and orthogonality
In the introduction we discussed certain βprojectionβ operators . In this section we introduce the general class of such operators and establish some of their basic properties.
Definition 6.1.
Let be a function. Suppose that . Then we define
Whilst we informally think of these maps as projections, this is not quite accurate as . The first observation we require is that has an almost periodicity property.
Lemma 6.2.
Let be a 1-bounded function. Let . Then, for any we have
Proof.
The LHS may be expanded as . The result then follows from ΛA.1. β
A crucial feature of the maps is that they essentially preserve the -norms (see DefinitionΛ2.3). Indeed we have the following lemma.
Lemma 6.3.
Let and . Then for which is -bounded, we have that .
Proof.
First recall that by definition DefinitionΛ2.3 we have
| (6.1) |
Note that by ΛA.1 we have
The desired result follows immediately upon taking in Λ6.1. β
We next require an approximate Pythagoras relation for projections .
Lemma 6.4.
Let be parameters with and . Let be a -bounded function. We have that
Proof.
For brevity we write and .
We first expand the LHS as
| (6.2) |
Expanding the definitions, we have
Substitute ; then, dropping the dash on , we see from ΛA.2 that this is
which equals
Now by ΛA.1 (using here that ) we have
Therefore, putting these observations together we obtain
Taking complex conjugates and adding, we obtain
Now by a further application of ΛA.2,
and by CauchyβSchwarz this is at least
It follows that
Substituting in to Λ6.2 gives the lemma.β
We now give the βmaximal functionβ argument which was hinted at in the introduction where we explained how to move from Λ1.1 to Λ1.2.
Lemma 6.5.
Let be non-negative -bounded functions. Let and let be positive integer parameters with . Suppose that . Then .
Proof.
Write for brevity. Set and denote . Then since and pointwise we have
Therefore we are done if we can show that , that is to say
| (6.3) |
Write . Since pointwise, we have pointwise, and so if then we have . It follows that using ΛA.2 and CauchyβSchwarz that
It follows that , so the claim Λ6.3 follows due to the choice of . β
We note a corollary under the same conditions which is good for taking averages, namely that for any
| (6.4) |
Indeed, if we write then Λ6.4 is trivial for , while for it follows from LemmaΛ6.5.
7. Proof of the main theorem
We are now ready to prove our main result, TheoremΛ1.1. The reader may find it helpful to revisit the overview given in the introduction.
7.1. Setting up parameters.
We begin by defining parameters and scales to be used in the proof.
Let be the number of colours; we will fix this for the remainder of the proof and we may assume it is sufficiently large. Let be a suitable large positive integer (independent of ), recall that , and set
| (7.1) |
where here is the constant in PropositionΛ5.1. Define
| (7.2) |
We now define a doubly-indexed sequence of positive integer scales by
| (7.3) |
Note that we have the crude bounds
| (7.4) |
provided is large enough. We will also use the auxiliary scales defined by the same formula Λ7.3. For and , define to be the set of primes satisfying . We note that with this choice of parameters we have, by Mertensβ theorem, .
7.2. Positivity for
The first step of the proof is to isolate the colour class in which we will eventually find our configuration , and to show that it is rich in configurations . This is a mild variant of [Ric25, TheoremΒ 3.6], which itself is related to results of Ahlswede, Khachatrian and SΓ‘rkΓΆzy [AKS99] and Davenport and ErdΕs [DE36].
Consider an -colouring . For each we have
where here denotes the harmonic sum. The last bound here follows (comfortably) using Λ7.4. By summing over all and an appeal to the pigeonhole principle, there is some colour class such that
which implies that for at least elements . Fix a set of such elements. We fix the colour class for the remainder of the proof.
By repeated applications of LemmaΛA.5, we have
for any , any and for any . Note here that the error term arising from this repeated application of LemmaΛA.5 is dominated by .
Let the elements of be . Then, applying the above with and summing over , we obtain
(Note here that, for the term with index , we can include the extra averages over with no change to the expression.) By CauchyβSchwarz it follows that
Since , if is large enough we may exclude the pairs of indices with at the loss of at most a factor . By symmetry we are also free to only include the pairs with (at the loss of another factor of 2), and we thereby obtain
| (7.5) |
By another repeated application of LemmaΛA.5 we have
for each pair with . From this and Λ7.5, it follows (again assuming large enough) that
Recall that this is true for all . By pigeonhole, for each there is some such that
Pass to a subset of size at least such that does not depend on , and denote by the common value of these . Writing and , we then have
| (7.6) |
for all . Fix this choice of (and hence of and the function ) for the rest of the proof. Define also to be the elements of except the largest one; thus .
7.3. Proof of the main theorem
We think of pairs (with and ) as βscalesβ in the proof. Associated to any scale will be a pair of βprojectionβ operators in the sense of DefinitionΛ6.1. Define . Note that is an integer (in fact it equals ).
For each pair there will be two important projection operators , namely
| (7.7) |
Here, denotes the next largest element in after , which exists since . We informally refer to these as the βsmallβ and βlargeβ projections associated to .
We first apply the small projection operator to Λ7.6 using LemmaΛ6.5, or more accurately Λ6.4. Taking there, we have
| (7.8) |
Here, and below, is shorthand for . Now observe that by LemmaΛ6.2 we have
| (7.9) |
The key points to observe here in applying LemmaΛ6.2 are that by the definitions of the s, and also
The inequalities here are all very comfortably true (when is large); we have , that for all , and that , all of which follow using Λ7.4. From Λ7.8 andΒ 7.9 we have
This, recall, is for all . Summing over all these gives
| (7.10) |
Suppose we had a similar result with replaced by , that is
| (7.11) |
In particular, for some choice of and we would then have
Taking and (and recalling that ) we then have , and the proof is complete.
It remains to prove that we do indeed have Λ7.11. As described in the introduction, we deduce it from Λ7.10 in two steps. First, we replace the βsmallβ projections in Λ7.10 by the βlargeβ projections . The error in making this replacement is
| (7.12) |
By ΛA.2 and the crude bounds , this is
| (7.13) |
where
For the rest of the proof (as in LemmaΛ6.4) we use the notation and . Using Λ7.1 andΒ 7.4, the error term in Λ7.13 is seen to be . Thus Λ7.13 is
By CauchyβSchwarz and the -boundedness of the functions , this is bounded above by
| (7.14) |
For each we apply LemmaΛ6.4 with , , and , obtaining
| (7.15) |
The explain the last line here, we can bound the first error term by using Λ7.4. The second error term can be bounded using and the fact that , which can be verified using the definitions Λ7.2 andΒ 7.3.
Summing Λ7.15 over gives
Recalling the definitions Λ7.7 of the two projection operators, we see that the bracketed sum has considerable cancellation; the only uncancelled positive terms are the terms from scales which are not of the form for some other scale , that is to say with or ; thus the bracketed sum is bounded by . It follows that Λ7.14 is bounded by
using here that and .
If the constant is chosen large enough, this means that Λ7.12 is small compared with the RHS of Λ7.10.
To summarise so far, we have replaced the βsmallβ projections in Λ7.10 by the βlargerβ ones at the loss of only the quality of the implied constant, that is to say we have shown
To complete the proof of Λ7.11 (and hence of TheoremΛ1.1) we now replace the copies of by itself. For this we can work one value of at a time; thus it is enough to show that, for each ,
| (7.16) |
(Here again). To prove this we use PropositionΛ5.1. Indeed, we note that the LHS of Λ7.16 is of the form
(which is exactly the expression in Λ5.1) where , , , , and .
Note here that every element of has just one representation in this product since all primes in are much smaller than those in , and so on, and so is the same thing as .
The setup for the application of PropositionΛ5.1 requires some discussion. We address the various requirements in the statement of that proposition in turn.
-
β’
The parameter will be . Note , so the condition is satisfied due to the choices Λ7.1.
-
β’
We will take (the aim being to show that the LHS of Λ7.16 is at most ). The conditions and are then immediately checked.
-
β’
We take and . For notational consistency with PropositionΛ5.1, write for . Thus, by definition, is the set of primes in the interval , which is exactly the situation in PropositionΛ5.1. By Λ7.3 and the choice of parameters we have . (This is essentially the βpinch pointβ for the analysis; for the main result to have the stated exponent of 50 we need here.)
-
β’
We take , . The condition is implied by Λ7.4, if is large enough.
-
β’
We take , . Note here that and , as required, using here that . The condition follows immediately from Λ7.4.
-
β’
The condition follows from Λ7.4 and the fact that .
-
β’
That all prime factors of are less than is immediate from the lower bound .
Suppose that Λ7.16 does not hold. By the above discussion we are in a position to apply PropositionΛ5.1. Note that in the conclusion there is, with our choice of parameters, exactly the same as in Λ7.1. Since , we may take the parameter in PropositionΛ5.1 to be . The conclusion of PropositionΛ5.1 is then that
(Here we observed from the various definitions that .) However, this is contrary to LemmaΛ6.3, which asserts that the LHS is , which is enormously smaller. This contradiction shows that we indeed have Λ7.16, and all of the required statements are proven.
8. Further remarks
We end the main body of the paper with a series of remarks regarding the bounds obtained for the pattern and related patterns.
First of all, we comment that there are two different ways in which the double exponential bound in the main theorem seems hard to improve using anything like the methods of this paper. The first is that it seems difficult to avoid the need to define a highly divisible set such as the set in Λ7.2, and any such definition seems to immediately lead to elements of double exponential size in . Second, the hierarchy of scales Λ7.4 needed to be chosen with in order that the primes in this range satisfy , which is crucial in the application of PropositionΛ5.1. It is possible to show using arguments somewhat related to those in [Tao24] that one cannot do appreciably better by choosing an alternative set to the primes. In particular, when applying LemmaΛA.5 with an alternate set of integers , the error term is dominated by and one can prove that for any set one has .
Next we make some comments on the potential for extending the underlying analytic method to handle the pattern (for which partition regularity was established by Moreira [Mor17], but with essentially no bounds). Presumably any such approach would require one to (at least) establish an inverse theorem establishing some structure assuming that
| (8.1) |
where is a suitable set of almost primes (compare here with Λ5.1. The following two rather different examples suggest this may be far from straightforward.
-
β’
Suppose first that . Let be an arbitrary sequence of unit complex numbers, and define if is the least prime in which divides , and otherwise. Set . Assuming that , the (logarithmic) proportion of for which is negligible. Now observe that if the least prime factor of in is less than . On average over , one expects this to happen half the time. If, one other other hand, the least prime factor of is then we have , and typically we expect cancellation of this when summed over . Examples of this type therefore give Λ8.1 with , but with only having rather weak structure.
-
β’
Now suppose that , and for some . One may then observe that . If is the scale of then this is for .
Even with an inverse theorem for Λ8.1 in hand, it is far from clear how the other arguments of the paper might be modified.
Appendix A Properties of averages
In this appendix we assemble simple properties of (mostly) logarithmic averages. Throughout the appendix we assume to avoid trivialities. For , denotes the harmonic sum ; we do not require to be an integer. The first lemma concerns the behaviour of averages (both uniform and logarithmic) under shifts.
Lemma A.1.
Let be a 1-bounded function and let . Then
| (A.1) |
and, if ,
| (A.2) |
Proof.
ΛA.1 is straightforward. For ΛA.2, we may suppose else the result is trivial. Without loss of generality we may suppose is positive, since the case negative follows from the positive case. We have
| (A.3) |
The second sum on the right is , whilst the third is since . Finally, the first sum on the right is bounded above by . Since , and , the first sum on the right in ΛA.3 is bounded by . Putting all this together, the result follows. β
Next we give a result about splitting into residue classes.
Lemma A.2.
Let be -bounded. Let . Then
Proof.
We may suppose since the result is trivial otherwise. The LHS may be expanded as
The change if we replace in the denominator by is bounded above by
which is acceptable. If we make this change, the resulting expression is
The two error terms are , and this concludes the proof. β
We also need the following related result.
Lemma A.3.
Let be coprime positive integers and let be a further positive integer parameter. Let be a 1-bounded function. Then
Proof.
Clearly we may assume that , as the result is trivial otherwise. If we replace by , the LHS changes by at most . It therefore suffices to consider the case . In this case we establish the result without the error term. We have , where denotes the unique element of congruent to , and . By ΛA.2, we have
However, since ranges over as ranges over any interval of length ,
The result now follows from LemmaΛA.2. β
The next result states that logarithmic averages are essentially preserved under dilations. This is standard and appears, for instance, as [Ric25, LemmaΒ 2.1].
Lemma A.4.
Let be -bounded and let . Then
Proof.
When the result is trivial, so suppose . By definition,
This is bounded by , and the result follows. β
We next require the logarithmic version of Elliottβs inequality. The proof is exactly that given in [Ric25, CorollaryΒ 2.3] modulo tracking error terms.
Lemma A.5.
Let be a finite set of primes, all bounded by . Let be -bounded. We have that
Proof.
By LemmaΛA.4 applied with , for each we have
Therefore by CauchyβSchwarz we have
By [Ric25, PropositionΒ 2.2], we have that
and the result follows. β
We end with a proposition regarding the behaviour of under replacing by a multiple or shrinking the interval .
Lemma A.6.
Suppose that and that . Then
Appendix B An exponential sum estimate over the primes
In this appendix we prove a log-free exponential sum estimate for the von Mangoldt function with polynomial phase.
Lemma B.1.
Let and . Suppose that
| (B.1) |
Then there is some such that
| (B.2) |
Proof.
We proceed via the weaker result with ΛB.2 replaced by
| (B.3) |
This is a standard application of the method of Type I/II sums. However, some sources in the literature such as [Har81] lose factors of instead of a power of via an invocation of the divisor bound in the proof of Weylβs inequality. This loss can be avoided with a little care, but it is hard to find a convenient source in the literature. One may find an essentially equivalent argument (with the polynomial phase replaced by a general nilsequence) in [GT-mobius]. The key point is that [GT-mobius, Proposition 3.1] holds verbatim if the MΓΆbius function is replaced by . This is established in a standard fashion as in the proof of [GT-mobius, Proposition 3.1] (which is outsourced to [GT08, Section 4], which itself is derivative of standard expositions such as [IK-book, Chapter 13]) by using Vaughanβs identity for rather than the variant for . One may now run the arguments of [GT-mobius, Section 3]; in this context most of the language of nilmanifolds is redundant since is a nilsequence on the abelian torus . In particular the βcomplexityβ parameter is simply . The conclusion of [GT-mobius, Section 3] is then that, starting from ΛB.1, and setting , there is some such that we have ; this is exactly ΛB.3 (noting here that the can be upgraded to at the expense of worsening exponents since ).
If then ΛB.3 immediately implies ΛB.2 (after adjusting the exponents ). To complete the proof of LemmaΛB.1, it therefore suffices to handle the case . In this case, from ΛB.3 we certainly have and . In this case one can obtain an asymptotic for the exponential sum using the Siegel-Walfisz theorem on the distribution of in progressions . These arguments are carried out in detail in work of Hua [Hua38]. Summarising briefly, the main term of this asymptotic at will be where satisfies an appropriate van der Corput estimate and satisfies . The assumption ΛB.1 therefore forces both and . β
Finally we give the case of (a slight generalisation of) LemmaΛ3.2, which was used in the proof of the case of that result. This can be quickly deduced from LemmaΛB.1 as a consequence of partial summation.
Lemma B.2.
Let and . Suppose that
| (B.4) |
Then there is some such that and .
Proof.
The result is trivial if (say), so suppose this is not the case. We may replace the assumption ΛB.4 by
(The loss of a further factor of 2 here comes from the essentially negligible contribution of the prime power support of ). Now for we have . Substituting in and applying the triangle inequality gives
By further applications of the triangle inequality and it follows that
The desired conclusion Λ3.2 now follows from LemmaΛB.1. β
B.1. Effectivity
The proof outline above for LemmaΛB.1 gives ineffective bounds due to the invocation of the SiegelβWalfisz theorem in Huaβs work. However, one can replace this with a version of the prime number theorem in progressions incorporating an additional correction term for a potential Siegel zero such as [IK-book, Equation (5.71)]. Specifically, for and we have
where here is some quadratic Dirichlet character for which has a Siegel zero . The Siegel zero term introduces a secondary main term in Huaβs asymptotic formula, now of the form , where and . These terms satisfy similar estimates to in the Hua analysis, allowing us to draw an analogous conclusion.