θΏ™ζ˜―indexlocζδΎ›ηš„ζœεŠ‘οΌŒδΈθ¦θΎ“ε…₯任何密码

Bounds for monochromatic solutions to {x+y,x​y}\{x+y,xy\}

Ben Green Mathematical Institute, Andrew Wiles Building, Radcliffe Observatory Quarter, Woodstock Rd, Oxford OX2 6QW, UK ben.green@maths.ox.ac.uk and Mehtaab Sawhney Department of Mathematics, Columbia University, New York, NY 10027 m.sawhney@columbia.edu
Abstract.

Let rr be a sufficiently large positive integer, and let Nβ©Ύexp⁑exp⁑(r50)N\geqslant\exp\exp(r^{50}). Then any rr-colouring of [N][N] contains a monochromatic copy of {x+y,x​y}\{x+y,xy\} with x>y>2x>y>2.

1. Introduction

The key result in this work is an effective bound for rr-colourings of the natural numbers 𝐍\mathbf{N} containing a monochromatic copy of {x+y,x​y}\{x+y,xy\}.

Theorem 1.1.

There is a constant r0r_{0} such that the following holds. Let rβ©Ύr0r\geqslant r_{0} be an integer and let Nβ©Ύexp⁑exp⁑(r50)N\geqslant\exp\exp(r^{50}). Then any rr-colouring of [N]:={1,…,N}[N]:=\{1,\ldots,N\} contains a monochromatic copy of {x+y,x​y}\{x+y,xy\} with x>y>2x>y>2.

Remarks.

The constant r0r_{0} is effectively computable. Furthermore, minor tweaks to the numerics in our arguments would allow one to replace 5050 with a slightly smaller constant. However, to obtain a β€˜small’ constant (less than 10, say) would appear to require new ideas. Finally we make no effort to compute an actual value of r0r_{0}. Due to arguments regarding the possible existence of a Siegel zero in Appendix˜B (among other reasons), to do so would be rather painful.

In the other direction, for all rr there is an rr-colouring of [N][N] with no monochromatic {x+y,x​y}\{x+y,xy\} with x>y>2x>y>2 when N=12​(3r+7)N=\frac{1}{2}(3^{r}+7), and therefore Theorem˜1.1 is at most one logarithm from the optimal result. To obtain such a colouring, use colour ii for [ai,ai+1)[a_{i},a_{i+1}) where ai:=12​(3i+9)a_{i}:=\frac{1}{2}(3^{i}+9), i=0,…,rβˆ’1i=0,\dots,r-1, and any colour for {1,2,3,4}\{1,2,3,4\}. The point here is that ai+1=3​(aiβˆ’3)a_{i+1}=3(a_{i}-3) and so if x+y∈[ai,ai+1)x+y\in[a_{i},a_{i+1}) with x,yβ©Ύ3x,y\geqslant 3 then x​yβ©Ύai+1xy\geqslant a_{i+1}. We never have x+yx+y or x​y∈{1,2,3,4}xy\in\{1,2,3,4\}.

We remark that obtaining effective bounds for the pattern {x+y,x​y}\{x+y,xy\} has been raised by both the first author [GreOp, ProblemΒ 22] and by Richter [Ric25, QuestionΒ 7.2].

1.1. Previous results

Theorem˜1.1 guarantees the existence of infinitely many pairs {x+y,x​y}\{x+y,xy\} given a fixed rr-colouring of 𝐍\mathbf{N}. To see this, suppose that we have found dd such monochromatic pairs {xi+yi,xi​yi}\{x_{i}+y_{i},x_{i}y_{i}\}, i=1,…,di=1,\dots,d. We modify our colouring of 𝐍\mathbf{N} to an (r+2​d)(r+2d)-colouring in which x1,…,xd,y1,…,ydx_{1},\dots,x_{d},y_{1},\dots,y_{d} are given distinct colours, different to the original rr, and then use Theorem˜1.1 to find a further pair {xd+1+yd+1,xd+1​yd+1}\{x_{d+1}+y_{d+1},x_{d+1}y_{d+1}\}. (Alternatively one may observe that our proof of Theorem˜1.1 may be trivially modified to give many monochromatic pairs as Nβ†’βˆžN\rightarrow\infty for a fixed value of rr.)

This existential statement was first proven in a celebrated paper of Moreira [Mor17]; furthermore Moreira in fact guarantees a monochromatic pattern of the form {x,x+y,x​y}\{x,x+y,xy\}. This result represents substantial progress towards Hindman’s conjecture that any rr-colouring of 𝐍\mathbf{N} contains a monochromatic copy of {x,y,x+y,x​y}\{x,y,x+y,xy\}. Recently there has been further important progress towards Hindman’s conjecture in various settings. Bowen [Bow25] has proven that any 22-colouring of 𝐍\mathbf{N} contains infinitely many copies of {x,y,x+y,x​y}\{x,y,x+y,xy\}. Bowen and Sabok [BS24] have proven that any rr-colouring of 𝐐≠0\mathbf{Q}^{\neq 0} contains a copy of {x,y,x+y,x​y}\{x,y,x+y,xy\} and Alweiss [Alw23] extended this to patterns of the form {βˆ‘i∈Sxi,∏i∈Sxi}\{\sum_{i\in S}x_{i},\prod_{i\in S}x_{i}\} where SβŠ†[k]S\subseteq[k] ranges over all nontrivial subsets. Additionally Alweiss [Alw24] has given an alternate proof of the result of Moreira. However even when restricting to {x+y,x​y}\{x+y,xy\} the proofs of Moreira and Alweiss give at least tower–type bounds due to highly recursive Ramsey type arguments. We remark that while the main argument of Moreira is purely qualitative, he indicates in [Mor17, Section 5] a variant argument using van der Waerden’s theorem (or SzemerΓ©di’s theorem) which does give explicit finite bounds when used with appropriate bounds for SzemerΓ©di’s theorem due to Gowers [Gow01].

Recently, Richter [Ric25] provided a quite different, more analytic, proof of Moreira’s result about {x+y,x​y}\{x+y,xy\}. The argument of Richter is quite infinitary in flavour and gives no bounds. However, as will be discussed shortly, our methods in this paper are very strongly influenced by those of Richter.

One may additionally compare Theorem˜1.1 with bounds for certain Schur-type equations. For instance, for the configuration {x,y,x+y}\{x,y,x+y\}, bounds of the form exp⁑(rO​(1))\exp(r^{O(1)}) are known due to work of Cwalina and Schoen [CS17]. Note that this (by restricting to powers of 22) gives an essentially double-exponential bound for {x,y,x​y}\{x,y,xy\}. Furthermore for more general linear systems AA, bounds of the form exp⁑exp⁑(rOA​(1))\exp\exp(r^{O_{A}(1)}) are proven in generality by Sanders [San20], and good control on the implicit constant OA​(1)O_{A}(1) for many systems may be found in work of Chapman and Prendiville [CP20].

1.2. Proof outline

Our work draws heavily on recent beautiful work of Richter [Ric25]; many of the ideas presented in this section are drawn from this work.

Logarithmic averages play a central role, so we define these before turning to an outline of the proof. If 𝒩\mathcal{N} is a finite set of positive integers and if f:𝒩→𝐂f:\mathcal{N}\rightarrow\mathbf{C} is a function, we write

𝔼nβˆˆπ’©log​f​(n):=βˆ‘nβˆˆπ’©f​(n)/nβˆ‘nβˆˆπ’©1/n.\mathbb{E}_{n\in\mathcal{N}}^{\log}f(n):=\frac{\sum_{n\in\mathcal{N}}f(n)/n}{\sum_{n\in\mathcal{N}}1/n}.

We write 𝔼n1βˆˆπ’©1,n2βˆˆπ’©2log\mathbb{E}_{n_{1}\in\mathcal{N}_{1},n_{2}\in\mathcal{N}_{2}}^{\log} as a shorthand for 𝔼n1βˆˆπ’©1log​𝔼n2βˆˆπ’©2log\mathbb{E}_{n_{1}\in\mathcal{N}_{1}}^{\log}\mathbb{E}^{\log}_{n_{2}\in\mathcal{N}_{2}} (and similarly for higher iterates). We will often use this notation when 𝒩=[N]={1,…,N}\mathcal{N}=[N]=\{1,\dots,N\}.

Suppose now that [N]=A1βˆͺβ‹―βˆͺAr[N]=A_{1}\cup\cdots\cup A_{r} is an rr-colouring of [N]={1,…,N}[N]=\{1,\dots,N\} in which we seek to find a monochromatic pair x+y,x​yx+y,xy. The colour class in which this pair will be found is identified right at the very start of the proof. We take B0B_{0} to be a fixed set of rO​(1)r^{O(1)} β€˜highly divisible’ numbers; the precise set we take is B0:={V4i:i=1,2,…,rC1}B_{0}:=\{V^{4^{i}}:i=1,2,\dots,r^{C_{1}}\}, where V=(rC2)!V=(r^{C_{2}})! for appropriate constants C1,C2C_{1},C_{2}. By the pigeonhole principle there is some A=Aβ„“A=A_{\ell} which contains many multiples of elements of B0B_{0} in the sense that that 𝔼n∈[N]log​1A​(b​n)≫1/r\mathbb{E}_{n\in[N]}^{\log}1_{A}(bn)\gg 1/r for at least ≫rC1βˆ’1\gg r^{C_{1}-1} elements b∈B0b\in B_{0}. We will find the desired configuration {x+y,x​y}\{x+y,xy\} in this colour class, which we fix for the rest of the argument.

The next key idea, which follows [Ric25] very closely, is to locate a β€˜rich’ set of pairs {x,x​y}\{x,xy\} in AA. This is done using a variant of arguments of Ahlswede, Khachatrian and SΓ‘rkΓΆzy [AKS99] and Davenport and ErdΕ‘s [DE36]. This argument involves the choice of various auxiliary sets of primes (for details see Section˜7.1) and a key component is Elliott’s inequality from multiplicative number theory (given in Lemma˜A.5 in the form we shall need). The output of this argument is many instances of the inequality

𝔼n∈[N],p1βˆˆπ’«1,…,pkβˆˆπ’«klog​1A​(b​n)​1A​(b′​p1​⋯​pk)≫rO​(1)\mathbb{E}_{n\in[N],p_{1}\in\mathscr{P}_{1},\dots,p_{k}\in\mathscr{P}_{k}}^{\log}1_{A}(bn)1_{A}(b^{\prime}p_{1}\cdots p_{k})\gg r^{O(1)} (1.1)

for some fixed b∈B0b\in B_{0} and many bβ€²βˆˆB0b^{\prime}\in B_{0} with b<bβ€²b<b^{\prime} and associated kk where 2β©½kβ‰ͺrO​(1)2\leqslant k\ll r^{O(1)}, and where the sets 𝒫i\mathscr{P}_{i} of primes can be chosen at many different scales. (The precise statement we are sketching here may be found at ˜7.6.) This provides the aforementioned rich source of configurations {x,x​y}\{x,xy\}, here with x:=b​nx:=bn and y:=bβ€²b​p1​⋯​pky:=\frac{b^{\prime}}{b}p_{1}\cdots p_{k},

The main business of the proof is a kind of deformation of the patterns x,x​yx,xy to the desired x+y,x​yx+y,xy. To describe how this works, fix an instance of ˜1.1 (that is, fix bβ€²b^{\prime} and the sets 𝒫i\mathscr{P}_{i} of primes). Set f​(n):=1A​(b​n)f(n):=1_{A}(bn). We will then consider two β€˜projections’ Ξ sml​f\Pi^{\operatorname{sml}}f and Ξ lrg​f\Pi^{\operatorname{lrg}}f, both of which average over progressions. They are defined by

Ξ sml​f​(n):=𝔼h,hβ€²βˆˆ[H]​f​(n+q​(hβˆ’hβ€²))andΞ lrg​f​(n):=𝔼h,hβ€²βˆˆ[H~]​f​(n+q~​(hβˆ’hβ€²))\Pi^{\operatorname{sml}}f(n):=\mathbb{E}_{h,h^{\prime}\in[H]}f(n+q(h-h^{\prime}))\qquad\mbox{and}\qquad\Pi^{\operatorname{lrg}}f(n):=\mathbb{E}_{h,h^{\prime}\in[\tilde{H}]}f(n+\tilde{q}(h-h^{\prime}))

where here q∣q~q\mid\tilde{q} and H>H~H>\tilde{H}. (The actual choice of parameters depends on the scale of the sets of primes 𝒫i\mathscr{P}_{i}; the details are given at ˜7.7). One should think of q,q~q,\tilde{q} as being bounded in terms of rr, whereas the lengths H,H~H,\tilde{H} grow with NN.

The small projection Πsml\Pi^{\operatorname{sml}} is chosen so that we may run the following argument, starting from ˜1.1. First, via a kind of maximal function argument, we replace ˜1.1 by

𝔼n∈[N],p1βˆˆπ’«1,…,pkβˆˆπ’«klog​Πsml​f​(n)​1A​(b′​p1​⋯​pk)≫rβˆ’O​(1).\mathbb{E}_{n\in[N],p_{1}\in\mathscr{P}_{1},\dots,p_{k}\in\mathscr{P}_{k}}^{\log}\Pi^{\operatorname{sml}}f(n)1_{A}(b^{\prime}p_{1}\cdots p_{k})\gg r^{-O(1)}. (1.2)

Details of this argument may be found in Lemma˜6.5.

Then, we use the almost-periodicity property Ξ sml​f​(n)β‰ˆΞ sml​f​(n+bβ€²b2​p1​⋯​pk)\Pi^{\operatorname{sml}}f(n)\approx\Pi^{\operatorname{sml}}f(n+\frac{b^{\prime}}{b^{2}}p_{1}\cdots p_{k}) to replace ˜1.2 by

𝔼n∈[N],p1βˆˆπ’«1,…,pkβˆˆπ’«klog​Πsml​f​(n+bβ€²b2​p1​⋯​pk)​1A​(b′​p1​⋯​pk)≫rβˆ’O​(1).\mathbb{E}_{n\in[N],p_{1}\in\mathscr{P}_{1},\dots,p_{k}\in\mathscr{P}_{k}}^{\log}\Pi^{\operatorname{sml}}f\big(n+\frac{b^{\prime}}{b^{2}}p_{1}\cdots p_{k}\big)1_{A}(b^{\prime}p_{1}\cdots p_{k})\gg r^{-O(1)}. (1.3)

(note here that bβ€²/b2b^{\prime}/b^{2} is an integer by the highly divisible nature of the set B0B_{0}). In order for this almost-periodicity property to hold, the small projection Ξ sml\Pi^{\operatorname{sml}} must be chosen appropriately: qq must divide bβ€²b2​p1​⋯​pk\frac{b^{\prime}}{b^{2}}p_{1}\cdots p_{k} and HH must be sufficiently long.

Leaving ˜1.3 aside for the moment, the technical heart of the proof is then an argument to the effect that (for an appropriate choice of the large projection Πlrg\Pi^{\operatorname{lrg}}) we have

𝔼n∈[N],p1βˆˆπ’«1,…,pkβˆˆπ’«klog\displaystyle\mathbb{E}_{n\in[N],p_{1}\in\mathscr{P}_{1},\dots,p_{k}\in\mathscr{P}_{k}}^{\log} Ξ lrg​f​(n+bβ€²b2​p1​⋯​pk)​1A​(b′​p1​⋯​pk)\displaystyle\Pi^{\operatorname{lrg}}f\big(n+\frac{b^{\prime}}{b^{2}}p_{1}\cdots p_{k}\big)1_{A}(b^{\prime}p_{1}\cdots p_{k})
β‰ˆπ”Όn∈[N],p1βˆˆπ’«1,…,pkβˆˆπ’«klog​f​(n+bβ€²b2​p1​⋯​pk)​1A​(b′​p1​⋯​pk).\displaystyle\approx\mathbb{E}_{n\in[N],p_{1}\in\mathscr{P}_{1},\dots,p_{k}\in\mathscr{P}_{k}}^{\log}f\big(n+\frac{b^{\prime}}{b^{2}}p_{1}\cdots p_{k}\big)1_{A}(b^{\prime}p_{1}\cdots p_{k}). (1.4)

Supposing that this has been established, imagine that we additionally have

Ξ sml​fβ‰ˆΞ lrg​f\Pi^{\operatorname{sml}}f\approx\Pi^{\operatorname{lrg}}f (1.5)

(in an β„“2\ell^{2} sense). Combining ˜1.3, 1.4, andΒ 1.5 then gives, assuming the various uses of β‰ˆ\approx work in our favour, that

𝔼n∈[N],p1βˆˆπ’«1,…,pkβˆˆπ’«klog​f​(n+bβ€²b2​p1​⋯​pk)​1A​(b′​p1​⋯​pk)≫rβˆ’O​(1).\mathbb{E}_{n\in[N],p_{1}\in\mathscr{P}_{1},\dots,p_{k}\in\mathscr{P}_{k}}^{\log}f\big(n+\frac{b^{\prime}}{b^{2}}p_{1}\cdots p_{k}\big)1_{A}(b^{\prime}p_{1}\cdots p_{k})\gg r^{-O(1)}.

Recalling that f​(n)=1A​(b​n)f(n)=1_{A}(bn), it then follows that for some choice of nn and p1,…,pkp_{1},\dots,p_{k} we have b​n+bβ€²b​p1​⋯​pk,b′​p1​⋯​pk∈Abn+\frac{b^{\prime}}{b}p_{1}\cdots p_{k},b^{\prime}p_{1}\cdots p_{k}\in A. This is the desired configuration {x+y,x​y}\{x+y,xy\}, with x=b​nx=bn and y=bβ€²b​p1​⋯​pky=\frac{b^{\prime}}{b}p_{1}\cdots p_{k}.

Whilst ˜1.5 will not be true in general (the projections Ξ sml,Ξ lrg\Pi^{\operatorname{sml}},\Pi^{\operatorname{lrg}} are quite different in scale), an β€˜energy-chaining’ or arithmetic regularity type of argument can be used to show that ˜1.5 does hold for at least one scale of primes 𝒫1,…,𝒫k\mathscr{P}_{1},\dots,\mathscr{P}_{k}. This part of the argument can be thought of as a quantitative version of the existence of projections in Hilbert space, specifically of the decomposition into locally aperiodic and locally quasiperiodic functions which is important in Richter’s work. This connection between existence of projections in Hilbert space and regularity lemmas is by now well established; see e.g. [Tao07, SectionΒ 2].

The remaining part of the argument is then to justify ˜1.4. This is done via a general study of averages

𝔼n∈[N],p1βˆˆπ’«1,…,pkβˆˆπ’«klog​f1​(n+λ​p1​⋯​pk)​f2​(n​p1​⋯​pk),\mathbb{E}_{n\in[N],p_{1}\in\mathscr{P}_{1},\dots,p_{k}\in\mathscr{P}_{k}}^{\log}f_{1}(n+\lambda p_{1}\cdots p_{k})f_{2}(np_{1}\cdots p_{k}), (1.6)

where Ξ»=bβ€²/b2\lambda=b^{\prime}/b^{2} in our setting. Here, we consider arbitrary 11-bounded functions f1,f2f_{1},f_{2}, and the key question of interest is the β€˜inverse question’ of what can be said if ˜1.6 is at least Ξ΄\delta in magnitude for some Ξ΄>0\delta>0. Our main result on this topic, Proposition˜5.1, is an inverse theorem for this question. It concludes that under such a hypothesis (and with suitable assumptions on the sets 𝒫i\mathscr{P}_{i} of primes) the function f1f_{1} is biased along progressions to some modulus q~=Ξ»β€‹βŒŠΞ΄βˆ’CβŒ‹!\tilde{q}=\lambda\lfloor\delta^{-C}\rfloor! and length HH comparable (in logarithmic scale) to the largest of the primes 𝒫i\mathscr{P}_{i}. The statement ˜1.4 follows very quickly from this inverse theorem (see Lemma˜6.3 for the argument).

This inverse theorem, Proposition˜5.1, is the most novel part of our paper. Whilst it is in a sense a quantitative, finitary version of [Ric25, Theorem 3.5], it is not a direct translation of that result, which would appear to be far too weak for our purposes. The key difference when unwinding the argument in [Ric25] in finitary language is that the latter finds bias along progressions with size depending on 𝒫i\mathscr{P}_{i} while ours depends only on Ξ΄\delta. The proof of Proposition˜5.1 is lengthy, and involves a Fourier analytic argument combined with Cauchy–Schwarz manΕ“uvres inspired by certain β€œconcatenation” results in the additive combinatorics literature, for instance [PP24, Pel20]. Ultimately it is these concatenation ideas which eliminate the dependence on 𝒫i\mathscr{P}_{i}. Key further ingredients are:

  • β€’

    Quantitative diophantine approximation results (Lemma˜2.2);

  • β€’

    β€˜Log-free’ exponential sum estimates for certain arithmetic sets, specifically sets 𝒫′={p2​⋯​pk:p2∈I2,…,pk∈Ik}\mathscr{P}^{\prime}=\{p_{2}\cdots p_{k}:p_{2}\in I_{2},\dots,p_{k}\in I_{k}\} of β€˜almost primes’, as well as the sets of squares of the elements of such sets (Section˜3);

  • β€’

    Construction of a majorant for the primes with a certain Fourier decomposition (Section˜4), in order to avoid the constant r0r_{0} in our main result being ineffective due to possible Siegel zeros.

1.3. Acknowledgments

BG is supported by Simons Investigator Award 376201. This research was conducted during the period MS served as a Clay Research Fellow.

1.4. Notation

At various points, for brevity it will be expedient to use the following notation. If f:𝐙→𝐂f:\mathbf{Z}\rightarrow\mathbf{C} is a function and if h,hβ€²βˆˆπ™h,h^{\prime}\in\mathbf{Z}, we write Ξ”(h,hβ€²)​f​(x):=f​(x+h)​f​(x+hβ€²)Β―\Delta_{(h,h^{\prime})}f(x):=f(x+h)\overline{f(x+h^{\prime})}. If Ξ»\lambda is some further integer parameter, by Δλ​(h,hβ€²)​f\Delta_{\lambda(h,h^{\prime})}f we mean Ξ”(λ​h,λ​hβ€²)​f\Delta_{(\lambda h,\lambda h^{\prime})}f.

By a dyadic interval we mean any subset of 𝐍\mathbf{N} of the form {n:Yβ©½n<2​Y}\{n:Y\leqslant n<2Y\}. We will occasionally abuse notation by writing [H][H] when we really mean [⌊HβŒ‹][\lfloor H\rfloor], for some Hβˆˆπ‘β©Ύ1H\in\mathbf{R}_{\geqslant 1}.

When we say that a parameter (for instance Ξ΄\delta) is β€˜sufficiently small’ we mean that Ξ΄β©½Ξ΄0\delta\leqslant\delta_{0} for some absolute Ξ΄0\delta_{0} which we do not explicitly specify, and analogously if we say that NN is β€˜sufficiently large’ we mean that Nβ©ΎN0N\geqslant N_{0} for some absolute constant N0N_{0}. It is important to remark that Ξ΄0,N0\delta_{0},N_{0} are absolute and do not depend on the number of colours rr (otherwise our results would have little content). Throughout the paper the letter NN will always denote a sufficiently large integer parameter.

We write (x,y)(x,y) for the greatest common divisor of x,yx,y and [x,y][x,y] for the lowest common multiple.

2. Diophantine sets and averages

The purpose of this section is to bound certain averages that will appear in the arguments of the next section, where our key technical result is established. The averages in question will be of the form

𝔼n∈[N]log​𝔼s∈S,t,tβ€²β©½T​f​(n+t​s)​f​(n+t′​s)Β―=𝔼n∈[N]log​𝔼s∈S,t,tβ€²βˆˆT​Δt​(s,sβ€²)​f​(n),\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{s\in S,t,t^{\prime}\leqslant T}f(n+ts)\overline{f(n+t^{\prime}s)}=\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{s\in S,t,t^{\prime}\in T}\Delta_{t(s,s^{\prime})}f(n),

where SβŠ‚πS\subset\mathbf{N} is contained in some dyadic interval, or the analogous average with 𝔼n∈[N]\mathbb{E}_{n\in[N]} in place of the logarithmic average. The main result of the section is Lemma˜2.4 below.

In our applications the set SS will have a useful arithmetic property, namely that it satisfies a β€˜log-free Weyl-type estimate’. The precise definition we will use is the following.

Definition 2.1.

Let L,Lβ€²,DL,L^{\prime},D be parameters. Let SS be a set of integers. Suppose that whenever δ∈(0,12)\delta\in(0,\frac{1}{2}) and |𝔼s∈S​e​(θ​s)|β©ΎΞ΄|\mathbb{E}_{s\in S}e(\theta s)|\geqslant\delta, then there is some natural number qq, qβ©½(Lβ€²/Ξ΄)Lq\leqslant(L^{\prime}/\delta)^{L}, such that β€–q​θ‖𝐑/𝐙⩽(Lβ€²/Ξ΄)L/D\|q\theta\|_{\mathbf{R}/\mathbf{Z}}\leqslant(L^{\prime}/\delta)^{L}/D. Then we say that SS is (L,Lβ€²,D)(L,L^{\prime},D)-diophantine.

Remarks.

Note that the definition is invariant under translation of SS. In applications the parameter DD will be comparable to the diameter of SS, but it is convenient not to simply set D:=diam⁑(S)D:=\operatorname{diam}(S), since this would lead to unnecessary estimations of the diameter of SS in some situations. Being diophantine with D≍diam⁑(S)D\asymp\operatorname{diam}(S) (for some L,Lβ€²L,L^{\prime}) is a common property of sets of integers. For instance, (the log-free variant of) Weyl’s inequality asserts that the set of jjth powers in [D][D] is (L,Lβ€²,D)(L,L^{\prime},D)-diophantine with appropriate parameters L,Lβ€²β‰ͺj1L,L^{\prime}\ll_{j}1; the set of jjth powers of primes in [D][D] is also (L,Lβ€²,D)(L,L^{\prime},D)-diophantine for some L,Lβ€²β‰ͺj1L,L^{\prime}\ll_{j}1. In fact, we will use the latter fact in our argument; for the proof see Lemma˜B.2.

Before turning to the statement and proof of the main results, we isolate the following lemma, which is of a standard type in the analysis of exponential sums. A proof of this particular variant may be found in [Gre25, Lemma C.1] (we have changed some dummy variables to avoid conflicts with the present paper).

Lemma 2.2.

Suppose that Ξ±βˆˆπ‘\alpha\in\mathbf{R} and that Tβ©Ύ1T\geqslant 1 is an integer. Suppose that Ξ΄1,Ξ΄2\delta_{1},\delta_{2} are positive real numbers satisfying Ξ΄2β©Ύ32​δ1\delta_{2}\geqslant 32\delta_{1}, and suppose that there are at least Ξ΄2​T\delta_{2}T elements t∈[T]t\in[T] for which ‖α​t‖𝐑/𝐙⩽δ1\|\alpha t\|_{\mathbf{R}/\mathbf{Z}}\leqslant\delta_{1}. Suppose that Tβ©Ύ16/Ξ΄2T\geqslant 16/\delta_{2}. Then there is some positive integer qβ©½16/Ξ΄2q\leqslant 16/\delta_{2} such that ‖α​q‖𝐑/𝐙⩽δ1​δ2βˆ’1​Tβˆ’1\|\alpha q\|_{\mathbf{R}/\mathbf{Z}}\leqslant\delta_{1}\delta_{2}^{-1}T^{-1}.

We next give the definition of certain norms describing bias of functions along arithmetic progressions.

Definition 2.3.

Let f:𝐙→𝐂f:\mathbf{Z}\rightarrow\mathbf{C} be a function. Let q∈𝐍q\in\mathbf{N} and H∈𝐍H\in\mathbf{N} be parameters. Set

β€–fβ€–Ulog1​[N;q,H]2:=𝔼n∈[N]log​|𝔼h∈[H]​f​(n+h​q)|2=𝔼n∈[N]log​𝔼h,hβ€²βˆˆ[H]​Δq​(h,hβ€²)​f​(n)\|f\|_{U^{1}_{\log}[N;q,H]}^{2}:=\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{h\in[H]}f(n+hq)\big|^{2}=\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{h,h^{\prime}\in[H]}\Delta_{q(h,h^{\prime})}f(n) (2.1)

and

β€–fβ€–U1​[N;q,H]2:=𝔼n∈[N]​|𝔼h∈[H]​f​(n+h​q)|2=𝔼n∈[N]​𝔼h,hβ€²βˆˆ[H]​Δq​(h,hβ€²)​f​(n).\|f\|_{U^{1}[N;q,H]}^{2}:=\mathbb{E}_{n\in[N]}\big|\mathbb{E}_{h\in[H]}f(n+hq)\big|^{2}=\mathbb{E}_{n\in[N]}\mathbb{E}_{h,h^{\prime}\in[H]}\Delta_{q(h,h^{\prime})}f(n). (2.2)

The logarithmic norm ˜2.1 will play the more prominent role in our analysis, with the uniform norm ˜2.2 being relegated to a more modest technical role in Lemma˜2.6. We record that, roughly speaking, we have β€–fβ€–Ulog1​[N;q,H]βͺ…β€–fβ€–Ulog1​[N;q~,H~]\|f\|_{U^{1}_{\log}[N;q,H]}\lessapprox\|f\|_{U^{1}_{\log}[N;\tilde{q},\tilde{H}]} if q∣q~q\mid\tilde{q} and that H~​q~<H​q\tilde{H}\tilde{q}<Hq (for a precise statement, see Lemma˜A.6). In particular for fixed qq the information that β€–fβ€–Ulog1​[N;q,H]\|f\|_{U^{1}_{\log}[N;q,H]} is large becomes weaker as HH becomes smaller. We are now ready for the first main result of the section, which could potentially have other applications.

Lemma 2.4.

Let Ξ΄\delta be a sufficiently small positive parameter and L,Lβ€²,Dβ©Ύ1L,L^{\prime},D\geqslant 1. Let SβŠ‚π™S\subset\mathbf{Z} be (L,Lβ€²,D)(L,L^{\prime},D)-diophantine with SβŠ‚[βˆ’4​D,4​D]S\subset[-4D,4D], and let T∈𝐍T\in\mathbf{N} be a parameter. Suppose that D,Tβ©Ύ(Lβ€²/Ξ΄)8​LD,T\geqslant(L^{\prime}/\delta)^{8L} and that log⁑T​Dlog⁑Nβ©½(Ξ΄/Lβ€²)50​L\frac{\log TD}{\log N}\leqslant(\delta/L^{\prime})^{50L}. Let HH be any positive integer with Hβ©½(Ξ΄/Lβ€²)50​L​T​DH\leqslant(\delta/L^{\prime})^{50L}TD. Let f:𝐍→𝐂f:\mathbf{N}\rightarrow\mathbf{C} be 11-bounded and suppose that we have

𝔼n∈[N]log​𝔼t,tβ€²βˆˆ[T]​𝔼s∈S​f​(n+t​s)​f​(n+t′​s)Β―β©ΎΞ΄.\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{t,t^{\prime}\in[T]}\mathbb{E}_{s\in S}f(n+ts)\overline{f(n+t^{\prime}s)}\geqslant\delta. (2.3)

Then there exists q∈𝐍q\in\mathbf{N}, qβ©½(Lβ€²/Ξ΄)8​Lq\leqslant(L^{\prime}/\delta)^{8L}, such that β€–fβ€–Ulog1​[N;q,H]β©Ύ(Ξ΄/Lβ€²)25​L\|f\|_{U^{1}_{\log}[N;q,H]}\geqslant(\delta/L^{\prime})^{25L}.

Remark.

Note here that qq may depend on ff, but we are free to specify HH subject to the stated upper bound condition.

Proof.

Throughout the proof we assume that Ξ΄0\delta_{0} is sufficiently small without further comment. The proof is Fourier-analytic; closely related arguments have appeared as base cases for various β€˜concatenation’ results (see e.g. [PP24, LemmaΒ 5.3] or [Pel20, LemmaΒ 5.4]). By ˜A.2 applied with h=(tβˆ’tβ€²)​sh=(t-t^{\prime})s we have

𝔼n∈[N]log​f​(n)​𝔼t,tβ€²βˆˆ[T]​𝔼s∈S​f​(n+(tβˆ’tβ€²)​s)Β―β©ΎΞ΄/2,\mathbb{E}_{n\in[N]}^{\log}f(n)\mathbb{E}_{t,t^{\prime}\in[T]}\mathbb{E}_{s\in S}\overline{f(n+(t-t^{\prime})s)}\geqslant\delta/2,

which for brevity we write

𝔼n∈[N]log​f​(n)​𝔼u∈[T]βˆ’[T]​𝔼s∈S​f​(n+u​s)Β―β©ΎΞ΄/2,\mathbb{E}_{n\in[N]}^{\log}f(n)\mathbb{E}_{u\in[T]-[T]}\mathbb{E}_{s\in S}\overline{f(n+us)}\geqslant\delta/2,

with the understanding that [T]βˆ’[T][T]-[T] is considered with multiplicity. By Cauchy–Schwarz this gives that

𝔼n∈[N]log​𝔼u,uβ€²βˆˆ[T]βˆ’[T]​𝔼s,sβ€²βˆˆS​f​(n+u​s)​f​(n+u′​sβ€²)Β―β©ΎΞ΄2/4.\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{u,u^{\prime}\in[T]-[T]}\mathbb{E}_{s,s^{\prime}\in S}f(n+us)\overline{f(n+u^{\prime}s^{\prime})}\geqslant\delta^{2}/4.

By a further application of ˜A.2, followed by the triangle inequality, we have that

𝔼n∈[N]log​|𝔼h∈[T​D]βˆ’[T​D]​𝔼u,uβ€²βˆˆ[T]βˆ’[T]​𝔼s,sβ€²βˆˆS​f​(n+h+u​s)​f​(n+h+u′​sβ€²)Β―|β©ΎΞ΄2/8.\mathbb{E}_{n\in[N]}^{\log}\Big|\mathbb{E}_{h\in[TD]-[TD]}\mathbb{E}_{u,u^{\prime}\in[T]-[T]}\mathbb{E}_{s,s^{\prime}\in S}f(n+h+us)\overline{f(n+h+u^{\prime}s^{\prime})}\Big|\geqslant\delta^{2}/8.

Denote

𝒩0:={n∈[N]:|𝔼h∈[T​D]βˆ’[T​D]​𝔼u,uβ€²βˆˆ[T]βˆ’[T]​𝔼s,sβ€²βˆˆS​f​(n+h+u​s)​f​(n+h+u′​sβ€²)Β―|β©ΎΞ΄2/16}.\mathcal{N}_{0}:=\big\{n\in[N]:\big|\mathbb{E}_{h\in[TD]-[TD]}\mathbb{E}_{u,u^{\prime}\in[T]-[T]}\mathbb{E}_{s,s^{\prime}\in S}f(n+h+us)\overline{f(n+h+u^{\prime}s^{\prime})}\big|\geqslant\delta^{2}/16\big\}.

By a simple averaging argument we have

𝔼n∈[N]log​1𝒩0​(n)β©ΎΞ΄2/16.\mathbb{E}^{\log}_{n\in[N]}1_{\mathcal{N}_{0}}(n)\geqslant\delta^{2}/16. (2.4)

For the time being, let nβˆˆπ’©0n\in\mathcal{N}_{0} be fixed. Defining gn:𝐍→𝐂g_{n}:\mathbf{N}\rightarrow\mathbf{C} by gn​(m)=f​(n+m)g_{n}(m)=f(n+m) for |m|β©½16​T​D|m|\leqslant 16TD and 0 otherwise, we have from the definition of 𝒩0\mathcal{N}_{0} that

|𝔼h∈[T​D]βˆ’[T​D]​𝔼u,uβ€²βˆˆ[T]βˆ’[T]​𝔼s,sβ€²βˆˆS​gn​(h+u​s)​gn​(h+u′​sβ€²)Β―|β©ΎΞ΄2/16.\big|\mathbb{E}_{h\in[TD]-[TD]}\mathbb{E}_{u,u^{\prime}\in[T]-[T]}\mathbb{E}_{s,s^{\prime}\in S}g_{n}(h+us)\overline{g_{n}(h+u^{\prime}s^{\prime})}\big|\geqslant\delta^{2}/16.

Note here that |h+u​s|,|h+u′​sβ€²|β©½16​T​D|h+us|,|h+u^{\prime}s^{\prime}|\leqslant 16TD, using here that SβŠ‚[βˆ’4​D,4​D]S\subset[-4D,4D]. Taking the Fourier expansion gn​(m)=βˆ«π‘/𝐙gn^​(ΞΈ)​e​(θ​m)​𝑑θg_{n}(m)=\int_{\mathbf{R}/\mathbf{Z}}\widehat{g_{n}}(\theta)e(\theta m)d\theta and applying the triangle inequality, this gives

∫(𝐑/𝐙)2|g^n​(ΞΈ)​g^n​(ΞΈβ€²)|​K​(ΞΈ,ΞΈβ€²)​𝑑θ​𝑑θ′⩾δ2/16,\int_{(\mathbf{R}/\mathbf{Z})^{2}}\big|\widehat{g}_{n}(\theta)\widehat{g}_{n}(\theta^{\prime})\big|K(\theta,\theta^{\prime})d\theta d\theta^{\prime}\geqslant\delta^{2}/16, (2.5)

where

K​(ΞΈ,ΞΈβ€²):=|𝔼h∈[T​D]βˆ’[T​D]​e​((ΞΈβˆ’ΞΈβ€²)​h)β€‹Οˆβ€‹(ΞΈ)β€‹Οˆβ€‹(ΞΈβ€²)|K(\theta,\theta^{\prime}):=\big|\mathbb{E}_{h\in[TD]-[TD]}e\big((\theta-\theta^{\prime})h\big)\psi(\theta)\psi(\theta^{\prime})\big|

with

Οˆβ€‹(ΞΈ):=𝔼u∈[T]βˆ’[T]​𝔼s∈S​e​(θ​u​s).\psi(\theta):=\mathbb{E}_{u\in[T]-[T]}\mathbb{E}_{s\in S}e(\theta us). (2.6)

Now by bounding the Οˆβ€‹(β‹…)\psi(\cdot) terms trivially by 11 and using that

|𝔼h∈[T​D]βˆ’[T​D]​e​((ΞΈβˆ’ΞΈβ€²)​h)|=|𝔼h∈[T​D]​e​((ΞΈβˆ’ΞΈβ€²)​h)|2β‰ͺ(T​D)βˆ’2​βˆ₯ΞΈβˆ’ΞΈβ€²βˆ₯𝐑/π™βˆ’2,|\mathbb{E}_{h\in[TD]-[TD]}e((\theta-\theta^{\prime})h)|=|\mathbb{E}_{h\in[TD]}e((\theta-\theta^{\prime})h)|^{2}\ll(TD)^{-2}\lVert\theta-\theta^{\prime}\rVert_{\mathbf{R}/\mathbf{Z}}^{-2},

we have K​(ΞΈ,ΞΈβ€²)β‰ͺmin⁑(1,(T​D)βˆ’2​βˆ₯ΞΈβˆ’ΞΈβ€²βˆ₯𝐑/π™βˆ’2)K(\theta,\theta^{\prime})\ll\min(1,(TD)^{-2}\lVert\theta-\theta^{\prime}\rVert_{\mathbf{R}/\mathbf{Z}}^{-2}). From this, Cauchy–Schwarz and Parseval it follows that

βˆ«π‘/𝐙|gn^​(ΞΈ)​gn^​(ΞΈ+Ξ±)|​K​(ΞΈ,ΞΈ+Ξ±)​𝑑θβ‰ͺ(βˆ«π‘/𝐙|g^n​(ΞΈ)|2)​(T​D)βˆ’2​‖α‖𝐑/π™βˆ’2β‰ͺ(T​D)βˆ’1​‖α‖𝐑/π™βˆ’2.\int_{\mathbf{R}/\mathbf{Z}}\big|\widehat{g_{n}}(\theta)\widehat{g_{n}}(\theta+\alpha)\big|K(\theta,\theta+\alpha)d\theta\ll\Big(\int_{\mathbf{R}/\mathbf{Z}}|\widehat{g}_{n}(\theta)|^{2}\Big)(TD)^{-2}\|\alpha\|_{\mathbf{R}/\mathbf{Z}}^{-2}\ll(TD)^{-1}\|\alpha\|_{\mathbf{R}/\mathbf{Z}}^{-2}.

Integrating over Ξ±βˆˆπ‘/𝐙\alpha\in\mathbf{R}/\mathbf{Z}, we see that the contribution to ˜2.5 from ‖α‖𝐑/𝐙⩾Cβ€‹Ξ΄βˆ’2/T​D\|\alpha\|_{\mathbf{R}/\mathbf{Z}}\geqslant C\delta^{-2}/TD is negligible for CC sufficiently large, that is to say

βˆ«β€–ΞΈβˆ’ΞΈβ€²β€–π‘/𝐙⩽Cβ€‹Ξ΄βˆ’2/T​D|g^n​(ΞΈ)​g^n​(ΞΈβ€²)|​K​(ΞΈ,ΞΈβ€²)​𝑑θ​𝑑θ′⩾δ2/32.\int_{\|\theta-\theta^{\prime}\|_{\mathbf{R}/\mathbf{Z}}\leqslant C\delta^{-2}/TD}\big|\widehat{g}_{n}(\theta)\widehat{g}_{n}(\theta^{\prime})\big|K(\theta,\theta^{\prime})d\theta d\theta^{\prime}\geqslant\delta^{2}/32.

Therefore, bounding the geometric series part of KK trivially by 11,

βˆ«β€–ΞΈβˆ’ΞΈβ€²β€–π‘/𝐙⩽Cβ€‹Ξ΄βˆ’2/T​D|g^n​(ΞΈ)​g^n​(ΞΈβ€²)β€‹Οˆβ€‹(ΞΈ)β€‹Οˆβ€‹(ΞΈβ€²)|​𝑑θ​𝑑θ′⩾δ2/32.\int_{\|\theta-\theta^{\prime}\|_{\mathbf{R}/\mathbf{Z}}\leqslant C\delta^{-2}/TD}\big|\widehat{g}_{n}(\theta)\widehat{g}_{n}(\theta^{\prime})\psi(\theta)\psi(\theta^{\prime})\big|d\theta d\theta^{\prime}\geqslant\delta^{2}/32.

In particular, for some Ξ±βˆˆπ‘/𝐙\alpha\in\mathbf{R}/\mathbf{Z} we have

βˆ«π‘/𝐙|g^n​(ΞΈ)​g^n​(ΞΈ+Ξ±)β€‹Οˆβ€‹(ΞΈ)β€‹Οˆβ€‹(ΞΈ+Ξ±)|​𝑑θ≫δ4​T​D.\int_{\mathbf{R}/\mathbf{Z}}\big|\widehat{g}_{n}(\theta)\widehat{g}_{n}(\theta+\alpha)\psi(\theta)\psi(\theta+\alpha)\big|d\theta\gg\delta^{4}TD.

Using the AM–GM inequality x2+y2β©Ύ2​x​yx^{2}+y^{2}\geqslant 2xy with x=|gn^​(ΞΈ)β€‹Οˆβ€‹(ΞΈ)|x=|\widehat{g_{n}}(\theta)\psi(\theta)| and y=|gn^​(ΞΈ+Ξ±)β€‹Οˆβ€‹(ΞΈ+Ξ±)|y=|\widehat{g_{n}}(\theta+\alpha)\psi(\theta+\alpha)|, it follows that

βˆ«π‘/𝐙|g^n​(ΞΈ)|2​|Οˆβ€‹(ΞΈ)|2​𝑑θ≫δ4​T​D.\int_{\mathbf{R}/\mathbf{Z}}|\widehat{g}_{n}(\theta)|^{2}|\psi(\theta)|^{2}d\theta\gg\delta^{4}TD. (2.7)

By Parseval’s inequality we have βˆ«π‘/𝐙|g^n​(ΞΈ)|2β‰ͺT​D\int_{\mathbf{R}/\mathbf{Z}}|\widehat{g}_{n}(\theta)|^{2}\ll TD, and so for sufficiently small c1c_{1} we have

∫|Οˆβ€‹(ΞΈ)|β©Ύc1​δ2|g^n​(ΞΈ)|2​|Οˆβ€‹(ΞΈ)|2​𝑑θ≫δ4​T​D.\int_{|\psi(\theta)|\geqslant c_{1}\delta^{2}}|\widehat{g}_{n}(\theta)|^{2}|\psi(\theta)|^{2}d\theta\gg\delta^{4}TD. (2.8)

To proceed further we need to analyse the ΞΈ\theta for which |Οˆβ€‹(ΞΈ)|β©Ύc1​δ2|\psi(\theta)|\geqslant c_{1}\delta^{2}. Suppose in the following discussion that ΞΈ\theta has this property. Recalling that the definition of ψ\psi is ˜2.6, it follows that 𝔼u∈[T]βˆ’[T]​|𝔼s∈S​e​(θ​u​s)|β©Ύc1​δ2\mathbb{E}_{u\in[T]-[T]}\big|\mathbb{E}_{s\in S}e(\theta us)\big|\geqslant c_{1}\delta^{2}. Writing 𝒰:={u∈[T]:|𝔼s∈S​e​(θ​u​s)|β©Ύc1​δ2/2}\mathcal{U}:=\{u\in[T]:\big|\mathbb{E}_{s\in S}e(\theta us)\big|\geqslant c_{1}\delta^{2}/2\}, we see that ΞΌ[T]βˆ’[T](𝒰βˆͺβˆ’π’°)≫δ2\mu_{[T]-[T]}(\mathcal{U}\cup-\mathcal{U})\gg\delta^{2}, where ΞΌ[T]βˆ’[T]\mu_{[T]-[T]} denotes the natural weighted probability measure on [T]βˆ’[T][T]-[T]. Since ΞΌ[T]βˆ’[T]​(x)β©½1/T\mu_{[T]-[T]}(x)\leqslant 1/T pointwise, it follows that |𝒰|≫δ2​T|\mathcal{U}|\gg\delta^{2}T.

We now apply the diophantine assumption on SS. We conclude that for each uβˆˆπ’°u\in\mathcal{U} there is some nonzero quβ‰ͺ(Lβ€²/Ξ΄)2​Lq_{u}\ll(L^{\prime}/\delta)^{2L} such that β€–qu​u​θ‖𝐑/𝐙β‰ͺ(Lβ€²/Ξ΄)2​L/D\|q_{u}u\theta\|_{\mathbf{R}/\mathbf{Z}}\ll(L^{\prime}/\delta)^{2L}/D. By further refining the set of uu (to a set of size ≫(Ξ΄/Lβ€²)4​L​T\gg(\delta/L^{\prime})^{4L}T) we may assume that quq_{u} does not depend on uu. Denote this common value by q0q_{0}.

Now we apply Lemma˜2.2, taking Ξ±=q0​θ\alpha=q_{0}\theta, Ξ΄2≫(Ξ΄/Lβ€²)4​L\delta_{2}\gg(\delta/L^{\prime})^{4L} and Ξ΄1=(Lβ€²/Ξ΄)2​L/D\delta_{1}=(L^{\prime}/\delta)^{2L}/D. One can check that the conditions of Lemma˜2.2 are consequences of the hypothesised lower bounds on DD and TT, provided CC is large enough. The conclusion of the lemma is then that there is some qβ‰ͺ(Lβ€²/Ξ΄)4​Lq\ll(L^{\prime}/\delta)^{4L} such that ‖α​q‖𝐑/𝐙β‰ͺ(Lβ€²/Ξ΄)6​L/T​D\|\alpha q\|_{\mathbf{R}/\mathbf{Z}}\ll(L^{\prime}/\delta)^{6L}/TD. Taking qβ€²:=q​q0q^{\prime}:=qq_{0}, we see that qβ€²β‰ͺ(Lβ€²/Ξ΄)6​Lq^{\prime}\ll(L^{\prime}/\delta)^{6L} and ‖θ​q′‖𝐑/𝐙β‰ͺ(Lβ€²/Ξ΄)8​L/T​D\|\theta q^{\prime}\|_{\mathbf{R}/\mathbf{Z}}\ll(L^{\prime}/\delta)^{8L}/TD.

It follows from this analysis and ˜2.7 that ∫θ∈Θ|g^n​(ΞΈ)|2≫δ4​T​D\int_{\theta\in\Theta}|\widehat{g}_{n}(\theta)|^{2}\gg\delta^{4}TD, where Θ\Theta is the set of all ΞΈ\theta for which ‖θ​q‖𝐑/𝐙⩽(Lβ€²/Ξ΄)8​L/T​D\|\theta q\|_{\mathbf{R}/\mathbf{Z}}\leqslant(L^{\prime}/\delta)^{8L}/TD for some q∈𝐍q\in\mathbf{N} with qβ‰ͺ(Lβ€²/Ξ΄)6​Lq\ll(L^{\prime}/\delta)^{6L}. Since the measure of Θ\Theta is β‰ͺ(Lβ€²/Ξ΄)14​L/T​D\ll(L^{\prime}/\delta)^{14L}/TD, there is some ΞΈn∈Θ\theta_{n}\in\Theta such that

|g^n​(ΞΈn)|≫(Ξ΄/Lβ€²)18​L​T​D.|\widehat{g}_{n}(\theta_{n})|\gg(\delta/L^{\prime})^{18L}TD. (2.9)

By refining 𝒩0\mathcal{N}_{0} we may, using ˜2.4, find 𝒩1βŠ‚π’©0\mathcal{N}_{1}\subset\mathcal{N}_{0} such that

𝔼n∈[N]log​1𝒩1​(n)≫(Ξ΄/Lβ€²)8​L\mathbb{E}_{n\in[N]}^{\log}1_{\mathcal{N}_{1}}(n)\gg(\delta/L^{\prime})^{8L} (2.10)

and such that, for all nβˆˆπ’©1n\in\mathcal{N}_{1}, the corresponding ΞΈn\theta_{n} all have the same value of qq; that is, β€–q​θn‖𝐑/𝐙β‰ͺ(Lβ€²/Ξ΄)8​L/T​D\|q\theta_{n}\|_{\mathbf{R}/\mathbf{Z}}\ll(L^{\prime}/\delta)^{8L}/TD for all nβˆˆπ’©1n\in\mathcal{N}_{1}. Writing out the definition of the Fourier transform, we have from ˜2.9 that

|𝔼|m|β©½16​T​D​gn​(m)​e​(ΞΈn​m)|≫(Ξ΄/Lβ€²)18​L.\Big|\mathbb{E}_{|m|\leqslant 16TD}g_{n}(m)e(\theta_{n}m)\Big|\gg(\delta/L^{\prime})^{18L}.

Recall that H∈𝐍H\in\mathbf{N} is a given parameter, satisfying Hβ©½(Lβ€²/Ξ΄)50​LH\leqslant(L^{\prime}/\delta)^{50L}. By the properties of ΞΈn\theta_{n}, we have

|𝔼h∈[H]​𝔼|m|β©½16​T​D​gn​(m)​e​(ΞΈn​(mβˆ’q​h))|≫(Ξ΄/Lβ€²)18​L.\Big|\mathbb{E}_{h\in[H]}\mathbb{E}_{|m|\leqslant 16TD}g_{n}(m)e(\theta_{n}(m-qh))\Big|\gg(\delta/L^{\prime})^{18L}.

Substituting mβ€²:=mβˆ’q​hm^{\prime}:=m-qh gives

|𝔼h∈[H]β€‹π”Όβˆ’16​T​D+q​hβ©½mβ€²β©½16​T​D+q​h​gn​(mβ€²+q​h)​e​(ΞΈn​mβ€²)|≫(Ξ΄/Lβ€²)18​L,\Big|\mathbb{E}_{h\in[H]}\mathbb{E}_{-16TD+qh\leqslant m^{\prime}\leqslant 16TD+qh}g_{n}(m^{\prime}+qh)e(\theta_{n}m^{\prime})\Big|\gg(\delta/L^{\prime})^{18L},

which implies that

|𝔼h∈[H]​𝔼|mβ€²|β©½16​T​D​gn​(mβ€²+q​h)​e​(ΞΈn​mβ€²)|≫(Ξ΄/Lβ€²)18​L\Big|\mathbb{E}_{h\in[H]}\mathbb{E}_{|m^{\prime}|\leqslant 16TD}g_{n}(m^{\prime}+qh)e(\theta_{n}m^{\prime})\Big|\gg(\delta/L^{\prime})^{18L}

by the bound on HH and ˜A.1. Dropping the dashes on mβ€²m^{\prime} and swapping the order of the averages gives

|𝔼|m|β©½16​T​D​e​(ΞΈn​m)​𝔼h∈[H]​gn​(m+q​h)|≫(Ξ΄/Lβ€²)18​L.\Big|\mathbb{E}_{|m|\leqslant 16TD}e(\theta_{n}m)\mathbb{E}_{h\in[H]}g_{n}(m+qh)\Big|\gg(\delta/L^{\prime})^{18L}.

By Cauchy–Schwarz, it follows that

𝔼|m|β©½16​T​D​𝔼h,hβ€²βˆˆ[H]​gn​(m+q​h)​gn​(m+q​hβ€²)¯≫(Ξ΄/Lβ€²)36​L.\mathbb{E}_{|m|\leqslant 16TD}\mathbb{E}_{h,h^{\prime}\in[H]}g_{n}(m+qh)\overline{g_{n}(m+qh^{\prime})}\gg(\delta/L^{\prime})^{36L}.

Recall that we have this for all nβˆˆπ’©1n\in\mathcal{N}_{1}. However, the quantity on the left is non-negative for all nn. Taking the logarithmic average over nn (and recalling ˜2.10) we obtain

𝔼n∈[N]log​𝔼|m|β©½16​T​D​𝔼h,hβ€²βˆˆ[H]​gn​(m+q​h)​gn​(m+q​hβ€²)¯≫(Ξ΄/Lβ€²)44​L.\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{|m|\leqslant 16TD}\mathbb{E}_{h,h^{\prime}\in[H]}g_{n}(m+qh)\overline{g_{n}(m+qh^{\prime})}\gg(\delta/L^{\prime})^{44L}.

Recalling that gn​(m)=f​(n+m)g_{n}(m)=f(n+m), and taking the nn average to the inside, this is

𝔼|m|β©½16​T​D​𝔼h,hβ€²βˆˆ[H]​𝔼n∈[N]log​f​(n+m+q​h)​f​(n+m+q​hβ€²)¯≫(Ξ΄/Lβ€²)44​L.\mathbb{E}_{|m|\leqslant 16TD}\mathbb{E}_{h,h^{\prime}\in[H]}\mathbb{E}_{n\in[N]}^{\log}f(n+m+qh)\overline{f(n+m+qh^{\prime})}\gg(\delta/L^{\prime})^{44L}.

Applying ˜A.2 to the inner average for each mm (and using the assumed bound on log⁑T​Dlog⁑N\frac{\log TD}{\log N}) we may drop the mm-average, obtaining

𝔼h,hβ€²βˆˆ[H]​𝔼n∈[N]log​f​(n+q​h)​f​(n+q​hβ€²)¯≫(Ξ΄/Lβ€²)44​L.\mathbb{E}_{h,h^{\prime}\in[H]}\mathbb{E}_{n\in[N]}^{\log}f(n+qh)\overline{f(n+qh^{\prime})}\gg(\delta/L^{\prime})^{44L}.

This is equivalent to the stated result. ∎

Lemma 2.5.

There is an absolute constant Ξ΄0\delta_{0} such that the following holds. Fix δ∈(0,Ξ΄0]\delta\in(0,\delta_{0}] and L,Lβ€²,Dβ©Ύ1L,L^{\prime},D\geqslant 1. Let SβŠ‚[βˆ’4​D,4​D]S\subset[-4D,4D] be a set which is (L,Lβ€²,D)(L,L^{\prime},D)-diophantine, and let T∈𝐍T\in\mathbf{N} be a parameter. Let XX be a further sufficiently large parameter. Suppose that D,Tβ©Ύ(Lβ€²/Ξ΄)8​LD,T\geqslant(L^{\prime}/\delta)^{8L} and that T​Dβ©½(Ξ΄/Lβ€²)50​L​XTD\leqslant(\delta/L^{\prime})^{50L}X. Let HH be a positive integer with Hβ©½(Ξ΄/Lβ€²)50​L​T​DH\leqslant(\delta/L^{\prime})^{50L}TD. Let f:𝐍→𝐂f:\mathbf{N}\rightarrow\mathbf{C} be 11-bounded and suppose that we have

𝔼n∈[X]​𝔼t,tβ€²βˆˆ[T]​𝔼s∈S​f​(n+t​s)​f​(n+t′​s)Β―β©ΎΞ΄.\mathbb{E}_{n\in[X]}\mathbb{E}_{t,t^{\prime}\in[T]}\mathbb{E}_{s\in S}f(n+ts)\overline{f(n+t^{\prime}s)}\geqslant\delta. (2.11)

Then there exists q∈𝐍q\in\mathbf{N}, qβ©½(Lβ€²/Ξ΄)8​Lq\leqslant(L^{\prime}/\delta)^{8L}, such that β€–fβ€–U1​[X;q,H]β©Ύ(Ξ΄/Lβ€²)25​L\|f\|_{U^{1}[X;q,H]}\geqslant(\delta/L^{\prime})^{25L}.

Proof.

The same proof works essentially verbatim, except that the three applications of ˜A.2 are replaced by appeals to ˜A.1, using each time the assumption T​DXβ©½(Ξ΄/Lβ€²)50​L\frac{TD}{X}\leqslant(\delta/L^{\prime})^{50L} rather than a bound on log⁑T​Dlog⁑X\frac{\log TD}{\log X} in the logarithmic case. ∎

Rather than Lemma˜2.5 itself, we will need the following iterated variant. Here we use the notation for difference operators Ξ”(h,hβ€²)\Delta_{(h,h^{\prime})} described in Section˜1.4.

Lemma 2.6.

There are absolute constants Ξ΄0<1\delta_{0}<1 and C=C2.6C=C_{\operatorname{\ref{lem:input-concat-2-iter}}} such that the following holds. Fix δ∈(0,Ξ΄0]\delta\in(0,\delta_{0}] and L,Lβ€²,D1,D2β©Ύ1L,L^{\prime},D_{1},D_{2}\geqslant 1. For i=1,2i=1,2 suppose that SiβŠ‚[βˆ’4​Di,4​Di]S_{i}\subset[-4D_{i},4D_{i}] is a set which is (L,Lβ€²,Di)(L,L^{\prime},D_{i})-diophantine, and let TiT_{i} be a parameter. Let XX be a sufficiently large parameter and suppose that Di,Tiβ©Ύ(Lβ€²/Ξ΄)C​L2D_{i},T_{i}\geqslant(L^{\prime}/\delta)^{CL^{2}} and that Ti​Diβ©½(Lβ€²/Ξ΄)C​L2​XT_{i}D_{i}\leqslant(L^{\prime}/\delta)^{CL^{2}}X. Let H1,H2H_{1},H_{2} be positive integers with Hiβ©½(Lβ€²/Ξ΄)C​L2​Ti​DiH_{i}\leqslant(L^{\prime}/\delta)^{CL^{2}}T_{i}D_{i}. Let ψ:𝐍→𝐂\psi:\mathbf{N}\rightarrow\mathbf{C} be 11-bounded and suppose that

𝔼n∈[X,2​X),t1,t1β€²βˆˆ[T1],t2,t2β€²βˆˆ[T2],s1∈S1,s2∈S2​Δs1​(t1,t1β€²)​Δs2​(t2,t2β€²)β€‹Οˆβ€‹(n)β©ΎΞ΄.\mathbb{E}_{n\in[X,2X),t_{1},t_{1}^{\prime}\in[T_{1}],t_{2},t^{\prime}_{2}\in[T_{2}],s_{1}\in S_{1},s_{2}\in S_{2}}\Delta_{s_{1}(t_{1},t^{\prime}_{1})}\Delta_{s_{2}(t_{2},t^{\prime}_{2})}\psi(n)\geqslant\delta. (2.12)

Then there exist q1,q2∈𝐍q_{1},q_{2}\in\mathbf{N}, qiβ©½(Lβ€²/Ξ΄)C​L2q_{i}\leqslant(L^{\prime}/\delta)^{CL^{2}}, such that

𝔼n∈[2​X],h1,h1β€²βˆˆ[H1],h2,h2β€²βˆˆ[H2]​Δq1​(h1,h1β€²)​Δq2​(h2,h2β€²)β€‹Οˆβ€‹(n)β©Ύ(Ξ΄/Lβ€²)C​L2.\mathbb{E}_{n\in[2X],h_{1},h^{\prime}_{1}\in[H_{1}],h_{2},h^{\prime}_{2}\in[H_{2}]}\Delta_{q_{1}(h_{1},h^{\prime}_{1})}\Delta_{q_{2}(h_{2},h^{\prime}_{2})}\psi(n)\geqslant(\delta/L^{\prime})^{CL^{2}}.
Remark.

We have only stated a version with two difference operators (which will involve one iteration of Lemma˜2.5), since this is what we will need later. A similar argument gives a version with kk difference operators.

Proof.

By an averaging argument, there are at least δ​|S2|​T22/2\delta|S_{2}|T_{2}^{2}/2 triples (s2,t2,t2β€²)(s_{2},t_{2},t^{\prime}_{2}) such that

𝔼n∈[X,2​X),t1,t1β€²βˆˆ[T1],s1∈S1​Δs1​(t1,t1β€²)​(Ξ”s2​(t2,t2β€²)β€‹Οˆ)​(n)β©ΎΞ΄/2.\mathbb{E}_{n\in[X,2X),t_{1},t^{\prime}_{1}\in[T_{1}],s_{1}\in S_{1}}\Delta_{s_{1}(t_{1},t^{\prime}_{1})}\big(\Delta_{s_{2}(t_{2},t^{\prime}_{2})}\psi\big)(n)\geqslant\delta/2.

Since, for any n,s1n,s_{1}, the average over t1,t1β€²t_{1},t^{\prime}_{1} is non-negative, we have

𝔼n∈[2​X],t1,t1β€²βˆˆ[T1],s1∈S1​Δs1​(t1,t1β€²)​(Ξ”s2​(t2,t2β€²)β€‹Οˆ)​(n)β©ΎΞ΄/4.\mathbb{E}_{n\in[2X],t_{1},t^{\prime}_{1}\in[T_{1}],s_{1}\in S_{1}}\Delta_{s_{1}(t_{1},t^{\prime}_{1})}\big(\Delta_{s_{2}(t_{2},t^{\prime}_{2})}\psi\big)(n)\geqslant\delta/4.

For each such triple, this is exactly the hypothesis ˜2.11 of Lemma˜2.5 with f=Ξ”s2​(t2,t2β€²)β€‹Οˆf=\Delta_{s_{2}(t_{2},t^{\prime}_{2})}\psi (and Ξ΄\delta replaced by Ξ΄/4\delta/4 and XX by 2​X2X). The conclusion of Lemma˜2.5 is then that there exists q=q​(s2,t2,t2β€²)β©½(Lβ€²/Ξ΄)O​(L)q=q(s_{2},t_{2},t^{\prime}_{2})\leqslant(L^{\prime}/\delta)^{O(L)} such that β€–Ξ”s2​(t2,t2β€²)β€‹Οˆβ€–U1​[2​X;q,H1]β©Ύ(Ξ΄/Lβ€²)O​(L)\|\Delta_{s_{2}(t_{2},t^{\prime}_{2})}\psi\|_{U^{1}[2X;q,H_{1}]}\geqslant(\delta/L^{\prime})^{O(L)}. Squaring and writing out, this gives

𝔼n∈[2​X]​𝔼h1,h1β€²βˆˆ[H1]​Δq​(h1,h1β€²)​(Ξ”s2​(t2,t2β€²)β€‹Οˆ)​(n)β©Ύ(Ξ΄/Lβ€²)O​(L).\mathbb{E}_{n\in[2X]}\mathbb{E}_{h_{1},h^{\prime}_{1}\in[H_{1}]}\Delta_{q(h_{1},h^{\prime}_{1})}\big(\Delta_{s_{2}(t_{2},t^{\prime}_{2})}\psi\big)(n)\geqslant(\delta/L^{\prime})^{O(L)}. (2.13)

By pigeonhole, we may pass to set of (Ξ΄/Lβ€²)O​(L)​|S2|​T22(\delta/L^{\prime})^{O(L)}|S_{2}|T_{2}^{2} triples (s2,t2,t2β€²)(s_{2},t_{2},t^{\prime}_{2}) such that q1=q​(s2,t2,t2β€²)q_{1}=q(s_{2},t_{2},t^{\prime}_{2}) is independent of s2,t2,t2β€²s_{2},t_{2},t^{\prime}_{2}. Since the expression on the left in ˜2.13 is always nonnegative, we may average over all (s2,t2,t2β€²)∈S2Γ—[T2]Γ—[T2](s_{2},t_{2},t^{\prime}_{2})\in S_{2}\times[T_{2}]\times[T_{2}], obtaining

𝔼h1,h1β€²βˆˆ[H1]​𝔼n∈[2​X],t2,t2β€²βˆˆT2,s2∈S2​Δs2​(t2,t2β€²)​(Ξ”q1​(h1,h1β€²)β€‹Οˆ)​(n)β©Ύ(Ξ΄/Lβ€²)O​(L).\mathbb{E}_{h_{1},h^{\prime}_{1}\in[H_{1}]}\mathbb{E}_{n\in[2X],t_{2},t^{\prime}_{2}\in T_{2},s_{2}\in S_{2}}\Delta_{s_{2}(t_{2},t^{\prime}_{2})}\big(\Delta_{q_{1}(h_{1},h^{\prime}_{1})}\psi\big)(n)\geqslant(\delta/L^{\prime})^{O(L)}.

For at least (Ξ΄/Lβ€²)O​(L)​H12(\delta/L^{\prime})^{O(L)}H_{1}^{2} pairs (h1,h1β€²)(h_{1},h^{\prime}_{1}), we have

𝔼n∈[2​X],t2,t2β€²βˆˆT2,s2∈S2​Δs2​(t2,t2β€²)​(Ξ”q1​(h1,h1β€²)β€‹Οˆ)​(n)β©Ύ(Ξ΄/Lβ€²)O​(L).\mathbb{E}_{n\in[2X],t_{2},t^{\prime}_{2}\in T_{2},s_{2}\in S_{2}}\Delta_{s_{2}(t_{2},t^{\prime}_{2})}\big(\Delta_{q_{1}(h_{1},h^{\prime}_{1})}\psi\big)(n)\geqslant(\delta/L^{\prime})^{O(L)}.

For each such pair, this is again the hypothesis ˜2.11 of Lemma˜2.5, now with f=Ξ”q1​(h1,h1β€²)β€‹Οˆf=\Delta_{q_{1}(h_{1},h^{\prime}_{1})}\psi, Ξ΄\delta replaced by (Ξ΄/Lβ€²)O​(L)(\delta/L^{\prime})^{O(L)}, and again with N=2​XN=2X. Another application of Lemma˜2.5 gives that there exists q=q​(h1,h1β€²)β©½(Lβ€²/Ξ΄)O​(L2)q=q(h_{1},h^{\prime}_{1})\leqslant(L^{\prime}/\delta)^{O(L^{2})} such that β€–Ξ”q1​(h1,h1β€²)β€‹Οˆβ€–U1​[2​X;q,H2]β©Ύ(Ξ΄/Lβ€²)O​(L2)\|\Delta_{q_{1}(h_{1},h^{\prime}_{1})}\psi\|_{U^{1}[2X;q,H_{2}]}\geqslant(\delta/L^{\prime})^{O(L^{2})}, provided that CC is sufficiently large that the relevant conditions on D2,T2D_{2},T_{2} and T2​D2/XT_{2}D_{2}/X are satisfied. Squaring and writing out, this gives

𝔼n∈[2​X]​𝔼h2,h2β€²βˆˆ[H2]​Δq​(h2,h2β€²)​Δq1​(h1,h1β€²)β€‹Οˆβ€‹(n)β©Ύ(Ξ΄/Lβ€²)O​(L2).\mathbb{E}_{n\in[2X]}\mathbb{E}_{h_{2},h^{\prime}_{2}\in[H_{2}]}\Delta_{q(h_{2},h^{\prime}_{2})}\Delta_{q_{1}(h_{1},h^{\prime}_{1})}\psi(n)\geqslant(\delta/L^{\prime})^{O(L^{2})}. (2.14)

Passing to a further subset of (Ξ΄/Lβ€²)O​(L2)(\delta/L^{\prime})^{O(L^{2})} pairs (h1,h1β€²)(h_{1},h^{\prime}_{1}), we may assume that q2=q​(h1,h1β€²)q_{2}=q(h_{1},h^{\prime}_{1}) does not depend on (h1,h1β€²)(h_{1},h^{\prime}_{1}). Since the expression on the left in ˜2.14 is non-negative for all qq, we obtain the desired result by averaging over h1,h1β€²h_{1},h^{\prime}_{1}. ∎

3. Diophantine properties of almost primes

The main result of this section, Lemma˜3.2, is a vital technical ingredient in our later arguments. Roughly, it states that sets such as {p1​⋯​pk:piβˆˆπ’«i}\{p_{1}\cdots p_{k}:p_{i}\in\mathscr{P}_{i}\} and {p12​⋯​pk2:piβˆˆπ’«i}\{p^{2}_{1}\cdots p^{2}_{k}:p_{i}\in\mathscr{P}_{i}\} are diophantine (see Definition˜2.1) with suitable parameters, where 𝒫i\mathscr{P}_{i} are dyadically localised sets of primes. We first note a general lemma for β€˜bilinear’ exponential sums.

Lemma 3.1.

Let j∈𝐍j\in\mathbf{N}. Let δ∈(0,12)\delta\in(0,\frac{1}{2}), and let S1βŠ‚[N1]S_{1}\subset[N_{1}] and S2βŠ‚[N2]S_{2}\subset[N_{2}] be sets with |Si|=Οƒi​Ni|S_{i}|=\sigma_{i}N_{i} for i=1,2i=1,2. Suppose that, for some ΞΈβˆˆπ‘/𝐙\theta\in\mathbf{R}/\mathbf{Z}, we have |𝔼s1∈S1,s2∈S2​e​(θ​s1j​s2j)|β©ΎΞ΄|\mathbb{E}_{s_{1}\in S_{1},s_{2}\in S_{2}}e(\theta s^{j}_{1}s^{j}_{2})|\geqslant\delta. Then either Niβ©½(Οƒ1​σ2​δ)βˆ’Oj​(1)N_{i}\leqslant(\sigma_{1}\sigma_{2}\delta)^{-O_{j}(1)} for some i∈{1,2}i\in\{1,2\}, or else there is some q∈𝐍q\in\mathbf{N}, qβ©½(δ​σ1​σ2)βˆ’Oj​(1)q\leqslant(\delta\sigma_{1}\sigma_{2})^{-O_{j}(1)}, such that β€–q​θ‖𝐑/𝐙⩽(δ​σ1​σ2)βˆ’Oj​(1)​(N1​N2)βˆ’j\|q\theta\|_{\mathbf{R}/\mathbf{Z}}\leqslant(\delta\sigma_{1}\sigma_{2})^{-O_{j}(1)}(N_{1}N_{2})^{-j}.

Remark.

We will only need the cases j=1,2j=1,2, in which case of course the exponents may be taken to be absolute constants.

Proof.

Write the condition as

|𝔼n1∈[N1],n2∈[N2]​1S1​(n1)​1S2​(n2)​e​(θ​n1j​n2j)|⩾δ​σ1​σ2.\big|\mathbb{E}_{n_{1}\in[N_{1}],n_{2}\in[N_{2}]}1_{S_{1}}(n_{1})1_{S_{2}}(n_{2})e(\theta n^{j}_{1}n^{j}_{2})\big|\geqslant\delta\sigma_{1}\sigma_{2}.

By two applications of the Cauchy–Schwarz inequality, we obtain

𝔼n1,n1β€²βˆˆ[N1],n2,n2β€²βˆˆ[N2]​e​(θ​(n1jβˆ’n1′⁣j)​(n2jβˆ’n2′⁣j))β©Ύ(δ​σ1​σ2)4.\mathbb{E}_{n_{1},n^{\prime}_{1}\in[N_{1}],n_{2},n^{\prime}_{2}\in[N_{2}]}e\big(\theta(n^{j}_{1}-n^{\prime j}_{1})(n^{j}_{2}-n^{\prime j}_{2})\big)\geqslant(\delta\sigma_{1}\sigma_{2})^{4}.

To handle this, we use the β€˜log-free’ multidimensional Weyl inequality [GT14, PropositionΒ 2.2]; we remark that the published version of that paper omits the necessary constraint that min⁑(Ni)\min(N_{i}) be sufficiently large. ∎

We now proceed to the main technical lemma of the section. Although we will only need this lemma for j=1,2j=1,2, it is no harder to prove it for general jj.

Lemma 3.2.

Let j∈𝐍j\in\mathbf{N}. Then there is a constant Ljβ©Ύ1L_{j}\geqslant 1 such that the following holds. Let kβ©Ύ2k\geqslant 2 be a natural number and let δ∈(0,12)\delta\in(0,\frac{1}{2}). Let M1,…,MkM_{1},\ldots,M_{k} be a sequence of integers such that the intervals [Mi,(1+14​k)​Mi)[M_{i},(1+\frac{1}{4k})M_{i}) are disjoint. Set Q:=kkβ€‹βˆi=1klog⁑MiQ:=k^{k}\prod_{i=1}^{k}\log M_{i}, and suppose the condition mini⁑Mi>QLj\min_{i}M_{i}>Q^{L_{j}} is satisfied. For each ii, suppose we are given a parameter Ξ·i\eta_{i} satisfying 18​kβ©½Ξ·iβ©½14​k\frac{1}{8k}\leqslant\eta_{i}\leqslant\frac{1}{4k} and define 𝒫i\mathscr{P}_{i} to be the set of primes satisfying Miβ©½p<Mi​(1+Ξ·i)M_{i}\leqslant p<M_{i}(1+\eta_{i}), and set S:={p1j​⋯​pkj:piβˆˆπ’«i}S:=\{p_{1}^{j}\cdots p_{k}^{j}:p_{i}\in\mathscr{P}_{i}\}. Then SS is (Lj,k,(M1​⋯,Mk)j)(L_{j},k,(M_{1}\cdots,M_{k})^{j})-diophantine.

Proof.

We first note that the case k=1k=1 is also true and is essentially a standard result about exponential sums over powers of primes. We in fact need (a slight generalisation of) this result in our proof. Since it is hard to find an appropriate reference with the log-free bound that we require, we give this in Lemma˜B.2.

Suppose now that kβ©Ύ2k\geqslant 2. Without loss of generality, assume M1>M2>β‹―>Mkβ©Ύ3M_{1}>M_{2}>\cdots>M_{k}\geqslant 3. Let ΞΈβˆˆπ‘/𝐙\theta\in\mathbf{R}/\mathbf{Z} and suppose that

|𝔼p1βˆˆπ’«1,…,pkβˆˆπ’«k​e​(θ​p1j​⋯​pkj)|β©ΎΞ΄.\big|\mathbb{E}_{p_{1}\in\mathscr{P}_{1},\dots,p_{k}\in\mathscr{P}_{k}}e(\theta p_{1}^{j}\cdots p_{k}^{j})\big|\geqslant\delta. (3.1)

We must show that there is some q∈𝐍q\in\mathbf{N} such that

qβ©½(k/Ξ΄)Ljandβ€–q​θ‖𝐑/𝐙⩽(k/Ξ΄)Lj​(M1​⋯​Mk)βˆ’j.q\leqslant(k/\delta)^{L_{j}}\quad\mbox{and}\quad\|q\theta\|_{\mathbf{R}/\mathbf{Z}}\leqslant(k/\delta)^{L_{j}}(M_{1}\cdots M_{k})^{-j}. (3.2)

We try applying Lemma˜3.1 with N1:=2β€‹βˆiβ©½k:i​evenMiN_{1}:=2\prod_{i\leqslant k:i\operatorname{even}}M_{i} and N2:=2β€‹βˆiβ©½k:i​oddMiN_{2}:=2\prod_{i\leqslant k:i\operatorname{odd}}M_{i}. Define sets S1βŠ‚[N1]S_{1}\subset[N_{1}], S2βŠ‚[N2]S_{2}\subset[N_{2}] by S1:=∏iβ©½k:i​even𝒫iS_{1}:=\prod_{i\leqslant k:i\operatorname{even}}\mathscr{P}_{i} and S2:=∏iβ©½k:i​odd𝒫iS_{2}:=\prod_{i\leqslant k:i\operatorname{odd}}\mathscr{P}_{i} (the stated containments are easily verified). Set Οƒi:=|Si|/Ni\sigma_{i}:=|S_{i}|/N_{i}. Since Mi>Q>kkM_{i}>Q>k^{k}, it follows from the prime number theorem with classical error term (see e.g. [IK-book, SectionΒ 5.6]) that we have |𝒫i|β©Ύc​Mi/k​log⁑Mi|\mathscr{P}_{i}|\geqslant cM_{i}/k\log M_{i} for some absolute c>0c>0. Therefore

Οƒ1​σ2=14β€‹βˆi=1k|𝒫i|Miβ©Ύ(c2​k)kβ€‹βˆj=1k1log⁑Miβ©Ύ(c2)k​Qβˆ’1≫Qβˆ’2,\sigma_{1}\sigma_{2}=\frac{1}{4}\prod_{i=1}^{k}\frac{|\mathscr{P}_{i}|}{M_{i}}\geqslant\Big(\frac{c}{2k}\Big)^{k}\prod_{j=1}^{k}\frac{1}{\log M_{i}}\geqslant\big(\frac{c}{2}\big)^{k}Q^{-1}\gg Q^{-2},

using in this last step that Q>kkQ>k^{k}. Applying Lemma˜3.1, and noting that N1⩽N2N_{1}\leqslant N_{2}, it follows that either

N1β©½(Q/Ξ΄)Oj​(1)N_{1}\leqslant(Q/\delta)^{O_{j}(1)} (3.3)

or else there is some q∈𝐍q\in\mathbf{N} with

qβ©½(Q/Ξ΄)Oj​(1)and‖θ​q‖𝐑/𝐙⩽(Q/Ξ΄)Oj​(1)​(M1​⋯​Mk)βˆ’j.q\leqslant(Q/\delta)^{O_{j}(1)}\quad\mbox{and}\quad\|\theta q\|_{\mathbf{R}/\mathbf{Z}}\leqslant(Q/\delta)^{O_{j}(1)}(M_{1}\cdots M_{k})^{-j}. (3.4)

We leave aside ˜3.3 for now, and assume that ˜3.4 holds. If Ξ΄β©½1/Q\delta\leqslant 1/Q then ˜3.2 follows immediately (with LjL_{j} equal to twice the Oj​(1)O_{j}(1) exponent). Therefore we may suppose henceforth that Ξ΄β©Ύ1/Q\delta\geqslant 1/Q. In particular, ˜3.4 gives (after doubling the implied constant in the exponents) that

qβ©½QOj​(1)and‖θ​q‖𝐑/𝐙⩽QOj​(1)​(M1​⋯​Mk)βˆ’j.q\leqslant Q^{O_{j}(1)}\quad\mbox{and}\quad\|\theta q\|_{\mathbf{R}/\mathbf{Z}}\leqslant Q^{O_{j}(1)}(M_{1}\cdots M_{k})^{-j}.

Thus ΞΈ=aq+ΞΈβ€²\theta=\frac{a}{q}+\theta^{\prime} for some aβˆˆπ™a\in\mathbf{Z}, with

|ΞΈβ€²|β©½QOj​(1)​(M1​⋯​Mk)βˆ’j.|\theta^{\prime}|\leqslant Q^{O_{j}(1)}(M_{1}\cdots M_{k})^{-j}. (3.5)

We now return to the original sum ˜3.1. By pigeonhole, there is a choice of t=p2j​⋯​pkjt=p_{2}^{j}\cdots p_{k}^{j} such that |𝔼p1βˆˆπ’«1​e​(θ​t​p1j)|β©ΎΞ΄\big|\mathbb{E}_{p_{1}\in\mathscr{P}_{1}}e(\theta tp_{1}^{j})\big|\geqslant\delta. By Lemma˜B.2 (that is, essentially the case k=1k=1 of the present lemma) it follows that there is some q0β©½(k/Ξ΄)Oj​(1)q_{0}\leqslant(k/\delta)^{O_{j}(1)} such that ‖θ​t​q0‖𝐑/𝐙⩽(k/Ξ΄)Oj​(1)​M1βˆ’j\|\theta tq_{0}\|_{\mathbf{R}/\mathbf{Z}}\leqslant(k/\delta)^{O_{j}(1)}M_{1}^{-j}. Since ΞΈβ€²=ΞΈβˆ’a/q\theta^{\prime}=\theta-a/q, this means that θ′​t​q0\theta^{\prime}tq_{0} is within (k/Ξ΄)Oj​(1)​M1βˆ’j(k/\delta)^{O_{j}(1)}M_{1}^{-j} of βˆ’a​t​q0/q-atq_{0}/q, an integer multiple of 1/q1/q. However, we may also note using ˜3.5 and the bound p2​⋯​pkβ©½3​M2​⋯​Mkp_{2}\cdots p_{k}\leqslant 3M_{2}\cdots M_{k} that

|θ′​t​q0|β©½QOj​(1)​(M1​⋯​Mk)βˆ’jβ‹…(3​M2​⋯​Mk)jβ‹…(k/Ξ΄)Oj​(1)β©½QOj​(1)​M1βˆ’j<12​q.|\theta^{\prime}tq_{0}|\leqslant Q^{O_{j}(1)}(M_{1}\cdots M_{k})^{-j}\cdot(3M_{2}\cdots M_{k})^{j}\cdot(k/\delta)^{O_{j}(1)}\leqslant Q^{O_{j}(1)}M_{1}^{-j}<\frac{1}{2q}.

Here, in the penultimate step we used that Ξ΄β©½1/Q\delta\leqslant 1/Q (and so k/Ξ΄β©½Q2k/\delta\leqslant Q^{2}), and in the last step we invoked the assumption M1>QLjM_{1}>Q^{L_{j}} and the upper bound qβ©½QOj​(1)q\leqslant Q^{O_{j}(1)} (and assumed LjL_{j} is large enough). Since (k/Ξ΄)Oj​(1)​M1βˆ’jβ©½QOj​(1)​M1βˆ’j<12​q(k/\delta)^{O_{j}(1)}M_{1}^{-j}\leqslant Q^{O_{j}(1)}M_{1}^{-j}<\frac{1}{2q} (for the aforementioned reasons) the only possible integer multiple of 1/q1/q that θ′​t​q\theta^{\prime}tq can be near is 0, and therefore |θ′​t​q0|β©½(k/Ξ΄)Oj​(1)​M1βˆ’j|\theta^{\prime}tq_{0}|\leqslant(k/\delta)^{O_{j}(1)}M_{1}^{-j} and q∣a​t​q0q\mid atq_{0}. Dividing through by t​q0tq_{0}, we obtain |ΞΈβ€²|β©½(k/Ξ΄)Oj​(1)​(M1​⋯​Mk)βˆ’Oj​(1)|\theta^{\prime}|\leqslant(k/\delta)^{O_{j}(1)}(M_{1}\cdots M_{k})^{-O_{j}(1)}. Note also that (q,t)=1(q,t)=1 since all prime factors of tt are at least Mk>QLjβ©ΎqM_{k}>Q^{L_{j}}\geqslant q, and therefore q∣a​q0q\mid aq_{0}. Finally it follows that ‖θ​q0‖𝐑/𝐙=‖θ′​q0‖𝐑/𝐙⩽|ΞΈβ€²|​q0β©½(k/Ξ΄)Oj​(1)​(M1​⋯​Mk)βˆ’j\|\theta q_{0}\|_{\mathbf{R}/\mathbf{Z}}=\|\theta^{\prime}q_{0}\|_{\mathbf{R}/\mathbf{Z}}\leqslant|\theta^{\prime}|q_{0}\leqslant(k/\delta)^{O_{j}(1)}(M_{1}\cdots M_{k})^{-j}, which is the desired conclusion ˜3.2.

It remains to analyse the β€˜small parameter’ case ˜3.3, that is to say N1β©½(Q/Ξ΄)Oj​(1)N_{1}\leqslant(Q/\delta)^{O_{j}(1)}. The assumption mini⁑Mi>QLj\min_{i}M_{i}>Q^{L_{j}} certainly implies that N1>QLjN_{1}>Q^{L_{j}}. Therefore (assuming LjL_{j} large enough) we have N1β©½Ξ΄βˆ’Oj​(1)N_{1}\leqslant\delta^{-O_{j}(1)}. It follows that

M2​⋯​Mk=∏iβ©½ki​evenMiβ‹…βˆiβ©½kβˆ’1i​evenMi+1⩽∏iβ©½ki​evenMiβ‹…βˆiβ©½kβˆ’1i​evenMiβ©½N12β©½Ξ΄βˆ’Oj​(1).M_{2}\cdots M_{k}=\prod_{\begin{subarray}{c}i\leqslant k\\ i\operatorname{even}\end{subarray}}M_{i}\cdot\prod_{\begin{subarray}{c}i\leqslant k-1\\ i\operatorname{even}\end{subarray}}M_{i+1}\leqslant\prod_{\begin{subarray}{c}i\leqslant k\\ i\operatorname{even}\end{subarray}}M_{i}\cdot\prod_{\begin{subarray}{c}i\leqslant k-1\\ i\operatorname{even}\end{subarray}}M_{i}\leqslant N_{1}^{2}\leqslant\delta^{-O_{j}(1)}. (3.6)

As before, ˜3.1 implies that there is some t=p2j​⋯​pkjt=p_{2}^{j}\cdots p_{k}^{j} such that |𝔼p1βˆˆπ’«1​e​(θ​t​p1j)|β©ΎΞ΄|\mathbb{E}_{p_{1}\in\mathscr{P}_{1}}e(\theta tp_{1}^{j})|\geqslant\delta. By Lemma˜B.2 it follows that there is some q0β©½(k/Ξ΄)Oj​(1)q_{0}\leqslant(k/\delta)^{O_{j}(1)} such that ‖θ​t​q0‖𝐑/𝐙⩽(k/Ξ΄)Oj​(1)​M1βˆ’j\|\theta tq_{0}\|_{\mathbf{R}/\mathbf{Z}}\leqslant(k/\delta)^{O_{j}(1)}M_{1}^{-j}. Taking q:=t​q0q:=tq_{0}, we then have (using ˜3.6)

qβ©½(M2​⋯​Mk)j​(k/Ξ΄)Oj​(1)β©½(k/Ξ΄)Oj​(1),q\leqslant(M_{2}\cdots M_{k})^{j}(k/\delta)^{O_{j}(1)}\leqslant(k/\delta)^{O_{j}(1)},

and (using ˜3.6 again) ‖θ​q‖𝐑/𝐙⩽(k/Ξ΄)Oj​(1)​M1βˆ’jβ©½(k/Ξ΄)Oj​(1)​(M1​⋯​Mk)βˆ’j\|\theta q\|_{\mathbf{R}/\mathbf{Z}}\leqslant(k/\delta)^{O_{j}(1)}M_{1}^{-j}\leqslant(k/\delta)^{O_{j}(1)}(M_{1}\cdots M_{k})^{-j}, which is once again the desired conclusion ˜3.2. ∎

4. Fourier decomposition of a majorant for the primes

In this section we give another technical ingredient for our later arguments. Here is the main result.

Lemma 4.1.

Let XX be a large parameter. Then there is a function Ξ›~:[X,2​X)→𝐑⩾0\tilde{\Lambda}:[X,2X)\rightarrow\mathbf{R}_{\geqslant 0} with

Ξ›~​(p)≫log⁑X\tilde{\Lambda}(p)\gg\log X (4.1)

for all primes p∈[X,2​X)p\in[X,2X) and

𝔼x∈[X,2​X)​Λ~​(x)β‰ͺ1\mathbb{E}_{x\in[X,2X)}\tilde{\Lambda}(x)\ll 1 (4.2)

such that the following is true. Let c∈(0,1)c\in(0,1) be a constant. For any parameter Q∈𝐍Q\in\mathbf{N}, Qβ©½log⁑XQ\leqslant\log X, there is a (Q!)(Q!)-periodic function Ξ›per\Lambda_{\operatorname{per}} satisfying

𝔼x∈[X,2​X)​|Ξ›per​(x)|β‰ͺ(log⁑Q)O​(1)andβ€–Ξ›perβ€–βˆžβ‰ͺQ2\mathbb{E}_{x\in[X,2X)}|\Lambda_{\operatorname{per}}(x)|\ll(\log Q)^{O(1)}\quad\mbox{and}\quad\|\Lambda_{\operatorname{per}}\|_{\infty}\ll Q^{2} (4.3)

together with a decomposition Ξ›~βˆ’Ξ›per=βˆ‘igi+h\tilde{\Lambda}-\Lambda_{\operatorname{per}}=\sum_{i}g_{i}+h (where the sum over ii is finite) with the following properties. First, the function hh is small in β„“1\ell^{1} in the sense that

𝔼x∈[X,2​X)​|h​(x)|β‰ͺQβˆ’1.\mathbb{E}_{x\in[X,2X)}|h(x)|\ll Q^{-1}. (4.4)

Second, the functions gig_{i} are reasonably bounded in sup norm in the sense that

βˆ‘iβ€–giβ€–βˆžβ‰ͺ(log⁑X)Oc​(1)\sum_{i}\|g_{i}\|_{\infty}\ll(\log X)^{O_{c}(1)} (4.5)

for all ii. Finally, denoting β€–f^β€–βˆž:=supΞΈβˆˆπ‘/𝐙|βˆ‘x∈[X,2​X]f​(x)​e​(θ​x)|\|\widehat{f}\|_{\infty}:=\sup_{\theta\in\mathbf{R}/\mathbf{Z}}\big|\sum_{x\in[X,2X]}f(x)e(\theta x)\big| we have the estimate

βˆ‘iβ€–g^iβ€–βˆžc​‖giβ€–βˆž1βˆ’cβ‰ͺXc​Qβˆ’c/4.\sum_{i}\|\widehat{g}_{i}\|_{\infty}^{c}\|g_{i}\|_{\infty}^{1-c}\ll X^{c}Q^{-c/4}. (4.6)

Here, all implied constants may depend on cc but are effectively computable.

Proof.

We take Ξ›~\tilde{\Lambda} to be a Selberg-type majorant for the primes. Rather than describe the construction explicitly here, we can just refer to [green-tao-selberg, Proposition 3.1], which provides the relevant properties. Taking F​(n)=nF(n)=n in that proposition (thus the singular series 𝔖F\mathfrak{S}_{F} as defined in [green-tao-selberg] is ≍1\asymp 1) and R:=X1/10R:=X^{1/10}, we can take Ξ›~=Ξ²\tilde{\Lambda}=\beta, where Ξ²\beta is the function constructed in [green-tao-selberg, Proposition 3.1]. The desired majorant property ˜4.1 is a consequence of [green-tao-selberg, Equation (3.1)]. The bound ˜4.2 is an absolutely standard fact about the Selberg sieve. It could be deduced within the framework of [green-tao-selberg] by summing [green-tao-selberg, Equation (3.3)] over n∈[X,2​X)n\in[X,2X), and discarding the negligible contribution from all frequencies except a/q=0a/q=0. On the Fourier side we have (see [green-tao-selberg, Equation (7.7)])

Ξ›~​(n)=(βˆ‘qβ©½Rμ​(q)ϕ​(q))βˆ’1​(βˆ‘qβ©½Rμ​(q)ϕ​(q)β€‹βˆ‘(a,q)=1e​(a​nq))2.\tilde{\Lambda}(n)=\Big(\sum_{q\leqslant R}\frac{\mu(q)}{\phi(q)}\Big)^{-1}\Big(\sum_{q\leqslant R}\frac{\mu(q)}{\phi(q)}\sum_{(a,q)=1}e\big(\frac{an}{q}\big)\Big)^{2}.

It is shown in [green-tao-selberg, Proposition 7.1], following RamarΓ© and Ruzsa [ramare-ruzsa], that

Ξ›~​(n)=βˆ‘qβ©½R2cqβ€‹βˆ‘(a,q)=1e​(a​n/q)\tilde{\Lambda}(n)=\sum_{q\leqslant R^{2}}c_{q}\sum_{(a,q)=1}e(an/q)

with cqc_{q} supported on squarefrees with qβ©½R2q\leqslant R^{2} and |cq|β‰ͺτ​(q)2/q|c_{q}|\ll\tau(q)^{2}/q. Set i0:=⌊log2⁑QβŒ‹i_{0}:=\lfloor\log_{2}Q\rfloor, i1:=⌊A​log2⁑log⁑XβŒ‹i_{1}:=\lfloor A\log_{2}\log X\rfloor for some A=A​(c)A=A(c) to be specified below, and finally set

Ξ›per​(n):=βˆ‘qβ©½2i0cqβ€‹βˆ‘(a,q)=1e​(a​n/q),fi​(n):=βˆ‘2i<qβ©½2i+1cqβ€‹βˆ‘(a,q)=1e​(a​n/q)\Lambda_{\operatorname{per}}(n):=\sum_{q\leqslant 2^{i_{0}}}c_{q}\sum_{(a,q)=1}e(an/q),\qquad f_{i}(n):=\sum_{2^{i}<q\leqslant 2^{i+1}}c_{q}\sum_{(a,q)=1}e(an/q)

for i0β©½i<i1i_{0}\leqslant i<i_{1} and

fi1​(n):=βˆ‘2i1β©½qβ©½R2cqβ€‹βˆ‘(a,q)=1e​(a​n/q).f_{i_{1}}(n):=\sum_{2^{i_{1}}\leqslant q\leqslant R^{2}}c_{q}\sum_{(a,q)=1}e(an/q).

It is then clear that Ξ›per\Lambda_{\operatorname{per}} is (Q!)(Q!)-periodic and that Ξ›~βˆ’Ξ›per=βˆ‘ifi\tilde{\Lambda}-\Lambda_{\operatorname{per}}=\sum_{i}f_{i}. We now define gi,giβ€²g_{i},g^{\prime}_{i} by β€˜thresholding’ the fif_{i}, specifically by setting

gi​(n):=fi​(n)​1|fi​(n)|β©½2i​c/2,gi′​(n):=fi​(n)​1|fi​(n)|>2i​c/2g_{i}(n):=f_{i}(n)1_{|f_{i}(n)|\leqslant 2^{ic/2}},\qquad g_{i}^{\prime}(n):=f_{i}(n)1_{|f_{i}(n)|>2^{ic/2}}

for i0β©½iβ©½i1i_{0}\leqslant i\leqslant i_{1}. Set h:=βˆ‘igiβ€²h:=\sum_{i}g^{\prime}_{i}. The β„“βˆž\ell^{\infty} bound ˜4.5 is then immediate.

Next we establish ˜4.4. For this, we will use the moment estimates

𝔼x∈[X,2​X)​|fi​(x)|mβ‰ͺmiCm,i∈[i0,i1),and𝔼x∈[X,2​X)​|fi1​(x)|mβ‰ͺm(log⁑X)Cm\mathbb{E}_{x\in[X,2X)}|f_{i}(x)|^{m}\ll_{m}i^{C_{m}},\quad i\in[i_{0},i_{1}),\quad\mbox{and}\quad\mathbb{E}_{x\in[X,2X)}|f_{i_{1}}(x)|^{m}\ll_{m}(\log X)^{C_{m}} (4.7)

for m∈𝐍m\in\mathbf{N} and for some constants CmC_{m}, which we will establish below. Indeed, taking m=⌈4/cβŒ‰m=\lceil 4/c\rceil in ˜4.7 yields

𝔼x∈[X,2​X)​|gi′​(x)|β©½2c​i​(1βˆ’m)/2​𝔼x∈[X,2​X]​|fi​(x)|mβ‰ͺ2c​i​(1βˆ’m)​iCmβ‰ͺ2βˆ’i,\mathbb{E}_{x\in[X,2X)}|g^{\prime}_{i}(x)|\leqslant 2^{ci(1-m)/2}\mathbb{E}_{x\in[X,2X]}|f_{i}(x)|^{m}\ll 2^{ci(1-m)}i^{C_{m}}\ll 2^{-i}, (4.8)

uniformly for i∈[i0,i1)i\in[i_{0},i_{1}), and similarly

𝔼x∈[X,2​X)​|gi1′​(x)|β©½(log⁑X)c​A​(1βˆ’m)/2​𝔼x∈[X,2​X]​|fi1​(x)|mβ‰ͺ(log⁑X)c​A​(1βˆ’m)+Cmβ‰ͺ(log⁑X)βˆ’A\mathbb{E}_{x\in[X,2X)}|g^{\prime}_{i_{1}}(x)|\leqslant(\log X)^{cA(1-m)/2}\mathbb{E}_{x\in[X,2X]}|f_{i_{1}}(x)|^{m}\ll(\log X)^{cA(1-m)+C_{m}}\ll(\log X)^{-A} (4.9)

provided AA is chosen large enough (depending only on cc). The desired estimate ˜4.4 is now immediate from the triangle inequality, the dominant contribution being from ˜4.8 with values iβ‰ˆi0i\approx i_{0}. (Here we use the assumption that Qβ©½log⁑XQ\leqslant\log X to guarantee that the contribution from ˜4.9 is insignificant.)

Now we establish ˜4.6. It is enough to show that

β€–gi^β€–βˆžβ‰ͺX​2βˆ’3​i/4\|\widehat{g_{i}}\|_{\infty}\ll X2^{-3i/4} (4.10)

for i∈[i0,i1]i\in[i_{0},i_{1}], since the desired estimate then follows using the β„“βˆž\ell^{\infty} bounds on the gig_{i} implicit in the definitions of these functions. To show ˜4.10, it suffices to show the non-thresholded estimates

β€–fi^β€–βˆžβ‰ͺX​2βˆ’3​i/4\|\widehat{f_{i}}\|_{\infty}\ll X2^{-3i/4} (4.11)

for i∈[i0,i1]i\in[i_{0},i_{1}], from which ˜4.10 follows using ˜4.8 andΒ 4.9. From the definition of fif_{i}, summing the geometric series and the bound |cq|β‰ͺτ​(q)2/q|c_{q}|\ll\tau(q)^{2}/q, we have

|βˆ‘x∈[X,2​X]fi​(x)​e​(θ​x)|β‰ͺβˆ‘2i<qβ©½R2τ​(q)2qβ€‹βˆ‘(a,q)=1min⁑(X,β€–ΞΈβˆ’a/q‖𝐑/π™βˆ’1)\big|\sum_{x\in[X,2X]}f_{i}(x)e(\theta x)\big|\ll\sum_{2^{i}<q\leqslant R^{2}}\frac{\tau(q)^{2}}{q}\sum_{(a,q)=1}\min\big(X,\|\theta-a/q\|_{\mathbf{R}/\mathbf{Z}}^{-1}\big)

Since the fractions a/qa/q are Rβˆ’4R^{-4}-separated, the contribution from all except at most one a/qa/q will be (crudely) β‰ͺR2β‹…R4β‰ͺX​2βˆ’i\ll R^{2}\cdot R^{4}\ll X2^{-i}. For the fraction a/qa/q closest to ΞΈ\theta, we have the trivial bound β‰ͺX​τ​(q)2/q\ll X\tau(q)^{2}/q, which is <X​qβˆ’3/4β‰ͺX​2βˆ’3​i/4<Xq^{-3/4}\ll X2^{-3i/4} by the divisor bound, and ˜4.11 (and therefore ˜4.6) follows.

We now return to establish the moment estimate ˜4.7. An ingredient in the proof will be the (standard) estimate

βˆ‘P+​(d)β©½Qτ​(d)Cdβ‰ͺC(log⁑Q)2C.\sum_{P^{+}(d)\leqslant Q}\frac{\tau(d)^{C}}{d}\ll_{C}(\log Q)^{2^{C}}. (4.12)

To prove this, observe that the LHS is ∏pβ©½Q(1+2Cp+3Cp2+…)β‰ͺC∏pβ©½Q(1+1p)2C\prod_{p\leqslant Q}(1+\frac{2^{C}}{p}+\frac{3^{C}}{p^{2}}+\dots)\ll_{C}\prod_{p\leqslant Q}(1+\frac{1}{p})^{2^{C}}.

Turning to ˜4.7 itself, it suffices to prove the general estimate

𝔼x∈[X,2​X)​|f​(x)|mβ‰ͺ(log⁑Q)Om,B​(1),\mathbb{E}_{x\in[X,2X)}|f(x)|^{m}\ll(\log Q)^{O_{m,B}(1)}, (4.13)

for m∈𝐍m\in\mathbf{N}, where

f​(x)=βˆ‘P+​(q)β©½Qcqβ€‹βˆ‘(a,q)=1e​(a​nq),f(x)=\sum_{P^{+}(q)\leqslant Q}c_{q}\sum_{(a,q)=1}e(\frac{an}{q}),

the cqc_{q} are supported on squarefrees and |cq|⩽τ​(q)B/q|c_{q}|\leqslant\tau(q)^{B}/q. To prove such an estimate, we first write ff in physical space using Kluyver’s identity βˆ‘(a,q)=1e​(a​n/q)=βˆ‘d∣(n,q)d​μ​(q/d)\sum_{(a,q)=1}e(an/q)=\sum_{d\mid(n,q)}d\mu(q/d) for Ramanujan sums. This gives

f​(n)=βˆ‘P+​(d)β©½Qd∣ndβ€‹βˆ‘d∣qP+​(q)β©½Qμ​(qd)​cq=βˆ‘P+​(d)β©½Qd∣nΞ»d,whereΞ»d:=dβ€‹βˆ‘d∣qP+​(q)β©½Qμ​(qd)​cq.f(n)=\sum_{\begin{subarray}{c}P^{+}(d)\leqslant Q\\ d\mid n\end{subarray}}d\sum_{\begin{subarray}{c}d\mid q\\ P^{+}(q)\leqslant Q\end{subarray}}\mu\big(\frac{q}{d}\big)c_{q}=\sum_{\begin{subarray}{c}P^{+}(d)\leqslant Q\\ d\mid n\end{subarray}}\lambda_{d},\quad\mbox{where}\quad\lambda_{d}:=d\sum_{\begin{subarray}{c}d\mid q\\ P^{+}(q)\leqslant Q\end{subarray}}\mu\big(\frac{q}{d}\big)c_{q}. (4.14)

Now we have

|Ξ»d|β©½dβ€‹βˆ‘d∣qP+​(q)β©½Q|cq|β©½βˆ‘P+​(k)β©½Qτ​(k​d)Bk⩽τ​(d)Bβ€‹βˆ‘P+​(k)β©½Qτ​(k)Bkβ‰ͺτ​(d)B​(log⁑Q)2B|\lambda_{d}|\leqslant d\sum_{\begin{subarray}{c}d\mid q\\ P^{+}(q)\leqslant Q\end{subarray}}|c_{q}|\leqslant\sum_{P^{+}(k)\leqslant Q}\frac{\tau(kd)^{B}}{k}\leqslant\tau(d)^{B}\sum_{P^{+}(k)\leqslant Q}\frac{\tau(k)^{B}}{k}\ll\tau(d)^{B}(\log Q)^{2^{B}} (4.15)

by ˜4.12. Now observe that

𝔼n∈[X,2​X)​(βˆ‘P+​(d)β©½Qd∣nτ​(d)B)m\displaystyle\mathbb{E}_{n\in[X,2X)}\big(\sum_{\begin{subarray}{c}P^{+}(d)\leqslant Q\\ d\mid n\end{subarray}}\tau(d)^{B}\big)^{m} =𝔼n∈[X,2​X)β€‹βˆ‘P+​(d1),…,P+​(dm)β©½Q(τ​(d1)​⋯​τ​(dm))B​1[d1,…,dm]∣n\displaystyle=\mathbb{E}_{n\in[X,2X)}\sum_{P^{+}(d_{1}),\dots,P^{+}(d_{m})\leqslant Q}\big(\tau(d_{1})\cdots\tau(d_{m})\big)^{B}1_{[d_{1},\dots,d_{m}]\mid n}
⩽𝔼n∈[X,2​X)​τ​([d1,…,dm])m​B​1[d1,…,dm]∣n\displaystyle\leqslant\mathbb{E}_{n\in[X,2X)}\tau([d_{1},\dots,d_{m}])^{mB}1_{[d_{1},\dots,d_{m}]\mid n}
⩽𝔼n∈[X,2​X)β€‹βˆ‘P+​(d)β©½Qτ​(d)m​B+m​1d∣n\displaystyle\leqslant\mathbb{E}_{n\in[X,2X)}\sum_{P^{+}(d)\leqslant Q}\tau(d)^{mB+m}1_{d\mid n}
β‰ͺβˆ‘P+​(d)β©½Qτ​(d)m​B+mdβ‰ͺ(log⁑Q)OB,m​(1),\displaystyle\ll\sum_{P^{+}(d)\leqslant Q}\frac{\tau(d)^{mB+m}}{d}\ll(\log Q)^{O_{B,m}(1)}, (4.16)

In the middle step here the key point was that the number of representations of dd as [d1,…,dm][d_{1},\dots,d_{m}] is at most τ​(d)m\tau(d)^{m}, and in the penultimate step that 𝔼n∈[X,2​X)​1d∣nβ‰ͺ1/d\mathbb{E}_{n\in[X,2X)}1_{d\mid n}\ll 1/d for all dd. Combining ˜4.14, 4.15, andΒ 4.16 gives ˜4.13, and so ˜4.7 follows.

The final task is to establish ˜4.3. The first statement is immediate from ˜4.13 and Cauchy–Schwarz. For the second statement (which is rather crude) one can proceed directly from the definition of Ξ›per​(n)\Lambda_{\operatorname{per}}(n) using |cq|β‰ͺ1|c_{q}|\ll 1. ∎

We remark that from the first bound in ˜4.3 and the Q!Q!-periodicity of Ξ›per\Lambda_{\operatorname{per}} (or by direct proof) we have

𝔼x∈I​Λper​(x)β‰ͺ(log⁑Q)O​(1)\mathbb{E}_{x\in I}\Lambda_{\operatorname{per}}(x)\ll(\log Q)^{O(1)} (4.17)

for any interval of length Q!Q!.

Remarks.

It is possible to establish an analogue of Lemma˜4.1 with Ξ›~\tilde{\Lambda} equal to the von Mandoldt function itself, taking Ξ›per\Lambda_{\operatorname{per}} and the gig_{i} to be suitable CramΓ©r approximants to the von Mangoldt function and h=0h=0. The details necessary to accomplish this may be found in [Gre05], though the context there was different. There are some advantages to this, for instance Ξ›per\Lambda_{\operatorname{per}} is non-negative and subject to good β„“1\ell^{1}- and β„“βˆž\ell^{\infty}-bounds. The drawback of proceeding this way is that the bounds are ineffective due to an application of the Siegel-Walfisz theorem. This can be corrected via the introduction of appropriate β€˜Siegel-modified CramΓ©r approximants’ as in [TT25] but this is quite technical. By passing to a suitable majorant as in Lemma˜4.1 we can avoid all Siegel zero issues entirely.

5. An inverse theorem

In this section we explore the consequences of an assumption

|𝔼n∈[N],pβˆˆπ’«,pβ€²βˆˆπ’«β€²log​f1​(n+λ​p​pβ€²)​f2​(λ​n​p​pβ€²)|β©ΎΞ΄\big|\mathbb{E}_{n\in[N],p\in\mathscr{P},p^{\prime}\in\mathscr{P}^{\prime}}^{\log}f_{1}(n+\lambda pp^{\prime})f_{2}(\lambda npp^{\prime})\big|\geqslant\delta (5.1)

where f1,f2:𝐍→𝐂f_{1},f_{2}:\mathbf{N}\rightarrow\mathbf{C} are 11-bounded, 𝒫\mathscr{P} consists of primes, 𝒫′\mathscr{P}^{\prime} of almost primes and λ∈𝐍\lambda\in\mathbf{N} is some parameter. The reason for being interested in such an assumption was sketched in Section˜1.2 and will be further apparent in Section˜7.

The aim is to show that ˜5.1 implies that β€–f1β€–Ulog1​[N;q,H]\|f_{1}\|_{U^{1}_{\log}[N;q,H]} is large for suitable parameters q,Hq,H. (Recall from Definition˜2.3 the definition of these norms.) This result is directly inspired by [Ric25, Theorem 3.5], a connection we shall elaborate upon later. Here is the technical statement of our main result.

Proposition 5.1.

There is an absolute constant C∈𝐍C\in\mathbf{N} such that the following holds. Let Ξ΄βˆˆπ‘\delta\in\mathbf{R} be a sufficiently small parameter and let k∈𝐍k\in\mathbf{N}. Suppose that max⁑(k,1/Ξ΄)β©½log⁑log⁑N\max(k,1/\delta)\leqslant\log\log N and kβ©½Ξ΄βˆ’10k\leqslant\delta^{-10}. Let P1,P2,P1β€²,P2β€²P_{1},P_{2},P^{\prime}_{1},P^{\prime}_{2} be parameters with exp⁑exp⁑((log⁑log⁑N)1/10)β©½P1β€²<P2β€²<P1<P2<exp⁑((log⁑N)1/4)\exp\exp((\log\log N)^{1/10})\leqslant P^{\prime}_{1}<P^{\prime}_{2}<P_{1}<P_{2}<\exp((\log N)^{1/4}) and P1β©Ύ(P2β€²)10P_{1}\geqslant(P^{\prime}_{2})^{10}. Suppose that λ∈𝐍\lambda\in\mathbf{N} satisfies Ξ»β©½exp⁑((log⁑N)1/4)\lambda\leqslant\exp((\log N)^{1/4}), and that all prime factors of Ξ»\lambda are less than P1β€²P^{\prime}_{1}. Let 𝒫\mathscr{P} denote the set of primes in [P1,P2)[P_{1},P_{2}) and suppose that π’«β€²βŠ‚[P1β€²,P2β€²)\mathscr{P}^{\prime}\subset[P^{\prime}_{1},P^{\prime}_{2}) is a set of β€˜almost primes’ of the following form: 𝒫′={p1​⋯​pk:pβ„“βˆˆIβ„“}\mathscr{P}^{\prime}=\{p_{1}\cdots p_{k}:p_{\ell}\in I_{\ell}\}, where I1,…,IkβŠ‚[P1β€²,P2β€²)I_{1},\dots,I_{k}\subset[P^{\prime}_{1},P^{\prime}_{2}) are disjoint intervals, all with log⁑log⁑(max⁑(Iβ„“))βˆ’log⁑log⁑(min⁑(Iβ„“))β©Ύkβ€‹Ξ΄βˆ’4.1\log\log(\max(I_{\ell}))-\log\log(\min(I_{\ell}))\geqslant k\delta^{-4.1}, and the pβ„“p_{\ell} range over all primes in Iβ„“I_{\ell} for β„“βˆˆ[k]\ell\in[k]. Set V:=βŒŠΞ΄βˆ’CβŒ‹!V:=\lfloor\delta^{-C}\rfloor!. Suppose we have ˜5.1. Then we have β€–f1β€–Ulog1​[N;λ​V,H]≫δO​(1)\|f_{1}\|_{U^{1}_{\log}[N;\lambda V,H]}\gg\delta^{O(1)} for any H∈𝐍H\in\mathbf{N} with Hβ©½P11/8H\leqslant P_{1}^{1/8}.

Remarks.

For the rest of the section we write Ξ΅0:=110\varepsilon_{0}:=\frac{1}{10}, thus the lower bound on log⁑log⁑(max⁑(Iβ„“))βˆ’log⁑log⁑(min⁑(Iβ„“))\log\log(\max(I_{\ell}))-\log\log(\min(I_{\ell})) is kβ€‹Ξ΄βˆ’4βˆ’Ξ΅0k\delta^{-4-\varepsilon_{0}}. Any sufficiently small absolute constant Ξ΅0\varepsilon_{0} would do here. More generally, several of the assumptions on parameters are made so as to be comfortable for the required application and we do not claim these conditions are tight. For instance, the lower bound kβ€‹Ξ΄βˆ’4βˆ’Ξ΅0k\delta^{-4-\varepsilon_{0}} could be kβ€‹Ξ΄βˆ’4​(log⁑(1/Ξ΄))Ck\delta^{-4}(\log(1/\delta))^{C} for an appropriate CC.

5.1. Setting up the proof of the inverse theorem

The proof of Proposition˜5.1 is somewhat lengthy. We prepare the ground by defining some key parameters and observing simple preliminary bounds. In the proof C1<C2C_{1}<C_{2} are absolute constants, with C1C_{1} assumed to be sufficiently large and C2C_{2} assumed sufficiently large in terms of C1C_{1}. We will write Q:=βŒŠΞ΄βˆ’C2βŒ‹Q:=\lfloor\delta^{-C_{2}}\rfloor.

Next we point out some consequences of the (somewhat elaborate) conditions on parameters in the statement of Proposition˜5.1. First, the Piβ€²,PiP^{\prime}_{i},P_{i} are enormously larger than powers of Q!Q! (and a fortiori powers of Ξ΄βˆ’O​(1)\delta^{-O(1)}). Indeed P1β€²β©Ύexp⁑exp⁑((log⁑log⁑N)1/10)P^{\prime}_{1}\geqslant\exp\exp((\log\log N)^{1/10}) whilst Q!β©½exp⁑(Ξ΄βˆ’O​(C2))β©½exp⁑((log⁑log⁑N)O​(C2))Q!\leqslant\exp(\delta^{-O(C_{2})})\leqslant\exp((\log\log N)^{O(C_{2})}), using here the assumption that 1/Ξ΄β©½log⁑log⁑N1/\delta\leqslant\log\log N.

Second, we have

P1β€²>(k​log⁑P2β€²)k​LP^{\prime}_{1}>(k\log P^{\prime}_{2})^{kL} (5.2)

for any fixed constant LL (assuming NN sufficiently large in terms of LL). This is easily confirmed using the assumptions P1β€²>exp⁑exp⁑((log⁑log⁑N)1/10)>exp⁑((log⁑log⁑N)3)P^{\prime}_{1}>\exp\exp((\log\log N)^{1/10})>\exp((\log\log N)^{3}), P2β€²β©½NP^{\prime}_{2}\leqslant N and kβ©½log⁑log⁑Nk\leqslant\log\log N, and will be used (twice) to verify the key condition in Lemma˜3.2.

Third and finally, we note that all the PiP_{i} parameters are significantly smaller than NN, and one has for example log⁑P2log⁑Nβ‰ͺΞ΄10\frac{\log P_{2}}{\log N}\ll\delta^{10}, which will be used several times in the analysis to assert that error terms coming from ˜A.2 are negligible.

Next we record the fact that, under the stated conditions, the elements of 𝒫′\mathscr{P}^{\prime} are almost pairwise coprime. If 𝒩\mathcal{N} is any finite set of positive integers, we define γ​(𝒩):=𝔼n,nβ€²βˆˆπ’©log​(n,nβ€²)βˆ’1\gamma(\mathcal{N}):=\mathbb{E}_{n,n^{\prime}\in\mathcal{N}}^{\log}(n,n^{\prime})-1, where (n,nβ€²)(n,n^{\prime}) is the gcd of n,nβ€²n,n^{\prime}. This is a measure of the pairwise coprimality of elements of 𝒩\mathcal{N}; note that γ​(𝒩)β©Ύ0\gamma(\mathcal{N})\geqslant 0 always, and that if γ​(𝒩)\gamma(\mathcal{N}) is small then we expect the elements of 𝒩\mathcal{N} to be mostly coprime. Recall that Ξ΅0:=110\varepsilon_{0}:=\frac{1}{10} (though this is irrelevant to the following lemma).

Lemma 5.2.

Under the conditions of Proposition˜5.1, we have γ​(𝒫′)β©½Ξ΄4+Ξ΅0/2\gamma(\mathscr{P}^{\prime})\leqslant\delta^{4+\varepsilon_{0}/2}.

Proof.

If π’«βˆ—\mathscr{P}_{*} is a set of primes and p,pβ€²βˆˆπ’«βˆ—p,p^{\prime}\in\mathscr{P}_{*} then (p,pβ€²)=1(p,p^{\prime})=1 unless p=pβ€²p=p^{\prime}, and so if we denote by 𝒫ℓ\mathscr{P}_{\ell} the set of primes in Iβ„“I_{\ell} we have

γ​(𝒫ℓ)=(βˆ‘pβˆˆπ’«β„“1p)βˆ’2β€‹βˆ‘pβˆˆπ’«β„“pβˆ’1p2<(βˆ‘pβˆˆπ’«β„“1p)βˆ’1.\gamma(\mathscr{P}_{\ell})=\Big(\sum_{p\in\mathscr{P}_{\ell}}\frac{1}{p}\Big)^{-2}\sum_{p\in\mathscr{P}_{\ell}}\frac{p-1}{p^{2}}<\Big(\sum_{p\in\mathscr{P}_{\ell}}\frac{1}{p}\Big)^{-1}. (5.3)

Now since log⁑log⁑(max⁑(Iβ„“))βˆ’log⁑log⁑(min⁑(Iβ„“))β©Ύkβ€‹Ξ΄βˆ’4βˆ’Ξ΅0\log\log(\max(I_{\ell}))-\log\log(\min(I_{\ell}))\geqslant k\delta^{-4-\varepsilon_{0}}, it follows from Mertens’ theorem (see e.g. [Kou19, TheoremΒ 5.4]) and ˜5.3 that we have maxℓ⁑γ​(𝒫ℓ)β©½2​δ4+Ξ΅0/k\max_{\ell}\gamma(\mathscr{P}_{\ell})\leqslant 2\delta^{4+\varepsilon_{0}}/k. It follows that

γ​(𝒫′)\displaystyle\gamma(\mathscr{P}^{\prime}) =𝔼p1,p1β€²βˆˆπ’«1,…,pk,pkβ€²βˆˆπ’«klog​(p1​⋯​pk,p1β€²,β‹―,pkβ€²)βˆ’1=βˆβ„“=1k𝔼pβ„“,pβ„“β€²βˆˆπ’«β„“log​(pβ„“,pβ„“β€²)βˆ’1\displaystyle=\mathbb{E}^{\log}_{p_{1},p^{\prime}_{1}\in\mathscr{P}_{1},\dots,p_{k},p^{\prime}_{k}\in\mathscr{P}_{k}}(p_{1}\cdots p_{k},p^{\prime}_{1},\cdots,p^{\prime}_{k})-1=\prod_{\ell=1}^{k}\mathbb{E}^{\log}_{p_{\ell},p^{\prime}_{\ell}\in\mathscr{P}_{\ell}}(p_{\ell},p^{\prime}_{\ell})-1
=βˆβ„“=1k(1+γ​(𝒫ℓ))βˆ’1β©½(1+2​δ4+Ξ΅0k)kβˆ’1β©½e2​δ4+Ξ΅0βˆ’1β©½Ξ΄4+Ξ΅0/2.∎\displaystyle=\prod_{\ell=1}^{k}(1+\gamma(\mathscr{P}_{\ell}))-1\leqslant\Big(1+\frac{2\delta^{4+\varepsilon_{0}}}{k}\Big)^{k}-1\leqslant e^{2\delta^{4+\varepsilon_{0}}}-1\leqslant\delta^{4+\varepsilon_{0}/2}.\qed

As we said, the proof of Proposition˜5.1 is lengthy. Moreover, the logic is somewhat complicated, since it is difficult to state self-contained intermediate lemmas. For reference we summarise the proof structure now.

  • β€’

    We proceed directly from the assumption ˜5.1 via a series of steps to show that either ˜5.9 or ˜5.10 below holds.

  • β€’

    We then aim to show that ˜5.9 leads to a contradiction. This is first done subject to an unproven claim ˜5.15.

  • β€’

    Claim ˜5.15 is proven by contradiction. This task is quickly reduced to showing that statements ˜5.16 and ˜5.17 imply ˜5.18, which is then a somewhat lengthy undertaking.

  • β€’

    At this point we have confirmed that ˜5.9 cannot hold. Therefore (by the first bullet point) ˜5.10 holds.

  • β€’

    We then proceed directly from ˜5.10 to the desired conclusion via a quite lengthy (but linear) sequence of manipulations.

5.2. Proof of the inverse theorem

We turn now to the proof of Proposition˜5.1.

Proof.

Throughout the proof we will freely use the fact that NN is sufficiently large and that Ξ΄\delta is sufficiently small. The starting assumption is ˜5.1. We start by removing the function f2f_{2} using essentially the same manipulation as in [Ric25, Theorem 5.2]. First observe that, for each p,pβ€²p,p^{\prime}, an application of Lemma˜A.4 yields

𝔼n∈[N]log​f1​(n+λ​p​pβ€²)​f2​(λ​n​p​pβ€²)=𝔼n∈[N]log​pβ€²β€‹πŸpβ€²βˆ£n​f1​(npβ€²+λ​p​pβ€²)​f2​(λ​n​p​pβ€²)+O​(log⁑P2log⁑N).\mathbb{E}_{n\in[N]}^{\log}f_{1}(n+\lambda pp^{\prime})f_{2}(\lambda npp^{\prime})=\mathbb{E}_{n\in[N]}^{\log}p^{\prime}\mathbf{1}_{p^{\prime}\mid n}f_{1}\big(\frac{n}{p^{\prime}}+\lambda pp^{\prime}\big)f_{2}(\lambda npp^{\prime})+O\big(\frac{\log P_{2}}{\log N}\big).

Averaging over pβ€²p^{\prime} (and using the upper bound log⁑P2log⁑Nβ‰ͺΞ΄3\frac{\log P_{2}}{\log N}\ll\delta^{3}) gives

𝔼n∈[N],pβ€²βˆˆπ’«β€²log​f1​(n+λ​p​pβ€²)​f2​(λ​n​p​pβ€²)=𝔼n∈[N],pβ€²βˆˆπ’«β€²log​pβ€²β€‹πŸpβ€²βˆ£n​f1​(npβ€²+λ​p​pβ€²)​f2​(λ​n​p)+O​(Ξ΄3).\mathbb{E}^{\log}_{n\in[N],p^{\prime}\in\mathscr{P}^{\prime}}f_{1}(n+\lambda pp^{\prime})f_{2}(\lambda npp^{\prime})=\mathbb{E}^{\log}_{n\in[N],p^{\prime}\in\mathscr{P}^{\prime}}p^{\prime}\mathbf{1}_{p^{\prime}\mid n}f_{1}\big(\frac{n}{p^{\prime}}+\lambda pp^{\prime}\big)f_{2}(\lambda np)+O(\delta^{3}).

Write g​(p)g(p) for the expression on the left, and g~​(p)\tilde{g}(p) for the first expression on the right; thus g~​(p)=g​(p)+Ρ​(p)\tilde{g}(p)=g(p)+\varepsilon(p) with |Ρ​(p)|β‰ͺΞ΄3|\varepsilon(p)|\ll\delta^{3}. Now the assumption is that |𝔼pβˆˆπ’«log​g​(p)|β©ΎΞ΄|\mathbb{E}^{\log}_{p\in\mathscr{P}}g(p)|\geqslant\delta. By Cauchy–Schwarz (since gg is 11-bounded) we have 𝔼pβˆˆπ’«log​|g​(p)|2β©ΎΞ΄2\mathbb{E}^{\log}_{p\in\mathscr{P}}|g(p)|^{2}\geqslant\delta^{2}. Therefore 𝔼pβˆˆπ’«log​|g~​(p)|2β©ΎΞ΄2βˆ’2​𝔼pβˆˆπ’«log​|Ρ​(p)|​|g​(p)|βˆ’π”Όpβˆˆπ’«log​|Ρ​(p)|2β©ΎΞ΄2/2\mathbb{E}^{\log}_{p\in\mathscr{P}}|\tilde{g}(p)|^{2}\geqslant\delta^{2}-2\mathbb{E}^{\log}_{p\in\mathscr{P}}|\varepsilon(p)||g(p)|-\mathbb{E}^{\log}_{p\in\mathscr{P}}|\varepsilon(p)|^{2}\geqslant\delta^{2}/2, using the 11-boundedness of gg to estimate the second term. That is,

𝔼pβˆˆπ’«log​|𝔼n∈[N],pβ€²βˆˆπ’«β€²log​pβ€²β€‹πŸpβ€²βˆ£n​f1​(npβ€²+λ​p​pβ€²)​f2​(λ​n​p)|2β©ΎΞ΄2/2.\mathbb{E}^{\log}_{p\in\mathscr{P}}\Big|\mathbb{E}_{n\in[N],p^{\prime}\in\mathscr{P}^{\prime}}^{\log}p^{\prime}\mathbf{1}_{p^{\prime}\mid n}f_{1}\big(\frac{n}{p^{\prime}}+\lambda pp^{\prime}\big)f_{2}(\lambda np)\Big|^{2}\geqslant\delta^{2}/2.

Using Cauchy–Schwarz on the inner sum (and the 11-boundedness of f2f_{2}) gives

𝔼pβˆˆπ’«log​𝔼n∈[N]log​|𝔼pβ€²βˆˆπ’«β€²log​pβ€²β€‹πŸpβ€²βˆ£n​f1​(npβ€²+λ​p​pβ€²)|2β©ΎΞ΄2/2.\mathbb{E}_{p\in\mathscr{P}}^{\log}\mathbb{E}_{n\in[N]}^{\log}\Big|\mathbb{E}_{p^{\prime}\in\mathscr{P}^{\prime}}^{\log}p^{\prime}\mathbf{1}_{p^{\prime}\mid n}f_{1}\big(\frac{n}{p^{\prime}}+\lambda pp^{\prime}\big)\Big|^{2}\geqslant\delta^{2}/2.

We now pass to a non-logarithmic average in the pp variable, on a suitable dyadic interval. To do this, first partition [P1,P2][P_{1},P_{2}] into intervals II with 32⩽max⁑(I)/min⁑(I)⩽2\frac{3}{2}\leqslant\max(I)/\min(I)\leqslant 2. By averaging, there is some such II for which

𝔼pβˆˆπ’«βˆ©Ilog​𝔼n∈[N]log​|𝔼pβ€²βˆˆπ’«β€²log​pβ€²β€‹πŸpβ€²βˆ£n​f1​(npβ€²+λ​p​pβ€²)|2β©ΎΞ΄2/2.\mathbb{E}_{p\in\mathscr{P}\cap I}^{\log}\mathbb{E}_{n\in[N]}^{\log}\Big|\mathbb{E}_{p^{\prime}\in\mathscr{P}^{\prime}}^{\log}p^{\prime}\mathbf{1}_{p^{\prime}\mid n}f_{1}\big(\frac{n}{p^{\prime}}+\lambda pp^{\prime}\big)\Big|^{2}\geqslant\delta^{2}/2.

Let XX be such that IβŠ‚[X,2​X)I\subset[X,2X). We introduce the majorant Ξ›~\tilde{\Lambda} from Lemma˜4.1. Since the logarithmic weight 1p\frac{1}{p} varies by a factor at most 22 on II, it follows that

𝔼x∈[X,2​X]​Λ~​(x)​𝔼n∈[N]log​|𝔼pβ€²βˆˆπ’«β€²log​pβ€²β€‹πŸpβ€²βˆ£n​f1​(npβ€²+λ​x​pβ€²)|2≫δ2.\mathbb{E}_{x\in[X,2X]}\tilde{\Lambda}(x)\mathbb{E}_{n\in[N]}^{\log}\Big|\mathbb{E}_{p^{\prime}\in\mathscr{P}^{\prime}}^{\log}p^{\prime}\mathbf{1}_{p^{\prime}\mid n}f_{1}\big(\frac{n}{p^{\prime}}+\lambda xp^{\prime}\big)\Big|^{2}\gg\delta^{2}.

Expanding out the square gives

𝔼x∈[X,2​X)​Λ~​(x)​𝔼n∈[N],p1β€²βˆˆπ’«β€²,p2β€²βˆˆπ’«β€²log​p1′​p2β€²β€‹πŸ[p1β€²,p2β€²]∣n​f1​(np1β€²+λ​x​p1β€²)​f1​(np2β€²+λ​x​p2β€²)¯≫δ2.\mathbb{E}_{x\in[X,2X)}\tilde{\Lambda}(x)\mathbb{E}_{n\in[N],p_{1}^{\prime}\in\mathscr{P}^{\prime},p_{2}^{\prime}\in\mathscr{P}^{\prime}}^{\log}p_{1}^{\prime}p^{\prime}_{2}\mathbf{1}_{[p^{\prime}_{1},p^{\prime}_{2}]\mid n}f_{1}\big(\frac{n}{p^{\prime}_{1}}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(\frac{n}{p^{\prime}_{2}}+\lambda xp^{\prime}_{2}\big)}\gg\delta^{2}. (5.4)

The next technical reduction is to replace the cutoff 𝟏[p1β€²,p2β€²]∣n\mathbf{1}_{[p^{\prime}_{1},p^{\prime}_{2}]\mid n} with 𝟏p1′​p2β€²βˆ£n\mathbf{1}_{p^{\prime}_{1}p^{\prime}_{2}\mid n}, which we do using the fact that the elements of 𝒫′\mathscr{P}^{\prime} are mostly coprime due to Lemma˜5.2. Let us justify this carefully. Since f1,f2f_{1},f_{2} are 11-bounded and 𝔼x∈[X,2​X]​Λ~​(x)β‰ͺ1\mathbb{E}_{x\in[X,2X]}\widetilde{\Lambda}(x)\ll 1, the error in making this switch in the LHS of ˜5.4 is bounded up to a constant factor by

𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²log​p1′​p2′​|𝟏[p1β€²,p2β€²]∣nβˆ’πŸp1′​p2β€²βˆ£n|.\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}p_{1}^{\prime}p_{2}^{\prime}\big|\mathbf{1}_{[p_{1}^{\prime},p_{2}^{\prime}]\mid n}-\mathbf{1}_{p_{1}^{\prime}p_{2}^{\prime}\mid n}\big|. (5.5)

We have the pointwise bound |𝟏[p1β€²,p2β€²]∣nβˆ’πŸp1′​p2β€²βˆ£n|β©½2β€‹πŸ(p1β€²,p2β€²)β‰ 1β€‹πŸ[p1β€²,p2β€²]∣n\big|\mathbf{1}_{[p_{1}^{\prime},p_{2}^{\prime}]\mid n}-\mathbf{1}_{p_{1}^{\prime}p_{2}^{\prime}\mid n}\big|\leqslant 2\mathbf{1}_{(p_{1}^{\prime},p_{2}^{\prime})\neq 1}\mathbf{1}_{[p^{\prime}_{1},p^{\prime}_{2}]\mid n} and therefore

𝔼n∈[N]log​|𝟏[p1β€²,p2β€²]∣nβˆ’πŸp1′​p2β€²βˆ£n|β©½2β€‹πŸ(p1β€²,p2β€²)β‰ 1​𝔼n∈[N]logβ€‹πŸ[p1β€²,p2β€²]∣nβ©½4β€‹πŸ(p1β€²,p2β€²)β‰ 1[p1β€²,p2β€²],\mathbb{E}^{\log}_{n\in[N]}\big|\mathbf{1}_{[p_{1}^{\prime},p_{2}^{\prime}]\mid n}-\mathbf{1}_{p_{1}^{\prime}p_{2}^{\prime}\mid n}\big|\leqslant 2\mathbf{1}_{(p^{\prime}_{1},p^{\prime}_{2})\neq 1}\mathbb{E}_{n\in[N]}^{\log}\mathbf{1}_{[p^{\prime}_{1},p^{\prime}_{2}]\mid n}\leqslant\frac{4\mathbf{1}_{(p^{\prime}_{1},p^{\prime}_{2})\neq 1}}{[p^{\prime}_{1},p^{\prime}_{2}]},

using in the last step that p1β€²,p2β€²p^{\prime}_{1},p^{\prime}_{2} are much smaller than NN. It follows that ˜5.5 is bounded above by 4​𝔼p1β€²,p2β€²βˆˆπ’«β€²log​(p1β€²,p2β€²)β€‹πŸ(p1β€²,p2β€²)β‰ 14\mathbb{E}_{p_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}^{\prime}}^{\log}(p^{\prime}_{1},p^{\prime}_{2})\mathbf{1}_{(p^{\prime}_{1},p^{\prime}_{2})\neq 1}. Using the pointwise bound (p1β€²,p2β€²)β€‹πŸ(p1β€²,p2β€²)β‰ 1β©½2​((p1β€²,p2β€²)βˆ’1)(p^{\prime}_{1},p^{\prime}_{2})\mathbf{1}_{(p^{\prime}_{1},p^{\prime}_{2})\neq 1}\leqslant 2((p^{\prime}_{1},p^{\prime}_{2})-1), this in turn is bounded by 8​𝔼p1β€²,p2β€²βˆˆπ’«β€²log​((p1β€²,p2β€²)βˆ’1)=8​γ​(𝒫′)8\mathbb{E}_{p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}((p^{\prime}_{1},p^{\prime}_{2})-1)=8\gamma(\mathscr{P}^{\prime}). By Lemma˜5.2, we see that ˜5.5 is bounded by O​(Ξ΄4)O(\delta^{4}). Therefore, as claimed, we may replace ˜5.4 by

|𝔼x∈[X,2​X)​Λ~​(x)​𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²log​p1′​p2β€²β€‹πŸp1′​p2β€²βˆ£n​f1​(np1β€²+λ​x​p1β€²)​f1​(np2β€²+λ​x​p2β€²)Β―|≫δ2.\Big|\mathbb{E}_{x\in[X,2X)}\tilde{\Lambda}(x)\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}p_{1}^{\prime}p^{\prime}_{2}\mathbf{1}_{p^{\prime}_{1}p^{\prime}_{2}\mid n}f_{1}\big(\frac{n}{p^{\prime}_{1}}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(\frac{n}{p^{\prime}_{2}}+\lambda xp^{\prime}_{2}\big)}\Big|\gg\delta^{2}. (5.6)

The reason for having replaced ˜5.4 with ˜5.6 is that we may now invoke Lemma˜A.4 (with q=p1′​p2β€²q=p^{\prime}_{1}p^{\prime}_{2}) to conclude that

|𝔼x∈[X,2​X)​Λ~​(x)​𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²log​f1​(n​p2β€²+λ​x​p1β€²)​f1​(n​p1β€²+λ​x​p2β€²)Β―|≫δ2.\Big|\mathbb{E}_{x\in[X,2X)}\tilde{\Lambda}(x)\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|\gg\delta^{2}. (5.7)

We now apply Lemma˜4.1 with parameter Q=βŒŠΞ΄βˆ’C2βŒ‹Q=\lfloor\delta^{-C_{2}}\rfloor and constant c:=1/4​C1c:=1/4C_{1}. Observe that the required inequality

Q⩽log⁑XQ\leqslant\log X (5.8)

is true and follows from the choice of parameters, using here that Xβ©ΎP1X\geqslant P_{1}.

Let Ξ›per\Lambda_{\operatorname{per}} be the Q!Q!-periodic function as in that lemma. Our aim is to replace Ξ›~\tilde{\Lambda} in ˜5.7 by Ξ›per\Lambda_{\operatorname{per}}.

From ˜5.7 and the triangle inequality, one of the following two statements holds:

|𝔼x∈[X,2​X)​(Ξ›~βˆ’Ξ›per)​(x)​𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²log​f1​(n​p2β€²+λ​x​p1β€²)​f1​(n​p1β€²+λ​x​p2β€²)Β―|≫δ2,\Big|\mathbb{E}_{x\in[X,2X)}(\tilde{\Lambda}-\Lambda_{\operatorname{per}})(x)\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|\gg\delta^{2}, (5.9)

or

|𝔼x∈[X,2​X)​Λper​(x)​𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²log​f1​(n​p2β€²+λ​x​p1β€²)​f1​(n​p1β€²+λ​x​p2β€²)Β―|≫δ2.\Big|\mathbb{E}_{x\in[X,2X)}\Lambda_{\operatorname{per}}(x)\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|\gg\delta^{2}. (5.10)

We analyse these two possibilities in turn. In the analysis we will use several times that

𝔼x∈[X,2​X)​|(Ξ›~βˆ’Ξ›per)​(x)|β‰ͺ(log⁑Q)O​(1),\mathbb{E}_{x\in[X,2X)}|(\tilde{\Lambda}-\Lambda_{\operatorname{per}})(x)|\ll(\log Q)^{O(1)}, (5.11)

which follows from ˜4.3 and the triangle inequality (since Ξ›~\tilde{\Lambda} is non-negative).

Analysis of ˜5.9. We begin by dyadically localising the (two copies of) the set 𝒫′\mathscr{P}^{\prime}. Recall that 𝒫′={p1​⋯​pk:pi∈Ii}\mathscr{P}^{\prime}=\{p_{1}\cdots p_{k}:p_{i}\in I_{i}\}. Since max⁑(Ii)/min⁑(Ii)β©Ύ10\max(I_{i})/\min(I_{i})\geqslant 10, we can decompose each IiI_{i} as a disjoint union of intervals Ii,jI_{i,j}, each of the form [Y,(1+Ξ·i,j)​Y][Y,(1+\eta_{i,j})Y] for some Ξ·i,j\eta_{i,j} satisfying 18​kβ©½Ξ·i,jβ©½14​k\frac{1}{8k}\leqslant\eta_{i,j}\leqslant\frac{1}{4k}. We then have a corresponding decomposition 𝒫′=⋃j1,…,jk𝒫j1,…​jkβ€²\mathscr{P}^{\prime}=\bigcup_{j_{1},\dots,j_{k}}\mathscr{P}^{\prime}_{j_{1},\dots j_{k}}, where 𝒫j1,…,jkβ€²:={p1​⋯​pk:pi∈Ii,ji}\mathscr{P}^{\prime}_{j_{1},\dots,j_{k}}:=\{p_{1}\cdots p_{k}:p_{i}\in I_{i,j_{i}}\}. Note that, since (1+14​k)k<2(1+\frac{1}{4k})^{k}<2, each 𝒫j1,…,jkβ€²\mathscr{P}^{\prime}_{j_{1},\dots,j_{k}} is contained in a dyadic interval. By averaging, there are jβ†’=(j1,…,jk)\vec{j}=(j_{1},\dots,j_{k}) and jβ†’β€²=(j1β€²,…,jkβ€²)\vec{j}^{\prime}=(j^{\prime}_{1},\dots,j^{\prime}_{k}) such that

𝔼p1β€²βˆˆπ’«jβ†’β€²,p2β€²βˆˆπ’«jβ†’β€²β€²log​|𝔼x∈[X,2​X)​(Ξ›~βˆ’Ξ›per)​(x)​𝔼n∈[N]log​f1​(n​p2β€²+λ​x​p1β€²)​f1​(n​p1β€²+λ​x​p2β€²)Β―|≫δ2.\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}^{\prime}_{\vec{j}},p^{\prime}_{2}\in\mathscr{P}_{\vec{j}^{\prime}}^{\prime}}^{\log}\Big|\mathbb{E}_{x\in[X,2X)}(\tilde{\Lambda}-\Lambda_{\operatorname{per}})(x)\mathbb{E}_{n\in[N]}^{\log}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|\gg\delta^{2}. (5.12)

For notational brevity, write 𝒫1β€²:=𝒫jβ†’β€²\mathscr{P}^{\prime}_{1}:=\mathscr{P}^{\prime}_{\vec{j}} and 𝒫2β€²:=𝒫jβ†’β€²β€²\mathscr{P}^{\prime}_{2}:=\mathscr{P}^{\prime}_{\vec{j}^{\prime}}. As 𝒫1β€²,𝒫2β€²\mathscr{P}_{1}^{\prime},\mathscr{P}^{\prime}_{2} are each contained in dyadic intervals, we can remove the logarithmic averaging to obtain

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2′​|𝔼x∈[X,2​X)​(Ξ›~βˆ’Ξ›per)​(x)​𝔼n∈[N]log​f1​(n​p2β€²+λ​x​p1β€²)​f1​(n​p1β€²+λ​x​p2β€²)Β―|≫δ2.\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}_{2}}\Big|\mathbb{E}_{x\in[X,2X)}(\tilde{\Lambda}-\Lambda_{\operatorname{per}})(x)\mathbb{E}_{n\in[N]}^{\log}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|\gg\delta^{2}. (5.13)

The next several manipulations leading to ˜5.14 are straightforward and are aimed to replacing the logarithmic average over nn by an ordinary average on an appropriate subinterval. We first discard the contribution from small values of NN. Set Nβ€²:=e(log⁑N)3/4N^{\prime}:=e^{(\log N)^{3/4}} (say). Writing ˜5.13 as

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2′​|𝔼x∈[X,2​X)​(Ξ›~βˆ’Ξ›per)​(x)β€‹βˆ‘n∈[N]1n​f1​(n​p2β€²+λ​x​p1β€²)​f1​(n​p1β€²+λ​x​p2β€²)Β―|≫δ2​HN,\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}_{2}}\Big|\mathbb{E}_{x\in[X,2X)}(\tilde{\Lambda}-\Lambda_{\operatorname{per}})(x)\sum_{n\in[N]}\frac{1}{n}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|\gg\delta^{2}H_{N},

(where HNH_{N} is the harmonic sum), using ˜5.11 we see that the contribution to the LHS from nβ©½Nβ€²n\leqslant N^{\prime} is bounded by HN′​(log⁑Q)O​(1)<Ξ΄10​HNH_{N^{\prime}}(\log Q)^{O(1)}<\delta^{10}H_{N}, using here that 1/Ξ΄β©½log⁑log⁑N1/\delta\leqslant\log\log N.

Since HNHNβˆ’HNβ€²β‰ˆ1\frac{H_{N}}{H_{N}-H_{N^{\prime}}}\approx 1, it follows that we may replace ˜5.13 by

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2′​|𝔼x∈[X,2​X)​(Ξ›~βˆ’Ξ›per)​(x)​𝔼n∈[Nβ€²,N]log​f1​(n​p2β€²+λ​x​p1β€²)​f1​(n​p1β€²+λ​x​p2β€²)Β―|≫δ2.\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}_{2}}\Big|\mathbb{E}_{x\in[X,2X)}(\tilde{\Lambda}-\Lambda_{\operatorname{per}})(x)\mathbb{E}_{n\in[N^{\prime},N]}^{\log}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|\gg\delta^{2}.

We now break [Nβ€²,N][N^{\prime},N] into intervals II whose lengths satisfy e(log⁑N)1/2β©½|I|β©½2​e(log⁑N)1/2e^{(\log N)^{1/2}}\leqslant|I|\leqslant 2e^{(\log N)^{1/2}}. By pigeonhole there exists such an interval for which

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2′​|𝔼x∈[X,2​X)​(Ξ›~βˆ’Ξ›per)​(x)​𝔼n∈Ilog​f1​(n​p2β€²+λ​x​p1β€²)​f1​(n​p1β€²+λ​x​p2β€²)Β―|≫δ2.\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}_{2}^{\prime}}\Big|\mathbb{E}_{x\in[X,2X)}(\tilde{\Lambda}-\Lambda_{\operatorname{per}})(x)\mathbb{E}_{n\in I}^{\log}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|\gg\delta^{2}.

The weight 1/n1/n varies by at most 1+O​(|I|β‹…Nβ€²β£βˆ’1)1+O(|I|\cdot N^{\prime-1}) on II and so, using ˜5.11, we can justify replacing 𝔼n∈Ilog\mathbb{E}^{\log}_{n\in I} with a uniform average 𝔼n∈I\mathbb{E}_{n\in I}, thus obtaining

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2′​|𝔼x∈[X,2​X)​(Ξ›~βˆ’Ξ›per)​(x)​𝔼n∈I​f1​(n​p2β€²+λ​x​p1β€²)​f1​(n​p1β€²+λ​x​p2β€²)Β―|≫δ2.\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}^{\prime}_{2}}\Big|\mathbb{E}_{x\in[X,2X)}(\tilde{\Lambda}-\Lambda_{\operatorname{per}})(x)\mathbb{E}_{n\in I}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|\gg\delta^{2}. (5.14)

Our plan now is to use the decomposition Ξ›~βˆ’Ξ›per=βˆ‘igi+h\tilde{\Lambda}-\Lambda_{\operatorname{per}}=\sum_{i}g_{i}+h from Lemma˜4.1 in order to obtain a contradiction from ˜5.14. To do this, we claim that for a general function ψ:[X,2​X)→𝐂\psi:[X,2X)\rightarrow\mathbf{C} we have

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2β€²|𝔼x∈[X,2​X)\displaystyle\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}_{2}}\Big|\mathbb{E}_{x\in[X,2X)} ψ(x)𝔼n∈If1(np2β€²+Ξ»xp1β€²)f1​(n​p1β€²+λ​x​p2β€²)Β―|\displaystyle\psi(x)\mathbb{E}_{n\in I}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|
β‰ͺmin⁑(𝔼x∈[X,2​X)​|Οˆβ€‹(x)|,kXcβ€‹β€–Οˆ^β€–βˆžcβ€‹β€–Οˆβ€–βˆž1βˆ’c+(log⁑X)βˆ’C2β€‹β€–Οˆβ€–βˆž).\displaystyle\ll\min\Big(\mathbb{E}_{x\in[X,2X)}|\psi(x)|,\frac{k}{X^{c}}\|\widehat{\psi}\|_{\infty}^{c}\|\psi\|_{\infty}^{1-c}+(\log X)^{-C_{2}}\|\psi\|_{\infty}\Big). (5.15)

Here, ψ^​(ΞΈ)=βˆ‘x∈[X,2​X)Οˆβ€‹(x)​e​(βˆ’ΞΈβ€‹x)\widehat{\psi}(\theta)=\sum_{x\in[X,2X)}\psi(x)e(-\theta x). Assuming the claim for now, we see that the LHS of ˜5.14 is bounded above by

β‰ͺkXcβ€‹βˆ‘iβ€–gi^β€–βˆžc​‖giβ€–βˆž1βˆ’c+(log⁑X)βˆ’C2β€‹βˆ‘iβ€–giβ€–βˆž+𝔼x∈[X,2​X)​|h​(x)|β‰ͺk​Qβˆ’c/4\ll\frac{k}{X^{c}}\sum_{i}\|\widehat{g_{i}}\|_{\infty}^{c}\|g_{i}\|_{\infty}^{1-c}+(\log X)^{-C_{2}}\sum_{i}\|g_{i}\|_{\infty}+\mathbb{E}_{x\in[X,2X)}|h(x)|\ll kQ^{-c/4}

by ˜4.4, 4.6, andΒ 4.5, assuming here that C2C_{2} is sufficiently large and noting ˜5.8. This contradicts ˜5.14, recalling here that Q=βŒŠΞ΄βˆ’C2βŒ‹Q=\lfloor\delta^{-C_{2}}\rfloor, that C2C_{2} is sufficiently large in terms of C1C_{1}, and additionally recalling here our assumption (in Proposition˜5.1) that kβ©½Ξ΄βˆ’10k\leqslant\delta^{-10}. That is (assuming the claim ˜5.15) we cannot have ˜5.9, and therefore ˜5.10 holds.

Proof of claim ˜5.15. The first bound is trivial, but the second is a somewhat involved task. By homogeneity, we may assume that β€–Οˆβ€–βˆž=1\|\psi\|_{\infty}=1. Thus if the second bound in ˜5.15 does not hold, we have

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2′​|𝔼x∈[X,2​X)β€‹Οˆβ€‹(x)​𝔼n∈I​f1​(n​p2β€²+λ​x​p1β€²)​f1​(n​p1β€²+λ​x​p2β€²)Β―|β©ΎΟ„/Ο„0,\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}_{2}}\Big|\mathbb{E}_{x\in[X,2X)}\psi(x)\mathbb{E}_{n\in I}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}\big)}\Big|\geqslant\tau/\tau_{0}, (5.16)

where we are free to choose an absolute Ο„0\tau_{0} and Ο„:=kXcβ€‹β€–Οˆ^β€–βˆžc+(log⁑X)βˆ’C2\tau:=\frac{k}{X^{c}}\|\widehat{\psi}\|_{\infty}^{c}+(\log X)^{-C_{2}}, thus in particular

Ο„βˆˆ[(log⁑X)βˆ’C2,Ο„0].\tau\in[(\log X)^{-C_{2}},\tau_{0}]. (5.17)

It therefore suffices to show that the assumption ˜5.16 and the inclusion ˜5.17 imply that

β€–Οˆ^β€–βˆž=supΞΈβˆˆπ‘/𝐙|ψ^​(ΞΈ)|β©Ύ(Ο„/k)1/c​X,\|\widehat{\psi}\|_{\infty}=\sup_{\theta\in\mathbf{R}/\mathbf{Z}}|\widehat{\psi}(\theta)|\geqslant(\tau/k)^{1/c}X, (5.18)

since this immediately contradicts the definition of Ο„\tau. The remainder of the proof of claim ˜5.15 is devoted to this task.

Suppose that 𝒫1β€²βŠ‚[Y1,2​Y1]\mathscr{P}^{\prime}_{1}\subset[Y_{1},2Y_{1}] and 𝒫2β€²βŠ‚[Y2,2​Y2]\mathscr{P}^{\prime}_{2}\subset[Y_{2},2Y_{2}], where P1β€²β©½Y1,Y2β©½P2β€²P^{\prime}_{1}\leqslant Y_{1},Y_{2}\leqslant P^{\prime}_{2}. Set T1:=βŒŠΟ„C1​X/Y1βŒ‹T_{1}:=\lfloor\tau^{C_{1}}X/Y_{1}\rfloor and T2:=βŒŠΟ„C1​X/Y2βŒ‹T_{2}:=\lfloor\tau^{C_{1}}X/Y_{2}\rfloor. Since Xβ©ΎP1>(P2β€²)10β©ΎYi10X\geqslant P_{1}>(P^{\prime}_{2})^{10}\geqslant Y_{i}^{10} (by one of the assumptions of Proposition˜5.1) and Ο„C1β©Ύ(log⁑X)βˆ’C1​C2\tau^{C_{1}}\geqslant(\log X)^{-C_{1}C_{2}} we have T1,T2β©ΎX1/2β©Ύ1T_{1},T_{2}\geqslant X^{1/2}\geqslant 1. Let t1,t2t_{1},t_{2} be integers with |ti|β©½Ti|t_{i}|\leqslant T_{i}, and substitute n:=nβ€²βˆ’Ξ»β€‹p1′​t2βˆ’Ξ»β€‹p2′​t1n:=n^{\prime}-\lambda p^{\prime}_{1}t_{2}-\lambda p^{\prime}_{2}t_{1}, x:=xβ€²+p1′​t1+p2′​t2x:=x^{\prime}+p^{\prime}_{1}t_{1}+p^{\prime}_{2}t_{2} in ˜5.16. This gives

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2β€²\displaystyle\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}_{2}^{\prime}} |𝔼xβ€²βˆˆ[X,2​X)βˆ’p1′​t1βˆ’p2′​t2𝔼nβ€²βˆˆI+λ​(p1′​t2+p2′​t1)f1(nβ€²p2β€²+Ξ»xβ€²p1β€²+Ξ»(p1′⁣2βˆ’p2′⁣2)t1)\displaystyle\Big|\mathbb{E}_{x^{\prime}\in[X,2X)-p^{\prime}_{1}t_{1}-p^{\prime}_{2}t_{2}}\mathbb{E}_{n^{\prime}\in I+\lambda(p^{\prime}_{1}t_{2}+p^{\prime}_{2}t_{1})}f_{1}\big(n^{\prime}p^{\prime}_{2}+\lambda x^{\prime}p^{\prime}_{1}+\lambda(p_{1}^{\prime 2}-p_{2}^{\prime 2})t_{1}\big)
Γ—f1​(n′​p1β€²+λ​x′​p2β€²+λ​(p2′⁣2βˆ’p1′⁣2)​t2)¯ψ(xβ€²+p1β€²t1+p2β€²t2)|β©ΎΟ„.\displaystyle\times\overline{f_{1}\big(n^{\prime}p^{\prime}_{1}+\lambda x^{\prime}p^{\prime}_{2}+\lambda(p_{2}^{\prime 2}-p_{1}^{\prime 2})t_{2}\big)}\psi(x^{\prime}+p^{\prime}_{1}t_{1}+p^{\prime}_{2}t_{2})\Big|\geqslant\tau. (5.19)

Now observe that |p1′​t1+p2′​t2|β‰ͺΟ„C1​X|p^{\prime}_{1}t_{1}+p^{\prime}_{2}t_{2}|\ll\tau^{C_{1}}X, and also we have the crude bound

|λ​(p1′​t2+p2′​t1)|β©½|Ξ»|​P2′​Xβ©½e3​(log⁑N)1/4β‰ͺΟ„10​|I|,|\lambda(p^{\prime}_{1}t_{2}+p^{\prime}_{2}t_{1})|\leqslant|\lambda|P^{\prime}_{2}X\leqslant e^{3(\log N)^{1/4}}\ll\tau^{10}|I|,

using here that all of |Ξ»|,P2β€²,X|\lambda|,P^{\prime}_{2},X are β©½e(log⁑N)1/4\leqslant e^{(\log N)^{1/4}}, that |I|β©Ύe(log⁑N)1/2|I|\geqslant e^{(\log N)^{1/2}} and that Ο„β©Ύ(log⁑X)βˆ’C2>(log⁑N)βˆ’C2\tau\geqslant(\log X)^{-C_{2}}>(\log N)^{-C_{2}}. It follows using Lemma˜A.1 that for each fixed p1β€²,p2β€²p^{\prime}_{1},p^{\prime}_{2} we may replace the xβ€²x^{\prime}-average in ˜5.19 by 𝔼x∈[X,2​X)\mathbb{E}_{x\in[X,2X)}, and the nβ€²n^{\prime}-average by 𝔼nβ€²βˆˆI\mathbb{E}_{n^{\prime}\in I}, at the cost of changing the inner sum in ˜5.19 by O​(Ο„2)O(\tau^{2}). Doing this, averaging over t1,t2t_{1},t_{2} and dropping the dashes on xβ€²,nβ€²x^{\prime},n^{\prime} for clarity, we obtain

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2β€²|\displaystyle\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}_{2}^{\prime}}\Big| 𝔼t1∈[T1],t2∈[T2],x∈[X,2​X),n∈I​f1​(n​p2β€²+λ​x​p1β€²+λ​(p1′⁣2βˆ’p2′⁣2)​t1)\displaystyle\mathbb{E}_{t_{1}\in[T_{1}],t_{2}\in[T_{2}],x\in[X,2X),n\in I}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}+\lambda(p_{1}^{\prime 2}-p_{2}^{\prime 2})t_{1}\big)
Γ—f1​(n​p1β€²+λ​x​p2β€²+λ​(p2′⁣2βˆ’p1′⁣2)​t2)¯ψ(x+p1β€²t1+p2β€²t2)|β©ΎΟ„/2.\displaystyle\qquad\qquad\times\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}+\lambda(p_{2}^{\prime 2}-p_{1}^{\prime 2})t_{2}\big)}\psi(x+p^{\prime}_{1}t_{1}+p^{\prime}_{2}t_{2})\Big|\geqslant\tau/2.

Therefore there exists nn such that

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2β€²|\displaystyle\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}_{2}^{\prime}}\Big| 𝔼t1∈[T1],t2∈[T2],x∈[X,2​X)​f1​(n​p2β€²+λ​x​p1β€²+λ​(p1′⁣2βˆ’p2′⁣2)​t1)\displaystyle\mathbb{E}_{t_{1}\in[T_{1}],t_{2}\in[T_{2}],x\in[X,2X)}f_{1}\big(np^{\prime}_{2}+\lambda xp^{\prime}_{1}+\lambda(p_{1}^{\prime 2}-p_{2}^{\prime 2})t_{1}\big)
Γ—f1​(n​p1β€²+λ​x​p2β€²+λ​(p2′⁣2βˆ’p1′⁣2)​t2)¯ψ(x+p1β€²t1+p2β€²t2)|β©ΎΟ„/2.\displaystyle\qquad\qquad\qquad\times\overline{f_{1}\big(np^{\prime}_{1}+\lambda xp^{\prime}_{2}+\lambda(p_{2}^{\prime 2}-p_{1}^{\prime 2})t_{2}\big)}\psi(x+p^{\prime}_{1}t_{1}+p^{\prime}_{2}t_{2})\Big|\geqslant\tau/2.

This implies that

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2β€²,t1∈[T1],t2∈[T2],x∈[X,2​X)​F1​(p1β€²,p2β€²,x,t1)​F2​(p1β€²,p2β€²,x,t2)β€‹Οˆβ€‹(x+p1′​t1+p2′​t2)β©ΎΟ„/2\displaystyle\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}_{2}^{\prime},t_{1}\in[T_{1}],t_{2}\in[T_{2}],x\in[X,2X)}F_{1}(p_{1}^{\prime},p_{2}^{\prime},x,t_{1})F_{2}(p_{1}^{\prime},p_{2}^{\prime},x,t_{2})\psi(x+p^{\prime}_{1}t_{1}+p^{\prime}_{2}t_{2})\geqslant\tau/2

with FiF_{i} being 11-bounded functions; here we have absorbed the absolute value as a unit complex number into F1​(p1β€²,p2β€²,x,t1)F_{1}(p_{1}^{\prime},p_{2}^{\prime},x,t_{1}). We now apply Cauchy–Schwarz twice, and replace the dummy variable xx by nn, to obtain that

𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2β€²,t1,t1β€²βˆˆ[T1],t2,t2β€²βˆˆ[T2],n∈[X,2​X)​Δp1′​(t1,t1β€²)​Δp2′​(t2,t2β€²)β€‹Οˆβ€‹(n)β©Ύ(Ο„/2)4.\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}_{2}^{\prime},t_{1},t_{1}^{\prime}\in[T_{1}],t_{2},t^{\prime}_{2}\in[T_{2}],n\in[X,2X)}\Delta_{p^{\prime}_{1}(t_{1},t_{1}^{\prime})}\Delta_{p_{2}^{\prime}(t_{2},t_{2}^{\prime})}\psi(n)\geqslant(\tau/2)^{4}.

(The notation used here is described in Section˜1.4.)

The expression on the LHS is the same as the one in ˜2.12, with Si=𝒫iβ€²S_{i}=\mathscr{P}^{\prime}_{i}. In order to apply Lemma˜2.6, we need the sets 𝒫1β€²,𝒫2β€²\mathscr{P}^{\prime}_{1},\mathscr{P}^{\prime}_{2} to have suitable diophantine properties. Such a statement is precisely the content of Lemma˜3.2. To see this, recall that by definition we have 𝒫1β€²={p1​⋯​pk:pi∈Iiβ€²}\mathscr{P}^{\prime}_{1}=\{p_{1}\cdots p_{k}:p_{i}\in I^{\prime}_{i}\}, where each interval Iiβ€²I^{\prime}_{i} has the form [Mi,(1+Ξ·i)​Mi)[M_{i},(1+\eta_{i})M_{i}) for some Ξ·i∈(1/8​k,1/4​k)\eta_{i}\in(1/8k,1/4k) and some MiM_{i}, which we may assume to be the smallest prime pip_{i} in Iiβ€²I^{\prime}_{i}. Note that we always have P1β€²β©½Miβ©½P2β€²P^{\prime}_{1}\leqslant M_{i}\leqslant P^{\prime}_{2}, and Y1β©½M1​⋯​MkY_{1}\leqslant M_{1}\cdots M_{k} since M1​⋯​Mkβˆˆπ’«1β€²M_{1}\cdots M_{k}\in\mathscr{P}^{\prime}_{1} and 𝒫1β€²βŠ‚[Y1,2​Y1]\mathscr{P}^{\prime}_{1}\subset[Y_{1},2Y_{1}]. We now apply Lemma˜3.2 with j=1j=1. The required condition mini⁑Mi>QL1\min_{i}M_{i}>Q^{L_{1}} in that lemma follows using ˜5.2. Thus Lemma˜3.2 gives that 𝒫1β€²\mathscr{P}^{\prime}_{1} is (L1,k,Y1)(L_{1},k,Y_{1})-diophantine, for some absolute constant L1L_{1}. Similarly, 𝒫2β€²\mathscr{P}^{\prime}_{2} is (L1,k,Y2)(L_{1},k,Y_{2})-diophantine.

We may now apply Lemma˜2.6 with Si=𝒫iβ€²S_{i}=\mathscr{P}^{\prime}_{i} for i=1,2i=1,2, Ξ΄=(Ο„/2)4\delta=(\tau/2)^{4}, L=L1L=L_{1}, Lβ€²=kL^{\prime}=k and Di=YiD_{i}=Y_{i} for i=1,2i=1,2. To apply that lemma we need to verify, for i=1,2i=1,2, the three conditions Di,Tiβ©Ύ(Lβ€²/Ξ΄)C2.6​L2D_{i},T_{i}\geqslant(L^{\prime}/\delta)^{C_{\operatorname{\ref{lem:input-concat-2-iter}}}L^{2}} and Ti​Diβ©½(Lβ€²/Ξ΄)C2.6​L2​XT_{i}D_{i}\leqslant(L^{\prime}/\delta)^{C_{\operatorname{\ref{lem:input-concat-2-iter}}}L^{2}}X. The first condition holds comfortably using Yiβ©ΎP1β€²Y_{i}\geqslant P^{\prime}_{1} and the choice of parameters. The second condition holds even more comfortably using Tiβ©ΎX1/2β©ΎP11/2T_{i}\geqslant X^{1/2}\geqslant P_{1}^{1/2} and the choice of parameters. Finally, the third condition holds using that Ti​Di≍τC1​XT_{i}D_{i}\asymp\tau^{C_{1}}X, provided C1C_{1} is large enough; larger than 4​L1​C2.64L_{1}C_{\operatorname{\ref{lem:input-concat-2-iter}}} is sufficient.

The conclusion of Lemma˜2.6 gives that for any H1,H2∈𝐍H_{1},H_{2}\in\mathbf{N} with Hiβ©½(Ο„/k)2​C1​XH_{i}\leqslant(\tau/k)^{2C_{1}}X, there are q1,q2∈𝐍q_{1},q_{2}\in\mathbf{N}, qiβ©½(k/Ο„)O​(1)q_{i}\leqslant(k/\tau)^{O(1)} such that

𝔼n∈[2​X],h1,h1β€²βˆˆ[H1],h2,h2β€²βˆˆ[H2]​Δq1​(h1,h1β€²)​Δq2​(h2,h2β€²)β€‹Οˆβ€‹(n)≫(Ο„/k)O​(1).\mathbb{E}_{n\in[2X],h_{1},h^{\prime}_{1}\in[H_{1}],h_{2},h^{\prime}_{2}\in[H_{2}]}\Delta_{q_{1}(h_{1},h^{\prime}_{1})}\Delta_{q_{2}(h_{2},h^{\prime}_{2})}\psi(n)\gg(\tau/k)^{O(1)}. (5.20)

Set H1=H2=H:=⌊(Ο„/k)2​C1​XβŒ‹H_{1}=H_{2}=H:=\lfloor(\tau/k)^{2C_{1}}X\rfloor. The expression on the left in ˜5.20 is closely related to the Gowers U2U^{2}-norm of ψ\psi (or more accurately a Gowers–Peluse norm; see [Pel20] where they are called β€œGowers box norms”). Rather than appeal to any general theory of such norms, we proceed with a direct analysis using the Fourier transform. By the Fourier expansion Οˆβ€‹(n)=βˆ«π‘/π™Οˆ^​(ΞΈ)​e​(n​θ)​𝑑θ\psi(n)=\int_{\mathbf{R}/\mathbf{Z}}\widehat{\psi}(\theta)e(n\theta)d\theta, ˜5.20 is

∫ψ^​(ΞΈ1)β€‹Οˆ^​(ΞΈ2)Β―\displaystyle\int\widehat{\psi}(\theta_{1})\overline{\widehat{\psi}(\theta_{2})} ψ^​(ΞΈ3)¯ψ^(ΞΈ4)ΞΌ[2​X]^(βˆ’ΞΈ1+ΞΈ2+ΞΈ3βˆ’ΞΈ4)ΞΌ[H]^(q1(βˆ’ΞΈ1+ΞΈ3))ΞΌ[H]^(q1(βˆ’ΞΈ2+ΞΈ4)))\displaystyle\overline{\widehat{\psi}(\theta_{3})}\widehat{\psi}(\theta_{4})\widehat{\mu_{[2X]}}(-\theta_{1}+\theta_{2}+\theta_{3}-\theta_{4})\widehat{\mu_{[H]}}(q_{1}(-\theta_{1}+\theta_{3}))\widehat{\mu_{[H]}}(q_{1}(-\theta_{2}+\theta_{4})))
Γ—ΞΌ[H]^(q2(βˆ’ΞΈ1+ΞΈ2))ΞΌ[H]^(q2(βˆ’ΞΈ3+ΞΈ4))dΞΈ1dΞΈ2dΞΈ3dΞΈ4≫(Ο„/k)O​(1).\displaystyle\qquad\qquad\times\widehat{\mu_{[H]}}(q_{2}(-\theta_{1}+\theta_{2}))\widehat{\mu_{[H]}}(q_{2}(-\theta_{3}+\theta_{4}))d\theta_{1}d\theta_{2}d\theta_{3}d\theta_{4}\gg(\tau/k)^{O(1)}.

Here, ΞΌ[M]\mu_{[M]} denotes the normalised probability measure on [M][M]. By AM-GM and the pointwise bound |ΞΌ[2​X]^|β©½1|\widehat{\mu_{[2X]}}|\leqslant 1 we have that

βˆ«βˆ‘j=14|ψ^​(ΞΈj)|4\displaystyle\int\sum_{j=1}^{4}|\widehat{\psi}(\theta_{j})|^{4} |ΞΌ[H]^(q1(βˆ’ΞΈ1+ΞΈ3))ΞΌ[H]^(q1(βˆ’ΞΈ2+ΞΈ4)))\displaystyle\Big|\widehat{\mu_{[H]}}(q_{1}(-\theta_{1}+\theta_{3}))\widehat{\mu_{[H]}}(q_{1}(-\theta_{2}+\theta_{4})))
Γ—ΞΌ[H]^(q2(βˆ’ΞΈ1+ΞΈ2))ΞΌ[H]^(q2(βˆ’ΞΈ3+ΞΈ4))|dΞΈ1dΞΈ2dΞΈ3dΞΈ4≫(Ο„/k)O​(1).\displaystyle\qquad\qquad\times\widehat{\mu_{[H]}}(q_{2}(-\theta_{1}+\theta_{2}))\widehat{\mu_{[H]}}(q_{2}(-\theta_{3}+\theta_{4}))\Big|d\theta_{1}d\theta_{2}d\theta_{3}d\theta_{4}\gg(\tau/k)^{O(1)}.

Substitute ΞΈiβ€²=βˆ’ΞΈi+t\theta^{\prime}_{i}=-\theta_{i}+t, for tβˆˆπ‘/𝐙t\in\mathbf{R}/\mathbf{Z}, and integrate over tt. This gives (dropping the dashes)

(4β€‹βˆ«|ψ^​(t)|4​𝑑t)\displaystyle\Big(4\int|\widehat{\psi}(t)|^{4}~dt\Big) ∫|ΞΌ[H]^(q1(ΞΈ1βˆ’ΞΈ3))ΞΌ[H]^(q1(ΞΈ2βˆ’ΞΈ4)))\displaystyle\int\Big|\widehat{\mu_{[H]}}(q_{1}(\theta_{1}-\theta_{3}))\widehat{\mu_{[H]}}(q_{1}(\theta_{2}-\theta_{4})))
Γ—ΞΌ[H]^(q2(ΞΈ1βˆ’ΞΈ2))ΞΌ[H]^(q2(ΞΈ3βˆ’ΞΈ4))|dΞΈ1dΞΈ2dΞΈ3dΞΈ4≫(Ο„/k)O​(1).\displaystyle\qquad\qquad\times\widehat{\mu_{[H]}}(q_{2}(\theta_{1}-\theta_{2}))\widehat{\mu_{[H]}}(q_{2}(\theta_{3}-\theta_{4}))\Big|d\theta_{1}d\theta_{2}d\theta_{3}d\theta_{4}\gg(\tau/k)^{O(1)}. (5.21)

We claim that

∫|ΞΌ[H]^(q1(ΞΈ1βˆ’ΞΈ3))ΞΌ[H]^(q1(ΞΈ2βˆ’ΞΈ4)))ΞΌ[H]^(q2(ΞΈ1βˆ’ΞΈ2))ΞΌ[H]^(q2(ΞΈ3βˆ’ΞΈ4))|dΞΈ1dΞΈ2dΞΈ3dΞΈ4β‰ͺHβˆ’3.\int\Big|\widehat{\mu_{[H]}}(q_{1}(\theta_{1}-\theta_{3}))\widehat{\mu_{[H]}}(q_{1}(\theta_{2}-\theta_{4})))\widehat{\mu_{[H]}}(q_{2}(\theta_{1}-\theta_{2}))\widehat{\mu_{[H]}}(q_{2}(\theta_{3}-\theta_{4}))\Big|d\theta_{1}d\theta_{2}d\theta_{3}d\theta_{4}\ll H^{-3}. (5.22)

By AM-GM and symmetry it suffices to prove that

∫|ΞΌ[H]^(q1(ΞΈ1βˆ’ΞΈ3))ΞΌ[H]^(q1(ΞΈ2βˆ’ΞΈ4)))ΞΌ[H]^(q2(ΞΈ1βˆ’ΞΈ2))|4/3dΞΈ1dΞΈ2dΞΈ3dΞΈ4β‰ͺHβˆ’3.\int\Big|\widehat{\mu_{[H]}}(q_{1}(\theta_{1}-\theta_{3}))\widehat{\mu_{[H]}}(q_{1}(\theta_{2}-\theta_{4})))\widehat{\mu_{[H]}}(q_{2}(\theta_{1}-\theta_{2}))\Big|^{4/3}d\theta_{1}d\theta_{2}d\theta_{3}d\theta_{4}\ll H^{-3}.

The triple (q1​(ΞΈ1βˆ’ΞΈ3),q1​(ΞΈ2βˆ’ΞΈ4),q2​(ΞΈ1βˆ’ΞΈ2))(q_{1}(\theta_{1}-\theta_{3}),q_{1}(\theta_{2}-\theta_{4}),q_{2}(\theta_{1}-\theta_{2})) ranges uniformly over (𝐑/𝐙)3(\mathbf{R}/\mathbf{Z})^{3} as (ΞΈ1,ΞΈ2,ΞΈ3,ΞΈ4)(\theta_{1},\theta_{2},\theta_{3},\theta_{4}) ranges over (𝐑/𝐙)4(\mathbf{R}/\mathbf{Z})^{4} and so it is enough to show that ∫|ΞΌ[H]^​(ΞΈ)|4/3​𝑑θβ‰ͺHβˆ’1\int|\widehat{\mu_{[H]}}(\theta)|^{4/3}d\theta\ll H^{-1}. This, however, follows immediately using the bound |ΞΌ[H]^​(ΞΈ)|β‰ͺmin⁑(1,Hβˆ’1​‖θ‖𝐑/π™βˆ’1)|\widehat{\mu_{[H]}}(\theta)|\ll\min\big(1,H^{-1}\|\theta\|_{\mathbf{R}/\mathbf{Z}}^{-1}\big). The claim ˜5.22 is therefore proven. From this and ˜5.21 we immediately have ∫|ψ^​(t)|4​𝑑θ⩾(Ο„/k)O​(1)​H3≫(Ο„/k)7​C1​X3\int|\widehat{\psi}(t)|^{4}d\theta\geqslant(\tau/k)^{O(1)}H^{3}\gg(\tau/k)^{7C_{1}}X^{3} (if C1C_{1} is sufficiently large). Since ∫|ψ^​(t)|2​𝑑tβ‰ͺX\int|\widehat{\psi}(t)|^{2}dt\ll X by Parseval, it follows that β€–Οˆ^β€–βˆžβ‰«(Ο„/k)7​C1/2​X\|\widehat{\psi}\|_{\infty}\gg(\tau/k)^{7C_{1}/2}X and so β€–Οˆ^β€–βˆžβ©Ύ(Ο„/k)4​C1​X=(Ο„/k)1/c​X\|\widehat{\psi}\|_{\infty}\geqslant(\tau/k)^{4C_{1}}X=(\tau/k)^{1/c}X. In this last step we used the fact ˜5.17 that Ο„β©½Ο„0\tau\leqslant\tau_{0}; what is written is then true if Ο„0\tau_{0} is chosen sufficiently small. This completes the proof that the claims ˜5.16 andΒ 5.17 imply ˜5.18, and hence finishes the proof of the claim ˜5.15.

As explained just before the statement of claim ˜5.15, it now follows that ˜5.10 holds. The remainder of the proof of Proposition˜5.1 consists of the analysis of this case.

Analysis of ˜5.10. We first recall the statement, which is (after a mild reordering of the averaging operators)

|𝔼p1β€²,p2β€²βˆˆπ’«β€²log​𝔼h∈[X,2​X)​Λper​(h)​𝔼n∈[N]log​f1​(n​p2β€²+λ​h​p1β€²)​f1​(n​p1β€²+λ​h​p2β€²)Β―|≫δ2.\Big|\mathbb{E}^{\log}_{p_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}^{\prime}}\mathbb{E}_{h\in[X,2X)}\Lambda_{\operatorname{per}}(h)\mathbb{E}_{n\in[N]}^{\log}f_{1}\big(np^{\prime}_{2}+\lambda hp^{\prime}_{1}\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda hp^{\prime}_{2}\big)}\Big|\gg\delta^{2}. (5.23)

The advantage of having the function Ξ›per\Lambda_{\operatorname{per}} in place of Ξ›~\tilde{\Lambda} is that the former is invariant under shifts by Q!Q!. This is by construction (Lemma˜4.1); recall here that Q=βŒŠΞ΄βˆ’C2βŒ‹Q=\lfloor\delta^{-C_{2}}\rfloor. For fixed p1β€²,p2β€²p^{\prime}_{1},p^{\prime}_{2}, in the inner average over hh and nn in ˜5.23 we substitute n:=nβ€²βˆ’Q!​λ​p2′​tn:=n^{\prime}-Q!\lambda p^{\prime}_{2}t and h:=hβ€²+Q!​p1′​th:=h^{\prime}+Q!p^{\prime}_{1}t for some tβˆˆπ™t\in\mathbf{Z} and then average over all t∈[P11/2]t\in[P_{1}^{1/2}]. By the periodicity of Ξ›per\Lambda_{\operatorname{per}} we obtain

|𝔼t∈[P11/2]𝔼p1β€²,p2β€²βˆˆπ’«β€²log𝔼hβ€²βˆˆ[X,2​X)βˆ’Q!​p1′​tΞ›per(hβ€²)𝔼nβ€²βˆˆ[N]+Q!​λ​p2′​tlog\displaystyle\Big|\mathbb{E}_{t\in[P_{1}^{1/2}]}\mathbb{E}^{\log}_{p_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}^{\prime}}\mathbb{E}_{h^{\prime}\in[X,2X)-Q!p^{\prime}_{1}t}\Lambda_{\operatorname{per}}(h^{\prime})\mathbb{E}_{n^{\prime}\in[N]+Q!\lambda p^{\prime}_{2}t}^{\log} f1​(n′​p2β€²+λ​h′​p1β€²+λ​t​Q!​(p1′⁣2βˆ’p2′⁣2))\displaystyle f_{1}\big(n^{\prime}p^{\prime}_{2}+\lambda h^{\prime}p^{\prime}_{1}+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)
Γ—f1​(n′​p1β€²+λ​h′​p2β€²)Β―|≫δ2.\displaystyle\times\overline{f_{1}\big(n^{\prime}p^{\prime}_{1}+\lambda h^{\prime}p^{\prime}_{2}\big)}\Big|\gg\delta^{2}. (5.24)

Fix t,p1β€²,p2β€²t,p^{\prime}_{1},p^{\prime}_{2}. By ˜A.2, crude bounds for the parameters, and ˜4.3, the error in replacing the average over nβ€²n^{\prime} by 𝔼nβ€²βˆˆ[N]log\mathbb{E}^{\log}_{n^{\prime}\in[N]} is

β‰ͺlog⁑(Q!​|Ξ»|​P2′​P11/2)log⁑N𝔼hβ€²|Ξ›per(hβ€²)|β‰ͺ(logN)βˆ’1/2𝔼hβ€²|Ξ›per(hβ€²)|β‰ͺ(logN)βˆ’1/2log(1Ξ΄)O​(1)β‹˜Ξ΄10,\ll\frac{\log(Q!|\lambda|P^{\prime}_{2}P_{1}^{1/2})}{\log N}\mathbb{E}_{h^{\prime}}|\Lambda_{\operatorname{per}}(h^{\prime})|\ll(\log N)^{-1/2}\mathbb{E}_{h^{\prime}}|\Lambda_{\operatorname{per}}(h^{\prime})|\ll(\log N)^{-1/2}\log(\frac{1}{\delta})^{O(1)}\lll\delta^{10},

so we may make this replacement without affecting ˜5.24. Moreover, by applying ˜A.1 and the bound β€–Ξ›perβ€–βˆžβ©½Q2\|\Lambda_{\operatorname{per}}\|_{\infty}\leqslant Q^{2}, the error in then replacing the average over hβ€²h^{\prime} by 𝔼hβ€²βˆˆ[X,2​X)\mathbb{E}_{h^{\prime}\in[X,2X)} is β‰ͺQ2β‹…Q!​P1′​P11/2Xβ‰ͺP1βˆ’1/4β‰ͺΞ΄10\ll Q^{2}\cdot\frac{Q!P^{\prime}_{1}P_{1}^{1/2}}{X}\ll P_{1}^{-1/4}\ll\delta^{10}, so we may again make the replacement without affecting ˜5.24. (In the chain of inequalities here we used that P1β€²β©½P2β€²β©½P11/10P^{\prime}_{1}\leqslant P^{\prime}_{2}\leqslant P_{1}^{1/10}, that Xβ©ΎP1X\geqslant P_{1} and that P1P_{1} is much larger than fixed powers of Q!Q! and Ξ΄βˆ’1\delta^{-1}, cf. remarks in Section˜5.1.) Having made these two replacements we drop the dashes on nβ€²,hβ€²n^{\prime},h^{\prime} for clarity, thereby arriving at

|𝔼t∈[P11/2]​𝔼p1β€²,p2β€²βˆˆπ’«β€²log​𝔼h∈[X,2​X)​Λper​(h)​𝔼n∈[N]log​f1​(n​p2β€²+λ​h​p1β€²+λ​t​Q!​(p1′⁣2βˆ’p2′⁣2))​f1​(n​p1β€²+λ​h​p2β€²)Β―|≫δ2.\Big|\mathbb{E}_{t\in[P_{1}^{1/2}]}\mathbb{E}^{\log}_{p^{\prime}_{1},p_{2}^{\prime}\in\mathscr{P}^{\prime}}\mathbb{E}_{h\in[X,2X)}\Lambda_{\operatorname{per}}(h)\mathbb{E}_{n\in[N]}^{\log}f_{1}\big(np^{\prime}_{2}+\lambda hp^{\prime}_{1}+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)\overline{f_{1}\big(np^{\prime}_{1}+\lambda hp^{\prime}_{2}\big)}\Big|\gg\delta^{2}.

By the triangle inequality, we obtain

𝔼h∈[X,2​X)​|Ξ›per​(h)|​𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²log​|𝔼t∈[P11/2]​f1​(n​p2β€²+λ​h​p1β€²+λ​t​Q!​(p1′⁣2βˆ’p2′⁣2))|≫δ2.\mathbb{E}_{h\in[X,2X)}|\Lambda_{\operatorname{per}}(h)|\mathbb{E}_{n\in[N],p_{1}^{\prime},p_{2}^{\prime}\in\mathscr{P}^{\prime}}^{\log}\Big|\mathbb{E}_{t\in[P_{1}^{1/2}]}f_{1}\big(np^{\prime}_{2}+\lambda hp^{\prime}_{1}+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)\Big|\gg\delta^{2}.

Applying Cauchy–Schwarz, we obtain

𝔼h∈[X,2​X)|Ξ›per(h)|𝔼t,tβ€²βˆˆ[P11/2]𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²logf1(\displaystyle\mathbb{E}_{h\in[X,2X)}|\Lambda_{\operatorname{per}}(h)|\mathbb{E}_{t,t^{\prime}\in[P_{1}^{1/2}]}\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}f_{1}\big( np2β€²+Ξ»hp1β€²+Ξ»tQ!(p1′⁣2βˆ’p2′⁣2))\displaystyle np^{\prime}_{2}+\lambda hp^{\prime}_{1}+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)
Γ—f1​(n​p2β€²+λ​h​p1β€²+λ​t′​Q!​(p1′⁣2βˆ’p2′⁣2))¯≫δ4.\displaystyle\times\overline{f_{1}\big(np^{\prime}_{2}+\lambda hp^{\prime}_{1}+\lambda t^{\prime}Q!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)}\gg\delta^{4}.

Using the pointwise bound 𝟏(p1β€²,p2β€²)β‰ 1β©½(p1β€²,p2β€²)βˆ’1\mathbf{1}_{(p^{\prime}_{1},p^{\prime}_{2})\neq 1}\leqslant(p^{\prime}_{1},p^{\prime}_{2})-1 and the fact (Lemma˜5.2) that γ​(𝒫′)β©½Ξ΄4+Ξ΅0/2\gamma(\mathscr{P}^{\prime})\leqslant\delta^{4+\varepsilon_{0}/2}, as well as the bound 𝔼h∈[X,2​X)​|Ξ›per​(h)|β‰ͺlogO​(1)⁑(1/Ξ΄)\mathbb{E}_{h\in[X,2X)}|\Lambda_{\operatorname{per}}(h)|\ll\log^{O(1)}(1/\delta) (see ˜4.3), we see that the contribution from pairs with (p1β€²,p2β€²)β‰ 1(p^{\prime}_{1},p^{\prime}_{2})\neq 1 can be ignored. Thus

𝔼h∈[X,2​X)​|Ξ›per​(h)|​𝔼t,tβ€²βˆˆ[P11/2]​𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²logβ€‹πŸ(p1β€²,p2β€²)=1\displaystyle\mathbb{E}_{h\in[X,2X)}|\Lambda_{\operatorname{per}}(h)|\mathbb{E}_{t,t^{\prime}\in[P_{1}^{1/2}]}\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}\mathbf{1}_{(p^{\prime}_{1},p^{\prime}_{2})=1} f1​(n​p2β€²+λ​h​p1β€²+λ​t​Q!​(p1′⁣2βˆ’p2′⁣2))\displaystyle f_{1}\big(np^{\prime}_{2}+\lambda hp^{\prime}_{1}+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)
Γ—f1​(n​p2β€²+λ​h​p1β€²+λ​t′​Q!​(p1′⁣2βˆ’p2′⁣2))¯≫δ4.\displaystyle\times\overline{f_{1}\big(np^{\prime}_{2}+\lambda hp^{\prime}_{1}+\lambda t^{\prime}Q!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)}\gg\delta^{4}.

Since Ξ›per\Lambda_{\operatorname{per}} is invariant under shifts by Q!Q!, we may introduce an additional average obtaining

𝔼h∈[X,2​X),hβ€²βˆˆ[Xβ€²],t,tβ€²βˆˆ[P11/2]​|Ξ›per​(h)|​𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²logβ€‹πŸ(p1β€²,p2β€²)=1\displaystyle\mathbb{E}_{h\in[X,2X),h^{\prime}\in[X^{\prime}],t,t^{\prime}\in[P_{1}^{1/2}]}|\Lambda_{\operatorname{per}}(h)|\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}\mathbf{1}_{(p^{\prime}_{1},p^{\prime}_{2})=1}
Γ—f1(np2β€²+Ξ»(h+Q!hβ€²)p1β€²+Ξ»tQ!(p1′⁣2βˆ’p2′⁣2))f1​(n​p2β€²+λ​(h+Q!​hβ€²)​p1β€²+λ​t′​Q!​(p1′⁣2βˆ’p2′⁣2))¯≫δ4,\displaystyle\times f_{1}\big(np^{\prime}_{2}+\lambda(h+Q!h^{\prime})p^{\prime}_{1}+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)\overline{f_{1}\big(np^{\prime}_{2}+\lambda(h+Q!h^{\prime})p^{\prime}_{1}+\lambda t^{\prime}Q!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)}\gg\delta^{4},

where here Xβ€²:=⌊δ5​X/Q!βŒ‹X^{\prime}:=\lfloor\delta^{5}X/Q!\rfloor; note that Xβ€²X^{\prime} is much larger than 1 by the choice of parameters. Apart from the invariance of Ξ›per\Lambda_{\operatorname{per}} under translation by Q!Q!, the key point here is that, for each fixed hβ€²h^{\prime}, the shifted average differs from the original one by at most Xβˆ’1βˆ‘h∈[X,X+X′​Q!]|Ξ›per(h)|β‰ͺX′​Q!Xlog(1/Ξ΄)O​(1)X^{-1}\sum_{h\in[X,X+X^{\prime}Q!]}|\Lambda_{\operatorname{per}}(h)|\ll\frac{X^{\prime}Q!}{X}\log(1/\delta)^{O(1)} by ˜4.17 (and a similar term corresponding to the edge effects near 2​X2X).

In the display above, consider the average over n,hβ€²n,h^{\prime} (for fixed h,t,tβ€²,p1β€²,p2β€²h,t,t^{\prime},p^{\prime}_{1},p^{\prime}_{2}). The point now is that, from the point of view of logarithmic averages, n​p2β€²+λ​Q!​h′​p1β€²np^{\prime}_{2}+\lambda Q!h^{\prime}p^{\prime}_{1} may be regarded as essentially just varying over [N][N]. More precisely, applying Lemma˜A.3 with q=p2β€²q=p^{\prime}_{2}, b=λ​Q!​p1β€²b=\lambda Q!p^{\prime}_{1}, H:=Xβ€²=⌊δ5​X/Q!βŒ‹H:=X^{\prime}=\lfloor\delta^{5}X/Q!\rfloor and f​(x):=f1​(x+λ​h​p1β€²+λ​t​Q!​(p1′⁣2βˆ’p2′⁣2))​f1​(x+λ​h​p1β€²+λ​t′​Q!​(p1′⁣2βˆ’p2′⁣2))Β―f(x):=f_{1}(x+\lambda hp^{\prime}_{1}+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2}))\overline{f_{1}(x+\lambda hp^{\prime}_{1}+\lambda t^{\prime}Q!(p^{\prime 2}_{1}-p^{\prime 2}_{2}))}, we may replace the above with

𝔼h∈[X,2​X)​|Ξ›per​(h)|​𝔼t,tβ€²βˆˆ[P11/2]​𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²log\displaystyle\mathbb{E}_{h\in[X,2X)}|\Lambda_{\operatorname{per}}(h)|\mathbb{E}_{t,t^{\prime}\in[P_{1}^{1/2}]}\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log} 𝟏(p1β€²,p2β€²)=1​f1​(n+λ​h​p1β€²+λ​t​Q!​(p1′⁣2βˆ’p2′⁣2))\displaystyle\mathbf{1}_{(p^{\prime}_{1},p^{\prime}_{2})=1}f_{1}\big(n+\lambda hp^{\prime}_{1}+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)
Γ—f1​(n+λ​h​p1β€²+λ​t′​Q!​(p1′⁣2βˆ’p2′⁣2))¯≫δ4.\displaystyle\times\overline{f_{1}\big(n+\lambda hp^{\prime}_{1}+\lambda t^{\prime}Q!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)}\gg\delta^{4}. (5.25)

Let us comment on the application of Lemma˜A.3. First, we used that q=p2β€²q=p^{\prime}_{2} and b=λ​Q!​p1β€²b=\lambda Q!p^{\prime}_{1} are coprime. That (p2β€²,Ξ»)=1(p^{\prime}_{2},\lambda)=1 follows from the assumption that all prime factors of Ξ»\lambda are less than P1β€²P^{\prime}_{1}, and that (p2β€²,Q!)=1(p^{\prime}_{2},Q!)=1 follows using that P1β€²P^{\prime}_{1} is much larger than Ξ΄βˆ’C2\delta^{-C_{2}}. The error terms O​(log⁑q+log⁑b​hlog⁑N)O\big(\frac{\log q+\log bh}{\log N}\big) and O​(qH)O\big(\frac{q}{H}\big) resulting from the application of Lemma˜A.3 are all β‰ͺΞ΄10\ll\delta^{10} by simple verifications using the choice of parameters, the key point being that H>P11/2H>P_{1}^{1/2} is much larger than qq, but b​H<P22bH<P_{2}^{2} is much smaller than NN.

Applying ˜A.2, we may remove the λ​h​p1β€²\lambda hp^{\prime}_{1} shifts in ˜5.25, allowing us to decouple the average over hh and thus obtain via another application of ˜4.3 that

𝔼t,tβ€²βˆˆ[P11/2]​𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²logβ€‹πŸ(p1β€²,p2β€²)=1​f1​(n+λ​t​Q!​(p1′⁣2βˆ’p2′⁣2))​f1​(n+λ​t′​Q!​(p1′⁣2βˆ’p2′⁣2))¯≫δ4+Ξ΅0/4.\mathbb{E}_{t,t^{\prime}\in[P_{1}^{1/2}]}\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}\mathbf{1}_{(p^{\prime}_{1},p^{\prime}_{2})=1}f_{1}\big(n+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)\overline{f_{1}\big(n+\lambda t^{\prime}Q!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)}\gg\delta^{4+\varepsilon_{0}/4}. (5.26)

We may remove the condition (p1β€²,p2β€²)=1(p^{\prime}_{1},p^{\prime}_{2})=1 (losing a further factor of 2 in the implicit constant) exactly as before, obtaining

𝔼t,tβ€²βˆˆ[P11/2]​𝔼n∈[N],p1β€²,p2β€²βˆˆπ’«β€²log​f1​(n+λ​t​Q!​(p1′⁣2βˆ’p2′⁣2))​f1​(n+λ​t′​Q!​(p1′⁣2βˆ’p2′⁣2))¯≫δ4+Ξ΅0/4>Ξ΄5.\mathbb{E}_{t,t^{\prime}\in[P_{1}^{1/2}]}\mathbb{E}_{n\in[N],p_{1}^{\prime},p^{\prime}_{2}\in\mathscr{P}^{\prime}}^{\log}f_{1}\big(n+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)\overline{f_{1}\big(n+\lambda t^{\prime}Q!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)}\gg\delta^{4+\varepsilon_{0}/4}>\delta^{5}.

To analyse this, we will eventually use the diophantine nature of suitable sets {(pβ€²)2:pβ€²βˆˆπ’«β€²}\{(p^{\prime})^{2}:p^{\prime}\in\mathscr{P}^{\prime}\}, applying Lemma˜3.2 in the case j=2j=2. To prepare the ground, we must again foliate into appropriate β€˜subdyadic products’ as we did in the analysis of ˜5.9 leading to ˜5.12. With notation exactly the same as in that analysis, we may locate 𝒫1β€²:=𝒫jβ†’β€²\mathscr{P}^{\prime}_{1}:=\mathscr{P}^{\prime}_{\vec{j}} and 𝒫2β€²:=𝒫jβ†’β€²β€²\mathscr{P}^{\prime}_{2}:=\mathscr{P}^{\prime}_{\vec{j}^{\prime}} such that

𝔼t,tβ€²βˆˆ[P11/2]​𝔼n∈[N]log​𝔼p1β€²βˆˆπ’«1β€²,p2β€²βˆˆπ’«2′​f1​(n+λ​t​Q!​(p1′⁣2βˆ’p2′⁣2))​f1​(n+λ​t′​Q!​(p1′⁣2βˆ’p2′⁣2))Β―β©ΎΞ΄5.\mathbb{E}_{t,t^{\prime}\in[P_{1}^{1/2}]}\mathbb{E}^{\log}_{n\in[N]}\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}^{\prime}_{1},p^{\prime}_{2}\in\mathscr{P}^{\prime}_{2}}f_{1}\big(n+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)\overline{f_{1}\big(n+\lambda t^{\prime}Q!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)}\geqslant\delta^{5}.

Note here that we were able to replace the logarithmic average over the piβ€²p^{\prime}_{i} variables by a uniform average since these are now dyadically localised, and each t,tβ€²t,t^{\prime}-average is nonnegative. Suppose that 𝒫1β€²βŠ‚[Y1,2​Y1]\mathscr{P}^{\prime}_{1}\subset[Y_{1},2Y_{1}] and 𝒫2β€²βŠ‚[Y2,2​Y2]\mathscr{P}^{\prime}_{2}\subset[Y_{2},2Y_{2}], where P1β€²β©½Y1,Y2β©½P2β€²P^{\prime}_{1}\leqslant Y_{1},Y_{2}\leqslant P^{\prime}_{2}. Without loss of generality, Y1β©ΎY2Y_{1}\geqslant Y_{2}. Pigeonholing in p2β€²p^{\prime}_{2}, we see that there is some p2β€²p^{\prime}_{2} such that

𝔼t,tβ€²βˆˆ[P11/2]​𝔼n∈[N]log​𝔼p1β€²βˆˆπ’«1′​f1​(n+λ​t​Q!​(p1′⁣2βˆ’p2′⁣2))​f1​(n+λ​t′​Q!​(p1′⁣2βˆ’p2′⁣2))Β―β©ΎΞ΄5.\mathbb{E}_{t,t^{\prime}\in[P_{1}^{1/2}]}\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}^{\prime}_{1}}f_{1}\big(n+\lambda tQ!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)\overline{f_{1}\big(n+\lambda t^{\prime}Q!(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)}\geqslant\delta^{5}.

By Lemma˜A.2 with modulus q=λ​Q!q=\lambda Q!, this gives

𝔼a∈{0,1,…,λ​Q!βˆ’1}​𝔼n∈[N]log​𝔼p1β€²βˆˆπ’«1β€²,t,tβ€²βˆˆ[P11/2]​f1,a​(n+t​(p1′⁣2βˆ’p2′⁣2))​f1,a​(n+t′​(p1′⁣2βˆ’p2′⁣2))Β―β©ΎΞ΄6\mathbb{E}_{a\in\{0,1,\dots,\lambda Q!-1\}}\mathbb{E}^{\log}_{n\in[N]}\mathbb{E}_{p_{1}^{\prime}\in\mathscr{P}^{\prime}_{1},t,t^{\prime}\in[P_{1}^{1/2}]}f_{1,a}\big(n+t(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)\overline{f_{1,a}\big(n+t^{\prime}(p^{\prime 2}_{1}-p^{\prime 2}_{2})\big)}\geqslant\delta^{6}

where f1,a​(n):=f1​(λ​Q!​n+a)f_{1,a}(n):=f_{1}(\lambda Q!n+a). For each fixed aa, the inner average is of the form ˜2.3, with S:={p1′⁣2βˆ’p2′⁣2:p1β€²βˆˆπ’«1β€²}S:=\{p^{\prime 2}_{1}-p^{\prime 2}_{2}:p^{\prime}_{1}\in\mathscr{P}^{\prime}_{1}\} and Ξ΄\delta replaced by Ξ΄6\delta^{6}. We showed in Lemma˜3.2 (with j=2j=2) that S+p2′⁣2={p1′⁣2:p1β€²βˆˆπ’«1β€²}S+p^{\prime 2}_{2}=\{p^{\prime 2}_{1}:p^{\prime}_{1}\in\mathscr{P}^{\prime}_{1}\} is (L2,k,Y12)(L_{2},k,Y_{1}^{2})-Diophantine (the condition mini⁑Mi>QL2\min_{i}M_{i}>Q^{L_{2}} in that lemma follows using ˜5.2), and so by translation invariance of the notion of diophantine, the same is true of SS. Observe that SβŠ‚[βˆ’4​Y12,4​Y12]S\subset[-4Y_{1}^{2},4Y_{1}^{2}]. Thus we may aim to apply Lemma˜2.4 with S={p1′⁣2βˆ’p2′⁣2:p1β€²βˆˆπ’«1β€²}S=\{p_{1}^{\prime 2}-p_{2}^{\prime 2}:p^{\prime}_{1}\in\mathscr{P}^{\prime}_{1}\}, T:=⌊P11/2βŒ‹T:=\lfloor P_{1}^{1/2}\rfloor, (L,Lβ€²,D)=(L2,k,Y12)(L,L^{\prime},D)=(L_{2},k,Y_{1}^{2}), and Ξ΄\delta replaced by Ξ΄6\delta^{6}. There are three conditions to be checked, namely that D,Tβ©Ύ(Lβ€²/Ξ΄6)8​L=(k/Ξ΄6)8​L2D,T\geqslant(L^{\prime}/\delta^{6})^{8L}=(k/\delta^{6})^{8L_{2}}, and that log⁑T​Dlog⁑Nβ©½(Ξ΄6/k)50​L2\frac{\log TD}{\log N}\leqslant(\delta^{6}/k)^{50L_{2}}.

The first condition, involving D=Y12D=Y_{1}^{2}, is immediate from Y1β©ΎP1β€²Y_{1}\geqslant P^{\prime}_{1} and the parameter hierarchy. The second condition, involving T=⌊P11/2βŒ‹T=\lfloor P_{1}^{1/2}\rfloor, is also immediate. For the third condition note that T​Y12β©ΎTβ©ΎP11/2TY_{1}^{2}\geqslant T\geqslant P_{1}^{1/2} and (Ξ΄6/k)50​L2(\delta^{6}/k)^{50L_{2}} is much smaller than P11/4P_{1}^{1/4}.

Thus the appeal to Lemma˜2.4 is indeed valid, and we are free to take any H=P11/4H=P_{1}^{1/4} in this application. Recalling that kβ©½Ξ΄βˆ’10k\leqslant\delta^{-10}, the conclusion of Lemma˜2.4 that for each aa there is qaβ©½(k/Ξ΄)O​(1)β©½Ξ΄βˆ’O​(1)q_{a}\leqslant(k/\delta)^{O(1)}\leqslant\delta^{-O(1)} such that β€–f1,aβ€–Ulog1​[N;qa,P11/4]≫δO​(1)\|f_{1,a}\|_{U^{1}_{\log}[N;q_{a},P_{1}^{1/4}]}\gg\delta^{O(1)}. By pigeonhole there is a set of β©ΎΞ΄O​(1)​λ​Q!\geqslant\delta^{O(1)}\lambda Q! values of aa such that qaq_{a} does not depend on aa. Denote this common value by qq (which is of course not the same quantity as in the application of Lemma˜A.2 above). It follows that

𝔼a∈{0,1,…,λ​Q!βˆ’1}​‖f1,aβ€–Ulog1​[N;qa,H]2≫δO​(1),\mathbb{E}_{a\in\{0,1,\dots,\lambda Q!-1\}}\|f_{1,a}\|^{2}_{U^{1}_{\log}[N;q_{a},H]}\gg\delta^{O(1)},

that is to say

𝔼a∈{0,1,…,λ​Q!βˆ’1}​𝔼n∈[N]log​𝔼h,hβ€²βˆˆ[P11/4]​f1,a​(n+q​h)​f1,a​(n+q​hβ€²)Β―β©ΎΞ΄O​(1).\mathbb{E}_{a\in\{0,1,\dots,\lambda Q!-1\}}\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{h,h^{\prime}\in[P_{1}^{1/4}]}f_{1,a}(n+qh)\overline{f_{1,a}(n+qh^{\prime})}\geqslant\delta^{O(1)}.

A further application of Lemma˜A.2 then yields

𝔼n∈[N]log​𝔼h,hβ€²βˆˆ[P11/4]​f1​(n+λ​h​q​Q!)​f1​(n+λ​h′​q​Q!)¯≫δO​(1),\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{h,h^{\prime}\in[P_{1}^{1/4}]}f_{1}(n+\lambda hqQ!)\overline{f_{1}(n+\lambda h^{\prime}qQ!)}\gg\delta^{O(1)},

which is the statement

β€–f1β€–Ulog1​[N;λ​q​Q!,P11/4]2≫δO​(1).\|f_{1}\|^{2}_{U^{1}_{\log}[N;\lambda qQ!,P_{1}^{1/4}]}\gg\delta^{O(1)}.

Finally, let Hβ©½P11/8H\leqslant P_{1}^{1/8} be as in the statement of Proposition˜5.1. Set C:=2​C2C:=2C_{2} and V:=βŒŠΞ΄βˆ’CβŒ‹!V:=\lfloor\delta^{-C}\rfloor!. Note that q​Q!∣VqQ!\mid V. Therefore by Lemma˜A.6 we have

β€–f1β€–Ulog1​[N;λ​V;H]β©Ύβ€–f1β€–Ulog1​[N;λ​q​Q!,P11/4]βˆ’O​(log⁑|P11/4​λ​q​Q!|log⁑N)βˆ’O​(H​VP11/4​q​Q!)≫δO​(1),\|f_{1}\|_{U^{1}_{\log}[N;\lambda V;H]}\geqslant\|f_{1}\|_{U^{1}_{\log}[N;\lambda qQ!,P_{1}^{1/4}]}-O\Big(\frac{\log|P_{1}^{1/4}\lambda qQ!|}{\log N}\Big)-O\Big(\frac{HV}{P_{1}^{1/4}qQ!}\Big)\gg\delta^{O(1)},

where the error terms can be estimated crudely bearing in mind the comments in Section˜5.1 (essentially, P1P_{1} is much smaller than NN but much larger than all other variables). This concludes the proof of Proposition˜5.1. ∎

6. Averaging projections and orthogonality

In the introduction we discussed certain β€˜projection’ operators Ξ sml,Ξ lrg\Pi^{\operatorname{sml}},\Pi^{\operatorname{lrg}}. In this section we introduce the general class of such operators and establish some of their basic properties.

Definition 6.1.

Let f:𝐙→𝐂f:\mathbf{Z}\rightarrow\mathbf{C} be a function. Suppose that q,H∈𝐍q,H\in\mathbf{N}. Then we define

Ξ q,H​f​(n):=𝔼h,hβ€²βˆˆ[H]​f​(n+q​(hβˆ’hβ€²)).\Pi_{q,H}f(n):=\mathbb{E}_{h,h^{\prime}\in[H]}f(n+q(h-h^{\prime})).

Whilst we informally think of these maps as projections, this is not quite accurate as Ξ q,H​Πq,H​fβ‰ Ξ q,H​f\Pi_{q,H}\Pi_{q,H}f\neq\Pi_{q,H}f. The first observation we require is that Ξ q,H​f\Pi_{q,H}f has an almost periodicity property.

Lemma 6.2.

Let f:𝐙→𝐂f:\mathbf{Z}\rightarrow\mathbf{C} be a 1-bounded function. Let q,H∈𝐍q,H\in\mathbf{N}. Then, for any hh we have

Ξ q,H​f​(n+q​h)=Ξ q,H​f​(n)+O​(|h|H).\Pi_{q,H}f(n+qh)=\Pi_{q,H}f(n)+O(\frac{|h|}{H}).
Proof.

The LHS may be expanded as 𝔼h1,h1β€²βˆˆ[H]​f​(n+q​(h+h1βˆ’h1β€²))\mathbb{E}_{h_{1},h^{\prime}_{1}\in[H]}f(n+q(h+h_{1}-h^{\prime}_{1})). The result then follows from ˜A.1. ∎

A crucial feature of the maps Πq,H\Pi_{q,H} is that they essentially preserve the Ulog1U^{1}_{\log}-norms (see Definition˜2.3). Indeed we have the following lemma.

Lemma 6.3.

Let q∈𝐍q\in\mathbf{N} and Hβ€²β©½HH^{\prime}\leqslant H. Then for f:𝐙→𝐂f:\mathbf{Z}\to\mathbf{C} which is 11-bounded, we have that β€–Ξ q,H′​fβˆ’fβ€–Ulog1​[N;q,H]β‰ͺHβ€²/H\|\Pi_{q,H^{\prime}}f-f\|_{U^{1}_{\log}[N;q,H]}\ll H^{\prime}/H.

Proof.

First recall that by definition Definition˜2.3 we have

β€–gβ€–Ulog1​[N;q,H]2=𝔼n∈[N]log​|𝔼h∈[H]​g​(n+h​q)|2.\|g\|_{U^{1}_{\log}[N;q,H]}^{2}=\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{h\in[H]}g(n+hq)\big|^{2}. (6.1)

Note that by ˜A.1 we have

𝔼h∈[H]​Πq,H′​f​(n+h​q)=𝔼h∈[H],h1β€²,h2β€²βˆˆ[Hβ€²]​f​(n+q​(h+h1β€²βˆ’h2β€²))=𝔼h∈[H]​f​(n+h​q)+O​(Hβ€²H).\mathbb{E}_{h\in[H]}\Pi_{q,H^{\prime}}f(n+hq)=\mathbb{E}_{h\in[H],h^{\prime}_{1},h^{\prime}_{2}\in[H^{\prime}]}f(n+q(h+h^{\prime}_{1}-h^{\prime}_{2}))=\mathbb{E}_{h\in[H]}f(n+hq)+O\Big(\frac{H^{\prime}}{H}\Big).

The desired result follows immediately upon taking g=fβˆ’Ξ q,H′​fg=f-\Pi_{q,H^{\prime}}f in ˜6.1. ∎

We next require an approximate Pythagoras relation for projections Ξ H,q,Ξ Hβ€²,qβ€²\Pi_{H,q},\Pi_{H^{\prime},q^{\prime}}.

Lemma 6.4.

Let q,qβ€²,H,Hβ€²q,q^{\prime},H,H^{\prime} be parameters with q∣qβ€²q\mid q^{\prime} and Hβ€²β©½HH^{\prime}\leqslant H. Let f:𝐙→𝐂f:\mathbf{Z}\rightarrow\mathbf{C} be a 11-bounded function. We have that

𝔼n∈[N]log​|Ξ qβ€²,H′​f​(n)βˆ’Ξ q,H​f​(n)|2⩽𝔼n∈[N]log​|Ξ qβ€²,H′​f​(n)|2βˆ’π”Όn∈[N]log​|Ξ q,H​f​(n)|2+O​(log⁑q′​Hlog⁑N+q′​Hβ€²q​H).\mathbb{E}_{n\in[N]}^{\log}\big|\Pi_{q^{\prime},H^{\prime}}f(n)-\Pi_{q,H}f(n)\big|^{2}\leqslant\mathbb{E}_{n\in[N]}^{\log}\big|\Pi_{q^{\prime},H^{\prime}}f(n)\big|^{2}-\mathbb{E}_{n\in[N]}^{\log}\big|\Pi_{q,H}f(n)\big|^{2}+O\Big(\frac{\log q^{\prime}H}{\log N}+\frac{q^{\prime}H^{\prime}}{qH}\Big).
Proof.

For brevity we write ⟨g1,g2⟩:=𝔼n∈[N]log​g1​(n)​g2​(n)Β―\langle g_{1},g_{2}\rangle:=\mathbb{E}_{n\in[N]}^{\log}g_{1}(n)\overline{g_{2}(n)} and β€–gβ€–2:=⟨g,g⟩=𝔼n∈[N]log​|g​(n)|2\|g\|^{2}:=\langle g,g\rangle=\mathbb{E}_{n\in[N]}^{\log}|g(n)|^{2}.

We first expand the LHS as

β€–Ξ qβ€²,H′​fβ€–2+β€–Ξ q,H​fβ€–2βˆ’βŸ¨Ξ qβ€²,H′​f,Ξ q,H​fβŸ©βˆ’βŸ¨Ξ qβ€²,H′​f,Ξ q,H​f⟩¯.\|\Pi_{q^{\prime},H^{\prime}}f\|^{2}+\|\Pi_{q,H}f\|^{2}-\langle\Pi_{q^{\prime},H^{\prime}}f,\Pi_{q,H}f\rangle-\overline{\langle\Pi_{q^{\prime},H^{\prime}}f,\Pi_{q,H}f\rangle}. (6.2)

Expanding the definitions, we have

⟨Πqβ€²,H′​f,Ξ q,H​f⟩=𝔼n∈[N]log​𝔼h1,h2∈[H],h1β€²,h2β€²βˆˆ[Hβ€²]​f​(n+q′​(h1β€²βˆ’h2β€²))​f​(n+q​(h1βˆ’h2))Β―.\langle\Pi_{q^{\prime},H^{\prime}}f,\Pi_{q,H}f\rangle=\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{h_{1},h_{2}\in[H],h^{\prime}_{1},h^{\prime}_{2}\in[H^{\prime}]}f(n+q^{\prime}(h^{\prime}_{1}-h^{\prime}_{2}))\overline{f(n+q(h_{1}-h_{2}))}.

Substitute n=nβ€²+q​h2βˆ’q′​h1β€²n=n^{\prime}+qh_{2}-q^{\prime}h^{\prime}_{1}; then, dropping the dash on nβ€²n^{\prime}, we see from ˜A.2 that this is

𝔼n∈[N]log​𝔼h1,h2∈[H],h1β€²,h2β€²βˆˆ[Hβ€²]​f​(n+q​h2βˆ’q′​h2β€²)​f(n+qh1βˆ’qβ€²h1β€²))Β―+O​(log⁑q′​Hlog⁑N),\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{h_{1},h_{2}\in[H],h^{\prime}_{1},h^{\prime}_{2}\in[H^{\prime}]}f(n+qh_{2}-q^{\prime}h^{\prime}_{2})\overline{f(n+qh_{1}-q^{\prime}h^{\prime}_{1}))}+O\Big(\frac{\log q^{\prime}H}{\log N}\Big),

which equals

𝔼n∈[N]log​|𝔼h∈[H],hβ€²βˆˆ[Hβ€²]​f​(n+q​hβˆ’q′​hβ€²)|2+O​(log⁑q′​Hlog⁑N).\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{h\in[H],h^{\prime}\in[H^{\prime}]}f(n+qh-q^{\prime}h^{\prime})\big|^{2}+O\Big(\frac{\log q^{\prime}H}{\log N}\Big).

Now by ˜A.1 (using here that q∣qβ€²q\mid q^{\prime}) we have

𝔼h∈[H],hβ€²βˆˆ[Hβ€²]​f​(n+q​hβˆ’q′​hβ€²)=𝔼h∈[H]​f​(n+q​h)+O​(q′​Hβ€²q​H).\mathbb{E}_{h\in[H],h^{\prime}\in[H^{\prime}]}f(n+qh-q^{\prime}h^{\prime})=\mathbb{E}_{h\in[H]}f(n+qh)+O\Big(\frac{q^{\prime}H^{\prime}}{qH}\Big).

Therefore, putting these observations together we obtain

⟨Πqβ€²,H′​f,Ξ q,H​f⟩=𝔼n∈[N]log​|𝔼h∈[H]​f​(n+q​h)|2+O​(q′​Hβ€²q​H+log⁑q′​Hlog⁑N).\langle\Pi_{q^{\prime},H^{\prime}}f,\Pi_{q,H}f\rangle=\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{h\in[H]}f(n+qh)\big|^{2}+O\Big(\frac{q^{\prime}H^{\prime}}{qH}+\frac{\log q^{\prime}H}{\log N}\Big).

Taking complex conjugates and adding, we obtain

⟨Πqβ€²,H′​f,Ξ q,H​f⟩+⟨Πqβ€²,H′​f,Ξ q,H​f⟩¯=2​𝔼n∈[N]log​|𝔼h∈[H]​f​(n+q​h)|2+O​(q′​Hβ€²q​H+log⁑q′​Hlog⁑N).\langle\Pi_{q^{\prime},H^{\prime}}f,\Pi_{q,H}f\rangle+\overline{\langle\Pi_{q^{\prime},H^{\prime}}f,\Pi_{q,H}f\rangle}=2\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{h\in[H]}f(n+qh)\big|^{2}+O\Big(\frac{q^{\prime}H^{\prime}}{qH}+\frac{\log q^{\prime}H}{\log N}\Big).

Now by a further application of ˜A.2,

𝔼n∈[N]log​|𝔼h∈[H]​f​(n+q​h)|2=𝔼hβ€²βˆˆ[H]​𝔼n∈[N]log​|𝔼h∈[H]​f​(n+q​(hβˆ’hβ€²))|2+O​(log⁑q​Hlog⁑N),\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{h\in[H]}f(n+qh)\big|^{2}=\mathbb{E}_{h^{\prime}\in[H]}\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{h\in[H]}f(n+q(h-h^{\prime}))\big|^{2}+O\Big(\frac{\log qH}{\log N}\Big),

and by Cauchy–Schwarz this is at least

𝔼n∈[N]log​|𝔼h,hβ€²βˆˆ[H]​f​(n+q​(hβˆ’hβ€²))|2+O​(log⁑q​Hlog⁑N)=β€–Ξ q,H​fβ€–2+O​(log⁑q​Hlog⁑N).\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{h,h^{\prime}\in[H]}f(n+q(h-h^{\prime}))\big|^{2}+O(\frac{\log qH}{\log N})=\|\Pi_{q,H}f\|^{2}+O\Big(\frac{\log qH}{\log N}\Big).

It follows that

⟨Πqβ€²,H′​f,Ξ q,H​f⟩+⟨Πqβ€²,H′​f,Ξ q,H​f⟩¯⩾2​‖Πq,H​fβ€–2+O​(q′​Hβ€²q​H+log⁑q′​Hlog⁑N).\langle\Pi_{q^{\prime},H^{\prime}}f,\Pi_{q,H}f\rangle+\overline{\langle\Pi_{q^{\prime},H^{\prime}}f,\Pi_{q,H}f\rangle}\geqslant 2\|\Pi_{q,H}f\|^{2}+O\Big(\frac{q^{\prime}H^{\prime}}{qH}+\frac{\log q^{\prime}H}{\log N}\Big).

Substituting in to ˜6.2 gives the lemma.∎

We now give the β€˜maximal function’ argument which was hinted at in the introduction where we explained how to move from ˜1.1 to ˜1.2.

Lemma 6.5.

Let f,g:𝐍→𝐂f,g:\mathbf{N}\rightarrow\mathbf{C} be non-negative 11-bounded functions. Let δ∈(0,12)\delta\in(0,\frac{1}{2}) and let H,qH,q be positive integer parameters with log⁑H​qlog⁑N<c​δ2\frac{\log Hq}{\log N}<c\delta^{2}. Suppose that 𝔼n∈[N]log​f​(n)​g​(n)β©ΎΞ΄\mathbb{E}_{n\in[N]}^{\log}f(n)g(n)\geqslant\delta. Then 𝔼n∈[N]log​(Ξ q,H​f)​(n)​g​(n)β©ΎΞ΄2/8\mathbb{E}_{n\in[N]}^{\log}(\Pi_{q,H}f)(n)g(n)\geqslant\delta^{2}/8.

Proof.

Write Ξ =Ξ q,H\Pi=\Pi_{q,H} for brevity. Set Ξ΅:=Ξ΄/4\varepsilon:=\delta/4 and denote h​(n):=1Π​f​(n)>Ξ΅h(n):=1_{\Pi f(n)>\varepsilon}. Then since 0β©½f​hβ©½10\leqslant fh\leqslant 1 and (Π​f)​h⩾Ρ​h(\Pi f)h\geqslant\varepsilon h pointwise we have

𝔼n∈[N]log​(Π​f)​(n)​g​(n)⩾𝔼n∈[N]log​f​(n)​(Π​f)​(n)​g​(n)​h​(n)⩾Ρ​𝔼n∈[N]log​f​(n)​h​(n)​g​(n).\mathbb{E}_{n\in[N]}^{\log}(\Pi f)(n)g(n)\geqslant\mathbb{E}_{n\in[N]}^{\log}f(n)(\Pi f)(n)g(n)h(n)\geqslant\varepsilon\mathbb{E}_{n\in[N]}^{\log}f(n)h(n)g(n).

Therefore we are done if we can show that 𝔼n∈[N]log​f​(n)​(1βˆ’h​(n))β©½Ξ΄/2\mathbb{E}_{n\in[N]}^{\log}f(n)(1-h(n))\leqslant\delta/2, that is to say

𝔼n∈[N]log​f​(n)​1Π​f​(n)β©½Ξ΅β©½Ξ΄/2.\mathbb{E}_{n\in[N]}^{\log}f(n)1_{\Pi f(n)\leqslant\varepsilon}\leqslant\delta/2. (6.3)

Write F​(n):=f​(n)​1Π​f​(n)β©½Ξ΅F(n):=f(n)1_{\Pi f(n)\leqslant\varepsilon}. Since Fβ©½fF\leqslant f pointwise, we have Π​F⩽Π​f\Pi F\leqslant\Pi f pointwise, and so if F​(n)β‰ 0F(n)\neq 0 then we have Π​F​(n)⩽Π​f​(n)β©½Ξ΅\Pi F(n)\leqslant\Pi f(n)\leqslant\varepsilon. It follows that using ˜A.2 and Cauchy–Schwarz that

|𝔼n∈[N]log​F​(n)|2\displaystyle\big|\mathbb{E}_{n\in[N]}^{\log}F(n)\big|^{2} =|𝔼n∈[N]log​𝔼h∈[H]​F​(n+h​q)|2+O​(log⁑H​qlog⁑N)\displaystyle=\big|\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{h\in[H]}F(n+hq)\big|^{2}+O\Big(\frac{\log Hq}{\log N}\Big)
⩽𝔼n∈[N]log​|𝔼h∈[H]​F​(n+h​q)|2+O​(log⁑H​qlog⁑N)\displaystyle\leqslant\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{h\in[H]}F(n+hq)\big|^{2}+O\Big(\frac{\log Hq}{\log N}\Big)
=𝔼n∈[N]log​𝔼h,hβ€²βˆˆ[H]​F​(n+h​q)​F​(n+h′​q)+O​(log⁑H​qlog⁑N)\displaystyle=\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{h,h^{\prime}\in[H]}F(n+hq)F(n+h^{\prime}q)+O\Big(\frac{\log Hq}{\log N}\Big)
⩽𝔼n∈[N]log​𝔼h,hβ€²βˆˆ[H]​F​(n)​F​(n+(hβˆ’hβ€²)​q)+O​(log⁑H​qlog⁑N)\displaystyle\leqslant\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{h,h^{\prime}\in[H]}F(n)F(n+(h-h^{\prime})q)+O\Big(\frac{\log Hq}{\log N}\Big)
=𝔼n∈[N]log​F​(n)​(Π​F)​(n)+O​(log⁑H​qlog⁑N)\displaystyle=\mathbb{E}^{\log}_{n\in[N]}F(n)(\Pi F)(n)+O\Big(\frac{\log Hq}{\log N}\Big)
⩽Ρ​𝔼n∈[N]log​F​(n)+Ξ΅2.\displaystyle\leqslant\varepsilon\mathbb{E}^{\log}_{n\in[N]}F(n)+\varepsilon^{2}.

It follows that 𝔼n∈[N]log​F​(n)β©½2​Ρ\mathbb{E}_{n\in[N]}^{\log}F(n)\leqslant 2\varepsilon, so the claim ˜6.3 follows due to the choice of Ξ΅\varepsilon. ∎

We note a corollary under the same conditions which is good for taking averages, namely that for any Ξ·\eta

𝔼n∈[N]log​Π​f​(n)​g​(n)β©ΎΞ·8​𝔼n∈[N]log​f​(n)​g​(n)βˆ’Ξ·28.\mathbb{E}_{n\in[N]}^{\log}\Pi f(n)g(n)\geqslant\frac{\eta}{8}\mathbb{E}^{\log}_{n\in[N]}f(n)g(n)-\frac{\eta^{2}}{8}. (6.4)

Indeed, if we write Ξ΄:=𝔼n∈[N]log​f​(n)​g​(n)\delta:=\mathbb{E}_{n\in[N]}^{\log}f(n)g(n) then ˜6.4 is trivial for Ξ΄β©½Ξ·\delta\leqslant\eta, while for Ξ΄β©ΎΞ·\delta\geqslant\eta it follows from Lemma˜6.5.

7. Proof of the main theorem

We are now ready to prove our main result, Theorem˜1.1. The reader may find it helpful to revisit the overview given in the introduction.

7.1. Setting up parameters.

We begin by defining parameters and scales to be used in the proof.

Let rr be the number of colours; we will fix this for the remainder of the proof and we may assume it is sufficiently large. Let C0C_{0} be a suitable large positive integer (independent of rr), recall that Ξ΅0:=110\varepsilon_{0}:=\frac{1}{10}, and set

K:=C0​r8,t:=K2,V=(⌈r4+Ξ΅0βŒ‰C)!andN:=exp⁑exp⁑(r50),K:=C_{0}r^{8},\quad t:=K^{2},\quad V=(\lceil r^{4+\varepsilon_{0}}\rceil^{C})!\quad\mbox{and}\quad N:=\exp\exp(r^{50}), (7.1)

where here CC is the constant in Proposition˜5.1. Define

B0:={V4i:i=1,2,…,K2}.B_{0}:=\{V^{4^{i}}:i=1,2,\dots,K^{2}\}. (7.2)

We now define a doubly-indexed sequence of positive integer scales (Hi,j)i∈[t],j∈[2​K](H_{i,j})_{i\in[t],j\in[2K]} by

Hi,j:=⌊exp⁑exp⁑(r25​(4​K​i+j))βŒ‹.H_{i,j}:=\lfloor\exp\exp(r^{25}(4Ki+j))\rfloor. (7.3)

Note that we have the crude bounds

exp⁑exp⁑((log⁑log⁑N)1/10)<max⁑B0<H1,1<β‹―<H1,2​K<H2,1<β‹―<Ht,2​K<e(log⁑N)1/10,\exp\exp((\log\log N)^{1/10})<\max B_{0}<H_{1,1}<\cdots<H_{1,2K}<H_{2,1}<\cdots<H_{t,2K}<e^{(\log N)^{1/10}}, (7.4)

provided rr is large enough. We will also use the auxiliary scales Hi,0H_{i,0} defined by the same formula ˜7.3. For i∈[t]i\in[t] and j∈[K]j\in[K], define 𝒫i,j\mathscr{P}_{i,j} to be the set of primes satisfying Hi,2​jβˆ’1β©½pβ©½Hi,2​jH_{i,2j-1}\leqslant p\leqslant H_{i,2j}. We note that with this choice of parameters we have, by Mertens’ theorem, βˆ‘pβˆˆπ’«i,j1p≫r25\sum_{p\in\mathscr{P}_{i,j}}\frac{1}{p}\gg r^{25}.

7.2. Positivity for x,x​yx,xy

The first step of the proof is to isolate the colour class in which we will eventually find our configuration {x+y,x​y}\{x+y,xy\}, and to show that it is rich in configurations {x,x​y}\{x,xy\}. This is a mild variant of [Ric25, TheoremΒ 3.6], which itself is related to results of Ahlswede, Khachatrian and SΓ‘rkΓΆzy [AKS99] and Davenport and ErdΕ‘s [DE36].

Consider an rr-colouring A1βˆͺβ‹―βˆͺAr=[N]A_{1}\cup\cdots\cup A_{r}=[N]. For each b∈B0b\in B_{0} we have

𝔼n∈[N]logβ€‹βˆ‘j=1r𝟏Aj​(b​n)=𝔼n∈[N]log​1[N]​(b​n)=HN/bHNβ©Ύ12,\mathbb{E}_{n\in[N]}^{\log}\sum_{j=1}^{r}\mathbf{1}_{A_{j}}(bn)=\mathbb{E}_{n\in[N]}^{\log}1_{[N]}(bn)=\frac{H_{N/b}}{H_{N}}\geqslant\tfrac{1}{2},

where here HNH_{N} denotes the harmonic sum. The last bound here follows (comfortably) using ˜7.4. By summing over all b∈B0b\in B_{0} and an appeal to the pigeonhole principle, there is some colour class A=AjA=A_{j} such that

βˆ‘b∈B0𝔼n∈[N]log​1A​(b​n)β©ΎK2/2​r,\sum_{b\in B_{0}}\mathbb{E}_{n\in[N]}^{\log}1_{A}(bn)\geqslant K^{2}/2r,

which implies that 𝔼n∈[N]log​1A​(b​n)β©Ύ1/4​r\mathbb{E}_{n\in[N]}^{\log}1_{A}(bn)\geqslant 1/4r for at least K2/4​rβ©ΎKK^{2}/4r\geqslant K elements b∈B0b\in B_{0}. Fix a set BβŠ‚B0B\subset B_{0} of KK such elements. We fix the colour class AA for the remainder of the proof.

By repeated applications of Lemma˜A.5, we have

𝔼n∈[N],pi,1βˆˆπ’«i,1,…,pi,jβˆˆπ’«i,jlog​1A​(b​pi,1​⋯​pi,j​n)β©Ύ1/8​r\mathbb{E}^{\log}_{n\in[N],p_{i,1}\in\mathscr{P}_{i,1},\dots,p_{i,j}\in\mathscr{P}_{i,j}}1_{A}(bp_{i,1}\cdots p_{i,j}n)\geqslant 1/8r

for any i∈[t]i\in[t], any jβ©½Kj\leqslant K and for any b∈Bb\in B. Note here that the error term arising from this repeated application of Lemma˜A.5 is dominated by β‰ͺKmaxi,j(βˆ‘pβˆˆπ’«i,j1p)βˆ’1/2β‰ͺKrβˆ’25/2β‰ͺrβˆ’3\ll K\max_{i,j}\big(\sum_{p\in\mathscr{P}_{i,j}}\frac{1}{p}\big)^{-1/2}\ll Kr^{-25/2}\ll r^{-3}.

Let the elements of BB be b1<β‹―<bKb_{1}<\cdots<b_{K}. Then, applying the above with b=bjb=b_{j} and summing over 1β©½jβ©½K1\leqslant j\leqslant K, we obtain

βˆ‘j=1K𝔼n∈[N],pi,1βˆˆπ’«i,1,…,pi,Kβˆˆπ’«i,Klog​1A​(bj​pi,1​⋯​pi,j​n)β©ΎK8​r.\sum_{j=1}^{K}\mathbb{E}^{\log}_{n\in[N],p_{i,1}\in\mathscr{P}_{i,1},\dots,p_{i,K}\in\mathscr{P}_{i,K}}1_{A}(b_{j}p_{i,1}\cdots p_{i,j}n)\geqslant\frac{K}{8r}.

(Note here that, for the term with index jj, we can include the extra averages over 𝒫i,j+1,…​𝒫i,K\mathscr{P}_{i,j+1},\dots\mathscr{P}_{i,K} with no change to the expression.) By Cauchy–Schwarz it follows that

𝔼n∈[N],pi,1βˆˆπ’«i,1,…,pi,Kβˆˆπ’«i,Klogβ€‹βˆ‘1β©½j,jβ€²β©½K1A​(bj​pi,1​⋯​pi,j​n)​1A​(bj′​pi,1​⋯​pi,j′​n)β©Ύ2βˆ’6​(Kr)2.\mathbb{E}^{\log}_{n\in[N],p_{i,1}\in\mathscr{P}_{i,1},\dots,p_{i,K}\in\mathscr{P}_{i,K}}\sum_{1\leqslant j,j^{\prime}\leqslant K}1_{A}(b_{j}p_{i,1}\cdots p_{i,j}n)1_{A}(b_{j^{\prime}}p_{i,1}\cdots p_{i,j^{\prime}}n)\geqslant 2^{-6}\big(\frac{K}{r}\big)^{2}.

Since K=r8K=r^{8}, if rr is large enough we may exclude the O​(K)O(K) pairs of indices with |jβˆ’jβ€²|β©½1|j-j^{\prime}|\leqslant 1 at the loss of at most a factor 22. By symmetry we are also free to only include the pairs with j>jβ€²j>j^{\prime} (at the loss of another factor of 2), and we thereby obtain

𝔼n∈[N],pi,1βˆˆπ’«i,1,…,pi,Kβˆˆπ’«i,Klogβ€‹βˆ‘1β©½jβ€²<jβ©½Kjβ©Ύjβ€²+21A​(bk​pi,1​⋯​pi,j′​n)​1A​(bj​pi,1​⋯​pi,j​n)β©Ύ2βˆ’8​(Kr)2.\mathbb{E}^{\log}_{n\in[N],p_{i,1}\in\mathscr{P}_{i,1},\dots,p_{i,K}\in\mathscr{P}_{i,K}}\sum_{\begin{subarray}{c}1\leqslant j^{\prime}<j\leqslant K\\ j\geqslant j^{\prime}+2\end{subarray}}1_{A}(b_{k}p_{i,1}\cdots p_{i,j^{\prime}}n)1_{A}(b_{j}p_{i,1}\cdots p_{i,j}n)\geqslant 2^{-8}\big(\frac{K}{r}\big)^{2}. (7.5)

By another repeated application of Lemma˜A.5 we have

𝔼n∈[N],pi,1βˆˆπ’«i,1,…,pi,Kβˆˆπ’«i,Klog​1A​(bj′​pi,1​⋯​pi,j′​n)​1A​(bj​pi,1​⋯​pi,j​n)\displaystyle\mathbb{E}^{\log}_{n\in[N],p_{i,1}\in\mathscr{P}_{i,1},\dots,p_{i,K}\in\mathscr{P}_{i,K}}1_{A}(b_{j^{\prime}}p_{i,1}\cdots p_{i,j^{\prime}}n)1_{A}(b_{j}p_{i,1}\cdots p_{i,j}n)
=𝔼n∈[N],pi,1βˆˆπ’«i,1,…,pi,Kβˆˆπ’«i,Klog​1A​(bj′​n)​1A​(bj​pi,jβ€²+1​⋯​pi,j​n)+O​(rβˆ’3)\displaystyle\qquad\qquad=\mathbb{E}^{\log}_{n\in[N],p_{i,1}\in\mathscr{P}_{i,1},\dots,p_{i,K}\in\mathscr{P}_{i,K}}1_{A}(b_{j^{\prime}}n)1_{A}(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}n)+O(r^{-3})

for each pair j,jβ€²j,j^{\prime} with j>jβ€²j>j^{\prime}. From this and ˜7.5, it follows (again assuming rr large enough) that

𝔼n∈[N],pi,1βˆˆπ’«i,1,…,pi,Kβˆˆπ’«i,Klogβ€‹βˆ‘1β©½jβ€²<jβ©½Kjβ©Ύjβ€²+21A​(bj′​n)​1A​(bj​pi,jβ€²+1​⋯​pi,j​n)β©Ύ2βˆ’9​(Kr)2.\mathbb{E}^{\log}_{n\in[N],p_{i,1}\in\mathscr{P}_{i,1},\dots,p_{i,K}\in\mathscr{P}_{i,K}}\sum_{\begin{subarray}{c}1\leqslant j^{\prime}<j\leqslant K\\ j\geqslant j^{\prime}+2\end{subarray}}1_{A}(b_{j^{\prime}}n)1_{A}(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}n)\geqslant 2^{-9}\big(\frac{K}{r}\big)^{2}.

Recall that this is true for all i∈[t]i\in[t]. By pigeonhole, for each ii there is some j′​(i)j^{\prime}(i) such that

𝔼n∈[N],pi,1βˆˆπ’«i,1,…,pi,Kβˆˆπ’«i,Klogβ€‹βˆ‘j=j′​(i)+2K1A​(bj′​(i)​n)​1A​(bj​pi,j′​(i)+1​⋯​pi,j​n)β©Ύ2βˆ’9​Kr2.\mathbb{E}^{\log}_{n\in[N],p_{i,1}\in\mathscr{P}_{i,1},\dots,p_{i,K}\in\mathscr{P}_{i,K}}\sum_{j=j^{\prime}(i)+2}^{K}1_{A}(b_{j^{\prime}(i)}n)1_{A}(b_{j}p_{i,j^{\prime}(i)+1}\cdots p_{i,j}n)\geqslant 2^{-9}\frac{K}{r^{2}}.

Pass to a subset IβŠ‚[t]I\subset[t] of size at least t/Kt/K such that j′​(i)j^{\prime}(i) does not depend on i∈Ii\in I, and denote by jβ€²j^{\prime} the common value of these j′​(i)j^{\prime}(i). Writing b:=bjβ€²b:=b_{j^{\prime}} and f​(n):=1A​(b​n)f(n):=1_{A}(bn), we then have

𝔼n∈[N],pi,1βˆˆπ’«i,1,…,pi,Kβˆˆπ’«i,Klogβ€‹βˆ‘j=jβ€²+2Kf​(n)​1A​(bj​pi,jβ€²+1​⋯​pi,j​n)β©Ύ2βˆ’9​Kr2\mathbb{E}^{\log}_{n\in[N],p_{i,1}\in\mathscr{P}_{i,1},\dots,p_{i,K}\in\mathscr{P}_{i,K}}\sum_{j=j^{\prime}+2}^{K}f(n)1_{A}(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}n)\geqslant 2^{-9}\frac{K}{r^{2}} (7.6)

for all i∈Ii\in I. Fix this choice of jβ€²j^{\prime} (and hence of b=bjβ€²b=b_{j^{\prime}} and the function ff) for the rest of the proof. Define also Iβˆ—:=Iβˆ–{max⁑I}I_{*}:=I\setminus\{\max I\} to be the elements of II except the largest one; thus |Iβˆ—|β©Ύ|I|/2|I_{*}|\geqslant|I|/2.

7.3. Proof of the main theorem

We think of pairs (i,j)(i,j) (with i∈Iβˆ—i\in I_{*} and jβ©Ύjβ€²+2j\geqslant j^{\prime}+2) as β€˜scales’ in the proof. Associated to any scale will be a pair of β€˜projection’ operators in the sense of Definition˜6.1. Define Qj:=bj/b2​VQ_{j}:=b_{j}/b^{2}V. Note that QjQ_{j} is an integer (in fact it equals V4jβˆ’2β‹…4jβ€²+1V^{4^{j}-2\cdot 4^{j^{\prime}}+1}).

For each pair (i,j)(i,j) there will be two important projection operators Ξ \Pi, namely

Ξ i,jsml:=Ξ Qjβˆ’1,Hi+,0andΞ i,jlrg:=Ξ Qj,Hi,0.\Pi^{\operatorname{sml}}_{i,j}:=\Pi_{Q_{j-1},H_{i_{+},0}}\quad\mbox{and}\quad\Pi^{\operatorname{lrg}}_{i,j}:=\Pi_{Q_{j},H_{i,0}}. (7.7)

Here, i+i_{+} denotes the next largest element in II after ii, which exists since i∈Iβˆ—=Iβˆ–{max⁑I}i\in I_{*}=I\setminus\{\max I\}. We informally refer to these as the β€˜small’ and β€˜large’ projections associated to (i,j)(i,j).

We first apply the small projection operator to ˜7.6 using Lemma˜6.5, or more accurately ˜6.4. Taking Ξ·=2βˆ’10​rβˆ’2\eta=2^{-10}r^{-2} there, we have

𝔼n∈[N],pi,βˆ—βˆˆπ’«i,βˆ—logβ€‹βˆ‘j=jβ€²+2KΞ i,jsml​f​(n)​1A​(bj​pi,k+1​⋯​pi,j​n)β©ΎΞ·8​(2βˆ’9​Kr2)βˆ’Ξ·28​K≫Kr4.\mathbb{E}^{\log}_{n\in[N],p_{i,*}\in\mathscr{P}_{i,*}}\sum_{j=j^{\prime}+2}^{K}\Pi_{i,j}^{\operatorname{sml}}f(n)1_{A}(b_{j}p_{i,k+1}\cdots p_{i,j}n)\\ \geqslant\frac{\eta}{8}(2^{-9}\frac{K}{r^{2}})-\frac{\eta^{2}}{8}K\gg\frac{K}{r^{4}}. (7.8)

Here, and below, 𝔼pi,βˆ—βˆˆπ’«i,βˆ—log\mathbb{E}^{\log}_{p_{i,*}\in\mathscr{P}_{i,*}} is shorthand for 𝔼pi,1βˆˆπ’«i,1,…,pi,Kβˆˆπ’«i,Klog\mathbb{E}^{\log}_{p_{i,1}\in\mathscr{P}_{i,1},\dots,p_{i,K}\in\mathscr{P}_{i,K}}. Now observe that by Lemma˜6.2 we have

Ξ i,jsml​f​(n)\displaystyle\Pi_{i,j}^{\operatorname{sml}}f(n) =Ξ i,jsml​f​(n+bjb2​pi,jβ€²+1​⋯​pi,j)+O​(bjb2​pi,jβ€²+1​⋯​pi,jHi+,0)\displaystyle=\Pi_{i,j}^{\operatorname{sml}}f\big(n+\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j}\big)+O\Big(\frac{b_{j}}{b^{2}}\frac{p_{i,j^{\prime}+1}\cdots p_{i,j}}{H_{i_{+},0}}\Big)
=Ξ i,jsml​f​(n+bjb2​pi,jβ€²+1​⋯​pi,j)+O​(rβˆ’10).\displaystyle=\Pi_{i,j}^{\operatorname{sml}}f\big(n+\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j}\big)+O(r^{-10}). (7.9)

The key points to observe here in applying Lemma˜6.2 are that Qjβˆ’1=bjβˆ’1b2​V∣bjb2Q_{j-1}=\frac{b_{j-1}}{b^{2}}V\mid\frac{b_{j}}{b^{2}} by the definitions of the bjb_{j}s, and also

bjb2​pi,jβ€²+1​⋯​pi,jβ©½V4K2β€‹βˆj=1KHi,2​j<V4K2​Hi,2​K2<rβˆ’10​Hi+,0.\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j}\leqslant V^{4^{K^{2}}}\prod_{j=1}^{K}H_{i,2j}<V^{4^{K^{2}}}H_{i,2K}^{2}<r^{-10}H_{i_{+},0}.

The inequalities here are all very comfortably true (when rr is large); we have r10<V4K2<H1,1<Hi,2​Kr^{10}<V^{4^{K^{2}}}<H_{1,1}<H_{i,2K}, that Hi,2​j2<Hi,2​(j+1)H_{i,2j}^{2}<H_{i,2(j+1)} for all jj, and that Hi,2​K4<Hi+,0H_{i,2K}^{4}<H_{i_{+},0}, all of which follow using ˜7.4. From ˜7.8 andΒ 7.9 we have

βˆ‘j=jβ€²+2K𝔼n∈[N],pi,βˆ—βˆˆπ’«i,βˆ—log​Πi,jsml​f​(n+bjb2​pi,jβ€²+1​⋯​pi,j)​1A​(bj​pi,jβ€²+1​⋯​pi,j​n)≫Kr4.\sum_{j=j^{\prime}+2}^{K}\mathbb{E}^{\log}_{n\in[N],p_{i,*}\in\mathscr{P}_{i,*}}\Pi_{i,j}^{\operatorname{sml}}f\big(n+\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j}\big)1_{A}(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}n)\gg\frac{K}{r^{4}}.

This, recall, is for all i∈Iβˆ—i\in I_{*}. Summing over all these ii gives

βˆ‘i∈Iβˆ—βˆ‘j=jβ€²+2K𝔼n∈[N],pi,βˆ—βˆˆπ’«i,βˆ—log​Πi,jsml​f​(n+bjb2​pi,jβ€²+1​⋯​pi,j)​1A​(bj​pi,jβ€²+1​⋯​pi,j​n)≫K​|I|r4.\sum_{i\in I_{*}}\sum_{j=j^{\prime}+2}^{K}\mathbb{E}^{\log}_{n\in[N],p_{i,*}\in\mathscr{P}_{i,*}}\Pi_{i,j}^{\operatorname{sml}}f\big(n+\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j}\big)1_{A}(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}n)\gg\frac{K|I|}{r^{4}}. (7.10)

Suppose we had a similar result with Ξ i,jsml​f\Pi_{i,j}^{\operatorname{sml}}f replaced by ff, that is

βˆ‘i∈Iβˆ—βˆ‘j=jβ€²+2K𝔼n∈[N],pi,βˆ—βˆˆπ’«i,βˆ—log​f​(n+bjb2​pi,jβ€²+1​⋯​pi,j)​1A​(bj​pi,jβ€²+1​⋯​pi,j​n)≫K​|I|r4.\sum_{i\in I_{*}}\sum_{j=j^{\prime}+2}^{K}\mathbb{E}^{\log}_{n\in[N],p_{i,*}\in\mathscr{P}_{i,*}}f\big(n+\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j}\big)1_{A}(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}n)\gg\frac{K|I|}{r^{4}}. (7.11)

In particular, for some choice of i,j,pi,jβ€²+1,…,pi,ji,j,p_{i,j^{\prime}+1},\dots,p_{i,j} and nβ©Ύ3n\geqslant 3 we would then have

f​(n+bjb2​pi,jβ€²+1​⋯​pi,j)​1A​(bj​pi,jβ€²+1​⋯​pi,j​n)>0.f\big(n+\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j}\big)1_{A}(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}n)>0.

Taking x:=b​nx:=bn and y:=bjb​pi,jβ€²+1​⋯​pi,jy:=\frac{b_{j}}{b}p_{i,j^{\prime}+1}\cdots p_{i,j} (and recalling that f​(n)=1A​(b​n)f(n)=1_{A}(bn)) we then have x+y,x​y∈Ax+y,xy\in A, and the proof is complete.

It remains to prove that we do indeed have ˜7.11. As described in the introduction, we deduce it from ˜7.10 in two steps. First, we replace the β€˜small’ projections Ξ i,jsml​f\Pi_{i,j}^{\operatorname{sml}}f in ˜7.10 by the β€˜large’ projections Ξ i,jlrg​f\Pi_{i,j}^{\operatorname{lrg}}f. The error in making this replacement is

βˆ‘i∈Iβˆ—βˆ‘j=jβ€²+2K𝔼n∈[N],pi,βˆ—βˆˆπ’«i,βˆ—log​(Ξ i,jsml​fβˆ’Ξ i,jlrg​f)​(n+bjb2​pi,jβ€²+1​⋯​pi,j)​1A​(bj​pi,jβ€²+1​⋯​pi,j​n).\sum_{i\in I_{*}}\sum_{j=j^{\prime}+2}^{K}\mathbb{E}^{\log}_{n\in[N],p_{i,*}\in\mathscr{P}_{i,*}}\big(\Pi_{i,j}^{\operatorname{sml}}f-\Pi_{i,j}^{\operatorname{lrg}}f\big)\big(n+\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j}\big)1_{A}(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}n). (7.12)

By ˜A.2 and the crude bounds bjβ©½V4K2b_{j}\leqslant V^{4^{K^{2}}}, pi,βˆ—β©½Ht,2​Kp_{i,*}\leqslant H_{t,2K} this is

βˆ‘i∈Iβˆ—βˆ‘jβ©Ύjβ€²+2𝔼pi,βˆ—βˆˆπ’«i,βˆ—log​𝔼n∈[N]log​(Ξ i,jsml​fβˆ’Ξ i,jlrg​f)​(n)β€‹Οˆi,j,pi,βˆ—β€‹(n)+O​(|I|​K​log⁑(V4K2​Ht,2​K2​K)log⁑N),\sum_{i\in I_{*}}\sum_{j\geqslant j^{\prime}+2}\mathbb{E}^{\log}_{p_{i,*}\in\mathscr{P}_{i,*}}\mathbb{E}^{\log}_{n\in[N]}\big(\Pi_{i,j}^{\operatorname{sml}}f-\Pi_{i,j}^{\operatorname{lrg}}f\big)(n)\psi_{i,j,p_{i,*}}(n)+O\Big(|I|K\frac{\log(V^{4^{K^{2}}}H^{2K}_{t,2K})}{\log N}\Big), (7.13)

where

ψi,j,pi,βˆ—β€‹(n):=1A​(bj​pi,jβ€²+1​⋯​pi,j​(nβˆ’bjb2​pi,jβ€²+1​⋯​pi,j)).\psi_{i,j,p_{i,*}}(n):=1_{A}\big(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}(n-\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j})\big).

For the rest of the proof (as in Lemma˜6.4) we use the notation ⟨g1,g2⟩:=𝔼n∈[N]log​g1​(n)​g2​(n)Β―\langle g_{1},g_{2}\rangle:=\mathbb{E}_{n\in[N]}^{\log}g_{1}(n)\overline{g_{2}(n)} and β€–gβ€–2:=⟨g,g⟩=𝔼n∈[N]log​|g​(n)|2\|g\|^{2}:=\langle g,g\rangle=\mathbb{E}_{n\in[N]}^{\log}|g(n)|^{2}. Using ˜7.1 andΒ 7.4, the error term in ˜7.13 is seen to be O​(|I|​K​rβˆ’10)O(|I|Kr^{-10}). Thus ˜7.13 is

βˆ‘i∈Iβˆ‘j=jβ€²+2K𝔼pi,βˆ—βˆˆπ’«i,βˆ—logβ€‹βŸ¨Ξ i,jsml​fβˆ’Ξ i,jlrg​f,ψi,j,pi,βˆ—βŸ©+O​(|I|​K​rβˆ’10).\sum_{i\in I}\sum_{j=j^{\prime}+2}^{K}\mathbb{E}^{\log}_{p_{i,*}\in\mathscr{P}_{i,*}}\langle\Pi_{i,j}^{\operatorname{sml}}f-\Pi_{i,j}^{\operatorname{lrg}}f,\psi_{i,j,p_{i,*}}\rangle+O(|I|Kr^{-10}).

By Cauchy–Schwarz and the 11-boundedness of the functions ψ\psi, this is bounded above by

βˆ‘i∈Iβˆ—βˆ‘j=jβ€²+2K\displaystyle\sum_{i\in I_{*}}\sum_{j=j^{\prime}+2}^{K} β€–Ξ i,jlrg​fβˆ’Ξ i,jsml​fβ€–+O​(|I|​K​rβˆ’10)\displaystyle\|\Pi_{i,j}^{\operatorname{lrg}}f-\Pi_{i,j}^{\operatorname{sml}}f\|+O(|I|Kr^{-10})
β©½(|I|​K)1/2​(βˆ‘i∈Iβˆ—βˆ‘j=jβ€²+2Kβ€–Ξ i,jlrg​fβˆ’Ξ i,jsml​fβ€–2)1/2+O​(|I|​K​rβˆ’10).\displaystyle\leqslant(|I|K)^{1/2}\Big(\sum_{i\in I_{*}}\sum_{j=j^{\prime}+2}^{K}\|\Pi_{i,j}^{\operatorname{lrg}}f-\Pi_{i,j}^{\operatorname{sml}}f\|^{2}\Big)^{1/2}+O(|I|Kr^{-10}). (7.14)

For each i,ji,j we apply Lemma˜6.4 with q=Qjβˆ’1q=Q_{j-1}, qβ€²=Qjq^{\prime}=Q_{j}, H=Hi+,0H=H_{i_{+},0} and Hβ€²=Hi,0H^{\prime}=H_{i,0}, obtaining

β€–Ξ i,jlrg​fβˆ’Ξ i,jsml​fβ€–2\displaystyle\|\Pi_{i,j}^{\operatorname{lrg}}f-\Pi_{i,j}^{\operatorname{sml}}f\|^{2} β©½β€–Ξ i,jlrg​fβ€–2βˆ’β€–Ξ i,jsml​fβ€–2+O​(log⁑Qj​Hi+,0log⁑N)+O​(Qj​Hi,0Qjβˆ’1​Hi+,0)\displaystyle\leqslant\|\Pi_{i,j}^{\operatorname{lrg}}f\|^{2}-\|\Pi_{i,j}^{\operatorname{sml}}f\|^{2}+O\Big(\frac{\log Q_{j}H_{i_{+},0}}{\log N}\Big)+O\Big(\frac{Q_{j}H_{i,0}}{Q_{j-1}H_{i_{+},0}}\Big)
β©½β€–Ξ i,jlrg​fβ€–2βˆ’β€–Ξ i,jsml​fβ€–2+rβˆ’10.\displaystyle\leqslant\|\Pi_{i,j}^{\operatorname{lrg}}f\|^{2}-\|\Pi_{i,j}^{\operatorname{sml}}f\|^{2}+r^{-10}. (7.15)

The explain the last line here, we can bound the first error term by <(log⁑N)βˆ’1/2<rβˆ’20<(\log N)^{-1/2}<r^{-20} using ˜7.4. The second error term can be bounded using Qj<max⁑B0Q_{j}<\max B_{0} and the fact that Hi+,0β©ΎHi+1,0>r20​(max⁑B0)​Hi,0H_{i_{+},0}\geqslant H_{i+1,0}>r^{20}(\max B_{0})H_{i,0}, which can be verified using the definitions ˜7.2 andΒ 7.3.

Summing ˜7.15 over i,ji,j gives

βˆ‘i∈Iβˆ—βˆ‘j=jβ€²+2Kβ€–Ξ i,jlrg​fβˆ’Ξ i,jsml​fβ€–2β©½βˆ‘i∈Iβˆ—βˆ‘j=jβ€²+2K(β€–Ξ i,jlrg​fβ€–2βˆ’β€–Ξ i,jsml​fβ€–2)+O​(|I|​K​rβˆ’10).\sum_{i\in I_{*}}\sum_{j=j^{\prime}+2}^{K}\|\Pi_{i,j}^{\operatorname{lrg}}f-\Pi_{i,j}^{\operatorname{sml}}f\|^{2}\leqslant\sum_{i\in I_{*}}\sum_{j=j^{\prime}+2}^{K}\big(\|\Pi_{i,j}^{\operatorname{lrg}}f\|^{2}-\|\Pi_{i,j}^{\operatorname{sml}}f\|^{2}\big)+O(|I|Kr^{-10}).

Recalling the definitions ˜7.7 of the two projection operators, we see that the bracketed sum has considerable cancellation; the only uncancelled positive terms are the β€–Ξ i,jlrg​fβ€–2\|\Pi_{i,j}^{\operatorname{lrg}}f\|^{2} terms from scales (i,j)(i,j) which are not of the form (iΒ―+,jΒ―βˆ’1)(\overline{i}_{+},\overline{j}-1) for some other scale (iΒ―,jΒ―)(\overline{i},\overline{j}), that is to say with i=min⁑(I)i=\min(I) or j=Kj=K; thus the bracketed sum is bounded by |I|+K|I|+K. It follows that ˜7.14 is bounded by

β©½(|I|​K)1/2​(|I|+K+O​(rβˆ’10))1/2+O​(|I|​K​rβˆ’10)β‰ͺC0βˆ’1/2​rβˆ’4​|I|​K,\leqslant(|I|K)^{1/2}\big(|I|+K+O(r^{-10})\big)^{1/2}+O(|I|Kr^{-10})\ll C_{0}^{-1/2}r^{-4}|I|K,

using here that K=C0​r8K=C_{0}r^{8} and |I|β©Ύt/K=K|I|\geqslant t/K=K.

If the constant C0C_{0} is chosen large enough, this means that ˜7.12 is small compared with the RHS of ˜7.10.

To summarise so far, we have replaced the β€˜small’ projections Ξ i,jsml\Pi^{\operatorname{sml}}_{i,j} in ˜7.10 by the β€˜larger’ ones Ξ i,jlrg\Pi^{\operatorname{lrg}}_{i,j} at the loss of only the quality of the implied constant, that is to say we have shown

βˆ‘i∈Iβˆ—βˆ‘j=jβ€²+2K𝔼n∈[N],pi,βˆ—βˆˆπ’«i,βˆ—log​Πi,jlrg​f​(n+bjb2​pi,jβ€²+1​⋯​pi,j)​1A​(bj​pi,jβ€²+1​⋯​pi,j​n)≫K​|I|r4.\sum_{i\in I_{*}}\sum_{j=j^{\prime}+2}^{K}\mathbb{E}^{\log}_{n\in[N],p_{i,*}\in\mathscr{P}_{i,*}}\Pi_{i,j}^{\operatorname{lrg}}f(n+\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j})1_{A}(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}n)\gg\frac{K|I|}{r^{4}}.

To complete the proof of ˜7.11 (and hence of Theorem˜1.1) we now replace the copies of Ξ i,jlrg​f\Pi_{i,j}^{\operatorname{lrg}}f by ff itself. For this we can work one value of (i,j)(i,j) at a time; thus it is enough to show that, for each (i,j)(i,j),

𝔼n∈[N],pi,βˆ—βˆˆπ’«i,βˆ—log​(fβˆ’Ξ i,jlrg​f)​(n+bjb2​pi,jβ€²+1​⋯​pi,j)​1A​(bj​pi,jβ€²+1​⋯​pi,j​n)β©½rβˆ’4βˆ’Ξ΅0.\mathbb{E}^{\log}_{n\in[N],p_{i,*}\in\mathscr{P}_{i,*}}\big(f-\Pi_{i,j}^{\operatorname{lrg}}f\big)(n+\frac{b_{j}}{b^{2}}p_{i,j^{\prime}+1}\cdots p_{i,j})1_{A}(b_{j}p_{i,j^{\prime}+1}\cdots p_{i,j}n)\leqslant r^{-4-\varepsilon_{0}}. (7.16)

(Here Ρ0=110\varepsilon_{0}=\frac{1}{10} again). To prove this we use Proposition˜5.1. Indeed, we note that the LHS of ˜7.16 is of the form

𝔼n∈[N],pβˆˆπ’«,pβ€²βˆˆπ’«β€²log​f1​(n+λ​p​pβ€²)​f2​(λ​n​p​pβ€²).\mathbb{E}_{n\in[N],p\in\mathscr{P},p^{\prime}\in\mathscr{P}^{\prime}}^{\log}f_{1}(n+\lambda pp^{\prime})f_{2}(\lambda npp^{\prime}).

(which is exactly the expression in ˜5.1) where f1=fβˆ’Ξ Qj,Hi,0​ff_{1}=f-\Pi_{Q_{j},H_{i,0}}f, f2​(n)=1A​(b2​n)f_{2}(n)=1_{A}(b^{2}n), Ξ»=bj/b2\lambda=b_{j}/b^{2}, 𝒫=𝒫i,j\mathscr{P}=\mathscr{P}_{i,j}, 𝒫′=𝒫i,jβ€²+1​⋯​𝒫i,jβˆ’1\mathscr{P}^{\prime}=\mathscr{P}_{i,j^{\prime}+1}\cdots\mathscr{P}_{i,j-1} and k=jβˆ’jβ€²βˆ’1∈𝐍k=j-j^{\prime}-1\in\mathbf{N}.

Note here that every element of 𝒫′\mathscr{P}^{\prime} has just one representation in this product since all primes in 𝒫i,jβ€²+1\mathscr{P}_{i,j^{\prime}+1} are much smaller than those in 𝒫i,jβ€²+2\mathscr{P}_{i,j^{\prime}+2}, and so on, and so 𝔼pi,jβ€²+1βˆˆπ’«i,jβ€²+1,…,pi,jβˆ’1βˆˆπ’«i,jβˆ’1log\mathbb{E}^{\log}_{p_{i,j^{\prime}+1}\in\mathscr{P}_{i,j^{\prime}+1},\dots,p_{i,j-1}\in\mathscr{P}_{i,j-1}} is the same thing as 𝔼pβ€²βˆˆπ’«β€²log\mathbb{E}^{\log}_{p^{\prime}\in\mathscr{P}^{\prime}}.

The setup for the application of Proposition˜5.1 requires some discussion. We address the various requirements in the statement of that proposition in turn.

  • β€’

    The parameter kk will be jβˆ’jβ€²βˆ’1j-j^{\prime}-1. Note 1β©½kβ©½K1\leqslant k\leqslant K, so the condition kβ©½log⁑log⁑Nk\leqslant\log\log N is satisfied due to the choices ˜7.1.

  • β€’

    We will take Ξ΄:=⌈r4+Ξ΅0βŒ‰βˆ’1\delta:=\lceil r^{4+\varepsilon_{0}}\rceil^{-1} (the aim being to show that the LHS of ˜7.16 is at most Ξ΄\delta). The conditions 1/Ξ΄β©½log⁑log⁑N1/\delta\leqslant\log\log N and kβ©½Ξ΄βˆ’10k\leqslant\delta^{-10} are then immediately checked.

  • β€’

    We take 𝒫=𝒫i,j\mathscr{P}=\mathscr{P}_{i,j} and 𝒫′=𝒫i,jβ€²+1​⋯​𝒫i,jβˆ’1\mathscr{P^{\prime}}=\mathscr{P}_{i,j^{\prime}+1}\cdots\mathscr{P}_{i,j-1}. For notational consistency with Proposition˜5.1, write 𝒫ℓ′:=𝒫i,jβ€²+β„“\mathscr{P}^{\prime}_{\ell}:=\mathscr{P}_{i,j^{\prime}+\ell} for β„“βˆˆ[k]\ell\in[k]. Thus, by definition, 𝒫ℓ′\mathscr{P}^{\prime}_{\ell} is the set of primes in the interval Iβ„“=[Hi,2​(jβ€²+β„“)βˆ’1,Hi,2​(jβ€²+β„“)]I_{\ell}=[H_{i,2(j^{\prime}+\ell)-1},H_{i,2(j^{\prime}+\ell)}], which is exactly the situation in Proposition˜5.1. By ˜7.3 and the choice of parameters we have log⁑log⁑(max⁑(Iβ„“))βˆ’log⁑log⁑(min⁑(Iβ„“))β©Ύr25>kβ€‹Ξ΄βˆ’4βˆ’Ξ΅0\log\log(\max(I_{\ell}))-\log\log(\min(I_{\ell}))\geqslant r^{25}>k\delta^{-4-\varepsilon_{0}}. (This is essentially the β€˜pinch point’ for the analysis; for the main result to have the stated exponent of 50 we need (4+Ξ΅0)2<17(4+\varepsilon_{0})^{2}<17 here.)

  • β€’

    We take P1=Hi,2​jβˆ’1P_{1}=H_{i,2j-1}, P2=Hi,2​jP_{2}=H_{i,2j}. The condition P2<exp⁑((log⁑N)1/4)P_{2}<\exp((\log N)^{1/4}) is implied by ˜7.4, if C2C_{2} is large enough.

  • β€’

    We take P1β€²=Hi,2​jβ€²+1P^{\prime}_{1}=H_{i,2j^{\prime}+1}, P2β€²=Hi,2​jβˆ’22P^{\prime}_{2}=H^{2}_{i,2j-2}. Note here that min⁑(𝒫′)β©ΎP1β€²\min(\mathscr{P}^{\prime})\geqslant P^{\prime}_{1} and max⁑(𝒫′)β©½Hi,2​jβ€²+2​⋯​Hi,2​jβˆ’2β©½P2β€²\max(\mathscr{P}^{\prime})\leqslant H_{i,2j^{\prime}+2}\cdots H_{i,2j-2}\leqslant P^{\prime}_{2}, as required, using here that Hi,j2<Hi,j+1H_{i,j}^{2}<H_{i,j+1}. The condition P2β€²β©Ύexp⁑exp⁑((log⁑log⁑N)1/10)P^{\prime}_{2}\geqslant\exp\exp((\log\log N)^{1/10}) follows immediately from ˜7.4.

  • β€’

    The condition λ⩽e(log⁑N)1/4\lambda\leqslant e^{(\log N)^{1/4}} follows from ˜7.4 and the fact that λ⩽max⁑B0\lambda\leqslant\max B_{0}.

  • β€’

    That all prime factors of Ξ»\lambda are less than P1β€²P^{\prime}_{1} is immediate from the lower bound H1,1>max⁑B0H_{1,1}>\max B_{0}.

Suppose that ˜7.16 does not hold. By the above discussion we are in a position to apply Proposition˜5.1. Note that VV in the conclusion there is, with our choice of parameters, exactly the same as VV in ˜7.1. Since ⌊P11/8βŒ‹β©ΎβŒŠHi,11/8βŒ‹>Hi,02\lfloor P_{1}^{1/8}\rfloor\geqslant\lfloor H_{i,1}^{1/8}\rfloor>H_{i,0}^{2}, we may take the parameter HH in Proposition˜5.1 to be Hi,02H_{i,0}^{2}. The conclusion of Proposition˜5.1 is then that

β€–fβˆ’Ξ Qj,Hi,0​fβ€–Ulog1​[N;Qj,Hi,02]≫Kβˆ’O​(1).\|f-\Pi_{Q_{j},H_{i,0}}f\|_{U^{1}_{\log}[N;Q_{j},H_{i,0}^{2}]}\gg K^{-O(1)}.

(Here we observed from the various definitions that Qj=λ​VQ_{j}=\lambda V.) However, this is contrary to Lemma˜6.3, which asserts that the LHS is β‰ͺHi,0βˆ’1\ll H_{i,0}^{-1}, which is enormously smaller. This contradiction shows that we indeed have ˜7.16, and all of the required statements are proven.

8. Further remarks

We end the main body of the paper with a series of remarks regarding the bounds obtained for the pattern {x+y,x​y}\{x+y,xy\} and related patterns.

First of all, we comment that there are two different ways in which the double exponential bound in the main theorem seems hard to improve using anything like the methods of this paper. The first is that it seems difficult to avoid the need to define a highly divisible set such as the set B0B_{0} in ˜7.2, and any such definition seems to immediately lead to elements of double exponential size in rr. Second, the hierarchy of scales ˜7.4 needed to be chosen with log⁑log⁑(Hi,j+1)βˆ’log⁑log⁑(Hi,j)≫1\log\log(H_{i,j+1})-\log\log(H_{i,j})\gg 1 in order that the primes in this range satisfy βˆ‘pβˆˆπ’«1p≫1\sum_{p\in\mathscr{P}}\frac{1}{p}\gg 1, which is crucial in the application of Proposition˜5.1. It is possible to show using arguments somewhat related to those in [Tao24] that one cannot do appreciably better by choosing an alternative set to the primes. In particular, when applying Lemma˜A.5 with an alternate set of integers 𝒫\mathscr{P}, the error term is dominated by γ​(𝒫)1/2\gamma(\mathscr{P})^{1/2} and one can prove that for any set π’«βŠ†[2,X]\mathscr{P}\subseteq[2,X] one has γ​(𝒫)≫(log⁑log⁑X)βˆ’1\gamma(\mathscr{P})\gg(\log\log X)^{-1}.

Next we make some comments on the potential for extending the underlying analytic method to handle the pattern {x,x+y,x​y}\{x,x+y,xy\} (for which partition regularity was established by Moreira [Mor17], but with essentially no bounds). Presumably any such approach would require one to (at least) establish an inverse theorem establishing some structure assuming that

|𝔼n∈[N],pβˆˆπ’«log​f1​(n)​f2​(n+p)​f3​(n​p)|β©ΎΞ΄,\big|\mathbb{E}_{n\in[N],p\in\mathscr{P}}^{\log}f_{1}(n)f_{2}(n+p)f_{3}(np)\big|\geqslant\delta, (8.1)

where 𝒫\mathscr{P} is a suitable set of almost primes (compare here with ˜5.1. The following two rather different examples suggest this may be far from straightforward.

  • β€’

    Suppose first that f2=1f_{2}=1. Let (ΞΎp)pβˆˆπ’«(\xi_{p})_{p\in\mathscr{P}} be an arbitrary sequence of unit complex numbers, and define f1​(n):=ΞΎpf_{1}(n):=\xi_{p} if pp is the least prime in 𝒫\mathscr{P} which divides nn, and f1​(n)=0f_{1}(n)=0 otherwise. Set f3​(n):=f1​(n)Β―f_{3}(n):=\overline{f_{1}(n)}. Assuming that βˆ‘pβˆˆπ’«1pβ‹™1\sum_{p\in\mathscr{P}}\frac{1}{p}\ggg 1, the (logarithmic) proportion of nn for which f1​(n)=0f_{1}(n)=0 is negligible. Now observe that f1​(n)​f3​(p​n)=1f_{1}(n)f_{3}(pn)=1 if the least prime factor of nn in 𝒫\mathscr{P} is less than pp. On average over p,np,n, one expects this to happen half the time. If, one other other hand, the least prime factor of nn is pβ€²>pp^{\prime}>p then we have f1​(n)​f3​(p​n)=ΞΎp′​ξpΒ―f_{1}(n)f_{3}(pn)=\xi_{p^{\prime}}\overline{\xi_{p}}, and typically we expect cancellation of this when summed over p,pβ€²p,p^{\prime}. Examples of this type therefore give ˜8.1 with Ξ΄β‰ˆ1/2\delta\approx 1/2, but with f1,f3f_{1},f_{3} only having rather weak structure.

  • β€’

    Now suppose that f1​(n)=e​(α​n2)f_{1}(n)=e(\alpha n^{2}), f2​(n)=e​(βˆ’Ξ±β€‹n2)f_{2}(n)=e(-\alpha n^{2}) and f3​(n)=e​(2​α​n)f_{3}(n)=e(2\alpha n) for some Ξ±βˆˆπ‘\alpha\in\mathbf{R}. One may then observe that f1​(n)​f2​(n+p)​f3​(n​p)=e​(βˆ’Ξ±β€‹p2)f_{1}(n)f_{2}(n+p)f_{3}(np)=e(-\alpha p^{2}). If PP is the scale of 𝒫\mathscr{P} then this is β‰ˆ1\approx 1 for |Ξ±|βͺ…Pβˆ’2|\alpha|\lessapprox P^{-2}.

Even with an inverse theorem for ˜8.1 in hand, it is far from clear how the other arguments of the paper might be modified.

Appendix A Properties of averages

In this appendix we assemble simple properties of (mostly) logarithmic averages. Throughout the appendix we assume Nβ©Ύ2N\geqslant 2 to avoid trivialities. For mβˆˆπ‘β©Ύ1m\in\mathbf{R}_{\geqslant 1}, HmH_{m} denotes the harmonic sum βˆ‘nβ©½m1n\sum_{n\leqslant m}\frac{1}{n}; we do not require mm to be an integer. The first lemma concerns the behaviour of averages (both uniform and logarithmic) under shifts.

Lemma A.1.

Let f:𝐍→𝐂f:\mathbf{N}\rightarrow\mathbf{C} be a 1-bounded function and let hβˆˆπ™h\in\mathbf{Z}. Then

|𝔼n∈[N]​f​(n)βˆ’π”Όn∈[N]​f​(n+h)|β‰ͺ|h|N\big|\mathbb{E}_{n\in[N]}f(n)-\mathbb{E}_{n\in[N]}f(n+h)\big|\ll\frac{|h|}{N} (A.1)

and, if h≠0h\neq 0,

|𝔼n∈[N]log​f​(n)βˆ’π”Όn∈[N]log​f​(n+h)|β‰ͺ1+log⁑|h|log⁑N.\big|\mathbb{E}^{\log}_{n\in[N]}f(n)-\mathbb{E}^{\log}_{n\in[N]}f(n+h)\big|\ll\frac{1+\log|h|}{\log N}. (A.2)
Proof.

˜A.1 is straightforward. For ˜A.2, we may suppose |h|⩽N/2|h|\leqslant N/2 else the result is trivial. Without loss of generality we may suppose hh is positive, since the case hh negative follows from the positive case. We have

βˆ‘n∈[N]f​(n+h)nβˆ’βˆ‘n∈[N]f​(n)n=βˆ‘m=h+1N(f​(m)mβˆ’hβˆ’f​(m)m)βˆ’βˆ‘n=1hf​(n)n+βˆ‘n=Nβˆ’h+1Nf​(n+h)n.\sum_{n\in[N]}\frac{f(n+h)}{n}-\sum_{n\in[N]}\frac{f(n)}{n}=\sum_{m=h+1}^{N}\big(\frac{f(m)}{m-h}-\frac{f(m)}{m}\big)-\sum_{n=1}^{h}\frac{f(n)}{n}+\sum_{n=N-h+1}^{N}\frac{f(n+h)}{n}. (A.3)

The second sum on the right is β‰ͺ1+log⁑h\ll 1+\log h, whilst the third is β©½log⁑Nβˆ’log⁑(Nβˆ’h+1)+O​(1)β‰ͺ1\leqslant\log N-\log(N-h+1)+O(1)\ll 1 since hβ©½N/2h\leqslant N/2. Finally, the first sum on the right is bounded above by hβ€‹βˆ‘m=h+1N1m​(mβˆ’h)h\sum_{m=h+1}^{N}\frac{1}{m(m-h)}. Since βˆ‘m=h+12​h1m​(mβˆ’h)β‰ͺ1hβ€‹βˆ‘m=h+12​h1mβˆ’hβ‰ͺ1+log⁑hh\sum_{m=h+1}^{2h}\frac{1}{m(m-h)}\ll\frac{1}{h}\sum_{m=h+1}^{2h}\frac{1}{m-h}\ll\frac{1+\log h}{h}, and βˆ‘m=2​h+1N1m​(mβˆ’h)β‰ͺβˆ‘m>2​hmβˆ’2β‰ͺhβˆ’1\sum_{m=2h+1}^{N}\frac{1}{m(m-h)}\ll\sum_{m>2h}m^{-2}\ll h^{-1}, the first sum on the right in ˜A.3 is bounded by β‰ͺ1+log⁑h\ll 1+\log h. Putting all this together, the result follows. ∎

Next we give a result about splitting into residue classes.

Lemma A.2.

Let f:𝐙→𝐂f:\mathbf{Z}\rightarrow\mathbf{C} be 11-bounded. Let q∈𝐍q\in\mathbf{N}. Then

𝔼a∈{0,1,…,qβˆ’1}​𝔼n∈[N]log​f​(q​n+a)=𝔼n∈[N]log​f​(n)+O​(1+log⁑qlog⁑N).\mathbb{E}_{a\in\{0,1,\dots,q-1\}}\mathbb{E}_{n\in[N]}^{\log}f(qn+a)=\mathbb{E}_{n\in[N]}^{\log}f(n)+O\Big(\frac{1+\log q}{\log N}\Big).
Proof.

We may suppose 2β©½qβ©½N2\leqslant q\leqslant N since the result is trivial otherwise. The LHS may be expanded as

1HNβ€‹βˆ‘a∈{0,1,…,qβˆ’1}βˆ‘n∈[N]f​(q​n+a)q​n.\frac{1}{H_{N}}\sum_{a\in\{0,1,\dots,q-1\}}\sum_{n\in[N]}\frac{f(qn+a)}{qn}.

The change if we replace q​nqn in the denominator by q​n+aqn+a is bounded above by

β‰ͺ1log⁑N​supaβˆ‘n∈[N]|1nβˆ’1n+a/q|β‰ͺ1log⁑N,\ll\frac{1}{\log N}\sup_{a}\sum_{n\in[N]}\Big|\frac{1}{n}-\frac{1}{n+a/q}\Big|\ll\frac{1}{\log N},

which is acceptable. If we make this change, the resulting expression is

1HNβ€‹βˆ‘qβ©½nβ€²β©½q​N+(qβˆ’1)f​(nβ€²)nβ€²=𝔼n∈[N]log​f​(n)+O​(HqHN)+O​(Hq​N+(qβˆ’1)βˆ’HNHN).\frac{1}{H_{N}}\sum_{q\leqslant n^{\prime}\leqslant qN+(q-1)}\frac{f(n^{\prime})}{n^{\prime}}=\mathbb{E}_{n\in[N]}^{\log}f(n)+O\Big(\frac{H_{q}}{H_{N}}\Big)+O\Big(\frac{H_{qN+(q-1)}-H_{N}}{H_{N}}\Big).

The two error terms are β‰ͺ1+log⁑qlog⁑N\ll\frac{1+\log q}{\log N}, and this concludes the proof. ∎

We also need the following related result.

Lemma A.3.

Let q,bq,b be coprime positive integers and let HH be a further positive integer parameter. Let f:𝐍→𝐂f:\mathbf{N}\rightarrow\mathbf{C} be a 1-bounded function. Then

𝔼n∈[N]log​𝔼h∈[H]​f​(q​n+b​h)=𝔼n∈[N]log​f​(n)+O​(1+log⁑q+log⁑b​Hlog⁑N)+O​(qH).\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{h\in[H]}f(qn+bh)=\mathbb{E}^{\log}_{n\in[N]}f(n)+O\Big(\frac{1+\log q+\log bH}{\log N}\Big)+O\big(\frac{q}{H}\big).
Proof.

Clearly we may assume that Hβ©ΎqH\geqslant q, as the result is trivial otherwise. If we replace HH by H~:=qβ€‹βŒŠH/qβŒ‹\tilde{H}:=q\lfloor H/q\rfloor, the LHS changes by at most O​(q/H)O(q/H). It therefore suffices to consider the case q∣Hq\mid H. In this case we establish the result without the O​(q/H)O(q/H) error term. We have f​(q​n+b​h)=f​(q​(n+Οƒh)+(b​h)q)f(qn+bh)=f(q(n+\sigma_{h})+(bh)_{q}), where (b​h)q(bh)_{q} denotes the unique element of {0,1,…,qβˆ’1}\{0,1,\dots,q-1\} congruent to b​h​(mod⁑q)bh(\operatorname{mod}\,q), and Οƒh:=1q​(b​hβˆ’(b​h)q)\sigma_{h}:=\frac{1}{q}(bh-(bh)_{q}). By ˜A.2, we have

𝔼n∈[N]log​f​(q​(n+Οƒh)+(b​h)q)=𝔼n∈[N]log​f​(q​n+(b​h)q)+O​(1+log⁑b​Hlog⁑N).\mathbb{E}^{\log}_{n\in[N]}f(q(n+\sigma_{h})+(bh)_{q})=\mathbb{E}^{\log}_{n\in[N]}f(qn+(bh)_{q})+O\Big(\frac{1+\log bH}{\log N}\Big).

However, since (b​h)q(bh)_{q} ranges over {0,1,…,qβˆ’1}\{0,1,\dots,q-1\} as hh ranges over any interval of length H/qH/q,

𝔼h∈[H]​𝔼n∈[N]log​f​(q​n+(b​h)q)=𝔼a∈{0,1,…,qβˆ’1}​𝔼n∈[N]log​f​(q​n+a).\mathbb{E}_{h\in[H]}\mathbb{E}^{\log}_{n\in[N]}f(qn+(bh)_{q})=\mathbb{E}_{a\in\{0,1,\dots,q-1\}}\mathbb{E}^{\log}_{n\in[N]}f(qn+a).

The result now follows from Lemma˜A.2. ∎

The next result states that logarithmic averages are essentially preserved under dilations. This is standard and appears, for instance, as [Ric25, LemmaΒ 2.1].

Lemma A.4.

Let f:𝐍→𝐂f:\mathbf{N}\to\mathbf{C} be 11-bounded and let q∈𝐍q\in\mathbf{N}. Then

|𝔼n∈[N]log​(f​(n)βˆ’qβ€‹πŸq|n​f​(n/q))|β‰ͺlog⁑qlog⁑N.\Big|\mathbb{E}_{n\in[N]}^{\log}\big(f(n)-q\mathbf{1}_{q|n}f(n/q)\big)\Big|\ll\frac{\log q}{\log N}.
Proof.

When q=1q=1 the result is trivial, so suppose qβ©Ύ2q\geqslant 2. By definition,

𝔼n∈[N]log​(f​(n)βˆ’qβ€‹πŸq|n​f​(n/q))=1HNβ€‹βˆ‘n∈[N]f​(n)βˆ’qβ€‹πŸq|n​f​(n/q)n=1HN​(βˆ‘n∈[N]f​(n)nβˆ’βˆ‘nβ€²βˆˆ[N/q]f​(nβ€²)nβ€²).\mathbb{E}_{n\in[N]}^{\log}\big(f(n)-q\mathbf{1}_{q|n}f(n/q)\big)=\frac{1}{H_{N}}\sum_{n\in[N]}\frac{f(n)-q\mathbf{1}_{q|n}f(n/q)}{n}=\frac{1}{H_{N}}\Big(\sum_{n\in[N]}\frac{f(n)}{n}-\sum_{n^{\prime}\in[N/q]}\frac{f(n^{\prime})}{n^{\prime}}\Big).

This is bounded by β©½1HN​(HNβˆ’HN/q)=1HN​(log⁑q+O​(1))\leqslant\frac{1}{H_{N}}\big(H_{N}-H_{N/q}\big)=\frac{1}{H_{N}}(\log q+O(1)), and the result follows. ∎

We next require the logarithmic version of Elliott’s inequality. The proof is exactly that given in [Ric25, CorollaryΒ 2.3] modulo tracking error terms.

Lemma A.5.

Let 𝒫\mathscr{P} be a finite set of primes, all bounded by PP. Let f:𝐍→𝐂f:\mathbf{N}\to\mathbf{C} be 11-bounded. We have that

|𝔼n∈[N]log​f​(n)βˆ’π”Όn∈[N],pβˆˆπ’«log​f​(p​n)|β‰ͺlog⁑Plog⁑N+(βˆ‘pβˆˆπ’«1p)βˆ’1/2.\Big|\mathbb{E}_{n\in[N]}^{\log}f(n)-\mathbb{E}_{n\in[N],p\in\mathscr{P}}^{\log}f(pn)\Big|\ll\frac{\log P}{\log N}+\Big(\sum_{p\in\mathscr{P}}\frac{1}{p}\Big)^{-1/2}.
Proof.

By Lemma˜A.4 applied with f~​(n):=f​(p​n)\tilde{f}(n):=f(pn), for each pβˆˆπ’«p\in\mathscr{P} we have

𝔼n∈[N]log​f​(p​n)=p​𝔼n∈[N]logβ€‹πŸp|n​f​(n)+O​(log⁑Plog⁑N).\mathbb{E}_{n\in[N]}^{\log}f(pn)=p\mathbb{E}^{\log}_{n\in[N]}\mathbf{1}_{p|n}f(n)+O\big(\frac{\log P}{\log N}\big).

Therefore by Cauchy–Schwarz we have

|𝔼n∈[N]log​f​(n)βˆ’π”Όn∈[N],pβˆˆπ’«log​f​(p​n)|\displaystyle\Big|\mathbb{E}_{n\in[N]}^{\log}f(n)-\mathbb{E}_{n\in[N],p\in\mathscr{P}}^{\log}f(pn)\Big| =|𝔼n∈[N]log​𝔼pβˆˆπ’«log​f​(n)​(pβ€‹πŸp|nβˆ’1)|+O​(log⁑Plog⁑N)\displaystyle=\Big|\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{p\in\mathscr{P}}^{\log}f(n)(p\mathbf{1}_{p|n}-1)\Big|+O\Big(\frac{\log P}{\log N}\Big)
β©½(𝔼n∈[N]log​|𝔼pβˆˆπ’«log​(pβ€‹πŸp|nβˆ’1)|2)1/2+O​(log⁑Plog⁑N).\displaystyle\leqslant\Big(\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{p\in\mathscr{P}}^{\log}(p\mathbf{1}_{p|n}-1)\big|^{2}\Big)^{1/2}+O\Big(\frac{\log P}{\log N}\Big).

By [Ric25, PropositionΒ 2.2], we have that

𝔼n∈[N]log\displaystyle\mathbb{E}_{n\in[N]}^{\log} |𝔼pβˆˆπ’«log​(pβ€‹πŸp|nβˆ’1)|2β©½9​(βˆ‘pβˆˆπ’«1p)βˆ’1,\displaystyle\big|\mathbb{E}_{p\in\mathscr{P}}^{\log}(p\mathbf{1}_{p|n}-1)\big|^{2}\leqslant 9\Big(\sum_{p\in\mathscr{P}}\frac{1}{p}\Big)^{-1},

and the result follows. ∎

We end with a proposition regarding the behaviour of Ulog1​[N;q,H]U^{1}_{\log}[N;q,H] under replacing qq by a multiple or shrinking the interval HH.

Lemma A.6.

Suppose that q∣q~q\mid\tilde{q} and that H~​q~<H​q<N/2\tilde{H}\tilde{q}<Hq<N/2. Then

β€–fβ€–Ulog1​[N;q,H]β©½β€–fβ€–Ulog1​[N;q~,H~]+O​(log⁑|H​q|log⁑N)+O​(H~​q~H​q).\|f\|_{U^{1}_{\log}[N;q,H]}\leqslant\|f\|_{U^{1}_{\log}[N;\tilde{q},\tilde{H}]}+O\Big(\frac{\log|Hq|}{\log N}\Big)+O\Big(\frac{\tilde{H}\tilde{q}}{Hq}\Big).
Proof.

First observe that

𝔼h∈[H]​f​(n+h​q)=𝔼h∈[H],h~∈[H~]​f​(n+h​q+h~​q~)+O​(H~​q~H​q)\mathbb{E}_{h\in[H]}f(n+hq)=\mathbb{E}_{h\in[H],\tilde{h}\in[\tilde{H}]}f(n+hq+\tilde{h}\tilde{q})+O\Big(\frac{\tilde{H}\tilde{q}}{Hq}\Big)

by the assumptions and ˜A.1. Substituting into the definition of β€–fβ€–Ulog1​[N;q,H]\|f\|_{U^{1}_{\log}[N;q,H]}, we have

β€–fβ€–Ulog1​[N;q,H]2=𝔼n∈[N]log​|𝔼h∈[H],h~∈[H~]​f​(n+h​q+h~​q~)|2+O​(H~​q~H​q).\|f\|_{U^{1}_{\log}[N;q,H]}^{2}=\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{h\in[H],\tilde{h}\in[\tilde{H}]}f(n+hq+\tilde{h}\tilde{q})\big|^{2}+O\Big(\frac{\tilde{H}\tilde{q}}{Hq}\Big).

By Cauchy–Schwarz,

β€–fβ€–Ulog1​[N;q,H]2⩽𝔼n∈[N]log​𝔼h∈[H]​|𝔼h~∈[H~]​f​(n+h​q+h~​q~)|2+O​(H~​q~H​q).\|f\|_{U^{1}_{\log}[N;q,H]}^{2}\leqslant\mathbb{E}_{n\in[N]}^{\log}\mathbb{E}_{h\in[H]}\big|\mathbb{E}_{\tilde{h}\in[\tilde{H}]}f(n+hq+\tilde{h}\tilde{q})\big|^{2}+O\Big(\frac{\tilde{H}\tilde{q}}{Hq}\Big).

However by ˜A.2, for each hh we have

𝔼n∈[N]log​|𝔼h~∈[H~]​f​(n+h​q+h~​q~)|2=𝔼n∈[N]log​|𝔼h~∈[H~]​f​(n+h~​q~)|2+O​(log⁑|h​q|log⁑N).\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{\tilde{h}\in[\tilde{H}]}f(n+hq+\tilde{h}\tilde{q})\big|^{2}=\mathbb{E}_{n\in[N]}^{\log}\big|\mathbb{E}_{\tilde{h}\in[\tilde{H}]}f(n+\tilde{h}\tilde{q})\big|^{2}+O\Big(\frac{\log|hq|}{\log N}\Big).

Averaging over h∈[H]h\in[H] gives the result. ∎

Appendix B An exponential sum estimate over the primes

In this appendix we prove a log-free exponential sum estimate for the von Mangoldt function with polynomial phase.

Lemma B.1.

Let m∈𝐍m\in\mathbf{N} and Ρ∈(0,12)\varepsilon\in(0,\frac{1}{2}). Suppose that

|βˆ‘nβ©½XΛ​(n)​e​(nm​θ)|⩾Ρ​X.\big|\sum_{n\leqslant X}\Lambda(n)e(n^{m}\theta)\big|\geqslant\varepsilon X. (B.1)

Then there is some q∈𝐍q\in\mathbf{N} such that

qβ©½Ξ΅βˆ’Om​(1)and‖θ​q‖𝐑/𝐙⩽ΡOm​(1)​Xβˆ’m.q\leqslant\varepsilon^{-O_{m}(1)}\quad\mbox{and}\quad\|\theta q\|_{\mathbf{R}/\mathbf{Z}}\leqslant\varepsilon^{O_{m}(1)}X^{-m}. (B.2)
Proof.

We proceed via the weaker result with ˜B.2 replaced by

qβ©½(log⁑XΞ΅)Om​(1)and‖θ​q‖𝐑/𝐙⩽(log⁑XΞ΅)Om​(1)​Xβˆ’m.q\leqslant\big(\frac{\log X}{\varepsilon}\big)^{O_{m}(1)}\quad\mbox{and}\quad\|\theta q\|_{\mathbf{R}/\mathbf{Z}}\leqslant\big(\frac{\log X}{\varepsilon}\big)^{O_{m}(1)}X^{-m}. (B.3)

This is a standard application of the method of Type I/II sums. However, some sources in the literature such as [Har81] lose factors of Xo​(1)X^{o(1)} instead of a power of log⁑X\log X via an invocation of the divisor bound in the proof of Weyl’s inequality. This loss can be avoided with a little care, but it is hard to find a convenient source in the literature. One may find an essentially equivalent argument (with the polynomial phase e​(nm​θ)e(n^{m}\theta) replaced by a general nilsequence) in [GT-mobius]. The key point is that [GT-mobius, Proposition 3.1] holds verbatim if the MΓΆbius function ΞΌ\mu is replaced by Ξ›\Lambda. This is established in a standard fashion as in the proof of [GT-mobius, Proposition 3.1] (which is outsourced to [GT08, Section 4], which itself is derivative of standard expositions such as [IK-book, Chapter 13]) by using Vaughan’s identity for Ξ›\Lambda rather than the variant for ΞΌ\mu. One may now run the arguments of [GT-mobius, Section 3]; in this context most of the language of nilmanifolds is redundant since e​(nm​θ)e(n^{m}\theta) is a nilsequence on the abelian torus 𝐑/𝐙\mathbf{R}/\mathbf{Z}. In particular the β€˜complexity’ parameter QQ is simply O​(1)O(1). The conclusion of [GT-mobius, Section 3] is then that, starting from ˜B.1, and setting Ξ΄:=Ξ΅/log⁑X\delta:=\varepsilon/\log X, there is some qβ‰ͺmΞ΄βˆ’Om​(1)q\ll_{m}\delta^{-O_{m}(1)} such that we have ‖θ​q‖𝐑/𝐙β‰ͺmΞ΄βˆ’Om​(1)​Xβˆ’m\|\theta q\|_{\mathbf{R}/\mathbf{Z}}\ll_{m}\delta^{-O_{m}(1)}X^{-m}; this is exactly ˜B.3 (noting here that the β‰ͺ\ll can be upgraded to β©½\leqslant at the expense of worsening exponents since Ξ΄β‹˜1\delta\lll 1).

If Ξ΅β©Ύ(log⁑X)βˆ’1\varepsilon\geqslant(\log X)^{-1} then ˜B.3 immediately implies ˜B.2 (after adjusting the exponents Om​(1)O_{m}(1)). To complete the proof of Lemma˜B.1, it therefore suffices to handle the case Ξ΅β©½(log⁑X)βˆ’1\varepsilon\leqslant(\log X)^{-1}. In this case, from ˜B.3 we certainly have qβ©½(log⁑X)Om​(1)q\leqslant(\log X)^{O_{m}(1)} and ‖θ​q‖𝐑/𝐙⩽(log⁑X)Om​(1)​Xβˆ’m\|\theta q\|_{\mathbf{R}/\mathbf{Z}}\leqslant(\log X)^{O_{m}(1)}X^{-m}. In this case one can obtain an asymptotic for the exponential sum βˆ‘nβ©½XΛ​(n)​e​(nm​θ)\sum_{n\leqslant X}\Lambda(n)e(n^{m}\theta) using the Siegel-Walfisz theorem on the distribution of Ξ›\Lambda in progressions (mod⁑q)(\operatorname{mod}\,q). These arguments are carried out in detail in work of Hua [Hua38]. Summarising briefly, the main term of this asymptotic at ΞΈ=aq+Ξ·\theta=\frac{a}{q}+\eta will be Xϕ​(q)​S​(a,q)​ν​(η​Xm)\frac{X}{\phi(q)}S(a,q)\nu(\eta X^{m}) where ν​(y)=∫01e​(y​xm)​𝑑x\nu(y)=\int^{1}_{0}e(yx^{m})dx satisfies an appropriate van der Corput estimate and S​(a,q):=βˆ‘b∈(𝐙/q​𝐙)βˆ—e​(a​bm/q)S(a,q):=\sum_{b\in(\mathbf{Z}/q\mathbf{Z})^{*}}e(ab^{m}/q) satisfies |S​(a,q)|β‰ͺmq1/2+om​(1)|S(a,q)|\ll_{m}q^{1/2+o_{m}(1)}. The assumption ˜B.1 therefore forces both η​Xmβ‰ͺΞ΅βˆ’1\eta X^{m}\ll\varepsilon^{-1} and qβ‰ͺmΞ΅βˆ’2βˆ’om​(1)q\ll_{m}\varepsilon^{-2-o_{m}(1)}. ∎

Finally we give the case k=1k=1 of (a slight generalisation of) Lemma˜3.2, which was used in the proof of the case k⩾2k\geqslant 2 of that result. This can be quickly deduced from Lemma˜B.1 as a consequence of partial summation.

Lemma B.2.

Let m∈𝐍m\in\mathbf{N} and δ,η∈(0,12)\delta,\eta\in(0,\frac{1}{2}). Suppose that

|βˆ‘Xβ©½p<(1+Ξ·)​Xe​(pm​θ)|⩾δ​η​Xlog⁑X.\big|\sum_{X\leqslant p<(1+\eta)X}e(p^{m}\theta)\big|\geqslant\frac{\delta\eta X}{\log X}. (B.4)

Then there is some q∈𝐍q\in\mathbf{N} such that qβ©½(η​δ)βˆ’Om​(1)q\leqslant(\eta\delta)^{-O_{m}(1)} and ‖θ​q‖𝐑/𝐙⩽(η​δ)βˆ’Om​(1)​Xβˆ’m\|\theta q\|_{\mathbf{R}/\mathbf{Z}}\leqslant(\eta\delta)^{-O_{m}(1)}X^{-m}.

Proof.

The result is trivial if δ​η⩽Xβˆ’1/10\delta\eta\leqslant X^{-1/10} (say), so suppose this is not the case. We may replace the assumption ˜B.4 by

|βˆ‘Xβ©½n<(1+Ξ·)​XΛ​(n)log⁑n​e​(nm​θ)|⩾δ​η​X2​log⁑X.\Big|\sum_{X\leqslant n<(1+\eta)X}\frac{\Lambda(n)}{\log n}e(n^{m}\theta)\Big|\geqslant\frac{\delta\eta X}{2\log X}.

(The loss of a further factor of 2 here comes from the essentially negligible contribution of the prime power support of Ξ›\Lambda). Now for nβ©ΎXn\geqslant X we have 1log⁑n=1log⁑Xβˆ’βˆ«Xnd​tt​(log⁑t)2\frac{1}{\log n}=\frac{1}{\log X}-\int^{n}_{X}\frac{dt}{t(\log t)^{2}}. Substituting in and applying the triangle inequality gives

1log⁑X​|βˆ‘Xβ©½n<(1+Ξ·)​XΛ​(n)​e​(nm​θ)|+∫X2​Xd​tt​(log⁑t)2​|βˆ‘tβ©½nβ©½(1+Ξ·)​XΛ​(n)​e​(nm​θ)|⩾δ​η​X2​log⁑X.\frac{1}{\log X}\Big|\sum_{X\leqslant n<(1+\eta)X}\Lambda(n)e(n^{m}\theta)\Big|+\int^{2X}_{X}\frac{dt}{t(\log t)^{2}}\Big|\sum_{t\leqslant n\leqslant(1+\eta)X}\Lambda(n)e(n^{m}\theta)\Big|\geqslant\frac{\delta\eta X}{2\log X}.

By further applications of the triangle inequality and ∫X2​Xd​tt​(log⁑t)2β©½1log⁑X\int^{2X}_{X}\frac{dt}{t(\log t)^{2}}\leqslant\frac{1}{\log X} it follows that

supY∈[X,2​X]|βˆ‘nβ©½YΛ​(n)​e​(nm​θ)|⩾δ​η​X/16.\sup_{Y\in[X,2X]}\Big|\sum_{n\leqslant Y}\Lambda(n)e(n^{m}\theta)\Big|\geqslant\delta\eta X/16.

The desired conclusion ˜3.2 now follows from Lemma˜B.1. ∎

B.1. Effectivity

The proof outline above for Lemma˜B.1 gives ineffective bounds due to the invocation of the Siegel–Walfisz theorem in Hua’s work. However, one can replace this with a version of the prime number theorem in progressions incorporating an additional correction term for a potential Siegel zero such as [IK-book, Equation (5.71)]. Specifically, for qβ©½elog⁑Xq\leqslant e^{\sqrt{\log X}} and (b,q)=1(b,q)=1 we have

βˆ‘nβ©½X:n≑b​(mod​q)Λ​(n)=Xϕ​(q)βˆ’Ο‡β€‹(b)¯ϕ​(q)​XΞ²Ξ²+O​(X​eβˆ’c​log⁑X),\sum_{n\leqslant X:n\equiv b(\mbox{\scriptsize mod}\,q)}\Lambda(n)=\frac{X}{\phi(q)}-\frac{\overline{\chi(b)}}{\phi(q)}\frac{X^{\beta}}{\beta}+O(Xe^{-c\sqrt{\log X}}),

where here Ο‡\chi is some quadratic Dirichlet character for which L​(s,Ο‡)L(s,\chi) has a Siegel zero Ξ²\beta. The Siegel zero term introduces a secondary main term in Hua’s asymptotic formula, now of the form βˆ’Xβϕ​(q)​S~​(a,q)​ν~​(η​Xm)-\frac{X^{\beta}}{\phi(q)}\tilde{S}(a,q)\tilde{\nu}(\eta X^{m}), where S~​(a,q)=βˆ‘b​(mod​q),(b,q)=1χ​(b)¯​e​(a​bm/q)\tilde{S}(a,q)=\sum_{b(\mbox{\scriptsize mod}\,q),(b,q)=1}\overline{\chi(b)}e(ab^{m}/q) and Ξ½~​(y)=∫01xΞ²βˆ’1​e​(xm​y)​𝑑x\tilde{\nu}(y)=\int^{1}_{0}x^{\beta-1}e(x^{m}y)dx. These terms satisfy similar estimates to S,Ξ½S,\nu in the Hua analysis, allowing us to draw an analogous conclusion.