Factorization and N^3LL_p+NNLO Predictions for the Higgs Cross Section with a Jet Veto

We have recently derived a factorization formula for the Higgs-boson production cross section in the presence of a jet veto, which allows for a systematic resummation of large Sudakov logarithms of the form alpha_s^n ln^m(p_T^veto/m_H), along with the large virtual corrections known to affect also the total cross section. Here we determine the ingredients entering this formula at two-loop accuracy. Specifically, we compute the dependence on the jet-radius parameter R, which is encoded in the two-loop coefficient of the collinear anomaly, by means of a direct, fully analytic calculation in the framework of soft-collinear effective theory. We confirm the result obtained by Banfi et al. from a related calculation in QCD, and demonstrate that factorization-breaking, soft-collinear mixing effects do not arise at leading power in p_T^veto/m_H, even for R=O(1). In addition, we extract the two-loop collinear beam functions numerically. We present detailed numerical predictions for the jet-veto cross section with partial next-to-next-to-next-to-leading logarithmic accuracy, matched to the next-to-next-to-leading order cross section in fixed-order perturbation theory. The only missing ingredients at this level of accuracy are the three-loop anomaly coefficient and the four-loop cusp anomalous dimension, whose numerical effects we estimate to be small.


Introduction
With firm evidence for a Higgs boson with a mass around m H = 125 GeV, the primary focus of particle physics has now shifted to the study of the properties of this new particle, in particular of its couplings. An important channel in this context is Higgs-boson production with subsequent decay into a W + W − pair, for which both ATLAS and CMS have recently reported 4σ evidence [1,2]. With a branching ratio of about 22%, this is the second largest decay channel of the Higgs boson. Because of the missing energy in the final state, the W + W − channel is not particularly well suited for a Higgs mass measurement, but it offers the possibility for a precise Higgs coupling measurement and spin studies. A challenge is posed by the large background from tt production, which, after the top-quarks decay, results in a W + W − pair in association with two b-quark jets. This background is significantly reduced by rejecting events containing jets with transverse momentum above a certain threshold p veto T , which is chosen around 25-30 GeV in current experimental analyses.
Imposing such a jet veto enhances the higher-order QCD corrections to the Higgs-boson production cross section by Sudakov logarithms of the form α n s ln m (p veto T /m H ), with m ≤ 2n. One might argue that these logarithms are not particularly large for the relevant values of p veto T ; however, even the total Higgs production rate suffers from large corrections, and additional enhancements can easily lead to unreliable predictions. For the fixed-order predictions of the production cross section with a jet veto, it was observed that there is a numerical cancellation between the negative corrections from Sudakov logarithms and the large positive virtual corrections to the total rate, which leads to artificially small scale uncertainties [3]. To avoid this cancellation and get a more reliable estimate of the theoretical uncertainties, it was subsequently proposed to add in quadrature the scale uncertainties of the total cross section and the cross section with one or more jets in the final state, which leads to an uncertainty of 17% on the Higgs production cross section with a jet veto [4]. This uncertainty is about twice as large as the experimental systematic errors and of the same size as the current statistical uncertainty. To make full use of the coming LHC data, the theoretical uncertainty should thus be reduced significantly.
There has been a lot of progress in the theoretical description of the Higgs-boson production rate with a jet veto over the past year, starting with the work [5], where it was shown that the Sudakov logarithms associated with the jet veto can be resummed at next-to-leading logarithmic (NLL) order. Subsequently, we have derived an all-order factorization theorem [6] using soft-collinear effective theory (SCET) [7][8][9][10], which allows for resummation to any desired accuracy, given the necessary perturbative input. We have also explicitly carried out the resummation at NNLL order. One of the necessary ingredients at this level of accuracy is the two-loop coefficient d veto 2 (R) of the so-called collinear anomaly [6,11], which we had extracted from partial NNLL results of [5] under the assumption that these results remained valid in the limit where the jet-radius parameter R is taken to infinity. It was subsequently shown by Banfi et al. that this assumption does not hold [12]. The limits m H → ∞ and R → ∞ do not commute, and taking the large-R limit naively one misses an R-independent term in d veto 2 (R). After correcting the value of the two-loop coefficient accordingly, there is full agreement between the NNLL results presented in [6] and [12].
The validity of factorization formula put forward in [6] was questioned by the authors of [13], who claimed that this formula breaks down unless the jet radius R is assumed to be parametrically small, such that R ∼ p veto T /m H ≪ 1. However, for small R the perturbative corrections to the cross section are enhanced by logarithms of the jet radius, and these logarithms cannot be resummed by means of the factorization formula obtained in [6]. Reference [13] concluded that one is "stuck between a rock and a hard place", because one would either face factorization-breaking corrections or large unresummed logarithms of R. However, immediately after the paper [13] appeared a NNLL resummation formula was published in [12], and it was verified numerically that it correctly predicts the relevant logarithms up to O(α 3 s ), even for R ∼ 1. To NNLL accuracy, our factorization formula precisely matches the result of [12].
The purpose of the present paper is three-fold. First, we will compute the two-loop anomaly coefficient d veto 2 (R) directly within the SCET framework. We find complete agreement with the QCD result of [5], which demonstrates explicitly and analytically that our factorization formula, which does not include soft-collinear mixing terms, is correct up to NNLL order. We provide a completely analytic result for the expansion of the two-loop anomaly coefficient in R, whose R-independent piece was only obtained in numerical form in previous papers [5]. Secondly, we will show that the soft-collinear mixing contributions obtained in [13] are absent if one ensures that the computation is done in such a way that there is no double counting among the different momentum regions in the effective theory. This double counting is avoided from the beginning if loop and phase-space integrals in the effective theory are properly expanded in the different momentum regions. If the jet measure is left unexpanded, as was done in [13], then non-zero soft-collinear mixing contributions can arise in individual integrals, but they cancel if the necessary subtractions are performed to remove the soft-collinear overlap regions from the integrals. We discuss these issues in detail, give arguments that factorization breaking will also not arise at higher logarithmic accuracy, and conclude that all the available evidence indicates that the factorization theorem proposed in [6] is valid to all orders. Thirdly, we present an updated and improved phenomenological analysis of the Higgs-boson production cross section with a jet veto. Since the two-loop anomaly coefficient turns out to be numerically large for the values of R used by the experimental collaborations, the predictions obtained at NNLL order still suffer from significant scale uncertainties. We show that all the ingredients required to increase the accuracy to N 3 LL order are either already known or can be extracted numerically, except for the three-loop anomaly coefficient and the four-loop cusp anomalous dimension. We estimate the effect of these missing coefficients and find that they only have a small numerical impact on the results. We thus obtain predictions with N 3 LL p accuracy, where the subscript "p" (for "partial") indicates that two of the ingredients for a complete N 3 LL calculation are yet unknown. We also include power-suppressed terms by matching our results to the NNLO fixed-order cross section, finding that these power corrections are numerically small. This indicates that the expansion about small p veto T is well behaved at the experimentally relevant values of the jet-veto scale.
Our paper is organized as follows: We first review in Section 2 the factorization theorem for the cross section with a jet veto and collect the necessary perturbative ingredients. In Section 3, we then discuss the clustering of particles in different momentum regions and show that factorization-breaking terms are absent. After this general discussion, we present the explicit calculation of the two-loop anomaly coefficient d veto 2 (R) in Section 4. The numerical extraction of the two-loop beam functions and the fixed-order matching are discussed in Section 5. With these ingredients at hand, we present in Section 6 our numerical results for the jet-veto cross section for Higgs production at the LHC. Our conclusions are summarized in Section 7. In the Appendix, we give some details on the analytic calculation of the two-loop anomaly coefficient as an expansion in the jet-radius parameter R.
2 Factorization theorem for the jet-veto cross section Using arguments based on SCET, we have shown in [6] that the Higgs-boson production cross section defined with a jet veto p jet T < p veto T can be factorized, to all orders in perturbation theory and at leading power in the small ratio p veto T /m H , in a way that separates the short-distance scales m t and m H from the scale p veto T of the jet veto. We work with the usual class of sequential recombination jet algorithms [15], with distance measure where n = 1 corresponds to the k T algorithm [16,17], n = 0 to the Cambridge-Aachen algorithm [18,19], and n = −1 to the anti-k T algorithm [20]. The two particles with the smallest distance are combined into a new "particle", whose momentum is the sum of the momenta of the parent particles. If the smallest distance is d iB , then particle i is considered a jet and removed from the list. The procedure is iterated until all particles are grouped into jets, i.e., the algorithm is inclusive. In the following, the jet-radius parameter is assumed to obey the inequalities and we work in the limit where λ = p veto T /m H is a small expansion parameter. Then these inequalities are satisfied as long as R is treated as an O(1) number, independent of λ. For too small values of R (meaning R ∼ λ or smaller), large logarithms ln n R arise, which would require a special treatment. These "clustering logarithms" have a complicated structure in higher orders [21,22], and it is currently not understood how to resum them. For too large R (meaning R ∼ ln(1/λ) or larger), on the other hand, the factorization formula breaks down.
The factorization formula is obtained by factorizing the contributions of hard, collinear, anti-collinear, and soft modes in SCET. Denoting by y the rapidity of the Higgs boson in the proton-proton center-of-mass frame, one first derives the preliminary result The Wilson coefficient C t = 1 + O(α s ) arises when one approximates the fermion-loop contribution to the gluon fusion amplitude by an effective, local Hgg operator, as is routinely done in calculations of the Higgs-boson production amplitude. The hard matching coefficient C S = 1 + O(α s ) appears when the scalar two-gluon operator is matched onto a corresponding operator in SCET [23]. Both coefficients are known to three-loop order in perturbation theory, but for our purposes we only need the two-loop expressions derived in [24,25] and [23,26], respectively. The resulting expressions can also be found in eqs. (12) and (17) of [23]. The emissions of (anti-)collinear and soft gluons, which are then grouped into jets according to the jet algorithm, are accounted for by the beam functions B c , Bc and the soft function S in the factorization theorem (3). Besides the veto scale, these functions also depend on the jet definition and in particular on the jet-radius parameter R. This dependence is suppressed in our notation. The collinear matrix element relevant for Higgs production reads [6] where A c⊥ denotes the gauge-invariant collinear gluon field in SCET. The matrix element in the second line is exactly the same as that entering the definition of the standard parton distribution function (PDF) for the gluon. The only difference is that the sum over intermediate states in (5) is constrained by the jet veto, whose effect is encoded in a "measurement function" M veto , which depends on the momenta {p c } of the particles in the final state. Likewise, the soft function is defined as with d R = N 2 c − 1. It involves Wilson lines of soft gluon fields in the adjoint representation, integrated along the beam directions n andn.
Like in the case of the transverse-position dependent PDFs studied in [11], the presence of a measurement function probing parton transverse momenta leads to additional light-cone (or rapidity) divergences, which are not regularized in dimensional regularization. The sums over collinear states X c in (5) and soft states X s in (6) are therefore regularized analytically. To this end, we use the phase-space regularization prescription of [27], which amounts to replacing the usual phase-space measure by where y = 1 2 ln(k + /k − ) = ln(k + /k T ) and k T = | k ⊥ |. The regularization softens the light-cone singularities arising in the evaluation of the matrix elements. It introduces a new scale ν, which plays an analogous role to the scale µ entering in dimensional regularization.
Once the light-cone singularities in the (anti-)collinear and soft functions have been regularized, they show up as poles in the analytic regulator α, which cancel in the product of the three matrix elements in (3). However, after the cancellation large logarithms of the scale ratio m H /p veto T arise, which need to be resummed to all orders in perturbation theory. This effect has been called the "collinear factorization anomaly" [11]. The resummation of the anomalous logarithms can be accomplished by means of solving simple differential equations, which express the fact that the product of the three functions must be regularization independent [6]. One finds that where the anomalous dependence on the hard scale m H is now explicit. Compared with [6], we have extracted a factor e h A (p veto T ,µ) from each collinear function, which is chosen such that the remaining functionB g (ξ, p veto T ) is renormalization-group (RG) invariant. We have also absorbed the square root of the soft function into the collinear matrix elements. (In the regularization scheme adopted here, S(p veto T , µ) = 1 to all orders in perturbation theory, so this last step is trivial.) The exponents F gg and h A obey the RG equations [11,28] where without loss of generality we can impose the normalization condition h A (p veto T , p veto T ) = 0. In (9), Γ A cusp is the cusp anomalous dimension in the adjoint representation, and γ g denotes the anomalous dimension of the collinear gluon field as defined in [29]. For our analysis we require the three-loop expression for the anomaly exponent F gg and the two-loop result for h A . Solving the evolution equations (9), we obtain where we have defined the abbreviations a s = α s (µ)/(4π) and L ⊥ = 2 ln(µ/p veto T ). The coefficients Γ A n , γ g n , and β n appear in the perturbative expansions of the anomalous dimensions and β-function, defined as As long as the veto scale p veto T is in the perturbative domain, one can match the beam functionB g appearing in (8) onto standard PDFs, which is accurate up to hadronic corrections suppressed by powers of Λ QCD /p veto T . The matching coefficients are connected by the simple rescaling relation to the functions I g←i (z, p veto T , µ) computed at one-loop order in [6]. We find where P (1) g←i (z) are the one-loop DGLAP splitting functions. The explicit one-loop calculations of F gg and I g←i performed in [6] show that (in the MS scheme) At two-loop order, the anomaly coefficient d veto 2 (R) can be extracted from results presented in [12]. One finds that where the expansion of f (R) for small R reads, in numerical form, 1 In the following section we will reproduce this expression based on a two-loop calculation in SCET, which relies on the structure of the factorization formula (3). The fact that we will reproduce the above expression exactly provides a non-trivial test of our factorization theorem at two-loop order. The three-loop coefficient d veto 3 (R) in (10) is presently still unknown and will be estimated in Section 4 below, where we will also extract the two-loop corrections to the beam functionsB g (ξ 1 , p veto T ) in (12) in numerical form. We can now rewrite the jet-veto cross section from (3) in the final, factorized form where we have introduced the RG-invariant hard function which contains all dependence on the short-distance scales m t and m H . The dependence on rapidity is carried only by the beam functionsB g (ξ 1,2 , p veto T ). Note that, due to the collinear anomaly, it is not possible to factorize the dependence on the jet-veto scale p veto T in the hard functionH. However, it is possible to resum all large logarithms in the ratio m H /p veto T consistently, to all orders in perturbation theory. To this end, one chooses a low factorization scale µ ∼ p veto T in the factorization formula (18). Then the kernel functionsĪ g←i required to compute the beam functionB g can be calculated in fixed-order perturbation theory. Likewise, the fixed-order expressions for F gg and h A in (10) are sufficient. On the other hand, the matching coefficients C t and C S need to be computed in RG-improved perturbation theory. They can be evolved from the high matching scales µ ∼ m t and µ 2 ∼ −m 2 H , where the matching calculations are performed, down to lower scales µ ∼ p veto T using RG equations. We will require the resulting expressions at next-to-next-to-leading order (NNLO) in RG-improved perturbation theory, which is equivalent to N 3 LL accuracy. The corresponding expressions can be found in eqs. (20) and (22) of [23], with further details given in the Appendix of [30].
All objects in the factorization formula (18) are defined in a RG-invariant way, i.e. they are formally independent of the factorization scale µ. As is common practice, we can use the residual µ dependence arising when the expressions (12) and (19) are evaluated at some fixed order in perturbation theory as an indicator of the remaining perturbative uncertainties. This can be done for each of these objects separately, not just for the total cross section. We also note that the expression for the hard function becomes particularly simple if one adopts the default scale choice µ = p veto T on the right-hand side of (19). In this casē 3 Jet clustering, multipole expansion, and zero bins We now analyze the factorization properties of the jet-veto cross section using the formalism of SCET, in which highly energetic particles aligned with the colliding protons are described in terms of collinear and anti-collinear quark and gluon fields, and soft particles emitted from the beam jets are described in terms of soft fields. The effective theory implements an expansion of scattering amplitudes in powers of the small parameter λ ∼ p veto T /m H , where the jet veto sets the characteristic size of all transverse momenta in the process. We introduce two light-like reference vectors n µ andn µ (satisfying n ·n = 2) parallel to the beam axis and decompose all 4-vectors in the light-cone basis spanned by these vectors, The different types of modes relevant to our discussion are characterized by the scalings of their momenta (p + , p − , p ⊥ ) with powers of λ, namely p µ c ∼ m H (λ 2 , 1, λ) for collinear particles, p μ c ∼ m H (1, λ 2 , λ) for anti-collinear particles, and p µ s ∼ m H (λ, λ, λ) for soft particles. Hence, the particles in these three categories have transverse momenta of order the jet veto, but very different rapidities. The scaling of these modes is displayed graphically in Figure 1. In addition, the cross section receives contributions from the hard momentum region p µ h ∼ m H (1, 1, 1), where we do not distinguish between m H and m t . These corrections are purely virtual and are integrated out in the construction of the effective theory. One may also worry about the contributions from modes with smaller virtualities, p 2 ≪ (p veto T ) 2 . For example, an on-shell soft mode, which accidentally is closely aligned with the beam axis, would have momentum scaling ∼ m H (λ 2 , λ, λ 3/2 ). This mode has a rapidity lying in between that of collinear and soft modes. Indeed, it may also be regarded as a collinear mode whose minus component is accidentally small. The important point is that, because of their small transverse momenta, such modes play no role for the total transverse momentum of a jet. Therefore, an arbitrary number of them can be emitted, and their effect cancels out in the factorization theorem. This is in analogy with the cancellation of ultrasoft modes in the factorization theorem for the Drell-Yan cross section at small transverse momentum [11].
As explained in [6], the jet clustering algorithm does not group particles with different momentum scalings (collinear, anti-collinear, or soft) into the same jet. The reason is that, generically, the rapidity difference between two such particles are such that ∆y ij ∼ ln(m H /p veto T ), which by assumption is much larger than R, see (2). As a consequence, in the jet algorithm (1) the distance measure d ij for two such particles is always larger than the minimum of d iB and d jB . Since the soft and (anti-)collinear modes have the same virtuality, they live along the hyperbola in the (p + , p − ) plane shown in Figure 1, and their precise separation along this hyperbola is to some extent arbitrary. The fact that these modes differ by large rapidities gives rise to large logarithms, which are accounted for by the collinear anomaly. In complete analogy with the construction of the SCET Lagrangian, where based on the generic scalings of the fields one does not include soft-collinear interaction terms, it is unnecessary to consider the degenerate case where a collinear and a soft mode near the boundary are clustered into a single jet. Since there are no enhancements of the cross section in these power-suppressed phase-space regions, boundary effects do not contribute at leading power. Only in corners of the phase space, e.g. when a soft emission becomes collinear to the beam, soft and collinear radiation can be clustered into the same jet. However, since the cross section does not exhibit additional singularities in the corresponding limit, such configurations only give rise to power-suppressed contributions.
The argument just presented has been challenged in [13], where it was argued that the clustering of soft and collinear modes near the boundary of phase space that separates them in rapidity does give rise to leading-order contributions to the cross section starting at NNLL order, unless the jet-radius parameter R is taken to be parametrically much smaller than 1. If this was true, then there would be no region in parameter space where our factorization theorem (18) would be useful, since for parametrically small values of R it does not accomplish the resummation of ln n R terms. The argument presented in [13] was backed up by a calculation of a particular soft-collinear clustering contribution, which was found to be non-zero and provided a leading-power contribution to the cross section proportional to R 2 and R 4 , hence the claim that these contributions are suppressed only for parametrically small R. We will now demonstrate in detail that these findings are not in conflict with our factorization formula.
SCET is based on the method of regions. Loop and phase-space integrations are split into different momentum regions in a systematic manner dictated by the structure of the effective Lagrangian. One could separate the different regions using cutoffs, as indicated graphically in Figure 1, but this is impractical because it would spoil gauge invariance in the individual sectors of the effective theory. Instead, one uses dimensional regularization to handle the appearing singularities. It is then crucially important to perform the calculation of SCET diagrams using a multipole expansion, which expands out components of particle momenta (or position vectors) that are parametrically suppressed with respect to the leading ones in a given interaction or propagator [10]. Only if this is done consistently, there is no double counting of momentum configurations. The reason is that, once we consider the contribution from a certain momentum region to an integral, any expansion of the integrand around another limit will leave us with a scaleless integral, which is zero in dimensional (or, more generally, analytic) regularization. In the present case, it is important that one performs the multipole expansion not only for the integrands of loop or phase-space integrals, but also for the measurement functions M veto in the definitions (5) and (6).
If one does not perform the multipole expansion consistently, then there arise contributions from the double counting of overlapping momentum regions, which must be subtracted by hand in order to obtain the correct result. In the SCET community these subtraction terms are referred to as "zero-bin subtractions" [31]. The problem with the argument presented in [13] is that the soft-collinear clustering contribution was calculated without performing the multipole expansion, but the relevant zero-bin subtractions were not evaluated. We will argue that these zero-bin subtractions exactly cancel the soft-collinear clustering term, so that one recovers the same result as before. When the multipole expansion is performed consistently, the contributions in which different modes are clustered by the jet algorithm are simply zero.
We will now illustrate these statements with the help of a simple example, which demonstrates our point without involving the technical complexities of the real calculation. In Section 4, we will perform the calculation of the two-loop anomaly coefficient d veto 2 (R) in the context of SCET, considering only the contributions of two collinear or two soft emissions, in accordance with our factorization formula (3), according to which the jet veto must be applied separately in each sector of SCET. The fact that in this way we reproduce the result extracted from [12] proves that there are no missing contributions at this order. We thus do not confirm the statement made in [13] that soft-collinear mixing terms contribute to the cross section at O(α 2 s ). Moreover, since our arguments are completely general and not tied to a particular order in the perturbative expansion, they support our claim that the factorization formula (18) remains valid also in higher orders in perturbation theory.
Consider the following simple rapidity integral with integrand 1 and the constraint that two particles with rapidities y c and y are clustered into one jet: Here y c ≫ 1 is the fixed rapidity of a collinear particle, and we assume that R = O(1). The θ-function plays the role of the measurement function M veto in the definitions (5) and (6). Even though this integral is extremely simple, it captures the main features of the integrals we will encounter in the computation of the C 2 F term in Section 4.2, since the relevant part of the amplitude for this color structure only depends on the transverse momentum. The only difference to the trivial example integral (22) is that one also integrates over the rapidity of the second emission and the azimuthal angles, which then also enter the θ-function constraint. In the context of SCET we should evaluate the integral as a sum over contributions from different momentum regions. In each region we must multipole expand the argument of the θ-function according to the rules of the effective theory. For the purposes of our discussion we will consider for the moment only the contributions from collinear and soft partons; as we discuss below, adding the anti-collinear region would not change the argument. A collinear particle has momentum scaling (λ 2 , 1, λ). In the collinear region both rapidities scale the same, y ∼ y c ∼ ln(1/λ) ≫ 1, but their difference is O(1). There is thus nothing to expand in the argument of the θ-function, and we get I c = I for the collinear-collinear clustering term. A soft particle has momentum scaling (λ, λ, λ) and hence y = O(1). It follows that in the argument of the θ-function (y − y c ) 2 = O(ln(1/λ)) is parametrically larger than R 2 = O(1). We must therefore perform the multipole expansion The higher-order terms in the expansion will contain derivatives of δ-functions, but because the arguments are always non-zero the entire expression on the right-hand side just vanishes. At this point one may worry about double counting, since in the collinear-collinear contribution we have also integrated over the region where the collinear particle becomes soft. One should therefore subtract the contribution from the soft-collinear overlap region (the "zero bin"), which otherwise would be counted twice. However, after the multipole expansion this overlap contribution vanishes, I (cs) = 0, for the same reason that the soft-collinear contribution vanishes. Alternatively, one could evaluate the contributions from the two regions without performing the multipole expansion. Then obviously both regions yield the same contribution, I c = I s = I. But now the double-counted soft-collinear overlap contribution is also non-zero, and indeed I (cs) = I is equal to the soft-collinear contribution. The final result is I c + I s − I (cs) = I, as it should be. In analogy with the findings of [13] the soft-collinear clustering term is non-zero in this case, but its contribution is precisely cancelled by the zero-bin subtraction.
When also the anti-collinear region is included, we still only need to compute the contribution I c if the multipole expansion is performed consistently, but when working with subtractions the procedure gets more complicated. The general expression for three momentum regions reads The last term Ic cs describes the double overlap region, where the momentum can simultaneously be part of any region. It is obtained by expanding the integrand in the limit where the momentum scales as (λ 2 , λ 2 , λ). It has to be added back, since the other three subtractions would remove this region from the integral. The general systematics of subtractions was studied in detail in [32], as a step towards a proof of the method of regions. In our simple example, all of the above contributions are equal to the original integral I. Since the momenta in the double overlap region and in the (cc) contribution scale in the same way, the last two terms in (24) are identical for any given integral, and the general expression simplifies to While useful to map the integrals in dimensional regularization onto standard integrals, the subtraction procedure is extremely cumbersome in practice. For the two-emission case, for example, one would start off with 25 momentum configurations, since each of the two momenta can be in any of the regions or overlap regions in (25). In addition to the proliferation of regions, another drawback of the subtraction method is fact that the integrals are no longer homogenous in the expansion parameter λ, so that in general one will need to reexpand the final result in λ after integrating. It may appear strange at first sight that we had to expand the argument of the θ-function in (22) in powers of ln(1/λ), not in powers of λ. This distinction is however meaningless. Instead of (23) we may equally well write for a soft particle. The multipole expansion is now an expansion in powers of λ. Indeed, one can always rewrite the rapidity integrals in terms of integrals over components of lightcone momenta. For example, denoting the collinear reference momentum by k and the soft momentum by p, we have y c = ln(k + /k T ) and y = ln(p + /p T ), and hence the phase-space constraint can be rewritten in the form In the last step we have indicated the scalings of the various soft and collinear momentum components. Neglecting higher-order terms in λ, we obtain which vanishes, since each light-cone component of an on-shell momentum is positive. We can further think of the θ-functions of momentum components as the discontinuities of some propagators. This clearly shows that the multipole expansion in (23) is not different from multipole expansions of propagators in ordinary SCET loop or phase-space integrals, and it makes it clear that power-suppressed terms, which are expanded out, are governed by powers of λ, not powers of 1/ ln(1/λ). We finish this section with an important remark. The structure of the first θ-function in (26) suggests that some of the power-suppressed terms may be accompanied by a factor e R . Because we treat R as an O(1) parameter, also e R is not a parametrically large quantity, so even if such terms exist, their presence would not upset the structure of the factorization formula (18). The question of the numerical size of power-suppressed corrections must be separated from the issue of parametrically enhanced corrections. The outcome of our discussion is that, for R = O(1), there are no contributions to the cross section arising from soft-collinear clustering terms, which would upset the factorization formula. In our framework all soft contributions are purely scaleless. In physical terms, this means that the soft contributions can effectively be absorbed into the (anti-)collinear fields. The structure of relation (8) implies that this is indeed possible. The same happens for the transverse-momentum spectrum of electroweak bosons [6,11], and also in all SCET I applications where a separate mode with (λ, λ, λ) scaling is not needed to describe the physics. Nevertheless, there are power-corrections to our factorization formula from subleading terms in the effective Lagrangian and subleading SCET operators. In Section 6 we will study their numerical impact by matching our results to the cross section computed in fixed-order perturbation theory. We will find that even for R = 1 the power corrections remain small; indeed, we will not find any numerical evidence for the existence of e R -enhanced power corrections.

Two-loop computation of the anomaly exponent
We now turn to the computation of the two-loop anomaly exponent d veto 2 (R) in (10). According to the factorization formula (18), this quantity can be obtained from a perturbative computation of the collinear and soft matrix elements defined in (5) and (6). Instead of the beam function B c,g for a gluon, we will in the following consider the analogous function for a collinear quark, defined as (28) This function would appear in the calculation of the jet-veto cross section for a quark-initiated process such as Z-boson production at the LHC. As we will explain below, the result for d veto 2 (R) relevant for Higgs production can be obtained from the corresponding coefficient for Z production by replacing C F → C A , but distinguishing the two color factors will make it easier to organize the calculation. The gauge-invariant gluon and quark fields in the matrix elements in (5) and (28) are related to the usual QCD fields by [8,9,33] where W (x) is a straight Wilson line along then direction from −∞ to x. We use the standard QCD Lagrangian to evaluate the collinear matrix element in (28), since the collinear SCET Lagrangian is equivalent to it (see e.g. [10,34]). Some representative examples of two-loop diagrams contributing to this matrix element are shown in Figure 2.
In order to extract the anomaly coefficient d veto 2 (R), we can evaluate the collinear matrix elements (5) and (28) with partonic instead of hadronic external states. In addition, we also need to calculate the soft function defined in (6), which involves products of soft Wilson lines along the two beam directions. For the Higgs case, these are Wilson lines in the adjoint representation, while the fundamental representation is relevant for the case of Z-boson production. The normalization factor becomes d R = N c in the latter case, such that S(p veto T , µ) = 1 at lowest order in perturbation theory. In the analytic regularization scheme based on the prescription (7), the soft function is given by scaleless integrals of the type which vanish by definition. The reason is that the integral over the total rapidity y t of the emitted soft gluons is not constrained by the jet veto. It follows that S(p veto T , µ) = 1 to all orders in this regularization scheme. In principle, it is thus sufficient to evaluate the collinear functions B c,q and B c,q , and since the divergences in the analytic regulator must cancel in the product of these functions, calculating the left-or right-collinear function would be sufficient in practice.
In our calculation we will, however, adopt a different strategy. It has been shown in [6] that one obtains a non-zero soft function if one imposes different jet vetoes for the left-and rightmoving particles. The anomalous large logarithms in the soft function are then tied, via the anomaly equations, to those in the collinear beam functions. Extracting the coefficient d veto 2 (R) from the soft function offers the advantage that the relevant Wilson-line diagrams are simpler to compute than the loop diagrams for the collinear functions. Instead of imposing different jet vetoes for left-and right-moving particles, we can generate a non-trivial soft function by Table 1: Structure of the divergences of two-emission diagrams arising when one uses the regulator (31), which distinguishes left-and right-moving particles. We denote collinear particles by c, anti-collinear ones byc, and soft ones by s.
using different analytic regulators for them. To this end, we generalize the regularization prescription in (7) by replacing After the multipole expansion, the collinear function only involves the regulator β, while the anti-collinear function is regularized by α. The cancellation of divergences between the soft and (anti-)collinear functions then proceeds in the way shown schematically in Table 1.
Because of the structure of the cancellations, the computation of the divergence of a single function is again sufficient, and with the regulator (31) we can work with the soft instead of the (anti-)collinear functions. For convenience, we will perform the extraction of the color structures C F C A and C F T F n f of the anomaly coefficient d veto 2 (R) from the computation of the soft function using the split regulator (31), while we will extract the C 2 F part from the collinear functions with the original form (7) of the regulator.
A second simplification of the computation is achieved by using the fact that the two-loop anomaly coefficient for the transverse-momentum spectrum of an electroweak boson B (with B = H, Z, γ * , W ± ) is known [11]. Since the jet algorithm only has an effect for two and more emissions, the difference between the jet-veto cross section σ veto (p veto T ) and σ B (p veto T ), the boson q T spectrum integrated up to a momentum scale p veto T , starts at O(α 2 s ) and involves only contributions from two realemission diagrams at this order. This observation was used in [5] to extract the R-dependent part of the NNLL order corrections, and the logarithmically-enhanced terms in the above difference were given explicitly in [12]. On the partonic level, the logarithmically-enhanced two-loop terms have the form whereŝ is the partonic center-of-mass energy squared. In order to obtain the full two-loop anomaly coefficient, we then use the relation [6] d veto where C B = C A for Higgs production and C B = C F for Z-boson production. The ζ 3 term arises from the Fourier integral present in the factorization formula for the boson q T spectrum. The anomaly coefficient relevant for the transverse-momentum spectrum, was extracted in [11]. We will find that the quantity ∆d veto 2 (R) defined in (33) can be further decomposed as ∆d veto where f (R) vanishes for R → ∞, and for R < π it can be approximated by the numerical expression given in (17). In this way, we recover the result (16) once we set C B = C A for the Higgs-boson case.
The real-emission QCD diagrams contributing to ∆σ(p veto T ) are free of infrared singularities and can be evaluated in d = 4 space-time dimensions. However, the effective-theory diagrams will continue to suffer from light-cone divergences, and thus the analytic regularization has to be kept. The two-emission measurement function relevant for the difference ∆σ(p veto T ) reads where p = k + q is the total momentum of the two emissions with momenta k and q, and ∆R = ∆y 2 + ∆φ 2 is their angular separation.
There is a price one has to pay when working with the difference ∆σ(p veto T ) instead of the jet-veto cross section itself. With the measurement function M ∆ (p veto T ), contributions arise from clustered particles that have large angular separations ∆R > R. In contrast to the jetveto cross section, ∆σ(p veto T ) does get contributions from particles from the different sectors and we will therefore need to evaluate those contributions. The physics reason is that collinear and soft particles both contribute equally to the q T spectrum of the electroweak boson. The mixing contributions are R independent and only arise for the C 2 F color structure. Their presence is the reason why we evaluate this part with the the standard form (7) of the analytic regulator, for which the soft region is absent. We then only need to compute the mixing contribution involving one collinear and one anti-collinear particle.

Evaluation of the C F C A and C F T F n f terms
To extract the contribution of these two color structures to d veto 2 (R), we compute the twoloop soft function with the split analytic regulator (31). Up to the choice of the regulator, the corresponding computation is identical to what was done in [13], with one important difference: this paper claimed that the factorization of the cross section would only hold for R → 0 and the computation was only performed in this limit. As we have demonstrated in Section 3, the factorization formula (18) also holds at finite R = O(1), and we must therefore recover the full QCD result of [5] from our computation of the soft function.
In order to perform the calculation, one needs the two-emission soft amplitude squared, which is given in compact form in Appendix C of [35] and can also be found in [13]. One then parameterizes the integration over the two-particle phase space in terms of angles, rapidities, and transverse momenta, introducing the variables The integration over the total rapidity y t then gives rise to a divergence of the form whose coefficient is the collinear anomaly. The divergence only arises if both emissions are either to the left or two the right, and the two terms would cancel if we were to set α = β.
The integration over p T can be performed analytically, which leads to the result For a given value of R, the remaining integrations can be performed numerically. To obtain an analytic form of the result, we have expanded the integrand in powers of R, as was done in [5]. Details of the calculation can be found in Appendix A. Translating the divergence in the analytic regulator into the anomalous logarithm according to the structures shown in Table 1, and using relation (33), we obtain where the first few expansion coefficients are given by In Appendix A, we present analytic expressions for the expansion coefficients up to O(R 10 ).
Our results for the coefficients c i L and c i n with n = 2, 4, 6 agree with the findings of [5]. Our analytic expressions for the coefficients c i 0 are new, and they are in agreement with the numerical values reported in [5]. 2 4.2 Evaluation of the C 2 F term The computation of the C 2 F part is complicated by the fact that quadratic divergences in the analytic regulator as well as mixing terms between the different sectors arise. Rewriting θ(∆R − R) = 1 − θ(R − ∆R), we note that both problems do not affect the second term on the right, which can thus be treated exactly in the same way as the C F C A and C F T F n f contributions studied in the previous section. This second term, which contains all R dependence, corresponds to the independent-emission piece computed in [5]. Proceeding in the same way as above, we confirm their result This leaves the R-independent piece arising from the rewriting of the θ-function. As discussed earlier, this part involves the mixing between the different sectors of the effective theory, but only because we consider the cross-section difference in (32) instead of the jet-veto cross section itself. We will compute it using the standard form (7) of the analytic regulator, for which the soft contributions are absent. The general form of this contribution in terms of rapidity, azimuthal angle, and transverse momentum is Here A ij (k, l) is the squared amplitude for two emissions in the appropriate momentum regions. The measurement function only involves transverse momenta and is thus the same in all regions. The function ∆ ij (k, l) gives the multipole expansion of the Higgs-boson on-shell constraint (p 1 + p 2 − k − l) 2 = m 2 H in the relevant momentum region. For example, one has in the partonic center-of-mass system. We now consider each contribution in turn.  We begin with the A cc contribution. The diagrams relevant for the C 2 F color structure are shown in Figure 3. We are only interested in the light-cone singularities of these diagrams, which result in divergences in the analytic regulator α. Therefore only diagrams with at least one Wilson-line emission can contribute. The light-cone singularities arise from the region of the integrand in which the large minus-components of the collinear momenta tend to zero, i.e. when these particles become soft. In the limit where the momentum k becomes soft, the C 2 F part of the squared collinear amplitude takes the form where the one-emission soft and collinear amplitudes squared are As we will see, only the region where both emissions become soft gives rise to a 1/α divergence in (45). In the double soft limit, the squared amplitude reduces to With this simple form, the integration over the light-cone components becomes trivial. It has the form We have inserted an upper cutoff Λ ∼ m H in the k − integral, since we are only interested in the divergences arising at small k − . Changing variables to p T = k T + l T and ξ = k T /l T , and integrating over the total transverse momentum p T , the integral in (45) becomes where in the last step we have replaced the cutoff scale Λ by the Higgs mass, since we know from power counting that the full integral scales in this way. Note that, upon performing the double integral, one finds that the expression in the second line is of order O(α), so that the final result has only a single pole in α even though the light-cone integrations have produced a double pole. This O(α) suppression is also the reason why only the double soft limit is divergent. After subtracting the double-soft part from the total contribution ∆σ cc (p T )| R−indep. , the lightcone integrations for the single-soft contribution (48) give only rise to a single pole, and since A c (q) has the same transverse-momentum dependence as A s (q), the O(α) suppression of the transverse-momentum integration then renders the integral finite. We conclude that only the double soft region gives rise to a divergence, so that (52) is indeed the full result. Next, we consider the contribution ∆σcc(p veto T ). Its structure is basically the same as above, except that the analytic regulator is now attached to the large momentum component, so that the light-cone integrations give In contrast to (51), the integral over transverse momentum is not affected by the regulator α. The transverse-momentum integration associated with this term can thus be obtained by taking the α → 0 limit in the second line of (52). But we have seen above that this integral is of O(α), and hence it follows that This leaves us with the mixed contribution ∆σc c (p veto T ). Since the SCET Lagrangian does not contain any interactions coupling collinear and anti-collinear particles, the squared amplitude is a product where the first-order collinear amplitude squared was given in (49) above, and the anti-collinear amplitude squared Ac(k) is obtained from A c (k) by interchanging k + and k − . Expanding the result in the soft limit, performing the integrations over the light-cone momentum components using (51) and (53) in the two sectors, and evaluating the integrals over transverse momenta as in (52), we get Summing the different contributions, we finally obtain (57) The cancellation of the divergence provides a check on our computation. The resulting contribution to the anomaly coefficient derived from (33) is Interestingly, this term exactly cancels the ζ 3 term which arose in (34) from the Fourier integral in the expansion of the boson q T spectrum. In the discussion above, we have exploited the fact that the light-cone singularities arise when the collinear particles become soft, and that the soft parts of the amplitudes can be factorized off. The structure of this factorization can be understood by splitting the collinear gluon field A c into a collinear and an ultrasoft gluon field, A c → A c + A us . This ultrasoft field describes collinear particles in the limit where their large light-cone momentum components become small, k − ∼ εm H ≪ m H . Its other light-cone component scales as k + ∼ λ 2 , and is therefore softer than the soft mode in the factorization formula (3). For ε ∼ λ 2 , this mode would be the standard ultrasoft gluon, but the relative scaling of ε and λ is not important in the following. Decoupling the ultrasoft gluon, the collinear quark field matches onto The ultrasoft Wilson line Y † n (x) arises from the substitution A c → A c + A us in the collinear Wilson line W † (x), while the second ultrasoft Wilson line arises after decoupling the ultrasoft gluons from the collinear quark field ψ. These ultrasoft contributions are scaleless in our regularization scheme, so we did not need to include them explicitly. But as we have shown above, we can use their structure to extract the divergences in the analytic regulator. Relation (59) is also the underlying reason why the cancellation of the divergences between the different sectors works: they all reduce to (ultra)soft Wilson lines in the singular limit. Since the Wilson lines arising for quarks and gluons only differ in their color representation, we can obtain the gluon result from the quark result computed above by replacing C F → C A .
We now have computed all the ingredients required to present the complete result for the two-loop anomaly coefficient d veto 2 (R). Combining (34) and (36), we obtain with For the Higgs case, with C B = C A , this reproduces the numerical result given in (17).

Two-loop beam functions and fixed-order matching
The one remaining unknown two-loop ingredient to the factorization theorem (18) is the twoloop beam functionB g (ξ, p veto T ) defined in (8). In (12) we have matched this function onto standard PDFs, and we have then presented the one-loop expressions for the kernel functions I g←i . For our analysis we will extract the two-loop contributions toB g numerically. At the same time, we will match our resummed expression for the jet-veto cross section with the corresponding fixed-order expression at O(α 2 s ). In this way, we extract terms that are powersuppressed in the small ratio p veto T /m H . Once this is done, our result not only resums the large logarithms of m H /p veto T at N 3 LL p order, but it also accounts for all two-loop corrections. At fixed order in perturbation theory, the two-loop result for the Higgs cross section with a jet veto can be obtained by running the codes FeHiP [36] or HNNLO [37,38]. These Monte-Carlo programs compute the production cross section at O(α 2 s ), with arbitrary cuts on the final state. In the following, we use HNNLO with MSTW2008NNLO PDFs [39] and α s (m Z ) = 0.1171. In order to extract the product of the two beam functions with twoloop precision, we compute the cross section integrated over rapidity and divide it by the perturbative expansion for the hard functionH defined in (19). This yields the reduced cross sectionσ withσ where τ = m H / √ s and y max = ln(1/τ ). The quantityσ ∞ contains the leading-power contribution and is proportional to the convolution of the two beam functions. The remainder ∆σ = O(p veto T /m H ) in (62) contains the power corrections to the reduced cross section. The rationale for considering the reduced cross section is that, in the factorization formula (18), all large logarithms are resummed in the RG-invariant hard functionH (provided we choose µ ∼ p veto T ). The reduced cross section obtained whenH is factored out has a well-behaved perturbative expansion, and it can thus be extracted from numerical fixed-order codes.
We now exploit the fact that the leading-power reduced cross sectionσ ∞ depends on m H only through the ratio m H / √ s, which enters in the arguments of the beam functions and in σ 0 (p veto T ). If we compute the reduced cross section for a very large value of m H , keeping the ratio m H / √ s fixed at its physical value, the power corrections will become negligibly small and we directly obtain the quantityσ ∞ , and from it the two-loop beam functions. Repeating the analysis with the physical value m H = 125 GeV, we are then able to extract the powersuppressed contribution ∆σ. In practice, we run the program HNNLO at a fixed value of As a validation, we have performed the numerical extraction of the beam functions and power corrections also at NLO, where the expression forB g (ξ, p veto T ) is known analytically, see (12). The two upper bands in Figure 4 show the leading-power reduced cross sectionσ ∞ , while the two lower bands show the power-suppressed contribution ∆σ. In all cases we show NLO bands obtained by varying the factorization scale in the region p veto T /2 < µ < 2p veto T . The blue bands show the numerical result extracted from the procedure just described, while the red bands give the exact result, obtained by using the analytic expressions (14) for the calculation of the beam functions. We observe that the numerical method reproduces the analytical results with good accuracy. The small difference is shown by the very narrow orange band in the figure. This band is equal to the power corrections at m H = 500 GeV, which are very small but non-zero. In our final matched results, we add back the power-suppressed terms toσ(p veto T ), so that the small residual power corrections remaining at m H = 500 GeV do not change our predictions for the cross section. We separate out the power corrections in order to assess their relative size and to be able to vary the scale µ independently forσ ∞ and ∆σ. The small scale uncertainty of the fixed-order cross section is known to be due to a cancellation of different types of large corrections. In order to avoid such accidental cancellations, we separate the different parts of the calculation and vary their scales independently.
At NNLO the numerics become more challenging, in particular at the high value m H = 500 GeV. In the left two plots in Figure 5 perform 20 independent runs of the HNNLO program, each producing 3·10 8 events. Every run generates a histogram forσ(p veto T ) with the selected parameter values and takes approximately 10 hours to complete, so that the total computing time would amount to 2000 days on a single processor core. Despite the large number of events, the statistical uncertainties on the extracted values in Figure 5 are not completely negligible. To determine the default value and the scale variation at a given value of p veto T , we fit a third-order polynomial in ln µ to the numerical data. The resulting fit functions, together with their uncertainties, are shown in the left two panels of Figure 5. In both cases, the quality of the fit is excellent (χ 2 /dof ≈ 0.8). From the fit in µ, we extract the default value for the cross section and the upper and lower edges of the scale-variation band, for each value of p veto T . In a last step, we first multiply by the prefactorH evaluated at its default scale µ = p veto T and then fit a third-order polynomial in ln p veto T to the leading-power cross-section results, and a fourth-order polynomial in p veto T to the power corrections. We do not include a constant term in the fit to ∆σ, since the power corrections must vanish for p veto T → 0. The fitted curves are shown in the third plot in Figure 5. Once again, the upper curves show the leading-power cross sectionσ ∞ (p veto T ) together with its scale-uncertainty band, while the lower ones show the corresponding results for the power corrections ∆σ(p veto T ). The fact that the scale variation at NNLO turns out to be larger than at NLO can be traced back to the presence of rather large, R-dependent two-loop corrections in the beam functions. This will be discussed in more detail in Section 6. We have also used other forms of fit functions and find compatible results. However, employing too many fit parameters would cause the fit to follow the statistical fluctuations of the numerical results. As a further cross check, we have also computed the p veto T dependence using the MCFM code [40] instead of HNNLO, finding results consistent with the ones presented here.
It is interesting to look at the dependence of the power corrections on the jet-radius parameter R. From (26), one would naively expect that the power corrections can be enhanced by factors of e R , as mentioned near the end of Section 3. However, numerically we see no evidence for such an effect. Indeed, as can be seen from Figure 6, we find a very moderate dependence on the jet radius. The relative size of the power corrections, ∆σ(p veto T )/σ(p veto T ) = ∆σ(p veto T )/σ(p veto T ), turns out to be almost independent of R in the range 0.2 < R < 1.

Numerical predictions for the LHC
We are now in a position to present our final results for the jet-veto cross section and the veto efficiency for Higgs-boson production in gluon fusion at the LHC. In order to obtain the highest possible accuracy at present, we combine resummed results at N 3 LL p order with fixedorder results at NNLO in perturbation theory. The only missing ingredients for a complete resummation with N 3 LL accuracy are the four-loop coefficient Γ A 3 of the cusp anomalous dimension and the three-loop coefficient d veto 3 (R) in the anomaly exponent F gg in (10). Both quantities enter via the RG-invariant hard function defined in (19). For the four-loop cusp anomalous dimension, we use the Padé approximation valid for n f = 5. A corresponding estimate works very well one order lower, where one has Γ A 2 = 538.2 and (Γ A 1 ) 2 /Γ A 0 = 572.7. The largest effect of Γ A 3 occurs at low p veto T values. However, even at the very low value p veto T = 10 GeV, switching off the four-loop cusp anomalous dimension would increase the cross section by only 0.1%, so that the uncertainty associated with Γ A 3 is negligibly small. The contribution of the unknown three-loop anomaly coefficient d veto 3 (R) to the cross section is of the form (α s /π) 3 ln(p veto T /m H ). Generically, we would expect this type of contribution to be small in the range of p veto T values we consider, since the logarithm ln(p veto T /m H ) is not large enough to fully compensate the suppression by a factor of α s /π. However, we have seen in Section 4 that the anomaly coefficient is enhanced at small R by factors of ln R. The leading-color part of the two-loop coefficient can be well approximated as d veto 2 (R) ≈ 2 (4C A ) 2 ln(2/R). Motivated by this, we will estimate the quantity d veto 3 (R) as and vary the overall prefactor in the range −4 < κ < 4. The result of this variation on the cross section is shown in Figure 7. The above ansatz encodes the correct logarithmic scaling at small R, and we believe it provides a generous estimate for all R values considered in our work. Even at R = 1 our estimate for d veto 3 (R) is still more than six times larger than the three-loop cusp anomalous dimension Γ A 2 . Nevertheless, the resulting effect is seen to be very small for larger values of R. Also for smaller values, such as R = 0.4, the associated uncertainty is lower than the scale uncertainty. While a full computation of d veto 3 (R) looks difficult, we believe that a determination of the coefficient of the leading logarithm should be feasible. The double logarithm arises from diagrams with three collinear emissions, which involve two propagators that are nearly on-shell.

Scale uncertainties
We now proceed to explore the perturbative uncertainties in the resummed predictions for the jet-veto cross section, as estimated by scale variations. We obtain predictions for the cross section integrated over rapidity by using the RG-improved result for the hard functionH in (19) and multiplying it with the reduced cross section in (62), which we have extracted with two-loop accuracy and including power corrections. Since the Sudakov logarithms exponentiate, it is natural to perform the perturbative expansion of the hard function in the exponent, i.e. to expand lnH instead ofH itself. For this reason, we do not perform an additional expansion after multiplying the reduced cross section byH. To estimate the residual scale uncertainties of our predictions, we independently vary the hard matching scales µ t and µ h , at which the Wilson coefficients C t and C S in (3) are calculated (for details see [23]), as well as the factorization scale µ, by factors of 2 about their default values µ t = m t , µ 2 h = −m 2 H , and µ = p veto T . We then obtain individual error estimates for the hard functionH and for the reduced cross sectionσ ∞ and its power corrections ∆σ. The error associated with the hard function also includes the uncertainty arising due to the unknown value of d veto 3 (R), which we estimate by scanning κ over the interval between −4 and 4. Beyond NLL order, the sensitivity to variations of the hard scales µ t and µ h is so small that one can safely neglect it compared to the effect of the µ variation. For instance, at p veto T = 20 GeV the µ h variation is ±0.3% and the µ t variation +0.1 −0.2 % at N 3 LL level (±1% in both cases at NNLL order). Since the quantitiesH andσ are RG invariant, it seems reasonable to assume that their residual scale uncertainties are uncorrelated. We therefore combine the errors inH,σ ∞ , and ∆σ in quadrature to obtain our final error estimates.
In addition to the matching and factorization scales, one can also consider a variation of the logarithms associated with the collinear anomaly. This can be done by rewriting the anomaly factor in (19) in the form For ν ∼ p veto T , the second factor on the right-hand can be expanded in fixed-order perturbation theory, after which some higher-order ν dependence is left over. For example, at NNLL order the one-loop expression for F gg in (10) is sufficient for the second factor, while the two-loop expression is needed for the first one beause of the large logarithm. The variation from changing ν by a factor of 2 about the default value ν = p veto T is ±10% at NNLL order, while it vanishes by definition at N 3 LL order if d veto 3 (R) = 0 and µ = p veto T , assuming the expansion is performed for the logarithm ofH(m t , m H , p veto T ), as we do. If insteadH(m t , m H , p veto T ) itself is expanded, then the variation is ±3%. The type of scale variation considered here can be formalized in an RG framework [41,42], in which the change in ν reshuffles contributions between the soft and collinear functions. However, in contrast to the standard RG, there is no physical coupling constant involved in the running, since the different contributions live at the same virtuality. Furthermore, the individual contributions are strongly scheme dependent. With our regulator, all perturbative corrections to the soft function vanish, while the regulator put forward in [42] leads to a non-zero soft function. For these reasons, we do not believe that the ν variation provides much insight into the size of higher-order corrections, and we therefore do not include it in our error budget. Figure 8 shows our predictions for the leading-power cross section for three different values of the jet-radius parameter R. The colored bands refer to the predictions obtained at NLL, NNLL, and N 3 LL p order. Consider first the right-most panel, which corresponds to the relatively large value R = 0.8. In this case we observe a reduction of the scale uncertainties as we increase the accuracy of the resummation. While the NLL and NNLL bands do not quite overlap, they are at least near each other. The N 3 LL p band overlaps with the NNLL band and counteracts to some extent the large enhancement seen at NNLL order. All in all, it appears that the impact of higher-order effects is roughly in accordance with the error estimates from lower-order results, suggesting that the perturbative series is reasonably well behaved. Unfortunately, the quality of the expansion deteriorates as one lowers the jet radius R. The size of the corrections and the uncertainties obtained at NNLL and N 3 LL p order both increase with decreasing R. For R = 0.2, the NNLL band is as broad as (or even broader than) the NLL band, and there is a rather substantial gap between them. The origin of the large scale dependence of the NNLL order bands at small R can be traced back to the behavior of the two-loop anomaly coefficient d veto 2 (R) given in (16), which is plotted in Figure 9 in units of the coefficient d A 2 appearing in the resummation formula for the transverse-momentum distribution of Higgs bosons at low q T ≪ m H [43]. Whereas d veto 2 (R)/d A 2 is of modest size for R 0.8, this ratio quickly increases as R decreases, and it reaches a very large value d veto 2 (R)/d A 2 ≈ 8.7 for R = 0.2. The origin of this effect can be understood from the presence of the ln R term in the expression for the function f (R) in (17), which becomes large for such small values of the jet radius. Note that the d veto 2 (R) term first appears at NNLL order, and that the µ dependence of the running coupling in the anomaly term contained in the hard functionH in (19) only gets compensated at N 3 LL order. For p veto T = 25 GeV and R = 0.2, the exponent approximately equals 17α 2 s (µ). Since the NLL band completely misses this genuine source of large scale dependence, it underestimates the perturbative uncertainties for small R. To reduce the scale variations of the NNLL band, it is necessary to perform the resummation at N 3 LL p order, as we do in the present work. The fact that the green bands in Figure 8 are narrower than at NNLL order and fall between the NLL and NNLL bands gives us confidence that at N 3 LL p order, and for R ≥ 0.4 not too small, one captures the main corrections and obtains reliable predictions and error estimates.
In order to substantiate this claim, we study the scale variations of the different ingredients in the factorization formula (18) separately. The top panels in Figure 10 show the residual scale dependence of the RG-invariant hard functionH at different orders in perturbation theory. We observe a very large correction when going from NLL to NNLL order, whereas the impact of yet higher-order corrections is seen to be small. Indeed, comparing with Figure 8, we see that this dependence of the hard function explains the scale variations of the cross section shown in Figure 8 at NLL and NNLL order. At N 3 LL p order, the large scale uncertainty related to the d veto 2 (R) term in (68) gets compensated by including the three-loop terms in the anomaly exponent. From that point on, the remaining scale variation of the hard function is very small, even if we simultaneously vary the coefficient d veto 3 (R) according to our estimate (66). This latter effect is illustrated by the difference between the green and blue bands in the plots. The bottom panels in Figure 10 show the scale variation of the leading-power reduced cross section σ ∞ (p veto T ) in (62). Once again large R-dependent two-loop corrections arise, which increase with decreasing R. However, in the present case these corrections are contained in the beam functions and count as N 3 LL order effects, because the reduced cross section does not contain large logarithms of p veto T /m H . As a result, while the one-loop corrections are seen to be small, at two-loop order the reduced cross section receives large negative corrections, whose size is not anticipated by the small scale dependence of the one-loop result. In addition, also the scale uncertainties increase when these corrections are included, especially at low R values. Indeed, it is the residual scale dependence of the beam functions at two-loop order which dominates the scale uncertainty in our final result for the cross section (cf. Figure 8). The figures show that, once again, these large two-loop effects strongly increase with decreasing R. For the anomaly coefficient, we had found that the remaining higher-order corrections are moderate once the leading ln R-enhanced terms are in place, and we believe that the same is true for the beam functions. In order to check that this is indeed the case, it would be necessary to compute the leading three-loop corrections to the beam functions in the limit of very small R -a task that is well beyond the scope of the present work.

Predictions for the jet-veto cross section
In the last step, we now add the power-suppressed corrections to our resummed predictions for the jet-veto cross section, thereby extending the accuracy of our results to N 3 LL p +NNLO. Our final predictions for the cross section are depicted in Figure 11. The red and green hatched bands show the cross section and its uncertainty at NNLL+NLO and N 3 LL p +NNLO accuracy, respectively. Also shown in the bottom half of each panel are the contributions from powersuppressed effects, which are seen to remain small even for the largest p veto T values considered here. The bands in the figure account for the scale variations and, at N 3 LL p order, include the uncertainty due to the unknown coefficient d veto 3 (R). While the uncertainty bands obtained at different orders do not quite overlap, they lie close to each other. Given the discussion above, this is the best we could have hoped for. The scale uncertainties increase significantly as R is lowered to smaller values. To have good theoretical control, one should either try to increase the value of R above 0.4 in the experimental analyses, or to resum the ln R-enhanced terms in the cross section. For a larger jet-radius parameter R = 0.8 and an intermediate jet-veto scale p veto T = 25 GeV, the cross section is predicted with an accuracy of about ±6%. For smaller R = 0.4, the uncertainties increase to about ±11%.
In Figure 12, we compare our findings to the results obtained by Banfi et al. (BMSZ) in [12]. Their results have NNLL+NNLO accuracy and are available through a public code JetVHeto [44], which was used to produce the purple bands shown in the figure. An important difference to our approach is that BMSZ consider the efficiency ǫ(p veto T ) = σ(p veto T )/σ tot instead of the cross section itself. Their goal was to divide out the large corrections affecting both σ(p veto T ) and the total cross section. We will explain below why we prefer not to follow the same strategy. The perturbative expansion of ǫ(p veto T ) is not unique. BMSZ consider in their paper three schemes, which translate into three different schemes for how to include the matching corrections. One can either first expand the numerator and the denominator in the formula for ǫ(p veto T ) and then take their ratio (default scheme (a)), expand ǫ(p veto T ) itself (scheme (c)), or consider 1 − ǫ(p veto T ) and separately expand numerator and denominator for this quantity (scheme (b)). The purple bands in Figure 12 show the scale uncertainty of the BMSZ results, which they obtain by first varying µ f and µ r by a factor 2 about the default value m H /2, while keeping 1/2 < µ f /µ r < 2 and the resummation scale Q at its default value, and then varying the resummation scale Q, while keeping µ f and µ r fixed at their default values. The bands shown in the figure are the envelope of these variations. The difference between the three matching schemes shown in Figure 12 is not negligible. Since the fixed-order corrections to both σ tot and σ(p veto T ) are large, the different ways of defining the efficiency ǫ(p veto T ) lead to fairly different results, despite the fact that this difference is H for the matching scale [23,45], as we do in our analysis. By now the virtual corrections to Higgs production are known to three-loop accuracy [46][47][48], and the result confirms that the higher-order corrections to |C S (−m 2 H , µ)| 2 are negligibly small for a time-like scale choice. Even for the standard choice µ 2 = +m 2 H , the three-loop corrections are only about 4%. The part which suffers from these large corrections is thus known very precisely, with sub-percent accuracy. The uncertainty on the fixed-order total cross section is larger, of order 10%, because the real-emission corrections are not as well known as the virtual part. Dividing by the total cross section therefore increases the uncertainty on the prediction and should better be avoided.
To compare our results to those of BMSZ, we have divided our prediction for σ(p veto T ) by the central value of the resummed total cross section σ tot = 19.66 +2.8%+7.8% −0.8%−7.5% pb obtained in [49], which is a state-of-the-art calculation using the same resummed expression for C S (−m 2 H , µ) as we do. The first uncertainty is due to scale variations, whereas the second one is the 90% C.L. error due to the combined PDF and α s variations. For comparison, we note that the LHC Higgs Cross Section Working Group adopts the value σ tot = 19.52 +7.2%+7.5% −7.8%−6.9% pb [50]. Our results are shown by the green bands in Figure 12. Note that we do not include an additional  Figure 12: Comparison of our result for the jet-veto efficiency (green) to the results of BMSZ (purple) obtained in the three different matching schemes used in [12].
uncertainty from the errors on the total cross section, because in order to compare with an experimental cross-section measurement we would have to multiply the efficiency with this same value of the total cross section. We observe that our results are numerically quite similar to the BMSZ results obtained in scheme (b), even though conceptually scheme (a) is closer to our procedure. Note that for their final prediction BMSZ use scheme (a) and its scale variation as the default. As an additional uncertainty, they consider the difference between the default values obtained in the three schemes. As a result, their final uncertainty band is given by the upper edge of the band in scheme (a) and the central curve in scheme (b). The uncertainty band so derived is smaller than the envelope of the bands obtained in the three schemes.
Numerical results for the jet-veto cross section and efficiency are given in Table 2 for two different values of R. In addition to the uncertainties shown in the table, one has to account for the PDF and α s errors. The combined relative uncertainty due to these input parameters is very similar to that obtained for the total cross section. The table shows that it would be beneficial to increase the jet radius compared to the presently used values R ≈ 0.4. On the experimental side, this would increase the sensitivity to the underlying event and pile-up, but these effects could be mitigated by resorting to techniques such as the one proposed in [51].

Conclusion
Using methods from effective field theory, we have obtained precise predictions for the Higgsboson production cross section in the presence of a jet veto, in which both Sudakov logarithms of the form α n s ln m (p veto T /m H ) arising due to the veto as well as the large virtual corrections affecting also the total cross section are resummed in a systematic way. To demonstrate the  validity of the factorization formula (18) for the cross section, which was first derived in [6], we have determined all its ingredients at two-loop order and have increased the logarithmic accuracy of the resummation to (partial) N 3 LL order. In particular, we have computed the two-loop anomaly coefficient d veto 2 (R) and have obtained a fully analytic expression for its series expansion valid for R < π, including the constant term, which previously was only known in numerical form. Our result agrees with the expression obtained in [5] through a calculation in QCD. Contrary to claims in the literature [13], we find that even for R = O(1) softcollinear mixing contributions, which would break factorization, are absent. This establishes factorization at NNLL accuracy.
In addition to the explicit two-loop calculation, we have discussed in detail why such softcollinear mixing terms are absent also in higher orders of the perturbative expansion. That they do not arise becomes manifest if the multipole expansion in the effective theory is properly implemented, i.e., if power-suppressed terms are consistently expanded away at the integrand level. Since SCET does not include hard cutoffs to separate the soft and collinear momentum regions, this multipole expansion is necessary in order to avoid double counting of certain momentum configurations. If, as in [13], the multipole expansion is only performed at the Lagrangian level, but not for the measurement function which defines the observable, then soft-collinear mixing terms arise in the overlap of the soft and collinear regions. However, we have shown that they cancel against the contribution from the overlap region. Our analysis thus reinforces the validity of the factorization theorem proposed in [6].
We have extended the phenomenological analysis of the Higgs-boson production cross section with a jet veto to N 3 LL p accuracy, where the subscript "p" stands for "partial" and indicates that two perturbative coefficients -the four-loop cusp anomalous dimension and the three-loop coefficient of the collinear anomaly -are currently still missing to complete the resummation at this order. The motivation for going beyond NNLL order is that the two-loop anomaly coefficient d veto 2 (R) turns out to be numerically quite large. For example, at R = 0.4 it is almost six times larger than the corresponding coefficient arising in the resummation formula for the transverse-momentum spectrum of the Higgs boson. This gives rise to a significant scale uncertainty of the NNLL result. Furthermore, for small values of the jet-radius parameter R, the scale variation cannot be trusted as an estimator of the effect of higher-order corrections. This is because the three-loop coefficient d veto 3 (R) is enhanced by two powers of ln R, while the scale-dependent pieces at two-loop order involve at most a single logarithm of R. In view of this fact, we have estimated the impact of the three-loop coefficient and find that it is relatively small, even with a generous estimate for the coefficient of the double-logarithmic term. The numerical impact of the other missing ingredient, the four-loop cusp anomalous dimension, is negligibly small. The most important contribution at N 3 LL order arises from the two-loop beam functions, which we have extracted numerically using the fixed-order code HNNLO. As for the two-loop anomaly coefficient, we find that the two-loop perturbative corrections to the beam functions are rather large, especially in the region of small R, where they are logarithmically enhanced. For R < 0.8, the two-loop corrections are larger than estimated by the one-loop scale variation, and also the scale uncertainty of the two-loop beam functions is larger than at one-loop order. It constitutes the main uncertainty of our final results for the cross section. On the other hand, since the leading ln R-dependent corrections to both the exponent of the collinear anomaly and the beam functions are included in our N 3 LL p +NNLO results, we expect that yet higher-order corrections would turn out to be small. Nevertheless, it would be interesting and important to compute the ln R-enhanced terms at three-loop order (and perhaps even beyond). We believe that it should be possible to calculate the leading logarithmic part of the three-loop anomaly coefficient d veto 3 (R), which would reduce the uncertainty of our predictions especially at low values of R.
In addition to performing the resummation of the jet-veto cross section at leading power in p veto T /m H , we have matched our resummed results to the full fixed-order expression for the cross section computed at NNLO, which has allowed us to also include power-suppressed terms. The size of these power corrections serves as an important check of the quality of the expansion in the small ratio p veto T /m H , which is the basis for our factorization formula. Numerically, the power corrections turn out to be quite small for the relevant values of p veto T . Importantly, we find no evidence for a growth of the power corrections with increasing R. A crucial element of our analysis is that we have separated different sources of theoretical uncertainties. This avoids accidental cancellations and furthermore allows us to investigate the uncertainty associated with each source individually. In our final results, we have added in quadrature the uncertainties in the hard function (which contains the resummation of all large logarithms), the beam functions, and the power corrections to the cross section.
Our results significantly improve the accuracy of the Higgs-boson production cross section with a jet veto. Detailed numerical predictions can be found in Table 2. It would be interesting to perform the resummation of large logarithms also for the W + W − production cross section, which is the main background to the H → W + W − signal in the presence of a jet veto. The W + W − cross section is furthermore used to search for anomalous gauge couplings, and also in this case it is necessary to impose a jet veto. Likewise, Sudakov logarithms should also be resummed for the cross sections for Higgs production in association with one or two tagged jets. The resummation is more challenging in the higher jet bins, due to the presence of nonglobal logarithms. However, a recent study at NLL order suggests that the numerical effect of the non-global logarithms is likely to be small [52,53].
we first perform the integral over t and then integrate over y. In this way, we obtain where δI i (R 2 ) is a power series in R 2 , and the constant term will be irrelevant for our discussion. The quantities ∆A i denote the reduced matrix elements A i in (A.3) with their leading singularities (for ∆φ, ∆y → 0) subtracted. The first derivative ∂ R 2 yields a δ-distribution, ∂ R 2 θ(r 2 − R 2 ) = −δ(r 2 − R 2 ), and for R < π the radial integral then simply sets r = R in the integrand. It is then straightforward to show that