Performance of Low Mode Averaging on Twisted-Mass Fermion Ensembles at the physical pion mass point
Abstract
We study the performance of low-mode averaging (LMA) on twisted-mass fermion ensembles at near-physical quark masses, assessing both its theoretical framework and practical cost-effectiveness in modern lattice QCD. In particular, we present a numerical study of light-quark meson and baryon observables. For mesons, we analyse two-point functions, including the vector–vector correlator relevant for the hadronic vacuum polarisation contribution to the muon anomalous magnetic moment, comparing two implementations of LMA: an exact approach based on explicit low modes and an approximate, high-statistics variant using multigrid techniques. For baryons, we restrict to the exact approach and study both two- and three-point functions, quantifying the resulting noise and cost reductions at large Euclidean times. In addition, we compute the eigenvalue density of the massless Wilson operator and determine the renormalised chiral condensate via the Banks–Casher relation, obtaining for isospin-symmetric QCD at a scale in the scheme, with an uncertainty dominated by the chiral extrapolation. Additionally, from the pion-mass dependence of , we extract the scale-independent low-energy constant .
I Introduction
As computational techniques, the inclusion of more complex observables and higher precision in lattice QCD continue to advance, reducing statistical uncertainties has become essential for meaningful comparisons with experimental data. Computing correlation functions at large Euclidean time is central to these efforts, both for isolating hadronic ground states, such as in nucleon matrix elements [66, 17, 40, 1, 12, 53, 37, 3, 39, 65, 6], and for applications of spectral reconstruction methods [8, 41, 60, 9, 33], including those used to determine the long-distance hadronic vacuum polarization contribution to the muon anomalous magnetic moment [25, 38, 23, 20, 16, 64]. In both settings, state-of-the-art simulations at physical quark masses demand reliable control of correlators at source–sink time separations exceeding 2 fm or even 3 fm. The exponential degradation of the signal-to-noise ratio at such Euclidean time separations, in particular for baryon observables, renders the long-distance regime one of the most critical and computationally demanding aspects of modern lattice calculations. A key strategy to address this challenge is low-mode averaging (LMA), which targets the infrared (IR) sector of the Dirac spectrum governing long-distance physics and enhances its precision by explicitly isolating its contribution.
Low-mode averaging has a long history in lattice QCD. Its first applications in the early 2000s [61, 49, 34, 35, 51] identified it as a promising variance-reduction technique, although its impact was then limited by the fact that lattice QCD simulations used relatively small lattice volumes and unphysical quark masses. Subsequently, LMA was combined with the truncated solver method (TSM) [18] to form all-mode averaging (AMA) [22, 63], thanks to their complementarity: LMA improves the IR sector of the Dirac spectrum, while TSM efficiently suppresses fluctuations originating from ultraviolet (UV) modes. Owing to its conceptual simplicity and ease of implementation, TSM without LMA quickly became widely adopted. It is important to note, however, that TSM alone predominantly improves short-distance observables and has little impact at large Euclidean time separations, as its action is largely confined to the UV sector [20].
The landscape shifted markedly with the advent of simulations performed near the physical pion mass, hereafter referred to as the physical point. In this regime, as the light quark mass is reduced, the condition number of the Dirac operator grows rapidly, rendering inversions with standard Krylov solvers prohibitively expensive. This development revived interest in LMA and, more broadly, in deflation techniques [59, 56]. These methods also exploit low-lying eigenmodes to accelerate solver convergence [58]; however, their substantial setup costs—due to the computation of exact eigenvectors, which scales poorly with the volume—limit their applicability. A major turning point was the successful deployment of multigrid solvers for Wilson-type fermions [26, 13, 46, 44], including variants for twisted mass [4, 5], overlap [27], domain-walls [28], and staggered [29] fermions. Multigrid algorithms feature substantially cheaper setup costs and can reduce inversion times by up to two orders of magnitude compared to standard Krylov methods. This breakthrough proved transformative for simulations at physical quark masses [15, 42, 7], effectively displacing exact-deflation solvers and TSM techniques. With Dirac operator inversions no longer the dominant component of the overall computational cost, the benefits of loose stopping criteria within TSM largely vanished. Moreover, TSM does not integrate well with the MG setup, as the solve time depends highly nonlinearly on the target residual.
The successful deployment of multigrid algorithms brings back the focus on LMA and motivates the present work. In the current computational regime, characterised by light quark masses, affordable inversions, and a growing demand for precision at long Euclidean time separations, LMA alone re-emerges as a particularly well-suited noise-reduction technique. By directly targeting the IR modes responsible for long-distance physics, it is especially relevant for modern high-precision lattice QCD calculations. In this work, we consider two alternative realisations of LMA: The first approach is the original formulation based on exact all-to-all treatments obtained by explicitly computing a subset of low-lying eigenvectors of the Dirac operator; this approach allows the IR contribution to correlation functions to be evaluated exactly on each gauge configuration, fully saturating the associated statistical fluctuations. We refer to this method as exact LMA (exLMA). The second approach, explored more recently, employs approximate but high-statistics implementations based on multigrid techniques [47, 52, 45], which we denote as multigrid LMA (mgLMA). The aim of this paper is to present and discuss the optimisation and performance of these low-mode averaging strategies using modern physical-point twisted mass ensembles, drawing on our practical experience.
The remainder of this work is organised as follows. In section II, we introduce the LMA procedure and discuss a key feature, namely, in commonly used correlation-function setups, LMA improves only the IR contribution, leaving the UV sector and mixed IR–UV terms unaffected. This has important implications for both performance and optimisation. In section III, we introduce the twisted-mass operator and its properties, and discuss the expected scaling of LMA-based approaches, identifying the physical volume and the quark mass as the dominant dependences. Combining these observations with a conceptual understanding of LMA, we argue that LMA is effective primarily on observables involving only light-quark propagators. Numerical results follow: In section IV, we present results for exLMA applied to quark-bilinear correlation functions on several physical-point ensembles, in section V, we present the corresponding results obtained with mgLMA, which should become advantageous for large physical volumes, and in section VI, we present results for baryon correlation functions using exLMA, quantifying the improvements achieved in both two- and three-point functions and highlighting the observable-dependent nature of the overall gain. In section VII, we summarise and conclude.
II LMA improvement of IR part in common applications
Let be a projector onto an infrared (IR) subspace of the Dirac operator , and define the complementary ultraviolet (UV) projector as . This induces the decomposition of the all-to-all propagator as
| (1) |
The role of is to enable an exact evaluation of the IR contribution , while the remaining UV component is estimated stochastically. To this end, we introduce a set of stochastic sources and define
| (2) |
In practical implementations, the resolution of the identity holds within the desired subset of lattice sites and/or spin/colour indices via dilution of the stochastic source.
An approximate all-to-all propagator is then defined as
| (3) |
which corresponds to the standard exact low-mode averaging (exLMA) construction [49]. In this approach, the IR contribution is known exactly all-to-all, whereas the UV part is entirely determined by the stochastic estimator, to be compared with the standard stochastic approach, where the full propagator is computed stochastically on the same set of sources
| (4) |
Using the definitions above, an alternative but numerically identical approach to compute the LMA propagator can be defined as
| (5) |
Both here and in eq. 3, is computed exactly; however, in eq. 3 the stochastic source is first projected onto the UV subspace and then propagated, whereas in eq. 5 the full stochastic propagator is computed and the stochastic IR contribution is subsequently subtracted.
The procedural distinction between these approaches becomes more pronounced at the level of correlation functions. Consider a generic correlation function , obtained by contracting and reducing several propagators. We define five realisations of using the propagators introduced above:
-
The reference correlation function obtained by contracting standard stochastic propagators .
- and
-
The same correlation function, whereas the stochastic propagators are restricted to their IR or UV parts, or , with contractions performed identically to .
-
A conceptually and computationally distinct quantity. Here, the exact IR propagator is employed to construct the full all-to-all correlation function, eliminating stochastic noise and any restriction to a lattice subset, such that the variance is solely due to gauge fluctuations.
-
The correlation function constructed from as defined in eq. 3. Upon contraction, it generates terms, where is the number of quark propagators involved in the contraction, two of which correspond to the decomposition of into two pieces, and . Therefore, among the resulting terms in the correlator, only two correspond to the purely IR and purely UV contributions, and , defined above; the remaining terms mix IR and UV components and need to be computed individually.
A central statement of this work is that, although contains several mixed IR–UV contributions in which exact and stochastic components appear simultaneously, in commonly used computational setups LMA improves only the purely IR contribution . All remaining terms continue to be limited by stochastic sources, even when the IR contribution is treated exactly in every term in which it appears. Specifically, we show that the following equality holds.
The proof is presented in appendix A. For disconnected loops, the relation in eq. 6 follows directly from eq. 5, and we will demonstrate it explicitly for spin-diluted quark-bilinear (meson) two-point functions and point-source baryon two- and three-point functions. In these cases, the specific properties of the stochastic sources employed are used to establish the relation exactly. The equality in eq. 6 is highly non-trivial, and we discuss its practical implications in what follows.
II.1 Discussion on eq. 6
A direct contraction of propagators built from as in eq. 3 produces purely IR and purely UV contributions, along with several mixed IR–UV terms. In practice, this typically requires dedicated contraction kernels and considerable implementation effort. By contrast, the right-hand side (r.h.s) of eq. 6 allows for a much simpler procedure: compute the full stochastic correlator, subtract its stochastic IR component, and replace it with an improved—ideally exact—estimate. Moreover, even in lattice setups where the two sides of eq. 6 are not identical, the construction in the r.h.s. still provides a valid and more efficient realisation of LMA, with double counting of the IR contribution properly removed.
We note that both approaches, left- and right-hand sides of eq. 6, have been employed in the literature. In fact, the formulation based on the full dates back to the original LMA works [49], while the second strategy was introduced in the context of combining LMA with AMA [22], as it naturally enables independent improvement of the IR contribution via LMA and the UV contribution via TSM. To our knowledge, however, it has not been emphasized that, in certain commonly used computational setups, the two procedures are in fact exactly equivalent111Actually, in parts of the literature, the full correlation function is sometimes regarded as superior because the mixed terms are assumed to yield additional improvement since their IR part is computed exactly. This may indeed happen in setups where eq. 6 does not hold, e.g. as discussed in Ref. [51] for baryons computed with stochastic sources. For the cases considered here, however, we demonstrate that the two constructions are identical.. Recognising this equivalence or, more generally, adopting the second strategy in place of the first, has several important consequences:
- Noise reduction
-
When eq. 6 holds, it makes explicit that LMA improves only the purely IR contribution . All UV contributions, as well as any IR–UV mixing, are governed by stochastic noise. This observation is crucial for identifying the class of observables, Euclidean-time regions, and quark-mass regimes in which LMA can deliver significant variance reduction and meaningful performance gains. In particular, noise reduction is expected only where dominates the correlator. In general, this is not straightforward to identify, as the extent of this region depends sensitively on the number of deflated modes, as we will demonstrate with our numerical results.
- Reduced contraction costs
-
Evaluating the r.h.s. of eq. 6 is considerably easier and more efficient than computing through explicitly expanding it in terms of the IR and UV contractions. The correlators and are expressed in terms of ordinary stochastic contractions differing only in the contracted propagator, while is the only genuinely new object. Effectively, the IR-UV contraction terms—with being the number of propagators involved in the contraction—are replaced by only three, since all interference contributions are avoided.
- Post-production LMA
-
If the stochastic sources are reproducible—i.e., the same exact stochastic source can be generated in independent runs by e.g. storing the point-source coordinates or the random seeds—then stochastic correlators can be improved a posteriori. Suppose has already been computed using the full stochastic propagator, and one subsequently wishes to apply LMA. Then and can be evaluated independently, without recomputing the full propagator, but using only the exact IR information.
- Optimised production
-
The separation implied by eq. 6 naturally suggests an optimised computational strategy, particularly at physical quark masses. The evaluation of requires Dirac matrix inversions, for which modern multigrid solvers are optimal but exhibit poor strong scaling, and are therefore best run on as few nodes as possible. By contrast, the computation of the IR sector involves a large number of exact eigenvectors; while memory-intensive, it scales efficiently with the number of nodes. The correlators and can thus be computed in a separate stage on large node counts, decoupled from the full Dirac operator inversions, provided that the stochastic sources are reproducible. On the other hand, in the direct construction, the UV projection forces inversions and IR modes to be treated simultaneously within a single workflow, often requiring execution on more nodes than optimal due to the increased memory demands.
III LMA Dependence on Quark Mass and Physical Volume
Another important aspect of LMA concerns its computational cost and, in particular, its scaling with the quark mass and the physical volume. The scaling of the Dirac spectrum with the volume is well understood: in the IR region, the number of modes scales with the physical volume , rather than with the number of lattice points [19, 55, 59, 49, 50]. By contrast, the scaling with the quark mass is less well understood and may depend on the discretisation of the Dirac operator used in the analysis. In several works [61, 49, 34, 51], it has been observed that the computational gain from using LMA decreases as the quark mass increases, although this has typically been reported as an empirical finding rather than derived from theoretical considerations. Here, we show that the twisted-mass fermion discretisation provides a particularly transparent framework to understand this mechanism and yields a simple theoretical explanation. In particular, we demonstrate that maintaining constant the LMA induced noise reduction as the quark mass increases requires proportionally increasing the number of deflated modes.
III.1 Twisted Mass fermions and eigen-decomposition
We first briefly summarise the key properties of Wilson twisted-mass fermions that are relevant for discussing their spectrum and deriving the mass dependence of LMA. In the twisted basis, i.e., the one in which the mass parameter couples to a pseudoscalar density, the maximally-twisted Wilson–Dirac operator with a quark mass reads as222For simplicity in the following discussion, we set the Wilson parameter to ; the case is completely equivalent.
| (7) |
where is the massless -Hermitian Wilson–Dirac operator, consisting of the hopping term and a diagonal mass term tuned to its critical value, i.e. to vanishing residual mass, with an optional clover term included. The associated Hermitian operator
| (8) |
has eigenpairs satisfying
| (9) |
It follows that the equivalent twisted-mass Dirac operator reads as
| (10) |
and since the mass term enters multiplied by the identity operator, it shares the same eigenvectors of , while its eigenvalues are shifted into the complex plane by ,
| (11) |
These vectors are also eigenmodes of the positive-definite squared operator,
| (12) |
The above relation highlights a central property of the twisted-mass operator, namely, the parameter provides an infrared cutoff that prevents the appearance of near-zero modes [43]. As a consequence, at non-zero mass, the squared twisted operator is always positive definite, with condition number , and its inverse is well defined.
Another important property of the twisted mass operator is that its spectrum is independent of the value of , being entirely determined by the massless Hermitian Wilson operator . As a result, the cost of the eigensolver does not depend on the quark mass, and the same set of deflated modes can be employed for any value of the twisted mass parameter . This contrasts with, e.g., the Wilson operator and other discretisations, where the eigenvectors must be recomputed for each quark mass. The inverse of the twisted mass Dirac operator then follows directly from the spectral decomposition of as
| (13) |
where the modes are ordered by magnitude and the IR propagator, , is obtained by restricting the sum to the first modes.
III.2 Spectral density
We now examine the spectral densities of the massless Wilson Dirac operator, as well as those of the twisted-mass operator at the light quark mass. It has been shown in Refs. [19, 55, 59, 49, 50] that, on sufficiently large-volume ensembles—i.e., where the spectrum is dense and finite-volume effects are negligible—the spectral density of the massless Wilson Dirac operator, , integrated over an interval with in the IR region, is proportional to both the interval width and the physical volume . Consequently, once these factors are factored out, the normalised density is approximately constant, namely
| (14) |
where is physical volumes, is a constant related to the chiral condensate in the chiral limit [19], see section III.4, and is a smooth function, providing subleading corrections across physically relevant IR eigenvalue intervals.
We confirm this behaviour using six near-physical-pion-mass ensembles generated by the Extended Twisted Mass Collaboration (ETMC) at four different lattice spacings and three different physical volumes (see Ref. [64]). Our results are summarised in table 1 and shown in fig. 1. In the left panel, we show the dependence of the normalised density as a function of the eigenvalue norm . For our ensembles, we determine by performing a linear fit of in the interval with the simulated light quark mass and . We observe that its values have a mild dependence on the lattice spacing and are approximately constant with respect to the volume.
The number of eigenvalues up to a threshold is then given by
| (15) |
where, in the last (approximated) step of eq. 15, we neglected the very mild higher-order dependence in of the spectral density. We show this expectation in fig. 2 for the aforementioned ETMC ensembles. Due to the scarcity of modes in the near-zero region, the r.h.s. of eq. 15 does not adequately describe the measured at small . In fig. 2, we then depict
| (16) |
where defines the lower bound of the fit region of . The bottom panel shows the relative measured–predicted deviation, which remains within in the low-lying part of the spectrum.
| 48 | [MeV] | [MeV] | ||||||||
| B48 | 0.07967 | 3.82 | 7.25 | 0.00072 | 3.7 | 276 | 267 | 13 | 14 | |
| B64 | 0.07967 | 5.10 | 7.25 | 0.00072 | 3.686(17) | 275.51(82) | 266.6(3.2) | 44 | 45 | |
| B96 | 0.07967 | 7.65 | 7.25 | 0.00072 | 3.7 | 276 | 267 | 232 | 229 | |
| C80 | 0.06799 | 5.44 | 6.94 | 0.00060 | 3.658(18) | 275.94(60) | 267.4(3.1) | 55 | 56 | |
| D96 | 0.05686 | 5.46 | 7.34 | 0.00054 | 3.632(20) | 277.08(59) | 268.0(3.3) | 62 | 61 | |
| E112 | 0.04883 | 5.47 | 6.89 | 0.00044 | 3.608(16) | 277.03(49) | 268.5(3.0) | 58 | 58 | |
| Continuum limit | 278.16(77) | 269.5(4.5) | ||||||||
Given the smoothness of the massless Wilson operator, the density of the spectrum of the twisted operator follows immediately. Wilson eigenvalues with are mapped into the region
| (17) |
producing an accumulation of modes in the interval . For eigenvalues much larger than , the spectrum resumes its approximately linear growth in . This pattern is illustrated in the right panel of fig. 1 at the physical light-quark mass for each ensemble given in Table 1. This leads directly to the central observation of this section. As the quark mass increases, all Wilson modes with are compressed into a narrow region of the twisted-mass spectrum and therefore contribute uniformly to its infrared part. In table 1, we report the measured and predicted value for according to
| (18) |
Indeed, given the above considerations, the integrated density of eigenvalues in of eq. 18 grows linearly with .
III.3 Discussion on eq. 18
The previous analysis of the twisted mass operator shows that to keep constant LMA performances, i.e. to deflate exactly a given region of the spectral density of the operator, the cost grows linearly with the quark mass and with the volume. This occurs because all infrared modes contribute comparably due to their similar eigenvalue magnitude induced by the mass shift and increased density with the volume. This scaling can be expressed as
| (19) |
Although this argument is derived for the twisted mass operator, where the mass shift explicitly accumulates eigenvalues, the conclusion is not specific to this regularisation. For instance, in the Wilson case, the eigenvalues are not similarly clustered but still populate the region near the mass shift threshold [59]. Since different lattice discretisations describe the same infrared physics, the density of modes relevant for low-energy observables must exhibit the same parametric dependence. Therefore, it is natural to expect that the linear scaling with the multiplicatively renormalisable quark-mass parameter (here ) and the spacetime volume holds more generally across fermion formulations. This leads to several important consequences as discussed below.
- LMA computational cost and memory usage grow as
-
Given the scaling of and considering that the cost of lattice operations scales at least with the number of lattice points , the computational cost and memory footprint of exact LMA scale as . This scaling renders the method rapidly impractical on large volumes due to both larger resource requirements and poor parallel scaling of algorithms, in particular, multigrid solvers. The LMA further deteriorates as the quark mass increases. In particular, if modes are required to achieve a given performance at the light-quark mass , maintaining the same efficiency would require roughly times more modes at the strange mass and times more at the charm mass. Given that already at the physical light-quark mass on a volume several hundred modes are needed for a significant gain, extending exLMA to heavier quarks quickly becomes computationally prohibitive. These considerations imply that LMA is most effective at light-quark masses, where the number of required modes remains manageable, and the balance between cost and variance reduction is most favourable.
- Contraction costs for all-to-all correlation functions scale with
-
Contraction costs can also become rapidly prohibitive with the number of propagators, , involved. While exact eigenvectors enable the construction of all-to-all correlation functions, the associated cost typically scales as , leading to a steep increase already for a moderate number of propagators—see e.g. fig. 9 for baryon contractions. As our results will demonstrate, this scaling effectively renders all-to-all baryon correlators () already impractical for light-quark masses and moderate volumes, and, as volume or quark mass increase, fully exact all-to-all correlation functions become generically unfeasible, except for simpler observables, such as single-quark loops with . In practice, this limitation can be alleviated by replacing all-to-all contractions with stochastic estimators, since the signal typically saturates at moderate statistics. Other possible strategies not considered here include sparsening [36, 57, 30].
- Gain of LMA versus stochastic approaches decreases with
-
Assuming that a stochastic propagator samples the Dirac spectrum approximately uniformly, then the fraction of infrared modes it captures increases with the quark mass, since the IR region relevant for long-distance physics grows proportionally to for a given spectrum. Stochastic estimators, therefore, become progressively more efficient at sampling the physically relevant modes as the mass increases. At the same time, the cost of LMA rises linearly with , as the number of eigenmodes that must be treated explicitly increases accordingly. Thus, the two approaches scale in opposite manners, with LMA becoming more expensive, while stochastic estimators improve in quality. As a result, the relative advantage of LMA over stochastic methods rapidly decreases with increasing quark mass, scaling approximately as . Consequently, for sufficiently heavy quarks, stochastic approaches become markedly more efficient than LMA.
- LMA performance is limited by the heaviest quark mass in the correlation function
-
The previous remarks apply to correlators with a single quark mass. A more subtle situation arises for mixed correlators involving both light and heavy quarks, with masses and , such as the kaon two-point function. Suppose one deflates modes that are determined to be sufficient for achieving a good performance at , and does not scale this number for the heavier propagator, i.e., instead of the larger value one still uses —this is particularly advantageous for the twisted mass discretization, where the same eigenvectors can be used for all quark masses, see eq. 13. Then the corresponding propagators can be written as
(20) where the infrared and ultraviolet parts are separated and labelled according to whether they are treated exactly (ex) or stochastically (). For the heavy propagator, the full infrared region of size is split into an exact part for the first modes, , and a stochastic remainder, .
Considering then the infrared contribution to a bilinear correlator, one obtains
(21) where, in the first equality, the IR contraction is split into two contributions (one purely exact and one mixed); and then, in the second equality, the mixed exact–stochastic term is effectively rendered stochastic since we assume eq. 6 to hold. Consequently, only the light–light part of the correlator benefits from LMA, while the remaining contribution does not. Namely, using eq. 6, the full correlator can be written as
(22) making explicit that only the fully exact part is improved. This means that if all eigenvalues contribute approximately equally, the improved fraction of the correlator scales as , since we expect modes to contribute in , but only have been deflated in . For example, for a kaon at physical quark masses, this yields a suppression of order of the noise reduction, significantly reducing the overall effects of LMA.
The above considerations strongly indicate that, in modern lattice QCD applications, LMA is beneficial primarily for correlation functions composed exclusively of light-quark propagators, which thus define the focus of this work.
III.4 On the Banks–Casher relation and the chiral condensate
As a by-product of the analysis presented in section III.2, the extracted values of enable a determination of the renormalised chiral condensate via the Banks–Casher relation [19]. The chiral condensate is related to the disconnected contribution of the scalar insertion, which can be written as
| (23) |
where denotes the properly subtracted (mixing with and removed) scalar light-quark density.
In the infinite-volume limit, the sum can be replaced by an integral. Assuming a constant spectral density up to a cutoff , one obtains
| (24) |
where we used
| (25) |
The renormalised chiral condensate is then defined as
| (26) |
where denotes the spectral density in the chiral limit, i.e. for vanishing sea and valence quark masses. Owing to the relation between the spectral density of the twisted-mass operator and that of the massless Wilson operator, the extracted are already in the valence chiral limit, but correspond to a finite sea quark mass, MeV.
Taking the continuum limit of and comparing with independent determinations of , we find values in the expected range, see fig. 3, albeit in mild tension with our earlier result from ETMC21 [2]. In that work, was extracted using chiral perturbation theory relations for the pion mass across multiple ensembles, including heavier pion masses and a controlled extrapolation to the chiral limit. The observed discrepancy can therefore be attributed to residual sea-quark mass effects in the present determination, since other lattice errors appear to be well under control, including finite size corrections. Indeed, by comparing three significantly different volumes for the B ensemble yields consistent spectral densities, see table 1.
To quantify these residual sea-quark mass effects, one may employ the chiral expansion of the subtracted quark scalar operator vacuum expectation value [48],
| (27) |
where [12] is the next-to-leading-order low-energy constant governing the quark-mass dependence of , while denotes a less constrained contact term entering scalar correlation functions. Owing to the relation between the scalar insertion and the spectral density via the trace of modes, this expression directly translates into
| (28) |
This relation suggests two possible strategies. One may either adopt an external input for to correct for pion-mass effects in the spectral density, or, alternatively, determine directly from our spectral-density data combined with an independent continuum limit determination of from ETMC21 [2].
Regarding the first approach, a prediction for in three-flavour chiral perturbation theory at is available in Ref. [10], determined in the large limit. Converting this to the scale-independent [48], and assigning a conservative uncertainty of approximately [21], one obtains . The corresponding value, represented as a blue dot in fig. 3, can then be used to correct the chiral condensate for chiral effects. The resulting continuum-limit result is
| (29) |
as shown in fig. 3 with very good agreement with our previous ETMC21 determination.
Following the second strategy instead, by comparing to ETMC21 to determine the chiral dependence, we find
| (30) |
IV Results on Exact LMA for Quark Bilinear Correlation Functions
In this section, we present results for two-point bilinear correlation functions computed on four physical-point ETMC ensembles, the parameters of which are listed in table 1 and the statistics used are reported in table 2. These correlators were originally generated to study the long-distance contribution to the muon anomalous magnetic moment [16, 64], but were constructed for generic Dirac structures. This setup allows us to report more broadly on the achieved noise reduction across a range of physically relevant correlation functions.
| Ensemble | |||
|---|---|---|---|
| B64 | 1300 | 400 | 1024 |
| C80 | 590 | 530 | 1600 |
| D96 | 570 | 530 | 960 |
| E112 | 670 | 530 | 1344 |
IV.1 LMA of quark-bilinear correlation function
We consider a generic meson two-point correlation function, computed using a backwards-running propagator via -hermiticity, 333In the case of TM fermions, one has , where is the Wilson parameter. In this section, we adopt the canonical quark basis, where the -quark mass term is and the -quark reads . , defined as
| (31) |
where denotes the spacetime coordinates. The operator acts on the lattice coordinates, projecting the propagator onto the timeslice and momentum . While the two propagators may in general have different masses, here we restrict to light quarks, where the benefits from LMA are greatest as discussed in the previous section. For the twisted-mass operator, the propagators can be generated with either equal or opposite signs of the Wilson parameter multiplying the light-quark twisted mass , hence in eq. 7, corresponding either to twisted-mass (TM) or Osterwalder–Seiler (OS) regularisations. Both setups are tested, yielding comparable improvements. Therefore, results are presented for TM or OS regularisation, as specified below.
Regarding momentum boosts, we restrict to the zero-boost case, , since non-zero momenta would require dedicated inversions on momentum stochastic sources. Concerning the Dirac structures, several choices vanish at zero momentum, in particular those with unmatched spatial Dirac indices. Also, channels with pseudoscalar or scalar structures ( or ) rapidly saturate the stochastic noise, and the correlation functions are subject only to gauge noise. In these cases, no significant improvement from LMA is observed, as the correlator on a single configuration is already saturated with a small number of stochastic sources. In the following, we therefore restrict to non-vanishing vector, axial, and tensor structures, where a significant noise-to-signal ratio is observed.
The correlation functions have been computed either stochastically or via all-to-all methods exploiting the exact low-mode eigenvectors, as follows from eq. 6. For the stochastic part, we employ momentum-projected stochastic sources with time and spin dilution,
| (32) |
where are spin indices, and colour indices. Subscripts label independent sources, each requiring a separate inversion. In practice, we invert over all four Dirac components , while is varied together with a new stochastic sample , and we take . These sources allow us to approximate the term
| (33) |
in eq. 31 leading to
| (34) |
which corresponds to an inner product of the propagated stochastic sources . The stochastic IR contribution is obtained analogously as
| (35) |
where is exactly the same source, but propagated using only the IR part of the spectrum.
The remaining ingredient is the exact all-to-all IR contribution. Explicitly inserting the eigenmode decomposition of given in eq. 13, in the correlation function yields
| (36) |
From a computational perspective, it is more efficient to evaluate inner products of eigenvectors, obtaining
| (37) |
requiring only independent contractions to be computed by exploiting
| (38) |
and then assembling the all-to-all IR correlation functions during post-processing.
IV.2 Error reduction with the number of modes
The most critical aspect in optimising LMA performance is determining how many modes to deflate. As discussed in section III.2, for the twisted mass operator, one needs at least as many modes as those within the region , since all eigenvalues in this interval are of comparable magnitude. For the B64 ensemble, which is used for the testing in this section, this corresponds to roughly 45 modes at the simulated light-quark mass, as reported in table 1.
Results for different numbers of deflated modes are shown in fig. 4 for a subset of about 50 configurations of the B64 ensemble, where the noise reduction is quantified as
| (39) |
i.e., as the ratio between the statistical error of the fully stochastic correlator and that of the LMA-improved one, cf. eq. 6. We observe that 100 eigenvectors, namely, more than twice the minimal estimate, are still insufficient to yield any noticeable improvement. A clear noise reduction appears only for significantly larger numbers of modes, predominantly at large Euclidean time separations. The error saturates if one uses around 400–500 eigenvectors for most correlation functions, as shown in fig. 4. In particular, this is true for the vector–vector correlator that enters the study of the muon g-2. This motivates the choice of 400 modes used for the B64 ensemble.
Scaling with the volume then leads to about 530 modes for the C80, D96, and E112 ensembles, which have a larger linear extent, corresponding to a increase in volume. Results for all ensembles are depicted in fig. 6. Several remarks are in order here:
- Dependence on Euclidean time and number of modes
-
The observed behaviour as a function of the number of modes and Euclidean time can be understood by examining the different contributions entering in eq. 6. This is illustrated in the left panel of fig. 5, where, for the two-point vector-vector correlator, we show the fully stochastic correlator , its IR part , their difference (the residual),
(40) and the all-to-all IR contribution . At sufficiently large Euclidean time separations, a cross-over between the IR and residual contributions is observed, around 1.5 fm for this case, indicating that beyond this point the full correlator becomes dominated by the IR part. The location and even the existence of this cross-over depends on the number of deflated modes. With too few modes, the IR part lacks sufficient information to describe the ground state and never dominates. With enough modes, the asymptotic state is fully captured by the IR contribution, and as more modes are included, the cross-over shifts to shorter Euclidean times as excited states are progressively incorporated. Based on the results shown in fig. 4 for the B64 ensemble, we conclude that around 400-500 modes are necessary to fully describe the ground-state and reach maximal gain at large Euclidean time.
- Asymptotic noise reduction at large Euclidean time
-
The asymptotic gain at large Euclidean time is determined by how much more precisely the IR contribution is computed compared to its stochastic estimate. In the present setup, where all-to-all correlators are used, is known exactly up to gauge noise. By contrast, the stochastic estimate is dominated by the stochastic noise and its error depends on the number of noise sources. In the regime where errors of the stochastic estimate scale ideally, one expects
(41) so that the gain decreases as the number of stochastic sources is increased. Consequently, the number of stochastic sources should be chosen as small as possible while still ensuring that the residual contribution is under control, thereby maximising the gain in the IR-dominated region while controlling the stochastic noise in the UV region. This observation also explains the difference in the asymptotic gain between fig. 4 and fig. 6: in the former, a smaller number of stochastic sources was used, yielding an error reduction of about 4.5, whereas the final production, performed according to table 2, results in an error reduction of about 3.5 for the vector-vector case.
- Dependence on the Dirac structure
-
We observe that all correlation functions affected by a significant noise-to-signal ratio exhibit very similar noise-reduction properties and, in particular, the vector–vector, axial–axial, and tensor–tensor channels. Exceptions are correlation functions built from scalar, pseudoscalar, and the temporal components of the vector and axial currents. As discussed above, correlation functions involving scalar or pseudoscalar operators saturate very rapidly with stochastic noise, eliminating any gain from the LMA approach. For and , we do observe an improvement, although it is less pronounced than for the spatial components and does not saturate at large Euclidean time (see 6). This suggests that a larger number of modes would be required to fully capture the correlator. Finally, exhibits a distinctive behaviour, namely the improvement decreases significantly as the continuum limit is approached. Our interpretation is that the stochastic noise is decreasing towards the continuum limit, thereby diminishing the relative gain from LMA. This is consistent with the fact that the asymptotically has the pseudoscalar pion as its ground state; as the continuum limit is approached, this correlator increasingly resembles a pseudoscalar correlation function, for which no signal-to-noise problem is present.
V Results on Multigrid LMA for Quark Bilinear Correlation Functions
We now move to results produced using multigrid LMA (mgLMA), which belongs to the class of inexact LMA approaches [59]. Here, the setup cost is reduced by avoiding the exact computation of eigenvectors, while exploiting local coherence to capture multiple low modes simultaneously. A successful inexact deflation strategy, therefore, offers clear advantages for LMA, as it targets both a reduction in setup cost and a decrease in the number of required vectors, with the additional prospect of improved scaling with the volume and the quark mass. In particular, algebraic multigrid (AMG) has emerged as one of the most effective realisations of inexact deflation and is nowadays widely used as a preconditioner in solvers for many fermion discretisations at the physical point [26, 13, 46, 4, 27, 28, 29]. Therefore, it is natural to expect that its success as a preconditioner can translate directly into an effective LMA strategy. However, as we will discuss in this section, our experience indicates that this connection is not straightforward and requires further investigation.
V.1 Multigrid operators and their properties
In algebraic multigrid, a coarse operator is defined in terms of prolongation and restriction operators as
| (42) |
where denotes the fine operator. In lattice QCD, is also referred to as the little Dirac operator [59]. While this construction generalises straightforwardly to multiple levels, here we restrict the discussion to the finest and coarsest levels only. Accordingly, denotes the operator on the coarsest grid, and and represent the composite restriction and prolongation operators mapping directly between the finest and coarsest grids.
To define multigrid-based projectors, one typically requires that and preserve the identity,
| (43) |
with the identity on the coarse space. This leads to the following projectors acting on the fine level:
| (44) |
The operator simply projects the identity onto the coarse subspace, while the remaining operators incorporate the coarse inverse and define suitable projections for the quark propagator. For instance, using one can decompose the propagator as
| (45) |
An equivalent representation follows from the other projectors, finding
| (46) |
The connection with exact deflation becomes explicit if and are chosen as the left and right IR eigenvector matrices of , respectively:
| (47) |
Thus, the multigrid propagator approaches the IR propagator to the extent that and accurately capture the low-mode subspace. When specialising this general AMG framework to lattice QCD, additional symmetry-preserving structures are typically imposed:
-
•
To preserve the sparsity of the Dirac operator, and act on space-time aggregates of the lattice, ensuring that retains a nearest-neighbor structure. This is also in line with the property of local coherence [59], namely, eigenvectors restricted to aggregates still maintain significant overlap with the low-mode subspace.
-
•
To preserve -hermiticity and related symmetries, and are constructed to act separately on left- and right-handed spin components, enforcing -compatibility:
(48) This allows one to relate left and right eigenspaces and consistently choose
(49) It also follows that -hermiticity is preserved at the coarse level,
(50) allowing the definition of a Hermitian coarse operator
(51) -
•
The -compatibility is particularly crucial for twisted-mass fermions [4], as it ensures that the twisted-mass term retains its linear form at all multigrid levels:
(52) where denotes the massless Wilson operator. The coarse operator, therefore, inherits the same structure as the fine twisted-mass operator, i.e., the twisted mass provides a low-mode cutoff, while the eigenvectors remain those of the massless Wilson operator. In practice, this allows a prolongation operator constructed at a given to be reused for both signs and arbitrary values of , which is advantageous when employing multigrid as a preconditioner for twisted-mass simulations and observables [4, 5, 15].
V.2 From multigrid preconditioning to multigrid LMA
The properties discussed above closely mirror those of exact LMA, but within a more general multigrid framework. It is important to identify, however, where the construction of multigrid preconditioning deviates from the construction of a multigrid-based LMA approach. In particular:
- Multigrid preconditioning is dynamic
-
A central feature of multigrid solvers for lattice QCD is their adaptivity. In a typical multigrid correction, the inverse of is computed only to very low accuracy, and this approximate correction is then combined with a smoother. Additionally, the solver dynamically cycles among levels, often via K-cycles, using nested Krylov methods to determine whether to recurse further to coarser grids or return to finer levels. This adaptivity optimises the computational effort by adjusting the coarse-level work according to the current condition number and convergence state. As a result, the effective multigrid correction operator is not fixed, but evolves during the solve. Such strategies necessitate combining the preconditioner with a flexible Krylov solver.
- LMA requires static operators
-
In contrast, LMA demands a strictly static construction where all sources of adaptivity must be removed so that the multigrid-based IR propagator defines a fixed operator. In particular, the inverse of must be computed to high (effectively exact) precision, rather than approximated. This is especially challenging for twisted-mass fermions, where the coarse operator is typically highly ill-conditioned; even achieving a relatively loose tolerance (e.g. a relative residual) can already require hundreds of iterations [4]. A practical resolution is to perform exact deflation on the coarsest grid, which is the strategy adopted here. We define
(53) so that the exact LMA construction is carried out on the coarsest grid. This significantly reduces the cost, both because of the smaller system size and because fewer eigenvectors are required, with the prolongation operator effectively lifting the coarse IR subspace to the fine level.
- Different hyper-parameters and tuning
-
As a consequence, the set of hyperparameters and their tuning strategy differs substantially from standard multigrid preconditioning. Since LMA excludes any dynamic components, only parameters entering the definition of the coarse operator remain relevant, i.e., the aggregation size between levels, the number of null vectors, and the number of IR modes treated exactly on the coarse grid, . As we will see, in practice, the optimal values of these parameters also differ significantly from the corresponding ones that are optimal in multigrid preconditioning. Additionally, the realisation of optimal multigrid LMA will differ significantly with the kind of fermion discretisation, in the same way multigrid preconditioning does for Wilson [26, 13, 46], twisted-mass [4], overlap [27], domain-walls [28], and staggered [29] fermions. Thus, each of these requires dedicated studies.
V.3 Multigrid LMA of quark-bilinear correlation functions
As given in eq. 53, our strategy combines multigrid operators with exact deflation to construct the propagator . A first consequence of this choice is that the identity in eq. 6 no longer holds. This is due to the fact that involves a nested projection, whereas the identity would apply to either the full or the exact . Nevertheless, eq. 6 can still be used as a guiding principle to define an improved estimator by embedding it in a control variates framework:
| (54) |
Here is chosen to minimise the variance of at each time slice. In this formulation, acts as a low-noise control observable, while provides a stochastic estimator with the same expectation value. Their difference therefore has vanishing mean but non-trivial correlation with , enabling a variance reduction without introducing bias: when is strongly correlated with , a substantial variance reduction can be achieved up to the limit of perfect correlation, where the stochastic estimators cancel. In a setup where eq. 6 holds, one expects . However, as shown in the right panel of fig. 7, this is not exactly true, and we find that the optimised deviates from unity by less than in the large time limit. This would lead to a further reduction in the statistical error, which is, though, only about and thus overall negligible.
In the case of mgLMA, since the stochastic source is coarsened before applying the IR component of the multigrid propagator, the coefficient is expected to reflect the reduced dimensionality. The resulting values of for the results presented below are shown in the left panel of fig. 7. Their relatively large magnitude is consistent with this expectation, while exhibiting a mild, non-trivial time dependence, as well as a weak dependence on the observable considered. Overall, the behaviour is smooth and quite reasonable.
As in exact LMA, the two additional correlation functions in eq. 54 are built from the deflated propagator. The correlator has the same contraction structure and stochastic sources as , but with the propagator replaced by , while is evaluated at higher statistics. Several strategies are possible for the latter. Ideally, one would compute an all-to-all correlator, but already for bilinear correlators, the all-to-all construction becomes demanding:
| (55) |
as it requires constructing the coarse operator for each choice of time, momentum, and Dirac structure.
A possible improvement is to design the prolongation operator to be compatible with quark bilinear structures, which is particularly natural at zero momentum. If does not aggregate in the time direction and is fully compatible with spin structures (i.e. -compatible), one obtains
| (56) |
In this case, one can follow the same strategy as in eq. 37 and write
| (57) |
While promising and computationally efficient for bilinear observables, this approach requires additional implementation effort and sacrifices generality, and is therefore not pursued here. Instead, we approximate through high-statistics sampling,
| (58) |
leveraging the efficient application of the coarse-level correction via QUDA together with optimised accumulated contractions.
V.4 Using QUDA for multigrid LMA
Let us now emphasise that the efficient testing and implementation of this multigrid LMA strategy is greatly facilitated by the use of the QUDA library [31, 14, 32]. QUDA provides key capabilities, including efficient setup and application of multigrid operators, as well as deflation of low modes on the coarse grid. In practice, the multigrid propagator is obtained by invoking the multigrid preconditioner with a specific choice of parameters: no smoother iterations are applied, and a single V-cycle is performed, corresponding to a direct traversal to the coarsest grid and back without intermediate iterations. On the coarsest level, a deflated solver is employed with zero iterations, such that only the infrared (IR) component of the operator is effectively applied. This yields an extremely efficient application of the preconditioner, requiring only a fraction of a second per propagator. For the setup phase, instead, standard strategies are used. An additional feature of QUDA exploited here is the use of the even–odd preconditioned operator, rather than the standard Dirac operator, at all levels of the multigrid hierarchy.
V.5 Results on the B96 ensemble
The motivation for employing multigrid LMA on large-volume ensembles stems from the unfavourable scaling of exact LMA. To illustrate this, we consider representative numbers from this work. For the B64 ensemble, production runs typically use 4 or 8 GPU nodes, depending on workload and memory constraints. With exact LMA, however, storing the full set of eigenvectors already requires running on 16 nodes. While this corresponds to a factor of 2–4 increase in node count, the overall scaling remains close to ideal at this volume, since the evaluation of deflated correlation functions is still dominated by full-size operator applications using the strategy discussed in section II. As a result, the overhead associated with the larger node count remains manageable for B64.
The situation changes dramatically for larger volumes. The B96 ensemble, with approximately five times the volume, is normally executed on GPU nodes. Scaling exact deflation to this case would require roughly five times more eigenvectors to maintain comparable performance, i.e. about modes. Combined with the increased lattice size, this leads to a prohibitive memory footprint, namely, storing these eigenvectors would require on the order of GPUs with 64 GB memory, corresponding to at least nodes in a realistic setup. This represents an order-of-magnitude increase over the baseline node count and would severely impact both resource usage and parallel efficiency, effectively negating the computational advantage of exact LMA.
In contrast, multigrid LMA achieves a satisfactory noise reduction with significantly reduced resources. In this work, we obtain a moderate improvement for the B96 ensemble using only 54 nodes. This requires a careful retuning of the multigrid setup. In particular, standard multigrid parameters are insufficient. As in exact LMA, a large number of modes is needed to accurately capture the low-energy subspace. In the multigrid context, this is controlled by the aggregation strategy, which determines the size of the coarse grid and, thus, the dimensionality of the coarse operator relative to the fine Dirac operator.
The parameters used are summarised in table 3. Conventional multigrid preconditioning typically employs a three-level hierarchy with aggregation factors of between the fine and first coarse level and between subsequent levels (up to minor variations, as in the present case, where the lattice size favors aggregation factors of ). For LMA, however, we find that a much finer aggregation is required. In particular, the overall aggregation between the fine and coarsest level is reduced by a factor of , leading to a correspondingly larger coarse operator. Despite this, the coarse operator remains smaller than the fine operator by a factor of once spin aggregation and the use of null vectors are taken into account. On this enlarged coarse operator, we compute low modes to construct the multigrid-based IR subspace.
| Type | Cycle | Smoother | L1 aggreg. | L2 aggreg. | ||
|---|---|---|---|---|---|---|
| mgLMA | V-cycle | No | 32 | 6000 | ||
| Precond. | K-cycle | Yes | 24 | 4000 |
| Ensemble | |||
|---|---|---|---|
| B96 | 490 | 11200 | 768 |
From our tests, the most critical parameter in this setup is the aggregate size, which determines the coarse lattice resolution. If the aggregates are too large, no significant noise reduction is observed, which is analogous to retaining too few low modes in standard LMA. Conversely, overly small aggregates lead to costs comparable to exact LMA, eliminating any practical advantage. A careful tuning is therefore required. By contrast, we observe only a mild dependence on the number of low modes. The choice of 6000 modes should be regarded as conservative, corresponding to the maximum number that can be accommodated without degrading performance, i.e. within the available device memory.
It is also worth noticing that the resulting multigrid setup differs substantially from the one used for preconditioning and is also significantly less efficient if used for that purpose. Therefore, we again split the production of correlation functions into two independent runs: one performed using the standard node count and the multigrid setup optimised for preconditioning, used to produce the full correlation functions ; and another using the modified setup employed for the multigrid-based correlation functions and .
Results for this setup are shown in fig. 8, where we focus on the statistical error of the various correlators entering , defined in eq. 54. The remaining two contributions are
| (59) |
which sum to . Assuming these two contributions are uncorrelated, i.e. in the regime where stochastic noise dominates, the quadratic sum of their individual errors should reproduce the statistical error of . This assumption holds well in all cases except at short Euclidean times, where stochastic noise saturates, and for , which constitutes a special case due to the conserved current.
Guided by this, the number of sources (11200 vs 768) for the improved MG estimator is chosen such that its error remains smaller than that of the corresponding residual contribution. In this way, the residual term dominates the total error while maintaining a balanced computational cost for the MG component. In fig. 8, we also display the noise reduction for both the total and residual components, where the former reflects the achieved gain (about 2 in all cases), while the latter indicates the potential asymptotic improvement (about 3 in most cases) as the MG estimator error is further reduced. Within the range of stochastic sources considered here, no saturation of the MG estimator noise is observed, leaving room for further error reduction, for instance, by increasing the number of sources or by employing all-to-all estimators as in eq. 55. Regarding the latter, as can be seen in table 3, the current choice of aggregation does not coarsen in the time direction and would thus enable an efficient evaluation of the all-to-all contribution in eq. 57, provided that a fully -compatible (i.e., spin-diluted) multigrid setup is employed, which is not currently the case. Our findings strongly support pursuing this approach for quark bilinear correlation functions, as it would allow a substantial improvement in noise reduction (from a factor of about 2 to about 3, comparing the red and orange points in fig. 8) at significantly lower cost to compute the improved estimator, since the high-statistics multigrid correlation function would not need to be computed.
VI Results on Exact LMA for Nucleon two- and three-point functions
Returning to the exact LMA, we present and discuss results for nucleon two- and three-point correlation functions. After a brief introduction of the correlators, we examine the exact all-to-all contribution, which proves prohibitively expensive due to the large number of eigenvectors required, and thus focus on high-statistics estimates. We then report results for both two- and three-point functions, analysing the gain from LMA and its dependence on the choice of current as well as on the temporal separation.
VI.1 Nucleon two- and three-point functions with LMA
The nucleon interpolating field commonly used is
| (60) |
where and are the up- and down-quark interpolating fields, is the charge conjugation matrix and is the totally antisymmetric Levi-Civita tensor. After performing the Wick contractions, the nucleon two-point function with generic spatial momenta , reads as
| (61) |
where the unpolarized positive-parity projector and the tensor are defined by
| (62) |
Inserting the LMA propagator of eq. 3 yields a purely stochastic UV contribution, an exact IR contribution, and six additional mixed IR–UV terms. As shown in appendix A, when point sources are used, the purely stochastic and mixed contributions combine to reproduce the standard correlator with the stochastic IR correlator subtracted. Consequently, eq. 6 holds, and only the IR correlator needs to be computed in addition to the standard one.
The evaluation of the exact all-to-all IR correlator turns to be computationally demanding. Here we outline the most efficient strategy we have identified. We begin by defining the Fourier-transformed contraction of three eigenvectors over colour and two spin indices,
| (63) |
and its conjugate
| (64) |
with repeated indices implicitly summed. The exact IR correlator can then be written as
| (65) |
We first note that the quantity is symmetric under the exchange of , having
| (66) |
and, thus, with independent entries. But this number can be further reduced by noticing that in we need an antisymmetric contribution under the exchange of , namely
| (67) |
Indeed,
| (68) | ||||
| (69) |
This offers, to our knowledge, the most computationally efficient approach for all-to-all baryon correlation functions, requiring contractions, but nevertheless scaling with . This cubic scaling of the contractions renders the computational cost prohibitive for large volumes, such as the one considered in this work, where at least are required. Indeed, as shown in fig. 9, the runtime to compute the exact IR correlators already approaches one day of wall time for . Thus, all-to-all correlation functions are realistically unfeasible via this approach.
A simple way to address this issue is to still estimate the IR contribution stochastically, but with higher precision. To this end, we adopt the same strategy used for the standard correlator, evaluating both the full correlator and the IR part from point sources, while increasing the number of sources used for the latter. This leads to the definition
| (70) |
The resulting LMA improved correlation function is then
| (71) |
We restrict ourselves to point sources, as these are the preferred choice for the computation of three-point correlation functions that we also analyse in this work. In particular, they allow one to freely inject a momentum boost at the source, and, when combined with sequential inversions through a fixed sink, enable efficient calculations of form factors and related observables. We therefore define the nucleon three-point function with momentum transfer as
| (72) |
where the current is defined by
| (73) |
corresponding to scalar, axial, and tensor operator insertions, respectively, and depends on the Dirac structure. We refer to the combination as isoscalar and to the combination as isovector. After renormalisation and in the large-time limit, the ratio of three-point to two-point functions yields the nucleon charges
| (74) |
at zero momentum transfer , and more generally, at finite , to form factors.
VI.2 Error reduction in two- and three-point functions
To investigate the performance of LMA for baryon correlators, we generate results using the B48 and B64 ensembles (see table 1 for the simulation parameters). The maximal statistics collected are reported in table 4. Since these are test runs rather than production runs, the number of configurations is limited. However, we use production-level statistics for both the eigenvectors and the number of sources.
| B48 | 20 | 1500 | 100 | 2000 |
|---|---|---|---|---|
| B64 | 50 | 1000 | 100 | 2000 |
In fig. 10, we present the LMA-improved nucleon effective mass for the B48 (left panel) and B64 (right panel) ensembles at maximal statistics, varying the number of eigenvectors. For the B48 ensemble, the gain saturates at approximately , which, when scaled with the physical volume, would correspond to about 1600 eigenvectors for B64. However, due to computational limitations, we restrict ourselves to a maximum of 1000 eigenvectors for the B64 ensemble, which nonetheless still yields a significant improvement. Importantly, we highlight that the saturation occurs at roughly four times more eigenvectors than in the meson case, indicating that a substantially larger number of modes is required to efficiently resolve nucleon observables compared to the corresponding mesons.
Based on this observation for the B64 ensemble, we study LMA-improved three-point functions using . We consider source–sink time separations 16 and 24, corresponding to 1.3 fm and 1.9 fm, respectively. While essentially no improvement is observed at the smallest separation, a clear reduction in statistical noise emerges at the largest one. In fig. 11, we compare the standard and LMA-improved ratios of three- to two-point functions at 24 that lead to the isoscalar axial charge. In the improved case, we observe a reduction of the statistical uncertainty by approximately a factor of 2.5 in the plateau region, with an even larger improvement in the corresponding plateau fit result. The infrared and residual contributions, shown separately in Fig. fig. 11, exhibit comparable statistical uncertainties. This indicates that the chosen number of eigenmodes achieves an approximately optimal balance between the infrared and residual sectors. Further reduction of only one component would therefore lead to diminishing returns, as the total uncertainty would become dominated by the other.
VI.3 Computational gain of LMA in nucleon three-point functions
We now quantify the efficiency of the LMA procedure for baryons by taking computational costs into account. To this end, we define the cost ratio
| (75) |
where and denote the per-source costs of computing the standard correlator and its infrared component, respectively. In this expression, we neglect the cost associated with the computation of eigenvectors, assuming that it is amortised over their reuse in multiple observables. Consequently, it does not enter the cost comparison for a single measurement. The ratio, therefore, depends only on the relative cost of constructing the two correlator contributions. While the contraction costs are identical, the propagator part is significantly cheaper in the IR case, since it is directly computed from the known eigenvectors. In our numerical estimates, based on our costs, we take , i.e. the standard correlator is assumed to be five times more expensive than the IR component per source.
Next, we define the error gain as
| (76) |
assuming that the stochastic noise, i.e., the variance between independent point sources, dominates over gauge noise. Furthermore, assuming that the infrared and residual contributions are weakly correlated,
| (77) |
The error gain is therefore controlled by the ratio , and is consequently time-dependent, observable-dependent, and sensitive to the number of eigenvectors included in the infrared sector. By analysing in our data as a function of fixing to the largest available statistics, we extract this ratio for several nucleon charges at and 24. The results are reported in table 5, showing that the error gain almost saturates at 100% for certain charges at , while it is much reduced for other charges or at shorter distances.
In the regime where stochastic noise dominates over gauge noise, represents directly the increase in statistics required to achieve the same precision using the standard approach. This allows us to quantify the overall computational gain as
| (78) |
| 50.7% | 42.3% | 78.5% | 82.7% | 32.6% | 23.4% | |
| 87.6% | 99.2% | 99.0% | 99.8% | 55.5% | 71.5% |
We analyse this gain for the three-point correlation functions of axial, scalar, and tensor operator insertions, considering both isovector and isoscalar contributions. In fig. 12, we present the cost-gain defined in eq. 78 for the scalar operator at source-sink separations and . As expected, we find that the cost-gain increases with increasing , while, conversely, the statistical precision at fixed and deteriorates. The corresponding results for the axial and tensor operators are shown in fig. 13 only for . In these cases, the cost-gain is reduced compared to the scalar insertion, indicating that the efficiency gain is observable-dependent. In all cases, the isoscalar combination exhibits the most pronounced improvement with respect to the isovector one.
VII Conclusions
In this work, we have demonstrated that low-mode averaging (LMA) on physical-point ensembles yields a consistent noise reduction by a factor of three to five relative to standard stochastic estimators at large source–sink time separations for observables dominated by stochastic noise. This improvement translates into an order-of-magnitude reduction in the required statistics and, when setup and solver costs are taken into account, to a substantial overall decrease in computational expense.
At the same time, the magnitude of the gain and the optimal implementation of LMA are strongly observable dependent. In addition, they depend on the lattice physical volume and, potentially, on the quark mass, although the latter was not explored here, as all results are obtained at the physical point. Consequently, systematic tuning is required for each application. In particular, we find that baryonic correlators require significantly more low modes than mesonic ones. Furthermore, different operators benefit to varying degrees, and the choice between multigrid-based and exact LMA depends sensitively on the cost–benefit balance for a given setup, especially after accounting for volume scaling and memory constraints.
Beyond these numerical results, which are necessarily tied to the specific setup of our calculations and therefore partly qualitative, we emphasise three general conclusions of this work:
-
•
As presented in section II, and extended to the multigrid formulation in section V, we motivate the application of LMA based on the generic estimator
(79) Here, denotes the standard stochastic correlator, the corresponding correlator with propagators restricted to the IR sector, and an improved estimator of the same IR contribution, ideally computed all-to-all or, more generally, at higher statistics. The control-variate coefficient becomes essential whenever eq. 6 is not satisfied, and in particular in the multigrid LMA setup, where evaluating correlators on a coarser grid modifies their normalisation and must be accounted for. Alternatively, one may set . This formulation offers several computational advantages, which are discussed in section II.1.
-
•
As presented in section III, the spectrum of the massless Wilson operator is analysed, demonstrating a very smooth and stable spectrum for the considered physical quark mass ensembles. From chiral perturbation theory relations, the chiral condensate is extracted in the chiral limit for at a renormalisation scale of 2 GeV, with the result given in eq. 29. From its pion-mass dependence, we also determine the low-energy constant in two-flavour chiral perturbation theory for the first time from lattice QCD, with the corresponding value reported in eq. 30.
-
•
As also discussed in section III, the twisted-mass fermion discretisation exhibits particularly useful properties for the application of LMA. In this formulation, the spectrum depends trivially on the quark mass, since the operator at any mass value shares exactly the same eigenvectors. This allows us to predict the mass dependence of the spectrum straightforwardly and, crucially, to reuse the same eigenvectors for propagators at arbitrary quark masses. As a result, the cost of computing the IR component of correlation functions can be significantly reduced: once the necessary eigenvector inner products have been computed, they can be recombined with different eigenvalues without additional eigenvector calculations. A disadvantage, however, is that computing eigenvectors numerically at heavy quark mass requires essentially the same effort as at light quark mass, since in both cases one is effectively computing the eigenvectors of the same massless Wilson operator. This issue has also been identified previously in the context of multigrid solvers, where twisted-mass fermions were found to be unique compared to Wilson fermions because of the severe ill-conditioning of the coarse operator. Twisted-mass fermions therefore provide both an advantageous setup for LMA and a distinctive framework to which it is difficult to extend conclusions about the applicability of LMA made for other discretisations.
Acknowledgments
We thank the QUDA developers, in particular Kate Clark and Evan Weinberg, for useful discussions on the multigrid preconditioning software. We thank all members of the ETM Collaboration for the most enjoyable collaboration. We thank in particular Giuseppe Gagliardi, Marco Garofalo, and Bartosz Kostrzewa for the exchange of ideas, technical info and feedback that were valuable in developing the LMA methodology within the framework of the ETMC project. We also thank Johan Bijens for providing references and comments on .
C. A and A. E, and S. B. acknowledge, respectively, support from the projects EXCELLENCE/0524/0459 (IMAGE-N) and EXCELLENCE/0524/0017 (MuonHVP), co-financed by the European Regional Development Fund and the Republic of Cyprus through the Research and Innovation Foundation within the framework of the Cohesion Policy Programme “THALIA 2021-2027”. R. F. and F. M. are supported by the Italian Ministry of University and Research (MUR) under the grant PNRR-M4C2-I1.1-PRIN 2022-PE2 Non-perturbative aspects of fundamental interactions, in the Standard Model and beyond F53D23001480006 funded by E.U.- NextGenerationEU. F. S. is supported by ICSC – Centro Nazionale di Ricerca in High Performance Computing, Big Data and Quantum Computing, funded by European Union -NextGenerationEU and by Italian Ministry of University and Research (MUR) project FIS 00001556. C.S. is supported under the AQTIVATE EJD from the European Union’s research and innovation programme under the Marie Skłodowska-Curie Doctoral Networks action and Grant Agreement No 101072344. We gratefully acknowledge CINECA and the EuroHPC Joint Undertaking for granting access to the Leonardo Supercomputer. Computing time on Leonardo Booster was allocated through the Extreme Scale Access Call (grant EHPC-EXT-2024E01-027), and additional GPU resources were provided under the INFN-LQCD123 initiative. We acknowledge the Swiss National Supercomputing Centre (CSCS) access to Alps through the Chronos programme under project ID CH15. The authors gratefully acknowledge the Gauss Centre for Supercomputing e.V. (www.gauss-centre.eu) for funding this project by providing computing time on the GCS Supercomputer JUWELS [54] at Jülich Supercomputing Centre (JSC).
Appendix A Demonstration of validity of eq. 6 for the correlation functions considered in this work
A.1 Quark loops
Consider a generic quark loop of the form
| (80) |
where projects onto a momentum boost and time-slice and is an arbitrary Dirac matrix. In this case, the validity of eq. 6 is immediate, but already illustrates the mechanism at work. Indeed, since for quark loops the observable depends linearly on the propagator, employing from eq. 5 into eq. 80 and exploiting the linearity of the trace immediately yields
| (81) |
which is eq. 6 applied to eq. 80. Here, each loop on the right-hand side is computed as the trace of the corresponding propagator , , or .
A.2 Meson two-point correlation functions with stochastic sources
Consider a generic meson two-point correlation function, computed using a backwards-running propagator via the -hermiticity, 444In the case of TM fermions, one needs to recall that , where is the Wilson parameter. , defined as
| (82) |
Then if satisfies
| (83) |
which holds for commonly used spin- and time()-diluted stochastic sources (or point sources), then the two-point function read as
| (84) |
Proof
Using eq. 3 and eq. 4, we write and explicitly in terms of the IR, UV parts and mixed terms, namely
| (85) | ||||
| (86) |
where we recognise in each separately the two pure IR terms , and a common purely UV term , namely
| (87) |
From these definitions, eq. 84 follows, if the mixed terms are equal to each other, namely
| (88) |
where on the left we have and on the right . These equalities hold, and we show them explicitly for the first one, with the second following similarly. This then concludes our proof. Indeed,
| (89) |
where, in the first equality, we relied on the fact that is idempotent. In the second equality, we used that by construction and that spin-and time-diluted sources are used, thus commuting with the other terms. These are indeed the properties we have requested the stochastic sources to satisfy in eq. 83. If these are not satisfied, e.g., if the source is not spin-diluted, then eq. 6 would not hold.
A.3 Baryon two-point correlation functions with point sources
The final example we consider is a baryon two-point correlation function, for which we show that eq. 6 holds when point sources are employed.
For simplicity, we consider a baryon correlation function of the form
| (90) |
where denotes a projector and is antisymmetric in both spin and colour indices. For a positive-parity spin- baryon, these tensors take the form given in eq. 62.
We emphasise that eq. 90 is not completely generic. In particular, after performing the Wick contractions, additional terms involving different index orderings and quark-flavour structures may arise, as in the nucleon interpolator of eq. 61. Nevertheless, the expression above is sufficiently representative for discussing the validity of eq. 6. We also note that at the moment, we have not yet employed stochastic sources or point sources.
Let us now discuss the effect of employing into eq. 90, where the UV part is computed using a point-source with coordinates . This will then produce eight contracted terms built from the following sets of quark propagators,
| (91) |
The fully point-source correlation function, i.e. where is used, has exactly the same structure, whereas is replaced by . Thus, to demonstrate that eq. 6 holds for baryon correlation functions, namely
| (92) |
we need to proceed similarly to the meson case, showing that the mixed IR–UV terms are the same in both cases, namely
| (93) |
This is straightforward to show because the presence of at least one point source makes the sum over source coordinates collapse to a single point in any of such contractions. Let us show it explicitly for one case, considering e.g.
| (94) |
Since is non-zero only at , then the sum over collapses to the coordinate for all propagators, yielding to the equality above.
References
- [1] (2025) Nucleon charges and -terms in lattice QCD. Phys. Rev. D 111 (5), pp. 054505. External Links: 2412.01535, Document Cited by: §I.
- [2] (2021) Quark masses using twisted-mass fermion gauge ensembles. Phys. Rev. D 104 (7), pp. 074515. External Links: 2104.13408, Document Cited by: Figure 3, §III.4, §III.4.
- [3] (2024) Nucleon axial and pseudoscalar form factors using twisted-mass fermion ensembles at the physical point. Phys. Rev. D 109 (3), pp. 034503. External Links: 2309.05774, Document Cited by: §I.
- [4] (2016) Adaptive Aggregation-based Domain Decomposition Multigrid for Twisted Mass Fermions. Phys. Rev. D 94 (11), pp. 114509. External Links: 1610.02370, Document Cited by: §I, 3rd item, 3rd item, item LMA requires static operators, item Different hyper-parameters and tuning, §V.
- [5] (2019) Multigrid approach in shifted linear systems for the non-degenerated twisted mass operator. Comput. Phys. Commun. 236, pp. 51–64. External Links: 1805.09584, Document Cited by: §I, 3rd item.
- [6] (2025-07) Proton and neutron electromagnetic form factors from lattice QCD in the continuum limit. External Links: 2507.20910 Cited by: §I.
- [7] (2018) Simulating twisted mass fermions at physical light, strange and charm quark masses. Phys. Rev. D 98 (5), pp. 054518. External Links: 1807.00495, Document Cited by: §I.
- [8] (2023) Probing the Energy-Smeared R Ratio Using Lattice QCD. Phys. Rev. Lett. 130 (24), pp. 241901. External Links: 2212.08467, Document Cited by: §I.
- [9] (2024) Inclusive Hadronic Decay Rate of the Lepton from Lattice QCD: The u¯s Flavor Channel and the Cabibbo Angle. Phys. Rev. Lett. 132 (26), pp. 261901. External Links: 2403.05404, Document Cited by: §I.
- [10] (2000) Two point functions at two loops in three flavor chiral perturbation theory. Nucl. Phys. B 568, pp. 319–363. External Links: hep-ph/9907264, Document Cited by: §III.4.
- [11] (2022) FLAG Review 2021. Eur. Phys. J. C 82 (10), pp. 869. External Links: 2111.09849, Document Cited by: Figure 3.
- [12] (2026) FLAG review 2024. Phys. Rev. D 113 (1), pp. 014508. External Links: 2411.04268, Document Cited by: §I, §III.4.
- [13] (2010) Adaptive multigrid algorithm for the lattice Wilson-Dirac operator. Phys. Rev. Lett. 105, pp. 201602. External Links: 1005.3043, Document Cited by: §I, item Different hyper-parameters and tuning, §V.
- [14] (2011-09) Scaling lattice QCD beyond 100 GPUs. In International Conference for High Performance Computing, Networking, Storage and Analysis, External Links: 1109.2935, Document Cited by: §V.4.
- [15] (2018) Multigrid accelerated simulations for Twisted Mass fermions. EPJ Web Conf. 175, pp. 02002. External Links: 1710.06198, Document Cited by: §I, 3rd item.
- [16] (2026-03) An update on the HVP contribution to in isoQCD from ETMC. In 42th International Symposium on Lattice Field Theory, External Links: 2603.03120 Cited by: §I, §IV.
- [17] (2023) Octet baryon isovector charges from Nf=2+1 lattice QCD. Phys. Rev. D 108 (3), pp. 034512. External Links: 2305.04717, Document Cited by: §I.
- [18] (2010) Effective noise reduction techniques for disconnected loops in Lattice QCD. Comput. Phys. Commun. 181, pp. 1570–1583. External Links: 0910.3970, Document Cited by: §I.
- [19] (1980) Chiral Symmetry Breaking in Confining Theories. Nucl. Phys. B 169, pp. 103–125. External Links: Document Cited by: §III.2, §III.2, §III.4, §III.
- [20] (2025) Hadronic Vacuum Polarization for the Muon g-2 from Lattice QCD: Long-Distance and Full Light-Quark Connected Contribution. Phys. Rev. Lett. 135 (1), pp. 011901. External Links: 2412.18491, Document Cited by: §I, §I.
- [21] (2026) Private communication. Cited by: §III.4.
- [22] (2013) New class of variance-reduction techniques using lattice symmetries. Phys. Rev. D 88 (9), pp. 094503. External Links: 1208.4349, Document Cited by: §I, §II.1.
- [23] (2026) Hybrid calculation of hadronic vacuum polarization in muon g 2 to 0.48%. Nature 653 (8114), pp. 373–377. External Links: 2407.10913, Document Cited by: §I.
- [24] (2023) The chiral condensate of Nf = 2 + 1 QCD from the spectrum of the staggered Dirac operator. JHEP 11, pp. 013. External Links: 2308.01303, Document Cited by: Figure 3.
- [25] (2021) Leading hadronic contribution to the muon magnetic moment from lattice QCD. Nature 593 (7857), pp. 51–55. External Links: 2002.12347, Document Cited by: §I.
- [26] (2008) Adaptive Multigrid Algorithm for Lattice QCD. Phys. Rev. Lett. 100, pp. 041601. External Links: 0707.4018, Document Cited by: §I, item Different hyper-parameters and tuning, §V.
- [27] (2016) Multigrid Preconditioning for the Overlap Operator in Lattice QCD. Numer. Math. 132 (3), pp. 463–490. External Links: 1410.7170, Document Cited by: §I, item Different hyper-parameters and tuning, §V.
- [28] (2020) Multigrid for chiral lattice fermions: Domain wall. Phys. Rev. D 102 (9), pp. 094517. External Links: 2004.07732, Document Cited by: §I, item Different hyper-parameters and tuning, §V.
- [29] (2018) Multigrid algorithm for staggered lattice fermions. Phys. Rev. D 97 (11), pp. 114513. External Links: 1801.07823, Document Cited by: §I, item Different hyper-parameters and tuning, §V.
- [30] (2025) Aspects of propagator sparsening in lattice QCD. Phys. Rev. D 111 (11), pp. 114514. External Links: 2501.05404, Document Cited by: item Contraction costs for all-to-all correlation functions scale with .
- [31] (2010) Solving Lattice QCD systems of equations using mixed precision solvers on GPUs. Comput. Phys. Commun. 181, pp. 1517–1528. External Links: 0911.3191, Document Cited by: §V.4.
- [32] (2016-12) Accelerating lattice QCD multigrid on GPUs using fine-grained parallelization. In International Conference for High Performance Computing, Networking, Storage and Analysis, External Links: 1612.07873, Document Cited by: §V.4.
- [33] (2025) Inclusive semileptonic decays of the Ds meson: A first-principles lattice QCD calculation. Phys. Rev. D 112 (5), pp. 054503. External Links: 2504.06063, Document Cited by: §I.
- [34] (2004) Improving meson two point functions in lattice QCD. Comput. Phys. Commun. 159, pp. 185–191. External Links: hep-lat/0401011, Document Cited by: §I, §III.
- [35] (2005) Chiral properties of two-flavor QCD in small volume and at large lattice spacing. Phys. Rev. D 72, pp. 054503. External Links: hep-lat/0506021, Document Cited by: §I.
- [36] (2021) Sparsening algorithm for multihadron lattice QCD correlation functions. Phys. Rev. D 104 (3), pp. 034502. External Links: 1908.07050, Document Cited by: item Contraction costs for all-to-all correlation functions scale with .
- [37] (2022) Isovector axial form factor of the nucleon from lattice QCD. Phys. Rev. D 106 (7), pp. 074503. External Links: 2207.03440, Document Cited by: §I.
- [38] (2025) The hadronic vacuum polarization contribution to the muon g 2 at long distances. JHEP 04, pp. 098. External Links: 2411.07969, Document Cited by: §I.
- [39] (2024) Electromagnetic form factors of the nucleon from Nf=2+1 lattice QCD. Phys. Rev. D 109 (9), pp. 094510. External Links: 2309.06590, Document Cited by: §I.
- [40] (2024) Improved analysis of isovector nucleon matrix elements with Nf=2+1 flavors of O(a) improved Wilson fermions. Phys. Rev. D 109 (7), pp. 074507. External Links: 2402.03024, Document Cited by: §I.
- [41] (2023) Inclusive hadronic decay rate of the lepton from lattice QCD. Phys. Rev. D 108 (7), pp. 074513. External Links: 2308.03125, Document Cited by: §I.
- [42] (2018) Simulation of an ensemble of twisted mass clover-improved fermions at physical quark masses. EPJ Web Conf. 175, pp. 02003. External Links: 1712.09579, Document Cited by: §I.
- [43] (2001) Lattice QCD with a chirally twisted mass term. JHEP 08, pp. 058. External Links: Document, hep-lat/0101001 Cited by: §III.1.
- [44] (2013-07) An adaptive aggregation based domain decomposition multilevel method for the lattice wilson dirac operator: multilevel results. External Links: 1307.6101 Cited by: §I.
- [45] (2025-09) Using orthogonal projectors in multigrid multilevel Monte Carlo for trace estimation in lattice QCD. External Links: 2509.11424 Cited by: §I.
- [46] (2014) Adaptive Aggregation-Based Domain Decomposition Multigrid for the Lattice Wilson–Dirac Operator. SIAM J. Sci. Comput. 36 (4), pp. A1581–A1608. External Links: 1303.1377, Document Cited by: §I, item Different hyper-parameters and tuning, §V.
- [47] (2022) A Multilevel Approach to Variance Reduction in the Stochastic Estimation of the Trace of a Matrix. SIAM J. Sci. Comput. 44 (4), pp. A2536–A2556. External Links: 2108.11281, Document Cited by: §I.
- [48] (1984) Chiral Perturbation Theory to One Loop. Annals Phys. 158, pp. 142. External Links: Document Cited by: §III.4, §III.4.
- [49] (2004) Low-energy couplings of QCD from current correlators near the chiral limit. JHEP 04, pp. 013. External Links: hep-lat/0402002, Document Cited by: §I, §II.1, §II, §III.2, §III.
- [50] (2009) Chiral symmetry breaking and the Banks-Casher relation in lattice QCD with Wilson quarks. JHEP 03, pp. 013. External Links: 0812.3638, Document Cited by: §III.2, §III.
- [51] (2006) Low-mode averaging for baryon correlation functions. PoS LAT2005, pp. 132. External Links: hep-lat/0510011, Document Cited by: §I, §III, footnote 1.
- [52] (2025) Multigrid low-mode averaging. Phys. Rev. D 111 (7), pp. 074508. External Links: 2412.06347, Document Cited by: §I.
- [53] (2024) Nucleon isovector axial form factors. Phys. Rev. D 109 (1), pp. 014503. External Links: 2305.11330, Document Cited by: §I.
- [54] (2021) JUWELS Cluster and Booster: Exascale Pathfinder with Modular Supercomputing Architecture at Juelich Supercomputing Centre. Journal of large-scale research facilities 7 (A138). External Links: Document, Link Cited by: Acknowledgments.
- [55] (1992) Spectrum of Dirac operator and role of winding number in QCD. Phys. Rev. D 46, pp. 5607–5632. External Links: Document Cited by: §III.2, §III.
- [56] (2010) Overlap Valence on 2+1 Flavor Domain Wall Fermion Configurations with Deflation and Low-mode Substitution. Phys. Rev. D 82, pp. 114501. External Links: 1005.5424, Document Cited by: §I.
- [57] (2021) Field sparsening for the construction of the correlation functions in lattice QCD. Phys. Rev. D 103 (1), pp. 014514. External Links: 2009.01029, Document Cited by: item Contraction costs for all-to-all correlation functions scale with .
- [58] (2007) Deflation acceleration of lattice QCD simulations. JHEP 12, pp. 011. External Links: 0710.5417, Document Cited by: §I.
- [59] (2007) Local coherence and deflation of the low quark modes in lattice QCD. JHEP 07, pp. 081. External Links: 0706.2298, Document Cited by: §I, §III.2, §III.3, §III, 1st item, §V.1, §V.
- [60] (2025) Smeared -ratio in isospin symmetric QCD with Low Mode Averaging. PoS LATTICE2024, pp. 446. External Links: 2502.03187, Document Cited by: §I.
- [61] (2001) On the low fermionic eigenmode dominance in QCD on the lattice. Phys. Rev. D 64, pp. 114509. External Links: hep-lat/0106016, Document Cited by: §I, §III.
- [62] (In preparation) Quark masses and non-perturbative renormalisation of quark bilinear operators with Wilson-clover twisted mass fermions. Cited by: Table 1.
- [63] (2015) Covariant approximation averaging. Phys. Rev. D 91 (11), pp. 114511. External Links: 1402.0244, Document Cited by: §I.
- [64] (In preparation) The hadronic-vacuum-polarisation contribution to the muon anomalous magnetic moment in isospin-symmetric lattice QCD with twisted-mass fermion. Cited by: §I, §III.2, Table 1, §IV.
- [65] (2024) Nucleon form factors in Nf=2+1 lattice QCD at the physical point: Finite lattice spacing effect on the root-mean-square radii. Phys. Rev. D 109 (9), pp. 094505. External Links: 2311.10345, Document Cited by: §I.
- [66] (2020) Lattice QCD Determination of . PoS CD2018, pp. 020. External Links: 1912.08321, Document Cited by: §I.