• No results found

The RBC/UKQCD light quark physics program: Algorithms, methods and performance

N/A
N/A
Protected

Academic year: 2021

Share "The RBC/UKQCD light quark physics program: Algorithms, methods and performance"

Copied!
27
0
0

Loading.... (view fulltext now)

Full text

(1)

The RBC/UKQCD light quark physics program:

Algorithms, methods and performance

Taku Izubuchi/Chulwoo Jung Brookhaven National Laboratory for RBC/UKQCD collaborations

(2)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

Introduction

RBC/UKQCD collaborations have been studying Domain Wall Fermion(DWF) configurations, which we believe achieves optimal balance between preservation of chiral symmetry and practicality.

Quenched: 2002: PRD65:014504, PRD66:014504

2003: PRD68:114506

2004: PRD69:074504, PRD69:074502 2006: PRD73:094507

2f: 2005: PRD72:114505

(2+1)f: 2007: PRD76:014504, PRD75:114501,arXiv:0507.2340

2008: PRD78:114509, PRD77:014509, PRD78:054510, PRL 100:032001

(3)

Currently (2+1)f DWF configurations in 2 lattice spacings, 2 volumes are available:

L/a msa mla ms/ml mPSa τ(MD) Accept.

β = 2.13,a∼1.73Gev−1∼0.114fm,amres = 0.0031

163×32×16 0.04

0.01 3.3 0.247 4000 57%

0.02 1.86 0.325 4000 56%

0.03 1.3 0.387 7500 82%

243×32×16 0.04

0.005 5.4 0.192 8980 73%

0.01 3.3 0.242 8540 70%

0.02 1.86 0.323 2800 71%

0.03 1.3 0.388 2800 72%

β = 2.25,a∼2.34Gev−1∼0.084fm,amres = 0.00066

323×64×32 0.03

0.004 7.9 0.128 3428×2 72%

0.006 5.5 0.153 3825×2 76%

0.008 4.2 0.172 2965×2 73%

(4)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

Action & Algorithm

Gauge action: Iwasaki action

SG[U] =−

β

3ReTr

2

4(1−8c1) X

x;µ<ν

Uµ(x)Uν(x+ ˆµ)Uµ†(x+ ˆν)U †

ν(x) (c1=−0.331)

+c1

X

x;µ6=ν

Uµ(x)Uµ(x+ ˆµ)Uν(x+ 2ˆµ)U†µ(x+ ˆµ+ ˆν)U † µ(x+ ˆν)U

† ν(x)

3

5

Fermion action : Domain Wall Fermion with 5D preconditioning

Dxdwf,s;x0,s0(M5,mf) =δs,s0Dxk,x0(M5) +δx,x0Ds,s0(mf)

Dxk,x0(M5) =

1 2 4 X µ=1 h

(1−γµ)Ux,µδx+ ˆµ,x0+ (1 +γµ)Ux†0δx−µ,ˆx0 i

+ (M5−4)δx,x0

Ds,s0(mf) = 1 2

ˆ

(1−γ5)δs+1,s0+ (1 +γ5)δs−1,s0−2δs,s0˜ −mf

2

ˆ

(1−γ5)δs,Ls−1δ0,s0+ (1 +γ5)δs,0δLs−1,s0

˜ .

(5)

D(mf) =DDWF† (M5,mf)DDWF(M5,mf)

Z

dUdψe−(SG[U]−SF[U,ψ]+SPV[U,ψ])=

Z

dUe−SG[U]det

"

D(ms)1/2D(m f)

D(1)3/2

#

det

"

D(ms)1/2D(m f)

D(1)3/2

#

= det

»D

(ms)

D(1)

–3/2 det

»D

(mf)

D(ms)

∼ »

detR1/2

»D(ms) D(1)

––3

det

»D(m

f)

D(ms)

Omelyan integrator with λ= 0.22 used.

∆t(gauge) : ∆t(RationalQuotient) : ∆t(Quotient) = 1 : 6 : 6. CG: Quotient inversion MInv: Multimass inversion for Rational Quotient GF: Gauge force RF: Rational force HF: Quotient force. A typical sequence of routines called for 16-step trajectory is

6MInv+ 1CG+

[[12GF + [3MInv+ 2RF]×3]×2 + 12GF + 1CG +HF]×32

(6)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

0 1000 2000 3000 4000 5000 -20 -10 0 10 20 Qtop

-20 -10 0 10 20 0 0.05 0.1

0 1000 2000 3000 4000 5000 -30 -20 -10 0 10 20 Qtop

-20 -10 0 10 20 0 0.05 0.1

0 1000 2000 3000 4000 5000 -30 -20 -10 0 10 20 30 Qtop RHMC 0 RHMC I RHMC II

-20 -10 0 10 20 0 0.05 0.1

0 1000 2000 3000 4000 5000

Molecular dynamics time

-30 -20 -10 0 10 20 30 Qtop

-20 -10 0 10 20

Qtop histogram (normalized)

0 0.05 0.1

(7)

Performance

Rednumbers show the routines which are duplicated on different 5th dimension slices. 243×64×16(ms= 0.04), Local volume = 63×2×8, 4096-node QCDOC

ml =0.03 0.02 0.01 0.005

Routines time(s) time(s) MFlops/s time(s) time(s) MFlops/s

MInv 1225 1213 221 1195 1367 225

CG 173 223 273 370 634 258

GF 60 60 257 62 73 250

RF 218 218 36 232 274 34

HF 10 10 4.5 10 12 4.5

Total time(seconds) 1941 1983 2124 2635

Total flops(×1012) 1366 1411 1557 2006

323×64×16(m

s= 0.03) QCDOC

ml 0.006 0.004

Local volume 83×2×8 43×8×16

Routines time(sec) MFlops/s time(sec) MFlops/s

MInv 5062 172 4263 205

CG 1964 213 2038 268

GF 214 256 104 263

RF 1130 25 939 28

HF 39 4.3 10 16

Total time(seconds) 9035 7733

(8)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

323×64×16(m

s= 0.03) BG/P

2048×4 core BG/P 4096×4 core BG/P

ml 0.008 0.006 0.004

Local volume 44×16 44×8

Routines time(sec) MFlops/s time(sec) MFlops/s

MInv 851 950 443 563 374

CG 320 433 464 342 391

GF 34 39 350 45 302

RF 142 162 83 306 23

HF 2 3 30 12 4

Total time(seconds) 1406 1646 1355

Total flops (×1012) 4450 5260 5868

483×64×16 DWF 323×64×32 AuxDet DWF

ms= 0.03,2048×2 core BG/L ms= 0.045,4096×4 core BG/P

ml 0.002 0.0042

Local volume 6×122×2×16 44×16

Routines time(sec) MFlops/s time(sec) MFlops/s

MInv 44254 261 3460 479

CG 31280 286 1974 518

CG(AuxDet) 223 375

GF 2179 231 68 346

RF 6256 58 630 68

HF 39 37 28 6

Total time(s) 84851 6760

(9)

Observations

Fermion inversions consume ≥80% of time/total flops.

While total flops grows by a factor of∼2.9 between 243 and

323 lattices, 483 simulation is more expensive by a factor of >10 compared to 323 so far. It appears the increasing volume necessitates more tuning of running parameters such as stopping conditions. Parameter tuning is ongoing. Cost reduction up to a factor of 2 is expected.

5D preconditioning scheme is employed instead of 4D. While 4D preconditioning makes it possible to use more general (Moebius) formalism, 5D preconditioning allows an efficient implementation of the Dirac operator even when the 5th dimension is spread on more than 1 node, which has been very useful in running DWF simulations on massively parallel machines such as IBM Blue Gene machines while keeping local volume as symmetric as possible.

(10)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

Example: from Aux. Det. DWF 323×64×32

DWF CG, local volume 44×16 : 527MFlops/core

DWF CG, local volume 8×4×22×32 : 437MFlops/core

A mixed precision scheme where all the projected spinors are kept in single precision is used for MD step. This has had very little effect on acceptance while gives 10-15% performance boost. It may be explained by the fact that the error introduced by single precision Dirac operator does not correlated with lowest eigenvectors of the dirac operators.

(11)

Reweighting of dynamical strange quark

Motivation: Lattice spacing is a nontrivial function ofβ and one does not know it until it is measured on thermalized

configurations. While typically multiple ensemble of light quark masses are generated to do extrapolations to the chiral limit, it is in principle possible to simulate at the physical strange quark if one guesses the lattice spacing correctly.

In practice, ensembles with different strange quark masses are needed to interpolate (simple linear interpolation or SU(3) ChPT). It would save a lot of computing resources if this can be avoided. Reweighting of light quark to approach the chiral limit have been tried by various groups. (Hasenfratz, et. al., PRD78 014515(2008), Luscher et. al., arXiv:0810.0946) Here we try to apply the same technique to strange quark of DWF (2+1)f ensembles.

(12)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

Reweighting: Basics

w = det D

2D2

D1†D1

!1/2

= det(Ω)1/2,Ω =D2−1D1D1†(D

2)

−1

D1=D(ml,ms),D2 = (ml,m0s)

w =

R

e−ξ†Ω1/2ξ R

e−ξ†ξ = D

e−ξ†(Ω1/2−1)ξE

Now observables for reweighted ensemble is calculated by

hOi(m0s) = ΣiO[Ui]wi Σiwi

(13)

Noise reduction

We could think of different ways of evaluatingwi:

wi =

q

e−ξ†(Ω[U

i]−1)ξ (2)

or

wi =

D e−ξ†(

Ω[Ui]−1)ξE (3)

(2) can be evaluated by very slight modification of quotient 2 flavor part of DWF evolution and needs only 1 CG, while (3) uses Rational quotient part and it needs 2 Multimass inversion.

However, (2) is a biased estimator which converges only when evaluated multiple times while (3) is unbiased.

(14)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

Also, we can split the determinant to multiple terms. We are splitting by using intermediate masses:

wi = Πkj=1=1···N···n

1

ne

−ξ†jk(√Ωj[Ui]−1)ξjk

(4) Ωj =D(mj)−1D(mj−1)D(mj−1)†(D(mj)†)−1

m0 =ms,mN =ms0

While this requires more inversions per measurement, each term have smaller condition numbers, which helps in reducing overall noise. Also, this gives the reweighting factors for intermediate masses automatically.

Reweighting parameters:

Volume ms m0s N n

323×64 0.03 0.025 10 2

(15)

2700 2800 2900 3000 3100 3200 3300 trajectory

353 354 355 356 357 358 359 360 361 362 363 364 365

Heff

Rational Quotient, 1 step x 40 Quotient, 10 steps x 4 Rational Quotient, 10 steps x 2

Heff = -1/2 log ( Det(D(0.025)/D(0.03)))

Reweighting factors for different methods of evaluation. Rational Quotient : Eq.(3) Quotient: Eq. (2)

(16)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

0 1000 2000 3000 4000

0 1 2 3 4 5 6

m' s = 0.027 323 x 64 m

s = 0.03 ml = 0.004

0 1000 2000 3000 4000

0 5 10 15

m's = 0.025

(17)

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.7

0.75 0.8

Box 0.03

Box 0.03, Reweighted to 0.025 Box 0.025

Box 0.025, Reweighted to 0.025 Omega mass 323 x 64 ms = 0.03

(18)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

0.024 0.026 0.028 0.03 0.032

0.12 0.125 0.13 0.135

Pseudoscalar 323 x 64 m

s = 0.03 ml = 0.004 reweighted masses

0.024 0.026 0.028 0.03 0.032

0.46 0.47 0.48 0.49 0.5 0.51 0.52

Nucleon

(19)

10 0.7

0.75 0.8

No Reweigthing 0.025

Omega mass 323 x 64 ms = 0.03 ml = 0.004

(20)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

0 5000 10000 15000 20000

0 10 20 30 40 50 60

No Reweigthing 0.025

Omega propagator (ms=0.025,t=14) 323 x 64 ms = 0.03 ml = 0.004

(21)

3e+050 4e+05 5e+05 6e+05 7e+05 8e+05 9e+05 1e+06 20

40 60 80

No Reweigthing 0.025

Pseudoscalar propagator (ms=0.004,t=26) 323 x 64 ms = 0.03 ml = 0.004

(22)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

DWF with Auxiliary Determinant

Renfrew et. al., arXiv:0902.2587

Motivation: Dislocations which induce chiral symmetry breaking is the biggest hurdle for Ginsparg-Wilson fermions at larger lattice spacing. Suppressing dislocations is crucial for DWF studies of quantities which requires large lattice spacing and/or large volume. Examples: QCD Thermodynamics

Nucleon matrix elements

Weak matrix elements (K →ππ)

Various approaches :

Change Gauge action to suppress dislocations (DBW2, ...) Use link smearing to suppress the coupling between dislocations and fermions (HYP, stout,...)

(23)

Use additional fermion action to suppress dislocations. γ5Dw(−M5) has been suggested by Vranas.

Problems:

Too strong suppression near−M5 → blocks topology tunneling

Enhancement for larger eigenvalues

Use a ratio of Dirac Operator with imaginary Wilson masses to

control the suppression of eigenvalues near−M5 while preserve

larger eigenvalues.

W(M5, f, b) =

det[DW(−M5+ıbγ5)†DW(−M5+ıbγ5)]

det[DW(−M5+ıfγ5)†DW(−M5+ıfγ5)]

= det[DW(−M5)

D

W(−M5)] +2f det[DW(−M5)†DW(−M5)] +2b

=Y

i

λ2

i +2f

λ2i +2b

∼1 forλi b, f, ∼2f/2bforλi f.

Compared toQ

(24)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

0 0.05 0.1 0.15 0.2

0 50 100 150 200

ef=0.040, εb=0.50 ef=0.005, εb=0.50 ef=0.010, εb=0.100

Suppression factorFi = λ

2 i+2b

λ2 i+2f

versus |λ|for f/b of 0.001/0.10,

(25)

Have done extensive parameter searching on small volumes. A factor of 5-7 decrease inmres is observed after the scales are

matched by locating transition temperature.

163×8×32 164×32

no weighting factor f/b= 0.01/0.10 f/b= 0.005/0.50 β= 1.75 b= 0.50

β mres β mres β mres f mres

1.875 0.0101(5) 0.040 0.0025(1)

1.900 0.0072(3) 0.020 0.0018(1)

1.95 0.0252(1) 1.925 0.0054(4) 0.005 0.0014(10)

2.00 0.0102(1) 1.950 0.0035(3) 1.750 0.0015(1)

2.05 0.0046(2) 1.975 0.0026(4) 1.800 0.0009(3)

2.08 0.0022(2) 2.000 0.0020(2) 1.850 0.0003(4)

2.11 0.0011(1) 2.14 0.0007(1)

Possible run plan

Aux. Det. β= 1.75,a∼1.4Gev, f/b= 0.02/0.5,mres∼0.0019

L/a msa mla L(fm) mPS(Mev) τ(MD) Accept.

323×64×32 0.045 0.0042 4.5 250 200 85%

(26)

Introduction Action & Algorithm Performance Reweighting Aux. Det. Conclusions

(27)

Conclusions

RBC/UKQCD/LHP collaborations have generated

323×64×16 DWF configurations which allows for more

accurate continuum extrapolations. We greatly benefited from newly available IBM BG/P at Argonne. Analysis is ongoing. Reweighting for the strange quark appear to be practical even

for 323×64 DWF configurations. We are working to do full

analysis to with reweighted data.

What’s the next step? → we are exploring auxiliary

determinant, which improves chiral symmetry at larger lattice spacings. Algorithm tuning is ongoing.

References

Related documents

A computational fluid dynamics (CFD) methodology using ANSYS FLUENT 13.0 is used here to investigate effects of different curvature ratio on the heat

Suggestions were made on ways of curbing drugs abuse in competitive sports

The density matrix equation tracks the evolution of particle number densities taking into account correlations between particles with different discrete quan- tum numbers like flavor

Neben einigen Variablen mit tendenziellem Einfluss waren ein höherer Schweregrad der depressiven Symptomatik (ISR-Skala „Depressives Syndrom“), längere AU-Zeiten

This was done by characterizing the difference in expected Hamming weight of the result of a multiplication and squaring operation given random uniformly distributed inputs to

The model in chapter 1 also shows that, if countries are asymmetric with respect to the number of welfare recipients, specic kinds of altruistic motivations may support pure strategy

As the first stage in surveying the selected peptides, each of the 336 sublibrary populations was tested en masse—without being resolved into individual peptide-bearing phage