Curves
JuanManuelGarciaGarcia 1
andRolandoMenchacaGarcia 2
1
DepartmentofComputerSystems
InstitutoTecnologico deMorelia
Morelia,Mexico
2
CenterforComputingResearch
InstitutoPolitecnicoNacional
MexicoCity,Mexico
Abstract. GivenapositiveintegernandapointP onanellipticcurve
E,thecomputationofnP,thatis,theresultofaddingntimesthepoint
P to itself, called the scalar multiplication, is the central operation of
ellipticcurvecryptosystems.Wepresentanalgorithmthat,usingp
pro-cessors,cancomputenP intimeO(logn+H(n)=p+logp),whereH(n)
istheHammingweightofn.Furthermore,ifthisalgorithmisappliedto
Koblitzcurves,therunningtimecanbereducedtoO(H(n)=p+logp).
1 Introduction
Ellipticcurvecryptosytemswere rstproposed byKoblitz [10]and Miller[15].
Theirmain attractiveis theirstrongestsecuritybykeylength.It ispossibleto
constructellipticcurvecryptosystemsoverasmallerdenitioneldthansimilar
public-keycryptosystemsbasedonthediscretelogarithmproblem,suchasRSA
[17]cryptosystems.Ellipticcurvecryptosystemswitha160-bitkeyarebelieved
to havethesamesecurityas,forexample,theRSAwitha1024-bit keylength.
Severalfastimplementationofellipticcurvecryptosystemshasbeenreported
[8,7,18,4,5]. Thesecurityof elliptic curvecryptosystemsis basedon the
diÆ-cultyoftheellipticcurvediscretelogarithmproblem, andthenthemain
opera-tionofthosecryptosystemsistheexponentiation,alsoknownasscalar
multipli-cation.Forageneralsurveyonexponentiationmethodssee[6].Inordertoobtain
fastellipticcurveexponentiation threefactorshasbeenconsidered:theeld of
denition [4][18], addition chains [4,12,16,18] and optimalcoordinatesystems
[3,5].Special ellipticcurveswitheÆcientarithmetics hasalso been considered
[11] [20][21]. Inthis paper wepropose theapplication of parallel processing to
speeduptheexponentiationonellipticcurves.
Therealreadyexistsparallelalgorithmsforexponentiationofintegersmodulo
am-bit integer[2,1,13] and alsofor exponentiation on theelds GF(2 m
) [22]
andGF(q m
algorithmcouldleadtoeÆcientimplementationsofellipticcurvecryptosystems
onmultiprocessorcomputers.
2 Binary Method
We haveanelliptic curveE andapointP 2E. Wewantto compute nP, for
somenonnegativeintegern.Letn=(b
N 1 b
N 2 b
N 3 b
2 b
1 b
0 )
2
bethebinary
representationofn,where
N =blognc+1: (1)
Then
n=b
N 1 2
N 1
+b
N 2 2
N 2
++b
2 2
2
+b
1 2+b
0
: (2)
UsingtheHornerexpansion,thelastexpressionbecomes
n=(((b
N 1 2+b
N 2
)2++b
1 )2+b
0
): (3)
From(3) wehave
nP =2(2(2(2(b
N 1 P)+b
N 2
P)++b
1 P)+b
0
P: (4)
The last expression suggests a way to get nP. We can dene the sucession of
pointsP
1 ;;P
N
asfollows.Let
P
1 =P
(5)
P
i =
2P
i 1
ifb
N i =0
2P
i 1
+P ifb
N i =1
fori=2;;N.Wecan easilyobservethat P
N =nP.
Themethodaboveoutlined,to obtainnP by thecomputationoftheterms
P
1 ;;P
N
,isknownasthebinaryalgorithm [9]. From(5)wecandeduce that
toobtainthetermP
i
weneedonedoubling and,ifthecorrespondingbitofthe
binaryrepresentationofnisone,oneadding.Thiscomputation ismadeN 1
times,whereN =blognc+1.IfH(n)isthebinary Hammingweight ofn, that
is,thenumberofbitssettooneinthebinaryrepresentationofn,thenweneed
H(n) 1 point addings besides the N 1 doublings. Then we can state the
followingtheorem.
Theorem1. If P isapointonan elliptic curveE andnisapositive integer,
toobtain nP by the binarymethodarerequiredblogncdoublingsand H(n) 1
pointaddings.
Ifweassumethatthetimetakento addtwoellipticcurvepointsis almostthe
(blognc+H(n) 1)t;
wheret isthe timetakenby asingle elliptic curveoperation.
Some improvement to the binary method can be obtained by the use of
redundant number systems. Morain and Olivos [16] observed that on elliptic
curvestheinverse of apoint canbe obtainedwithouta cost. Forcurvesy 2
=
x 3
+Ax+B over GF(p) with p > 3, the inverse of (x;y) is (x; y) and, for
y 2
+xy=x 3
+Ax 2
+B overGF(2 m
), theinverseis (x;x+y). Then, wecan
considerrepresentations n= N 1 X i=0 c i 2 i (6) withc i
2f 1;0;1gfori=0;:::;N 1.Anonadjacent form(NAF)isa
repre-sentationwithc
i c
i+1
=0for0i<N 1.SincetheNAFingeneralhasfewer
nonzeros that the binary representation, then theadvantageof using it is few
numberofadditions.MorainandOlivos[16] showedthat theexpectednumber
ofnonzerosinaNAFisN=3.
3 The p
th
-Order Binary Method
First, we divide the binary representation of n in dN=pe blocksof pbits each
oneandthenwesplitnintos
0 ;s
1 ;:::;s
p
suchthatthebinaryrepresentationof
eachs
i
isformedbydN=peblocksofpbitssetto0,exceptforthei-thbit,which
has the same valuethat the i-th bit of the corresponding block of the binary
representationofn.Thus,wecandene s
i
,fori=0;:::;p 1,asfollows:
s 0 =b 0 +b p 2 p +b 2p 2 2p
++b
(d N p e 1)p 2 (d N p e 1)p (7) s 1 =b 1 2+b
p+1 2 p+1 +b 2p+1 2 2p+1
++b
(d N p e 1)p+1 2 (d N p e 1)p+1 . . . s p 1 =b p 1 2 p 1 +b 2p 1 2 2p 1 +b 3p 1 2 3p 1
++b
pd N p e 1 2 pd N p e 1 : That is s i = d N p e 1 X j=0 b jp+i 2 jp+i (8)
for i = 0;1;:::;p 1, where weare considering somepadding bits b
k
=0 for
N 1<kpdN=pe 1.Theintegersetfs
0 ;s
1 ;;s
p 1
gcanbeeasilyobtained
fromnbyXOR-ingitwiththeappropiatemask.Itisclearfrom(2)and(7)that
n=s
0 +s
1
++s
p 1
(9)
andthen,wecancomputenP asthesumofs
0 P;s
1
P;:::;s
p 1 P.
0 1 p 1
pprocessors,computeinparallel thescalarmultiplications
s
0 P;s
1
P;:::;s
p 1 P
runningthebinarymethodoneachprocessor.
2. Associativefan-in: Usingdp=2eprocessorscomputeinparallel
nP=s
0 P+s
1
P+:::+s
p 1 P:
Todothis, rstweseparatethetotalsumas
p 1
X
i=0 s
i =
dp=2e 1
X
i=0 s
i +
p 1
X
i=dp=2e s
i ;
andthenbothhalvescan becomputedinparallel.Weproceedoneachhalf
thesamewayrecursivellyuntil we onlyneed to computea singleaddition
oftwoelliptic curvepoints.Clearlyin thedeepest levelofthe recursionwe
havetododp=2eadditionsin parallel,andtoobtainthetotalsumwehave
tomakedlogperecursivesteps.
We can observethat if p = 1 then our algorithm reduces to the ordinary
binaryalgorithm.Inwhat follows,weanalyzetherunningtimeofthep th
-order
binarymethod.
Theorem2. The running time of the p th
-order binary algorithm, on the best
case, is
blognc+dH(n)=pe+dlogpe 1
and, onthe worstcase,
blognc+H(n) 1:
Proof. First,wemust observethat theintegersetfs
0 ;s
1 ;;s
p 1
gcan be
ob-tained from n in a negligible time. In the phase one of the algorithm, each
processor executes the binary algorithm to compute s
i
P, where i is the
pro-cessor'sindex. Fromcorollary1,itrequiresblogs
i
c+H(s
i
) 1stepstodothis
computation.Now,sincea
N 1
=1,oneoftheprocessorshavetocomputeN 1
doublingsandifwesupposethatin ordertostartthephasetwoallthe
proces-sors must nish its computation, then the running time of the phase one will
be
T
1
=N+H
2; (10)
where
H
= max
0ip 1 H(s
i
): (11)
Sincewehavethat
H(n)= p 1
X
H(s
i
dH(n)=peH
H(n): (13)
In the best case, all the bits set to onein the binary representation of n are
equallyscattered amongthe integersfs
i
g,sotheloadis perfectly balanced on
alltheprocessors,andthenwehave
H
=
H(n)
p
: (14)
Andfrom(14),(10)and(1)wehavethat,onthebestcase,thetimeofthephase
oneisgivenby
T
1
=blognc+dH(n)=pe 1: (15)
Inthephasetwoofouralgorithm,thesums
0
P++s
p 1
P iscomputedusing
theassociativefan-inalgorithmin
T
2
=dlogpe (16)
steps,andthenthetotalrunningtimeforthebest casewill be
T =T
1 +T
2
=blognc+dH(n)=pe+dlogpe 1: (17)
Inthe worst caseH
=H(n), which implies by thedenition of H
(11) that
exists some index k, 0 k p 1, such that H(s
k
)= H(n) and, from (12),
H(s
i
)=0fori 6=k and then s
i
=0for i6=k. From (9) wehavethat n =s
k
and then s
k
P =nP. Then wehavethat nP is computedonthe phaseone by
theprocessork. Thereforetherunningtimewillbethesameasthat statedon
thecorollary1.
Someimprovementcan be obtainedif we use theNAF as input to our
al-gorithm insteadof the binary representation of n. Since the NAF hasa fewer
numberofnonzeros,thenwecanobtainthesamerunningtimeswithfewer
pro-cessors.Butbeforethiswecanseethatabottleneckin ouralgorithmiscaused
bythenumberofdoublingsneededontherstphase,representedbythelinear
termonN in (10),whichisindependentfromthenumberpofprocessors.
In the deduction of the running time of our algorithm, was supossed that
thetimeneededtodoadoublingorapointadditionisalmostthesame,butin
generalitdepends ofthecoordinatesystemin whichanelliptic curveis
repre-sented.The most well known coordinate systemsare the aÆneand projective
coordinates[19].Anothertwocoordinatesystems,theJacobiancoordinatesand
ChudnovskyJacobian coordinates havebeenproposed in [3]. The eÆciency of
Jacobian coordinates for elliptic curve exponentiation was discussed in [4]. In
[5]hasbeenproposedamodiedJacobiancoordinatesystem,whichgivesfaster
doublings than aÆne, projective, Jacobian and Chudnovsky Jacobian
coordi-nates. Then we can use this modiedJacobian coordinate systemto partially
overcomethebottleneckofdoublingsinouralgorithm.
4 The p -order -ary method
Thebinary anomalous curves,proposedbyKoblitz [11],havebeenusedto
ob-taineÆcientimplementationsofellipticcurvecryptosystems[14][20][21].These
curvesare
E
1 : y
2
+xy=x 3
+x 2
+1 (18)
and
E
2 : y
2
+xy=x 3
+1 (19)
overGF(2 m
). For these it canbe easily veried thepropertythat ifthe point
(x;y)isinthecurve,alsoisthepoint(x 2
;y 2
).ThenwecandenetheirFrobenius
automorphisms as '(x;y) = (x 2
;y 2
), which corresponds to multiplication by
=(1+ p
7)=2onE
1
andby =( 1+ p
7)=2onE
2
.Usingnormalbasis
on GF(2 m
),the multiplicationby requires only twocyclic shifts andthen it
can bedoneinanegligibletime.
Since isanelementofnorm2intheEuclideandomain Z[]anyintegern
hasarepresentationas
n= 1 X i=0 c i i (20) for c i
2f0;1g. Forany n 2Z[]a representation(20) is called aNAF if c
i 2
f0;1gandc
i c
i+1
=0foralli0.Solinas[20]showedtheexistenceofaNAF
of minimal weigth by givingan algorithm that computes theNAF directly. If
jn, thenc
0
=0.Otherwise, 2
divideseithern+1orn 1andtheNAFends
in (0; 1) or(0;1), respectively.This processisrepeated onn=,(n+1)= 2 or (n 1)= 2 . Because' m
(x;y)=(x 2
m
;y 2
m
)=(x;y),anytworepresentationswhichagree
modulo m
1will yieldthe sameendomorphismonthecurve.Basedon this
property,MeierandStaelbach [14]showedthefollowing:
Theorem3. Every n2Z[]has arepresentation
n m 1 X i=0 c i i (mod m 1); (21) with c i
2f0;1g.
The binary method can be extended in a natural way to a -ary method
based on the representation of n given by (21). Now we candene a parallel
p th
-order-arymethod.
Using(21)wecandenetheset fs
0 ;;s
p 1
gZ[],where
s i = d m p e 1 X j=0 c jp+i jp+i (22)
fori=0;1;:::;p 1.
s
0 P;s
1
P;;s
p 1 P
runningthe-aryalgorithm oneachprocessor.
2. Associativefan-in: Usingdp=2eprocessorscompute
p 1
X
k =0 s
k P
withtheassociativefan-inalgorithm,indlogpesteps.
Ifwedene H(n)as thenumberofnonzerosin therepresentationgivenby
(21),wecanstatethefollowingtheorem:
Theorem 4. Therunningtimeofthep th
-order-ary method,onthe bestcase,
is
dH(n)=pe+dlogpe 1
and
H(n) 1
on theworst case.
Proof. Theproof ofthis theorem isthesameasthat of theorem2except that
thetimeforthephaseonewill be
T
1 =H
1 (23)
whereH
isdenedas in(11).
MeierandStaelbach[14] conjecture,basedonexperimentalevidence, that
onaveragehalf ofthec
i
will benonzero.Sowecanexpectthat whenpm=2
therunningtimewillbeclosertologH(n).Furtherimprovementscanbemade
to the -ary representation to reduce the number of nonzeros [21], so a small
numberofprocessorscan bringsimilarspeed-upinthep th
-order-arymethod.
5 Conclusions
Wehavediscussedinthispaperaextensionofthebinaryalgorithm,whichbythe
useofparallelprocessingcanspeedupthecomputationofscalarmultiplications
onellipticcurves.Wehavealsodiscussedsomewaystoimprovethisalgorithm.
WehaveshowedhowitcanbespeciallyeÆcientforKoblitz curves.
1. M.AdlemanandK.Kompella,Usingsmoothnesstoachieveparallelism,
Proceed-ingsofthe20thACMSymposiumontheTheoryofComputing,pp.528-538,1988.
2. P.W. Beame, S.A. Cook and H.J. Hoover, Log depth circuits for division and
relatedproblems,SIAMJournalonComputing,vol.15,pp.994-1003,1986.
3. D.V.ChudnovskyandG.V.Chudnovsky,Sequencesofnumbersgeneratedby
ad-dition in formal groups and new primality and factorization tests, Advances in
AppliedMathematics,vol.7,pp.385-434,1986.
4. H.Cohen,A.MiyajiandT.Ono,EÆcientellipticcurveexponentiation,Advances
inCryptology-ProceedingsofICICS'97,LNCS1334,pp.282-290,1997.
5. H. Cohen, A. Miyaji and T. Ono, EÆcient elliptic curve exponentiation using
mixed coordinates, Advances in Cryptology - ASIACRYPT'98, LNCS 1514, pp.
51-65,1998.
6. D.M. Gordon, A survey of fast exponentiation methods, Journal of Algorithms,
vol.27, pp.129-146,1998.
7. J. Guajardo and C. Paar, EÆcient algorithms for elliptic curve cryptosystems,
AdvancesinCryptology -CRYPTO'97,LNCS1294, pp.342-356,1997.
8. G.Harper,A.MenezesandS.Vanstone,Public-keycryptosystemswithverysmall
keylengths,Advances inCryptology-EUROCRYPT'92,LNCS658,pp.163-173,
1993.
9. D.E. Knuth. The Art in Computer Programming. Vol 2: Seminumerical
Algo-rithms.Addison-Wesley,2nd.Ed.,1981.
10. N.Koblitz,Ellipticcurvecryptosystems,MathematicsofComputation,vol.48,pp.
203-209,1987.
11. N.Koblitz,CM-curveswithgoodcryptographicproperties,AdvancesinCryptology
-CRYPTO'91,LNCS576,pp.279-287,1992.
12. K.KoyamaandT.Tsuruoka,Speedingupellipticcryptosystemsbyusingasigned
binary window method, Advances in Cryptology - CRYPTO'92, LNCS 740, pp.
345-357,1993.
13. C.H. Lim and P.J. Lee, More exible exponentiation with precomputation,
Ad-vancesinCryptology-CRYPTO'94,LNCS389,pp.95-107,1994.
14. W.MeierandO.Staelbach,EÆcientmultiplicationoncertainnonsupersingular
ellipticcurves,AdvancesinCryptology-CRYPTO'92,vol.740,pp.333-344,1993.
15. V. Miller, Uses of elliptic curves in cryptography, Advances in Cryptology
-CRYPTO'85,LNCS218,pp.417-426,1986.
16. F.MorainandJ.Olivos.Speedingupthecomputationsonanellipticcurveusing
addition-substractionchains.Inform.Theor.Appl.,24,531-543,1990.
17. R.Rivest,A.ShamirandL. Adleman,A methodforobtainingdigitalsignatures
and public keycryptosystems, Communications of the ACM, vol. 21, no. 2, pp.
120-126,1978.
18. R.Schroeppel,H.Orman,S.O'MalleyandO.Spatscheck,Fastkeyexchangewith
ellipticcurvesystems,Advances inCryptology -CRYPTO'95,LNCS963,pp.
43-56,1995.
19. J.H. Silverman, The Arithmetic of Elliptic Curves, GTM 106, Springer Verlag,
1986.
20. J. Solinas, An improved algorithm for arithmetic on a familyof elliptic curves,
AdvancesofCryptology-CRYPTO'97,LNCS1294,pp.357-371,1997.
GF(2 n
),SIAMJournalonComputing,vol.19,pp.711-717,1990.
23. J.vonzurGathen,EÆcientandoptimalexponentiationinniteelds,