2012-01-3.presentationx
Download
Report
Transcript 2012-01-3.presentationx
Ofer Schwarz, Winter 2012-2013
Advisor: Barukh Ziv
Elliptic Curve
Cryptography
The EC Discrete Logarithm problem and
Pollardβs Rho attack
Background
ECDLP; The ECDLP attack; Project goals
Elliptic Curves
β’ Elliptic curves may be defined over any field
β’ Solutions π₯, π¦ to the equation
π¦ 2 + π1 π₯π¦ + π3 π¦ = π₯ 3 + π2 π₯ 2 + π4 π₯ + π6
β’ Obtain a simpler equation through variable change
o Over ππ : π¦ 2 = π₯ 3 + ππ₯ + π
o Over π2π : π¦ 2 + π₯π¦ = π₯ 2 + ππ₯ 2 + π
β’ Define an additive group structure using geometry
o βPoint an infinityβ β serves as the unit element
Calculating π₯3 , π¦3 = π₯1 , π¦1 + (π₯2 , π¦2 ) over πΉπ :
π¦2 β π¦1
π=
π₯3 = π2 β π₯1 + π₯2
π¦3 = π π₯1 β π₯3 β π¦1
π₯2 β π₯1
ECDLP
β’ Elliptic Curve Discrete Logarithm Problem
β’ Computational hardness of DLP is the basis for many
cryptographic systems (e.g., DSA, ElGamal)
β’ Given a finite field πΉ,
β’ An elliptic curve πΈ over πΉ,
β’ A point π β πΈ(πΉ) of order π [ππ = β],
β’ And another point π = ππ β π
β’ The problem: find π
ECDLP using collisions
β’ The idea: find π1 , π1 , (π2 , π2 ) such that
π1 π + π1 π = π2 π + π2 π
β’ Then we have π1 β π2 π = π2 β π1 π = π2 β π1 ππ
β’ Simple method to find a collision: birthday paradox
o Very heavy memory requirements
β’ Pollardβs Rho attack: same time, negligible memory
β’ The means: random functions
Pollardβs Rho
β’ Every function over a finite space
is composed of finite chains
β’ Each chain has a cycle, and a collision:
π₯ β π¦ such that π π₯ = π π¦
β’ In a random function:
o Expected tail length β
ππ/8
o Expected cycle length β
ππ/8
β’ Use any cycle-detection method
o E.g., Floydβs algorithm: ~3 π EC operations
β’ Use a specific family of functions for which given π
= ππ + ππ it is easy to find πβ² , πβ² s.t. π π = πβ² π + π β² π
Additive walks
β’ Partition the curve into disjoint subsets π1 , β¦ , ππ
o E.g., according to the least π = log 2 π bits of π₯ coordinate
β’ Choose random integers ππ , ππ for π = 1, β¦ , π
β’ For π β ππ , define π π = π + ππ π + ππ π
β’ For starting element, choose random ππ + ππ
Pohlig-Hellman reduction
π
π
β’ Assume π = π11 β― ππ π
β’ Reduces ECDLP of order π to ππ instances of order ππ
for π = 1, β¦ , π
β’ Uses Chinese remainder theorem and group
structure
β’ Significance: ECDLP of order π is only as hard as the
largest prime factor of π
β’ Usually the parameters are chosen so π is prime
Project goals
β’ Implement a generic EC arithmetic library
β’ Implement the ECDLP attack
β’ Research and implement various improvements
and optimizations for the attack
β’ Ultimate goal: solve 64-bit ECDLP (i.e., π β 264 )
Improvements and
optimizations
Nivaschβs algorithm; Montgomery trick and distinguished
point method; Negation map
1. Nivaschβs algorithm
β’ Cycle detection using stacks
β’ The idea: find the smallest value in the cycle
o Keep a stack of values encountered so far
o For each new value, remove all values larger than it
o Stack is ordered by π₯π , π , increasing in both
β’ Improvement: use π stacks, with partitioning
o Look for smallest value on cycle in each subset separately
β’ Expected runtime: 1 +
1
2 π+1
β’ Expected memory: π(π log
ππ/2
π )
2. The Montgomery trick
β’ Inversion is the most expensive field operation
β’ Compute several inversions simultaneously
β’ The trick: use accumulating products:
πβ1
ππβ1 =
ππ β
π=1
β1
π
ππ
π=1
π
β
ππ
π=π+1
β’ Substitute π inversions with 3 π β 1 multiplications
and 1 inversion
Local parallelization
β’ Montgomeryβs trick requires several parallel
instances (all running locally)
β’ Naïve parallelization only results in a π speedup
β’ The distinguished point method yields a speedup
factor of π
β’ The result: we can use Montgomeryβs trick without
losing efficiency!
Distinguished points
β’ Pollardβs Rho chains may
intersect
β’ Use same function in all
instances
β’ Keep a hash table of points
β’ Only insert βdistinguishedβ points
β’ Common method: π least bits of
the π¦ coordinate are all 0
β’ Gives the same speedup factor,
but saves a factor of 2π in
memory
3. Negation map
β’ Method for improving the attack by a factor of 2
β’ The idea: given a point π β πΈ(πΉ), itβs very easy to
calculate βπ
o In prime curves: β π₯, π¦ = (π₯, βπ¦)
β’ The idea: βgroupβ each point and its negative as a
single element
o E.g., use the one with an even π¦ coordinate
Fruitless cycles
β’ Problem with negation map in additive walks
β’ If π β ππ and π π = βπ π β ππ , then
π π π = β π + ππ π + ππ π + ππ π + ππ π = βπ
β’ βFruitlessβ because linear combination is the same
β’ Happens with ππ =
1
2π
every step (π = partition factor)
β’ Longer even-length cycles are also possible
o Probability is exponential in cycle length
Resolving fruitless cycles
β’ The simplest idea actually works: just check!
β’ Check for 2-cycles every π2 steps
o
o
o
o
When calculating ππ = π(ππβ1 ) for π β‘ 0 (πππ π2 )
Check if ππβ1 = ππβ3
If so, define ππ = 2 β min{ππβ1 , ππβ2 }
Still easy to calculate the linear combination
β’ Do the same for larger even lengths
o Analysis shows that optimal ππ β ππ/2
o Only need to check up to ππππ₯ =
log π
log π
Implementation and results
EC arithmetic library; Collision library; Challenges and results
Curve arithmetic library
β’ Generic EC arithmetic library in C++
β’ Support for various different curves and algorithms
o Extensible syntax that allows adding even more curves and algorithms
β’ Fast field arithmetic using GMP and NTL
o Incl. complex operations, e.g., Chinese remainders, modular square roots
Collision library
β’ Generic (templated) C++ library for finding collisions
β’ Only need to supply the function
β’ Currently implemented:
o Floydβs algorithm
o Nivaschβs stack algorithm
o Distinguished point method for parallelization
Challenges
β’ 4 ECDLP challenges of increasing difficulty
o 30, 40, 50 and 64 bits
β’ 1 Extra challenge with non-prime order for testing
Pohlig-Hellman reduction
Results!
β’ 64-bit challenge solved in ~16 hours, ~231 iterations
β’ Results from previous group: 60 bits in 5-6 days
β’ Best result to date: 112 bits in 3.5 months
o Used a cluster of 218 PlayStation 3 consoles
o Single-Instruction, Multiple-Data architecture
o Heavy optimizations on all levels
Results!
Average time
Average function calls
65536
35
32768
16384
30
8192
25
2048
1024
log2(#calls)
Runtime (seconds)
4096
512
256
128
20
15
64
32
10
16
8
5
4
2
1
0
30
40
50
Challenge bits
64
30
40
50
Challenge bits
64
Optimization tests
β’ Check every improvement against vanilla version
β’ Nivasch: 2.16 times less iterations, 1.4 speedup
β’ Montgomery: 1.43 speedup factor for 40 bits, 1.33
factor for 30 bits
β’ Negation map: 1.1 times less iterations, no speedup
o (Actually about 1.07 times slower)
Improvement ideas
β’ Distributed attack
β’ Low-level optimizations
o Integer arithmetic
o Field arithmetic (probably harder since NTL is very good at that)
o In-place operations instead of constructors and copying
β’ Use SIMD architecture (e.g., GPUs)
The End