Transcript Slide 1

IELM 511: Information System design
Introduction
Part 1. ISD for well structured data – relational and other DBMS
Info storage (modeling, normalization)
Info retrieval (Relational algebra, Calculus, SQL)
DB integrated API’s
Part 2. ISD for systems with non-uniformly structured data
Basics of web-based IS (www, web2.0, …)
Markup’s, HTML, XML
Design tools for Info Sys: UML
Part III: (subset of)
API’s for mobile apps
Security, Cryptography
IS product lifecycles
Algorithm analysis, P, NP, NPC
Agenda
The mathematical basis for RSA encryption
Modulo mathematics: +; *; ^
How RSA is implemented
Proof of correctness of RSA
Concluding remarks
Need for RSA
In the last lecture, we saw the use of (shared) private key cryptography
Example: E-banking (you may need to physically get password)
Shared key cryptography does not solve all communication problems:
Examples: Secure E-commerce (how did you exchange password with
Amazon? with Yahoo shopping ?)
We also saw the need for a public-key private-key
encryption systems (digital signatures, secure transmission)
In this lecture, we look at the theoretical basis for the RSA algorithm,
which is used (in some form or other) in public-private key cryptography
The theoretical basis for the RSA algorithm: Number theory, Algorithms
Modulo mathematics
Given an integer m and positive integer n,
m mod n is the smallest nonnegative integer r such that for some integer q
m = nq + r
Examples:
27 mod 3 = 0 [since 27 = 3*9 + 0]
27 mod 4 = 3 [since 27 = 4*6 + 3]
-27 mod 4 = 1 [since -27 = 4+(-7)+ 1]
Note: this definition works for positive and negative m
Modulo ring
Zn is the set of integers {0, 1, . . . , n − 1} with two operators:
addition modulo n, denoted +n:
i +n j = (i + j) mod n
multiplication modulo n, denoted: *n:
i *n j = (i * j) mod n
Exercises:
Prove that +n and *n satisfy the commutative property;
Prove that *n distributes over +n
An insecure private key scheme: +n
In all discussion, we will assume that a message is a lower-case English text
message (with 26 characters)
In most encoding/decoding, we will use the notation a = 0; b = 1; … z =25
Scheme:
Secret key: integer k
Encode: Replace each letter x by x' = (x +26 k) = (x + k) mod 26.
Decode: Replace each letter x' by (x' –26 k) = (x' – 26) mod 26.
Notes:
1. (x' – k) can be negative [hence the usefulness of our mod definition!]
2. Exercise: show that indeed ( (x +26 k) –26 k ) = x
An insecure private key scheme: +n
Scheme:
Secret key: integer k
Encode: Replace each letter x by x' = (x +26 k) = (x + k) mod 26.
Decode: Replace each letter x' by (x' –26 k) = (x' – 26) mod 26.
Q: Why is this scheme insecure ?
Answer:
A scheme is insecure if an efficient algorithm exists that can decrypt an
encrypted message without knowledge of the key, k
In our scheme, k can have any value (infinite possibilities), BUT
To decipher k, how many values do we need to try ?
Why ? i mod n = (i + kn) mod n for all integers k.
So +n does not work, how about *n
Scheme:
1. Code the message into (a series of) number(s): Message = M
2. Private key: integers a,n
3. Encode: fa,n( M) = (a *n M) = (a * M) mod n.
4. Decode: ??
For this scheme, we need an inverse for multiplication mod n, namely
some function, ga,n(X) = a-1 *n X such that ga,n(fa,n( M)) = M,
Question: Is there some such function g( ) ?
In other words, we are looking for a definition of a multiplicative inverse.
Crypto scheme using *n …
M
fa,n( M) = (a *n M)
Suppose:
(a, n, M) = (4, 12, 3)
a
4 * 3 mod 12 = 0
 Impossible to decrypt!
Recipient gets message = 0;
From the Z12 table, row a=4
there are four possible values.
Crypto scheme using *n …
M
fa,n( M) = (a *n M)
Second try:
(a, n, M) = (5, 12, 7)
a
5 * 7 mod 12 = 11
Only one entry = 11 in
the Z12 table, row a=5
 Recipient decrypts M = 7 !
Conclusion: This scheme works iff all entries in some row of Zn table are
unique (and indeed, are a permutation of the set {0, 1, …, n-1}
Question: which combination of values n, a have this property ?
Primes, Relative primes, and GCD's in *n
A number > 1 is called a prime if it can only be divided by itself or 1
with no remainder.
Given two numbers, a and b, we define gcd( a, b) as the largest integer that
divides both a and b without remainder.
Two numbers, a and b, are called relatively prime if gcd( a, b) = 1.
Examples:
2, 3, 5, 7 .. are prime numbers
How many prime numbers are there?
gcd( 12, 3) = 3
gcd( 12, 5) = 1
Given prime number p, what is gcd( p, n) = ?
Primes, Relative primes, and GCD's in *n
A useful theorem and corollary
Theorem 1. Given two positive integers j, k, gcd(j, k) = 1 iff
there are integers x and y such that jx + ky = 1.
Corollary 2. For any positive integer n, an element a  Zn
has a multiplicative inverse if and only if gcd(a, n) = 1.
How to compute gcd( a, b): Euclid's method
Lemma 3. Let j, k, q, and r be nonnegative integers such that k = jq + r, then gcd(j, k)
= gcd(r, j).
Proof:
case 1. r = 0
gcd( r, j) = gcd( 0, j) = j (since everything divides 0), and
k = jq, therefore gcd( k, j) = j
case 2. r > 0
(i) let d be a common factor of j and k   integers x, y > 0 such that
j = xd and k = yd;
yd = xdq + r  r = d( y – dq)  d is a factor of r.
(ii) let d be a common factor if r, j   integers x, y > 0 such that
r = dx and j = dy;
k = dyq + dx = d( yq + x)  d is a common factor of k, j.
From (i) and (ii) , d is a common factor of r, j iff it is a common factor of j, k, which
implies that gcd( j, k) = gcd( r, j).
How to compute gcd( a, b): Euclid's method
Lemma 3. Let j, k, q, and r be nonnegative integers such that k = jq + r, then
gcd(j, k) = gcd(r, j).
Algorithm gcd( k, j)
1.
2.
3.
4.
5.
gcd(k, j) where 0 ≤ j < k
If (j = 0) return( k)
Else
r = k mod j; // therefore k = jq + r
return gcd(j, r)
Example:
gcd( 235, 141)
iteration 1: gcd( 235, 141): k = 235; j = 141; r = k mod j = 235 – 1 * 141 = 94
iteration 2: gcd( 141, 94): k = 141; j = 94; r = 141 - 1 * 94 = 47
iteration 3: gcd( 94, 47) : k = 94; j = 47; r = 94 – 2 * 47 = 0
iteration 4. gcd( 47, 0): returns 47.
Can we use *n and its inverse to design Asymmetric keys?
Not quite – such a mechanism is not secure.
First, let's look at the scheme that works: RSA
RSA (named after Profs. Rivest, Shamir & Adelman) was proposed in 1970's at MIT
It is the basis of almost all eCommerce security today
Main idea:
- The public key, Kp, provides a mechanism to encode the Message
- Given Kp and encrypted message M* = rsa( Kp, M) we cannot efficiently compute Kp-1
- The secret key, Ks, provides an efficient means to compute Kp-1
Before studying the theory behind RSA, let's first see how RSA functions.
The RSA scheme
1. Select two large prime numbers, p and q
2. Let n = pq; let T = ( p - 1)( q - 1)
3. Select a large prime, e (e != 1), such that gcd( e, T) = 1
4. Calculate d = e-1 mod T
5. The public key, Kp is (n ,e)
6. The secret key, Ks is d
Notes:
Large prime: a prime number with 150 digits or more (later we shall see why)
Is T prime ?
In step 3, e is selected so that e, T are relatively prime.
RSA: usage and security
Suppose Alice wants to send Bob a message, x ( 0 < x < n)
1. Alice gets Bob's public key, (e, n)
2. Alice computes x* = xe mod n
3. Alice sends x* to Bob.
Bob wants to decrypt the message received from Alice:
1. Bob looks up his secret key, d
2. Bob computes x** = x*d mod n
Claim: x** = x = original message that Alice wants to send.
To prove that RSA works, we need to prove the following:
1. Correctness: (xe mod n)d mod n = x
2. Security:
2.1. A party who knows n, e, and Me mod n, but not p, q, or d cannot compute M
2.2. A party who knows n (public key) cannot find its factors p, q (otherwise they
could easily calculate d!)
Multiplicative inverse modulo n
RSA involves the following step:
… 4. Calculate d = e-1 mod T
What is e-1 ?
In Zn, we say that a-1 is the multiplicative inverse of a (!= 0) iff a *n a-1 = a-1 *n a = 1
Does such an inverse always exist ? If so, how can we compute it ?
a
a-1
_______________
1
2
3
4
5
6
7
8
9
10
11
1
5
7
11
Computing the multiplicative inverse
Recall Theorem 1.
Given two positive integers j, k, gcd(j, k) = 1 iff
there are integers x and y such that jx + ky = 1.
We need a solution to: a *n x = 1, which is the same as
ax mod n = 1
 ax = qn + r (for some integer q, and r = 1),
 ax + (-q)n = 1
Claim: If a  Zn, and x, y are integers such that ax + ny = 1, then a-1 = x mod n
Proof (sketch):
a *n x = a *n x + n *n y = a *n x +n n *n y = (ax + ny) mod n =1
since n *n y = 0
since (s + t) mod n = (s mod n + t mod n ) mod n
Exercise: prove this
Computing the multiplicative inverse..
To solve: a *n x = 1, we need to find two integers x, y such that (ax + ny) mod n =1
The following algorithm, with inputs a, n, solves for x (if it exists):
Algorithm gcd_xy( k, j)
// 0 ≤ j < k
// returns: [x, y, gcd( j, k)] such that jx + ky = gcd( j, k)
1.
2.
3.
4.
5.
6.
If k = jq, return [x = 1, y = 0, gcd( k, j) = j];
Else
r = k mod j; // therefore k = jq + r
q = (k – r)/j
[x', y', gcd(j, k)] = gcd( r, j)
return [x = y' – qx', y = x', gcd(r, j)]
Exercise: prove that step 6 returns the correct values of x, y
Correctness of RSA
1. Select two large prime numbers, p and q
2. Let n = pq; let T = ( p - 1)( q - 1)
3. Select a large prime, e (e != 1), such that gcd( e, T) = 1
4. Calculate d = e-1 mod T
5. The public key, Kp is (n ,e)
6. The secret key, Ks is d
We need to prove that: (xe mod n)d mod n = x
We will use the following: For any a  Zn and non-negative integers i, j
(a) (ai mod n) *n (aj mod n) = ai +j mod n
(b) (ai mod n)j mod n = aij mod n
and
Fermat's little thoerem:
Let p be a prime number. Then, for every nonzero a  Zp, ap−1 mod p = 1.
Correctness of RSA…
primes: p, q
n = pq
T = ( p - 1)( q - 1)
e chosen such that gcd( e, T) = 1
d = e-1 mod T
We first prove that for prime, p (or q), x mod p = xed mod p
ed mod T = 1  there is some integer k such that ed = 1 + kT
xed mod p
= x1 + k(p-1)(q-1) mod p
= x (xk(q-1))(p-1) mod p
case 1. xk(q-1) is a multiple of p
 x is a multiple of p (since p is prime)
 xed mod p = 0 = x mod p
case 2. xk(q-1) is not a multiple of p
 (xk(q-1))(p-1) = 1 (Fermat's little theorem)
 xed mod p = x * 1 mod p = x mod p
xed mod p = x mod p (for prime numbers, p, q)
 xed – x divides p (and q)  xed – x = ip = jq  xed – x is also divisible by pq [why?]
 xed – x = k (pq) = k n for some integer k
 xed = kn + x. Therefore, for 0 ≤ x < n, xed = x
Security of RSA
primes: p, q
n = pq
T = ( p - 1)( q - 1)
e chosen such that gcd( e, T) = 1
d = e-1 mod T
To show that RSA is secure, we need some guarantee that
2.1. A party who knows n, e, and Me mod n, but not p, q, or d cannot compute M
2.2. A party who knows n (public key) cannot find its factors p, q (otherwise they
could easily calculate d!)
Given n, e, and Me mod n,
Can we work backwards and compute M ?
There is no known efficient algorithm to compute e-th root of a number mod n.
[note: if n was always fixed, we could use a computer to build up a look-up
decrypting sheet!]
Given n (public key) can we find its factors p, q, and use them to compute T, and
then use e to compute d ?
So far, there is no known efficient algorithm to factorize a number.
Discussion
RSA is currently the basis for almost all secure eCommerce
Examples:
banks (e.g. try hsbc.com, standardchartered.com.hk, …)
signed emails (e.g. HKUST's ITSC)
Once RSA has established a secure communication channel, two way
symmetric encryption is used, usually some variant of DES,
which is a block cipher algorithm.
Three important mathematicians whose works were used in this lecture:
Euclid (300 BC )
Fermat (17th century)
Euler (18th century)
References and Further Reading
Simon Singh, The Code Book, pub. Anchor press, 2000
PDF article giving brief introduction to RSA maths (Utah State, Prof Moon)
Wikipedia cryptography portal
Prof Deng Xiaotie/Prof Frances Yao’s lecture notes (City Univ, HK)
Prof M. Golin's lecture notes (CSE, HKUST)
Next: final exams