Skip to main content

A Practical Approach to the RSA Algorithm

So you're curious about the RSA algorithm, you like Internet technologies and mathematics?
This is a short article on how the RSA Algorithm works, with practical examples.

Preface and Disclaimer
Obviously the article itself and the code here presented are suitable for a learning activity only and in case you use them, you do it at your own responsibility.

I based this on the 1977's article from Rivest, Shamir and Adleman, (from which initial letters the algorithm takes the name), and ran the code on a Mac with python 2.6.1.

The code presented here is also not optimized. The reader is invited to suggest better implementations (in languages other than python as well).


Mathematics concepts you need to know
- What a 'prime number' is. (A integer number which can be divided evenly only by 1 and itself. e.g. 1, 2, 3, 5, 7, 11, ..., 47, ...)
- What the 'modulus' (here referred as 'mod') operation is. ('A mod B' is the reminder of A divided B. e.g. 7 mod 3 = 1)
- What 'factoring a number' means. (Finding the list of prime numbers that multiplied give that number. e.g. Factoring 12 leads to 2, 2, 3, because 2 * 2 * 3 = 12)

Usage of RSA
This algorithm is used to define a cryptographic system which uses a Public Key to encrypt messages, and a Private Key to decrypt them.

If Alice wants to send encrypted data to Bob, Alice encrypts the data with Bob's Public Key.
Only Bob, with his Private Key, can decrypt the data.

Security is the core of this algorithm, for which, if the keys are properly chosen, it's not possible, or so computationally hard to become impractical, to decrypt the data unless you have Bob's private key.

For this reason, the Public Key can be exchanged from Bob to Alice through an insecure transport, e.g. public Internet.


The mathematics behind it
The Public Key is a pair of integer numbers: e and n. (You can associate 'e' with 'e'ncryption).
The Private Key is a pair of integer numbers: d and n, (You can associate 'd' with 'd'ecryption).

'n' is computed as the product of two large prime numbers, p and q.
'd' is another prime number, bigger than p and q and 'relatively prime to p and q', for which exists an integer 'e', with this property:

e * d = 1 (mod fi)

where

fi = (p - 1) * (q - 1)

that is, the product of d and e, mod fi, equals 1, and said in other terms, 'd is the multiplicative inverse of e, modulus fi'.

'd is relatively prime to p and q' means that the greatest common divisor between d and fi is 1.

For the demonstration please refer to the original article and its related references.

If these conditions are met, then given a number M (< n) to be encrypted, Alice can encrypt it by computing:

C = (M exp e) mod n
Alice then sends C to Bob.
Once Bob receives C, he can decrypt it (because he knows the Private Key (d, n)) with:

D = (C exp d) mod n

D (the decrypted number) equals M (the original number).

As you can see, after having chosen the key components (n, d and e), encryption and decryption are conceptually easy (although consider that security relies on n being an extremely large number - say 200 digits).


Why is RSA secure

The algorithm is secure because to compute d, the Private Key, it's not sufficient to know 'n' and 'e' (which are included in the Public Key).
This is because d can be computed only knowing p and q and to know them, an attacker should retrieve them from n, i.e. it should 'factor n'.
It can be proved that 'factor n', when p and q are large, is so hard to be considered impossible (in a reasonable amount of... years).

A practical example
I'll use small values for p and q, just for the purpose of having simple computations that can be done with a normal calculator.
Again, remember that the security here relies on using very large prime numbers to compute n.

In fact, let's take the example from the original article:

p = 47
q = 59
d = 157
n = p * q = 2773
fi = (p - 1) * (q - 1) = 46 * 58 = 2668

With this values,
e = 17

So say Alice wants to encrypt the number 920 (associated through a dictionary to a combination of characters for example):

C = (M exp e) mod n = 948

Alice will then send 948 to Bob.

Bob will try to decrypt C with d:

D = (C exp d) mod n = 920

D = M, success!


How to compute d and e
The tricky bit, as you may have guessed, was to compute d and e.
Given p and q, and hence n, d was chosen from a set of candidates (the 'keyspaceì').
e was computed from d, p and q.

To experiment on this, first I wrote the algorithm to compute e given p, q and k:


def compute_bn(xn, x0, x1):
 an = -1
 bn = (xn - (an * x0)) / x1
 return bn

def compute_e(p, q, d):
 x0 = (p - 1) * (q - 1)
 x1 = d

 xnm2 = x0
 xnm1 = x1

 while True:
   xn = xnm2 % xnm1

   if (xn == 0):
     return compute_bn(xnm1, x0, x1)

  xnm2 = xnm1
  xnm1 = xn

 return 0



then I cycled through prime numbers in a certain range looking for values of d which gave an integer value for e:


#! /usr/bin/python

from rsa import *

def find_p_q_d_in_limit(starting_p, starting_q, limit):
 M = 920
 for p in range(starting_p, limit):
   if is_prime(p):
     for q in range(starting_q, limit):
       if is_prime(q):
          for d in range(starting_q + 1, limit):
            if (d > p and d > q and is_prime(d)):
              if (compute_and_check(p, q, d, M) == 0):
                print "p: ", p, ", q: ", q, ", d: ", d

find_p_q_d_in_limit(31, 37, 1000)


where compute_and_check(p, q, d, M) is:


def compute_and_check(p, q, d, M):
 n = p * q
 e = compute_e(p, q, d)
 Mexpe = M ** e
 C = Mexpe % n
 D = (C ** d) % n

 if M == D:
   print "p: ", p, ", q: ", q, ", d: ", d,
   return 0

 return 1



and is_prime(N):


def is_prime(N):
  i = 2
  while (i < N):
    if (N % i == 0):
      return False
    else:
      i += 1
  return True 



With small values of p and q I've got a list of possible candidates for d, e.g.:

[...]
p:  47 , q:  59 , d:  157
p:  47 , q:  61 , d:  251
p:  47 , q:  79 , d:  97
p:  47 , q:  97 , d:  631
p:  47 , q:  97 , d:  773
p:  47 , q:  101 , d:  107
[...]

and then given p = 47 and q = 59 used d = 157 as in the original example.

The key of this process is the computation of e given d, p and q, which was done with these two functions:


def compute_bn(xn, x0, x1):
 an = -1
 bn = (xn - (an * x0)) / x1
 return bn

def compute_e(p, q, d):
 x0 = (p - 1) * (q - 1)
 x1 = d

 xnm2 = x0
 xnm1 = x1

 while True:

   xn = xnm2 % xnm1

   if (xn == 0):
     return compute_bn(xnm1, x0, x1)

  xnm2 = xnm1
  xnm1 = xn

 return 0



compute_e(p, q, d) gives the value of e using a variation of the Euclidean algorithm to compute the Greatest Common Divisor, as suggested by R, S and A in their article.

Encrypting
So at the end given you have a number to encrypt (M) and a public key (e, n) you can just call:

$ python encrypt.py 920 17 2773

encrypt.py


#! /usr/bin/python

 import sys
 from rsa import *

 M = int(sys.argv[1]);
 e = int(sys.argv[2]);
 n = int(sys.argv[3]);

 C = encrypt_M(M, e, n)

 print "C: ", C


with a very simple function (defined inside the module called rsa.py):


def encrypt_M(M, e, n):
 return pow(M, e, n)


(note here that I'm using a standard python function, pow(), which does all the 'M ** e mod n' computation for us)

Decrypting
When you have an encrypted number C, and the private key (d, n) you can just call:

$ python decrypt.py 948 157 2773

decrypt.py:

#! /usr/bin/python

 import sys
 from rsa import *

 C = int(sys.argv[1]);
 d = int(sys.argv[2]);
 n = int(sys.argv[3]);

 D = decrypt_C(C, d, n)

 print "D: ", D


with a very simple function (defined inside the module called rsa.py)


def decrypt_C(C, d, n):
 return pow(C, d, n)


Input is not sanity checked: you can/should add it depending on your needs.

Feedback is welcome!

Popular posts from this blog

Troubleshooting TURN

  WebRTC applications use the ICE negotiation to discovery the best way to communicate with a remote party. I t dynamically finds a pair of candidates (IP address, port and transport, also known as “transport address”) suitable for exchanging media and data. The most important aspect of this is “dynamically”: a local and a remote transport address are found based on the network conditions at the time of establishing a session. For example, a WebRTC client that normally uses a server reflexive transport address to communicate with an SFU. when running inside the home office, may use a relay transport address over TCP when running inside an office network which limits remote UDP targets. The same configuration (defined as “iceServers” when creating an RTCPeerConnection will work in both cases, producing different outcomes.

VoIP calls encoded with SILK: from RTP to WAV

SILK is a codec defined by Skype, but can be found in many VoIP clients, like CSipSimple . It comes in different flavours (sample rates and frame sizes), from narrowband (8 KHz) to wideband (24 KHz). Since Wireshark doesn't allow you to decode an RTP stream carrying SILK frames, I was curious to find a programmatic way to do it. In fact, this has also allowed to me to earn a "tumbleweed" badge in stackoverflow . You may argue that a Wireshark plugin would be the right solution, but that's probably for another day. Initially I thought it was sufficient to read the specification for RTP payload when using SILK ; the truth is that I had to reverse engineer a solution by looking at SILK SDK's test vectors. There, I discovered that a file containing SILK audio doesn't have the file header indicated in the IETF draft ("!#SILK"), but a slightly different one ("!#SILK_V3"). More importantly, each encoded frame is not preced...

Decrypt SDES SRTP from pcap

If you have a pcap file with encrypted RTP (SDES SRTP) and have access to the SIP signalling to see the keys, these instructions will help you decrypt the RTP payload and save it as raw audio. Optionally, depending on the codec, you can then import the raw audio in Wireshark and save it as an audio file. Steps