Skip to main content

A Practical Approach to the RSA Algorithm

So you're curious about the RSA algorithm, you like Internet technologies and mathematics?
This is a short article on how the RSA Algorithm works, with practical examples.

Preface and Disclaimer
Obviously the article itself and the code here presented are suitable for a learning activity only and in case you use them, you do it at your own responsibility.

I based this on the 1977's article from Rivest, Shamir and Adleman, (from which initial letters the algorithm takes the name), and ran the code on a Mac with python 2.6.1.

The code presented here is also not optimized. The reader is invited to suggest better implementations (in languages other than python as well).


Mathematics concepts you need to know
- What a 'prime number' is. (A integer number which can be divided evenly only by 1 and itself. e.g. 1, 2, 3, 5, 7, 11, ..., 47, ...)
- What the 'modulus' (here referred as 'mod') operation is. ('A mod B' is the reminder of A divided B. e.g. 7 mod 3 = 1)
- What 'factoring a number' means. (Finding the list of prime numbers that multiplied give that number. e.g. Factoring 12 leads to 2, 2, 3, because 2 * 2 * 3 = 12)

Usage of RSA
This algorithm is used to define a cryptographic system which uses a Public Key to encrypt messages, and a Private Key to decrypt them.

If Alice wants to send encrypted data to Bob, Alice encrypts the data with Bob's Public Key.
Only Bob, with his Private Key, can decrypt the data.

Security is the core of this algorithm, for which, if the keys are properly chosen, it's not possible, or so computationally hard to become impractical, to decrypt the data unless you have Bob's private key.

For this reason, the Public Key can be exchanged from Bob to Alice through an insecure transport, e.g. public Internet.


The mathematics behind it
The Public Key is a pair of integer numbers: e and n. (You can associate 'e' with 'e'ncryption).
The Private Key is a pair of integer numbers: d and n, (You can associate 'd' with 'd'ecryption).

'n' is computed as the product of two large prime numbers, p and q.
'd' is another prime number, bigger than p and q and 'relatively prime to p and q', for which exists an integer 'e', with this property:

e * d = 1 (mod fi)

where

fi = (p - 1) * (q - 1)

that is, the product of d and e, mod fi, equals 1, and said in other terms, 'd is the multiplicative inverse of e, modulus fi'.

'd is relatively prime to p and q' means that the greatest common divisor between d and fi is 1.

For the demonstration please refer to the original article and its related references.

If these conditions are met, then given a number M (< n) to be encrypted, Alice can encrypt it by computing:

C = (M exp e) mod n
Alice then sends C to Bob.
Once Bob receives C, he can decrypt it (because he knows the Private Key (d, n)) with:

D = (C exp d) mod n

D (the decrypted number) equals M (the original number).

As you can see, after having chosen the key components (n, d and e), encryption and decryption are conceptually easy (although consider that security relies on n being an extremely large number - say 200 digits).


Why is RSA secure

The algorithm is secure because to compute d, the Private Key, it's not sufficient to know 'n' and 'e' (which are included in the Public Key).
This is because d can be computed only knowing p and q and to know them, an attacker should retrieve them from n, i.e. it should 'factor n'.
It can be proved that 'factor n', when p and q are large, is so hard to be considered impossible (in a reasonable amount of... years).

A practical example
I'll use small values for p and q, just for the purpose of having simple computations that can be done with a normal calculator.
Again, remember that the security here relies on using very large prime numbers to compute n.

In fact, let's take the example from the original article:

p = 47
q = 59
d = 157
n = p * q = 2773
fi = (p - 1) * (q - 1) = 46 * 58 = 2668

With this values,
e = 17

So say Alice wants to encrypt the number 920 (associated through a dictionary to a combination of characters for example):

C = (M exp e) mod n = 948

Alice will then send 948 to Bob.

Bob will try to decrypt C with d:

D = (C exp d) mod n = 920

D = M, success!


How to compute d and e
The tricky bit, as you may have guessed, was to compute d and e.
Given p and q, and hence n, d was chosen from a set of candidates (the 'keyspaceì').
e was computed from d, p and q.

To experiment on this, first I wrote the algorithm to compute e given p, q and k:


def compute_bn(xn, x0, x1):
 an = -1
 bn = (xn - (an * x0)) / x1
 return bn

def compute_e(p, q, d):
 x0 = (p - 1) * (q - 1)
 x1 = d

 xnm2 = x0
 xnm1 = x1

 while True:
   xn = xnm2 % xnm1

   if (xn == 0):
     return compute_bn(xnm1, x0, x1)

  xnm2 = xnm1
  xnm1 = xn

 return 0



then I cycled through prime numbers in a certain range looking for values of d which gave an integer value for e:


#! /usr/bin/python

from rsa import *

def find_p_q_d_in_limit(starting_p, starting_q, limit):
 M = 920
 for p in range(starting_p, limit):
   if is_prime(p):
     for q in range(starting_q, limit):
       if is_prime(q):
          for d in range(starting_q + 1, limit):
            if (d > p and d > q and is_prime(d)):
              if (compute_and_check(p, q, d, M) == 0):
                print "p: ", p, ", q: ", q, ", d: ", d

find_p_q_d_in_limit(31, 37, 1000)


where compute_and_check(p, q, d, M) is:


def compute_and_check(p, q, d, M):
 n = p * q
 e = compute_e(p, q, d)
 Mexpe = M ** e
 C = Mexpe % n
 D = (C ** d) % n

 if M == D:
   print "p: ", p, ", q: ", q, ", d: ", d,
   return 0

 return 1



and is_prime(N):


def is_prime(N):
  i = 2
  while (i < N):
    if (N % i == 0):
      return False
    else:
      i += 1
  return True 



With small values of p and q I've got a list of possible candidates for d, e.g.:

[...]
p:  47 , q:  59 , d:  157
p:  47 , q:  61 , d:  251
p:  47 , q:  79 , d:  97
p:  47 , q:  97 , d:  631
p:  47 , q:  97 , d:  773
p:  47 , q:  101 , d:  107
[...]

and then given p = 47 and q = 59 used d = 157 as in the original example.

The key of this process is the computation of e given d, p and q, which was done with these two functions:


def compute_bn(xn, x0, x1):
 an = -1
 bn = (xn - (an * x0)) / x1
 return bn

def compute_e(p, q, d):
 x0 = (p - 1) * (q - 1)
 x1 = d

 xnm2 = x0
 xnm1 = x1

 while True:

   xn = xnm2 % xnm1

   if (xn == 0):
     return compute_bn(xnm1, x0, x1)

  xnm2 = xnm1
  xnm1 = xn

 return 0



compute_e(p, q, d) gives the value of e using a variation of the Euclidean algorithm to compute the Greatest Common Divisor, as suggested by R, S and A in their article.

Encrypting
So at the end given you have a number to encrypt (M) and a public key (e, n) you can just call:

$ python encrypt.py 920 17 2773

encrypt.py


#! /usr/bin/python

 import sys
 from rsa import *

 M = int(sys.argv[1]);
 e = int(sys.argv[2]);
 n = int(sys.argv[3]);

 C = encrypt_M(M, e, n)

 print "C: ", C


with a very simple function (defined inside the module called rsa.py):


def encrypt_M(M, e, n):
 return pow(M, e, n)


(note here that I'm using a standard python function, pow(), which does all the 'M ** e mod n' computation for us)

Decrypting
When you have an encrypted number C, and the private key (d, n) you can just call:

$ python decrypt.py 948 157 2773

decrypt.py:

#! /usr/bin/python

 import sys
 from rsa import *

 C = int(sys.argv[1]);
 d = int(sys.argv[2]);
 n = int(sys.argv[3]);

 D = decrypt_C(C, d, n)

 print "D: ", D


with a very simple function (defined inside the module called rsa.py)


def decrypt_C(C, d, n):
 return pow(C, d, n)


Input is not sanity checked: you can/should add it depending on your needs.

Feedback is welcome!

Popular posts from this blog

Troubleshooting TURN

  WebRTC applications use the ICE negotiation to discovery the best way to communicate with a remote party. I t dynamically finds a pair of candidates (IP address, port and transport, also known as “transport address”) suitable for exchanging media and data. The most important aspect of this is “dynamically”: a local and a remote transport address are found based on the network conditions at the time of establishing a session. For example, a WebRTC client that normally uses a server reflexive transport address to communicate with an SFU. when running inside the home office, may use a relay transport address over TCP when running inside an office network which limits remote UDP targets. The same configuration (defined as “iceServers” when creating an RTCPeerConnection will work in both cases, producing different outcomes.

Extracting RTP streams from network captures

I needed an efficient way to programmatically extract RTP streams from a network capture. In addition I wanted to: save each stream into a separate pcap file. extract SRTP-negotiated keys if present and available in the trace, associating them to the related RTP (or SRTP if the negotiation succeeded) stream. Some caveats: In normal conditions the negotiation of SRTP sessions happens via a secure transport, typically SIP over TLS, so the exchanged crypto information may not be available from a simple network capture. There are ways to extract RTP streams using Wireshark or tcpdump; it’s not necessary to do it programmatically. All this said I wrote a small tool ( https://github.com/giavac/pcap_tool ) that parses a network capture and tries to interpret each packet as either RTP/SRTP or SIP, and does two main things: save each detected RTP/SRTP stream into a dedicated pcap file, which name contains the related SSRC. print a summary of the crypto information exchanged, if available. With ...

Testing SIP platforms and pjsip

There are various levels of testing, from unit to component, from integration to end-to-end, not to mention performance testing and fuzzing. When developing or maintaining Real Time Communications (RTC or VoIP) systems,  all these levels (with the exclusion maybe of unit testing) are made easier by applications explicitly designed for this, like sipp . sipp has a deep focus on performance testing, or using a simpler term, load testing. Some of its features allow to fine tune properties like call rate, call duration, simulate packet loss, ramp up traffic, etc. In practical terms though once you have the flexibility to generate SIP signalling to negotiate sessions and RTP streams, you can use sipp for functional testing too. sipp can act as an entity generating a call, or receiving a call, which makes it suitable to surround the system under test and simulate its interactions with the real world. What sipp does can be generalised: we want to be able to simulate the real world tha...