International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 2, February 2012)143
A New Polybit Shuffling Encryption and Decryption
Algorithm Based on N Dimensional Encryption-Decryption
Matrix
Harinandan Tunga1
Department of Computer Sc. & Engineering, RCC Institute of Information Technology Kolkata, West Bengal, India
Abstract - This paper elaborates polybit shuffling encryption and decryption algorithm based on N Dimensional encryption-decryption matrix which is an attempt to improve over the Classical Playfair Cipher. The unique feature of this algorithm is that it retains the simplicity of Classical Playfair Cipher but increases the robustness against crypto-attack by many folds. Moreover this algorithm can be used on any type of files as it is capable of encrypting binary files because this algorithm, instead of using characters, uses chunks of several bits or bytes to represent data values. The proposed polybit encryption and decryption algorithm uses an N-Dimensional encryption-decryption matrix. The encryption-decryption matrix is first filled up by the keywords (passwords) and then by the rest of the possible chunk values. Encryption is done on N number of chunk values as a block thereby ensuring higher level of encryption and security. Encryption is done by choosing a linear transformation function of the position co-ordinates of the N chunks of a block. The transformation function can be chosen depending upon the need for speed of encryption and degree of security needed. Decryption is also simple as one has to use reverse transformation function on the encrypted file using again N chunks as a block. Finally it may be noted that just by increasing the dimensions M and N we can combat against the advancement of computational power and thus against advanced crypto-attack.
Keywords - Brute force attack, Crypto-attack, Playfair cipher, Polybit Shuffling, Small Sub Group Attack, Transformation function.
I.
I
NTRODUCTIONCryptography plays an important role in data security since it provides a mechanism that hides the data to the intruders but the data can be recovered as it is by the authentic one. This paper describes a very simple technique which improves the drawbacks of Classical Playfair Cipher but maintains its simplicity. This paper proposes polybit shuffling algorithm based on N-dimensional encryption-decryption matrix that encrypts any type of files, not only the text files, as it uses a chunk of several bits or bytes instead of character. There is liberty to choose N depending upon requirement and a very flat distribution curve for encrypted text files is obtained. The transformation function to encrypt or decrypt can be chosen depending upon the speed of encryption required and the degree of security needed. The algorithm also minimizes the probability of a chunk to be mapped to itself after encryption.
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 2, February 2012)144
Though the implementation of this algorithm is simple, it is almost practically impossible to decrypt an encrypted file if one does not have the keys. It may be noted that crypto-analysis of this algorithm with probability of success 0.5 needs C x (MN)! trials. Where C is number of possible transformation functions used, M is length of each dimension and N is the dimension used.
II.
R
ELATEDW
ORKThe Playfair cipher or Playfair square is a manual symmetric encryption technique and was the first literal digraph substitution cipher. The scheme was invented in 1854 by Charles Wheatstone, but bears the name of Lord Playfair who promoted the use of the cipher. In substitution method letters in the plain text are replaced by other letters or symbols while in transposition method characters in the plain text are swapped to get the Cipher text [1][2][4]. The well-known examples of substation ciphers are Caesar cipher, Playfair cipher, Hill cipher and Vigenere cipher [4]. Martin E. Hellman, Bailey W. Diffie, and Ralph C. Merkle [5][4] in their work gives a concept how Cryptographic apparatus and method can work. Martin E. Hellman et. al.[10] had worked on Public Key Cryptography. JH Ellis et. al.[6][3] made valuable contribution in the field of Non-Secret Encryption. William Stallings, et. al. [2] has done a work on Cryptography and network security.
III.
T
HEP
ROPOSEDA
LGORITHMSA. Algorithm of the Dimensional Encryption-Decryption Matrix
Steps -
1. Decide about the dimensions and the length of each dimension of the N dimensional matrix. For example if we choose dimension (N) = 4 and the length of each dimension (M) = 4 then we will get a 4x4x4x4 matrix which is capable to hold all 1 byte values (from 00H to FFH or 0D to 255D) exactly as 4 x 4 x 4 x 4 = 256. 2. The dimension N can be any positive integer greater
than one i.e. N≥2. It can be an odd number or an even one. For higher values of N the security will also increase.
3. The length of each dimension M can be any positive integer which will be power of 2 i.e. M = 2i i ≥2. For higher values of M the security will also increase. 4. Compute the chunk size i.e. the number of bits or
bytes that will be present in a single chunk. The chunk length will be C = log2S bit where S = MN, Therefore C = N x log2M.
For example, if we choose N = 4 and M = 4 i.e. a 4 x 4 x 4 x 4 matrix then S = 44 = (22)4 = 28, therefore C = 8 bit = 1 byte. If we choose N = 4 and M = 8 i.e. an 8 x 8 x 8 x 8 matrix then S = 84 = (23)4 = 212, therefore C = 12 bit. If we choose N = 5 and M = 4 i.e. a 4 x 4 x 4 x 4 x 4 matrix then S = 45 = (22)5 = 210,
therefore C = 10 bit. Prepare a list of all the possible chunk value. If C = 8 bit then the list contains all the values in range [ 0 to 28 – 1 ] i.e. [ 0 to 255 ]. If C = 12 bit then the list contains all the values in range [ 0 to 212 – 1 ] i.e. [ 0 to 4095 ]. If C = 10 bit then the list contains all the values in range [ 0 to 210 – 1 ] i.e. [ 0 to 1023 ]. If M is not in the order of 2i then the value C will not be an integer but a floating point number. Therefore M must be a power of 2 (i.e. M = 2i i ≥2).
5. As it works in the bit level therefore it is capable of encrypting any type of binary files. Decide on the number of keywords to be used. These keywords must comprise of chunk values from the list and the chunk value set used for the keywords must be unique among all the keywords. Recommended number of keywords is equal to the dimension (N) of the matrix. Thus for 4 x 4 x 4 x 4 matrix 4 keywords are recommended. Moreover the keywords can be of arbitrary length, but the total length of all the keywords must be less than or equal to the number of possible chunk values otherwise the chunk values will not be distinct among the keywords. For example the total length of the 4 keywords of a 4 x 4 x 4 x 4 matrix will always be less than or equal to 256. As the length of the keywords increases it will strengthen the security against crypto-attack.
6. Decide a seed which will state in which fashion the encryption matrix will be filled up i.e. row-wise, column-wise, spirally, diagonally etc.
7. Create an N-dimensional matrix with the decided dimension and fill it up first with the keywords along the selected dimension and then with rest of the chunk values of the list in order in predetermined fashion as decided in the previous step.
B. Algorithm of Encryption of the Data
Steps -
1. After the preparation of the encryption-decryption matrix the encryption process starts.
2. Each chunk in the N-dimension matrix can now be identified by their position (p0,p1, …, pn-1) using
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 2, February 2012)145
3. N chunks are taken from the file to be encrypted as a group at a time for encryption from the beginning of input stream of the input file. In the last group, it may be reqiured to append maximum up to N-1 extra predefined chunk value to make it also a group of N chunks.
4. Positions of the N chunks in a group are marked in terms of their co-ordinate positions in the N– dimensional space using the encryption-decryption matrix.
5. The position of the every chunk is given a transformation in all N-dimensions by an amount which is some linear function of the position co-ordinates of all the chunks in the group. The transformation is such that the new position also points to a chunk value within the encryption-decryption matrix.
6. Use of more than one transformation function will increase the strength of the security as it will increase the confusion. If more than one transformation function is used then an extra keyword is needed to identify which transformation function have to use. 7. Get the values of N chunks from the
encryption-decryption matrix as per their new transformed co-ordinates and replace the new values for corresponding N chunks in the output stream for the output ciphered file.
8. Repeat the process till the end of the input file is reached. Then put the extra information like how many chunks are padded etc. in the output stream to help the decryption process at the receiver end.
C. Algorithm of Decryption of the Data
Steps -
1. After the preparation of the encryption-decryption matrix the decryption process takes place. (The unique matrix is formed from the keywords made available to the receiving end before the actual transmission starts.) 2. During decryption of the encrypted file we again take N
chunks at a time as the input stream from the encrypted file.
3. Positions of the N chunks in a group are marked in terms of their co-ordinate positions again in the N– dimensional space used for decryption.
4. Each chunk position is again given a reverse transformation i.e. opposite of the transformation suffered during encryption in order to get back the original N-tuple coordinates.
5. If more than one transformation function is used then the proper reverse transformation function must be decided by the said extra keyword which represents a pair of transformation function, both forward and reverse transformation functions corresponding to that particular transformation function.
6. Get the values of N chunks from the encryption-decryption matrix as per their new transformed co-ordinates and replace the new values of corresponding N chunks in the output stream of the output decrypted file.
7. Repeat the process till the end of the file for decryption is reached.
8. Remove the extra chunks appended, if there be in order to get back the exact original file. (In many cases like text, picture, audio etc. this step may be ignored as the appended chunks have little effect on the significance of the file).
IV.
I
LLUSTRATION OF THEA
LGORITHMA. Illustration of the Algorithm of Encryption-decryption Matrix
Let us consider an example using 4-Dimension algorithm with N = 4, M = 4 i.e. the encryption-decryption matrix has dimension of 4 x 4 x 4 x 4. The encryption matrix consists of the chunk of 8 bit or 1 byte length therefore is capable to contain the values 00H to FFH. This will enable us to encrypt any type of binary file and thus not only be confined to text files. Each byte position in the encryption matrix has co-ordinate (Wi,Xi,Yi,Zi). I use 4 keywords. The keywords are “Ma5klG”, “n9QTs4v3”, “IpibUrRH” and “Y,!71xyZ” . (For better understanding the keywords are shown in ASCII characters.)
Now the 1st keyword in hexadecimal representation is: Ma5klG = 4D 61 35 6B 6C 47
The 2nd keyword in hexadecimal representation is: n9QTs4v3 = 6E 39 51 54 73 34 76 33
The 3rd keyword in hexadecimal representation is: IpibUrRH = 49 70 69 62 55 72 52 48
The 4th keyword in hexadecimal representation is: Y,! 71xyZ = 59 2C 21 37 31 78 79 5A
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 2, February 2012)146
Table 1: The 4-D encryption-decryption matrix using keywords “Ma5klG”, “n9QTs4v3”, “IpibUrRH” and “Y,!71xyZ
B. Illustration of the Algorithm of Encryption of the Data
Now suppose we want to encrypt the sentence: “This algorithm is better”. The length of the data is 24 chunk hence divisible by the block size which is N = 4. Therefore padding of the extra chunks is not needed. This text in hexadecimal representation is: 54 68 69 73 20 61 6C 67 6F 72 69 74 68 6D 20 69 73 20 62 65 74 74 65 72. Since this is 4-Dimension algorithm we take the first block of 4 characters V1(W1,X1,Y1,Z1), V2(W2,X2,Y2,Z2), V3(W3,X3,Y3,Z3) and V4(W4,X4,Y4,Z4) which are 54 68 69 73.
The co-ordinates of these bytes in the matrix are 54 = (1, 0, 0, 3)
68 = (1, 2, 0, 2) 69 = (2, 0, 0, 2) 73 = (1, 0, 1, 0)
Now we use a transformation function, in which every co-ordinate position of a chunk is replaced by the diagonally next co-ordinate position of the immediate next chunk in the block of 4 chunks in a circular manner. Suppose the co-ordinates of the bytes are:
V1 = (W1,X1,Y1,Z1) V2 = (W2,X2,Y2,Z2) V3 = (W3,X3,Y3,Z3)
V4 = (W4,X4,Y4,Z4)
Thus W1 is replaced by X2, X1 is replaced by Y2, Y1 is replaced Z2 and Z1 remains unchanged. Thus V’1(W’1,X’1,Y’1,Z’1) becomes (X2,Y2,Z2,Z1).Thus V1, V2, V3 and V4 after encryption becomes V’1, V’2, V’3, V’4 which have co-ordinates as follows:
V’1 = (W’1,X’1,Y’1,Z’1) = (X2,Y2,Z2,Z1) V’2 = (W’2,X’2,Y’2,Z’2) = (X3,Y3,Z3,Y1) V’3 = (W’3,X’3,Y’3,Z’3) = (X4,Y4,Z4,X1) V’4 = (W’4,X’4,Y’4,Z’4) = (W4, W3, W2, W1) For our example
V1 = (1, 0, 0, 3) V2 = (1, 2, 0, 2) V3 = (2, 0, 0, 2) V4 = (1, 0, 1, 0)
After encryption becomes [ See Picture 1 ] V’1 = (2, 0, 2, 3) = 93
V’2 = (0, 0, 2, 0) = 02 V’3 = (0, 1, 0, 0) = 0A V’4 = (1, 2, 1, 1) = 6F
Thus 54 68 69 73 after encryption becomes 93 02 0A 6F.
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 2, February 2012)147
Picture 1 Transformation of the co-ordinates of the plaintext data for encryption.
Similarly, the entire data after encryption becomes 93 02 0A 6F 00 0B 52 6E 0F 03 A6 77 9A A0 04 72 A0 07 8C 67 A7 8D 10 9D
C. Illustration of the Algorithm of Decryption of the Encrypted Data
The decryption logic for this encryption is also simple. Here every co-ordinate position of a chunk is replaced by the diagonally previous co-ordinate position of the immediate previous chunk in the block of 4 chunks in a circular manner.
The co-ordinates of the 4 bytes of the first block of ciphered data in the matrix are
93 = (2, 0, 2, 3) 02 = (0, 0, 2, 0) 0A = (0, 1, 0, 0) 6F = (1, 2, 1, 1)
The co-ordinate positions of the encrypted data are suppose
V’1 = (W’1,X’1,Y’1,Z’1) V’2 = (W’2,X’2,Y’2,Z’2)
V’3 = (W’3,X’3,Y’3,Z’3) V’4 = (W’4,X’4,Y’4,Z’4)
Thus after the reverse transformation W’1 is replaced by Z’4, X’1 is replaced by Z’3, Y’1 is replaced Z’2 and Z’1 is replaced by Z’1. Thus V”1(W”1,X”1,Y”1,Z”1) becomes (Z’4,Z’3,Z’2,Z’1).
Thus V’1, V’2, V’3 and V’4 after decryption become V”1, V”2, V”3, V”4 which have coordinates as follows:
V1'' = (W”1,X”1,Y”1,Z”1) = (Z’4,Z’3,Z’2,Z’1) = (W1,X1,Y1,Z1)
V2'' = (W”2,X”2,Y”2,Z”2) = (Y’4,W’1,X’1,Y’1) = (W2,X2,Y2,Z2)
V3'' = (W”3,X”3,Y”3,Z”3) = (X’4,W’2,X’2,Y’2) = (W3,X3,Y3,Z3)
V4'' = (W”4,X”4,Y”4,Z”4) = (W’4,W’3,X’3,Y’3) = (W4,X4,Y4,Z4)
It can be easily seen from the decryption that W”1 = W1, X”1 = X1, Y”1 = Y1, Z”1 = Z1. Thus after decryption we get back the original data.
In case of my example becomes [ See Picture 2 ] V”1 = (1, 0, 0, 3) = 54
V”2 = (1, 2, 0, 2) = 68 V”3 = (2, 0, 0, 2) = 69 V”4 = (1, 0, 1, 0) = 73
Thus we get back from the encrypted data 93 02 0A 6F the original data 54 68 69 73. [ See Table 1 for the co-ordinates of ciphered data shown in OUTLINE style and decrypted data shown in UNDERLINE style ].
Picture 2 Transformation of the co-ordinates of the ciphered data for decryption
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 2, February 2012)148
V.
E
XPERIMENTALR
ESULTSSome Experimental Results of “Polybit Shuffling Encryption based on 4-Dimensional matrix” On Different Types of Files. This given results are generated by 4-Dimension Polybit Shuffling Encryption with 4 x 4 x 4 x 4 matrix and with four
keywords “44 53 62 71 80 91 A2 B3 C4 D5 E6 F7 E8 D9 CA BB”, “AC 9D 8E 7F 6E 5D 4C 3B 2A 19 08 17 26 35 46 57”, “68 79 8A 9B AA B9 C8 D7 C6 B5 A4 93 82 73 64 55”, “66 77 88 99 A8 B7 A6 95 84 75 86 97 DB EC FD EE”, with the matrix generation seed “46357” and transformation function choosing seed “26462”.
Picture 3. Frequency distribution of original text Picture 4. Frequency distribution of encrypted text
Picture 5. Original BMP image Picture 6. Encrypted BMP image
Picture 7. 0.5 second duration of original WAVE file Picture 8. 0.5 second duration of encrypted WAVE file
The analysis of my Polybit shuffling algorithm based on N-Dimensional matrix clearly shows its advantages over earlier Playfair Algorithms. I discuss in particular to 4-Dimensional algorithm. The main features of this algorithm are:
I. In case of 4-Dimension algorithm just by taking 4 x 4 x 4 x 4 matrix we can cover all 256 possible byte patterns thereby ensuring encryption of any kind of file and not only text files. By extending the dimension we can cover more patterns consisting of more number of bits to fit varied requirements. II. If the keywords are randomly generated then the
matrix generated will contain values in a random fashion. The randomness is increased by enabling another key (the seed) which states in what fashion
the matrix will be filled i.e. row wise, column wise, spiral wise, diagonally etc. Now the encryption matrix is not only depends on the keywords but also the seed used for filling .This makes cracking of the encrypted file more difficult because more the randomness in the encryption, the decryption matrix will reduce the probability of success of Small Sub-group Attack near about zero. Moreover the scope of this algorithm using variable dimension matrix makes it stronger against crypto-attack.
III. This algorithm overcomes the problem of character stuffing in Classical Playfair and Double Playfair Algorithm.
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 2, February 2012)149
For example in my 4-Dimension even if we encrypt TTTT i.e. 54 54 54 54 in hex we get 79 F3 21 A5 after encryption. Thus frequency analysis of the encrypted message becomes near about impossible. V. The size of the input file is not a problem. Only we
may need to stuff maximum up to 3 predefined bytes for the last block of data to make it a block of 4. These added bytes can be easily removed after decryption from the file. The algorithm is very fast, moreover it can be made to work faster by seeking simpler transformation function. Thus simple transformation function may be used for video files encryption where time of encryption is a major concern. A more complex transformation function can be used for text files where security is more major concern than speed of encryption.
VI. The algorithm minimizes the probability of a chunk to be mapped to itself after encryption. For example for 4-Dimension algorithm encrypting a text file say a byte “65” is being encrypted. Then the probability of “65” being mapped to the same value “65” is 1/(MN) where (M = Length of each Dimension and N = Number of Dimensions) is the dimension of the matrix. For a 4 x 4 x 4 x 4 matrix this is 1/256 which is much better than Classical Playfair Algorithm which is 1/25. For an 8 x 8 x 8 x 8 x 8 matrix in 5-Dimension algorithm the probability will decrease to 1/32768. Thus the output encrypted file becomes sparser.
VI.
C
ONCLUSIONThe Polybit shuffling encryption algorithm solves the problems faced by Classical Playfair algorithm and Double Playfair algorithm but retains the simplicity and elegance. Moreover since it uses chunks instead of characters it is capable of encrypting any type of file like image, text, sound files. The frequency distribution of the encrypted file is much smoother than previous algorithms thus strong against crypto-attack using frequency analysis. For large values of N this algorithm is practically impossible to crack for coming couples of years. My extended work on using Diffie-Hellman key exchange technique for matrix formation will make my algorithm to work on any channel with guaranteed security.
References
[1] Thomas H. Corman, Charles E. Lieserson and Ronald L. Rivest, Introduction to Playfair Algorithm, Prentice-Hall of India, 2nd
edition, 2000.
[2] William Stallings, Cryptography and network security, Prentice Hall, 2nd edition 2000.
[3] Non-Secret Encryption Using a Finite Field MJ Williamson, January 21, 1974 [11] Thoughts on Cheaper Non-Secret Encryption MJ Williamson, August 10, 1976.
[4] New Directions in Cryptography W. Diffie and M. E. Hellman, IEEE Transactions on Information Theory, vol. IT-22, Nov. 1976, pp: 644-654.
[5] Cryptographic apparatus and method Martin E. Hellman, Bailey W. Diffie, and Ralph C. Merkle, U.S. Patent #4,200,770, 29 April 1980.
[6] The History of Non-Secret Encryption JH Ellis 1987.
[7] The First Ten Years of Public-Key Cryptography Whitfield Diffie, Proceedings of the IEEE, vol. 76, no. 5, May 1988, pp: 560-577.
[8] Menezes, Alfred; van Oorschot, Paul; Vanstone, Scott (1997). Handbook of Applied Cryptography Boca Raton, Florida: CRC Press. ISBN 0-8493-8523-7.
[9] Singh, Simon (1999) The Code Book: the evolution of secrecy from Mary Queen of Scots to quantum cryptography New York: Doubleday ISBN 0-385-49531-5.
[10] An Overview of Public Key Cryptography Martin E. Hellman, IEEE Communications Magazine, May 2002, pp:42-49.
[11] Menezes, Alfred; van Oorschot, Paul; Vanstone, Scott (1997). Handbook of Applied Cryptography Boca Raton, Florida: CRC Press. ISBN 0-8493-8523-7.
[12] New Directions in Cryptography W. Diffie and M. E. Hellman, IEEE Transactions on Information Theory, vol. IT-22, Nov. 1976, pp: 644-654, http://citeseer.ist.psu.edu/340126.html.