Integrating and Securing Video, Audio and Text
Using Quaternion Fourier Transform
M. I. Khalil
Princess Norah Bent Abdurrahman University, Faculty of Computer and Information Sciences, Information Technology (networks) Department, Riyadh, Kingdom of Saudi Arabia,
Abstract: The rapid growth of communication technology encouraged a rapidly rising demand for internet connectivity. Accordingly, this led to an upsurge in research in the discipline of information security. cryptography plays a significant role in securing and verification of information exchanged via public communication channels. The current paper introduces a novel approach for combining both video, audio and text signals into a single architecture and securing it prior to the process of transmission. The idea behind this approach depends on embedding the color components of each pixel of the video signal in a quaternion number. The fourth component of the quaternion number is occupied with either an audio sample or a textual data. The array of quaternion numbers corresponding to a video frame is converted to the frequency domain, using quaternion Fourier transform, and then multiplied by the quaternion Fourier transform of a digital image. Herby, the selected digital image is used as a complicated secret. The yielded signal is transmitted and when received, both of video, audio and text signals are extracted using simple quaternion mathematics applied to the received signal and a copy of the digital image. A second level of complexity can be added to this approach by applying one of the well-known cryptographic techniques (symmetric or asymmetric) to the samples of the transmitted signal. The suggested approach is implemented using Matlab simulation software and the extracted signals are compared with the original ones using some performance metrics. The obtained results show that the proposed approach is robust and more secure against cryptanalysis attacks without affecting the used bandwidth of the communication channel.
Keywords: Quaternions, Cryptography, Multimedia Synchronization, Encryption, Decryption.
1.
Introduction
Having networks with limited bandwidth and shared by millions of users, the real-time and burst traffic of multimedia data face unpredictable delay and availability. This led to imperative needs to develop real-time protocols for multimedia networking. The Real-Time Protocol (RTP) & Resource Reservation Protocol (RSVP) have been developed by the Internet Engineering Task Force (IETF) and they are the foundation of real-time services. Such internet services provide means to transmit real-time multimedia data (video, audio and textual data) across networks. RTP protocol is usually implemented to run within the application. In order for real time systems such as multimedia networking to take off, sufficient security facilities must be provided [1]. Cryptography is an indispensable tool for ensuring the secrecy and/or authenticity of information during exchange in the presence of adversaries [2-5]. It is a framework relies on the intersection of the disciplines of mathematics, computer science, and electrical engineering. It is based on
various mathematical algorithms and techniques cooperate to block adversaries and to achieve information and computer network security aspects. Cryptography is utilized extensively in many fields such as electronic commerce, based banking applications such as Automated Teller Machines (ATM), and computer passwords. The techniques used in cryptographic systems are typically classified into two generic categories: symmetric-key and public-key:
1- Symmetric (Secret-key) cryptosystems [6] where only one key is used for both encryption and decryption operations, and it is of two types: block ciphers and stream ciphers.
2- Asymmetric (Public-key) cryptosystems [8] where different but mathematically related pairs of keys are used; private keys and public keys. Public key is disseminated widely and used for encryption while the private one is used for decryption.
RSA encryption algorithm, which is used in digital signatures and secure key exchange fields, is an example of asymmetric encryption techniques while Data Encryption Standard (DES), RC6 and Blowfish are the most extremely used symmetric encryption techniques [7].
Most of the cryptography encryption techniques are devoted to textual data messages while encryption of multimedia data, especially audio data, has few cryptography techniques. Most of multimedia encryption techniques are based on adding specific noise to the data before transmitting it and this noise signal is to be extracted at the receiver to obtain the original signal. An example of audio encryption is that presented by Raghunandhan. K. R, where two layers of security are utilized. In the first stage, the audio signal is processed with transposition cipher. In the second stage, the modulus multiplication is used as substitution cipher, for this, the key is generated using Pseudo Random Number Generation (PRNG) [9,10]. The frequency domain of the multimedia data (audio, video) are widely utilized in many encryption and decryption methods. In this case, any of the DFT (Discrete Fourier Transform) or DCT (Discrete Cosine Transform) or DWT (Discrete Wavelet Transform) methods can be used for transforming the time domain signal to frequency domain one before applying one of the encryption algorithms [11, 13].
this method is unlike the traditional one in two significant aspects. The first one: the secret key does not have a unique value and instead it has a huge set of keys. The second difference is the way of mathematical manipulation used in encryption and decryption processes. Each video frame, along with concurrent segments of both audio and textual signals are converted to the frequency domain, using quaternion Fourier transform, and then multiplied by the quaternion Fourier transform of a selected digital image. Herby, the selected digital image is used as a complicated secret. The video, audio and textual signals are then extracted at the receiver side using simple quaternion mathematics. The performance of the introduced approach will be estimated using performance metrics. The rest of this paper is organized as follows: Section II reviews some of the significant research in the field of multimedia combination, transmission and encryption. Section III illustrates the basics of quaternion mathematics. The proposed methods will be introduced in section IV, while the implementation and experimental results will be illustrated in section V. The obtained results will be concluded in section VI.
2.
Literature Review
Multimedia involves multiple modalities of text, audio, images, drawings, animation, and video. Multimedia and networking technologies have significantly affected several daily activities such as e-learning, video teleconferencing, distributed lectures, tele-medicine and web browsing. Real Audio, and other similar products that have followed for both audio and video, allow streaming over the internet. Streaming means that the audio or video file is played in real time on the user's machine. Products available which support streaming of various audio and video formats including MPEG, AVI and QuickTime, and some tools are available to stream from a standard Web server using the HTTP protocol. RTP (Real Time Transport Protocol) has been developed by the Internet Engineering Task Force as an alternative. RTP works alongside TCP to transport streaming data across networks and synchronize multiple streams. Multimedia applications with audio and video need a synchronization scheme if they are transmitted and processed independently [14]. Kuo et al. [15] present a scheme that guarantees the synchronization playback of audio and video streams according to Real-time Transport Protocol (RTP). Time-stamps on the audio and video streams are used in this scheme to determine the actual playback time to achieve synchronization. Robin [16] presents an outline of the audio synchronization requirement. The major purpose is to find a phase relationship between audio and video signals based on the number of audio samples per video frame while maintaining it to an integer to keep the audio packet synchronized to its relative video frame. Robin [16] provides a time-stamp approach for synchronization of audio and video signals. Lienhart et al. [17] present a synchronization scheme for audio and video streams to be used in wireless applications. Time-stamping information is obtained at A/D conversion time and embedded in the transmitted stream. This information is then used at the destination to convert the sampling rates of the audio and video streams and therefore synchronize the streams accordingly. The synchronization in [17] is for a wireless communication system and is not
digital image is used as a complicated key and cover for audio signal. Each sample of the audio signal is combined with the values of the three color components of a pixel fetched from the cover image yielding a quaternion number. The absolute value of this quaternion number is then transmitted and when received, the original value of the audio sample can be extracted using simple quaternion mathematics. P. M. Rubesh Anand et al. used Quaternion Julia set to generate real-time based symmetric keys for cryptography. The number of iterations, complex number and control value are the determining parameters of dynamically varying quaternion Julia image structure. The considered parameters are initialized in the proposed model of symmetric key generation during the establishment of communication between hosts. The model generates variable length, dynamic, one time usable key from quaternion Julia image to encrypt or decrypt data without involving the exchange of key. The time stamp used during the initialization process makes the quaternion Julia image to be different in real-time. The instantaneous key is generated at the hosts independently in a synchronous fashion to enhance the complexity in cryptanalysis [29]. The above-mentioned manuscripts address three significant multimedia issues: tools for combining multimedia, challenges of multimedia transmission and techniques and applications of multimedia content encryption. The proposed approach implies realization and dealing with multimedia combining, synchronization and encryption aspects.
3.
Quaternions Mathematics
Quaternions or hypercomplex mathematics, were discovered by Hamilton in 1843 [13, 21, 24] and since that time they were not embraced. A quaternion has four components, one real and three imaginary. Quaternions are suited to describe both three- and four- dimensional geometry. They are a generalization of complex numbers and combine by the normal rules of algebra with the exception that multiplication is not commutative. The usual notation, extended from that of the complex numbers is:
(1)
Where w, x, y and z are real, and i, j and k are imaginary units or complex operators that obey the following rules:
(2)
(3)
(4)
The signs in the products of different operators are shown in Eq. 2, 3 and 4. When multiplying any pair of these operators in a clockwise sequence a positive product is produced, while a negative product is produced when any pair is multiplied in anti-clockwise sequence. The quaternion conjugate is defined as:
(5)
In addition, the modulus of a quaternion is given by:
(6)
Define the real and imaginary parts of q as:
(7)
The inverse of a non-zero quaternion is:
(8)
In quaternion mathematics, the pure quaternion is that quaternion with zero real part while the quaternion with unit modulus is called a unit quaternion. The quaternion can be considered consisting of a vector part and a scalar part. The vector part of a quaternion (see Eq.1) has three components associated with i, j, and k operators:
(9)
The traditional quaternion Fourier transform (QFT) [21,22] is only defined for real or quaternion valued signals over the domain , while the quaternion discrete Fourier transform (QDFT) for can be defined, based on the concept of quaternion multiplication and exponential and non-commutative property of the quaternion multiplication , as three different types:
a)The two-sided DQFT:
(10)
b)The left-sided DQFT:
(11)
c)The right-sided DQFT:
(12)
μ is any unit pure quaternion.
(13)
Figure 1. The block diagram of encryption algorithm
4.
The Proposed System
4.1 Encryption Process
As any cryptographic system, the proposed method consists of two solidary parts: the encryption process is at the transmitter side while the receiver side encloses the decryption process. The proposed method implicates preamble operations, where video frames are sequentially acquired either from video device (camera) or from a multimedia file. At the same time, the audio samples are acquired from audio device (Microphone) simultaneously with retrieving a text. The block diagram of encryption process of the proposed approach is shown in Fig.1 and it is based on including the simultaneously acquired data (video, audio and textual data) in one quaternion number. The pixels of the acquired frame are decomposed into its r, g and b (red, green and blue) components whose values are used to create a quaternion number with zero-real part:
(14)
For video frame of size W x H and a frame rate (expressed in frames per second or FPS) , the number of quaternion numbers yielded for each frame = . At the same time, for an audio signal with sampling rate , the number of audio samples generated simultaneously within the duration of one frame = and to avoid losing of audio samples it is required that: and if there is text with length
string to be included in this message, then:
. (15)
Under this condition, the acquired audio samples in addition to the accompanied text can be represented and share the real component w of the generated quaternion numbers. Accordingly, for each frame, a determined number of the quaternion numbers generated are used to embrace the audio samples while another number of those quaternions are dedicated for the text string.
(16)
A space portion of the w component of the quaternions related to each frame can be dedicated for keeping some
basic important information such as number of audio samples, text string length, start location of audio samples and start location of text characters as well. For the current frame, the yielded quaternions are to be secured and the proposed method is alike the traditional encryption method in many aspects. The set of quaternions corresponding to the current frame (including audio and text) are converted to the frequency domain using the Quaternion fast Fourier Transform (QFFT):
(17)
Then, a selected digital image is used as multiple-secret-key (MSK) and is cropped to coincide dimensionally with the frame. This image is converted to quaternions before converting them to the frequency domain:
(18)
Both the quaternions of are multiplied together yielding a set of quaternions which their
components can be transmitted.
The pseudo code of the proposed encryption algorithm is shown in List.1.
4.2 Decryption Process
As shown in Fig.2, the same copy of the digital image used in the encryption process is used here in the receiver side as multiple-secret-key (MSK) and is cropped to coincide dimensionally with the frame. As previously stated, this image is converted to quaternions before converting them to the frequency domain:
(19)
Figure 2. The block diagram of decryption algorithm
When the frequency domain ciphered quaternions , which corresponding to one frame, are sequentially received, they are mathematically divided by and then inversely converted to the time domain using the Inverse Quaternion Fourier Transform (iqfft):
(20)
Decomposing the obtained quaternions, both the video pixels, audio samples and accompanied string can be extracted yielding the original signals. The pseudo code of the proposed decryption algorithm is shown in List.2.
5.
Implementation and Experimental Results
A new method is proposed for integrating video, audio and text signals in one unique signal and securing them prior to the transmission process. The system is implemented and tested using Matlab 2016 simulator. Hereby, the real-time video signal is acquired from either a real cam or multimedia file. Besides, the real-time audio signal and the text stream are acquired from the input peripherals as well. A new frame arrival event triggers the process of data acquisition, which yields an entire video frame, audio samples and text streams.
List 2. Pseudo code of the decryption algorithm
The encryption process deals with this acquired data using an external digital image as a multiple-secret-key. Hereby, both of the acquired data set (video, audio and text) and the multiple-secret-key image are represented in quaternion numbers format before converting them separately to the frequency domain using quaternion fast Fourier transform (QFFT). Both quaternion sets, in the frequency domain, are multiplied and the obtained quaternions are disassembled into its prime components before transmitting them sequentially via a TCP or UDP protocol.
At the receiver side, there is a copy of the multiple-secret-key image represented in the quaternion frequency domain. The received cipher data are reassembled in frames with the same size as the original video frames. For each frame, the data is represented as quaternion numbers set and converted to the frequency domain and then divided by the quaternion Fourier transform of the multiple-secret-key image. The prime components of the yielded quaternions are extracted and further processed constituting both video frame, audio and text streams.
before encryption and after decryption are shown in the left and right sides of Fig.4 respectively. The sound wave forms of both the input and extracted output audio signals are shown in the left and right sides of Fig.5 respectively. The outward comparison between both the input and output video frames shows that they are exactly coincides. Also, the input and output audio signals are the same. The extracted text is the same as the embedded string.
Figure 3. The multiple-secret-key image
Figure 4. left side) The original video frame
Figure 5. left side) The original audio signal
5.1 Performance evaluation
The overall performance of the proposed method is evaluated in the presence of additive Gaussian noise (AWGN) as a channel model. To measure the performance, the original input signals (video, audio and text) have been compared to the corresponding reconstructed signals after adding a white noise to the transmitted signal. The mean-squared error, the average of the squared errors between actual and estimated readings, (MSE) is calculated with respect to the signal-to-noise ratio (SNR) which is varied gradually form zero SNR to 140 during the measurement process. MSE between two one-dimensional signals and with length L can be defined as:
(21)
And for two-dimensional signals with MxN dimensions is defined as:
(22)
right side) the reconstructed video frame
The experimental results are presented in Fig.6, where MSE for the video frames and the audio signals are shown in left and right sides of the figure respectively. The extracted video signal suffers from high level of distortion when SNR is less than 50 dB, while the audio signal suffers from high level of distortion when SNR is less than 15 dB.
Figure 6. MSE versus SNR: left side) Video signal
6.
Conclusions
The current paper introduced a novel method for integrating video, audio and text signals in one unique signal and securing them prior to the transmission process. The proposed approach depends on simple quaternion mathematics for both signals combining/extracting and encryption/decryption processes. The audio and video synchronization problem is also addressed in this approach without having to add time-stamp information to the streaming audio and video sent. The proposed method is implemented and tested using Matlab 2016. Hereby, the real-time video, audio and text signals are acquired from either a real peripheral devices or multimedia files. A new frame arrival event triggers the process of integrating this frame, audio samples and text stream into a unique quaternion type frame. The encryption process deals with this new quaternion frame using an external digital image as a multiple-secret-key by multiplying both of them together in the frequency domain. The obtained quaternions are disassembled into its prime components before transmitting them sequentially via either a TCP or UDP protocol. At the receiver side, the received quaternions are gathered to construct a frame with the same dimensions as that at the transmitter side. The quaternions of this frame are divided by the quaternion Fourier transform of the multiple-secret-key image and the original signals can be extracted form yielded result. The experimental results revealed successful embedding and full restoration of signal’s samples with full synchronization between audio, video and text components . The performance of the proposed method is estimated by measuring the mean square error between the original signals and the corresponding received signals assuming that the
combined encrypted signals are transmitted in a channel with AWGN noise. The obtained results show that the proposed method is reliable and efficient.
right) Audio signal
References
[1] [MS-RTPME] - v20160714 Real-Time Transport Protocol (RTP/RTCP): Microsoft Extensions Copyright © 2016 Microsoft Corporation Release: July 14, 2016.
[2] Habutsu T., Nishio Y., Sasase I., and Morio S., “A secret key cryptosystem by iterating chaotic map,” Lect. Notes comput. Sci, Advances in Cryptology-EuroCrypt’91, vol. 547, page(s): pp. 127-140, 1991.
[3] Pichler F. and Scharinger J., “Finite dimensional generalized baker dynamical systems for cryptographic applications,” Lect. Notes in Comput. Sci, vol. 1030, pp. 465-476, 1996. [4] T. ElGamal, “A prublic key cryptosystem and a signature
scheme based on discrete logarithms, in Advances in Cryptology (CRYPTO ’84),” Springer, vol. 196, pp. 10–18. , 1985.
[5] Daria Lavrova and Alexander Pechenkin, “Applying Correlation and Regression Analysis to Detect Security Incidents in the Internet of Things,” International Journal of Communication Networks and Information Security (IJCNIS), vol. 7, no. 3, December 2015.
[6] Yen J. C. and Guo J. I., “Efficient hierarchical image encryption algorithm and its VLSI realization,” IEEE Proceeding Vis. Image Signal Process, vol. 147, no.2, page(s): 430-437, April, 2000.
[7] Mostafa Belkasmi , Mohamed Askali, “A Dynamic Study with Side Channel against an Identification Based Encryption, Rkia Aouinatou,” International Journal of Communication Networks and Information Security (IJCNIS), vol. 7, no. 1, April 2015.
[8] Prashant Kumar Arya, Mahendra Singh Aswal and Vinod Kumar, “Comparative Study of Asymmetric Key Cryptographic Algorithms,” International Journal of Computer Science & Communication Networks,vol 5, no. 1, pp.17-21, 2015.
Set,” International Journal of Communication Networks and Information Security (IJCNIS), vol. 5, no. 3, December 2013. [10] Raghunandhan K R, Radhakrishna Dodmane, Sudeepa K B, Ganesh Aithal, “Efficient Audio Encryption Algorithm For Online Applications Using Transposition And Multiplicative Non-Binary System,” International Journal of Engineering Research & Technology, vol.2 no. 6, June 2013.
[11] Ali Al-Ataby and Fawzi Al-Naima, “A Modified High Capacity Image Steganography Technique Based on Wavelet Transform ,” The International Arab Journal of Information Technology, Vol. 7, No. 4, October 2010
[12] Sheetal Sharma and Lucknesh Kumar, “Encryption of an Audio File on Lower Frequency Band for Secure Communication,” International Journal of Advanced Research in Computer Science and Software Engineering, v. 3, no. 7, July 2013.
[13] Sangwine, S., Ell, T.A., "Hypercomplex Fourier Transforms of Color Images," IEEE International Conference on Image Processing (ICIP), vol. 1, pp. 137–140, 2001.
[14] Mary Mikhali et al., “An Online System for Synchronization Processing of Video and Audio Signals,” 1-4244-0038-4 2006 IEEE, CCECE/CCGEI, Ottawa, May 2006.
[15] Chia-Chen Kuo, Ming-Syan Chen, and Jeng-Chun Chen, “An adaptive transmission scheme for audio and video synchronization based on real-time transport protocol,” International Conference on Multimedia (ICME), pp. 403 - 406 , August 2001.
[16] Michael Robin, “The audio synchronization concept,” Tech. Rep., Miranda, 1999.
[17] Rainer Lienhart, Igor Kozintsev, and StefanWehr Lienhar, “Universal synchronization scheme for distributed audio-video capture on heterogeneous computing platforms,” Proceedings of the ACM International Conference on Multimedia (ICME), November 2003.
[18] DOI: 10.1109/ICIP.2000.899482 • Source: IEEE Xplore Conference: Image Processing, 2000. Proceedings. 2000 International Conference on, Volume: 3
[19] Raul Peña, Alfonso Ávila, David Muñoz, and Juan Lavariega, “A Data Hiding Technique to Synchronously Embed Physiological Signals in H.264/AVC Encoded Video for Medicine Healthcare,” BioMed Research International Volume 2015 (2015), Article ID 514087, 10 pages, http://dx.doi.org/10.1155/2015/514087
[20] K. GEETHA and P.VANITHA MUTHU, “Implementation of ETAS (Embedding Text in Audio Signal) Model to Ensure Secrecy,” International Journal on Computer Science and Engineering ,Vol. 02, No. 04, 2010, 1308-1313.]
[21] Bihan, N.L., Sangwine, S.J., "Quaternion principal component analysis of color images," IEEE International Conference on Image Processing, vol, 1, pp. 809–812, 2003.
[22] Ell T.A., "Quaternion-Fourier transforms for analysis of two dimensional linear time-invariant partial differential systems," in Proc. 32nd Con. Decision Contr., pp. 1830-1841, Dec. 1993.
[23] M.I.Khalil, “Applying Quaternion Fourier Transforms for Enhancing Color Images,” I.J. Image, Graphics and Signal Processing, vo. 2, pp. 9-15.,1793-8201, 2012.
[24] M.I.Khalil, “Quaternion-based Encryption/Decryption of Audio Signal Using Digital Image as a Variable Key”, International Journal of Communication Networks and Information Security (IJCNIS), Vol. 9, No. 2, August 2017. [25] Nidhi S. KulkarniBalasubramanian RamanIndra Gupta,
“Recent Advances in Multimedia Signal Processing and Communications,” pp 417-449, Part of the Studies in Computational Intelligence book series (SCI, volume 231). [26] Sridhar C. and R.R.Sedamkar , “A Novel Idea on Multimedia
Encryption Using Hybrid Crypto Approach,” Procedia Computer Science Volume 79, 2016, Pages 293-298.
[27] Mariusz Dzwonkowski, Michal Papaj and Roman Rykaczewski, “A New Quaternion-Based Encryption Method for DICOM Images”, IEEE Transactions on Image Processing, vo. 24, no.11, pp. 4614 – 4622, Nov. 2015. [28] Shaoquan Wu and Jiwu Huang, “Efficiently
Self-Synchronized Audio Watermarking for Assured Audio Data Transmission,” DOI: 10.1109/TBC.2004.838265 • Source: IEEE Xplore.
[29] P. M. Rubesh Anand, , Gaurav Bajpai, and idhyacharan Bhaskar, “Real-Time Symmetric Cryptography using Quaternion Julia Set,” IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.3, March 2009.