Efficient Data Communication On Cloud by Secure
Auditing and Deduplication
Prof. Prashant Sadaphule, Priya Jawale, Rishabh Rapatwar, Uday Mahana, Chetan Magar
[email protected], [email protected], [email protected], [email protected], [email protected]
Department of ComputerEngineering, AISSMS’s IOIT, Pune
ABSTRACT
Cloud computing systems have graced the internet world in a way that no other technology has ever done before. They bring in a sense of comfort and a superiority that has enriched the lives of people who have benifited from it. Many giant companies like Google, Amazon and Microsoft has taken this technology to the next level by providing Google Cloud platform, Amazon Web Services and Microsoft Azure respectively. The problem comes in with the amount of data that is being uploaded to the cloud. The amount of data is directly proportional to the storage hardware devices and backup media used. The simplification that the businesses and enterprises enjoy comes at a cost. A technology called data deduplication can severely reduce the amount of data that is being stored on the cloud. Most of the data on the internet is redundant. Storing the replicated data on the cloud only increases the cost for the enterprise. Data deduplication is a technology that most of the enterprises use to get rid of the redundant data. Data deduplication aims at allowing only a single copy of a particular file to be stored on the cloud while discarding the replicated copies of the same file.
This paper shows the implementation of deduplication. Alongside deduplication this paper also asserts at the security of data in cloud environment. SecCloud and SecCloud+ are the systems that help us achieve both deduplication and data integrity
.
Keywords
Auditing, Cloud Computing, Cloud Storage, Data Deduplication, Data integrity.
INTRODUCTION
The cloud computing technology is emerging at a speedy pace across the globe. Though this technology is a decade old, it has not lost it aura. Most of the things we do on the internet are supported by cloud services. Though this realisation is difficult to sense as most of the things inside this computer obsessed world happen behind the curtain. The music that we listen to on the internet, the videos we watch, the online games we play, are hosted on the very same technology of this generation. This technology lets us supervise, store and process the data on internet while we enjoy the awesomeness of this gigantic service. The cloud is not hosted on local servers but on the internet with a web of remote servers. This gives us the freedom to access any application supported on the cloud on our personal computer. Cloud services omits the fuzz of buying big racks of hardware storages, coolants to cool them off, cuts the electricity cost and saves the space of an enterprise or organization. Though this technology may actually have a lot of star glowing features, but there’s a cost to enjoy this gaze, indeed. The amount of data is directly
proportional to the storage hardware devices and backup media used. Recent surveys show that a lot of data on the internet is redundant. Why pay to store the redundant data has been a question of significance. A technology called data deduplication can severely reduce the amount of data that is being stored on the cloud. Data deduplication is a specialized technique to eliminate the replicated copies of data. Data deduplication is a technique that allows only a single copy of a particular file to be stored on the cloud while discarding the replicated copies of the same file. Reviewers share their reviews about deduplication at an enterprise to be reliable and cost effective. The other problem that the cloud users face is integrity auditing. The whole of this architecture thing allows the data to be stored in an unknown domain, that the client or end user is unaware of. The data is transferred over the internet to the big racks of data centres of a cloud service provider. User clients do not have any control over the data. This raises concerns on the integrity of data. This paper shows techniques to achieve integrity of data.
SecCloud and SecCloud+ are the systems that help us achieve data deduplication as well as data integrity. These arent standard industry terms that everyone is aware of. But at some point at work IT professionals and enterprises indulge in knowing these systems to integrate some more efficiency. SecCloud generates data tags before uploading and reduces the overhead for user as well as auditor. Alongside providing deduplication, this system also provides a Proof of Ownership protocol which gives a sense to the client that it exactly owns the targeted file. It is an overwhelming desire of clients that the system meets its security requirements. SecCloud+ along with the previous mentioned entities also provides security to the system. It uses a key that cannot be seeded from the contents of file to prevent dictionary attacks. This key is generated to save ourselves from the adverse effects of an attack.
RELATED WORK
Ownership (POW) lets a client sense that he actually owns the particular file, since file is stored in a domain of data centers that the client is unaware of. However using POR and PDP techniques contradicts the privileges of POW. The proposed schema allows for deduplication check for both the files and authentication tags. Shai Halevi et al. [7] proposed a schema of Proof of Ownership (POW) so client proves to the cloud that it holds the file. Merkle trees and some specific codes does all the security check and reduces the overhead on the client for deduplication. N.Vidhya et al. [8] introduces a Cloud Storage Service(CSS) that manages the storage and maintenance.
PROPOSED SYSTEM
The proposed system directs towards achieving data integrity and data deduplication in the existing system. The existing system comprises of many flaws and schemas that are inconsistent with the growing usage of cloud services. The clients can not feel the control of data as the data is being transferred over the internet at an unknown data centre. This raises concerns on the integrity of data. Another thing is the storage of redundant copies of data is meaningless. Two systems SecCloud and SecCloud+ help us achieve integrity of data as well as the deduplication.
SecCloud helps us establish an environment where the user can sense that he owns the file and that file is safe. The Proof Of Ownership protocol takes care of that. This protocol is established between the client and cloud. SecCloud helps us achieve to generate data tags before uploading a file. This system reduces the computational load on the user and auditor.
SecCloud+ schema besides providing integrity and deduplication for data also provides the confidentiality of file. It uses a key that cannot be seeded from the contents of file to prevent dictionary attacks. This key is generated to save ourselves from the adverse effects of an attack.
The proposed system shows the implementation of deduplication. It also shows auditing techniques to manage the storage of data. The aforementioned integrity of data is also taken care of. The proposed system gives user the confidence that he has control over the files that are being uploaded to the cloud. Alas, the security measures for file accessing and downloading are made available through file key and secret key via OTP generation.
The System consists of the following three entities • Cloud Client (Users)
• Auditor • Cloud
The client or the cloud user has to register to the system to access his profile. Then using appropriate login credentials the user logins to the system. The user profile gives the user the privelege to upload files, download files and share files to other users of the system. The cloud user selects the file he wants to upload and uploads the file. Then it isnt now that the file is uploaded to the cloud. The file is sent to the auditor for verification. The auditor activates the file and then the file is uploaded to the cloud. User can download the files he has uploaded. But everytime he has to enter the file key and secret key which is sent via OTP to his registered email ID. Once the user enters both the keys accurately, file is download to the system. User can share the files he has uploaded to the cloud with other users of the system. The user selects the file he wants to share, selects the user among the registered cloud users, and sends the file. The other user then has to go through a security measure to download the file. The other user enters the file key and secret key which he receives through OTP on his registered email ID. When he enters both the key s, the file is downloaded to the system.
ALGORITHMS
OTP generation algorithm
Simple Mail Transfer Protocol (SMTP) is used to send and receive mail. It is a TCP/IP protocol. SMTP authenticator is used to check the username against the password for the email ID. The OTP is sent from an email ID that we registered on the system to the recipient email.
The algorithm basically uses the inbuilt functions from packages that java provides.
In this paper we are generating OTP for security and file sharing purpose.
1. Declare String chars a-z, A-Z, 0-9, Special characters.
2. Random rnd = new SecureRandom() 3. final int PW_LENGTH = x
4. for (int i = 0; i < PW_LENGTH; i++)
5. pass.append(chars.charAt(rnd.nextInt(chars.length() )))
6. Generate email String from="emailid"; 7. String password="pass"
8. RequestDispatcher
rd=request.getRequestDispatcher("/EmailServlet")
Hash algorithm for file level deduplication:
1. Start
2. Declare variable 3. Initialize variable 4. Read the file name
5. Read the file name till the end of file title
6. Generate NAME from strBUFF[FILENAMESIZE] 7. if (FirstFile)
8. Consider node as root element 9. Inc FileCtr
10. Else
11. Search the generated FileNAME in BST 12. If (Find NAME == True )
13. Compute the node
14. Add the node to a linked list 15. Change the Endlink of SLL 16. Else
17. Add the node in BST 18. Inc The FileCounter
19. Calculate Deduplication Ratio
20. Display the Result for each file iteration 21. END
IMPLEMENTATION MODULE
User Module
- In this module the user registers on the system with his username, password, first name, last name, mobile number and email ID. If an user tries to register his username with an already existing username, the privilege is denied. The user will have to register with a different username. All the registration details are stored in SQL database. After logging in user can upload a file, download a file, share a file or view shared files.Auditor Module
- In this module the auditor logins with the username and password that are predefined by the system. Auditor has the privileges to view the users that have been registered on the cloud. He can audit the space used as well as audit the deduplicated files. He has access to know what filesare duplicated and what not. He activates the file for uploading to the cloud.
File uploading
-The user logins to his profile using the credentials that he provided when registering. He selects a file he wants to upload from his local machine. The file is not directly uploaded to the cloud but waits for auditor verification. When the auditor verifies and activates the file, then the file is uploaded to the cloud.File downloading
- In this module the user can download files that he has already uploaded to the cloud. He has to enter file key and secret key for security measures. The file key and secret key are sent to the registered email ID of the user via OTP. When the user enters both the keys correctly the file is downloaded to the machine.File sharing
- In this module one user can share his file with another user of the cloud system. The user selects the file he wants to share, selects the user he wants to share to, and can share the file to another user. The receiving user has to follow the file downloading protocol to download the file. Secure auditing protocol- In this module, the Auditor logins to the system using the username and password. When the user tries to upload the file, the file is sent to the auditor for verification. It is when the auditor verifies that the file is uploaded to the cloud. Auditor audits for the files that are duplicate and non duplicate. He has access to bar charts for auditing the deduplication. Auditor has the privilege to approve files uploaded by the user, taking in consideration the used space and duplicated files. When the auditor approves a duplicate file to be uploaded on the cloud, replicated file is not stored on the cloud, but only a link is generated for the user to access.Deduplication
- When a user tries to upload a file to the cloud, the file is sent to auditor for verification. When the auditor activates the file, then the file is uploaded to the cloud. A table of what files are uploaded to the cloud is maintained. Every time a file is sent for auditor verification, the name of the file is checked against the already uploaded files from the table. If the file name already persists then the file is shown to be duplicate to the auditor. The file is non duplicate if the table has no entries of the filename that the user wants to upload.OTP generation
- This module is implemented when a user wants to download a file. The user has to enter file key and secret key to download the file. File key is a random key generated at the time when user uploads a file. This key alone will not help to download the file. User will also need the secret key which is a combination of random characters including special characters. These keys are sent to registered email ID of the user via OTP. The file key will remain same every time the user wants to download a particular file, but the secret key will keep changing every time. This security measure helps save the user files from attacks.EXPECTED RESULTS
File uploading
Fig 1. File upload
Deduplication
Fig 2. Deduplication
Auditing space
Fig 3. Auditing space
CONCLUSION
and deduplication of data. SecCloud helps audit the integrity of data stored in racks of storage devices at datacenters. SecCloud helps us achieve a privelage to access the files with the help of Proof of Ownership protocol so clients feel they have the control over the files they are storing at unknown data centres across the globe. The usage of SecCloud system has reduced the computational overhead for the user as well as auditor. The computation of the overall system is increased especially while uploading a file because SecCloud aids with generating data tags before uploading. It also helps us achieve efficient auditing phases. SecCloud+ allows applying the techniques of integrity auditing and deduplication on encrypted data for security measures. The implementation of generation of OTP helps us save from security threats.
REFERENCES
[1] P. S. Sadaphule, R. Rapatwar, U. Mahana, P. Jawale and C. Magar, "A Survey on Auditing Encrypted Data and Deduplication in Cloud Environment", In International Journal of Advance Engineering and Research Development,
Volume 4, Special Issue 5, Dec.-2017
[2] Ateniese.G, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z.Peterson, and D. Song,“Provable Data Possession At Untrusted Stores”,In Proc. 14th ACM Conf. Computer and Comm. Security (CCS‟07), pp. 598-609, 2007
[3] Wang.Q, C. Wang, K. Ren, W. Lou, and J. Li,“Enabling Public Audit Ability And Data Dynamics For Storage Security In Cloud Computing”, In IEEE Trans. Parallel Distributed Systems, vol. 22, no. 5, pp. 847-859, May 2011. [4] Yan Cong Wang, Student Member, IEEE, Sherman S.M. Chow, Qian Wang, Student Member, IEEE, Kui Ren,
Member, IEEE, and Wenjing Lou, Member, IEEE,“Privacy-Preserving Public Auditing For Secure Cloud Storage”
[5] Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, “A view of cloud computing”, In
Communications of the ACM ( CACM ) Vol. 53 No. 4 April 2010.
[6] Jiawei Yuan, Shucheng Yu,“Secure and Constant cost public cloud storage auditing with Deduplication”, In
Communications and Network Security (CNS), 2013 IEEE Conference Oct 2013
[7] Shai Halevi , Danny Harnik , Benny Pinkas , and Alexandra Shulman-Peleg, “Proofs of Ownership in Remote Storage Systems”, 18th ACM conference on Computer and communications security oct 2011
[8] N.Vidhya, P.Jegathesh ,“Secure file sharing of dynamic audit services in cloud storage”, International Journal of Research in Engineering and Technology Volume: 03 Issue: 05 May 2014
[9] M.Vanitha, Ar.Sivakumaran, L.Priyadharshini ,“A Study on Secure Storage of Dynamic Audit Services in Cloud”,
International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering Vol. 1, Issue 1 July 2012