Research Article
a
August
2017
Computer Science and Software Engineering
ISSN: 2277-128X (Volume-7, Issue-8)
Study of Various Multi Keyword Search in Cloud
Computing
Hitesh Pardesi, Naveen Kumari
Punjabi University Regional Centre of Information and Technology Management, Mohali, Punjab, India
DOI: 10.23956/ijarcsse/V7I8/0143
Abstract— With the approach of cloud computing, data proprietors are induced to outsource their important information administration systems from adjacent spots to the business open cloud for proficient flexibility. As needs be, enabling a mixed cloud data look organization is of crucial essentialness. Considering the colossal number of data customers and documents in the cloud, it is vital to allow diverse watchwords in the chase request and return reports in the demand of their relevance to these catchphrases. Related wears down accessible encryption focus on single watchword chase or Boolean catchphrase look for, and sometimes sort the rundown things. In this paper, we describe and deal with the testing issue of insurance defending multi-catchphrase situated look for over encoded data in disseminated computing (MRSE).
Keywords— Cloud Computing, Multi Keyword Search, Clustering in Cloud
I. INTRODUCTION
Cloud computing has been considered as a new model of enterprise IT infrastructure, which can organize huge resource of computing, storage and applications, and enable users to enjoy ubiquitous, convenient and on demand network access to a shared pool of configurable computing resources with great efficiency and minimal economic overhead.[2] Attracted by these appealing features, both individuals and enterprises are motivated to outsource their data to the cloud, instead of purchasing software and hardware to manage the data themselves. Despite of the various advantages of cloud services, outsourcing sensitive information (such as e-mails, personal health records, company finance data, government documents, etc.) to remote servers brings privacy concerns.[2] The cloud service providers (CSPs) that keep the data for users may access users’ sensitive information without authorization. A general approach to protect the data confidentiality is to encrypt the data before outsourcing. However, this will cause a huge cost in terms of data usability.
Fig 1: Privacy preserved Multi keyword Search[2]
II. CLOUD COMPUTING ENTITIES
Cloud providers and consumers are the two main entities in the business market. But, service brokers and resellers are the two more emerging service level entities in the Cloud world. These are discussed as follows.
Cloud Providers: Includes Internet service providers, telecommunications companies, and large business process outsourcers that provide either the media (Internet connections) or infrastructure (hosted data centers) that enable consumers to access cloud services. Service providers may also include systems integrators that build and support data centers hosting private clouds and they offer different services (e.g., SaaS, PaaS, IaaS, and etc.) to the consumers, the service brokers or resellers.
ISSN(E): 2277-128X, ISSN(P): 2277-6451, DOI: 10.23956/ijarcsse/V7I8/0143, pp. 318-325
Cloud Resellers: Resellers can become an important factor of the Cloud market when the Cloud providers will expand their business across continents. Cloud providers may choose local IT consultancy firms or resellers of their existing products to act as ―resellers‖ for their Cloud-based products in a particular region. Cloud Consumers: End users belong to the category of Cloud consumers. However, also Cloud service brokers and resellers can belong to this category as soon as they are customers of another Cloud provider, broker or reseller.
III. MULTI-KEYWORD SEARCH
Multi-keyword positioned seek conspire enable exact, successful and secure inquiry over scrambled adaptable cloud data. Security examination had displayed that different multi-keyword seek design may do arrangement of reports and record, trapdoor assurance, trapdoor unlinkability, and covering access case of the request customer in a simple way.
Inside this structure, we utilize a successful record to furthermore upgrade the interest adequacy, and get the outwardly disabled limit system to mask get to case of the chase customer. This structure developed the accessible encryption for multi-watchword situated investigate the limit data. Specifically, by considering the broad number of outsourced documents (data) in the cloud and utilized the significance score and k-nearest neighbor methodologies to develop a capable multi-catchphrase look for plot that can reestablish the situated inquiry things in light of the precision.
IV. RELATED STUDY
Hongwei Li et al. [1] built up the accessible encryption for multi-watchword arranged explore the farthest point information. In particular, by considering the huge number of outsourced reports (information) in the cloud, use the importance score and k-closest neighbor methods to build up a beneficial multi-catchphrase search for plot that can restore the arranged once-over things in context of the accuracy. Inside this structure, we use a feasible once-over to in addition enhance the intrigue suitability, and get the ostensibly debilitated constrain framework to cover get the chance to instance of the demand client. Security examination shows that our course of action can accomplish gathering of records and report, trapdoor confirmation, trapdoor unlinkability, and hiding access instance of the intrigue client.
K.S.Saravanan and S.Karthika et al. [2] showed a safe multi-catchphrase arranged look design over encoded cloud information, which meanwhile underpins dynamic fortify operations like destruction and development of reports. In particular, the vector space appear and the broadly utilized TF_IDF show are partaken in the record change and demand time and manufacture a noteworthy tree-based summary structure and proposed an "Insatiable Depth-first Search" estimation to give fit multi-watchword arranged search for. The secured KNN figuring is used to scramble the record and demand vectors, and a while later guarantee correct congruity score count between encoded report and question vectors. Recollecting a definitive goal to repudiate genuine strikes, indistinct vision terms are added to the record vector for blinding summary things. In light of the use of our unique tree-based record structure, the proposed plan can complete sub-arrange ask for time and manage the erasure and fuse of reports adaptably.
Veerraju Gampala and Sreelatha Malempati [3] proposed a gainful multi-catchphrase break even with word ask for over blended cloud information by recovering best k scored records. The vector space model and TFIDF demonstrate are utilized to gather record and question time. The KNN calculation used to scramble record and demand vectors and develop a unique tree called Balanced M-way Search (BMS) Tree for asking for and propose a Depth First Search Technique (DFST) figuring to complete reasonable multi-catchphrase proportionate word arranged search for. The effectiveness and precision of DFST estimation are addressed with a case, BMS tree, it takes sub-straight time multifaceted nature.
Raghavendra S et al. [4] key to encode the fragile information before trading to the cloud server to keep up protection and security. All standard accessible symmetric encryption (SSE) organizes empower the clients to search for with everything taken into account record report. In this paper, we propose the Domain and Range Specific Multi catchphrase Search (DRSMS) arrange for that compels the pursuit time and Index storage room. This course of action handles gathering sort structure to part the report record into D Domains and R Ranges. The Domain depends on upon the length of the watchword; the Range parts inside the space in context of the fundamental letter of the catchphrase. An intelligent model is utilized to search for over the encoded recorded watchword that takes out the data spillage. Coordinated intrigue is utilized to pick the range inside the space with time multifaceted nature O(RlogD) and direct demand is utilized to discover the watchword inside the range with O(R). The space whimsies of the record storage room is O(NT × 3) and demand time unconventionality is O(1)+O(RlogD)+O(R), while the multifaceted method for archive time is O(NT × 3).
Ajay kumar Narayankar et al. [5] proposed a noteworthy thought for the MRSE in light of secure inward thing calculation, and in this manner give two essentially enhanced MRSE plans to accomplish various stringent necessities in two unmistakable risk models. To enhance look incorporation of the information search for advantage and further stretch out these two game plans to help more pursue semantics furthermore settled a strategy of strict affirmation basics for such a shielded cloud information usage structure. Among different multi-catchphrase semantics and pick the competent closeness measure of "sort out arranging," i.e., however many matches as could sensibly be ordinary, to get the criticalness of information annals to the demand address.
ISSN(E): 2277-128X, ISSN(P): 2277-6451, DOI: 10.23956/ijarcsse/V7I8/0143, pp. 318-325
up (PRMSM) has been shown. To ask cloud servers to execute secure enthusiasm without knowing the honest to goodness information of the two watchwords and trapdoors, we unmitigated accumulate a novel secure demand convention. To rank the request things and range the security of importance scores among catchphrases and reports.
Jeniphar Francis et al. [8] demonstrated a guaranteed multi-watchword arranged search for plot over blended cloud information, which meanwhile bolsters dynamic restore operations like cancelation and thought of reports. In particular, the vector space show up and the widely utilized TF×IDF display are taken part in the summary progression and demand time. This technique developed a remarkable tree-based record structure and proposed a "Voracious Depth-first Search" estimation to give skilled multi-catchphrase arranged search for. The guaranteed kNN tally is used to scramble the once-over and ask for vectors, and after that affirmation correct significance score figuring between encoded record and question vectors. Recalling a definitive target to limit evident assaults, amorphous vision terms are added to the summary vector for blinding inquiry things. In light of the use of our uncommon tree-based record structure, the proposed plan can satisfy sub-organize intrigue time and manage the cancelation and thought of reports adaptably.
Shikha Rani and Shanky Rani [9] gave an other encryption systems and possible security course of action what's more to lessen appropriated ability to decrease its overhead. Dispersed preparing is recently the system to give on request advantage get to and giving figuring assets over the web. It is a get-together of shared pool of data, assets that makes up a cloud.
Keerthana G et al. [10] delivered an application for redesiging cloud security using bundle and encryption approach which will upgrade the cloud security. Most importantly record from customer were taken and apportion into number of parts. After segment we scramble the all record parts. By then we send record parts to different cloud servers. Right when customer require that data back we take that data from cloud servers and unscramble that data. Subsequent to unscrambling, converging of that data is done and offer it to customer. Our goal is that the application should have straight forward customer interface for customers versatility. The capable framework for giving security is the utilization of cream cryptography for more secured sending and getting of data.
S.No Keyword Based
Searching
Clustering Algorithm Remarks
1 Yes Yes IBE User upload the index and the
encrypted File on the cloud server
2 Yes Yes Symmetric key
Encryption
User upload cluster index, document index & encrypted document
4 Yes No AES User outsourced the index and the
encrypted File on cloud server
5 Yes No ECC User upload file with index after
encryption process
10 Yes No OPSE User upload the index and the
encrypted Files on the cloud server
V. PRIVACY REQUIREMENTS FOR MULTI RANKED KEYWORD SEARCHING
To build the security of encrypted information which is sent by the client to cloud server, the fundamental objective is to encrypt and decrypt the information in a secured path with less time and less cost in both the encryption and decryption process.[9]
The data may get disclosed or modified by any unauthorized access. It is essential that a special care must be taken to protect our sensitive data. A secure storage2 must be achieved in cloud computing. So we adopt cryptographic techniques for the secure storage. The data is encrypted by the data owner before the data is uploaded to the cloud. The major feature of a cryptographic storage is that the security properties that are described below are accomplished.
Fig 2: Cloud Encryption Strategy
Data Owner Data
Encryption Methods
User Cloud
ISSN(E): 2277-128X, ISSN(P): 2277-6451, DOI: 10.23956/ijarcsse/V7I8/0143, pp. 318-325
The above diagram represents encrypted cloud storage. The owner of the data applies cryptographic methods to the sensitive data to protect the information from unauthorized access. The data owner uploads the encrypted data to the cloud environment. The authorized user can decrypt the data and download the required file. The Strength of Cryptographic Cloud Storage are mostly depending on two factors they are Confidentiality and Integrity. Confidentiality Cryptographic Cloud Storage34 provides Confidentiality as the main characteristics. The information’s were encrypted with the advanced cryptographic techniques and thus the secrecy is maintained. Integrity: Cloud Storage provides Integrity to the data and thus it prevents any unauthorized people to modify the data.
Searchable Encryption: A searchable encryption scheme is applied at high level inorder to encrypt the content that is available in search index so that it can hidden from others except the party that provide the authorized tokens A collection of files which consists of full-text index otherwise keyword index considered to generate a search index. The index is encrypted based on searchable encryption scheme in such a way
(i) The pointers to the encrypted files can be retrieved based on the tokens given for the keyword. (ii) if the token is not provided then the contents are hidden for the index.
VI. KNAPSACK PROBLEM
Knapsack problem is a surely understood class of optimization problems, which tries to expand the profit of items in a knapsack without surpassing its capacity[1]. The 0-1 Knapsack Problem is vastly studied in importance of the real world applications that build depend it discovering the minimum inefficient approach to cut crude materials seating challenge of speculations and portfolios seating challenge of benefits for resource supported securitization, A few years ago the generalization of knapsack problem has been studied and many algorithms have been suggested [12]. Advancement Approach for settling the multi-objective0-1 Knapsack Problem is one of them, and there is numerous genuine worked papers established in the writing around 0-1 Knapsack Problem and about the algorithms for solving them. The 0-1 KP is extremely well known and it shows up in the real life worlds with distinctive application. The solution of the 0-1 KP can be viewed as the result of a sequence of decisions [12]. 0-1 KP is NP problem (nondeterministic polynomial time) - complete and it also speculation of the 0 – 1.
Knapsack problem consider an optimal solution. 0-1 knapsack problem can not solved by greedy method because it is not fill the capacity of knapsack and empty quantity lower the effective value per pound of the load, and we must estimate the solution to the sub problem in which the item is exclude before we can make the dainty. Fractional knapsack problem is also solved by greedy method because the 0-1 problem is not. The aim to fill the knapsack, the total weight of each item does not exceeded the capacity of knapsack, and maximized the total profit of the contain objects. Each items having a weight and profit of item pi and capacity of knapsack C. In this problem, the problem is called 0-1 problem because each item has been taken receive or ignore. The value of xi will be 1 if the item has been taken in the knapsack. If the value of xi will be 0 if the item has been ignore or not selected in knapsack.
VII. PROPOSED WORK
In this section, we proposed the detailed Secure Optimized Multi-Keyword Ranked Search(SOMRS). Since the
encrypted documents and index z are both stored in the blind storage system, we would provide the general construction of the blind storage system.
Start
Implement cloud architecture
Create various data sized blocks
Construct Blind storage and key gen
Avoid knapsack problem using IGreedy approach
Encrypt database storage using generated keys
Apply Role of time domain based access model
Generate tuples for data retrieval by cloud
Retrieve documents
Generate and validate results
ISSN(E): 2277-128X, ISSN(P): 2277-6451, DOI: 10.23956/ijarcsse/V7I8/0143, pp. 318-325
Moreover, since the SOMRS aims to eliminate the risk of sharing the key that is used to encrypt the documents with all search users and solve the knapsack problem as given in [1], we modify the construction of blind storage and lever-age ciphertext policy attribute-based encryption (CP-ABE) technique in the EMRS. However, specific construction of CP-ABE is out of scope of this paper and we only give a simple indication here. The SOMRS consists of the following phases: System Setup, Construction of Blind Storage, Encrypted Database Setup, Trapdoor Generation, Efficient and Secure Search, and Retrieve Documents from Blind Storage.
As described in the flow chart first of all the cloud architecture is to be build up. In this we had considered 8
Virtual Machines to store, encrypt and search the data from the server which is inquired by the user as described in fig 4.1.
After the cloud architecture the next part is to divide the data into the block so that we can encrypt the data
along with applying the blind storage scheme.
Now the main emphasis is given on the Knapsack problem due to which the problem arise in which the results
are not accurate which is inquired by the used.
To avoid the Knapsack problem the IGreedy algorithm is used, which helps us to sort the block of data along
with indexing so that the searching may be efficient.
Now apply role based time domain access model for the data privacy and security of data on the cloud.
In the last user may retrieve document and the results are to be generated and validated.
4.5.1 Algorithm
1: for each keyword ! 2 W do 2: Set t an empty list
3: for each document di containing the keyword ! do
4: Get the associated vector P of di
5: Avoid Knapsack problem
6: Dsc ABE i (idi jjKi jjx)
7: Append the tuple (Dsc; P) to t which was provided by greedy extraction
8: end for 9: z[!] D t 10: end for 11: return z
Where
D is collection of m documents
C is collection of n encrypted documents W is Keyword dictionary of length j
p, P is relevance vector and its encrypted form q, Q is query vector and its encrypted form w is a conjuctive keyword set for search request
B an array of nb blocks of mb bits each
VIII. RESULTS Space Utilization
Fig 3: Space Utilization
ISSN(E): 2277-128X, ISSN(P): 2277-6451, DOI: 10.23956/ijarcsse/V7I8/0143, pp. 318-325
Efficiency
Fig 4: Efficiency
Efficiency may be defined as the ratio of output over input. From the fig 5.9 it is clear that the proposed algorithm is more efficient than that of existing algorithm. This difference is due to the Greedy approach that is used in the proposed encryption. Efficiency will be more in the greedy based Searching because greedy always tends to fully utilize the resources that is given to it.
Encryption overheads
Fig 5 Encryption Overheads
From the fig it is clear that the encryption overheads are more in case of existing algorithm because this encryption has not ant support of optimization so it will not tend to fully utilize the resource vector that is given.
Cipher Size
Fig 6: Cipher Size
ISSN(E): 2277-128X, ISSN(P): 2277-6451, DOI: 10.23956/ijarcsse/V7I8/0143, pp. 318-325
Over Utilization
Fig 7: Over Utilization
From the figure it is clear that the over utilization are more in case of existing algorithm because this encryption has not ant support of optimization so it will not tend to fully utilize the resource vector that is given.
Time Complexity
Fig 8: Time Complexity
Here in this case the time complexity is calculated for existing algorithm and Greedy based proposed algorithm. From the figure it is clear that the Greedy based proposed algorithm perform operations in less time as compared to existing algorithm because greedy algorithm always tends to do more operations in a particular defined resource set.
IX. CONCLUSION
This paper concentrated on various information seeks strategies crosswise over cloud servers. To start with, requirement for cloud information stockpiling and its significance are examined. Later paper experienced cloud information looking significance. After that we talked about various looking strategies in the cloud information. Every method has their preferences and disservices. So as to overcome the disservices, Multi-catchphrase looking plan with equivalent word inquiry is utilized.
REFERENCES
[1] Hongwei Li, Dongxiao Liu, Yuanshun Da, Tom H. Luan, Xuemin, Shen, ―Enabling Efficient Multi-Keyword
Ranked Search Over Encrypted Mobile Cloud Data Through Blind Storage‖, IEEE Transactions On Emerging Topics In Computing, ISSN: 2168-6750, Volume 3, No: 1, March 2015, pp: 127-138
[2] K.S.Saravanan, S. Karthika, ―A Secure and Dynamic Multi-keyword Ranked Search Scheme over Encrypted
Cloud Data‖, International Journal of Advanced Research in Computer and Communication Engineering, ISSN (Online) 2278-1021, ISSN (Print) 2319 5940, Vol. 5, Issue 2, February 2016, pp: 244-247
[3] Veerraju Gampala, Sreelatha Malempati, ―An efficient Multi-Keyword Synonym Ranked Query over Encrypted
Cloud Data using BMS Tree‖, International Journal of Applied Engineering Research, ISSN 0973-4562, Vol: 11, No: 1, 2016, pp: 738-743
[4] Raghavendra S, Geeta C M, Rajkumar Buyya, Venugopal K R, S S Iyengar, L M Patnaik, ―DRSMS: Domain
ISSN(E): 2277-128X, ISSN(P): 2277-6451, DOI: 10.23956/ijarcsse/V7I8/0143, pp. 318-325
[5] Ajay kumar Narayankar, Gajanan Rathod, Sanket Londhe, Ashish Wankhade, M.A. Ansari, ―A Review on
Privacy-Preserving Multi Keyword Ranked Search over Encrypted Cloud Data‖, International Journal of Innovative Research in Science, Engineering and Technology, ISSN(Online) : 2319-8753, ISSN (Print) : 2347-6710, Vol. 5, Issue 3, March 2016, pp: 3532-3537
[6] V. Goutham, B.Shyla Reddy, Krishna Manasa, ―An proficient and Confidentiality-Preserving MultiKeyword
Ranked Search over Encrypted Cloud Data‖, International Journal of Computer Applications Technology and Research, ISSN: - 2319-8656, Vol: 5, Issue 7, 2016, pp: 478 – 483
[7] M. Veerabrahma Chary, N. Sujatha, ―A Novel Additive Multi-Keyword Search for Multiple Data Owners in
Cloud Computing‖, International Journal of Computer Engineering In Research Trends, ISSN: 2349-7084, Volume 3, Issue 6, June-2016, pp. 308-313
[8] Jeniphar Francis, Ruchika Bansod, Chetna Getme, Priyanka Bagde, ―A Secure and Encrypted Cloud Data with
Multi-keyword Rank Search and Revocation of User‖, International Journal of Advanced Research in Computer Science and Software Engineering, ISSN: 2277 128X, Volume 6, Issue 9, September 2016, pp: 367-369
[9] Shikha Rani, Shanky Rani, ―Data Security in Cloud Computing Using Various Encryption Techniques‖,
International Journal of Modern Computer Science, ISSN: 2320-7868, Volume 4, Issue 3, June 2016, pp: 163-166
[10] Keerthana G, Prabu S, Swarnalatha P, ―An Efficient Data Security in Cloud Computing using Cryptography‖,
International Journal of Advanced Research in Computer Science and Software Engineering, ISSN: 2277 128X, Volume 6, Issue 5, May 2016, pp: 654-660
[11] Gayathri M R, K. Srinivas, ―Efficient Keyword Search Techniques over Cloud Data in Cloud Computing‖,
International Journal of Science and Advanced Technology, ISSN 2228386, Volume 5 No 7 July 2015, pp: 1-5
[12] Neha Mahajan, V.M.Barkade, ―Survey on Recent Keyword Search Techniques on Outsource Encrypted Big
Data‖, International Journal of Innovative Research in Computer and Communication Engineering, ISSN(Online): 2320-9801 ISSN (Print): 2320-9798, Vol. 5, Issue 1, January 2017, pp: 441-445
[13] Jabeen Akkalkot, S. Shanmug Priya, ―A Survey On Keywordbased Search Mechanism For Data Stored In