• No results found

MULTIKEYWORD RANKED SEARCH OVER CLOUD DATA

N/A
N/A
Protected

Academic year: 2020

Share "MULTIKEYWORD RANKED SEARCH OVER CLOUD DATA"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

90

MULTIKEYWORD RANKED SEARCH OVER CLOUD

DATA

Abstract-The advent of cloud computing, data owners are stimulated to farm out their complex data management systems from local sites to commercial public cloud for great flexibility and economic savings. But for protecting data privacy, sensitive data has to be encrypted before outsourcing, which obsoletes traditional data exploitation based on plaintext keyword search. Thus, enabling an encrypted cloud data search service is of supreme importance. Considering the large number of data users and documents in cloud, it is essential for the search service to allow multi-keyword query and provide result similarity ranking to meet the efficient data retrieval need. Related works on searchable encryption focus on single keyword search or Boolean keyword search, and seldom differentiate the search results. We first propose a basic MRSE scheme using secure inner product computation, and then radically improve it to meet different privacy requirements in two levels of threat models.The Incremental High Utility Pattern Transaction Frequency Tree (IHUPTF-Tree), is designed according to the transaction frequency (descending order) of items to obtain a compact tree.

By using high utility pattern the items can be arranged in an efficient manner. Tree structure is used to sort the items. Thus the items are sorted and frequent pattern is obtained. The frequent pattern items are retrieved from the database by using hybrid tree (H-Tree) -structure. The crucial information are compressed to form a compact data structure. This reduces the memory space which in turn reduces the searching time. So the execution time becomes faster. Finally, the frequent pattern items that satisfies the threshold value is displayed.

Keywords—Cloud computing, searchable encryption, privacy-preserving, keyword search, ranked search

I. INTRODUCTION

CLOUD computing is the long dreamed vision of computing as a utility, where cloud customers can vaguely store their data into the cloud so as to enjoy the on-demand high-quality applications and services from a shared pool of configurable computing resources. Its great flexibility and

economic savings are motivating both individuals and enterprises to outsource their local complex data management system into the cloud. To protect data privacy and contest unwanted accesses in the cloud and beyond, sensitive data, for example, e-mails, personal health records, photo albums, tax documents, financial

transactions, and so on, may have to be encrypted by data

owners before outsourcing to the commercial public cloud this, however, obsoletes the traditional data utilization service based on plaintext keyword search. The trivial solution of downloading all the data and decrypting locally is clearly unfeasible, due to the huge amount of bandwidth

cost in cloud scale systems. Moreover, aside from eliminating the local storage management, storing data into the cloud serves no purpose unless they can be easily searched and utilized. Thus, exploring privacy preserving and effective search service over encrypted cloud data is of vital importance. Considering the potentially large number

of on-demand data users and huge amount of outsourced data documents in the cloud, this problem is particularly challenging as it is extremely difficult to meet also the

T.Nagamani J.Vaishali D.Nivedha

Assistant Professor BE-CSE BE-CSE

Bannariamman Institute

of Technology

Bannariamman Institute of

Technology

Bannariamman

Institute of Technology

(2)

91

requirements of performance, system usability, and scalability. On the one hand, to meet the effective data retrieval need, the large amount of documents demand the cloud server to perform result relevance ranking, instead of returning undifferentiated results.Such ranked search

system enables data users to find the most relevant information quickly, rather than burdensomely sorting through every match in the content collection. Ranked search can also elegantly eliminate unnecessary network traffic by sending back only the most relevant data, which is highly desirable in the “pay-as-you-use” cloud paradigm.

For privacy protection, such ranking operation, however, should not leak any keyword related information. On the other hand, to improve the search result accuracy as well as to enhance the user searching experience, it is also necessary for such ranking system to support multiple

keywords search, as single keyword search often yields far too coarse results. As a common practice indicated by today’s web search engines (e.g., Google search), data users.Each keyword in the search request is able to help narrow down the search result further. “Coordinate

matching”, i.e., as many matches as possible, is an efficient similarity measure among such multi-keyword semantics to refine the result relevance, and has been widely used in the plaintext information retrieval (IR) community. However, how to apply it in the encrypted cloud data search system remains a very challenging task because of inherent

security and privacy obstacles, including various strict requirements like the data privacy, the index privacy, the keyword privacy, and manyothers. In the literature, searchable encryption is a helpful technique that treats encrypted data as documents and allows a user to securely

search through a single keyword and retrieve documents of interest. However, direct application of these approaches to the secure large scale cloud data utilization system would not be necessarily suitable, as they are developed as crypto primitives and cannot accommodate such high service-level

requirements like system usability, user searching experience, and easy information discovery. Although some recent designs have been proposed to support Boolean keyword search as an attempt to enrich the search flexibility, they are still not passable to provide users with

acceptable result ranking functionality. Our early works have been aware of this problem, and provide solutions to the secure ranked search over encrypted data problem but only for queries consisting of a single keyword. How to design an efficient encrypted data search mechanism that supports multi-keyword semantics without privacy

breaches still remains a challenging open problem. In this paper, for the first time, we define and solve the problem of multi-keyword ranked search over encrypted cloud data (MRSE) while preserving strict systemwise privacy in the cloud computing paradigm. Among various multi-keyword

semantics, we choose the efficient similarity measure of “coordinate matching,” i.e., as many matches as possible, to capture the bearing of data documents to the search query. Specifically, we use “inner product similarity” [6], i.e., the number of query keywords appearing in a

document, to quantitatively evaluate such similarity measure of that document to the search query. During the index construction, each document is associated with a binary.

II. RELATED WORK

WeiZhang et al [2014] With the advent of cloud computing, it becomes increasingly popular for data owners to outsource their data to public cloud servers while allowing data users to retrieve these data. For privacy concerns, secure searches over encrypted cloud data

(3)

92

model is proposed. To enable cloud servers to perform secure search without knowing the actual data of both keywords and trapdoors,a novel secure search protocol is constructed systematically. To rank the search results and preserve the privacy of relevance scores between keywords

and files, we propose a novel Additive Order and Privacy Preserving Function family. Extensive experiments on real-world datasets confirm the efficacy and efficiency of our proposed schemes[2]

Jun Xu et al [2012] To protect privacy of users,

sensitive data need to be encrypted before outsourcing to cloud, which makes effective data retrieval a very tough task.A novel order-preserving encryption(OPE) based ranked search scheme over encrypted cloud data, which uses the encrypted keyword frequency to rank the results

and provide accurate results via two-step ranking strategy is proposed. The first step coarsely ranks the documents with the measure of coordinate matching, i.e., classifying the documents according to the number of query terms included in each document. In the second step, for each

category obtained in the first step, a fine ranking process is executed by adding up the encrypted score.[3]

Orencik et al [2012] Cloud computing technologies become more and more popular every year, as many organizations tend to outsource their data utilizing robust

and fast services of clouds while lowering the cost of hardware ownership. Although its benefits are welcomed, privacy is still a remaining concern that needs to be addressed. We propose an efficient privacy-preserving search method over encrypted cloud data that utilizes

minhash functions. Most of the work in literature can only support a single feature search in queries which reduces the effectiveness. One of the main advantages of the method proposed is the capability of multi-keyword search in a single query.An effective ranking capability that is based

on term frequency-inverse document frequency (tf-idf) values of keyword document pairs is proposed. [4]

III. PROPOSED SYSTEM

The system which we propose is more secure since we use AES encryption algorithm to encrypt the data uploaded in database.The user

should register so that they are given with user id and password for next login. As for the data privacy, the data owner can resort to the traditional symmetric key cryptography to encrypt the data before outsourcing, and successfully prevent the

cloud server from trying into the outsourced data. With respect to the index privacy, if the cloud server deduces any association between keywords and encrypted documents from index, it may learn the major subject of a document, even the content of a short document [6]. Access pattern. Within

the ranked search, the access pattern is the sequence of search results where every search result is a set of documents with rank order. Specifically, the search result for the query keyword set f W is denoted as Fe W , consisting of

(4)

93

LEVEL 0:

Login

Key Words

Web page

Login

Information

LEVEL 1:

Login Title

Keyword

similarity

Fig 4.1:System Flow Diagram

IV. CONCLUSION

In this paper, for the first time we define and solve the problem of multi-keyword ranked search

over encrypted cloud data, and establish a variety of privacy requirements. Among various multi-keyword semantics, we choose the efficient similarity measure of “coordinate matching,” i.e., as many matches as possible, to effectively capture the relevance of outsourced

documents to the query keywords, and use “inner product similarity” to quantitatively evaluate such similarity measure. For meeting the challenge of following multi-keyword semantic without privacy breaches, we propose a basic idea of MRSE using secure inner product computation. Then, we give two improved MRSE

schemes to achieve various harsh privacy requirements in two different threat models. We also investigate some

further enhancements of our ranked search mechanism, including supporting more search semantics, i.e., TF IDF, and dynamic data operations. Thorough analysis investigating privacy and efficiency guarantees of proposed schemes is given, and experiments on the

real-world data set show our proposed schemes introduce low overhead on both computation and communication.

In our future work, we will explore checking the integrity of the rank order in the search result assuming the cloud server is untrusted.Also we will

track the IP address of the system that access the file.

REFERENCES

[1]

N. Cao, C. Wang, M. Li, K. Ren, and W. Lou,

“Privacy-Preserving Multi-Keyword Ranked Search over Encrypted Cloud Data,” Proc. IEEE INFOCOM, pp. 829-837, Apr, 2011.

[2]WeiZhang ,China Sheng Xiao, Yaping Lin , Ting Zhou , Siwang Zhou, "Secure Ranked Multi-keyword Search for Multiple Data Owners in Cloud Computing," Proc. IEEE INFOCOM, June 2014.

[3]Jun Xu , Weiming Zhang , Ce Yang JiajiaXu, Nenghai Yu, "Two-Step-Ranking Secure Multi-Keyword Search over Encrypted Cloud Data " Proc. IEEE INFOCOM, Oct 2012.

[4] Orencik, Kantarcioglu, Savas, " A Practical and Secure Multi-keyword Search Method over Encrypted Cloud Data" Proc.IEEEINFOCOM,Apr 2012.

[5] S. Yu, C. Wang, K. Ren, and W. Lou, “Achieving Secure, Scalable, and Fine-Grained Data Access Control in Cloud Computing,” Proc. IEEE INFOCOM, 2010.

[6] S. Zerr, E. Demidova, D. Olmedilla, W. Nejdl, M. Winslett, and S.

Mitra, “Zerber: r-Confidential Indexing for Distributed Documents,” Proc. 11th Int’l Conf. Extending Database Technology (EDBT ’08), pp. 287-298, 2008.

Admin

Manage

Category

User

Search

syn_category

Admin

Category

Add, Edit,

Delete

syn_category

(5)

94

AUTHORS BIOGRAPHY

First Author –

The author

Mrs.T.Nagamani1 is currently working as an assistant Professor in the Department of Computer Science and Engineering in Bannari Amman Institute of Technology. She has completed her UG & PG in Anna University, Chennai. She has published 1 paper in InternationalConference.Her area of research interests is Cloud computing. She has attended 3 workshops and 4 seminars. Email: [email protected]

SecondAuthor-

The author Ms.J.Vaishali2 is currently doing Bachelor of Engineering in the Department of Computer Science and Engineering in Bannari Amman Institute of Technology. Email: [email protected]

Third

Author-

The author

Ms.D.Nivedha3 is currently doing Bachelor of Engineering in the Department of Computer Science and Engineering in Bannari Amman Institute of Technology.Email: [email protected]

References

Related documents

okaryons were initially crossed to maximize matching of heterokaryon-incompatibility alleles. Heterokaryons were formed by overlaying drops of conidial suspensions of

the best way to make students speaking in the target language is using the language.. and modeling the language

The overall purpose of the research study reported here was to create a simulation model that could provide the NC Department of Transportation (NCDOT) with

Self-curing or internal curing is a technique that can be used to provide additional moisture in concrete for more effective hydration of cement and reduced

Accessing information is an essential factor in decision making processes occurring in different domains. Therefore, broadening the coverage of available information for

The purpose of this work is minimizing the electrical power dissipation during water electrolysis by fined the best frequency and wave form can be used in water electrolysis

Population structure: RFLP analysis of the entire mtDNA genome and direct sequencing analysis of the mtDNA control region provided congruent results and

It has long been assumed that chromatid segregation following mitotic crossing over in yeast is random, with the recombinant chromatids segregating to opposite