Available Online at www.ijpret.com 529
INTERNATIONAL JOURNAL OF PURE AND
APPLIED RESEARCH IN ENGINEERING AND
TECHNOLOGY
A PATH FOR HORIZING YOUR INNOVATIVE WORK
“A RESULT ON SECURE ENTERPRISE SEARCH ENGINE USING CONTENT
FILTERING”
SAGAR TAYADE1, PROF. H. R. DESHMUKH2, PROF. N. S. BAND3
1. Student of Master of Engineering in (CSE), IBSS college of Engineering and Technology, Amravati, India. 2. Head of the Department of (CSE), IBSS College of Engineering and Technology, Amravati, India. 3. Assistant professor Department of (CSE), IBSS College of Engineering and Technology, Amravati, India.
Accepted Date: 05/03/2015; Published Date: 01/05/2015
\
Abstract: Security problem is particularly important to the enterprise search engines. We propose a bloom filter based index to solve the security problem of the enterprise search engines. Our approach maintains a single system-wide index. By considering the access privilege in the index creation algorithm and applying the bloom filter algorithm to compress the index. This application is an enterprise search engine in which company employees will upload data for search engine and other employees will search the content on search engine, but the result set as per the searched query will be filtered according to employee’s access privilege, the search engine content will be filtered by using content filtering techniques. A bloom filter based security index creation algorithm and the corresponding query processing and rank algorithms. Experimental results show that our index saves the disk space and guarantees both meanings of the security for the system at the same time.
Keywords: Bloom Filter; Security Model; Enterprise Search Engine; Access Privilege; Encryption.
Corresponding Author: MR. SAGAR TAYADE
Access Online On:
www.ijpret.com
How to Cite This Article:
Sagar Tayade, IJPRET, 2015; Volume 3 (9): 529-536
Available Online at www.ijpret.com 530
INTRODUCTION
Nowadays organizations spread all over the nation and communication between them is necessary to increase performance. To access documents between the organizations, enterprise search engine plays important roll but search engine may provide highly confidential documents to wrong employee therefore to increase the security and to provide easy document access content filtering based search engine can be used.
Security problem is particularly important to the enterprise search engines [1, 2]. There are two meanings for the security problem. The first one is that a user, without access privilege to a set of documents, cannot get searching results from the enterprise search engine containing any document in the set. The second meaning is that a user can infer any information he cannot access from the searching results.
By using content filtering, employee cannot abuse search engine to leak documents. The approach filtering based algorithms [3, 4, 5, 6]. They maintain a single system-wide index without considering the access privilege. After getting the searching results, the filtering based algorithms filter the results based on a user’s access control right and only return a subset of the results, that the user can access. The advantage of the filtering based algorithms is that they can save the disk space
2. LITERATURE REVIEW & RELATED WORK:
All the statistics is calculated at runtime, its performance is not effective. [4] Analyzes the factors affecting the performance of the filtering based approaches and gives some methods to optimize these factors. [5] And [6] propose security indexes, which incorporate the document access information into the index. [6] Proposes a structure named Access Control Barrels (ACB). Each ACB represents a unique kind of access privilege. The ACBs are organized into a directed acyclic graph (DAG) according to the access privileges of documents. All documents in the system are stored in different ACBs. A bloom filter (BF) is a bit vector with l bits, all of which are initially set to 0.
Available Online at www.ijpret.com 531 with the access information of the user. By using the index to match the access information of the two parts, the performance of the filtering based approaches can be improved.
But these security indexes cannot guarantee the second meaning of the security problem for the enterprise search engines.
3. Proposed Work & Objective:
3.1 Data Mining:
Data Mining refers to extracting or mining information from large amounts of data. Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of existing information resources and can be integrated with new products and systems as they are brought on-line. When implemented on high performance client/server or parallel processing computers, data mining tools can analyze massive databases to deliver answers to many questions. The information and knowledge gained can be used for application ranging from market analysis, fraud detection, and customer retention, to production control and science exploration. Data Mining plays an important role in online shopping for analyzing the subscriber’s data and understanding their behaviors and making good decisions such that customer acquisition and customer retention are increased which gives high revenue.
3.2 Decision Engineering:
Decision engineering is a framework that unifies a number of best practices for organizational decision making.
Decision engineering seeks to unify a number of decision making best practices
3.3 There are two kinds of approaches to solve the security problem:
3.3.1 Filtering based algorithms:
Available Online at www.ijpret.com 532
3.3.2 Desktop search systems:
They create a unique index for each user, which contains all files accessible to the user. The desktop search systems can guarantee both meanings of the security problem.
3.4 Secure Enterprise search Engine has the following phases:
3.4.1 Bloom Filter:
A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not.
A bloom filter (BF) is a bit vector with l bits, all of which are initially set to 0. The BF facilitates the membership test, element x belongs to a finite set S = {x1; x2; : : : ; xn}. The BF uses a set of k uniform and independent hash functions h1; h2; : : : ; hk to map the set S to the bit vector. To check whether an item x ∈ S, we check whether all hi(x) bits are set to 1. If not, x is definitely not a member of S. Otherwise; x is probably a member of S. We call the probability a false positive if an element x ̸∈ S has all hi(x) bits set to 1.
3.4.2 Foundation of the bloom filter based index:
The main idea of our index creation algorithm is to maintain a list for each term in the index. Each node in the list stores a user’s ID, the IDF value of the term for the user and some other information. Our index can overcome the shortcoming of the filtering based algorithms and save more disk space than the desktop systems.
3.5 Algorithms:
3.5.1 AES Algorithm:
Available Online at www.ijpret.com 533
3.5.2 Filtering Algorithms:
They maintain a single system-wide index without considering the access privilege. After getting the searching results, the filtering based algorithms filter the results based on a user’s access control right and only return a subset of the results, that the user can access.
3.5.3 Rank Algorithm:
Our rank algorithm utilizes this information to rank the query results to prevent information leaking. Association Rule Mining [1] is a popular and well researched method for discovering interesting relations between variables in large databases.
3.6 Modules:
Admin
Branch Admin
Upload
Download
Encryption/Decryption
Communication
Search Engine
Content Filtering
Available Online at www.ijpret.com 534
Screenshot 2: Report
Screenshot 3: Communication
Figure: Secure Enterprise Search Engine Using Content Filtering.
Available Online at www.ijpret.com 535
4. Application:
1. Company employees will use search engine to access any document, Search engine will search documents as per the entered keyword using SEO techniques (Search engine Optimization).
2. Registered Company Employees can search any document using keyword on search Engine; all employees belonging to specified designation can access the documents.
3. By using content filtering employee cannot abuse search engine to leak documents.
4. No need to worry about document security, Document Leakage prevented by using content filtering.
5. CONCLUSION:
Search engine may provide highly confidential documents to wrong employee therefore to increase the security and to provide easy document access Content Filtering based Search Engine can be used. The result set as per the searched query will be filtered according to employee’s access privilege. Company branch administrators and employees will upload documents for search engine with designation wise access permissions. Employee can download required document as per their access permissions. Company employees can communicate with each other, their incoming messages will be stored in inbox and outgoing messages will be stored in outbox. No need to worry about document security Document Leakage prevented by using content filtering.
6. REFERENCES:
1. D. Hawking, Challenges in enterprise search, in: Proc. ADC 2004.
2. D. Zhang, Y.M. Chee, A. Mondal, A.K.H. Tung, and M. Kitsuregawa, “Keyword Search in Spatial Databases: Towards Searching by Document,” Proc. Int’l Conf. Data Eng. (ICDE), pp. 688-699, 2009.
3. P. Bailey, D. Hawking, B. Matson, Secure search in enterprise webs: tradeoffs in efficient implementation for document level security, in: Proc. CIKM 2006.
Available Online at www.ijpret.com 536 5. J. Kasprzak, M. Brandejs, M. Cuhel, Access rights in enterprise full-text search, in: Proc. ICEIS 2010.
6. A. Singh, M. Srivatsa, Efficient and secure search of enterprise file systems, in: Proc. ICWS 2007.
7. Yichao Jin, Yonggang Wen, Member, IEEE, and Weiwen Zhang, "Content Routing and Lookup Schemes using Global Bloom Filter for Content-Delivery-as-a-Service", IEEE systems journal, VOL. 8, NO. 1, MARCH 2014.