International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 12, December 2013)
409
Development of a Secure XML Data Warehouse: A Practical
Approach
Arjun. P. K
1, M. Mythily
2, M. L.Valarmathi
31
Post Graduate Student, 2Assistant Professor, Department of CSE, Karunya Institute of Technology & Science, Coimbatore, India
3Associate Professor, Department of CSE, Govt. College of Technology, Coimbatore, India
Abstract—Data warehouses are repositories of valuable
information that provides a platform to integrate data coming from various sources and thus plays a vital part in formulating future policies of an organization. Due to its importance, security is an important criterion which has to be considered right from the initial phase of development.XML has been considered as a better target platform in order to deal with this kind of unstructured, multidimensional piece of data because of its flexibility and usability. The practical approach proposed here is an effort to develop a secure XML data warehouse for a railway security system. This technique enabled us to specify the required security parameters as rules in Xml format which manipulate and manage the information storage and retrieval process. The end result is the creation of separate Xml files with specified security level and access rights that constitute a secure Xml Data warehouse
Keywords—Data Warehouse, Data, Model, Secure,
Transformation, XML
I. INTRODUCTION
Data warehouses are repositories of valuable
information’s. These systems act as storehouse for large amount of data coming from heterogeneous sources in multiple forms. Every system or organization depends mainly on information or data to carry forward for its existence. Today Information storage and processing is a challenging task as it is not only necessary for us to maintain data in a readily available form, but also to maintain its consistency, integrity and security. Various DBMS systems are available today to help us to meet this challenge, but provide little maintenance when it comes to historic data. These historic data provides a valuable insight for any organization about how to carry further from its past experiences and plays a crucial role in Decision support system.
A data warehouse provides a platform to integrate data coming from various sources. Data cannot be expected to follow some structured format at all times. For example before launching a new product; a company might need a thorough analysis about their customer feedback or competitor statistics.
Thus they have to deal with information from various sources like web, internet, or other format. While dealing with privacy related data, confidentiality should be maintained in a secure way. Apart from this, the information stored in data warehouses is sensitive and plays a vital part in formulating future policies of an organization. Due to its importance security is an important criteria which has to be considered right from the beginning. Traditional Data warehouse systems follow an orthodox path where they think about security only in later phases of the development. It calls for integrating security in earlier development phases like, elicitation or design phases. While considering about systems which deals with historic, unstructured and heterogeneous-source oriented data, the main challenge is how to integrate these information which has different security requirements and policies. This calls for a better target platform like that of XML to manage unstructured, multidimensional data because
XML has been adopted as a standard in the e-world
and web for representing and exchanging data electronically.
A large part of external source of warehouse data
comes from internet which uses XML as standard format that facilitate the use and exchange of data in an easier way .
It helps to integrate various forms of data under a
single platform and enhances interoperability.
It helps to define user oriented tags related to business
domain therefore, reducing complexity in
representing multidimensional data.
II. DEVELOPMENT STUDY
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 12, December 2013)
410 This methodology considers a model driven approach
proposed in Figure 1 and apply it to represent different fact,
[image:2.612.348.535.129.356.2]dimension and base classes. The information stored in Data warehouse can be structured in a multidimensional format. It stores data in a form comprises of facts and dimensions [1]. A fact can represent important features or attributes related to a (business) process like that of a (Trip, Journey, sales etc.) and dimension provides a platform or context in which the fact is measured (season, geographical area, time etc.).
Figure 1: Model driven approach
[image:2.612.65.270.260.415.2]A practical approach for the development of a secure data ware house for a railway security system has been considered here with xml as the target platform. Railway security system deals with collecting, integrating and managing a repository of information that deals with the in and around activities of a railway station that provides a vital source of information to various departments like Counter terrorism unit, Railway Protection Force etc. These departments mainly function to provide security to train, passengers and to ensure the smooth functioning of Railways. They need to store a large volume of data of different formats dealing with day-to-day activities taking place inside a station along with passenger, train, luggage and other sensitive information. It also helps us to identify and incorporate the needs and security configuration for railways system right from the initial phase of the design. An Overview of development methodology is given below
Figure 2 [2] which initially analysis the security requirements conceptually, without considering the target platform.
Figure 2: Model driven approach
Various information related to railway station has been gathered here in an effort to identify the facts, attributes, dimensions and other features that constitute to the railway security. Modeling fact classes and other related dimension classes will provide with clear picture about the attributes which has to be adopted and considered. As mentioned above, it then try to implement the effort of developing a data warehouse in form of XML files which stores multidimensional data in any form by taking into account of security rules specified as XML files.
III. IMPLEMENTATION
The above development paradigm in Figure 2 [3], which
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 12, December 2013)
[image:3.612.51.562.156.307.2]411 TABLE I JOURNEY CLASS
Dimension Class Attributes
1)Passenger Passenger ID,Name,Address,Ticket,CriminalRecord,RiskIndex etc.
2)Luggage Passenger ID,TotalWeight,NoofItems,Suspecious,etc.
3)Train TrainNo,TrainName,TrainType,Stations etc.
4)Ticket PassengerId,Fair,Class,Distance,DateofJourney,TrainNo,etc.
Date (Base Class) DateCode,Day,week,month,year
Time (Base Class) Timecode,hour,Min,sec
For example, a fact class called ―Journey‖ can be identified easily that consists of many attributes like ticket, time, date, purpose, class, fair, distance, date, passenger ID,
train ID, luggage, place etc. The journey class in Table 1 is
associated with many dimension classes as mentioned. Here only test data has been considered in order to develop our system. The various securities related information as
shown in Figure 3 are identified and are classified mainly as
follows like
• Security Levels. • Security Roles. • Security Rules.
Security levels classify data into various categories of importance like that of Top secret (TS), Secret(S), Confidential(C), and General (G).Identifying security roles will help us to understand about various classes who access the system for various purposes with various privilages.
The user profile is mainly classified into two category like ―Staff‖ and ―Passengers‖. Staff role is further classified into ―Security‖, ―Administration‖, and ―Technicians‖.
Now the third important security information plays a major role, which describes about various security requirements specified as security rules that govern the
storage and manipulation of data.The authorization rules or
access controls rules [4] are then specified which helps to guide data storage into specified XML files and data access to authorized users. For example all users who has successful login are given full access to all train information, thus allowing us to store all train details in a separate XML file say ―Train.Xml‖ with security level as ―General (G)‖.In the same way it can specify general security level for each xml files created like that for each user registration, staff registration or addition of new trains etc.
[image:3.612.132.492.532.698.2]International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 12, December 2013)
[image:4.612.88.530.118.379.2]412
Figure 4: Xml Rules Specified It should also be possible to specify a security level of
―confidential‖ for xml file created for ―passengers‖ and a level of ―Secret‖ for ―staff‖ according to requirements.
[image:4.612.88.523.442.685.2]These rules has to be followed in both cases of storing data that has been entered through frontend and also while accessing these stored data.
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 12, December 2013)
413 The details of a new staff should be added to staff.xml files and also to secret.xml file with minimum level of
security as ―Secret‖ unless until specified.It should also be
ensured that only those roles that are authorized to access files which are classified as ―Secret‖ can access staff.xml file. External rules can also be specified, for example the security level of a passenger can be raised if his risk index is greater than a threshold value say ―5‖ or if he is suspicious as given
• Rule1: If passenger. Risk index>5 OR passenger. Suspicious = ―True‖ then SL=TS, SR =Security. • Rule2: If luggage.Totalweight>30 OR if luggage.
Inspected = ―false‖ then SL=TS, SR=Security, TTE Thus the information about this passenger will be stored in Topsecret.xml file and only users with Security Role = ―Security‖ can access that information as per Rule 1. Here the main effort was to develop secure xml files which act as a repository for railway related heterogeneous information. A set of test rules in XML file format has been generated and is given as input to the system initially like in Figure 4. The rules are mainly specified by taking into account of various attributes and its values for different fact and dimension classes. Though the implementation was carried out with test values xml files were created separately for each dimension class like that for train.xml, staff.xml, passenger.xml, secret.xml, confidential.xml etc. A snapshot of Xml files created as output is shown in Figure 5.
IV. CONCLUSION
A model driven approach for developing a secure xml data warehouse for a railway system has been implemented here where security related requirements and information are collected and analyzed right from the initial stage. It helps to clearly identify and model various facts and dimension classes related to railway system along with their attributes. This technique enabled us to specify the required security parameters as rules in Xml format which manipulate and manage the information storage and retrieval process. The end result is the creation of separate Xml files with specified security level and access rights that constitute a secure Xml Data warehouse. The static implementation used in this methodology can be enhanced
further in future by including dynamic model
transformation where model transformations can be carried out automatically.
REFERENCES
[1] B. Vela, C. Blanco, E. Fernandez-Medina, E. Marcos, A practical application of our MDD approach for modeling secure XML data warehouses, Decision Support Systems 52 (2012) 26.
[2] Sergio Luja´n-Mora, Juan Trujillo, Il-Yeol Song, A UML profile for multidimensional modeling in data warehouses, Data & Knowledge Engineering 59 (2006) 725–769.).
[3] B. Vela , J.N. Mazon , C. Blanco , E. Fernandez-Medina , J. Trujillo , E. Marcos , Development of Secure XML Data Warehouses with QVT,Information and Software Technology (2013).
[4] E. Fernandez-Medina, M. Piattini, Designing secure databases, Information and Software Technology 47 (2005) 463–477. [5] E. Soler, J. Trujillo, E. Fernandez-Medina, M. Piattini, Building a