the associated permanent storage media was very limited in capacity and directly attached to the computers. Also data accessibility was only for limited number of users. Hence, the threat to data security was not high. These days, computer technology has matured and is providing superior processing power, huge storage support and facilitating anywhere networking while the data storage demand is increasing exponentially. Today users spread across geographies are provided access to data storage devices which has popularized the use of NAS (Network Attached Storage) and SAN (Storage Area Network) technologies. Sensitive data traveling through worldwide networks or stored in devices within this network is at a risk of falling in wrong hands especially when the data is resting on storage media and devices. This paper describes about security threat to the data-in-motion and data-at-rest, mainly focusing on encrypting the data when it is in rest. It covers basic
About the Author
Suraj Kumar BhatnagarSuraj Kumar Bhatnagar has been working with TCS since 2001 and currently, he is a part of Storage COE, High-Tech Practice. He holds a Bachelors degree in Computer Engineering from G. B. Pant University of Agriculture and Technology and a Masters degree in Computer Science and Technology from Indian Institute of Technology, Kanpur. Suraj has worked on Systems Software designing and development in the Storage Technology with several engagements with TCS. His area of expertise is NAS, CIFS, NFS, File Systems and Protocol Stacks development in the area of IP and ATM networks.
Table of Contents
1. Introduction
3
2. The Need For Data Security
3
3. Securing Data
4
4. Layering Data Security
5
5. Some encryption algorithms
8
6. Conclusion
9
Introduction
The Need For Data Security
Data storage has come a long way since the days of early computers. The volume of data in the early days of computers was negligible as compared to today. Back then, a few disks or tapes were sufficient to store the data. Since most computers were standalone and only the users had access to data, security was not a big concern.
All this changed when computers became linked in networks, what started with small dedicated networks soon took the form of large LANs, WANs and the World Wide Web. With the rapid growth of networking came the voluminous increase in the data flow. In this scenario, the security of data became a big issue. Storage technologies like NAS, NAS, FAN, IP SAN, and Virtualization have made data accessible to thousands of users across the world without their knowing where the data is physically stored. Data passes through various networks, communication protocols, and devices to ultimately reach to the user. This has led to a concern regarding security and data security has become increasingly important. For companies that earn their livelihood from data management, protecting it is of paramount importance. Data security faces threat in two scenarios – when it is in motion, being transferred or at rest.
Data-in-motion
The various standards are used to secure the data-in-motion like SSL (Secure Sockets Layer), TLS (Transport Layer Security), and IPSEC (Secure Internet Protocol) using a combination of RSA, RC4, DES, or Diffie-Hellman algorithm.
Data-at-rest
The data-at-rest can be made secured by providing two level of security: controlling the access to the data by Access control and Encryption.
The yardstick of measuring everything in today’s business world is revenue. With large corporate relying on storage network for safekeeping their valuable data, lack of security makes the storage network environment unreliable, unstable, and unavailable which ultimately leads to the loss of revenue. Storage networks must be made reliable and stable in order to support business operations. Devices that depend on storage system elements, such as databases, Web servers and email servers require a stable environment. Security measures will increase the stability of an environment by ensuring that the network components that make up the storage environment are able to continue to perform in both normal and abnormal conditions.
Availability is the first and foremost issue in supporting a business. Downtime can equate to loss of revenue and/or loss of production. By making the storage more secure, companies can reduce potential downtime due to unauthorized access attempts, malicious code, and other issues.
Securing Data
The best way of securing the data is to restrict access to the data. This is best achieved by the process of authentication and authorization. A user should be asked for authenticating information before accessing the data and should only be allowed to perform the operations for which access rights are available.
If the data to be accessed is on a local machine, applying access control is easy. It is taken care of by the file-system of the local machine, but if data is accessed from a remote client using protocols like NFS, CIFS, HTTP, or FTP, user credentials and data needs to be secured on the network. It is in such cases that security protocols like SSL (Secure Sockets Layer), TLS (Transport Layer Security), and IPSEC (Secure Internet Protocol) are used.
In the event that a malicious user somehow breaches the above security provisions and gets access to data, the solution is scrambling the data. So encrypting the data – whether it is in-motion or at-rest – is the next level of security that will make the data worthless for the hacker.
Access control
Access control is achieved by the means of Authentication and Authorization.
Authentication is used to verify the identity of an entity and Authorization is used to determine which rights to grant to an authenticated entity.
Encryption
Encryption is used to scramble the data, which can only be accessed through appropriate credentials/keys. Encryption can be categorized into two types – encryption of “data-at-rest” and encryption of “data-in-motion”.
Encrypting data-in-motion
Encrypting “data-in-motion” hides information as data moves across the network. – From the storage to the servers or back. This type of encryption has several standards such as –Secure Sockets Layer (SSL), Transport Layer Security (TLS), and Secure Internet Protocol (IPSEC). Most database vendors have adopted the SSL standard, and include the ability to send traffic between the client and database vendor over an SSL tunnel using some combination of RSA, RC4, DES, or Diffie-Hellman algorithm.
Encryption of data-at-rest
Encrypting “data-at-rest” secures the information stored in the database. Encrypting “data-in-motion” does nothing to protect data that is attacked at the end points. It is a fact that most attacks occur against the end points of data, where data sits for long periods of time rather than on data-in-motion. Considering this, we find ourselves in an uncomfortable situation wherein encryption of data-in-motion is already widely adopted but even the most “security-conscious” database administrators have not adopted encryption of data-at-rest.
The aim of encrypting the data-at-rest is to protect the data when it is resting in the form of files in the file systems, database tables in the database or a raw data in blocks in the SAN environment.
Data security layers in the next section explain encryption of the data by an application at application layer, encrypting the files or databases tables at file/record layer and encrypting blocks at block layer.
Layering Data Security
Securing the data-at-rest by encryption involves encrypting it at various levels. This section explains the encryption at application, file/record and block layer.
Application-based encryption
In this type of encryption, data, like files and directories, is individually encrypted at the discretion of the end user. A separate suite of applications can be developed to encrypt and decrypt data as and when required by the user. Although such software affords a high degree of flexibility in choosing the exact files to be encrypted, the process is non-transparent and cumbersome. The initiative and decision to encrypt data as well as key management is left with the end user. Application based measures can require extensive coding changes, create inconsistencies across systems, and produce ongoing maintenance headaches.
File/Record based encryption
Encrypting at File System level
Managing cryptography at the file subsystem layer of the operating system brings several advantages such as transparency to users and applications, flexibility of key management and access control, good performance, and immunity from an array of attacks. Separate keys may be used to protect different file system objects that may be shared with other users on an individual basis. Some encrypting file systems are given below.
Cryptographic File System (CFS)
The CFS was the first encrypting file-system for UNIX. It is implemented as an NFS server that introduces a cryptographic layer between the virtual file system and the disk. The end user is required to manually attach an encrypted volume before using it to read or write files. Key management in the Cryptographic File System is fairly basic and uses a common pass phrase-derived mount-wide key.
Figure 1: Layering Data Security
Application Layer File / Record Layer Block Layer Application-based Encryption File / Record-based Encryption
Application
File / RecordDatabase File System
Block Aggregation Host Network Device Host-based Appliance-based Device-based Block-based Encryption
Windows Encrypting File System
Microsoft Windows provides a native Windows Encrypting File System that is tightly integrated with the NTFS file system. Its key management scheme uses different keys for different files and associates a public and private key pair with all users. This enables finer access control and provides greater flexibility to end users when sharing protected data.
Dm-crypt
The most popular encrypting file system for Linux is dm-crypt. The native kernel CryptoAPI provides the encryption and decryption routines. It is a very performance efficient implementation and part of the standard Linux kernel. However, it lacks flexibility due to the use of a common mount-wide key and tackles a narrow threat model. Hence, sharing specific files with specific users in large organizations is an issue that is left unresolved by dm-crypt. This limitation makes dm-crypt suitable for most personal applications but not for enterprise deployment.
eCryptfs
eCryptfs is the first attempt at designing an enterprise-class cryptographic file system for Linux. It provides an advanced key management scheme using per-file keys and user-specific keys.
Encrypting at database level
Encrypting data at file system level provides the encryption to the files and folders. Database maintains the tables on top of file systems in terms of files, and relies on file systems for encrypting those files. There are many weaknesses to using this strategy. You cannot selectively encrypt individual pieces of data. This approach results in encrypting the entire file, which means all the data is encrypted. This causes serious performance problems for reading from the database. Every time data is read from the database, it is encrypted whether or not the data really needs to be secured. This adds significant overhead to any action performed against the database.
Another weakness of encryption at file system level is that different pieces of data cannot be encrypted with different keys. Imagine you have a database which is shared by two or more different departments within an organization. One department needs to access the columns which are restricted to other department and another department needs access to the columns which are restricted to first one. This cannot be achieved using file-level encryption, because operating system file encryption encrypts the entire file, not sections of the file.
So the data in the tables can be encrypted at column-based and a key can be attached with each column providing security to the section of the tables if multiple departments are using the same table in the database. Moreover, requirements for applying encryption on columns can be analyzed and encryption can be applied only to the columns having sensitive data. For example, a table having customer’s record like customer ID, name, address, and customer credit card number has sensitive information in the last column i.e. credit card, so only the column having credit card information should be encrypted for the better read performance.
Host-based encryption
With host-based or server-based encryption, data is encrypted the moment it’s created, providing the highest possible level of data security. Since data is encrypted at creation, there’s no chance of unencrypted data being intercepted. If data is intercepted, encryption renders it unreadable and worthless. Host-based encryption is highly secure and well-suited to active data files. Its implementation requires change in current operating infrastructures. Moreover, encrypted data can not be compressed at storage end. The main drawback of this approach is the need of additional computation power at host-end to encrypt and decrypt the data. Another drawback is to the overall cost due to regular maintenance of encryption software at host-end.
Appliance-based encryption
In appliance-based encryption, data is encrypted while being transported from the creation point to its destination. This method protects data at the network level, implementing security features on LAN-connected or SAN-connected encryption appliances or switches.
Data leaves the host unencrypted, and then goes into a dedicated appliance where it is encrypted. After encryption, it enters the LAN or a storage device. Although it is a costly option, requiring a dedicated appliance for every two to six storage devices, it is simple to install and requires no changes to the existing data infrastructure. Moreover, it is the least scalable of the three methods. It works well as an immediate fix, but it grows more expensive and is more difficult to manage as data volume increases. It is easy to implement and it is well suited as a quick method for localized encryption solutions.
Device-based encryption
Data can be encrypted on a disk controller or dedicated storage server making it easy to validate and at the same time eliminating the performance penalty on the server. This method is easy to implement. It’s a good fit for mixed environments with a variety of operating systems. Device-based encryption supports data compression. Since the storage devices handle the encryption task, no changes are required to the existing data infrastructure. Decryption code is built into the data storage container, so there’s no need to maintain decryption software specifically for archived data.
Even though it is easy to implement and cost-effective, best suited to static and archived data, it is not very secure as the data is transmitted unencrypted till it reaches the storage device. Moreover, existing storage devices need to be replaced to support the technology.
Host-based Encrytion
Host Server
Encrytion Appliance Switch
Appliance-based Encryption
Device-based
Encryption Disk Array Tape Library
Some encryption algorithms
Some encryption algorithms are explained below.
DES
This algorithm was developed by IBM for protecting computer data against possible theft or unauthorized access. DES is now considered to be insecure for many applications; this is mainly due to the 56-bit key size being too small.
TripleDES
This algorithm is a variation of DES. It takes a 192 bit key (24 characters) as input and breaks it into three keys. First, DES is used to encrypt a file using the first key, then the file is decrypted using the second key and finally, DES is used to encrypt the file again using the third key.
Skipjack
This algorithm was developed by the U.S. National Security Agency (NSA). It uses an 80-bit key to encrypt or decrypt 64-bit data blocks and was designed for replacing DES. It has been extensively cryptanalyzed, and has no weaknesses.
Blowfish
This algorithm was designed in 1993 by Bruce Schneier. It uses a variable-length key, from 32 bits to 448 bits and a fast, free alternative to existing algorithms like DES.
Rijndael
This algorithm was designed by Joan Daemen and Vincent Rijmen and was selected for the Advanced Encryption Standard (AES). It is highly secure and has undergone extensive cryptanalysis.
o
Twofish
This algorithm is Counterpane System’s candidate for the AES. It is a symmetric key block cipher with a block size of 128 bits and key sizes up to 256 bits. It is designed to be highly secure and highly flexible.
o
MARS
This algorithm was designed by IBM as a candidate for the AES. It uses a 128-bit block size and a variable key size of between 128 and 448 bits.
Serpent
This algorithm was designed by Ross Anderson, Eli Biham and Lars Knudsen and was the candidate for the AES. It supports a key size of 128, 192 or 256 bits.
RC6
Conclusion
Understanding the need to secure your data is the first step towards securing it. In today’s age every detail – personal to corporate secrets – is present in form of data. For computers and networks which store and transfer this data, it is just numbers. It is for us to realize the damage this data can do if it falls into the hands of an unscrupulous person. Whether the data is on your laptop, desktop, or on an organizations storage network, it must be secured and should not come in the hand of an unauthorized entity.
Proper access control mechanism should be enforced for securing the data. While in motion, data should be well protected. It is advisable to encrypt the data before putting it on a network even if it passes through a secure channel.
Data lying on laptop, desktop, and NAS appliances can be encrypted at file as well as block level. However, encrypting the data at file system level provides robust security. Encryption of the file is done on per file key and user’s unique private key. Data in the form database should be encrypted at column level. Encrypting all columns in the database table is not advisable as it decreases the performance. NAS appliance represents the disk space to users in terms of the file systems and can support the encryption at file system level.
If encryption at file system level can not be achieved, it is worth encrypting the data at block level before writing to disk. This can be done by Volume Manager or RAID Controller.
In SAN environment, data can be encrypted at source from where it is created, by a dedicated appliance between source server and storage appliance, or by a storage appliance at block level. If storage is virtualized at switch level, it is advisable to encrypt the data by a dedicated appliance attached to switch. If virtualization is provided at storage level, SAN appliance itself or a dedicated encryption appliance before SAN appliance can be used for encrypting the data.
Most important place where data needs to be encrypted is backup appliance and backup media. It can be virtual or physical tape library. If data is not encrypted at application level or file system level coming to the tape library unencrypted, it should be encrypted before writing to virtual or physical tapes. It is most vulnerable place for theft as data rests there for long periods of time.
Organizations having sensitive data must encrypt it at all levels of its lifecycle whether it is on production server at application, file system and database layer, or at storage layer which includes primary, secondary and tertiary storage. Organizations need to carefully choose the specific place to encrypt the data on the basis of where sensitive data is managed or used.
References
1. Anthony Harrington, Christian D. Jensen- Cryptographic Access Control in a Distributed File System 2.Matt Blaze-A Cryptographic File System for Unix
3. Roman Pletka, Christian Cachin-Cryptographic Security for a High-Performance Distributed File System 2006
4. Kelvin F, M. Frans Kaashoek and David Mazieres- Fast and secure distributed read-only file system 5. Scott A. Banachowski, Zachary N. J. Peterson, Ethan L. Miller and Scott A. Brandt- Intra-file Security for a
Distributed File System
6. Benjamin C. Reed, Mark A. Smith, Dejan Diklic- Security Considerations When Designing a Distributed File System Using Object Storage Devices
7. SNIA-Encryption of Data At-rest, Step-by-step Checklist 8. Introduction to Storage Security, A SNIA Security White Paper
All content / information present here is the exclusive property of Tata Consultancy Services Limited (TCS). The content / information contained here is correct at the time of publishing. No material from here may be copied, modified, reproduced, republished, uploaded, transmitted, posted or distributed in any form without prior written permission from TCS. Unauthorized use of the content / information appearing here may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.
Copyright © 2007 Tata Consultancy Services Limited
innovation and IT solutions, and leverages its comprehensive portfolio of services to partner with high-tech enterprises and provide end-to-end solutions to help achieve product innovation, operational excellence and greater profitability thereby attaining market leadership.
solutions and outsourcing organization that delivers real results to global businesses, ensuring a level of certainty no other firm can match. TCS offers a consulting-led, integrated portfolio of IT and IT-enabled services delivered through its unique Global Network
TM
Delivery Model , recognized as the benchmark of excellence in software development.
A part of the Tata Group, India's largest industrial conglomerate, TCS has over 94,000 of the world's best trained IT consultants in 47 countries. The company generated consolidated revenues of US $4.3 billion for fiscal year ended 31 March 2007 and is listed on the National Stock Exchange and Bombay Stock Exchange in India. For more information, visit us at www.tcs.com