S
TUDYG
UIDEData Storage Technologies
Ramūnas MARKAUSKAS
Data Storage Technologies Study Guide
Cycle: 1st level
Study program: Information Technologies Course unit code: ITDST
Awarding institution: Department of Computer Science II, Faculty of Mathematics and Informatics
Preparation of the study guide was supported by
the project „Increasing Internationality in Study Programs of the Department of Computer Science II“, project number VP1–2.2–ŠMM-07-K-02-070, funded by The European Social Fund Agency and the Government of Lithuania.
Studijų vadovo medžiagos rengimą rėmė
projektas „Kompiuterijos katedros studijų programų tarptautiškumo didinimas“, projekto kodas VP1–2.2–ŠMM-07-K-02-070, finansuojamas iš Europos socialinio fondo ir Lietuvos valstybės biudžeto lėšų.
Contents
Abstract ... 4
Assessment strategy ... 4
Content of the course ... 4
Literature ... 6
Exam test ... 6
Project cases ... 7
Problem solving ... 9
Study Guide: Data Storage Technologies
Page 4 of 9
Abstract
This course is intended for students who wish / need to understand the different type of storage technologies, their architectures and the technological trends. While the information provided in this course is essential for IT system administrators and key knowledge for data storage administrators, but it can also be used in one’s everyday life dealing with personal storage.
The course is delivered by the means of inclusive lectures, task solving in class and individually, individual analysis of literature, presentation, project work, case analysis, data interpretation and consulting.
Assessment strategy
Assessment of the course consists of:
1. Exam test for maximum of 4 points (5 open questions, 0.4 points each and 10 multiple-choice questions, 0.2 points each), assessment criteria are based on the correctness of the answer. Deadline: during exam session;
2. Project presentation / defence for maximum of 3 points. Project is done in a groups of two students, assessment criteria are based on logical reasoning, technical requirements conformity (up to 80%); level of presentation and oratory (up to 10%); style of presentation (up to 10%). Deadline: till 14th lecture;
3. Class / homework presentation /defence for maximum of 3 points (5 problems, 0.6 points each) , assessment criteria are based on one’s ability to explain logics of the solution (up to 40%) and connection between variables and parameters of technical equipment (up to 40%); right arithmetic operations (up to 20%). Deadline: till 9th lecture;
4. Additionally, to earn extra 3 points, one can present / defend a laboratory work “Installation and Configuration of Virtual Data Storage”. Assessment criteria are based on the fact of installation of virtual data storage (explanation up to 10%), configuration of Back-End components (fact up to 15%, explanation up to 15%), configuration of Front-End components and access (fact up to 20%, explanation up to 20%), demonstration of access (up to 20%). Deadline: till the end of the course.
Content of the course
1. Introduction to Storage technologies [1: chapter 1] a. Information Storage
b. Evolution of storage technology and architecture c. Key challenges
d. Information Lifecycle
2. Storage system environment [1: chapter 2] a. Components of storage system b. Disk drive components
c. Disk drive performance
d. Fundamental laws for drive performance e. Logic components of the host
Study Guide: Data Storage Technologies
Page 5 of 9
d. RAID comparison
e. RAID impact on disk performance 4. Intelligent storage systems [1: chapter 4]
a. Components
b. Intelligent storage array 5. DAS and SCSI [1: chapter 5]
a. Types of DAS
b. DAS benefits and limits c. Disk drives interfaces d. Introduction to parallel SCSI e. SCSI command model 6. NAS [1: chapter 7]
a. General purpose servers versus NAS devices b. Benefits of NAS
c. NAS File I/O
d. Components of NAS e. NAS implementations
f. NAS file sharing protocols (NFS, CIFS) g. NAS I/O operations
h. NAS performance and availability 7. SAN [1: chapter 6]
a. Overview of Fiber Channel b. SAN and its evolution c. Components of SAN d. FC connectivity e. FC ports f. FC architecture g. Zoning h. FC login types i. FC topologies 8. IP SAN [1: chapter 8] a. Components of iSCSI b. iSCSI host connectivity c. iSCSI protocol stack d. iSCSI names and sessions
e. iSCSI error handling and security f. FCIP
9. CAS [1: chapter 9]
a. Fixed content and archives b. Types of archives
c. Benefits of CAS d. CAS architecture
e. Object storage and retrieval in CAS f. CAS examples
10. Storage, server virtualization, real world examples [1: chapter 10] a. Forms of virtualization
b. SNIA virtualization taxonomy
Study Guide: Data Storage Technologies
Page 6 of 9
e. Types of storage virtualization
11. Business continuity, Backup and recovery [1: chapter 11, 12] a. Backup purpose
b. Backup considerations c. Granularity
d. Recovery considerations e. Backup methods and process f. Backup and restore operations g. Topologies
h. Backup in NAS i. Backup technologies
12. Local and remote replication [1: chapter 13, 14] 13. Storage security and management [1: chapter 15, 16]
a. Storage security framework b. Risk triad
c. Storage security domains d. Security implementations e. Monitoring infrastructure f. Management activities g. Management challenges
Literature
Required[1] EMC education services, Information Storage and Management: Storing, Managing, and Protecting Digital Information. John Wiley & Sons, 2009.
Optional
EMC education services, Information Storage and Management: Storing, Managing, and
Protecting Digital Information in Classic, Virtualized, and Cloud Environments, 2nd ed. John Wiley & Sons, 2012.
G. Schulz, Cloud and Virtual Data Storage Networking. Taylor & Francis, 2011. M. Gupta, Storage Area Network Fundamentals. Cisco Press, 2002.
Exam test
Example of an open question
How many milliseconds does it take for an HDD of 5.400 RPM to do a full round? Please, give the full solution. (Answer may be found in lecture #2)
Example of multiple choice question
What is the minimal amount of controllers for a storage array of type AA (active - active)? (Answer may be found in lecture #4)
[A] 1 [B] 2
Study Guide: Data Storage Technologies
Page 7 of 9
Project cases
Each group of two students should choose one project case and inform the lecturer by e-mail about group members and their choice. The same case may be chosen not more than by two groups and the solutions between groups should be totally different. Detailed case interpretation will be done during laboratory work. Projects should be presented / defended in a presentational form. The presentation should consist of:
1. Introduction: presentation of current situation;
2. Architectural solution: presentation of proposed system’s architecture, requirements for the environment;
3. Proposal: presentation of real world hardware with specification and prices to fulfill the requirements.
Intensive use of schemas and graphics is desirable.
1st case
Select and present data storage equipment for MS Exchange 2010 environment with the following parameters / requirements:
1. E-mail boxes count: 4000; 2. Average size of a box: 2 GB;
3. Expected growth of boxes count: 1% per year; 4. Expected growth of boxes size: 50% per year; 5. Part of intensively used boxes: 30%;
6. RPO: 24 hours; 7. RTO: 1 hour; 8. Period: 3 years.
2nd case
Select and present data storage equipment for e-mail server of your choice with the following parameters / requirements:
1. E-mail boxes count: 500; 2. Average size of a box: 1 GB;
3. Expected growth of boxes count: 5% per year; 4. Expected growth of boxes size: 120% per year; 5. Part of intensively used boxes: 80%;
6. RPO: 5 min; 7. RTO: 0 min; 8. Period: 5 years.
3rd case
Select and present data storage equipment for an application with the following parameters / requirements:
1. IOPS: 450,000; 2. Block size: 4 KB;
3. Distribution of read / write operations: 80 / 20; 4. RPO: 0 hours;
5. RTO: 0 hour;
6. Records should be kept online at least for 3 years and may be destroyed after 10 years; 7. Period: 10 years.
4th case
Study Guide: Data Storage Technologies
Page 8 of 9
1. IOPS: 250,000; 2. Block size: 4 MB;
3. Distribution of read / write operations: 40 / 60; 4. RPO: 0 hours;
5. RTO: 0 hour;
6. Records should be kept online at least for 3 years and may be destroyed after 10 years; 7. Period: 10 years.
5th case
Select and present data storage equipment for application with the following parameters / requirements:
1. IOPS: 50,000; 2. Block size: 16 KB;
3. Distribution of read / write operations: 90 / 10; 4. Database size: 20 TB;
5. Expected growth of database: 15% per year; 6. RPO: 0 hours;
7. RTO: 24 hour;
8. Records should be kept online at least for 5 years and may be destroyed after 25 years.
6th case
Current equipment: one AA type SAN device with 128 GB SSD connected over FCP and configured in RAID 5 mode (two RAID groups with six disks in each; one disk used as Hot Spare). Requirements:
1. IOPS: 100,000; 2. Block size: 1 MB;
3. Distribution of read / write operations: 90 / 10; 4. Database size: 600 GB;
5. Expected growth of database: 100% per year; 6. RPO: 0 hours;
7. RTO: 24 hour;
8. Data distribution by creation date: a. Recent year – 80.0%; b. 1 to 3 years – 15.0%; c. 3 to 10 years – 5.0%; d. Over 10 years – 0.0%.
9. Records should be kept online at least for 3 years and may be destroyed after 15 years.
7th case
Current equipment: one AA type SAN device with 256 GB SSD connected over FCP and configured in RAID 1 mode (six RAID groups with two disks in each; one disk used as Hot Spare). Requirements:
1. IOPS: 150,000; 2. Block size: 1 MB;
3. Distribution of read / write operations: 85 / 15; 4. Database size: 1 TB;
5. Expected growth of database: 120% per year; 6. RPO: 0 hours;
7. RTO: 24 hour;
Study Guide: Data Storage Technologies
Page 9 of 9
c. 3 to 10 years – 10.0%; d. Over 10 years – 5.0%.
9. Records should be kept online at least for 5 years and may be destroyed after 20 years.
Problem solving
Each problem should be solved in a written form and defended.
1st problem
Application operates with blocks of size 64 KB and has 1,000 intensive users, who generate 2 IOPS each, and also 2,000 regular users, who generate 1 IOPS each. Each intensive user uses up to 400 GB and regular – up to 75 GB of storage space. Distribution of read / write operations is respectively 2/1. Management processes create additional 20% IOPS flow. Calculate IOPS requirements for RAID of type 1, 5 and 6.
2nd problem
The manufacturer gives the following parameters of hard disk drive: rotation speed – 15,000 RPM; external data transfer rate – 3 Gbps; internal data transfer rate – 120 MBps; average seek time – 3 ms, capacity – 2TB Calculate the IOPS capability of this disk if 64 KB data blocks are used.
3rd problem
The manufacturer gives the following parameters of solid state drive: external data transfer rate – 6 Gbps; internal data transfer rate – 400 MBps; average seek time – 0.1 ms, capacity – 256 GB Calculate the IOPS capability of this disk if 4 KB data blocks are used.
4th problem
Calculate, how many disk from the 2nd problem will be needed to fulfill the requirements for the
application from the 1st problem with all mentioned RAID types.
5th problem
Calculate, how many disk from the 3rd problem will be needed to fulfill the requirements for the application from the 1st problem with all mentioned RAID types.
Laboratory work “Installation and Configuration of Virtual Data Storage
(VDS)”
Goal
• To get to know the technological principles of data storage, configuration abilities, protocols in use;
• To practically try the process of data storage installation / configuration by using VDS.
Work flow
1. Get to know candidates for VDS installation (FreeNAS, NAS4Free, …); 2. Decide, which one you’ll be working with;
3. Prepare virtual machine in an environment of your own which fulfill requirements for your chosen VDS;
4. Create additional 3 to 5 virtual disks of possible smallest size for information storage;
5. Install virtual machine and complete basic configuration tasks (user accounts, network configuration, …);
6. Create RAID array of selected level;
7. Configure VDS to be accessed over CIFS and iSCSI; 8. Access VDS over CIFS and iSCSI;