• No results found

Database System. Session 1 Main Theme Introduction to Database Systems Dr. Jean-Claude Franchitti

N/A
N/A
Protected

Academic year: 2021

Share "Database System. Session 1 Main Theme Introduction to Database Systems Dr. Jean-Claude Franchitti"

Copied!
33
0
0

Loading.... (view fulltext now)

Full text

(1)

1

Database Systems

Session 1 – Main Theme

Introduction to Database Systems

Dr. Jean-Claude Franchitti

New York University

Computer Science Department

Courant Institute of Mathematical Sciences

Presentation material partially based on textbook slides Fundamentals of Database Systems (6thEdition)

by Ramez Elmasri and Shamkant Navathe Slides copyright © 2011

2

2 Introduction to Database SystemsIntroduction to Database Systems

Agenda

1

1 Instructor and Course IntroductionInstructor and Course Introduction

3

3 Database System ArchitectureDatabase System Architecture 4

(2)

3

Profile

-¾ 27 years of experience in the Information Technology Industry, including twelve years of experience working for leading IT consulting firms such as Computer Sciences Corporation

¾ PhD in Computer Science from University of Colorado at Boulder ¾ Past CEO and CTO

¾ Held senior management and technical leadership roles in many large IT Strategy and Modernization projects for fortune 500 corporations in the insurance, banking, investment banking, pharmaceutical, retail, and information management industries

¾ Contributed to several high-profile ARPA and NSF research projects

¾ Played an active role as a member of the OMG, ODMG, and X3H2 standards committees and as a Professor of Computer Science at Columbia initially and New York University since 1997

¾ Proven record of delivering business solutions on time and on budget

¾ Original designer and developer of jcrew.com and the suite of products now known as IBM InfoSphere DataStage

¾ Creator of the Enterprise Architecture Management Framework (EAMF) and main contributor to the creation of various maturity assessment methodology

¾ Developed partnerships between several companies and New York University to incubate new methodologies (e.g., EA maturity assessment methodology developed in Fall 2008), develop proof of concept software, recruit skilled graduates, and increase the companies’ visibility

Who am I?

How to reach me?

Cell (212) 203-5004

Email [email protected]

AIM, Y! IM, ICQ jcf2_2003

MSN IM [email protected]

LinkedIn http://www.linkedin.com/in/jcfranchitti

Come on…what else did you expect?

(3)

5

What is the class about?

ƒ

Course description and syllabus:

»http://www.nyu.edu/classes/jcf/CSCI-GA.2433-001 »http://cs.nyu.edu/courses/fall11/CSCI-GA.2433-001/

ƒ

Textbooks:

» Fundamentals of Database Systems (6thEdition)

Ramez Elmasri and Shamkant Navathe Addition Wesley

ISBN-10: 0-1360-8620-9, ISBN-13: 978-0136086208 6thEdition (04/10)

Icons / Metaphors

Common Realization

Information

Knowledge/Competency Pattern

Governance

Alignment

Solution Approach

(4)

7

Course Objectives

„ Gain understanding of fundamental concepts of state-of-the art

databases (more precisely called: Database Management Systems)

„ Get to know some of the tools used in the design and

implementations of databases

„ Know enough so that it is possible to read/skim a database system

manual and

„ Start designing and implementing small data bases

„ Start managing and interacting with existing small to large databases

„ Experiment and practice with industry leading vendor solutions:

„ CA’s Erwin for design of relational database

„ Oracle, IBM DB2, and other DB products for writing relational queries

„

Methodology used for modeling a business application

during the database design process, focusing on

entity-relationship model and entity entity-relationship diagrams

„

Relational model and implementing an entity-relationship

diagram

„

Relational algebra (using SQL syntax)

„

SQL as data manipulation language

(5)

9

„ Physical design of the database using various file organization and

indexing techniques for efficient query processing

„ Concurrency Control

„ Recovery

„ Query execution

„ Data warehouses

„ Additional topics may be covered as time allows, these topics are

covered in greater depth in other courses but PowerPoint presentations for them will still be provided

„ The course material is partially derived from the textbook slides and

material covered as part of the Database Systems course offered at NYU Courant in previous semesters

Key Material Covered (2/2)

Software Requirements

„

Microsoft Windows XP (Professional Ed.) / Vista / 7

„

Software tools will be available from the Internet or from the

course Web site under demos as a choice of freeware or

commercial tools

„

Database Modeling Tools

„

Database Management Software Tools

„

etc.

(6)

11

2

2 Introduction to Database SystemsIntroduction to Database Systems

Agenda

1

1 Instructor and Course IntroductionInstructor and Course Introduction

3

3 Database System ArchitectureDatabase System Architecture 4

4 Summary and ConclusionSummary and Conclusion

Section Outline

ƒ

Introduction

ƒ

An Example

ƒ

Characteristics of the Database Approach

ƒ

Actors on the Scene

ƒ

Workers behind the Scene

(7)

13

Overview

ƒ

Traditional database applications

ƒ Store textual or numeric information

ƒ

Multimedia databases

ƒ Store images, audio clips, and video streams digitally

ƒ

Geographic information systems (GIS)

ƒ Store and analyze maps, weather data, and satellite images

ƒ

Data warehouses and online analytical processing (OLAP)

systems

ƒ Extract and analyze useful business information from very large databases

ƒ Support decision making

ƒ

Real-time and active database technology

ƒ Control industrial and manufacturing processes

Introduction (1/3)

ƒ

Database

ƒ Collection of related data

ƒ Known facts that can be recorded and that have implicit meaning ƒ Mini-world or Universe of Discourse (UoD)

ƒ Represents some aspect of the real world

ƒ Logically coherent collection of data with inherent meaning ƒ Built for a specific purpose

ƒ

Example of a large commercial database

ƒ Amazon.com

ƒ

Database management system (DBMS)

ƒ Collection of programs

ƒ Enables users to create and maintain a database

ƒ

Defining a database

ƒ Specify the data types, structures, and constraints of the data to be stored

(8)

15

Introduction (2/3)

ƒ

Meta-data

ƒ Database definition or descriptive information

ƒ Stored by the DBMS in the form of a database catalog or dictionary

ƒ

Manipulating a database

ƒ Query and update the database miniworld ƒ Generate reports

ƒ

Sharing a database

ƒ Allow multiple users and programs to access the database simultaneously

ƒ

Application program

ƒ Accesses database by sending queries to DBMS

ƒ

Query

ƒ Causes some data to be retrieved

Introduction (3/3)

ƒ

Transaction

ƒ May cause some data to be read and some data to be written into the database

ƒ

Protection includes:

ƒ System protection ƒ Security protection

ƒ

Maintain the database system

(9)

17

Database System Environment

High-Level Example - Metadata

ƒ

UNIVERSITY database

ƒ Information concerning students, courses, and grades in a university environment

ƒ

Data records

ƒ STUDENT ƒ COURSE ƒ SECTION ƒ GRADE_REPORT ƒ PREREQUISITE

ƒ

Specify structure of records of each file by specifying data

type for each data element

ƒ String of alphabetic characters ƒ Integer

(10)

19

High-Level Example – Database Implementation (1/2)

ƒ Construct UNIVERSITY database

ƒ Store data to represent each student, course, section, grade report, and prerequisite as a record in appropriate file

ƒ Relationships among the records

ƒ Manipulation involves querying and updating

ƒ Examples of queries:

ƒ Retrieve the transcript

ƒ List the names of students who took the section of the ‘Database’ course offered in fall 2008 and their grades in that section

ƒ List the prerequisites of the ‘Database’ course

ƒ Examples of updates:

ƒ Change the class of ‘Smith’ to sophomore

ƒ Create a new section for the ‘Database’ course for this semester ƒ Enter a grade of ‘A’ for ‘Smith’ in the ‘Database’ section of last semester

ƒ Phases for designing a database:

ƒ Requirements specification and analysis ƒ Conceptual design

ƒ Logical design ƒ Physical design

(11)

21

Characteristics of the Database Approach

ƒ

Traditional file processing

ƒ Each user defines and implements the files needed for a specific software application

ƒ

Database approach

ƒ Single repository maintains data that is defined once and then accessed by various users

ƒ

Main characteristics of database approach

ƒ Self-describing nature of a database system

ƒ Insulation between programs and data, and data abstraction ƒ Support of multiple views of the data

ƒ Sharing of data and multiuser transaction processing

Self-Describing Nature of a Database System

ƒ

Database system

contains complete

definition of structure

and constraints

ƒ

Meta-data

ƒ Describes structure of the database

ƒ

Database catalog

used by:

ƒ DBMS software

ƒ Database users who need information about database structure

(12)

23

Insulation Between Programs and Data

ƒ

Program-data independence

ƒ Structure of data files is stored in DBMS catalog separately from access programs

ƒ

Program-operation independence

ƒ Operations specified in two parts:

• Interface includes operation name and data types of its arguments • Implementation can be changed without affecting the interface

ƒ

Data abstraction

ƒ Allows program-data independence and program-operation independence

ƒ

Conceptual representation of data

ƒ Does not include details of how data is stored or how operations are implemented

ƒ

Data model

ƒ Type of data abstraction used to provide conceptual representation

(13)

25

Support of Multiple Views of the Data

ƒ

View

ƒ Subset of the database

ƒ Contains virtual data derived from the database files but is not explicitly stored

ƒ

Multiuser DBMS

ƒ Users have a variety of distinct applications ƒ Must provide facilities for defining multiple views

Sharing of Data and Multiuser Transaction Processing

ƒ

Allow multiple users to access the database at the same

time

ƒ

Concurrency control software

ƒ Ensure that several users trying to update the same data do so in a controlled manner

• Result of the updates is correct

ƒ

Online transaction processing (OLTP) application

ƒ

Transaction

ƒ Central to many database applications

ƒ Executing program or process that includes one or more database ƒ Isolation property

• Each transaction appears to execute in isolation from other transactions

ƒ Atomicity property

• Either all the database operations in a transaction are executed or none are

(14)

27

Actors on the Scene

ƒ Database administrators (DBA) are responsible for:

ƒ Authorizing access to the database

ƒ Coordinating and monitoring its use

ƒ Acquiring software and hardware resources

ƒ Database designers are responsible for:

ƒ Identifying the data to be stored

ƒ Choosing appropriate structures to represent and store this data

ƒ End users

ƒ People whose jobs require access to the database

ƒ Types

• Casual end users

• Naive or parametric end users • Sophisticated end users • Standalone users

ƒ System analysts

ƒ Determine requirements of end users

ƒ Application programmers

ƒ Implement these specifications as programs

Workers behind the Scene

ƒ

DBMS system designers and implementers

ƒ Design and implement the DBMS modules and interfaces as a software package

ƒ

Tool developers

ƒ Design and implement tools

ƒ

Operators and maintenance personnel

ƒ Responsible for running and maintenance of hardware and software environment for database system

(15)

29

Advantages of Using the DBMS Approach (1/3)

ƒ Controlling redundancy

ƒ Data normalization

ƒ Denormalization

• Sometimes necessary to use controlled redundancy to improve the performance of queries

ƒ Restricting unauthorized access

ƒ Security and authorization subsystem

ƒ Privileged software

ƒ Providing persistent storage for program objects

ƒ Complex object in C++ can be stored permanently in an object-oriented

DBMS

ƒ Impedance mismatch problem

• Object-oriented database systems typically offer data structure compatibility

ƒ Providing storage structures and search techniques for efficient query processing

ƒ Indexes

ƒ Buffering and caching

ƒ Query processing and optimization

Advantages of Using the DBMS Approach (2/3)

ƒ

Providing backup and recovery

ƒ Backup and recovery subsystem of the DBMS is responsible for recovery

ƒ

Providing multiple user interfaces

ƒ Graphical user interfaces (GUIs)

ƒ

Representing complex relationships among data

ƒ May include numerous varieties of data that are interrelated in many ways

ƒ

Enforcing integrity constraints

ƒ Referential integrity constraint

• Every section record must be related to a course record

ƒ Key or uniqueness constraint

• Every course record must have a unique value for Course_number

ƒ Business rules

(16)

31

Advantages of Using the DBMS Approach (3/3)

ƒ

Permitting inferencing and actions using rules

ƒ Deductive database systems

• Provide capabilities for defining deduction rules

• Inferencing new information from the stored database facts

ƒ Trigger

• Rule activated by updates to the table

ƒ Stored procedures

• More involved procedures to enforce rules

ƒ

Additional implications of using the database approach

ƒ Reduced application development time ƒ Flexibility

ƒ Availability of up-to-date information ƒ Economies of scale

A Brief History of Database Applications (1/2)

ƒ

Early database applications using hierarchical

and network systems

ƒ

Large numbers of records of similar structure

ƒ

Providing data abstraction and application

flexibility with relational databases

ƒ

Separates physical storage of data from its conceptual

representation

(17)

33

A Brief History of Database Applications (2/2)

ƒ

Interchanging data on the Web for e-commerce using

XML

ƒ Extended markup language (XML) primary standard for

interchanging data among various types of databases and Web pages

ƒ

Extending database capabilities for new applications

ƒ Extensions to better support specialized requirements for applications

ƒ Enterprise resource planning (ERP) ƒ Customer relationship management (CRM)

ƒ

Databases versus information retrieval

ƒ Information retrieval (IR)

• Deals with books, manuscripts, and various forms of library-based articles

When Not to Use a DBMS

ƒ

More desirable to use regular files for:

ƒ Simple, well-defined database applications not expected to change at all

ƒ Stringent, real-time requirements that may not be met because of DBMS overhead

ƒ Embedded systems with limited storage capacity ƒ No multiple-user access to data

(18)

35

Summary

ƒ

Database

ƒ Collection of related data (recorded facts)

ƒ

DBMS

ƒ Generalized software package for implementing and maintaining a computerized database

ƒ

Several categories of database users

ƒ

Database applications have evolved

ƒ Current trends: IR, Web

2

2 Introduction to Database SystemsIntroduction to Database Systems

Agenda

1

1 Instructor and Course IntroductionInstructor and Course Introduction

3

3 Database System ArchitectureDatabase System Architecture 4

(19)

37

Section Outline

ƒ

Data Models, Schemas, and Instances

ƒ

Three-Schema Architecture and Data

Independence

ƒ

Database Languages and Interfaces

ƒ

The Database System Environment

ƒ

Centralized and Client/Server Architectures

for DBMSs

ƒ

Classification of Database Management Systems

Database System Concepts and Architecture

ƒ

Basic client/server DBMS architecture

ƒ

Client module

(20)

39

Data Models, Schemas, and Instances

ƒ

Data abstraction

ƒ

Suppression of details of data organization and storage

ƒ

Highlighting of the essential features for an improved

understanding of data

ƒ

Data model

ƒ

Collection of concepts that describe the structure of a

database

ƒ

Provides means to achieve data abstraction

ƒ

Basic operations

ƒ Specify retrievals and updates on the database

ƒ

Dynamic aspect or behavior of a database application

ƒ Allows the database designer to specify a set of valid operations allowed on database objects

Categories of Data Models (1/2)

ƒ High-level or conceptual data models

ƒ Close to the way many users perceive data

ƒ Low-level or physical data models

ƒ Describe the details of how data is stored on computer storage media

ƒ Representational data models

ƒ Easily understood by end users

ƒ Also similar to how data organized in computer storage

ƒ Entity

ƒ Represents a real-world object or concept

(21)

41

Categories of Data Models (2/2)

ƒ Relational data model

ƒ Used most frequently in traditional commercial DBMSs

ƒ Object data model

ƒ New family of higher-level implementation data models

ƒ Closer to conceptual data models

ƒ Physical data models

ƒ Describe how data is stored as files in the computer

ƒ Access path

• Structure that makes the search for particular database records efficient

ƒ Index

• Example of an access path

• Allows direct access to data using an index term or a keyword

Schemas, Instances, and Database State (1/2)

ƒ Database schema ƒ Description of a database ƒ Schema diagram ƒ Displays selected aspects of schema ƒ Schema construct

ƒ Each object in the schema ƒ Database state or snapshot ƒ Data in database at a particular moment in time

(22)

43

Schemas, Instances, and Database State (2/2)

ƒ

Define a new database

ƒ Specify database schema to the DBMS

ƒ

Initial state

ƒ Populated or loaded with the initial data

ƒ

Valid state

ƒ Satisfies the structure and constraints specified in the schema

ƒ

Schema evolution

ƒ Changes applied to schema as application requirements change

Three-Schema Architecture and Data Independence

ƒ Internal level

ƒ Describes physical storage structure of the database

ƒ Conceptual level

ƒ Describes structure of the whole database for a community of users

ƒ External or view level

(23)

45

Data Independence

ƒ

Capacity to change the schema at one level of a database

system

ƒ Without having to change the schema at the next higher level

ƒ

Types:

ƒ Logical ƒ Physical

DBMS Languages

ƒ

Data definition language (DDL)

ƒ Defines both schemas

ƒ

Storage definition language (SDL)

ƒ Specifies the internal schema

ƒ

View definition language (VDL)

ƒ Specifies user views/mappings to conceptual schema

ƒ

Data manipulation language (DML)

ƒ Allows retrieval, insertion, deletion, modification

ƒ

High-level or nonprocedural DML

• Can be used on its own to specify complex database operations concisely

• Set-at-a-time or set-oriented

ƒ

Low-level or procedural DML

(24)

47

DBMS Interfaces

ƒ

Menu-based interfaces for Web clients or browsing

ƒ

Forms-based interfaces

ƒ

Graphical user interfaces

ƒ

Natural language interfaces

ƒ

Speech input and output

ƒ

Interfaces for parametric users

ƒ

Interfaces for the DBA

The Database System Environment

ƒ

DBMS component modules

ƒ Buffer management

ƒ Stored data manager ƒ DDL compiler

ƒ Interactive query interface

• Query compiler • Query optimizer

ƒ Pre-compiler

(25)

49

Component Modules of a DBMS

Database System Utilities

ƒ

Loading

ƒ Load existing data files

ƒ

Backup

ƒ Creates a backup copy of the database

ƒ

Database storage reorganization

ƒ Reorganize a set of database files into different file organizations

ƒ

Performance monitoring

(26)

51

Tools, Application Environments, and Communications Facilities

ƒ

CASE Tools

ƒ

Data dictionary (data repository) system

ƒ Stores design decisions, usage standards, application program descriptions, and user information

ƒ

Application development environments

ƒ

Communications software

Centralized and Client/Server Architectures for DBMSs

ƒ

Centralized DBMSs Architecture

ƒ All DBMS functionality, application program execution, and user interface processing carried out on one machine

(27)

53

Basic Client / Server Architectures (1/2)

ƒ

Servers with specific functionalities

ƒ

File server

• Maintains the files of the client machines.

ƒ

Printer server

• Connected to various printers; all print requests by the clients are forwarded to this machine

ƒ

Web servers or e-mail servers

ƒ

Client machines

ƒ

Provide user with:

• Appropriate interfaces to utilize these servers • Local processing power to run local applications

Basic Client/Server Architectures (2/2)

ƒ

Client

ƒ

User machine that provides user interface capabilities

and local processing

ƒ

Server

ƒ

System containing both hardware and software

ƒ

Provides services to the client machines

(28)

55

Sample Two-Tier Client / Server Architecture

Two-Tier Client/Server Architectures for DBMSs

ƒ

Server handles

ƒ

Query and transaction functionality related to SQL

processing

ƒ

Client handles

ƒ

User interface programs and application programs

ƒ

Open Database Connectivity (ODBC)

(29)

57

Three-Tier and n-Tier Architectures for Web Applications

ƒ

Application server or Web server

ƒ

Adds intermediate layer between client and the

database server

ƒ

Runs application programs and stores business rules

ƒ

N-tier

ƒ

Divide the layers between the user and the stored data

further into finer components

Application Server or Web Server

(30)

59

Classification of Database Management Systems

ƒ Data model

ƒ Relational

ƒ Object

ƒ Hierarchical and network (legacy)

ƒ Native XML DBMS ƒ Number of users ƒ Single-user ƒ Multiuser ƒ Number of sites ƒ Centralized ƒ Distributed ƒ Homogeneous ƒ Heterogeneous ƒ Cost ƒ Open source

ƒ Different types of licensing

Classification of Database Management Systems

ƒ

Types of access path options

ƒ

General or special-purpose

(31)

61

Section Summary

ƒ

Concepts used in database systems

ƒ

Main categories of data models

ƒ

Types of languages supported by DMBSs

ƒ

Interfaces provided by the DBMS

ƒ

DBMS classification criteria:

ƒ Data model, number of users, number of sties, access paths, cost

2

2 Introduction to Database SystemsIntroduction to Database Systems

Agenda

1

1 Instructor and Course IntroductionInstructor and Course Introduction

3

3 Database System ArchitectureDatabase System Architecture 4

(32)

63

Course Assignments

„ Individual Assignments

„ Reports based on case studies / class presentations

„ Textbook problem sets

„ Project-Related Assignments

„ All assignments (other than the individual assessments) will

correspond to milestones in the course project

Assignments & Readings

ƒ Readings

» Slides and Handouts posted on the course web site

(33)

65

Next Session: Relational Data Model and Relational Database Constraints

References

Related documents

ISS.A LAN MSN 0359-0442 Temporary revisions Temporary revisions... ISS.A LAN MSN 0167 Temporary revisions

ƒ One of the most important and interesting features of functional programming languages is that functions are first-class values.. ƒ The means that programs can create new functions

Write a short report that documents your findings and recommendations with respect to selection criteria in support of page-based development environments for application

Organizing ideas around a currency crisis model can bring various advantages in understanding sources of financial instability such as: self-fulfilling

May it be your will YHVH, God of Hosts, enthroned on theKerubim, Shadai, lofty King, who will oversee with a merciful eye the one who carries this amulet, and will command the

“Voice” is a characteristic of verbs which indicates the relation of the verb’s action to its subject. In other word, voice is the form a verb takes to indicate whether the subject

Through management soft skills-based Islamic education is expected to restore Islam Indonesia on Islam very famous identity steeped in local culture and uphold the values of

Android platform includes the popular open source SQLite database which has been used with great success as on-disk file format that allows the developer to handle data in a