• No results found

Introduction to Databases

N/A
N/A
Protected

Academic year: 2021

Share "Introduction to Databases"

Copied!
31
0
0

Loading.... (view fulltext now)

Full text

(1)

Introduction to Databases

Marek Rychly

[email protected] Strathmore University, @iLabAfrica &

Brno University of Technology, Faculty of Information Technology

Advanced Databases and Enterprise Systems 24 August 2015

(2)

Outline

1

Course Content

Lectures and Assessments

Evaluation

Background on the Course

2

Introduction to Databases

From File-Based Storage to Database Management Systems

Database Schema and Data Model

Database Architecture

3

Theoretical Background

Sets and Relations

Functions

(3)

Course Content

Introduction to Databases Theoretical Background

Lectures and Assessments

Evaluation

Background on the Course

Lectures and Assessments

1st Introduction to Databases

2nd Relational Database Model

(1stAssessment: define relations, keys, and constraints) 3rd Relational Algebra and Calculus

(2ndAssessment: make expressions in relational algebra/calculus) 4th–5th SQL

(3rdAssessment: create database objects and define several queries)

6th Transactions and Transaction Management 7th Database Analysis and Design

8th Entity-Relationship Modelling 9th Normalization and Database Design

(4thAssessment: make an ER model and perform the normalization) 10th Database Performance

11th Data warehousing and Data-mining;

Post-relational Databases

12th Current developments in database technology;

Distributed, Object-oriented, and NoSQL Databases

(4)

Course Content

Introduction to Databases Theoretical Background

Lectures and Assessments

Evaluation

Background on the Course

Core Reading Materials

T. Connolly, T.M. Connolly, and C.E. Beg.

Database Systems: A Practical Approach to Design,

Implementation, and Management

.

Pearson Education, 2014.

ISBN 9780132943260.

URL

http://www.pearsonhighered.com/educator/product/

9780132943260.page#dw_resources

.

M. Kifer, A.J. Bernstein, and P.M. Lewis.

Database Systems: An Application-oriented Approach

.

Pearson/Addison-Wesley, 2006.

ISBN 9780321268457.

URL

http://www.pearsonhighered.com/educator/product/

9780321268457.page#dw_resources

.

(5)

Course Content

Introduction to Databases Theoretical Background

Lectures and Assessments

Evaluation

Background on the Course

Evaluation

50%

assessments

(each result has to be submitted to Marek Rychly <[email protected]> by, at latest, two weeks after the assessment specification in a lecture)

1

st

Assessment by 14 September 2015

2

nd

Assessment by 21 September 2015

3

rd

Assessment by 28 September 2015

4

th

Assessment by 16 October 2015

5

th

Assessment by 26 October 2015

50%

final exam

(6)

Course Content

Introduction to Databases Theoretical Background

Lectures and Assessments Evaluation

Background on the Course

Prior Knowledge and Skills

The course does not require any prior knowledge or skill,

however, the following may help you to understand the lectures:

1

basic mathematical background

(sets, relations, functions)

2

basics of software engineering

(software development process and its phases, software life-cycle) 3

software modelling

(data modelling, Class Diagram in Unified Modelling Language) 4

concepts of object-oriented programming

(encapsulation, composition, inheritance) 5

operation system design principles

(file system organization / data blocks, memory management / paging) 6

concepts of concurrent computation

(7)

Course Content

Introduction to Databases Theoretical Background

Lectures and Assessments Evaluation

Background on the Course

Lecturer Introduction

RNDr. Marek Rychly, Ph.D.

assistant professor at BUT FIT

Brno University of Technology

Faculty of Information Technology

Brno, Czech Republic, Central Europe

over 10 years of lecturing experience

(programming, cryptography, databases, SW architectures)

over 25 of scientific conference and journal publications

(dynamic architectures, SOA, Web services, distributed systems)

a researcher in several national and international projects

a happily married man with two small kids :-)

http://www.fit.vutbr.cz/homes/rychly/.en https://www.linkedin.com/in/rychly

(8)

Course Content

Introduction to Databases Theoretical Background

Lectures and Assessments Evaluation

Background on the Course

Students Introduction? (optional)

What is your name? Where are you from?

What is your educational background?

(in general, in IT, any database experiences, etc.)

What is your professional background?

(a job title, why do you need this course, etc.)

What do you want to focus on in this course?

(see the list of lectures on page 4)

Is there anything you would like to skip?

(because you already know about that or because you are not interested in) Please send your (very short) introduction [email protected] subject “ADES Student Introduction”. Knowing your background and priorities will help me to

(9)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model Database Architecture

How to Store Data? File-based Decentralized Storage

each application site/domain manages its own data

(a sales dept. manages “sales” data, a contract dept. manages “contract” data)

data saved to/loaded from local files by individual applications

(one file per one application instance)

data processing and querying is done by individual applications

(the application interprets the data)

possible data transfers between site/domains are done manually

(10)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model Database Architecture

File-based Decentralized Storage II

(11)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model Database Architecture

File-based Decentralized Storage III

Advantages and Disadvantages

+

easy to implement

(an application just save data to or load data from a file)

+

full control over data

(local data storage is easy to control, secure, and backup)

separation and isolation of data

(it is more difficult to access data from different locations or to combine them)

duplication of data

(the same data stored in different locations, difficult to control/synchronize them)

program-data dependence

(structure of the data stored is limited by a related application/program, it is difficult to modify the structure without modification of the program)

incompatible file formats

(different applications store data in different formats, e.g., little/big endian)

fixed queries/proliferation of application programs

(data are queried by their applications which implement just limited/currently required set of queries, it is difficult to implement another queries)

(12)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model Database Architecture

How to Store Data? Database Approach

Definition (Database, DB)

A shared collection of logically related data, and a description of this

data (i.e., meta-data, a system catalogue), designed to meet the

information needs of an organization (to answer queries).

Definition (Database Management System, DBMS, DB server)

A software system that enables users to define, create, maintain, and

control access to the database (it manages data and meta-data).

Definition (Database Application Program, DB client)

A computer program that interacts with the database by issuing an

appropriate request (typically an SQL statement) to the DBMS.

(13)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model Database Architecture

Database (DB)

An Example of Entity-Relationship Diagram (entities, attributes, relationships)

(14)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model Database Architecture

Database Management System and App. Program

DB Clients Accessing Data by Means of Centralized DBMS

(15)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model Database Architecture

Database Approach

Advantages and Disadvantages

+

centralized approach

(data are stored in one location, without duplicities, no separation/isolation)

+

data abstraction

(structure of the data is not affected by related applications/programs, it respects well-established practices/normal forms)

+

standardized interface

(DBMS provides a standardized interface to access data, e.g, SQL 2003)

+

performance/security tuning

(DBMS is optimized for performance and security, applications need not to care)

single point of failure

(failure on a DBMS server affects all its clients; solved by replication/distribution)

limited scalability

(scalability by upgrading a DBMS server or a cluster, which is not trivial)

issues related to concurrent access to/modification of data

(multiple concurrent modifications of single data need to be serialized, long-running modifications may require locking the data)

(16)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model Database Architecture

DBMS Environment

1

hardware

(HW servers where DBMSs are running, communication infrastructure, storage) 2

software

(DBMS software, e.g., Oracle 12c, Microsoft Server, PostgreSQL, MySQL, etc.)

3

data

(stored data and their meta-data) 4

procedures

(procedures manipulating the data, invoked on demand or triggered automatically) 5

people

(a data/DB administrator, logical/physical DB designers app. developer, end-user)

(17)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model

Database Architecture

The Three-Level ANSI-SPARC Architecture

(18)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model

Database Architecture

Three Types of Database Schema in ANSI-SPARC

1

External level/External Schemas

(different external views of the database for different users/applications)

data displayed by an application to its users (tables, forms)

results of user queries of data in the database (SELECTs)

database views (predefined SELECTs)

2

Conceptual level/Conceptual Schema/Logical model

(structure of data stored in the database, describes what the data is stored)

all entities, their attributes, and their relationships

keys and constraints on the data

security and integrity information

3

Internal level/Internal schema/Physical model

(physical representation of the database, describes how the data is stored)

storage space allocation for data and indexes

exact data-types for all attributes and their storage size

data encoding, compression, and encryption

Changes in lower levels do not affect levels above.

(19)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model

Database Architecture

Three Types of Database Schema in ANSI-SPARC

Example of the Database Schema in Three Different Levels

(20)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model

Database Architecture

Data Model and Data Modelling

Definition (Data model)

A high-level description of a database schema as an integrated collection of

concepts for describing and manipulating data, relationships between data,

and constraints on the data in an organization.

a data model consists of three components

1

structure of the data (objects/entities, attributes, relationships)

2

operations manipulating the data (methods, event handlers)

3

integrity constraints (for correct and accurate data)

a data model respect different levels of ANSI-SPARC architecture

(external, conceptual/logical, and internal/physical data models)

the process of construction of a data model is data modelling

there exists different types of data models (i.e., data modelling)

(21)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model

Database Architecture

Types of Data Model and Data Modelling

there were proposed several categories of data models:

1

object-based data models

(e.g., entity-relationship, object, functional, and semantic data models) 2

record-based data models

(e.g., relational, network, and hierarchical data models) 3

physical data models

(e.g., unifying model, frame model)

we will focus on

entity-relationship

and

relational

data models

(these are the most popular models in current databases and DBMS)

moreover, we will focus on

conceptual modelling

(22)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model

Database Architecture

A Sample Database Schema in Relational Data Model

(23)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model

Database Architecture

Classification of DBMS

DBMSs can be classified based on several criteria

based on data model

network, hierarchical

(traditional models not commonly used due to their complexity)

rational, object-relational

(well-known DBMSs like Oracle, MS SQL Server, DB2, PostgreSQL, MySQL/MariaDB, etc.)

object-oriented, XML, document-oriented, key-value, etc.

(suitable for particular applications)

based on user numbers

(single-user, multi-user)

based on database distribution

(24)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems

Database Schema and Data Model

Database Architecture

Eight Functions of Full-scale DBMS (by Codd, 1982)

1

data storage, retrieval, and update (in different external views)

2

a user-accessible system catalogue (meta-data)

3

transaction support (atomicity, consistency, integrity, durability)

4

concurrency control services (serialization of concurrent requests)

5

recovery services (data consistency even on failure)

6

authorization services (security)

7

support for data communication (requests as comm. messages)

8

integrity services (with respect to defined constraints)

(25)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems Database Schema and Data Model

Database Architecture

System Architecture with DBMS

DBMS implemented as a database server in multi-tier architecture.

Two-tier client-server architecture:

1

client

(user interface and partially also business and data processing logic) 2

database server

(data validation and access, partially business and data processing logic)

Three-tier client-server architecture:

1

client

(user interface, no business and data processing logic, i.e. it is a “thin client”) 2

application server

(business and data processing logic) 3

database server

(data validation and access)

The app. server may implement a TP Monitor to access DBMS.

(Transaction Processing Monitor provides transaction routing, distribution, load balancing, funnelling, and increased reliability by employing multiple DBMSs)

(26)

Course Content

Introduction to Databases

Theoretical Background

From File-Based Storage to Database Management Systems Database Schema and Data Model

Database Architecture

Transaction Processing Monitor

Definition (Transaction Processing Monitor, TP Monitor)

A program that controls data transfer between clients and servers to provide a

consistent environment, particularly for on-line transaction processing (OLTP).

transaction routing

(directs transactions/requests to a particular DBMS, to increase scalability, perform load-balancing, or increase reliability a primary DBMS server is inaccessible)

managing distributed transactions

(performs operations across several heterogeneous DBMSs in single transactions)

funnelling

(establishes a small number of network connections to DBMSs which are utilized/reused for incoming requests of a large number of clients as needed)

(27)

Course Content Introduction to Databases

Theoretical Background

Sets and Relations

Functions

Sets and Set Operations

Definition (Set)

A finite or infinite collection of objects in which order has no

significance and multiplicity is (generally) also ignored.

a

2

A

is used to denote that

a

is an element of a set

A

.

A

is a subset of

B

,

A

B

, iff all members of

A

are also members of

B

.

(e.g.,f2;3g f1;2;3;4;5g; an empty set, denoted;, is a subset of all sets)

fx

jp(x

)g

is a set of elements

x

s.t. a property

p

is satisfied by

x

.

(e.g.,fn2Njn<=3gorfn2Zjnis eveng)

A

[

B

= f

x

j

x

2

A

or

x

2

B

g

is an union of sets

A

and

B

.

(e.g.,f1;2;3g [ f3;4;5g = f1;2;3;4;5g)

A

\

B

= f

x

j

x

2

A

and

x

2

B

g

is an intersection of sets

A

and

B

.

(e.g.,f1;2;3g \ f3;4;5g = f3g)

B

n

A

= fxjx

2

B

and

x

=2

Ag

is a relative complement of

A

in

B

.

(28)

Course Content Introduction to Databases

Theoretical Background

Sets and Relations

Functions

Cartesian Product and Relations

Definition (Cartesian Product)

The Cartesian product of two sets

A

and

B

, denoted

A

B

(also called

the cross product), is defined to be the set of all points

(

a

;

b

)

where

a

2

A

and

b

2

B

; i.e.,

A

B

= f(

a

;

b

)j

a

2

A

and

b

2

B

g

.

Definition (Relation)

A relation is any subset of a Cartesian product.

For example, let

A

= f

1

;

2

;

3

g;

B

= f

4

;

5

g

C

=

A

B

= f(

1

;

4

); (

1

;

5

); (

2

;

4

); (

2

;

5

); (

3

;

4

); (

3

;

5

)g

D

= f(

a

;

b

)j(

a

;

b

) 2

C

and

a

=

2

g

Then, C is a Cartesian product of

A

and

B

,

(29)

Course Content Introduction to Databases

Theoretical Background

Sets and Relations

Functions

Functions

Definition (Function)

A function is a relation that

uniquely

associates members of one set

with members of another set. Formally, function

f

from

A

to

B

, denoted

f

:

A

!

B

,

uniquely

associates every

a

2

A

with object

f

(a) 2

B

.

For example,

f

:

N

!

N;

f

(x

) =

x

2

;

f

is a function.

(it is a relationf= f(1;1); (2;4); (3;9); (4;16); : : :g)

g

: f

1

;

2

g ! f

3

g;

g

(

1

) =

3 and

g

(

2

) =

3;

g

is a function.

(it is a relationg= f(1;3); (2;3)g)

h

: f

3

g ! f

1

;

2

g;

h

(

3

) =

1 and

h

(

3

) =

2;

h

is

not

a function.

(30)

Summary

Summary

DBMSs are trying to solve issue of file-based storage.

(issues: separation, isolation, duplication, program-data dependence, formats, etc.)

Database schema can be viewed from multiple levels.

(external, conceptual/logical, internal/physical; corresponds to DB schema models)

The most common approaches to data model/modelling are

entity-relationship (ER) model

(based on entities, their attributes and relationships)

relational model

(based on mathematical relations)

conceptual model

(based on the conceptual level of ANSI-SPARC architecture)

In the next lecture:

rational model, key and constraints

(31)

Thanks

Thank you for your attention!

Marek Rychly <[email protected]>

References

Related documents

This paper introduces a new model for the mean backscattered power waveform acquired by a radar altimeter operating in synthetic aperture radar mode, as well as an associated

The NEA Group Term Life, Dependent Term Life, NEA Accidental Death &amp; Dismemberment, NEA Guaranteed Issue Life, and NEA Complimentary Life Insurance are issued by The

A database management system provides efficient, convenient, and safe multi-user storage and access to massive amounts of persistent data.. Efficient - Able to handle large data

To date, private equity firms have focused on different stages of a portfolio company’s evolution, entering at an early stage with venture capital funds, later for growth capital,

 Database design in E-R model usually converted to design in the relational model (coming up next) which is used for storage and processing. Relational Model

 Introduction to Data Storage with SQL Server  Managing Storage for System Databases  Managing Storage for User Databases  Moving Database Files..  Configuring the

Introduction to Data Storage with SQL Server Managing Storage for System Databases Managing Storage for User Databases Moving Database Files.. Configuring the Buffer

Introduction to Data Storage with SQL Server Managing Storage for System Databases Managing Storage for User Databases Moving Database Files.. Configuring the Buffer