• No results found

The process of database development. Logical model: relational DBMS. Relation

N/A
N/A
Protected

Academic year: 2021

Share "The process of database development. Logical model: relational DBMS. Relation"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Relational Databases and SQL

○ Basic Concepts

○ The 3rd normal form

Structured Query Language (SQL)

Graça Abrantes

The process of database development

Reality (Universe of Discourse)

Conceptual model (e.g. Entity-Relationship model)

2

Logical model (e.g. Relational model)

Implementation

Logical model: relational DBMS

Relational DBMS are based on a set of theoretical concepts introduced on 1970 by E. F. Codd. p y

Advantages of relational DBMS:

simplicityof concepts

– the concepts have formal definitions

• quick learning

• generally adopted by different software development companies

3

– they support adequately the representation of most characteristics of reality

• entities, objects, phenomena

• relationships among them

Relation

Relationis the basic concept of relational databases

A relation is defined byy o one schemaand o one table

 A schema is defined by:

o the relation name o the name of each attribute o thedatatypeof eachattribute

4

o the datatype of each attribute o ...

A relational database is a set of relations

o thedatabase schema is the set of relation schema of every database table

(2)

Table – example:

Table and schema - examples

Relation schema:

soil (FID, Shape, AREA, PERIMETER, CODSOLO, NOME, SUBNOME, ESPECIFI)

5

SUBNOME, ESPECIFI)

Other examples of relation schema:

Camping (code, name, capacity, owner) River (FID, Shape, length, code, name, type)

Attributes

An attribute defines a property of an object, entity, phenomena

phenomena

An attribute Aitakes values in a set Di, called the attribute domain

the domain specifies the set of values that the attribute can take

Given U={A A A } a relation R over U is a

6

Given U={A1, A2,...,An}, a relation R over U is a subsetof the cartesian product D1x D2x ... x Dn.

Every tuple of this cartesian product is called an instanceof relation R.

Table

The set of instances of relation R is a tablein which - the instancesare also called rowsor records

7

the instancesare also called rowsor records

o orgeographic objects in GIS

- the attributesare also called columnsor fields

Remarks

the values of one attribute belong all to the same domain;

one attribute value must be atomic;

one relation cannotcontain equal instances;

the orderby which the instances are in a table is meaningless;

instances may have some attributes without value; in this case the attribute is called optional; when, given an instance, an optional attribute does not have one value, we say that its value isnull;

8

value is null;

the names (or identifiers) of attributes belonging to the same relation schema must be unique inside this schema

(3)

Key(s) of a relation

A set of attributes that takes different values on every instance is a primary keyof the relation

th l f i k i l ifi i t ithi

the value of a primary key uniquely specifies an instance within a table

a composite key is a key made up of two or more attributes

Within a relation, the primary key is the minimum subset of attributesby which every instance are uniquely specified

natural attributes are sometimes good primary keys

often an artificial attribute is assigned to an object in order to

9

often, an artificial attribute is assigned to an object in order to identify it uniquely; this kind of keys have no intrinsic meaning but they are useful to uniquely identify every instance of the relation

for instance, in a table of data about students at a school, they might all be assigned a student ID

one table may have several candidate keys but it must have one and only one primary key

Primary keys and Foreign Keys

On relational DBMS, the relationships among entities are represented in tables using common attributes.

i k f

– Example: Primary key of

table soilDiss$

Primary key of table soil

Foreign key in table soil

10

A foreign key is an attribute (or a set of attributes) in a relation that matches the primary key of another relation.

Relational model: 1st normal form

The value of each attribute is atomic.

E l i t d f

Example: instead of

use the 1NF

name type border

Tejo major maritime, land Mondego major maritime

11

name type maritime border

land border

Tejo major Yes Yes

Mondego major Yes No

Given a relation R, an attribute Aiin R is said to functionally determine another attribute Aj also

Relational model – Functional dependency

functionally determine another attribute Aj, also in R, (written Ai→ Aj) if, and only if, each Ai value is associated with precisely one Ajvalue.

Functional dependency

12

(4)

Relational model: 2nd normal form

A table is in 2NF if and only if it is in 1NF and no non-prime attribute is dependent on any proper p p y p p subset of any candidate key of the table.

a non-prime attribute of a table is an attribute that is not a part of any candidate key of the table

a table is in 2NF if and only if it is in 1NF and every non-prime attribute of the table is dependent on the whole of a candidate key

13

whole of a candidate key.

when a 1NF table has no composite candidate keys, the table is automatically in 2NF

the general case concerning GIS tables

A table is in 3NF if and only if it is in 2NF and no non-prime attribute is dependent on any other non-

Relational model: 3rdnormal form

prime attribute.

For instance, table conc_1998 is not in 3NF

Functional dependency

14

Normalization

A table is in 3NF if every non-key attribute provides a fact about the key, the whole key, and nothing but the key.

Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining

relationships between them.

Database normalization is the process of organizing the attributes and tables of a relational database to minimize

15

redundancy and dependency

the objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships

Example: 2FN 3FN

Example: given the following table verifying 2NF

to replace it with 2 tables:

16

(5)

Example:

Suppose that you have to design a database schema concerning remarkable trees in order to be stored as g vector data in a GIS.

For each tree it is required to record its common name, scientific name, family, height and trunk diameter.

In the study area there is around one thousand remarkable trees and several species are represented

17

p p

by more than one tree.

What relation schema best fits these requirements?

Structured Query Language (SQL)

SQLis a standard programming language (ANSI - 1986) for managing data held in a relational 1986) for managing data held in a relational DBMS

data insert

query

update and delete

18

schema creation and modification

data access control

The SELECTstatement

The most common operation in SQL is the query, which is performed with the SELECT statement.

SELECT retrieves data from one (or more) tables. Standard SELECT statements have no persistent effects on the database

the SELECT statement can have persistent effects when an output table is also mentioned to store the SELECT statement result

The SELECT statement retrieves a subset of a given table

19

according to a given logical condition

a logical condition is an expression that returns either TRUE or FALSE

the SELECT statement retrieves the instances for which the given logical condition returns TRUE

SELECT By Attributes (ArcMap)

20

(6)

The WHERE clause

SELECT attribute1, attribute2, .... (or *) FROM table1, table 2, ...

WHERE condition;

this statement retrieves the instances that hold the condition

elementary conditions are built using relational operators (<,

<=, >, >=, =, <>) and other operators (IN LIKE

21

operators (IN, LIKE, BETWEEN)

conditions may also be built using elementary conditions and logical operators (NOT, AND, OR).

The WHERE clause (examples)

 select * from soil where “CODE” >=800

 select * from soil

where “CODE” >400 and“CODE” <=700

 select * from soil

where not(“CODE” >400 and“CODE” <=700)

 select * from soil

where “CODE” <=400 or“CODE” >700) (1) Remark: this statement is equivalent to the previous statement

 select * from NUTSII where “NAME”like‘A%’

22

 select from NUTSII where NAME like A%

 select * from rivers

where "TYPE" =‘major' or

“description" in( 'land border' , ‘maritime border' )

 select * from NUTSII where “name” =‘’

The WHERE clause in GIS

 In GIS the condition of a SELECT statement may also use spatial operatorssuch as intersect, are within a distance of, contain, are within, touch the boundary of, ...

23

Joining tables

The FROM clause specifies the names of the tables containing the records to select.

When the FROM clause refers to more than one table, the SELECT statement executes the cartesian product (joining) of the referred tables

each tuple of the cartesian product is composed of one row from each of these tables.

 The WHERE clause of a SELECT statement that

24

 The WHERE clause of a SELECT statement that joins tables is used to constrain the resulting subset of the cartesian product

those tuples in which a foreign key value is equal to a primary key value.

(7)

SELECT *

FROM NUTS_1998, AddedValue Example (joining tables):

WHERE DTCC=code;

where code is the primary key of table AddedValue and DTCC is a foreign key of table NUTS_1998

25

Another example:

Suppose that you want to create a GIS vector theme concerning agricultural cultures in a given region.

For each land parcel it is required to record (i) the species common name For each land parcel, it is required to record (i) the species common name, (ii) the species scientific name, (iii) the average yield of each species in the region, (iv) the seeding or planting dates and (v) the parcel area.

The study area is a region with a high level of parcel disaggregation.

1. What GIS data structure best suits these requirements?

2. Design a suitable database schema.

3. Explain the main advantages of your schema.

4. Suppose that it is necessary to get information concerning the first

26

pp y g g

seeding or planting date of each species in the region. How to do it?

References

Related documents

Mitomycin-C application and conjunctival autografting are the two most commonly used adjunctive treatment methods in pterygium surgery (18).. Employing postoperative

[4] SAE Vehicle Network for Multiplexing and Data Communications Standards Committee, SAE J2178/3, “Class B Data Communication Network Messages: Frame IDs for Single Byte Forms

Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides

Accordingly, this study is conducted to re-examine an intercultural competence model developed by Deardorff in 2004 by exploring the nature of

The production process includes synchronizing files (platform and product assets) in the desired version from SCM, substitute variations of variation points,

Your social worker or care manager will talk to you about your back up plan and make sure this is written down as part of your assessment and shared with other the people who

A high resolution tunable diode laser spectrometer has been installed in the LENS II shock tunnel at CUBRC to study the effect of nozzle exhaust plume interaction with the flow