Views and Indexing. CSCI 4380 Database Systems

(1)

Views and

Indexing

(2)

Views

•

A view is a query.

•

Views can be anonymous

SELECT * FROM

(SELECT id, name, username, password FROM student

WHERE password is null) R

WHERE R.id = 5 ;

•

The relation R above is an anonymous view (it is not known

outside of this query).

•

This query is combined with the remaining query to find the

(3)

Views optimization

•

For example, after optimization the query below

SELECT * FROM

(SELECT id, name, username, password FROM student

WHERE password is null) R

WHERE R.id = 5 ;

•

may become:

SELECT *

(4)

Views optimization

• It is best to use an anonymous view if the query cannot

be written without it.

• Otherwise, the optimizer may loose some optimizations

and rewritings of the query when views are used.

(5)

(non anonymous) Views

• Views can also be given a name. This allows them to be

used in many different queries.

CREATE VIEW nullpwd(id, name, username, password) AS

SELECT id, name, username, password FROM student

(6)

Views

• Attributes can be renamed in views.

CREATE VIEW facultyWCourse(faculty_id, name) AS

SELECT DISTINCT f.id, f.name FROM faculty f, class c

(7)

Using views in queries

• Views can be used in any query as if they were a table.

• Remember, views are just queries. No tuples are stored

for them.

SELECT *

FROM nullpwd WHERE id = 5 ;

• When executing this query, the query processor first

takes the query definition and replaces the query name

with its definition (getting the expression in slide 2).

(8)

Why use views?

•

Creating views allows the system designer to customize the application code so that:

•

The functionality for different users can be built on top of views.

•

For example, faculty cannot access financial

information of students and can only the information about the students who are currently taking a course from them.

•

Solution: Create a view for the students in a specific class which only includes the relevant attributes. The application code will be built on top of this view.

(9)

Other uses of views

•

Views can also be used to insert/update/delete tuples

instead of the table they are based on.

•

This builds on the philosophy of building functionality based on views.

•

However, this is only possible for a very restricted subset of views, called updatable views.

•

Updatable views are such that each tuple in the view maps to one and only one tuple in the table it is based on.

(10)

Updatable views

•

A view is updatable if:

•

It has only one table T in its from clause

•

It contains all attributes in that cannot be null

(11)

Updatable views

•

Example:

CREATE VIEW nullpwd(id, name, username, password) AS

SELECT id, name, username, password FROM student

WHERE password is null ;

(12)

Updatable views

•

Note: nullpwd does not store any tuples. This expression allows only those tuples of student that are accessible through view to be updated.

•

Furthermore, after the update, the resulting tuple may not even be in the view (unless the view is created with the CHECK OPTION).

(13)

Why use views?

•

Using views to create functionality hides data complexity from developers.

•

Also, if the data model changes, the application code does not have to change as long as the new model can be mapped to the same view.

(14)

Why not use views?

•

Writing a query using views may hide some

optimizations from the database, creating sub optimal query plans.

(15)

INDEXING

• Views do not improve performance,

sometimes they may even cause a loss of

performance.

• One way to improve performance is store

(cache) the result of some queries in the

database.

• Indexes are exactly that, cached results of

queries.

(16)

Indexing example

SELECT CrsCode, Grade

FROM Transcript

WHERE studId = 5 AND semester = ‘Fall’ AND year

= 2010

Answering this query requires reading all of

transcript from disk to find the matching tuples to

return. Note that the matching tuples will be few, 4-5

in this case. However, transcript is large.

(17)

Cost Analysis

• Let us some basic cost analysis.

• Suppose transcript is stored on disk in 100 disk pages.

Then, answering this query requires reading 100 disk

pages.

• Suppose instead we had an index that allowed us to

find all the matching tuples. Example:

• Index on Transcript (studid, semester, year)

• Then, answering this query will cost:

(18)

Indexing

• Indexes are just query results stored explicitly.

• They are also stored on disk, but can be cheaper to

use because:

• They have fewer disk pages as they store only a

subset of the attributes in the relation.

• They are stored in a way to make it easy to find

queries on specific values in the index (we will see

how).

• For now, we can assume that querying the index is

small for most queries.

(19)

Index cost/benefit analysis

• Indices are good if

• they reduce the cost of frequently asked

queries

• the reduction is considerable

• Indices must be kept up to date when the

tables change

• Indices increase the cost insert/update/delete

operations (at least one extra disk page access

(20)

What are good indices

• A good index will reduce the total number

of matching tuples to 1 or a few

• Examples: attributes in a key

• An index on

student(id)

will greatly

improve queries like

(21)

What are good indices

•

If the underlying relation is sorted with respect to some

attribute, then an index on that attribute will help performance.

•

Suppose, transcript tuples are sorted by semester, year.

•

Create an index on Transcript(semester, year)

•

Given the query:

SELECT CrsCode, Grade FROM Transcript WHERE semester = ‘Fall’ AND year = 2010

use the index to find the first tuple for Fall/2010, and then scan the Transcript relation starting from that point.