Topic List
Week 8: Databases and Tables Week 9: Databases II and ERD Week 10: Business Intelligence Week 11: Business Intelligence II Week 12: Enterprise Systems
Week 8 Databases and Tables
What is a database, and how does a relational database organize data?
Database definition
• A collection of data organised to service many applications at the same time by storing and managing data so that they appear to be in one location
• A group of related fields
• A collection of related information. The information held in the database is stored in an organised way so that specific items can be selected and retrieved quickly
What do databases do beyond storage?
• Data represented as one or more two-dimensional tables with columns and rows
• Search, add, amend, delete records
• Multi-user access, distributed access, speed, data quality, security, space efficiency
• Relational databases enable data to be stored within a number of different tables
All IS, whether it is an ERP system, or Facebook rely on a core repository to keep the data.
• The database is critical to the storage of transaction data that is vital to business operation.
• The database now can be used to provide this huge volume of accumulated transaction data for further analysis – which in turn produces this new, highly valued commodity of business intelligence.
What are the characteristics of a relational database and how to structure one?
The concept of the relational model is very simple:
A relational database presents its data to the user one, or more two-dimensional tables. This simply means that each table is shown to a user, human or software, as rows and columns.
For a relational database to work efficiently, there will always be multiple tables. The idea of one huge table within a relational database is simply not workable.
Organise data into two-dimensional tables (relations) with columns and rows One table for each entity:
E.g., CUSTOMER, SUPPLIER, PART, SALES
Fields (columns) store data representing an attribute Rows store data for separate records, or tuples
Key field: uniquely identifies each record (primary key)
Entity: is anything of interest to the users of the database. An entity is a person, place, thing or event to be recorded in the database. Entities will almost always occur as the nouns, or naming words, in the user’s description of their work e.g. in a university environment, a user may speak of students, courses, lecturers, and degrees. They are not an instance (e.g. an individual person).
Attribute: a specific characteristic or quality of a particular entity e.g. for a table on “Students”, characteristics would include student_name and student_address to distinguish one instance of a student from another we give them a different name, address, and so on. A unique one, or unique combination of two or more attributes will form the primary key of our entity, and later the table in the database.
Record: a group of related fields (e.g. a row), an instance of something.
Field: A field describe some detail or characteristic of a record (e.g. a single cell).
Relational database keys – Primary keys
• A field (e.g. Student_number) that uniquely identifies each record in a table.
• A column or group of columns that identifies a unique row in a table.
• Must be unique, there can be no duplication of attributes in the column chosen as the primary key for the table
• Usually the primary key is the name of the table or an identifier e.g. ID, reference, code etc.
• There is usually one primary key per table; unless several columns/fields are used – this is called a compound primary key.
• Each table must have a primary key – it is not possible to create a table without specifying a primary key.
• The database overall is not operational without a primary key for each table.
Relational database keys – Foreign keys
• They are needed to link or form relationship between the tables within the database
• The foreign key always links to a primary key in another table
• There may be 0, 1 or several foreign keys in a table
• A field that is used to link tables, by linking to a primary key in another table
• The foreign key is used to link tables together by referring to the primary key in another database table.
• Primary keys are reasonably simple. A primary key must be unique, there can be no duplication of attributes in the column chosen as the primary key for the table. Foreign keys are a little more obscure as a concept
What are the principles of a database management system?
• Allows users to communicate with a database
• Software for creating, storing, organizing, and accessing data from a database • Provides general purpose tools and utilities for producing and extracting data.
• Enables non-technical users to access data.
• Allows users to create new records in the database, update existing records in the database, read and display existing records in the database, and finally, delete an existing record or records within the database.
• Separates the logical and physical views of the data
• Logical view: how end users view data
• Physical view: how data are actually structured and organized
• Examples: Microsoft Access, D B 2, Oracle Database, Microsoft S Q L Server, MySQL, IBM, SAP, Teradata. Microsoft Access is a popular DBMS for personal computers.
• For a user – which is either a human user or a software user, to communicate or talk to the database, both the database and the user must speak the same language.
Week 9 Databases / ERD
Data model – a ‘blueprint’ to build a database
The logical design models the database from a business perspective.
The physical design is concerned with how data are organized and structured on physical storage media, such as a hard disk.
Data model and entity relationship diagram
Data model: A diagram that represents the main items of interest to us – we call these entities in the database - and the relationships that connect the entities (the main items of interest)
Analysts/Designers use an entity-relationship diagram or model (ERD)
There are several variations on the ERD theme (e.g. UML) but we use “crows foot notation”
The data model (ERD) rely heavily upon visualisation and not upon a textual, word-based description.
This is because we want to show diagrams of our data and the ways our data will be used.
Visualisation is a great way to understand our design concepts, just like with the deliverable – the
‘blueprint’, of the architect.
ERDs are intuitively simple, with some practice. At its foundation, an ERD has the entity and the relationship.
Why data modelling and ERD?
Business analysts review ERDs to make sure they reflect business needs. e.g.:
Should comments on videos be by one member only? That would work, but suppose an database used the same approach to record students visiting a professors office and the topics they discussed.
Is it reasonable for each office visit to be by only one students? What if students working in a group came together? A many-to-many relationship between students and office visits might be better.
Understanding the structure of a database is a good basis for developing it.
Business analysts and database designers use ERDs to communicate their ideas when they design database for business needs.
You will use databases during your career. You will use DBMS, work with database administrators, and therefore you need to know how they fit into business and information systems picture.
Communicating ideas through an ERD
Analysts use ERD to communicate their ideas when they are designing a database.
It helps to visualise and show other stakeholders the rules inscribed in the database.
Changing ideas on paper/in a model is much easier than changing a flaw after the database has been built.
The more feedback the better the refined ERD will be.
Just like an architect getting feedback on their design.
ERD definitions
Entity: is anything of interest to the users of the database. Entities will almost always occur as the nouns, or naming words, in the user’s description of their work e.g. in a university environment, a user may speak of students, courses, lecturers, and degrees. They are not an instance (e.g. an individual person). Entities are rectangles.
Attribute: a specific characteristic or quality of a particular entity e.g. for a table on “Students”, characteristics would include student_name and student_address. A unique one, or unique combination of two or more attributes will form the primary key of our entity, and later the table in the database.
Record: a group of related fields (e.g. a row), an instance of something.
Field: A field describe some detail or characteristic of a record (e.g. a single cell).
Foreign key: A field that is used to link tables, by linking to a primary key in another table
Relationship: connects two entities. Relationships are almost always defined by the business rules of the organization. The line signifies that the entities may be used together in the operational database.
Relationships are lines that connect two entities.
Relationships and cardinality
Notations are used to describe the minimum and maximum numbers involved in the relationship.
This ‘number’ is referred to as the cardinality of the relationship.
The minimum cardinality is represented by a zero or a single vertical line. This zero or vertical line is sometimes called the ‘modality’ or the ‘existence’ of the relationship.
This first zero or vertical line is followed by either a single vertical line or a ‘crows foot’ notation - three diverging lines. This represents the maximum cardinality, that is, the range zero up to many.
Together, the two notations represent the cardinality.
The symbol nearer the entity gives the maximum cardinality of the relationship.
The further symbol from the entity gives the minimum cardinality. If the minimum is one, there must always be an entity of that type in this relationship. If the minimum is zero, having one is optional Cardinality
Number of entities in a relationship (min to max) Represented by a zero or a single (vertical) line Followed by a single (vertical) line or ‘crows foot’
How do we build an ERD?
First, we development the preliminary data model. We identify entities and relationships – we engage all parties involved in the project. With this information, the business analyst creates the preliminary data model. This model will be further developed as we proceed to analyse and confirm the information we need.
The preliminary data model does not show the detailed structure of attributes and the data types of the attributes. Indeed, the preliminary data model does not show attributes at all. And at this early stage, we are not even thinking about primary and foreign keys.
The preliminary data model may also show relationships that will need to be reconsidered or
‘redesigned’ before the physical build can be performed (e.g. many to many relationships). We shall come back to this issue – remember, the process is iterative!
Let’s consider a simple a few examples – (i) football/soccer, (ii) video streaming and (iii) a motor vehicle ownership/insurance context which needs a database implementation.
ERDs – further analysis of entities and relationships. Entity types:
domain’ and ‘linking’ entities
A ‘domain’ entity type is an entity that describes a core business element of a database e.g. ‘Employee’
and ‘Course’ are core elements in a university database.
Domain entity types are very common in any database. However we must be very careful of how we link these entities.
The relationship that links ‘Employee’ and ‘Course’ in our example is a ‘many to many’ connection.
This relationship cannot be implemented in any Relational DBMS – and therefore we should not leave this relationship in the ERD.
Many to many relationships present implementation problems
Linking (the book refers to them as “join tables” or “intersection relation”)
The solution to any ‘many to many’ relationship is to introduce a ‘linking’ entity – sometimes called a
‘bridging’ or ‘associative’ entity.
This linking entity changes the cardinality of the relationships – in our example the two relationships now become ‘one to many’. This relationship can now be implemented in our DBMS.
The composite primary key and the foreign keys that now operate within the linking entity. These are formed from the two primary keys of the ‘domain’ entities.