Databases and Information Management
Reading:
Laudon & Laudon
chapter 5
Additional Reading:
Brien & Marakas
chapter 3-4
Outline
Database Approach to Data Management
Database Management Systems
Improving Business Performance and
Decision Making
Data Warehouse
Data Marts
Business Intelligence
¾ Database
Collection of related files containing records on people, places, or
things
Prior to digital databases, business used file cabinets with paper
files
¾ Entity
Generalized category representing person, place, thing on which
we store and maintain information
Example → SUPPLIER, PART
¾ Attributes
Specific characteristics of each entity,example
SUPPLIER name, address
PART description, unit price, supplier
¾Organize Data into 2D Tables
Tables → Relations with columns and rows
One table for each entity
Example → CUSTOMER, SUPPLIER, PART, SALES
Fields (columns) store data representing an attribute
Rows store data for separate records
Key field: Uniquely identifies each record
Primary key:
One field in each table
Cannot be duplicated
Provides unique identifier for all information in any row
¾ Relational Database Table
Relational Database
A relational database organizes data in the form of two-dimensional tables. Illustrated here is a table for the entity SUPPLIER showing how it represents the entity and its attributes. Supplier_Number is the key field.
¾ Part Table
Relational Database
Data for the entity PART have their own separate table. Part_Number is the primary key and Supplier_Number is the foreign key, enabling users to find related information from the SUPPLIER table about the supplier for each part.
¾Establishing Relationships
Entity-relationship diagram
used to clarify table relationships in a relational database
Relational database tables may have:
One-to-one relationship One-to-many relationship
Many-to-many relationship
Requires creating a table (join table, Intersection relation) that
links the two tables to join information
¾A Simple Entity Relationship Diagram
Relationship between supplier and Part
¾ Sample Order Report
Relational Database
The shaded areas show which data came from the SUPPLIER, LINE_ITEM, and ORDER tables. The database does not maintain data on Extended Price or Order Total because they can be derived from other data in the tables.
¾ Final Database Design with Sample Records
Relational Database
The final design of the database for suppliers, parts, and orders has four tables. The LINE_ITEM table is a join table that eliminates the many-to-many relationship between ORDER and PART.
¾ Entity-Relationship Diagram for the Database with four Tables
Relational Database
¾Normalization
Process of streamlining complex groups of data to
Minimize redundant data elements
Minimize awkward many-to-many relationships
Increase stability and flexibility
¾Referential Entity Rules
Used by relational databases to ensure that
relationships between coupled tables remain consistent
Example → When one table has a foreign key that points to
another table, you may not add a record to the table with foreign key unless there is a corresponding record in the linked table
¾DBMS
Specific type of software for creating, storing,
organizing, and accessing data from a database
Separates the logical and physical views of the data
Logical view → How end users view data
Physical view → How data are actually structured and
organized
Examples of DBMS → Microsoft Access, DB2, Oracle
Database, Microsoft SQL Server, MySQL
(Open Source)¾HRD Database with Multiple Views
Combine tables to deliver data → Users
Requirement → Two tables share a common data element
¾Operations of a Relational DBMS
Select
Creates a subset of all records meeting stated criteria
Join
Combines relational tables to present the ser with more information than is available from individual tables
Project
Creates a subset consisting of columns in a table
Permits user to create new tables containing only desired information
¾ Three Basic Operations of a Relational DBMS
Database Management Systems
¾Capabilities of DBMS
Data Definition Capabilities
Specify Structure of Contents of Database
Data Directory
Automated or manual file storing definitions of data elements and their characteristics
Query and Data Reporting
Data manipulation language Structured query language (SQL)
Microsoft Access query-building tools
Report generation, example → Crystal Reports
¾ Access Data Directory Features
¾ Example of SQL Query
¾ An Access Query
¾ An Access Query
¾ Object-Oriented Database
DBMS designed for structured data rows/columns
Not suitable for graphics-based or multimedia applications
Object-oriented Database
OODBMS →Stores data and procedures that act on those data as
objects to be retrieved and shared
Usage → Manage multimedia components, Java applets for Web Relatively slow compared to relational DBMS
Hybrid Object-relational DBMS → Provide capabilities of both types
¾ Databases
Improves Performance, Better Decisions Tools
Data warehousing
Multidimensional data analysis
Data mining
Utilizing Web interfaces to databases
¾Data Warehouse
Database that stores current and historical data that
may be of interest to decision makers
Consolidates and standardizes data from many
systems, operational and transactional databases
Data can be accessed but not altered
¾Data Marts
Subset of data warehouses that is highly focused and
isolated for a specific population of users
Can be constructed more quickly at lower cost
Example – Company might develop Marketing and
Sales Data Mart to deal with customer information
¾ Components of Data Warehouse
Using Database to Improve Performance
The data are combined with data from external sources and reorganized into a central database designed for management reporting and analysis. The information directory provides users with information about the data available in the warehouse.
¾ Business Intelligence
Tools for consolidating, analyzing, and providing access to large
amounts of data to improve decision making
Software for database reporting and querying
Tools for multidimensional data analysis (online analytical processing)
Data Mining
¾Data Mining
Finds hidden patterns, relationships in large databases
and infers rules from them to predict future behavior
Types of Information
Associations → Occurrences linked to single event
Example → Chips with Coke for 65% but 85% when promotion for Coke
Sequences → Events linked over time
Example → House purchasing followed by new refrigerator 65% within 2
weeks, oven 45% within one month
Classifications → Patterns describing a group an item belongs to
Example → Characteristics of customers who are likely to leave, campaign
Clusters → Discovering as yet unclassified groupings
Forecasting → Uses series of values to forecast future values
Using Database to Improve Performance
¾Data Mining
Applications for all functional areas of business
Government, Scientific Applications
Usage
Patterns in Customer Data → Identifying profitable customers
or for one-to-one marketing campaigns
Predictive Analysis → Using data mining techniques, historical
data, and assumptions about future conditions to predict outcomes of events, such as the probability a customer will respond to an offer or purchase a specific product
¾Privacy Concerns
Usage
Create detailed data image about each individual
Using Database to Improve Performance
¾Crime Fighting Weapon or Threat to Privacy?
¾Questions
What are the benefits of DNA databases?
What problems do DNA databases pose?
Who should be included in a national DNA database?
Should it be limited to convicted felons? Explain your
answer.
Who should be able to use DNA databases?
Case Study – DNA Databases
¾ Databases and the web
Information from Internal Databases → Customers
View Product Catalog, Place Order
Request from HTML Commands → SQL for DBMS Processing (database
server)
Software make this possible
Web server
Application servers or CGI
Database server
Advantage of using web to access internal databases
Much less training to employees
Few or no changes in internal databases
Savings over redesigning and rebuilding legacy systems
¾ Policies and Procedures for Data Management
Information Policy
Organization’s rules → Sharing, Disseminating, Acquiring, Classifying,
Inventorying information
Example → Right to change/view sensitive employee data
Data Administration
Database design and management group responsible for defining and
organizing the structure and content of the database, and maintaining the database
Specific policies and procedures for data management
Responsibilities → Developing information policy, defining and
organizing structure and content of database, planning for data, data directory development, Overseeing logical database design
¾ Ensuring Data Quality
Poor Data Quality
Major problem for successful customer management relationship
About 20% of US mail and packages are returned because of incorrect
names or addresses
Why Data Quality Problems?
Redundant and inconsistent data produced by multiple systems
Data input errors → Major data quality problems
Data Quality Audit
Structured survey of the accuracy and completeness of data
Data Cleansing
Detects and corrects incorrect, incomplete, improperly formatted, and
redundant data
Specialized data cleansing software → Automatically survey data files,
correct errors in the data, integrate data into company wide format