Designing and Building
Access Database Systems
Edition 7, October 2009
Mark Gregory
Designing and Building Access
Database Systems
Mark Gregory
École Supérieure de Commerce de Rennes
ESC Rennes School of Business, France
Previously,
Edition 0 School of Computing and Engineering, University of Huddersfield
March 2000
Edition 7 ESC Rennes School of Business,
France October 2009
Work on this document started when Mark Gregory was working at the University of Huddersfield, UK in 1999/2000. Some of the material included was originally written by my then colleague Dr. Steve Wade, who is still at Huddersfield. Other portions draw on work by Dr. Ken Lunn, who has moved on to the IT Directorate of the UK National Health Service.
The Sixth Edition was a very substantial revision of the previous year’s edition, as is this Seventh Edition.
DESIGNING AND BUILDING ACCESS DATABASE SYSTEMS
...
2
MARK GREGORY
...
2
ÉCOLE SUPÉRIEURE DE COMMERCE DE RENNES
...
2
ESC RENNES SCHOOL OF BUSINESS, FRANCE
...
2
PREVIOUSLY,
...
2
Edition 0 ... 2
School of Computing and Engineering, University of Huddersfield ... 2
March 2000 ... 2
Edition 7 ... 2
ESC Rennes School of Business, France ... 2
October 2009 ... 2
1. INTRODUCTION: WHO IS THIS DOCUMENT FOR?
...
13
1.1. Preface ... 13
1.2. Skills required ... 13
1.3. The aims of the remainder of this document ... 13
1.4. The structure of this document and how to use it ... 14
1.5. About Learning Access ... 15
1.5.1. Starter: the naïve user: Level 1 ... 15
1.5.2. The thinking user: Level 2 ... 15
1.5.3. The competent (“power”) user: Level 3 ... 15
1.5.4. Advanced: the programmer or systems integrator: Level 4 ... 15
1.5.5. What is the relevance of MS Access skills? ... 15
1.6. Testing yourself ... 15
1.6.1. What you should already know ... 16
1.6.2. What you should already be able to do or need now to learn ... 16
1.6.3. A checklist of more advanced skills ... 17
1.6.4. Further help on learning to learn ... 17
1.7. Conventions Used in this document ... 17
1.8. Limitations ... 17
1.9. Acknowledgements ... 18
2. A BRIEF INTRODUCTION TO DATABASES
...
19
2.1. Databases (bases de données) and how they are designed ... 19
2.2. Models used by systems analysts ... 19
2.3. Simple model of data processing ... 19
2.4. Why study Databases? ... 20
2.4.1. They are used in every significant BIS ... 20
2.4.2. They are at the heart of ... 20
2.5. Background ... 20
2.6. How to store data ... 20
2.7. What is a database? ... 21
2.8. Why keep data in different tables? ... 21
2.9. Learning a minimum about database ... 23
2.10. Basic concepts ... 23
2.11. Entity (type, class) ... 24
2.12. An example: Students by Programme ... 25
2.13. Attribute ... 25
2.14. Primary and Foreign Keys ... 25
2.15. Why have foreign keys? ... 26
2.16. Entity occurrences – student ... 26
2.17. Entity occurrences – programme ... 26
2.18. Queries ... 26
2.19. Students and Modules ... 27
2.20. Three Vital Rules ... 27
2.21. Resolving Many-to-Many Relationships ... 27
2.22. Towards a more complete entity relationship attribute model ... 30
2.23. Example Query – definition ... 31
2.24. Example Query – results ... 31
2.25. What is a query? (French ‘une requête’) ... 32
2.26. Data Dictionary ... 32
2.27. Entity-Relationship Diagrams (ERD) ... 32
2.29. Relationship: meaning and characteristics ... 33
2.30. Why it’s important to relate entities ... 34
2.31. Degree of relationship (simplified) ... 34
2.32. Simple database design ... 34
3. A SIMPLE METHODOLOGY FOR DESIGNING MICROSOFT ACCESS
DATABASES
...
35
3.1. Introduction – background to the methodology ... 35
3.1.1. Database, or something else? ... 35
3.1.2. What is a methodology? ... 35
3.1.3. Assumptions ... 36
3.1.4. Introduction to modelling business information systems: why we have chosen certain techniques ... 36
3.1.5. What we’re trying to achieve together ... 37
3.1.6. Business Process Modelling: Documenting a Business Process ... 37
3.1.7. Why have we chosen the techniques we have? ... 37
3.1.8. Business Process Modelling ... 38
3.1.9. SSADM ... 38
3.1.10. MERISE ... 39
3.2. Feasibility study ... 39
3.3. Set out Project Terms of Reference ... 40
3.4. Analyse the needs of users ... 40
3.4.1. Identify business processes using a high-level Use Case diagram ... 40
3.4.2. Identify detailed requirements for a process to be computerised: carry out Process Modelling ... 40
3.5. Decide the purpose and basic contents of the database – Data Modelling ... 41
3.5.1. Basic Constructs of ER Modelling ... 41
3.5.2. Deciding entity types ... 41
3.5.3. Entities ... 41
3.5.4. Relationships ... 42
3.5.5. Fields: What are the attributes of each entity? ... 43
3.5.6. Data type: Domain ... 44
3.5.7. Identify Domains ... 44
3.5.8. Classifying Relationships ... 45
3.5.9. Keys: primary and secondary (“foreign”) ... 46
3.5.10. Normalisation ... 48
3.5.11. ER Notation ... 49
3.5.12. Online tutorial ... 50
3.5.13. DFDs and ERDs – why both? How are they linked? ... 51
3.5.14. Why BOTH Data and Process models? ... 51
3.6. Cross-check: entity life history ... 51
3.6.1. Cross-check DFD and ERA ... 51
3.6.2. Time dimension ... 51
3.7. Model User <-> System Interactions ... 52
3.8. Define required outputs: reports, forms, queries ... 52
3.9. How will Input / Update be carried out (Forms etc.)? ... 52
3.10. Work through your design on paper, whiteboard, etc. ... 52
3.11.1. System data processing ... 53
3.12. Define (“design” in Access terms) the database: Build a prototype ... 53
3.13. Refine / iterate / implement ... 53
3.14. Test the database ... 53
3.15. Obtain User Feedback ... 53
3.16. Refine the system by Iteration ... 54
4. PUTTING DATABASE DESIGN THEORY INTO PRACTICE
...
54
4.1. Design aids ... 54
4.2. An Exercise ... 54
4.3. Achieving real competence in Database Design ... 55
4.3.1. Documented scenarios ... 55
4.3.2. Suggested but undocumented scenarios ... 56
4.3.3. Further study ... 56
5. MORE ABOUT DATABASES
...
56
5.1. What is a database? ... 56
5.2. The history of databases ... 57
5.3. Implementing data models in MS Access ... 57
5.4. What is a database management system? ... 58
5.5. First challenge: database design ... 58
5.6. Second challenge: database implementation ... 59
5.7. An inductive approach ... 59
5.8. What Is a Database? ... 59
5.9. What Is a DBMS? ... 60
SECTION 2 – USING MICROSOFT ACCESS TO BUILD GOOD DATABASES
....
61
6. INTRODUCTION TO MICROSOFT ACCESS
...
61
6.1. What is a database management system? ... 61
6.1.1. Software which manages a database ... 61
6.1.2. Implements entities as tables, maintaining and enforcing relationships ... 61
6.1.3. Deals with all the component disc files ... 61
6.1.4. Provides functions such as ... 61
6.1.5. An approachable programming language ... 61
6.2. Important facilities of more advanced DBMS ... 61
6.3.1. Other RDBMS ... 63
6.4. Why we want business students to learn Access ... 63
6.4.1. The relative ease-of-use of MS Access ... 64
6.4.2. MS Access is easily obtained ... 64
6.4.3. MS Access supports usable programming languages ... 64
7. MS ACCESS IMPLEMENTATION OF DATA MODELS
...
64
7.1. Tables, one per entity type ... 64
7.2. Fields, one per attribute ... 64
7.3. Records, one per entity occurrence ... 64
7.4. Attribute types in MS Access ... 65
7.5. Permitted data types in MS Access ... 65
7.5.1. Use of Number or Currency fields ... 67
7.5.2. Storing telephone numbers ... 67
7.5.3. Controlling data entry formats with masks ... 67
7.6. Keys ... 67
7.6.1. Candidate keys ... 67
7.6.2. Primary key ... 67
7.6.3. Multi-part primary keys ... 68
7.6.4. Entity integrity rule ... 68
7.6.5. Foreign keys ... 68
7.7. Relationships ... 68
7.7.1. Relationships and linking: Enforcing referential integrity where appropriate ... 70
7.8. System outputs ... 70
7.8.1. Queries ... 70
7.8.2. Reports ... 70
7.8.3. Forms ... 70
7.9. System inputs ... 71
7.9.1. Forms, sub-forms and their use with 1: M and M: N relationships ... 71
7.9.2. Field-specific validation checks ... 71
7.9.3. Using relational integrity to carry out inter-table validation checks ... 71
7.9.4. Table-level checks on forms ... 72
7.10. Implementing processes ... 72
7.10.1. Data processing in Access ... 72
7.10.2. Functional elements in Access ... 72
7.11. System data transformations ... 73
7.11.1. Append and Update queries ... 73
7.11.2. Macros ... 73
7.11.3. Visual Basic for Applications (VBA) modules inside Access ... 73
7.11.4. Visual Basic programs outside Access ... 73
8. WAYS IN WHICH TO LEARN MORE MS ACCESS
...
74
8.1. Sample databases and applications included with Microsoft Access ... 74
8.1.1. NorthWind Traders sample database (English edition) / Les Comptoirs (édition française) ... 74
SECTION 3 – THE ANYTOWN DISTANCE LEARNING BUSINESS SCHOOL
EXAMPLE
...
75
9. EXAMPLE SCENARIO: ANYTOWN DISTANCE LEARNING BUSINESS
SCHOOL
...
75
10. BACKGROUND: STUDYING
...
75
11. A CLOSER LOOK INTO "MANAGING STUDENTS"
...
75
12. THE PROCESS OF DECIDING WHAT HAPPENS TO STUDENTS
...
76
13. COURSE REVIEW
...
76
14. SIMPLIFYING ASSUMPTIONS
...
77
15. EXTERNAL ENTITIES
...
77
16. PROCESSES
...
77
16.1. Process Applicants ... 77
16.2. Admit students to Course – Course Enrolment ... 77
16.3. Register students on core and optional modules ... 77
16.4. Teach and assess a module ... 77
16.5. Prepare for and hold exam board (jury) ... 77
Collect together the results for all students for all modules they have been studying ... 77
Review module results in exam board ... 77
Decide student status in exam board ... 77
16.6. Review Course ... 77
17. DOCUMENTS
...
78
17.1. Course Description ... 78
17.1.1. List of Modules ... 78
17.2. Management Reports ... 78
18. ENTITY AND ATTRIBUTE LISTS
...
78
19. EXAMPLE STUDENT RECORD REPORT
...
79
20. ANYTOWN HIGH-LEVEL USE CASE DIAGRAM
...
80
22. LEVEL 1 DFD
...
82
23. EXAMPLE LEVEL 2 DFD
...
83
24. DATA DICTIONARY
...
84
24.1. Data dictionary for Anytown Business School ... 84
25. ANYTOWN ER DIAGRAM
...
96
26. ANYTOWN SYSTEM IMPLEMENTATION
...
97
27. TERMINOLOGY ASSOCIATED WITH DATA MODELLING AND DATABASE
DESIGN
...
97
28. REFERENCES
...
98
28.1. Basics of structured analysis ... 98
28.2. Database theory ... 98
28.3. DataFlow Diagrams (DFDs) ... 98
28.4. Entity relationship modelling ... 98
28.5. Use Case ... 98
28.6. Basics of Object Oriented Analysis and Design (OOAD) ... 99
29. APPENDIX 1 BUSINESS PROCESS ANALYSIS USING USE CASE
ANALYSIS
...
100
29.1. What is a Use Case Diagram? ... 100
29.2. What to do if a use case diagram won’t fit on a single page? ... 103
29.3. Finding Use Cases ... 103
29.4. Naming Use Cases ... 104
29.5. Describing Use Cases ... 105
29.6. Using Use Cases to identify System Inputs and Outputs ... 105
29.7. Other resources for learning about Use Cases ... 105
30. APPENDIX 2 DATA FLOW DIAGRAMS
...
106
30.1. What are Data Flow Diagrams (DFDs)? ... 106
30.2. Why use Data Flow Diagrams? ... 106
30.4. The components of a DFD ... 107
30.5. What appears on a DFD? ... 108
30.5.1. Listing the elements of a DFD ... 109
30.6. The Data Flow Diagram Symbols – SSADM Notation ... 109
30.7. Making a Data Flow Diagram: a Top-Down Approach ... 109
30.8. The elements of a DFD ... 110
30.9. Creating DFDs ... 110
30.10. First List the Elements of the Data Flow Diagram ... 112
30.11. Drawing the Context Diagram ... 112
30.12. Expanding a context diagram to give a level 1 DFD ... 112
30.13. Questions to ask yourself ... 113
30.14. Rules for DFDs ... 113
30.15. Some points on logical DFDs ... 113
30.16. Supporting documentation ... 113
30.17. Summary: “levelled” DFDs ... 114
31. APPENDIX 3 WHEN TO USE A SPREADSHEET, AND WHEN TO USE A
DATABASE
...
115
31.1. Introduction ... 115
31.2. Spreadsheets versus databases ... 115
31.2.1. What spreadsheets are good at ... 115
31.2.2. What databases are better at ... 115
31.2.3. Using spreadsheets and database together ... 116
31.2.4. Summary ... 116
31.3. What to do if your spreadsheet skills are weak ... 117
31.4. What to do if your database skills are weak ... 117
31.5. Conclusion ... 118
31.6. Acknowledgements – bibliography for Appendix 31 ... 118
1. APPENDIX 4: REASONS WHY A DATABASE IS TO BE PREFERRED TO A
SPREADSHEET - SPREADSHEET DOES NOT EQUAL DATABASE
...
119
1.1. More Than a List ... 119
1.2. Create the Database ... 120
2. APPENDIX 5: ACCESS HINTS - DESIGNING FOR USE
...
124
2.1. Getting more help ... 124
2.2. Unlocking the power of many-to-many relationships ... 124
2.3. Some difficulties associated with forms and subforms and how to overcome them ... 128
2.4. Subform not updated ... 128
2.5. Detail subform does not show the subset of records based on the value of the current master form record ... 130
3. APPENDIX
6: NORMALISATION
...
133
3.1. Introduction to Normalisation ... 133 3.2. Introduction ... 133 3.3. Preliminary remarks ... 134 3.4. Terminology ... 134 3.4.1. Records ... 134 3.4.2. Field names ... 134 3.4.3. Keys ... 1343.5. The various stages of normalisation ... 137
3.5.1. Convert data into unnormalised form (UNF, 0NF) ... 137
3.5.2. Convert UNF into First Normal Form (1NF) ... 137
3.5.3. Convert 1NF into Second normal form (2NF) ... 137
3.5.4. Convert 2NF into Third normal form (3NF) ... 137
3.6. Further normalisation ... 138
3.7. A full example of normalisation ... 138
3.7.1. Step 1 - Convert data into UNF ... 138
3.7.2. Step 2 - Convert data into 1NF ... 139
3.7.3. Step 3 - Convert data into 2NF ... 139
3.7.4. Step 4 - Convert data into 3NF ... 140
3.8. Normalisation: A Summary ... 141
3.9. Normalisation complements top-down entity-relationship modelling ... 141
3.10. What is achieved by normalisation? ... 142
3.11. How is normalisation used in practice? ... 142
3.12. Still confused? ... 142
3.13. Some questions with which to check your understanding ... 142
4. APPENDIX 7 INSTALLING AND USING MICROSOFT VISIO
...
146
4.1. Introduction ... 146
4.2. Visualize complex information to better understand it ... 146
4.4. Creating DFDs using Visio ... 147 4.5. Installing SSADM support ... 147
5. APPENDIX 8 STRUCTURED WALKTHROUGHS, A WAY TO IMPROVE THE
QUALITY OF ANALYSIS
...
149
5.1. How to seek for perfection! Improving the quality of our work ... 149 5.2. References for Appendix 8 ... 150
1.
Introduction: Who is this document for?
1.1. Preface
The booklet aims to help you learn how to design and build applications using Microsoft Access. This document is written to be read and understood as you are working on your own design and build experiments.
This Access database design and implementation document is a higher-level self-instruction booklet; it is assumed that you are already a fairly competent Access user.
If you need to learn how to use Microsoft Access, please see section 1.6 for further advice.
1.2. Skills required
Modern relational database management systems such as Microsoft Access have been designed to enable users to get as far as is reasonably possible without needing software design and construction (“programming”) skills. Four basic levels of skill can be recognised in database use. These are:
♦ LEVEL 1 – Database User
Straightforward data input, amendment and querying, such as might be undertaken by a clerical or professional worker who is expected to capture and use data as a small part of their every day work;
♦ LEVEL 2 – Database Builder
Basic database implementation skills, including design of simple databases and implementation of the design as a series of tables, queries and reports; such skills might be anticipated in a professional worker in an office environment who has some responsibility for the basic information systems (IS) needed in that office, but whose primary job responsibility is not IS-oriented;
♦ LEVEL 3 – Database Administrator
Real database design and implementation competence. You would expect this in an information systems professional. But this same higher level of competence may also be found in certain business-oriented individuals who take a real pride in using computers to their full potential. Such individuals are sometimes referred to as power users. The work of such an individual includes serving the needs of other clerical and professional office workers by undertaking detailed analysis, design and implementation work and creating systems usable by other office workers and business professionals.
♦ LEVEL 4 – Database Professional Expert user with programming skills.
1.3. The aims of the remainder of this document
The main aim of asking you to work through the remainder of this document is to link the following topics:
♦ To help you to learn the principles of modelling
information systems in an experiential, problem-oriented way
and not just a theoretical one. (Corresponds to Level 1 above)
to give you reasonable skills in the analysis and
construction of effective, albeit small-scale, computerised business information systems. (Corresponds to Level 2
above)
♦ If you want really to exploit the power of databases and systems and / or you aspire to the challenge of managing information systems professionals: To help you reach the
point where you can analyse a user's requirements, design them a solution, and refine the solution by means of building a working prototype in MS Access.
(Corresponds to Level 3 above)
♦ If you are a budding IS professional, or wish to become a systems analyst or consultant: This document is a starting point only – you will need specific additional training and
experience. (Corresponds to Level 4 above)
Note that material which only applies to Level 3 or above is shown in grey-background Arial Narrow, like this paragraph.
1.4. The structure of this document and how to use it
This document consists of:
♦ Section 1 – The Principles of Database
∗ Data modelling using ERM
∗ Databases
♦ Section 2 – Using Microsoft Access to build good databases
∗ System implementation using Microsoft Access
♦ Section 3 – The Anytown Distance Learning Business School example
∗ A fully worked example of the analysis and design of a system for a virtual enterprise
∗ However, please note that this example does not enter into the business-oriented aspects of the assignment you are doing
♦ Appendix 1 Use Case analysis
∗ User interaction modelling using Use Case scenarios
♦ Appendix 2 Data Flow Diagrams
∗ Process modelling using DFD data modelling
♦ Appendix 3 When to use a spreadsheet, and when to use a database
♦ Appendix 4 Reasons why a database is to be preferred to a spreadsheet
♦ Appendix 5: Access Hints – Designing for Use
♦ Appendix 6: Normalisation
♦ Appendix 7 Installing and using Microsoft Visio
quality of your analysis
All readers should start with Section 1. Then read the rest of the document, but ignoring this kind of text. Later, reread the document including text like this.
1.5. About Learning Access
You should work your way through the following stages:
1.5.1. Starter: the naïve user: Level 1
You should already be at (or perhaps beyond) this stage. If you aren’t – learn how to use Access now! This Designing and Building Access Database Systems guide cannot help you to learn these basic skills, which you are assumed already to have – but may have to acquire, revise and practise them, at the same time as you are reading this booklet. You’ll find a checklist just below, in section 1.6.
1.5.2. The thinking user: Level 2
The business specialist who nevertheless thinks carefully about how s/he can best use a computer to help them to get their work done, or who spots a new application area or ICT-related business opportunity.
Working through this document, and using the facilities of each Office programme just a little bit more each time, should get you to about this stage.
1.5.3. The competent (“power”) user: Level 3
This is the person who becomes known as the person to whom to talk when no-one else in the department or office seems to know what to do! This is the person who has mastered spreadsheets and uses them frequently, and who knows when to use a database.
1.5.4.
Advanced: the programmer or systems integrator: Level
4
Further competence in Access will require you to begin to use the power of the VBA programming language and to understand SQL. This subject is beyond the scope of this module, and is NOT expected in ESC business students.
1.5.5. What is the relevance of MS Access skills?
The main reason for advising business students to learn Access is that it is possible using Access to build reasonably powerful Information Systems (IS) with a tool which is reasonably straightforward (if not always easy!). In effect, you are building a Prototype system using an End User Computing tool. At the same time, you are consolidating what you have been taught in first and second year modules. All this is essential if you are to achieve the learning outcomes of the module that you are studying.
Facility in Access is itself a marketable skill. You should find it much easier to obtain certain internships or placements as a result of the fact that you know industry standard software like Access. In addition, Access is a reasonably complete
implementation of the theoretical relational database model originally defined by Edgar Codd and popularised by many authors (notably, Chris Date—see Date 2003). Relational databases are a very powerful way to structure data and to be able to get the information you need as a future manager.
In this section, we summarise what we consider to be the basic knowledge and ability you need to have in Microsoft Access.
The most important first step is to take a first step! Get hold of a copy of Access and start to
use it. As you do so, tick off the various things on the list below. You can start reading this
Designing and Building Access Database Systems guide in parallel, but please understand that you cannot understand what is in this book without actually testing your practical ability and knowledge.
1.6.1. What you should already know
We have already revised or introduced the following concepts:
♦ What are Tables?
♦ Designing a Table
♦ Keys: primary and secondary (“foreign”)
♦ Relationships and linking
1.6.2. What you should already be able to do or need now to learn
You should aim at the following practical competences, which you may have acquired in the first year at ESC Rennes, or which you may now need to learn:
Competence Tick when
you can do this Fundamental skills ∗Starting Access ∗Creating a database ∗Creating a Table ∗Adding Data ∗Creating a Query
∗Adding a second table
∗Linking tables with a relationship – establishing foreign keys
Forms – basic concepts
∗Creating a form based on a table using the form-building assistant / wizard
∗Changing the design of the form
∗Adding records using a form Reports – basic concepts
∗Creating a report
∗Creating a report based on a query Relationships
∗Creating a relationship between tables
∗Creating a query which uses linked tables
one-to-many (1:M) relationships
∗Many-to-many relationships and multi-part primary keys
Forms – more advanced use
∗Using list and combo boxes
∗Combo boxes (zones de liste déroulantes) and subforms (sous-formulaires)
∗Creating a subform (sous-formulaire)
∗Inserting a subform into a main form
∗Subforms of subforms
∗Adding record navigation buttons
1.6.3.A checklist of more advanced skills
If you are aiming at Level 3 or Level 4 competence, you will need to achieve:
Competence Tick
when you can do this ∗Competence in update and append
queries
∗Advanced data validation techniques
∗Basic competence in Visual Basic programming
∗Basic competence in SQL (structured query language)
∗Dealing with problematic many-to-many relationships
1.6.4. Further help on learning to learn
See appendix 1 for a very basic introduction to Microsoft Access, and appendix 31.4 for some suggested websites.
1.7. Conventions Used in this document
Points which are significant only to more advanced users are indicated like this paragraph.
VITAL POINTS are indicated like this!1.8. Limitations
This document is aimed at people who are comparatively new to systems analysis and design, and who are not aiming to be experts in that field. It therefore aims to be useful and usable without necessarily being totally complete. Where a conflict exists between being totally comprehensive (but unnecessarily difficult), and being comprehensible and straightforward, the second approach is adopted. The aim is to exclude material which is extraneous in the
sense that most business people, and indeed many analysts, do not need to consider it. Therefore complex issues such as ternary relationships are ignored. Instead, the main issues are concentrated on and the reader is encouraged to understand them and to apply them. Once the reader is comfortable with the approach adopted in this document and has achieved some real competence in database design and implementation, he/she can read more advanced texts and tackle the more difficult issues. Until then, the slightly simplified (but never facile) approach adopted in this document is a sensible compromise.
Microsoft Access is very unusual as a database management package in that it is intended both to be very useful to people who are new to database, and also to offer the full power of a
programmable system to more advanced users. Microsoft Access aims to make the gap between intermediate and advanced use as small as it can be, because it provides both macros and Visual Basic for Applications (VBA). Macros can be used to automate repetitive sequences of commands or instructions, such as those needed to open a form without having first explicitly to open the database window. VBA is a full programming language, and it can be used for relatively complex tasks such as advanced field validation, and also for dealing with anticipated errors and automatically recovering from them. VBA is a programming language which is based on Microsoft's Visual Basic system. In Office 97 and beyond, the same VBA language is used in all the major Microsoft applications, Word, Excel and Access.
Although this document does not assume familiarity with Visual Basic, certain more advanced uses of Access do require awareness of such Visual Basic concepts as functions (sub programs which return a result) and many more advanced features of Access – things like validation rules - use Visual Basic syntax.
Business students should NOT normally attempt to master the VBA programming language.
However, at certain points in this document, VBA is used to illustrate more advanced techniques.
1.9. Acknowledgements
I should like to thank:
♦ Former Huddersfield colleagues Dr. Steve Wade and Dr. Ken Lunn
♦ ESC Rennes colleagues, notably Dr. Renaud Macgilchrist
♦ Previous ESC Rennes students
The following students gave me permission to reuse parts of their excellent work on the Anytown Business School group case. I have incorporated this case as a worked example in this document, and made significant use of these students’ work:
Marine CORRE; Marie GALATAUD; Emmanuelle HAMEURY; Naïla MALTI
SECTION 1 - THE PRINCIPLES OF DATABASE
2.
A brief introduction to databases
2.1. Databases (bases de données) and how they are designed
In this chapter, we give consideration to databases: what they are, to some extent how they are used, and to a limited extent how they are designed.
The subject matter here involves a specialised vocabulary, and a degree of complexity. Many of the ideas surrounding database are on first encounter quite strange, but they quickly become intuitive if you combine a study of the theory of database with an attempt to make them work in practice. So: stick with that approach, learn a little then try it out!
2.2. Models used by systems analysts
These are examples only!
♦ Interaction models
∗ UCDs (Use Case Diagrams)
♦ Process models
∗ DFDs (Data Flow Diagrams)
♦ Data models
∗ ERA (Entity Relationship Attribute diagram)
2.3. Simple model of data processing
This diagram, which shows the structure of a data processing system (a synonym for business information system), highlights the central importance of the database as the place where data is stored and from which it is retrieved.
Data Processing
System
Sourc
e
Recipien
t
Data
Information
Store
Data
Retrieve
Data
2.4. Why study Databases?
2.4.1. They are used in every significant BIS
♦ Store details of orders, customers etc.
♦ Support product catalogue in B2C applications
2.4.2. They are at the heart of
♦ “Database marketing”
♦ CRM – customer relationship management
♦ ERP – enterprise resource planning
2.5. Background
Some understanding of what a database is, how it is used, and (to a greater or lesser extent) how databases are designed is essential to understanding electronic business.
Businesses are systems; they use Information Systems, which are based on Information and Communications Technology.
Example: any e-commerce company provides a Web window onto its internal catalogue: which is a web page connected to a database.
Every stakeholder needs information from the business. They generally obtain this as information presented on forms (screens), reports and dynamic web-pages (webpages which show the current contents of a database and permit stockholders to update that database).
2.6. How to store data
♦ Data is stored in tables: 2-dimensional structures
♦ In MS Office terms:
∗ Word tables (also PowerPoint)
∗ Excel worksheets
∗ Access tables
♦
The 2-dimensional table which follows was created in WordRelative strengths and weaknesses of Word, Excel and Access for storing data
Method Advantages Disadvantages
Word
Processing: e.g. Word
Simple, well understood by people with weak computing skills No formulae (or only very rudimentary ones)
Excellent formatting options Tables are not related in any way Can only be updated by one person at a time. The data in a table has no “structure” known to the computer.
Spreadsheet: e.g.
Excel Some degree of structure – cells organised into rows and columns, with links possible between the cells Persistent data is not safe. Very powerful data manipulation using formulae Size limits – 65535 rows (until
Office 2007).
Separate tables can be held in different worksheets No design methodology or coherence – it is possible and easy to mix data up in a way which makes it impossible to
find, update and relate.
Items of data can be related together using lookup formulae such as VLOOKUP (RECHERCHEV) and HLOOKUP
(RECHERCHEH)
Poor support for queries – searching is slow, and the lookup formulae are far from being intuitive.
Can only be updated by one person at a time
Database: e.g.
Access Each kind of data is stored by the database management system (DBMS) in its own separate table. The tables are related together in accordance with the Relational data model – this gives coherence to the collection of tables, which is the whole database
More difficult to use and to learn (at first)
Very powerful data structuring and querying. In fact a query is just a results table which combines together selected data from more than one stored table. The database program enables the user to say what data they need and they construct a query which precisely specifies what data is to be retrieved into the results table
Requires thoughtful use and advance planning
Safer persistent data (though less safe than bigger, more powerful
DBMS programs like Microsoft SQL Server, ORACLE etc) Access databases are not directly web-accessible Is multi-user: that is, more than one person at a time can change
(update) the database
Since every record in a table has the same basic structure, it is much easier and / or more cost-effective to process complete sets of records under program control
But the programming language within Access, VBA, is too difficult and / or inappropriate for most business users to learn.
Figure 1 Comparative strengths and weaknesses of data storage in two dimensional tables: Microsoft Office tools
2.7. What is a database?
♦ A linked collection of tables (tables)
♦ Each table containing data about a single kind of thing
♦ Data in the separate tables can be combined ("joined") to answer user needs for information
2.8. Why keep data in different tables?
♦ Company A uses a single table named orders to record orders they receive, while Company B uses a relational database with two tables: orders and customers.
♦ When a customer places an order with Company A, a new record (or row) in the table orders is created.
♦ Because Company A has only one table of data, all the information pertaining to that order must be put into a single record: the customer’s general information, such as name and address, is stored in the same record as the order information, such as product description, quantity, and price. If customers place more than one order, their general information will need to be re-entered and thus duplicated for each order they place.
Order
number Customer name Customer address Product code Product description Unit of sale
Price
per unit Quantity Amount
O001 GREGORY Mark 1 La Rue P001 Apples kg 0,80 € 2,5 2,00 €
O002 GREGORY Mark 1 La Rue P876 Oranges kg 0,90 € 1 0,90 €
O003 MACGILCHRIST
Renaud 1 La Croix P001 Apples kg 0,80 € 3 2,40 €
O004 GREGORY Mark 1 La Rue P001 Apples kg 0,80 € 2 1,60 €
O005 MACGILCHRIST
Renaud 1 La Croix P876 Oranges kg 1,05 € 1,5 1,58 €
O006 GREGORY Mark 11 La Rue P001 Apples kg 0,90 € 2 1,80 €
O007 GOT Guillaume 1 L’Avenue P001 Apples kg 0,90 € 3 2,70 €
O008 GREGORY Mark 99 Le Chemin P876 Oranges kg 0,90 € 1,5 1,35 €
♦ Whenever there is duplicate data, as in the case above, many inconsistencies may arise when users try to query the
database. Additionally, a customer’s change of address might require the database manager to find all records in orders that the customer placed, and change the address data for each one.
♦ Company B is much better off with its relational database. Each of its customers has one and only one record of general information stored in the table customers. Each customer’s record is identified by a unique customer code which will serve as the relational key. When a customer orders from Company B, the record in orders need contain only a reference to the customer’s code, because all of the customer’s general information is already stored in customers.
♦ Indeed, Company B might go further and introduce a product table. It then has:
∗ CUSTOMER table Customer
number Customer name Customer address
C001 GREGORY Mark 1 La Rue
C002 MACGILCHRIST
Renaud 1 La Croix
C003 GOT Guillaume 1 L'Avenue
∗ Changed address Mistype d address
∗ PRODUCT table Product
code Product description Unit of sale Standard price per unit
P001 Apples kg 0,80 €
P876 Oranges kg 0,90 €
∗ ORDER table Order
number Customer number Product code Actual price per unit Quantity Amount O001 C001 P001 0,80 € 2,5 2,00 € O002 C001 P876 0,90 € 1 0,90 € O003 C002 P001 0,80 € 3 2,40 € O004 C001 P001 0,80 € 2 1,60 € O005 C002 P876 1,05 € 1,5 1,58 € O006 C001 P001 0,90 € 2 1,80 € O007 C003 P001 0,90 € 3 2,70 € O008 C001 P876 0,90 € 1,5 1,35 €
♦ This still isn’t perfect, since Orders and their Details continue to be mixed together in one table. 1
2.9. Learning a minimum about database
Every business student needs to know about and understand:
♦ Database principles
∗ Tables
∗ Queries
∗ A query is a results table
♦ Introduction to database design
♦ How to use a sample database
♦ How databases are used - ERP, CRM etc.
2.10. Basic concepts
♦ Entity: class of thing about which data is stored
Examples: student; programme – these are tables of data.
♦ Occurrence: a single instance of an entity
Example: ETU2004987 Smith, John – this is one record in the table of data.
♦ Attribute: a single fact that describes, qualifies or is otherwise
a property of an entity
Example: Programme name, value for John Smith: MA International Business
♦ Key: attribute(s) which uniquely identify a single occurrence of
an entity
Example: student number uniquely identifies a Student
1
The solution here includes the introduction of a link or intersection entity, called Order
Detail. See section 2.20 for a general description of what must be done.
Price per unit is on both tables! One is standard, the other order -specific.
♦ Relationship: a logical connection or dependency between
two entities
Example: any one programme has many students; any one student is on precisely one programme: we say that a one to many relationship exists between programme and student
2.11. Entity (type, class)
An entity represents a class of objects, usually in the real world. Synonyms for entity include class and type.
♦ Entities are of importance to the area of business being investigated
♦ They are objects about which data is stored
♦ Represented as boxes on an Entity Relationship Model (ERM)
♦ Examples:
∗ Student
∗ Programme
An entity has a number of different data attributes or properties, that is, facts about the thing. For example a student will have a student number, a last name, a first name, and a
2.12. An example: Students by Programme
Programme
PK
Programme code
Programme name
LMD level
Leader
Student
PK
Student no
Surname
Forenames
FK1 Programme code
In the diagram, the two rectangular boxes represent entity types. Here, they are programme and student. They are represented as different entity types because they represent different things in the real world. At least in theory, a programme could exist without any students. Almost by definition, a student is on a programme of some kind, but it is clear that
programme and student are not the same things. It is equally clear that they are related. The diagram represents this relationship by using a line with a crow's foot at one end of it. The end of the crow's foot represents the many end of a one to many relationship, often represented simply as 1: M
It is necessary to have an additional attribute on the student which links the student to its owning programme. On the sample data provided with the diagram, we have shown Annabelle Leuchars as being a student on the Executive MBA, by including the Programme code in the Student table. Programme code is a foreign key, which links the Student back to her Programme.
2.13. Attribute
An attribute is a Property of an entity, a single fact about the entity. An entity type will normally have several different attributes, one (or occasionally more) of which uniquely identifies every instance of the entity type. The identifying attribute or group of attributes is called for the primary key for the entity type.
The Attributes of Programme are Programme Code (primary key), Programme Name, and Programme leader
The Attributes of Student are Number (primary key), First name, Last name, Programme Code (foreign key)
Programme code has to be present as a foreign key in the student entity in order to represent the relationship which exists between programme and student.
2.14. Primary and Foreign Keys
Entity relationship
diagram Sample data
Programme code Programme name LMD level
Leader
PGE Programme Grande
École M RIVET Philippe
EMBA Executive MBA M MINDAY
Don
Student no Surname Forenames Programme code
20099234 Leuchars Annabelle EMBA
20099235 Dromsky Pierre-Charles PGE
♦ Primary key is an attribute or combination of attributes which uniquely identifies an entity occurrence
♦ To make a link between the Many (child) end of a relationship and its One (parent) end, the Primary key of the One end is repeated in the Many end
♦ In the Many entity, it is known as the Foreign Key
♦ What are Foreign Keys? A foreign key is an attribute that completes a relationship by identifying the parent entity. Foreign keys provide a method for maintaining integrity (coherence, consistency) in the data. Every relationship in the data model must be supported by a foreign key.
♦ Identifying Foreign Keys: Every dependent and subtype entity in the data model must have a foreign key for each relationship in which it participates. Foreign keys are formed in dependent and subtype entities by migrating the entire primary key from the parent entity.
2.15. Why have foreign keys?
♦ One particular programme has many students on it. That group of students (we call the group an entity type or table or set – the words are synonyms) is defined by having the same programme code
♦ The programme code of programme is of course the Primary key
♦ It is also, in the student table, the Foreign key which links each student to the programme of which s/he is a part
♦ Remember that the foreign key repeats at the many end, the primary key of the one end of the one to many relationship. This is essential if the database management software is to be able to link back together the students on a given programme, or to look up the details of the programme for a given student.
2.16. Entity occurrences – student
Having decided the general attributes of an entity type, it is then possible to store records relating to occurrences of the entity, typically in the real world. So, in the example above, details of three students are given, two on one programme, one on another.
2.17. Entity occurrences – programme
In the example above, details of two programmes are given.
2.18. Queries
The purpose of a database is to enable users to get the specific information they need. This can be done using queries. Queries are both useful in themselves, and also are used as the basis for reports and for forms.
♦ To answer a question like: who is programme leader for a given student? We can get all the necessary information by a query on both tables - programme and student
attribute of programme, and definitely NOT of student!
To answer a question like: who is programme leader for a given student? we can get all the necessary information by a query on both tables - programme and student. This is the work of the relational database management system software (RDBMS). A user of the database formulates a query, and the RDBMS goes away to look up details of occurrences in both entity types, joining the answers together as a result presented to the user.
2.19. Students and Modules
In this diagram, the two rectangular boxes represent entity types. Here, they are Module and Student. The relationship is Many-to-Many. The diagram represents this relationship by using a line with a crow's foot at both ends of it. The end of the crow's foot represents the many end of the many to many relationship, often represented simply as M:M or M:N
This model reflects the empirical observations that: 1. Any one student studies many modules 2. Any one module has many students
Many-to-Many relationships are very common. They are also problematical – this is because actual database management systems like Access (and almost all others) cannot support Many-to-Many relationships directly.
However, by following simple rules, it is possible to eliminate many-to-many relationships.
2.20. Three Vital Rules
1. An attribute can only hold a single fact
∗ If the name of an attribute is a list (e.g. something in the plural, like Student Qualifications) this is a sign that another entity is needed
2. The Primary key of the One end of a One to Many relationship also appears as a foreign key in the Many end
3. A Many to Many relationship can be resolved into two One to Many relationships, both going to a Link (or Intersection) entity type
These are Rules – there is no need to question them, just to apply them!
2.21. Resolving Many-to-Many Relationships
Resolving Many-to-Many
Relationships
Module
Student
Student
Registration or Result
Module
The many-to-many relationship is removed by:
♦ Introducing a link or intersection entity
♦ Drawing 1-to-many links FROM each original entity TO the new one
∗ Note that the primary key of EACH parent entity becomes part of the COMPOUND primary key of the link entity
♦ Here, the primary key of Module is Module Code, and that of Student is Student No. Both become the compound primary
key of the Registration entity
Student
PK
Student no
Surname
Forenames
Module
PK
Module code
Module name
Module leader
Module Registration
PK,FK1 Module code
PK,FK2 Student no
Module result
Page 28 of 150Module code Student no Module result
IS402E 20099234 A
IS402E 20099235 B
IS402E 20099897 C
♦ Note that there is only one primary key, made up of two attributes
∗ Neither Module code nor Student no are unique in the Module registration table – but the combination is unique
Unless a student is allowed to do a module a second time, in which case it is necessary to add a further attribute, usually a date, to the compound primary key in order to make it unique again:
Module code Student no Date Module result
IS402E 20099234 2009 A
IS402E 20099235 2009 B
IS402E 20099897 2009 C
OB401E 20099234 2009 Fx
2.22. Towards a more complete entity relationship attribute model
As analysis proceeds, the model is gradually refined and improved. Still incomplete, it might look like this:
Programme PK Programme code
Programme name LMD level
Programme leader surname Programme leader forenames
Student PK Student no Student surname Student forenames FK1 Programme code Student gender Student birthdate Qualification PK Qualification Award PK,FK1 Qualification PK,FK2 Student no Award result Module PK Module code Module name Module Registration PK,FK1 Module code
PK,FK1 Module year semester PK,FK2 Student no
Module grade Module mark
Module Operation PK,FK1 Module code
PK Module year semester Module leader
Note that this model has introduced a number of changes:
♦ There is greater precision in the attribute names chosen
♦ We wish to record a student’s qualifications, so we have introduced Qualification
♦ Because a many-to many relationship exists between student and qualification, an intermediate (link) entity has been
introduced; we observe that in the real world a specific award is give to each student who qualifies in something, so we’ve called the link entity Award
♦ We observe that many modules are offered and “run” (that is, they occur and are taught) for several years in succession (and sometimes in more than one semester in a year), and further that in some cases students take a module one year, fail it, and do it again in a subsequent year; therefore we introduce a Module Operation, the run of a module in a given year and semester
♦ The model remains incomplete but it’s now good enough to be worth prototyping (building and testing) in Access – so that we can check that it meets our needs for storing data and (above all) retrieving information in a very flexible way
2.23. Example Query – definition
Suppose we want to show a list of the students and the programme they are on, in ascending order of student last name. The slide is a screen shot indicating how this query is constructed in the Microsoft Access RDBMS.
I created it using the query design wizard (assistant) in Access. In the simple query wizard, I specified fields from the student table and from the programme table which I wished to appear in the result. Here, I wanted a list of students with details of the programme they are following. The screen shot shows the resulting query: it indicates that there are two tables which are joined together in the preparation of the result, and it also indicates which fields take part in the result.
2.24. Example Query – results
This slide shows the results of running (executing) the query. The results of the query have the form of a table, in which the columns are the attributes from the participating tables and the rows are the result records.
How has Access created this result? Probably something like this: it reads each record in the student table. One of the attributes of student is the programme code. Programme code is the foreign key in the student table; it is also the primary key in the programme table. Access looks up the details from the programme corresponding to the programme code for each student record. In effect, it joins together the two tables on the basis of the linking foreign key.
2.25. What is a query?
(French ‘une requête’)
♦ A query is a response to a question formulated by a user of the database
♦ A query takes the form of a table, in this case, a results table
∗ Technically, data tables are sets or relations
∗ So are the results of a query
♦ The power of a relational database is that it treats all stored data and derived information as what mathematicians call sets (sometimes called relations)
∗ Therefore database software is “simply” a computerised implementation of mathematical set manipulation
2.26. Data Dictionary
A store of data about data, a dictionary is a database used by analysts, programmers etc. or by you as you design a database. I find it useful to use a spreadsheet for this purpose, and you will find an example dictionary implemented as a spreadsheet for the Anytown case later in this document.
2.27. Entity-Relationship Diagrams (ERD)
Entity-Relationship Diagrams show:
♦ Entity types = the kind of things data is collected about in the database
∗ Entities = the specific things data is collected about
♦ Relationship = the way specific entities of one type are related to specific entities of the other type
♦ Attributes = the specific data items of interest stored for each entity type
♦ ERDs help determine what kinds of data will be included in a database, and how the database will be structured
♦ They are an excellent communication medium between users communication medium and developers
♦ Provided with Microsoft Access – usually to be found under help menu, Sample databases. It includes table like:
∗ Product
∗ Order
∗ Customer
∗ Supplier
∗ Purchase order
The screen-shot shows a slightly-improved version of NorthWind.
2.29. Relationship: meaning and characteristics
♦ A link between entities which is significant for this type of system
♦ Degree of relationship
♦ Optionality: does this relationship have to hold?
♦ Name: link phrase E.g. Customer places orders.
∗ Two names?
A relationship can be named in either of two directions, depending on which entity you start from. Thus:
(i) Customer places orders
2.30. Why it’s important to relate entities
Construction of the query in the previous example was eased because a proper design process had been undertaken in order to determine what entity types would be represented in the database, and how they would be related. This design process resulted in the simple entity relationship model presented in section 2.22. A line was used to link the two entity boxes together; this line had a crow's foot at the many end of the one to many relationship which analysis indicated exists between programme and student.
So one of the most important results of analysis is to establish what entity types are, and how they are related. Relationships are links between entities which are significant for this type of information and which are normally true in reality. A relationship can have a name: actually, it can have two, one read from one end of the link, the other from the other end. In the earlier example of Student and Programme, we can recognise two relationships - "student is on programme" and "programme enrols student". The relationship is said to have a degree of 1:M, which can be read one to many. In this case, the relationship is said to be mandatory: that is to say, a student is not a student if they are not on a programme, and a programme is not a programme if it has no students. (The second assertion may not always be the case, and it is possible to represent a relationship as being optional from one or both ends.)
2.31. Degree of relationship (simplified)
The degree of a relationship is an indication at the end being considered of whether more than one occurrence can be associated with one entity occurrence at the other
♦ There are three basic possibilities:
∗ One to one: 1:1
∗ One to many: 1:M
∗ Many to many: M:N
For more information concerning the degree of a relationship, please see 3.5.4
2.32. Simple database design
Design means deciding:
♦ What are the main entities?
♦ What are the attributes of each entity?
∗ What is the data type of each attribute?
∗ Validation rules
♦ Keys: primary and “foreign”
♦ Relationships and linking
Before building, for example, a Microsoft Access database, it is essential to carry out at least some database design. These are the main points that have got to be addressed in even the most informal design exercise.
The aim is to identify all the main entities and to give them the appropriate attributes. Having done that, consideration has to be given to the kind of data which each attribute will actually hold. Broadly speaking, the type of data is either numeric, text or something more special-purpose like a date. If text, consideration has to be given to the maximum number of characters that can be stored. Data which appears to be numeric may not in fact be so: telephone numbers, for example, must be stored as text, as indeed should code numbers like student numbers.
Note the usefulness of a simple data dictionary here. If you decide that a customer number is to be five letters (as it is in NorthWind), then it needs to be five letters everywhere it is used. You record that decision in the data dictionary.
It is absolutely critical to identify the primary key for each entity type, and to ensure that
there is a foreign key at the many end of any one to many relationship which is discovered
as you think about how the entity types are related.
3.
A Simple Methodology for Designing Microsoft Access
Databases
A methodology is a coherent set of methods, linked by a common underlying philosophy. The methodology I suggest for designing and building access databases is described in this section, and elaborated on in the remainder of this document. Some of the specific methods are derived from the British SSADM, Structured Systems Analysis and Design Methodology; and one UML2 technique, that of Use Case analysis, is employed. However, the Simple Methodology presented here is greatly simplified to make it more appropriate to business use.
3.1. Introduction – background to the methodology
3.1.1.Database, or something else?
As the writer of the biblical book Ecclesiastes wrote nearly three millennia ago, "There is nothing new under the sun"! Whatever, it is certainly the case that most standard information handling problems encountered in business are common to more than one business. It is therefore very likely that you will be able to find a system which has been written for someone else but which is (more or less) directly applicable to an information systems requirement you are analysing. If you can find a packaged solution which is (more or less) applicable to your company or to a client whom you are advising, then you can save yourself a lot of effort and the client a lot of money.
But what if there is no such package, or if it really doesn't suit the needs of the client / user?
A Database may be, and often is, appropriate in many contexts - but still consider alternatives, such as spreadsheets, for more straightforward or small-scale work, or where system users are already highly familiar with spreadsheets.
Spreadsheets are great in some contexts, and there is immense power in advanced spreadsheet packages like Microsoft Excel and Lotus 1-2-3. 3 And as we will begin to see as we become more knowledgeable about information systems modelling and more experienced in our use of relational databases, there are deficiencies also in the relational database approach. For this reason, newer techniques such as object oriented systems analysis and design are beginning to be used by IS/IT professionals. But for now, what you must do is work hard to get familiar with systems analysis using the structured approach, and database design using the relational approach.
3.1.2.
What is a methodology?
A methodology is a set of methods for tackling a particular class of problem; the methods should be linked by a coherent philosophy and be consistent with one another. Formal (mathematical) and semi-formal (strictly defined) methodologies have been defined for the analysis, design and construction of information systems. However, they are often too rigid, too prescriptive or quite simply too long-winded to be useful for people who are still learning the basics of the craft and who are tackling relatively small problems. The approach adopted in the rest of this document is methodical, but does NOT follow any one specific methodology; instead, it follows a simplified methodology of my own. 4 So, if you think a
full-2UML is the Unified Modelling Language, a set of notations largely used by information systems
professionals and particularly associated with a style of programming called Object Oriented or OO. The only UML notation we employ in this module is the Use Case diagram, UCD.
3 However, it is a serious error to use a spreadsheet when a database is necessary. Please see appendices 31