SQL Server 2008
Core Skills
Confucius
• I hear and I forget
• I see and I remember • I do and I understand
Core Skills Syllabus
• Theory of relational databases • SQL Server tools • Getting help • Data warehouses • Simple queries • Aggregate queries • Accreditation
Relational Model
• Edgar Codd, IBM (1969) • Based on set theory
• Normalised structure
• Related data stored in tables • Each table has a primary key
• Tables linked using foreign keys
• Data manipulated using Structured Query
Key Concepts
• Table
– Primary Key
• Unique column or combination of columns
– Foreign key
• Primary key of one table stored in another table
– Index
• Internal catalogue enabling faster queries
– Column / Field
• Column of data items of a specified data type
– Data Type
• Text
• Numeric • Date/time • Binary
Key Concepts
• View
– Stored select query
– May be used to restrict access to certain fields (eg salary) – Used like tables
• System function
• Calculations and string manipulation
• User-defined function
– Encapsulated logic – Input parameters – Output value
• Stored procedure
– Encapsulated logic or process – Input / output parameters
Key Concepts
• Parameter
– A value passed to / from a process
• Stored procedure / function
• Trigger
– Automatic process that happens when data is changed • Transaction
– All or nothing process
• Cash machine withdrawal
Normalisation: Why Bother?
• Mirror real world objects • Save space
– Very important 1970s
• Maximise flexibility of querying • Reduce risk of anomalies
SQL Terminology
• SQL
– Structured Query Language
• T-SQL
– Transact SQL (Microsoft extension) – Programming language
• DML
– Data Manipulation Language
• select / insert / update / delete
• DDL
– Data Definition Language
Object Naming
• Server.Database.Schema.Object
– [rptdev.nwtis.nhs.uk].Minerva.common.Date
• Don't need to specify all parts if using defaults • Defaults
– Server: the current server
– Database: the current database
Schema
• Logical database partition
• Used to enforce security over groups of objects – common • All users – restricted • Administrators – etl
Key Tools
• SSMS SQL Server Management Studio – 75% of your time spent here!
– DDL / DML
• Business Intelligence Development Studio
– SSAS SQL Server Analysis Services
• Data cubes
– SSRS SQL Server Reporting Services
• 25% of your time spent here • Reports / dashboards
– SSIS SQL Server Integration Services
Help
• Phone a friend
– Ask an expert (colleague)
• SQL Server 2008 Books Online
– Syntax for DDL / DML – Examples most helpful
EBNF
• Extended Backus-Naur Form
• Transact SQL Syntax Conventions • <label> ::= label is defined as • <A><B> A then B
• <A>|<B> A or B
• [<A>] A is optional
Function / Operator Reference
• Will use time and again... – Date and time functions
• dateadd / datetdiff / datename • getdate
– String functions
• left / right / substring • charindex
– Data types
• cast / convert
– Operators
• like % _ [xyz] [x-z] [^xyz] [^x-z] • in ('x', 'y', 'z')
Schemas: Disambiguation
• Logical partition of a database
– Server.Database.Schema.Object
• Model
– Diagram
– Textual eg EBNF
• Plan or design of a database
– Star schema
Typical Data Warehouse Structure
• Fact tables
– Main data tables – SUS IP / OP / AE
• Dimension tables
– Look-up tables
– Admission Method – GP
Dimensions
• Star dimensions
– Simple look-up tables • Admission method • Ethnic Category
• Snowflake dimensions
– Hierarchies
• GP - Practice - Cluster - PCT - SHA • Postcode - Electoral Ward - Town
Surrogate Keys
• Combine related dimension tables
• Natural key
– PAS / NHS Data
Dictionary / OCS code
• Surrogate key
Comments
• Use to improve readability of a script • Ignored by SQL Server parser
• Single line format
– --this is a single line comment – Everything after "--" ignored
• Multi line
Simple Queries
select<columns>
from <fact table>
<join type> join <dimension table> on <joined columns>
where
<filters>
order by
Select Modifiers
• select top x [percent]
– Restrict to x results – Useful for testing
• select distinct
– Unique combinations – Similar to group by
Aliasing
• Renaming objects within a query • Table
– from A_Long_Table_Name a
• Column
– select PQRSZ123 as 'Easier_To_Read_Name'
The Three Joins
• inner join
– Matches from both tables
• left (outer) join
– All matches from first table – Matches from second (if any) – Use left join in most cases
• union [all]
– All matches from both tables – No duplicates! (unless all)
Filtering
• where clause• and filters
– All filters in combination
• or filters
– Either / or filters
– in (A,B,C) shorthand for A or B or C
– Use brackets for clarity if used with and
• Comparators
Subqueries
• Use view like a table
• Use a select query like a table
select
<columns>
from <fact table>
<join type> join <select query>|<view> on <joined columns>
where
<filters>
order by
Case Statement
• Create new columns on-the-fly • Recoding values
case
when Day in ('Saturday','Sunday') then 'Weekend'
else 'Working Day'
Aggregate Select Query
select
<grouped columns> <literal columns>
<aggregated columns>
from <fact table>
<join type> join <dimension table> on <joined columns> where <simple filters> group by <grouped columns> having <aggregate filters> order by <columns>;
Aggregate Functions
• sum– Numeric values summed
• count
– Number of records
• min / max
– Lowest / highest values
• avg
– Mean
• floor / ceiling
Time-limited Test
• Simple select
– Use of filters (where) – Join types (inner / outer) – Sorting (order by)
– Union
• Aggregate select
– Aggregate functions (count / min / max etc) – Aggregate filters (having)
• Use of system functions
– Datetime (datediff / dateadd etc) – String (left / right etc)