• No results found

CISC 432/CMPE 432/CISC 832 Advanced Database Systems

N/A
N/A
Protected

Academic year: 2021

Share "CISC 432/CMPE 432/CISC 832 Advanced Database Systems"

Copied!
22
0
0

Loading.... (view fulltext now)

Full text

(1)

CISC 432/CMPE 432/CISC 832

Advanced Database Systems

(2)

Course Info

Instructor: Patrick Martin

Goodwin Hall 630 613 533 6063

[email protected]

Office Hours: Wednesday 11:00 – 1:00 or by appointment

Schedule: Slot 22 (Monday 3:30 – 4:30,

Wednesday 2:30 – 3:30 and Thursday 4:30 – 5:30)

(3)

Assumed Background

• Previous course in DBMS or equivalent

experience

• Relational model, SQL, relational algebra,

schema design

(4)

Reference Materials

Textbook:

Database System Concepts(6

th

Edition) by

A. Silbershatz, H. Korth and S. Sudarshan,

McGraw-Hill.

(5)

Learning Outcomes

• Apply optimization algorithms to SQL queries to produce efficient query plans.

• Apply concurrency control and recovery algorithms to sample transaction workloads to ensure ACID properties are

maintained.

• Assess the use of relational DBMSs and NoSQL systems for different types of data and applications.

• Apply a NoSQL system to the creation of a sample database. • Apply the MapReduce framework to a sample big data problem. • Evaluate the use of a big data approach to a sample application

(6)

Marking Schemes

• Undergraduate students (432)

– 4 assignments (60 %).

• Due dates: Oct 10, Oct 24, Nov 14, Nov 28

– 2 term tests (30%).

(7)

Marking Schemes

(

Cont.)

• Graduate students (832)

– 4 assignments (40%).

• Due dates: Oct 10, Oct 24, Nov 14, Nov 28

– 2 term tests (20%).

– Class participation (10%). – Term project (30 %).

(8)

Marking Schemes

(

Cont.)

• Grading Method

– In this course, some components will be graded using numerical percentage marks. Other components will receive letter grades, which for purposes of calculating your course average will be

translated into numerical equivalents using the Faculty of Arts and Science approved scale (see below). Your course average will then be converted to a final letter grade according to Queen’s Official Grade Conversion Scale.

• Late Policy

– Assignments should be handed in by 4:00 pm on the day they are due. Late assignments are subject to a 10% per day late penalty, with weekends counted as one day. Late assignments will not be accepted beyond 5 days past the date due.

(9)

Class Participation Rubric

9 - 10 marks 7 – 8 marks 5 – 6 marks < 5 marks Frequency and Quality Attends class regularly and always contributes to the discussion by raising thoughtful questions, analyzing relevant issues, building on others’ ideas, synthesizing across readings and discussions, expanding the class’ perspective. Attends class regularly and sometimes contributes to the discussion in the aforementioned ways. Attends class regularly but rarely contributes to the discussion in the aforementioned ways. Attends class regularly but never contributes to the discussion in the aforementioned ways. Readings and Surveys

Reads all of the assigned readings, demonstrates understanding of the material, completes all

Reads most of the assigned readings, demonstrates some understanding of the material, completes all required surveys.

Reads some of the assigned readings, completes all required surveys

Reads some of the assigned readings, completes some required surveys

(10)

Academic Integrity

• Academic integrity is constituted by the five core fundamental values of honesty, trust,

fairness, respect and responsibility (see www.academicintegrity.org). These values are central to the building, nurturing and sustaining of an academic community in which all members of the community will thrive. Adherence to the values expressed through academic integrity forms a foundation for the "freedom of inquiry and exchange of ideas" essential to the intellectual life of the University (see the Senate Report on Principles and Priorities) • Students are responsible for familiarizing themselves with the regulations concerning

academic integrity and for ensuring that their assignments conform to the principles of academic integrity. Information on academic integrity is available in the Arts and Science Calendar (see Academic Regulation 1), on the Arts and Science website (see

http://www.queensu.ca/artsci/sites/default/files/Academic%20Regulations.pdf), and from the instructor of this course.

• Departures from academic integrity include plagiarism, use of unauthorized materials,

facilitation, forgery and falsification, and are antithetical to the development of an academic community at Queen's. Given the seriousness of these matters, actions which contravene the regulation on academic integrity carry sanctions that can range from a warning or the loss of grades on an assignment to the failure of a course to a requirement to withdraw from the

(11)

Course Content

(data model, query performance, consistency, scalability, availability) Structured data Unstructured data Streaming data

RDBMS distributed parallel column-oriented multi-tenant NewSQL NoSQL key-value document graph column-oriented DSMS

Big Data

(12)

Course Schedule

• Structured Data (weeks 1 – 7)

– Relational DBMSs

• Query processing, query optimization, transaction management, concurrency control, recovery, BI & data warehousing, column-oriented storage,

database architectures, SaaS & multi-tenant DBs

(13)

Course Outline

(cont)

• Unstructured Data (weeks 8 – 11)

– NoSQL database systems, MapReduce technology for analytics

• Streaming Data (week 12)

(14)

Focus Issues

• Data model:

– language and logical constructs that determine the logical structure of a database and consequently how data is stored, organized, and manipulated.

(15)

Focus Issues

(cont)

• Query performance

– Efficiency of a DBMS typically expressed with

metrics like query response time and/or query

throughput

• Consistency

– Guarantee concerning state of a data item when accessed

(16)

Focus Issues

(cont)

• Scalability

– Ability of a system to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth

• Availability

– Proportion of time that requests received by a system receive a response.

(17)

What is Big Data?

• 7 definitions of big data

(

http://timoelliott.com/blog/2013/07/7-definitions-of-big-data-you-should-know-about.html)

– 3 V’s: Volume, Velocity, Variety

– New scale-out technologies like Hadoop and NoSQL

– Data distinctions: generated from transactions versus interactions (eg click on web page) and observations (eg sensors)

(18)

What is Big Data?

• 7 definitions (cont)

– Signals: data is time and facilitates real-time business decisions

– Opportunity: analyzing data that was previously ignored because of technology limitations

– Metaphor: process of helping planet grow a nervous system and we are just another sensor

(19)

Related Buzzwords

• Data analytics (Business Intelligence)

– “Analytics is the process of obtaining an

optimal and realistic decision based on existing data.”

– “Data analytics (DA) is the science of examining raw data with the purpose of

(20)

Related Buzzwords

• Cloud computing

– “A networking solution in which everything — from computing power to computing

infrastructure, applications, business processes to personal collaboration — is delivered as a service wherever and whenever you need.”

(21)

Related Buzzwords

• NoSQL (Not Only SQL)

– NoSQL encompasses a wide range of

technologies and architectures not based on the relational model that seeks to solve the

scalability and big data performance issues of relational databases.

– Eg Cassandra, BigTable, SimpleDB, MongoDB, Voldemort

(22)

Related Buzzwords

• NewSQL

– Encompasses solutions aimed at bringing to the relational model the benefits of horizontal

scalability and fault tolerance provided by NoSQL solutions

References

Related documents

The Compulsory Question No.1 will be short answer type questions containing ten.. questions of equal marks (i.e., 1 mark each) spread over the

HD 8.5―10 marks D 7.5―-8.4 marks C 6.5―7.4 marks P 5―6.4 marks F 0―4.5 marks 1  Scholarly demonstration of. analysis of

- Effective nationality principle (Nottebohm case)- The Nottebohm case cited by the petitioner invoked the international law principle of effective nationality which is clearly

Tape Archive Database Cluster w/ consolidated persistent state Stager xrootd Disk Pool. Nameserver Tape

o To make simple decision making (IF statement) solution and display relevant message (example: problems related to eligibility for a given value of age, profit/ loss

In the study of the evaluation of the customers’ expectation-perception score against each dimension of the service quality provided by the private life insurance

The concept of tribal sovereign status and the integration of knowledge, concepts, and theories around its integration into the social work curriculum are recent adaptations to social

Block-based Storage File-based Storage (continued) MariaDB Databases DAY FOUR Apache HTTPD Bash Scripts Bash Control Structures Shell Environment Containers Comprehensive