• No results found

CSE692: Adv Topics in CS Systems Design for Massive Data. Fall 2007

N/A
N/A
Protected

Academic year: 2021

Share "CSE692: Adv Topics in CS Systems Design for Massive Data. Fall 2007"

Copied!
26
0
0

Loading.... (view fulltext now)

Full text

(1)

CSE692: Adv Topics in CS

Systems Design for Massive Data

(2)

CSE692 Fall 2007

Outline

✦ Administrative Information ✦ Course Schedule

✦ Grading

✦ About the instructor ✦ Discussion

(3)

CSE692 Fall 2007

Administrative Information

✦ CSE692: Systems Design for Massive Data ✦ 3-credit, Tu Th 12:50-2:10pm, CS 1441

✦ http://www.cs.sunysb.edu/~cse692/ ✦ course mailing list cse692@cs.

✦ Instructor: Qin (Christine) Lv ✦ CS 1411, 2-8426, qlv@cs.

✦ Office hours: Tu 2:10-3pm

(4)

CSE692 Fall 2007

Course Summary

✦ Efficient systems for managing and exploring massive amounts of digital data

✦ Systems

✦ search systems, storage systems, P2P ✦ Algorithms

✦ bloom filters, sketching, indexing ✦ Applications

✦ multimedia, bioinformatics, scientific data

(5)

CSE692 Fall 2007

Course Schedule

✦ CSE692 course website

✦ Tentative schedule, subject to change ✦ Suggestions for other topics, readings? ✦ Sign up for presentations

(6)

CSE692 Fall 2007

Outline

✦ Administrative Information ✦ Course Schedule

✦ Grading

✦ About the instructor ✦ Discussion

(7)

CSE692 Fall 2007

Grading

✦ Paper review (20%) ✦ Class participation (20%) ✦ Paper presentation (20%) ✦ Course project (40%) 7

(8)

CSE692 Fall 2007

Paper Review (20%)

✦ Each class

✦ 2-3 papers in the reading list ✦ pick one paper to review

✦ due at 5pm the day before

✦ first review due at 5pm 9/5 (tomorrow)

(9)

CSE692 Fall 2007

Paper Review (20%)

✦ A paper review may contain the following ✦ What is this paper about?

✦ What are the strengths and weaknesses? ✦ Are the evaluations convincing?

✦ Can you improve their technique?

✦ Can you apply their technique to other domains/problems?

✦ Any other observations/questions

(10)

CSE692 Fall 2007

Class Participation (20%)

✦ Class attendance

✦ Read all papers in reading list

✦ be prepared to answer questions in class ✦ Participate in discussions

✦ listen to others, ask questions, tell us your thoughts

✦ presentation style, suggestions

(11)

CSE692 Fall 2007

Paper Presentation (20%)

✦ One or two presentations per student ✦ Also counts as a paper review

✦ 45-minute presentation + discussion

✦ motivation, background, related work ✦ main techniques & evaluations

✦ possible applications, improvements

✦ Meet w/ instructor one day before class

(12)

CSE692 Fall 2007

Course Project (40%)

✦ No midterm or final exam ✦ Important Dates

✦ proposal due: Oct 18 ✦ checkpoint: Nov 20 ✦ presentation: Dec 11

✦ final report due: Dec 13 ✦ Start early!

(13)

CSE692 Fall 2007

Course Project (40%)

✦ A self-contained project related to this course’s topics

✦ Work alone

✦ get instructor’s permission if work in pairs ✦ Possible project ideas will be posted at

course website

✦ Students may also pick their own topics ✦ Talk to instructor

(14)

CSE692 Fall 2007

Project Proposal (Oct 18)

✦ A 15-minute presentation ✦ motivation ✦ literature survey ✦ your technique ✦ how to evaluate ✦ milestones

✦ Submit a 3-page project proposal

(15)

CSE692 Fall 2007

Project Checkpoint (Nov 20)

✦ A 15-minute presentation

✦ proposal review: motivation, your technique, evaluation, milestones ✦ what you’ve achieved so far

✦ what remains to be done ✦ Submit a progress report

✦ updated version of your initial proposal ✦ highlight your progresses

(16)

CSE692 Fall 2007

Project Presentation (Dec 11)

✦ A 25-minute presentation

✦ motivation, literature survey, your

technique, evaluation, conclusions, future work

✦ Presentations will be peer-reviewed ✦ technical depth, evaluations

✦ presentation: style, clarity

(17)

CSE692 Fall 2007

Final Project Report (Dec 13)

✦ Follow the format of a regular research

paper

✦ title, abstract

✦ introduction, related work ✦ main technique, evaluation ✦ conclusion, future work

✦ references

(18)

CSE692 Fall 2007

Academic Honesty

✦ Be personally accountable for all your submitted work.

✦ No academic dishonesty will be tolerated. ✦ Any suspected instance will be reported to

the Computer Science Department and may be forwarded to the Graduate School.

(19)

CSE692 Fall 2007

Outline

✦ Administrative Information ✦ Grading

✦ Course Schedule

✦ About the instructor ✦ Discussion

(20)

CSE692 Fall 2007

Qin (Christine) Lv

✦ 2000: B.E. in Computer Science &

Technology, Tsinghua University, China

✦ 2006: Ph.D. in Computer Science, Princeton University, USA

✦ 2006-2007: Postdoc, Princeton University ✦ Sep. 2007:

✦ Assistant Professor, Stony Brook Univ.

(21)

CSE692 Fall 2007

Research Interests

✦ Develop efficient systems for managing and exploring massive amounts of digital data

✦ Search systems, data management,

distributed systems, storage systems, networking

✦ Systems, algorithms, applications

(22)

CSE692 Fall 2007

Research Projects

✦ Networking

✦ self-organized (ad-hoc) networks

✦ performance monitoring & optimization ✦ Peer-to-peer networks

✦ search, replication, heterogeneity ✦ Storage systems

✦ content-addressable, distributed B-tree

(23)

CSE692 Fall 2007

Research Projects

✦ CASS: Content-Aware Search Systems ✦ feature-rich data: audio, video, digital

photos, genomic data, scientific sensor data, ...

✦ content-based similarity search

✦ L1 sketching, multi-probe LSH indexing,

Ferret toolkit

(24)

CSE692 Fall 2007

Research Projects

✦ Massive Data Systems Lab

✦ distributed search systems

✦ similarity-aware storage systems ✦ data stream processing systems ✦ healthcare data management

✦ scientific data management ✦ and many more

(25)

CSE692 Fall 2007

Discussion

✦ Class meeting time ✦ conflicts?

✦ twice a week or once a week? ✦ Other topics to cover?

✦ Questions? Suggestions? ✦ Sign up for presentations

(26)

CSE692 Fall 2007

Next Lecture (9/6)

✦ Data, data, data

✦ MyLifeBits ✦ Memex

✦ How much information? 2003 ✦ Paper review due at 5pm, 9/5

✦ Class starts at 1:10pm

References

Related documents

Silene acaulis were found in montane to subalpine zones from 1320 to 2700 m a.s.l. Five gall midge species occur only in the alpine zone, viz. Rhopalomyia ruebsaameni indu- cing

Setelah desain dari sistem yang akan dibuat sudah disetujui baik itu oleh pengguna dan analis, maka pada tahap ini programmer mengembangkan desain menjadi suatu program.

In- terestingly, both the removal of haem from haemopexin and HasA addi- tionally use steric hindrance to displace the haem group from its high- af finity binding site, while no

If individuals truly reduce their labour supply once social assistance benefits become more generous, the employment rate of 30 years old (on census week) should be unusually

This report describes the development and delivery of a new bicycle and pedestrian engineering design course offered at Portland State University (PSU) during the winter term of

MOOCs began to gain public recognition as an alternative to traditional higher education and online classes around 2012 and are still gaining popularity, in part due to two new

Drawing on pure altruistic theories (see, e.g., Clotfelter, 1997), Fischbacher, Gächter and Fehr (2001) reveal that most individuals in their sample do not act as free riders in

A wellness program focusing on physical activity may contribute to improving health, weight reduction, and reducing chronic diseases and other health conditions related to a