• No results found

Introduction and Overview

N/A
N/A
Protected

Academic year: 2021

Share "Introduction and Overview"

Copied!
73
0
0

Loading.... (view fulltext now)

Full text

(1)

Introduction and Overview

Introduction to Computer Vision I

CSE 152A

(2)

• We’ll begin with some introductory

material …

• … and end with

– Syllabus

– Organizational materials

– Wait list

(3)

What is computer vision?

Add camera as input device to computer.

(4)

Computer Vision

• An interdisciplinary field that deals with how

computers can be made to gain high-level

understanding from digital images or videos

• Other common definitions:

– In computer vision, we are trying to...describe the

world that we see in one or more images and to

reconstruct its properties (Szeliski)

– Computing properties of the 3-D world from one or

more digital images (Trucco and Verri)

– The construction of explicit, meaningful description of

physical objects from images (Ballard and Brown)

– Extracting descriptions of the world from pictures or

sequences of pictures (Forsyth and Ponce)

– To make useful decisions about real physical objects

and scenes based on sensed images (Stockman and

Shapiro)

(5)

Computer Vision

• An interdisciplinary field that deals with how

computers can be made to gain high-level

understanding from digital images or videos

• Engineering perspective

– Computer vision seeks to automate tasks that the

human visual system can do

(6)
(7)

Why is this hard?

What is in this image?

1. A hand holding a man?

2. A hand holding a mirrored sphere?

3. An Escher drawing?

4. A 1935 self portrait of Escher

• Interpretations are ambiguous

• The forward problem (graphics) is well-posed

• The “inverse problem” (vision) is not

(8)

Underestimates

• “640K ought to be enough for

anybody.”

– Bill Gates, 1981

• “... in three to eight years we will

have a machine with the general

intelligence of an average human

being ... The machine will begin to

educate itself with fantastic speed. In

a few months it will be at genius

level and a few months after that its

powers will be incalculable ...”

(9)
(10)

Should Computer Vision follow from our

understanding of Human Vision?

Yes

&

No

1.

Who would ever be crazy enough to even try creating machine vision?

2.

Human vision “works”, and copying is easier than creating.

3.

Secondary benefit – in trying to mimic human vision, we learn about it.

1.

Why limit oneself to human vision when there is even greater diversity in

biological vision

2.

Why limit oneself to biological vision when there may be greater diversity in

sensing mechanism?

3.

Biological vision systems evolved to provide functions for “specific” tasks

and “specific” environments. These may differ for machine systems

4.

Implementation – hardware is different, and synthetic vision systems may use

different techniques/methodologies that are more appropriate to computational

(11)

Hermann Grid

Scan your eyes over the figure. Do you see the gray spots at the

intersections? Stare at one of them and it will disappear.

(12)

How many red X’s are there?

Type it in the chat when you

(13)
(14)

How many red X’s are there?

Type it in the chat when you

(15)
(16)

The Near Future: Ubiquitous Vision

• Digital video has become very

inexpensive.

• It’s widely embedded in cell

phones, cars, games, etc.

• 99.9% of digitized video isn’t

seen by a person.

• That doesn’t mean that only

0.1% is important!

• And there’s an enormous

amount of image and video

content on the internet…

(17)

Applications: touching your life

• Optical Character

Recognition

• Football

• Movies

• Surveillance

• HCI – hand gestures

• Aids to the blind

• Face recognition &

biometrics

• Road monitoring

• Industrial inspection

• Virtual Earth; street view

• Robotic control

• Autonomous driving

• Space: planetary

exploration, docking

• Medicine – pathology,

surgery, diagnosis

• Microscopy

• Military

• Remote Sensing

• Digital photography

• Google Goggles

• Video games

(18)

Vision to explore the world

Image from Microsoft Virtual Earth

(see also: Google Earth)

(19)

Vision to explore other worlds

• Panorama stitching

• 3D terrain modeling

• Obstacle detection

• Position tracking

• For more, read “Computer Vision on Mars“ by Matthies et al.

(20)

Vision to recognize objects

– Point & Find, Nokia

– SnapTell.com (now Amazon)

– Google Photos

– Apple Photos

(21)
(22)
(23)

Object recognition (in supermarkets)

LaneHawk by EvolutionRobotics (now part of iRobot)

“A smart camera is flush-mounted in the checkout lane, continuously watching

for items. When an item is detected and recognized, the cashier verifies the

quantity of items that were found under the basket, and continues to close the

transaction. The item can remain under the basket, and with LaneHawk, you are

assured to get paid for it… “

(24)

Amazon Go

1. Turn-style entry. Consumer

scans in with Amazon App on

smartphone

2. Consumer goes around the

store, picks up items, adds to

bag, shops like normal

(25)

Vision to look at people

• Digital cameras, phones, Facebook, Google

Photos, Snapchat etc.

Face Detection

(26)
(27)

Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris Patterns” Read the

story

1984

Age 12

2002

Age 30

(28)

Login without a password…

Fingerprint scanners on

(29)
(30)
(31)
(32)
(33)
(34)

With great power comes great

(35)

The Matrix movies, ESC Entertainment, XYZRGB, NRC

Vision for entertainment:

(36)

Vision for entertainment:

motion capture

Facial

motion

capture

(37)

Vision for entertainment

• Football first down line

(38)

Vision for augmented reality

• AR Toolkit

• Blippar

• Magic Leap

(39)

Vision for augmented reality

• Text detection, localization, and translation,

then render with similar font

(40)

Vision for augmented reality

(41)
(42)

Vision for smart cars

• Mobileye

(43)
(44)

Vision-based interaction (and games)

Nintendo Wii has

camera-based IR

tracking built in.

Digimask

: put your face on a 3D avatar.

Xbox

Kinect

(45)
(46)
(47)
(48)

Vision from your perspective

(49)

Vision for medicine

Image guided surgery

Grimson et al., MIT

3D imaging

MRI, CT

(50)

Vision for Science

Molecular Reconstruction from

(51)

Vision to bridge communication

between physical and digital worlds:

Optical Character Recogntion (OCR)

Digit recognition, AT&T labs

http://www.research.att.com/~yann/

Or more recent, see blog post about

Dropbox OCR

License plate readers

http://en.wikipedia.org/wiki/Automatic_number_plate_recognition

(52)

Scene Text: Text Recognition in the Wild

COCO-Text

A large-Scale Scene Text Dataset

https://bgshih.github.io/cocotext/

(53)
(54)
(55)

Video understanding

• Video classification

• Activity recognition

• Video segmentation

• Activity detection

(56)

Computer Vision

• An interdisciplinary field that deals with

how computers can be made to gain

high-level understanding from digital images or

videos

(57)

Related Fields

Computer

(58)

Related Fields

Deep Learning

Computer

Graphics

(59)

Four Rs of computer vision

• Reprojection

– Rendering a scene from a different view, under

different illumination, under different surface

properties, etc.

• Reconstruction

– Multiple view geometry, structure from motion,

shape from X (where X is texture, shading,

contour, etc.), etc.

• Registration

– Tracking, alignment, optical flow, correspondence,

etc.

• Recognition

(60)

Rudiments: The implied fifth R

• Image filtering

• Edge detection

• Interest point detection

• Probability

• Statistics

• Linear algebra

• Projective geometry

• Optics

• Fourier analysis

• Sampling

• Algorithms

• Photometry

• Physics of color

• Human vision

• Psychophysics

• Performance

evaluation

(61)

CSE 152A topics

• Geometric image

formation

• Photometric image

formation

• Photometric stereo

• Image filtering

• Edge detection and

corner detection

• Feature extraction

• Stereo

• Structure from motion

• Model fitting

• Optical flow and

motion

• Recognition,

detection, and

classification

• Neural networks

• Color

• Perception

(62)

Syllabus

• Instructor: Ben Ochoa

• TAs: Jiajian Fu, Gaurav Nakum, and Weijian

Xu

• Course is on Canvas

• Course website

https://cseweb.ucsd.edu/classes/wi21/cse152A-a/

• 18 lecture meetings (recorded)

– 2 university holidays (Jan 18 and Feb 15)

• Weekly discussion section (recorded)

• Class discussion

(63)

Syllabus

• Grading

– 5 homework assignments (65% of grade)

• By hand and programming using Python

• Late policy: 15% grade reduction for each 12 hours late

– Will not be accepted 72 hours after the due date

– 4 quizzes and final exam (35% of grade)

• Quiz two days after assignments are due, except

assignment 0

• Final exam has 5 parts

– First 4 parts correspond to quizzes and can increase your

grade

– Piazza

• Ask (and answer) questions using Piazza, not email

• Good participation could raise your grade (e.g., raise a

(64)

Textbook (optional)

• Computer Vision: Algorithms and

Applications, 2nd edition (in preparation)

– Richard Szeliski

(65)

Textbook (optional)

• Digital Image Processing,

4th edition

– Rafael C. Gonzalez and

Richard E. Woods

• See book website

– Corrections and clarifications

– Review material

• Linear systems

• Matrices and vectors

• Probability

(66)

Textbook (optional)

• Introductory Techniques for 3-D Computer

Vision

(67)

Textbook (optional)

• Multiple View Geometry in Computer

Vision, 2nd edition

– Richard Hartley and Andrew Zisserman

• Download the corrections and errata

(68)

Textbook (optional)

• Deep Learning

– Ian Goodfellow, Yoshua Bengio, and Aaron

Courville

(69)

Collaboration Policy

It is expected that you complete your academic

assignments on your own and in your own words

and code. The assignments have been developed

by the instructor to facilitate your learning and to

provide a method for fairly evaluating your

knowledge and abilities (not the knowledge and

abilities of others). So, to facilitate learning, you

are authorized to discuss assignments with

others; however, to ensure fair evaluations, you

are not authorized to use the answers developed

by another, copy the work completed by others in

the past or present, or write your academic

(70)

Academic Integrity Policy

Integrity of scholarship is essential for an

academic community. The University expects

that both faculty and students will honor this

principle and in so doing protect the validity

of University intellectual work. For students,

this means that all academic work will be

done by the individual to whom it is assigned,

without unauthorized aid of any kind.

(71)

Academic Integrity Violation

If the work you submit is determined to be

other than your own, you will be reported to

the Academic Integrity Office for violating

UCSD's Policy on Integrity of Scholarship. In

accordance with the CSE department

academic integrity guidelines, students found

committing an academic integrity violation

will receive an F in the course.

(72)

Wait List

• Number of enrolled students is limited by

– Size of room

– Number of TAs

• General advice

– Wait for as long as you can

• Concurrent enrollment (Extension) students

have lowest priority

• And, if you are going to drop the class, please

officially drop it to make room for others

(73)

Next Lecture

• Geometric image formation

• Reading:

– Szeliski

References

Related documents

The forces acting on the powdered material stored in a hopper tend to (1) compact the powder (i.e., reduce its bulk density), and (2) the shear stresses in the material tend to

Here, we additionally tested whether stem cell therapy alone or in combination with sRAGE targeting human cells (i.e., human neural progenitor cells or hNP1),

If a clinician focuses on self-care outside of work, takes coffee and lunch breaks during work hours, gets enough sleep, seeks support and supervision from coworkers, engages in

 The Carolina Permanente Medical Group, Department of Family Practice, Chapel Hill NC, 1993-2000..  Practicing Physician, Orange Family Medical Center, Hillsborough NC

In this region, the tape spring starts to unravel laterally (Fig. 11c), followed by a dynamic latching of the tip. Therefore, it is convenient to neglect the final uncoiling

Eight areas, which may be described as “best practice in fraud prevention”, are examined: fraud awareness and education; management of fraud control; personnel monitoring;

Overall,  the  picture  obtained  from  Table  2  suggests  that  the  primary  factors  which  distinguish  between  states  with  an  increasing  MFR  are  a 

We have analysed four sets of labour-market experiences. The …rst was the transition from compulsory school to various states: continuing their education, …nding a job,