• No results found

03: What You Need to Know About Big Data: Understanding and Better Utilizing Data Analytics

N/A
N/A
Protected

Academic year: 2021

Share "03: What You Need to Know About Big Data: Understanding and Better Utilizing Data Analytics"

Copied!
41
0
0

Loading.... (view fulltext now)

Full text

(1)

03: What You Need to Know About Big Data:

Understanding and Better Utilizing Data

Analytics

Trainer(s):

Mike Holland

NYU Center for Urban Science and Progress

http://cusp.nyu.edu

Timothy Savage

NYU Center for Urban Science and Progress

http://cusp.nyu.edu Alan Mitchell KPMG www.kpmg.com Stephen C. Beatty KPMG www.kpmg.com

(2)

Mike Holland Tim Savage March 7, 2015

What You Need to Know about Big Data

(3)

Applied Sciences NYC

“Applied Sciences NYC is the City’s unparalleled opportunity to build or expand world-class applied sciences and engineering campuses in New York

City. We are seeking to dramatically expand our capacity in the applied sciences to maintain our global

competitiveness and create jobs. These

campuses would not only enrich the City’s existing research capabilities, but also lead to innovative ideas that can be commercialized, catalyzing hundreds of

spinoff companies and increasing the probability that the next high growth company – a Google, Amazon, or

Facebook – will emerge in New York City.”

New York City Economic Development Corporation

The NYU-led Center for Urban Science and

Progress, a multi-sector research and education

collaborative, was announced on April 23, 2012.

(4)

Big Cities + Big Data

• Informatics capabilities are exploding

– Storage, transmission, analysis

• Proliferation of static and mobile sensors

• Internet of things

Global network traffic, 30% CAGR

• The world is urbanizing

• Cities are the loci of

consumption, economic activity,

and innovation

Cities are the cause of our

problems

(5)

GRADUATE PROGRAMS IN

APPLIED URBAN SCIENCE AND INFORMATICS 

DEGREE

Master of Science

LENGTH

One Year, 3-semester (Full-time)

CLASS SIZE

Approx. 60 students

(6)

Projects for the City & State

• City Lights

• Building Informatics • Urban Soundscape

• Neuroeconomics of Decision Making • Economic Mapping

• Greener Greater Buildings Plan • MTA Bus Driver Optimization • MTA Origin/Destination Study

• New York City Police Department 911/311 • Trash Informatics

• Parks Attendance & Utilization

• Property Ownership Records Assessment • School Property Use Assessment

• Taxi Visualization • Transit Operations

(7)

Properly acquired, integrated, and analyzed, 

data can 

Take government beyond imperfect understanding

Better (and more efficient) operations, better planning, 

better policy

Improve governance and citizen engagement

Enable the private sector to develop new services for citizens, 

governments, firms

Enable a revolution in the social sciences

Environment

Meteorology, pollution, noise, flora, fauna

People

Relationships, location, economic

/communications

activities, health, nutrition, opinions, …

Infrastructure

Condition, operations

(8)
(9)

Urban Data Sources: Acquire, Integrate, Use

Novel Technologies

• Visible, infrared and spectral imagery • RADAR, LIDAR

• Gravity and magnetic • Seismic, acoustic • Ionizing radiation, biological, chemical • … Sensors • Personal (location, activity, physiological) • Fixed in situ sensors • Crowd sourcing

(mobile phones, …) • Choke points (people,

vehicles) Organic Data Flows

• Administrative records (census, permits, …) • Transactions (sales, communications, …) • Operational (traffic,

transit, utilities, health system, …)

• Social media (Twitter, Facebook, blogs, …)

(10)

The book identifies ways in which vast new sets of data on human beings can be collected, integrated, and analyzed to improve urban systems and quality of life while protecting confidentiality. Sponsored by CUSP, the American Statistical Association, its Privacy and Confidentiality

subcommittee, and the Research Data Centre of the German Federal Employment Agency.

Editors: Julia Lane, American Institutes for Research; Victoria Stodden,

Columbia; Stefan Bender, The German Federal Employment Agency; Helen Nissenbaum, NYU

Chapter Authors

Alessandro Acquisti, Carnegie Mellon University; Cynthia Dwork, Microsoft; Peter Elias, University of Warwick; Robert Goerge, UChicago; Alan Karr, National Institute of Statistical Sciences and Jerry Reiter, Duke University; Steve Koonin and Michael Holland, CUSP; Frauke Kreuter, U-MD and Richard Peng, Johns Hopkins; Carl Landwehr, George Washington University; Helen Nissenbaum and Solon Baracas, NYU; Paul Ohm, Colorado; Alexander Pentland, et al., MIT; Kathy Strandberg, NYU; Victoria Stodden, Columbia; John Wilbanks, Sage Bionetworks/Kauffman Foundation.

visit dataprivacybook.org.

Privacy, Big Data, and the Public Good:

Frameworks for Engagement

(11)
(12)

Overview

• Data from yellow cabs 2009-2013 is almost 800 million trips; nearly impossible to manage, explore, visualize, and analyze with existing tools

Objective & Goal

• Build scalable, usable tools that can be used by experts and non-experts

• Work with relevant city agencies on development & deployment of the technology

Status

• Initial deployment of TaxiVis at NYC Taxi & Limousine Commission and Department of Transportation

Freire, Silva, Vo, et al.

Analysis of

(13)

Taxis as Sensors for Manhattan

Taxis are sensors that can provide unprecedented 

insight into city life: economic activity, human behavior, 

mobility patterns, …

April 2011:  Taxi drivers petitioning TLC for higher fares to compensate for 

rising gasoline prices.

August 2011:  Hurricane Irene

October 2012: Hurricane Sandy

(14)

Urban Observatory

PERSISTENT and SYNOPTIC ANALYTICS for URBAN SCIENCE

(15)

Photo by Tyrone Turner/National Geographic

Other synoptic modalities: Hyperspectral, RADAR, LIDAR, Gravity, Magnetic, …

Manhattan in the Thermal IR

199 Water Street

Built 1993 :: 998,000 sq ft electricity, natural gas, steam

(16)
(17)

Plumes of Opportunity

Background subtraction:

• registration to reference image

• form 10 absolute difference images from surrounding frames

• construct the minimum difference image pixel by pixel

Plume identification and tracking:

• denoise background subtracted image • identify excess/deficit in luminosity space • cross check object location in color space

• localization and probability weighted tracking of centroids

Upcoming use cases:

• plume rate • urban winds

• carbon vs steam emissions • TOO (triggered) observations raw image

background subtracted

(18)

Street Environment: Attention, Distraction, and Interaction Dynamics

(19)

Source: Dobler, et al.

(20)
(21)

Federal Open Data Policies

(22)

http://nys‐its.github.io/open‐data‐handbook/OpenDataHandbook.pdf

http://catalog.data.gov/dataset?organization_type=City+Government#topic=cities_ navigation

(23)

Source: Barbosa, Luciano, et al. "Structured open urban data: understanding the landscape." Big data 2.3 (2014): 144-154.

(24)

Cities and States with Chief Data Officers

Blue signifies a state‐level officer, green signifies a local‐level officer, and 

yellow signifies an officer in education.

Source: Steve Towns, Which States and Cities Have Chief Data Officers?, govtech.com, June 13,  2014

(25)

Open Data Can Lead to Open Innovation

A consortium of public sector transit 

agencies, commercial firms, nonprofits, 

academic researchers, and interested 

individuals

Real‐time arrival predictions 

94% reported increased or greatly 

increased satisfaction with public 

transit

Significant decrease in actual wait time 

per user, and an even greater decrease 

in perceived wait time

78% of riders reported increased 

walking ‐‐‐ a significant public health 

benefit

http://onebusaway.org/

(26)

$826B • Health • Education • Social Services $245B • Planning • Public Buildings • Financial Admin • Community Development $180B • Emergency Mgmt • Courts, Jails • Police • Fire $397B • Sanitation • Utilities • Parks • Roads

Streets

Safety

Human

Services

General

Government

Core City Services Include…

We need to understand:

• How data flows within

agencies?

• How interoperable can

data be?

• What data can be

shared?

and how is it shared to

support delivery of city

services?

Local Gov’t. Expenditures: U.S. Census Bureau, 2012 Census of Governments: Surveys of State

(27)
(28)

Tools

Data acquisition and synthesis

Exploration and data “mining”

Formulation of meaningful policy questions

(29)

Tools

Data acquisition and synthesis

Exploration and data “mining”

Formulation of meaningful policy questions

(30)

Picture merges image captured from video, 3‐D LIDAR map of NYC, PLUTO 

(Primary Land Use Tax Lot Output) database, and LL84 Energy Benchmarking data

(31)

Tools

Data acquisition and synthesis

Exploration and data “mining”

Formulation of meaningful policy questions

(32)

TaxiVis: Interactive Visual Exploration of NYC Taxi Records

(33)
(34)

Tools

Data acquisition and synthesis

Exploration and data “mining”

Formulation of meaningful policy questions

(35)

Tools

Data acquisition and synthesis

Exploration and data “mining”

Formulation of meaningful policy questions

(36)

Uses of Data Analytics

Regulatory compliance

Targeted enforcement

Improved understanding of municipal 

(37)

Some Examples

Regulatory compliance

Targeted enforcement

Improved understanding of municipal 

(38)

Some Examples

Regulatory compliance

Targeted enforcement

Improved understanding of municipal 

(39)

Apartment Fires in the Bronx and Brooklyn

20,000+ complaints/year of unsafe illegal conversions

Department of Buildings: 200 building inspectors for 

900,000 buildings relied on expert judgment to prioritize

Historically, only 8% of inspections found serious violations

Strongest predictors of unsafe illegal conversion  

Whether the building is current on its property taxes: data 

at Department of Finance

Whether banks have filed any mortgage foreclosures: data 

at Office of Court Administration

Teaming Fire Marshals up with Building Inspectors

Fire fighters 15X more likely to die responding to a fire in an 

illegal conversion than other fires

Vacate orders jumped to more than 70%

Source: Mike Flowers, “Beyond Open Data: The Data-Driven City” in Beyond Transparency: Open Data

and the Future of Civic Innovation, Brett Goldstein, Lauren Dyson, Eds.; San Francisco, CA:

(40)

Some Examples

Regulatory compliance

Targeted enforcement

Improved understanding of municipal 

(41)

cusp.nyu.edu

NYUCUSP

@NYU_CUSP

References

Related documents

SIGACT Executive Committee: Cynthia Dwork (Microsoft Research SVC) Lance Fortnow (Northwestern University) Richard Ladner (University of Washington) Anna Lysyanskaya

(1991) "An annotated bibliography of computer supported cooperative work: Revision 2." Research Report, Department of Computer Science, University of Calgary,

Ø An application would not be acted on if, within the previous two years, the same site or one within 500 feet of it has been pro- posed for the same type of license and the

Google Stanford University Carnegie Mellon University MIT Microsoft Research UC Berkley Columbia University University of Oxford Tsinghua University Facebook Cornell

Adrenaline (epinephrine) should only be administered with great caution in the elderly, or those with cardiovascular disease, including hypertension and ischaemic heart disease,

THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN "AS-IS" BASIS.. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY

Libraries are therefore seen as spaces and places that can be used to include and integrate youth into civil society (Derr and Rhodes, 2010; Feinberg and Keller, 2010 and

The positive effect of short-term nano-curcumin therapy on insulin resistance and serum levels of afamin in patients with metabolic