• No results found

Big Data Security and Privacy

N/A
N/A
Protected

Academic year: 2021

Share "Big Data Security and Privacy"

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

(1)

Big Data Security and Privacy

Copyright © 2014, Novetta Solutions, LLC. All rights reserved.

AFCEA CyberSecurity Symposium 2014

Kevin T. Smith, Novetta Solutions June 25, 2014

(2)

Big Data

With the increase of computing power, electronic devices & accessibility to the Internet,

more data than ever is being produced, collected and transmitted.

Interesting Facts*:

• Facebook Collects 250 Terabytes a Day

• Digital Data Production worldwide doubled in

2009 to 1 zettabyte (1 million petabytes)

• Worldwide digital production is expected to reach

• 7.9 zettabytes in 2015

• And 35 Zettabytes in 2020

*Stats from Thompson Reuters & InfoQ, http://www.infoq.com/news/2013/12/HadoopUsage

Organizations have recognized the power of data analysis, but are struggling to manage

the massive amounts of information they have.

(3)

Securing Big Data – Why Should We Care?

Regulatory, Access Control & Releasability Concerns

Regulatory - Many Organizations required to enforce access control & privacy

restrictions on data sets (HIPAA, Privacy Laws) – or face steep penalties and fines

Access Control - U.S. Government organizations are required to provide access

control based on Need-to-Know, & Formal Authorization Credentials

Releasability - Big Data brings new challenges related to data management &

organizations are struggling to understand what results they can release without

unintentionally disclosing information

Insider Threat / Threats on Availability

How do you control access to your analytics? Many deployments are unsecured

“Your data is only a distributed delete away”

Mismanagement of Data Sets & Breaches are Costly

AOL Research “Data Valdez Incident” – Listed as one of CNN/Money’s “Dumbest

Moments in Business”: $5M Settlement + $100 to each member at the time + $50 “to

any member concerned”

Netflix Contest & “Anonymized Data Set” – Class Action Lawsuit, $9M Settlement

Playstation (2011) – Experts predict costs to Sony between $2.4 and $2.6 Billion

Copyright © 2014, Novetta Solutions, LLC. All rights reserved.

(4)

What makes Securing Big Data Different?

Unique Challenges to Big Data Analytics

Distributed Security:

When Data and Processing are distributed to a cluster,

there are lots of moving parts to secure related to confidentiality, integrity, and

availability. This often leads to complexity related to the development &

configuration of security on these systems.

Combination of Different Sources:

Big Data Analytics Solutions are great at

bringing many data sources together & doing analytics on their combination.

Given that each data source may have its own access control security policy,

how do you enforce security policies on the combination of these data

sources?

Aggregation & Differential Privacy:

When you combine different sources of

data, you may discover “connections” between those data sources that may

disclose more information that you intended, potentially violating access control

and privacy policies.

Unintended Deduction from Large Data Sets:

Data sets are typically so

large, that it is often difficult to determine what may be deduced from them that

may disclose sensitive information.

(5)

Deduction & Differential Privacy Example

Could a data analyst

working for

Commissioner Gordon

deduce that Batman is

Bruce Wayne?

(6)

To Complicate the Matter…

Most Data Analytics Tools were designed without Security In Mind.

Example: Apache Hadoop

Originally No Security Model

– No authentication of users or services

– Anyone can submit arbitrary code to be executed

– Anyone could add data to or delete data from, or read data from distributed file system

– You could write a service that impersonated a Hadoop service.

– Later, after authorization was added, user impersonation = command line switch

2009 Yahoo! Security Retrofit

– Resulting Security Model is Complex

– Configuration is Complex

– No Data at Rest Encryption

– Kerberos-Centric

– Limited Authorization Capabilities

– Easy to Mess Up if You Don’t Know What You are Doing

Things Are Changing, But They are Changing Slowly!

– An Alphabet Soup of Secure Distributions, Vendor Add-Ons & Security Focused-Companies

– Companies releasing Hadoop Distros are taking Security Seriously (See recent press releases - Cloudera:

Gazzang, HortonWorks XASecurity)

– Much activity in open source movements like Project Rhino & projects like Apache Sentry

(7)
(8)

Air Gap & Isolation Approaches

- Network Isolation in various forms is used

in lieu of security in “closed networks”

- Import/Export is problematic

- Accidents may still happen

- Does not solve issues related to diff.

privacy | AuthZ issues

(9)

Augmenting Analytic Security with Other Tools

Cell-Level Access Control via visibility

By default, uses its own db for

users & credentials

Can be extended in code to use other

Identity & Access Management

Infrastructure

Ex: Apache Accumulo

Find your analytics tools limitations &

complement your solution with other tools

and libraries.

Example here shows building a security

layer over Hadoop…

(10)

Differential Privacy & Deduction

Many approaches are in the Academic Sphere

Cynthia Dwork from Microsoft Research is one of the leading researchers

Lots of University Work

Lots of Math involved.

I’m involved in more practical solutions (but no Math)

Determining Access Control Policies up Front & Applying that Policy

Determining Entities that Should not Resolve (Batman + Bruce Wayne) & including

this in the security of the system

Sometimes this involved an aggregation filter component to prevent the resolution of

entities

We will still need to follow the academic research in this area.

Copyright © 2014, Novetta Solutions, LLC. All rights reserved.
(11)

Final Thoughts – General Guidance

Every Security Approach Is Different – Security is a Journey, Not a Destination

Know Your Security Requirements

Understand your security requirements & policies related to access to data

Know The Security Policies of Your Data:

Understand the security policies of your data so that you can enforce them

Know Your Tools & Their Limitations

Understand, from an in-depth perspective, how to successfully meet your security

goals

Understand the limitations of your tools & augment your solutions with other

approaches

Understand the Unique Challenges of Big Data Security

Combination of Different Sources & Resulting Policies

Aggregation and Differential Privacy (Netflix Contest)

Unintended Disclosure (The Batman Problem)

References

Related documents

In the desperate search for a solution to the human popula- tion problem, space was for a while the great escape from ratio- nality: we would just ship off the earth's increase

e , f The 18 F-fluorodeoxyglucosepositron emission tomography-computed tomography ( 18 F-FDG PET/CT) scan performed five months after starting cART showed intense accumulation

I intend to use this practice as my regular and on-going provider of general practice / GP / health care services.. I understand that by enrolling with Pyes Pa Drs I will be included

In their study on the impact of control mechanisms on external embeddedness, Andersson, Björkman, and Forsgren (2005) also examined the influence of subsidiary embeddedness in

❏ Read in your novel, according to the pacing guide provided by your classroom teacher.. As you read, use the ‘Book Reflections’ worksheet to make connections to characters and

a single error on the method at any given point in time. This is a general statement while considering the protection against radiation-induced soft errors. There are three

IMPORTANT: Place your scale on a hard flat surface, and be ready with bare feet to record your first weight entry on the last step.. This scale can be customized for up to

All Personal Data collected by the MNJAC, regardless of whether it meets the reasonable suspicion standard in 28 Code of Federal Regulations Part 23, will be retained in