• No results found

Data Security in Hadoop

N/A
N/A
Protected

Academic year: 2021

Share "Data Security in Hadoop"

Copied!
14
0
0

Loading.... (view fulltext now)

Full text

(1)

Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Data Security in Hadoop

(2)

What is Data Security?

Data Security for Hadoop allows you to administer a

singular policy for authentication of users, authorize data

access, protect data at rest and in motion, and collect a

record of interactions with that data.

It allows you to coordinate enforcement of this policy

across the entire Hadoop stack.

(3)

Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

What is Security?

Administration

Centrally management & consistent security

Authentication

Authenticate users and systems

Authorization

Provision access to data

Audit

Maintain a record of data access

Data Protection

Protect data at rest and in motion

(4)

Why Security?

Administration

Centrally management & consistent security

Authentication

Authenticate users and systems

Authorization

Provision access to data

Audit

Maintain a record of data access

Data Protection

Protect data at rest and in motion

Security needs are changing

• YARN unlocks the data lake

• Multi-tenant: Multiple applications for data access

• Changing and complex compliance environment

• ETL of non-sensitive data can yield sensitive data

Summer 2014

65% of clusters host

multiple workloads

Fall 2013

Largely silo’d deployments

with single workload clusters

(5)

Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

HDP

delivers comprehensive security for Hadoop

HDP 2.1

Hortonworks Data Platform

Provision, Manage & Monitor Ambari Zookeeper Scheduling Oozie Data Workflow, Lifecycle & Governance Falcon Sqoop Flume NFS WebHDFS

YARN

: Data Operating System

DATA MANAGEMENT

BATCH, INTERACTIVE & REAL-TIME DATA ACCESS GOVERNANCE & INTEGRATION OPERATIONS Script Pig Search Solr SQL Hive HCatalog NoSQL HBase Accumulo Stream Storm Others ISV Engines 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N HDFS

(Hadoop Distributed File System)

In-Memory

Spark

Tez Tez

COMPREHENSIVE SECURITY

Meet all security requirements

across authentication,

authorization, audit & data

protection

CENTRAL ADMINISTRATION

Provide one location for

administering security policies and

for viewing and managing audit

across the platform

CONSISTENT INTEGRATION

Integrate with other security and

identity management systems, for

compliance with IT policies

SECURITY Authentication Authorization Accounting Data Protection Storage: HDFS Resources: YARN Access: Hive, … Pipeline: Falcon Cluster: Knox

(6)

Start of 2014: Hadoop Data Security Capabilities

Administration

Centrally management & consistent security

Did not exist

Authentication

Authenticate users and systems Kerberos & Apache Knox

Authorization

Provision access to data

Fragmented across Components

• Hive: ATZ-NG • HDFS: ACL’s

• Sentry (interactive SQL & Search)

Audit

Maintain a record of data access Fragmented across Components

Data Protection

Protect data at rest and in motion

3rd party add-ons, Integration

w/Apache Falcon & OS capabilities

Requirements: Authenticate, authorize, provide auditability

of data access and protect data at rest and in motion

Start of 2014:

State of Hadoop Security

• Largely disjoint patchwork

• A few projects address spot needs • No coverage across all workloads • No central administration and

(7)

Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

May 2014: Hortonworks Acquires XA Secure

Administration

Centrally management & consistent security

Authentication

Authenticate users and systems

Authorization

Provision access to data

Audit

Maintain a record of data access

Data Protection

Protect data at rest and in motion

Requirements: Authenticate, authorize, provide auditability

of data access and protect data at rest and in motion

+

+

+

+

+

Founded in 2013, XASecure provides an enterprise ready, cross-platform, security layer built from the ground up for Hadoop, providing centralized capabilities around data security, authorization, auditing and overall governance.

Hortonworks

acquired XA Secure

Accelerates delivery against the enterprise

requirements for central security administration

and enforcement across

all Hadoop workloads

from, batch, interactive SQL and real-time

(8)

Administration

Centrally management & consistent security

Central administration

Authentication

Authenticate users and systems Kerberos & Apache Knox

Authorization

Provision access to data

Authorization for HDFS, Hive, HBase, etc.

Audit

Maintain a record of data access Compliance controls

Data Protection

Protect data at rest and in motion

3rd party add-ons, Integration

w/Apache Falcon & OS capabilities

Current State of Security in HDP

HDP provides central administration and coordinated enforcement

of security policy across the entire Hadoop ecosystem of projects.

With additional XA Secure

features, HDP is a leader in

Hadoop Security

NEW FEATURES

• Centralized security policy enforcement

• Granular access control across HDFS,

Hive, and HBase

• Universal audit tracking

• Compliance conformance controls

(9)

Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Central Security Administration

HDP Advanced Security

Delivers a ‘single pane of glass’ for

the security administrator

Centralizes administration of

security policy

Ensures consistent coverage across

the entire Hadoop stack

Apache Argus

All delivered in Open

Source under the

governance of the

(10)

Authentication

For more than 20 years, Kerberos has been the de-facto standard for

strong authentication …no other option exists.

What does Kerberos Do?

Establishes identity for clients, hosts and services

Prevents impersonation/passwords are never sent over the wire

Integrates w/ enterprise identity mgmt tools such as LDAP & Active Directory

More granular auditing of data access/job execution

The design and implementation of Kerberos security in native Apache

Hadoop was delivered by Hortonworker, Owen

O’Malley in 2010

(11)

Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Perimeter Security with Apache Knox

Eliminates SSH “edge node”

Central API management

Central audit control

Simple Service level

Authorization

SSO Integration –

Siteminder and OAM*

LDAP & AD integration

Incubated and led by Hortonworks,

Apache Knox provides a simple and open

framework for Hadoop perimeter security.

Integrated with existing

systems to simplify

identity maintenance

Single, simple point of

access for a cluster

Central controls ensure

consistency across one or

more clusters

Single Hadoop access point

REST API hierarchy

Consolidated API calls

(12)

Authorization and Audit

Authorization

Fine grain access control

HDFS – Folder, File

Hive – Database, Table, Column

HBase – Table, Column Family, Column

Audit

Extensive user access auditing in

HDFS, Hive and HBase

IP Address

Resource type/ resource

Timestamp

Access granted or denied

Control

access into

system

Flexibility

in defining

policies

(13)

Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Data Protection

HDP allows you to apply data protection policy at

three

different layers across the Hadoop stack

Layer

What?

How ?

Storage

Encrypt data while it is at rest

Partners, OS level encrypt, Custom Code

Transmission Encrypt data as it moves

Supported in HDP 2.1

(14)

Thank You!

References

Related documents

Figure 7a shows the traces for the body-duct configuration for the baseline and optimized designs while Figure 7b shows the same for the wing-body-duct configuration.. The single

The original members of the NCC were: American Translators Association, Association of Language Companies, California Healthcare Interpreting Association California Pan-Ethnic

Epi and collagen; absent ristocetin Glanzmann’s Thrombasthenia • Rare Condition • Inherited absence of GPIIb/IIIa (AR) • Severe Bleeding manifestations • GPIIb/IIIa a

(2) In very many cases, what will be in the person’s best interests is a decision that will be reached informally and collaboratively between the health or social care

Players can create characters and participate in any adventure allowed as a part of the D&D Adventurers League.. As they adventure, players track their characters’

En efecto, así como los libertarianos ven en cual- quier forma de intervención del Estado una fuente inevitable de interferencias arbitrarias –con la excepción de aquella acción

To manage this asset, you need to understand the needs of the enterprise with regards to data life cycle, data security, compliance requirements, regulatory requirements,

According to the Information Systems Audit and Control Association (ISACA), “The most critical aspect of encryption is the determination of what data should be encrypted