• No results found

Meaning Based Computing: Managing the Avalanche of Unstructured Data

N/A
N/A
Protected

Academic year: 2021

Share "Meaning Based Computing: Managing the Avalanche of Unstructured Data"

Copied!
33
0
0

Loading.... (view fulltext now)

Full text

(1)

POWER PROTECT PROMOTE

Meaning Based Computing:

Managing the Avalanche of

Unstructured Data

(2)
(3)

Accelerated unstructured information growth

• Every user in an enterprise is now an information producer and consumer resulting in:

– Increasing costs of information storage – Difficulty accessing specific information – End user expectations not being met

Proliferation of Personal Devices

Growth of Email, Collaboration & Social Networking in the Enterprise

(4)

POWER PROTECTPROMOTE

Content is now also, interactive

(5)

Increasing complexity of unstructured information

- Convergence of Physical and

Digital Information

- Information is everywhere – on paper and tapes, on edge devices,

on-premises servers and now in the cloud - 1800 exabytes of new data expected in 2011

(Kahn Consulting 2010)

- 200+ billion emails per day (Kahn Consulting 2010)

- Information silos

• Proliferation of media types and formats - Globalization of business

• Distributed architectures and organizations • Virtualization

• Business velocity

• RTO/RPO expectations are tightening up

(6)

Unstructured information is risk

- More intrusive regulations

• 20,000+ records retention laws

• Higher expectations to be able to find information -quickly

• Tougher sanctions for failure to comply (including incarceration)

• Reputational risk for high-profile breaches

• Growing understanding by regulators and legislators of what technology can achieve

- Enterprise information a target of legal discovery

• eRecords defined - FRCP Rule 26; ESI & metadata are records

• Spoliation judgments are growing

• There were more eDiscovery sanction cases in 2009 than in all years prior to 2005 combined (Duke Law report)

(7)

POWER PROTECTPROMOTE

Evolving eDiscovery challenges

• Includes all electronic content:

– even social media, audio, video and SMS texts

• Risks and costs associated with handing off client data between systems

• Must locate, access and manage all enterprise data

• Greater number of multi-party litigation and multilingual cases

• Growing judicial scrutiny and larger, more punitive spoliation sanctions

“The average cost of a single case can easily exceed $1.75 million.” — Gartner

“The average cost of a single case can easily exceed $1.75 million.” — Gartner

“The average cost of a single case can easily exceed $1.75 million.” — Gartner

“On average, businesses spend $4 million per year for a $1 billion in revenue

company.” — Cohasset

“On average, businesses spend $4 million per year for a $1 billion in revenue

company.” — Cohasset

“On average, businesses spend $4 million per year for a $1 billion in revenue

(8)

Information management pain points for organizations

Failure to recover from failure Inconsistent Retention and Disposition Ad-hoc Discovery Inefficient Business Processes Email management Unprotected Laptops Complex backup process Over-preservation of information Storage sprawl Inadvertent deletion Poor access controls

Increasing litigation risks Expensive to Access Information Ad-hoc tape management What is a record? Information leakage Service continuity CAPX of Disk for

Backup

Unable to extract business value from

(9)

POWER PROTECTPROMOTE

Evolution: need for automated records management

ALL information must be managed, not

just “records”

• Manual methods are not scalable nor consistent

• Evolving regulatory and statutory requirements are causing constant process changes

• Greater reliance on cloud solutions

(10)

• In the face of an ongoing electronic information explosion, information management becomes an imperative. Organizations need to understand how (and why) information flows throughout their organization from

creation – to destruction…

• What information does your organization really have? • What information should you keep?

• What should you dispose of, when?

(11)

Why should any organization care about unstructured data

management?

Reduce the cost of storing information Reduce the cost of using information

Lower the risk of not meeting eDiscovery responsibilities Lower the risk of not meeting compliance requirements

(12)

Information management: easy concept, difficult to

accomplish

• In large corporate infrastructures, information can be stored anywhere, in many forms and revisions, with little or no central control

(13)

An example of the problem…

MB new sent and received email created each day for

each user

Users

GB of new email volume created each Month

TB of new email created each year

15

500

(14)

Employee managed information is the hardest to manage

1. Receive and create information

2. Information is stored in one or more storage locations

3. Employees use and re-use information

4. Employees send/forward/copy information so others can also use

5. Information is managed(?) for later use

6. Information is determined to be valueless and is removed from the infrastructure - maybe

Now consider this cycle happens with every single employee…

(15)

An example

• Employee has 24GB on local hard disk and removable media

• 22,000+ employees world-wide

• 528 TB of employee controlled information, company-wide

– Is the data your employees hold locally, useful or a risk to your organization?

• Compliance requirements

• Discovery requirements

• IP security

(16)

What’s the true cost of information?

• At its most basic, companies exist to generate and use information to produce revenue and profit…

• What constitutes the cost of generating and using information?

– Employee fully loaded salaries

– Data infrastructure; workstations, laptops, storage, internet, support…

(17)

The ROI of information

• What is the ROI of information generation?

– Revenue per employee?

– Profit per employee?

(18)

Problem #1: What information do I keep?

•Do you have a realistic information retention schedule?

•Who or what makes the real-time decision to keep or delete?

•Who or what decides how long an item should be kept and where?

•And how do I ensure information I don’t want is actually deleted?

These are the most basic questions organizations must be able to answer. If you rely on employees to make these decisions, have you trained them correctly and have you included an audit process to ensure they are following the policies

(19)
(20)

Total cost of management of information (TCM)

• Reduce the cost of owning / storing information by including

intelligence to help with the millions of decisions that must be made daily

• Instead of “keep or delete everything” or “one size fits all”…

Compliant Data

Risk-Relevant Data

Transactional Data “Look Into”

“Look Across” Intelligence Delete Relevant Information Relevant Information Cost Effective, Tiered Storage Critical Information Enterprise Information

(21)

Problem #2: Can I find the Information when I need it?

•Can you find a specific piece of information… quickly?

•Can you find all information related to a specific topic or concept?

•No matter where its stored?

Saving information is easy, finding specific information is very difficult. Retained information you can’t find and use when you need to is valueless… and

(22)

Intelligent management of enterprise information

Really knowing what you have…

• Reduce risk associated with information and enhance compliance • Reduce the total cost of managing information

• Enable true sharing of information

• Maximize the business value of the information

(23)
(24)
(25)

On-Premise Infrastructure Employee-Controlled

Edge Data

File Servers Email Servers SharePoint Servers

Other Systems

Cloud Storage

ECM - Move and manage all records into one repository

ECM Systems

Yes No

(26)

On-Premise Infrastructure Employee-Controlled

Edge Data

File Servers Email Servers SharePoint Servers

Other Systems

Cloud Storage

Store and access in place – utilize existing system indexes

Index Index Index Index Index

No indexes available

Search capabilities of each application

(27)

Super Indexer On-Premise Infrastructure

Employee-Controlled Edge Data

File Servers Email Servers SharePoint Servers

Other Systems

Cloud Storage

Store and manage in place – common index

(28)
(29)
(30)

POWER PROTECTPROMOTE

Meaning-Based Computing

Advanced discovery & investigations Rich user search Automatic records classification Supervision Surveillance

Gain control, manage cost

Email Social Video IM Audio Files SharePoint

(31)
(32)

Personalization from forms/questionnaires, geodemographic profiling

Aggregate content, tag & categorize Hypertext links to similar content Searching for information E-mailing information to relevant recipients Reformatting for multi-channel delivery, e.g. PDF to XML

Answering customer inquiries via a helpdesk Manual Processes Notes News Feeds Email Internet Database Files Document Management XML Information Theory and Bayesian Inference Integration Through Understanding Audio/ Media Aggregation Automatic categorization Hyperlinking Profiling Personalization Collaboration Delivery Retrieval Routing Alerting Process Automation

(33)

POWER PROTECT PROMOTE

Thank You

Chris Fellers: [email protected]

References

Related documents

Developed software gives instructions on assembling and disassembling processes of real vehicle transmission with the help of augmenting virtual reality objects on the video

Here we extend the concept of heritability to multidimensional traits, and present the first comprehensive analysis of the heritability of neuroanatomical shape measurements across

I am indifferent as to whether social enterprise is conducted through traditional for-profit or nonprofit legal entities, or whether it is conducted through one

In effect, as our sample indicates, in almost all the editorials published in the front pages during this period, ‘chaos’ is closely associated with the Allende

Millennium Exchange will operate in cold standby mode on the secondary data center (SDC) and in the event of total loss of the primary site (primary data center, PDC) Oslo Børs

For example: recognising foragers from a known nest site could provide information about foraging distances; distinguishing between bumble bees from different nests could be used to

The research team selected Software Architects, Tech Leads, Project Managers, Software Engineers, Quality Assurance Engineers (QA), Business Analysts and other IT

May cause damage to organs (respiratory system, eyes, skin) through prolonged or repeated exposure..