• No results found

Text Analytics Beginner s Guide. Extracting Meaning from Unstructured Data

N/A
N/A
Protected

Academic year: 2021

Share "Text Analytics Beginner s Guide. Extracting Meaning from Unstructured Data"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

Text Analytics Beginner’s Guide

Extracting Meaning from

Unstructured Data

(2)

2

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

©2013 Angoss Software Corporation. All rights reserved.

Contents

Text Analytics 3

Use Cases 7

Terms 9

Trends 14

Scenario 15

Resources 24

(3)

3 3

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Text Analytics

Powerful trends in social media, e-discovery, customer services (call center transcriptions of voice calls, customer complaint emails and instant messaging) and customer-centric business strategies are driving IT leaders to consider text analytics as a powerful

business tool.

The transformed information from text analytics can be combined with structured data (e.g., sales and demographic data) and analyzed using various business intelligence or predictive and automated discovery techniques.

Successful companies today both listen to and understand what customers are

saying and are taking action in response to customer feedback by incorporating the voice of the customer (VOC) into business strategies for sales, marketing and

customer service using text analytics.

(4)

4 4

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Text Analytics

Text analytics is the process of analyzing unstructured text, extracting relevant information, and transforming it into structured information that can be leveraged in various ways.

Text analytics describes a set of linguistic, statistical and

machine learning techniques that model and structure the information content of textual sources for business

intelligence, exploratory data analysis, research or

investigation.

What is Text Analytics?

(5)

5 5

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Text Analytics

Today, 80% of business information

originates in unstructured data; primarily text with no identifiable structure.

...although structured data

continues to be the primary source for business intelligence.

Structured Data

(6)

6 6

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Text Analytics

• Emails

• Customer Surveys

• Documents

• Call Center Notes

• Claims Records

• Customer Forms

• Customer Letters

• Blogs

• Social Media

• Tweets

• Online Forums

• Articles / Reports

• Web

INTERNAL EXTERNAL

Unstructured Data

(7)

7 7

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Use Cases

Text Analytics transforms unstructured data into structured data for analysis to help...

• Monitor and analyze brand reputation

• Determine purchase behavior

• Identify product issues

• Summarize surveys, customer reviews

• Improve customer service and

customer experience management

• Understand customer feedback

• Improve customer retention

• Predict and reduce churn

• Identify and reduce claims fraud

• Develop cross-sell, upsell strategies

• Design next best offer strategies

(8)

8 8

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Use Cases

Marketing Business Industry-Specific

• Voice of customer

• Social media analysis

• Churn analysis

• Market research

• Survey analysis

• Competitive intelligence

• Document categorization

• Human resources

• Records retention

• Risk analysis

• Website navigation

• News feeds analysis

• Fraud detection

• E-discovery

• Warranty analysis

• Medical research

(9)

9 9

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Terms

1. Entity: “Who, where, when” is being discussed?

2. Theme: “What” are the important words?

3. Classification: “What” are the important concepts?

4. Sentiment : “How” is the

conversation going? Is it positive or negative?

Or…given a collection of text, text analytics tells you who, where, when, what, and how so that you can figure out ‘why’.

(10)

10 10

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Terms

Entity

“Who, where, when” is being discussed?

Yahoo wants to make its Web e-mail service a place you never want to – or more importantly – have to leave to get your social fix.

The company on Wednesday is releasing an overhauled version of its Yahoo Mail Beta client that it says is twice as fast as the previous version, while managing to tack on new features like an integrated Twitter client, rich media previews and a more full-featured instant messaging client.

Yahoo says this speed boost should be especially noticeable to users outside the U.S. with latency issues, due mostly to the new version making use of the company's cloud computing technology. This means that if you're on a spotty connection, the app can adjust its behavior to keep pages from timing out, or becoming unresponsive.

Besides the speed and performance increase, which Yahoo says were the top users requests, the company has added a very robust Twitter client, which joins the existing social-sharing tools for Facebook and Yahoo.

Entity Type

Yahoo Company

Twitter Company

Facebook Company

U.S. Place

(11)

11 11

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Terms

Yahoo wants to make its Web e-mail service a place you never want to – or more importantly – have to leave to get your social fix.

The company on Wednesday is releasing an overhauled version of its Yahoo Mail Beta client that it says is twice as fast as the previous version, while managing to tack on new features like an integrated Twitter client, rich media previews and a more full-featured instant messaging client.

Yahoo says this speed boost should be especially noticeable to users outside the U.S. with latency issues, due mostly to the new version making use of the company's cloud computing technology. This means that if you're on a spotty connection, the app can adjust its behavior to keep pages from timing out, or becoming unresponsive.

Besides the speed and performance increase, which Yahoo says were the top users requests, the company has added a very robust Twitter client, which joins the existing social-sharing tools for Facebook and Yahoo.

Theme Score

Cloud computing

technology 4.11 E-mail service 2.672 Top users

requests 2.669

Theme

“What” are the important words being used?

(12)

12 12

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Terms

Yahoo wants to make its Web e-mail service a place you never want to – or more importantly – have to leave to get your social fix.

The company on Wednesday is releasing an overhauled version of its Yahoo Mail Beta client that it says is twice as fast as the previous version, while managing to tack on new features like an integrated Twitter client, rich media previews and a more full-featured instant messaging client.

Yahoo says this speed boost should be especially noticeable to users outside the U.S. with latency issues, due mostly to the new version making use of the company's cloud computing technology. This means that if you're on a spotty connection, the app can adjust its behavior to keep pages from timing out, or becoming unresponsive.

Besides the speed and performance increase, which Yahoo says were the top users requests, the company has added a very robust Twitter client, which joins the existing social-sharing tools for Facebook and Yahoo.

Concept Score Software and

Internet .56

Social Media .60

Technology .49

Business .72

Classification/Concepts

“What” are the important, high-level concepts?

(13)

13 13

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Terms

Yahoo wants to make its Web e-mail service a place you never want to – or more importantly – have to leave to get your social fix.

The company on Wednesday is releasing an overhauled version of its Yahoo Mail Beta client that it says is twice as fast as the previous version, while managing to tack on new features like an integrated Twitter client, rich media previews and a more full-featured instant messaging client.

Yahoo says this speed boost should be especially noticeable to users outside the U.S. with latency issues, due mostly to the new version making use of the company's cloud computing technology. This means that if you're on a spotty connection, the app can adjust its behavior to keep pages from timing out, or becoming unresponsive.

Besides the speed and performance increase, which Yahoo says were the top users requests, the company has added a very robust Twitter client, which joins the existing social-sharing tools for Facebook and Yahoo.

Entity Sentiment

Yahoo .534

Twitter .48

Facebook .534

Concept Sentiment

Software and Internet 0.0

Social Media .48

Technology .49

Theme Sentiment

Cloud computing

technology 1.3

Mail service .16

Top user requests .83

Sentiment

“How” is the conversation going ? Positive or negative?

(14)

14 14

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Trends

1. Social media analytics adoption drives text analytics.

2. Analytics moves beyond sentiment analysis.

3. The market begins to get the connection between text and Big Data.

4. Marrying structured and unstructured data becomes more popular.

5. The cloud becomes more popular for text analytics.

Text Analytics Victory Index Report, January, 2013

(15)

15 15

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario

An online book retailer tracks customer feedback by analyzing reviews and comments from online forums and social media.

They use Angoss KnowledgeREADER™ to extract meaning from the text to discover what is being discussed and how – the sentiment (positive or negative), and answer:

What are customers saying on a regional basis?

How frequently do certain entities, themes and topics occur?

Which themes and topics occur together, and are related?

How is sentiment trending over time?

What is the context of what is being discussed at the document level?

Book Reviews: Customer Feedback

(16)

16 16

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario

Sentiment breakdown across all reviews

Sentiment distribution across all documents

Sentiment distribution for Top 10 topics, themes and entities

Sentiment Dashboard

(17)

17 17

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario

The retailer can compare overall sentiment across stores, or isolate individual topics, themes, entities and phrases to determine how those items are discussed between various regions

For example, you can see that the topic “Technology” is viewed more negatively in Store 2, but it is also discussed more frequently as well.

Comparison Analysis

(18)

18 18

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario

By isolating topics, themes, entities or phrases, the retailer can examine how frequently they were

mentioned.

They can also view how customer sentiment regarding these terms changed alongside the frequency of their occurrence.

Trend Analysis

(19)

19 19

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario

Association Discovery

Using the Association Map, the retailer can visually determine the frequency with which certain terms occur, and how closely they relate to other terms used in customer

reviews.

The retailer can quickly assess how well certain subjects are received, and how much relative interest their customers have in those subjects.

(20)

20 20

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario

Document Summary

Individual terms can be isolated, as well as the sentences and documents that reference them – giving you a detailed look at the context used in reviews.

Each text record can be completely isolated for a full examination of the content and sentiment contained within.

(21)

21 21

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario

Decision Tree

KnowledgeREADER can be used to analyze the output of your text analysis with structured data, and use data mining and predictive analytics techniques to expand customer insights.

In this example, the retailer has created a Decision Tree that allows them to determine the price breakdown across book genres. The Decision Tree uses ‘High’ and ‘Low’ price brackets to segment genres.

The retailer can now determine if there is a correlation between price, genre and overall sentiment. They may use these insights to inform product inventory or pricing decisions.

Price High 26,820 14.21%

Low 161,985 85.79%

Total 188,805 100.00%

rank_1_topic

High 8,662 10.94%

Low 70,489 89.06%

Total 79,151 41.92%

null Automotive Hotels Video Games Weather

High 5,906 17.57%

Low 27,711 82.43%

Total 33,617 17.81%

Advertising Aviation Education Investing Law Religion

High 3,037 13.43%

Low 19,583 86.57%

Total 22,620 11.98%

Agriculture Art Biotechnology Crime Disasters Food Politics Space

Sports High 1,136 9.16%

Low 11,270 90.84%

Total 12,406 6.57%

Banking Beverages Marriage Real Estate Renewable Energy Robotics Travel

High 1,862 19.87%

Low 7,509 80.13%

Total 9,371 4.96%

Business Economics Mobile Devices

High 889 15.18%

Low 4,968 84.82%

Total 5,857 3.10%

Elections Fashion Intellectual Property Labor Popular Culture

High 551 23.93%

Low 1,752 76.07%

Total 2,303 1.22%

Environment Social Media

High 92 30.77%

Low 207 69.23%

Total 299 0.16%

Hardware

High 853 11.83%

Low 6,356 88.17%

Total 7,209 3.82%

Health Traditional Energy

High 3,064 21.83%

Low 10,971 78.17%

Total 14,035 7.43%

Science Technology

War High 768 39.65%

Low 1,169 60.35%

Total 1,937 1.03%

Software and Internet

(22)

22 22

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario

Strategy Tree

KnowledgeREADER can be used to build and deploy predictive strategies with Strategy Trees.

Here, the retailer has identified segments based on price and genre. In addition, they can track key metrics that drive store performance.

Combined with the text analysis output, this

measures the average sentiment, rating, sale price and the most common themes discussed in each segment.

By associating a treatment with each segment, the retailer can automatically assign specific actions or activities to each segment.

Now, the book retailer can quickly turn insight into action.

Total 188,805 100.00%

Avg Rating 4.20

Avg Sale Price $15.05

Avg Sentiment 0.22

Most Common Phrase great word_count

Total 17 0.01%

Avg Rating 4.65

Avg Sale Price $13.24

Avg Sentiment null

Most Common Phrase null

Treatment Ignore

null

Total 188,788 99.99%

Avg Rating 4.20

Avg Sale Price $15.05

Avg Sentiment 0.22

Most Common Phrase great

[1,5644] rating

Total 75,890 40.19%

Avg Rating 3.01

Avg Sale Price $14.71

Avg Sentiment 0.12

Most Common Phrase great

Treatment Ignore

[1,4]

Price

High 16,414 14.54%

Low 96,484 85.46%

Total 112,898 59.80%

Avg Rating 5.00

Avg Sale Price $15.27

Avg Sentiment 0.30

Most Common Phrase wonderful

5 rank_1_topic

Price

High 5,208 10.67%

Low 43,581 89.33%

Total 48,789 25.84%

Avg Rating 5.00

Avg Sale Price $13.45

Avg Sentiment 0.34

Most Common Phrase wonderful

Treatment E-Mail BOGO

null Beverages Hotels Real Estate Video Games Weather

Price

High 1,211 21.10%

Low 4,529 78.90%

Total 5,740 3.04%

Avg Rating 5.00

Avg Sale Price $18.65

Avg Sentiment 0.30

Most Common Phrase great

Treatment E-Mail New Hot Reads

Advertising Aviation Business Economics

Price

High 1,755 13.67%

Low 11,079 86.33%

Total 12,834 6.80%

Avg Rating 5.00

Avg Sale Price $15.32

Avg Sentiment 0.24

Most Common Phrase wonderful Treatment E-Mail Buy 3 Get 4th Free Agriculture

Art Crime Disasters Health Space Sports Traditional Energy

Price

High 618 9.61%

Low 5,813 90.39%

Total 6,431 3.41%

Avg Rating 5.00

Avg Sale Price $12.40

Avg Sentiment 0.26

Most Common Phrase wonderful

Treatment E-Mail BOGO

Automotive Banking Marriage Renewable Energy Robotics Travel

Price

High 1,819 22.68%

Low 6,200 77.32%

Total 8,019 4.25%

Avg Rating 5.00

Avg Sale Price $17.98

Avg Sentiment 0.23

Most Common Phrase wonderful

Treatment E-Mail 25% Off Coupon

Biotechnology Elections Science Technology War

Price

High 3,725 18.14%

Low 16,815 81.86%

Total 20,540 10.88%

Avg Rating 5.00

Avg Sale Price $17.05

Avg Sentiment 0.27

Most Common Phrase wonderful

Treatment E-Mail New Hot Reads

Education Intellectual Property Labor Law Religion

Price

High 369 25.31%

Low 1,089 74.69%

Total 1,458 0.77%

Avg Rating 5.00

Avg Sale Price $20.97

Avg Sentiment 0.24

Most Common Phrase wonderful

Treatment E-Mail 25% Off Coupon

Environment Hardware

Price

High 1,295 15.99%

Low 6,802 84.01%

Total 8,097 4.29%

Avg Rating 5.00

Avg Sale Price $16.66

Avg Sentiment 0.29

Most Common Phrase wonderful

Treatment E-Mail New Hot Reads

Fashion Food Investing Mobile Devices Politics Popular Culture

Price

High 414 41.82%

Low 576 58.18%

Total 990 0.52%

Avg Rating 5.00

Avg Sale Price $25.10

Avg Sentiment 0.34

Most Common Phrase great

Treatment E-Mail 25% Off Coupon

Social Media Software and Internet

(23)

23 23

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario

Angoss KnowledgeREADER

KnowledgeREADER is an industry-first software application that brings a new age of integrated customer intelligence by combining visual text discovery and

sentiment analysis with the power of predictive analytics.

Now, customer intelligence professionals and marketers can easily understand and model customer feedback without relying on data analysts.

KnowledgeREADER delivers unparalled customer intelligence and voice of the customer insights to support customer experience management—above and beyond what text analytics users have come to expect.

Learn more about KnowledgeREADER

(24)

24 24

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Resources

Video

Quick Tour of KnowledgeREADER

Articles

Voice of the Customer, How to Move Beyond Listening to Action Text Analytics Categorization and Concept Topics

Text Analytics Phrase and Theme Extraction Text Analytics Sentiment Extraction

Text Analytics Named Entity Extraction

Brochure

KnowledgeREADER

Web

KnowledgeREADER

(25)

25

Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

©2013 Angoss Software Corporation. All rights reserved.

About Angoss

Angoss Software Corporation is a global leader in delivering business intelligence software and predictive analytics to

businesses looking to improve performance across sales, marketing and risk. With a suite of desktop, client-server and big data software products and Cloud solutions, Angoss delivers powerful approaches to turn information into actionable business decisions and

competitive advantage. Angoss software products and solutions are user-friendly and agile, making predictive analytics accessible and easy to use.

For more information visit www.angoss.com.

References

Related documents

Request approval to 1) accept a grant award from, and enter into a grant agreement with, the American Psychological Association Board of Educational Affairs to pursue accreditation

The State of California, Department of Insurance (CDI) has awarded the District Attorney¶s Office (DA) $4,700,955 for the Automobile Insurance Fraud (AIF) Program, $2,121,829 for

77273 with Caban Resources, LLC (Caban), effective upon Board approval to: (i) extend the term of the Agreement for the period July 1, 2015 through June 30, 2016 with an option

Pursuant to Insurance Code Sections 1872.8(b)(1)(d) (AIF), 1874.8 (Urban Grant), 1872.83(d) (WCIF), and 1872.85(c)(2) (DHIF), CDI is authorized to award and distribute certain funds

Upon the completed review and approval of OIG’s written request to RR/CC for online access to YODA, RR/CC Public Records Division Manager shall provide OIG with the User

If you’re a beer buff, take a guided tour at Deschutes Brewery to learn more about how the craft beer scene got its start in Central Oregon, then visit a few.. of the city’s

Indira Nooyi, CEO PepsiCo Madras Christian College The non‐IIT Faces Satya Nadela, CEO Microsoft Manipal Institute of Technology Rakesh Kapoor, CEO Reckitt Benckiser BITS

New §8.208(c) requires operators to repair or replace any compression coupling used to 13. mechanically join steel pipe that is exposed during operation and maintenance