• No results found

USING SOCIAL MEDIA DATA SET AS A KEY INPUT TO ECONOMIC INDICATORS

N/A
N/A
Protected

Academic year: 2020

Share "USING SOCIAL MEDIA DATA SET AS A KEY INPUT TO ECONOMIC INDICATORS"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories.

GE-International Journal of Management Research (GE-IJMR) ISSN: (2321-1709)

219 | P a g e

GE-International Journal of Management Research

Vol. 4, Issue 7, July 2016 IF- 4.88 ISSN: (2321-1709)

© Associated Asia Research Foundation (AARF)

Website: www.aarf.asia Email : [email protected] , [email protected]

USING SOCIAL MEDIA DATA SET AS A KEY INPUT TO ECONOMIC

INDICATORS

Mr. Harish Kamath, B.Sc., MBA (Ph.D.)

Research Scholar

ISBR Research Centre

(Recognized by the University of Mysore)

No. 107, Electronics City – Phase I

Near Infosys, Behind BSNL Telephone Exchange

Bangalore 560100

Dr. Noor Firdoos Jahan, Ph.D

Professor

R. V. Institute of Management

CA 17, 26th Main, 36th Cross

4th T Block, Jayanagar

Bangalore 560041

(2)

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories.

GE-International Journal of Management Research (GE-IJMR) ISSN: (2321-1709)

220 | P a g e

ABSTRACT

Economic indicators and reports are published primarily by the govt. agencies from

time to time. These statistics are vital for building right strategy for economic growth.

Economic forecasting is hard without having the right statistics on key Economic Indicators.

The accuracy of the data is the key for the decision making. It’s always a challenge for the

statisticians to get the right data and the volume of data. Smaller volumes of data can project

the different Index than a huge volume of dataset. Economic Indicators will be more accurate

with the right volume of data and right contextual data. Social media has a collection of huge

dataset with real time data. We can identify patterns in certain trends and it can be a good

pointer for building key economic indicator. This paper will focus on using the Social Media

dataset as one of the key input to derive the key economic indicators.

KEYWORDS-JEL: A13, D81, E51, G21, G32, C81, C82, E24, J60

1. Introduction

This paper will focus on the understanding current method of deriving economic

Indicator for one of the key economic indicator as a case study. This paper will also suggest

the conceptual model to build the infrastructure to integrate into larger dataset. Scope of this

paper is to look at the validity of using Social media dataset as a key input to improve

accuracy of the key economic indicators and building a conceptual model for one sample

economic Indicator to prove how this objective can be achieved. Listing down all possible

key indicators is not in the scope of the paper.

2. Method

We will look at one of the key Economic Indicator unemployment rate and

understand what Social Media dataset is and how it can help. We will take research done on

the topic “Using Social Media to Measure Labor Market Flows” (Dolan, 2014) as a baseline

for this paper and extend it.

3. Social Media dataset

Social media is a term used to describe the group of internet based applications.

Internet based technologies enable communication channels dedicated to user based input,

interaction, content-sharing and collaboration. Social media includes popular networking

websites, like Facebook, Instagram, Twitter, Pinterest, Google+, LinkedIn, YouTube etc. the

usage of such application depends on user’s choice. The dataset will contain the conversation

(3)

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories.

GE-International Journal of Management Research (GE-IJMR) ISSN: (2321-1709)

221 | P a g e behavior. The extrapolation of data into subject of interest is critical to the accuracy of the

outcome. The Volume, real time, contextual data and involvement of majority of the

population makes dataset more relevant. According to Global Web Index (GWI), people

spend 28% of the online time on the Social Media activities and about 13% on

Microblogging. Following image summarizes the involvement of population. (Chaffey, 2016)

Image 1. User Base of Social Media. Source: SmartInsights.com

There is an apprehension about how the data can be compromised on privacy. The

data set is utilized without the personal context. To illustrate this better, If person A tweets I

lost job, then the content of the tweet is used over knowing who did that. Of course the

location and time related information becomes important but they will be detached from

individual’s context.

4. Limitations

Social Media dataset can be used for the Analysis of various aspects of human

behavior but the data cannot be verified for its trueness. That’s the downside of the

dataset. There have been several commercial usages of the Social media dataset. This

may add up to a certain percentage of error, which need to be accommodated into the

model.

5. Economic Indexes

An economic indicator is any statistics related to macro or micro economics, such as

(4)

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories.

GE-International Journal of Management Research (GE-IJMR) ISSN: (2321-1709)

222 | P a g e well the economy is doing and how well the economy is going to do in the future. These

information is critical for the economists, business, and lawmakers to make decisions and

build winning strategy. (Moffatt, 2016)

There are 3 dimensions of an Economic Indicator. They are Economic Index trend vs

Economy trend, Frequency of data, timing.

Economic Index trend vs Economy trend. Primarily there are 2 types.

1. Pro-Cyclic: the economic Indexes which are in tandem with the economy. As an

example, if GDP is moving upwards, it indicates economy is moving is the same direction

2. Counter-Cyclic indicates that economy is moving in the opposite direction to the

Economic Index. The best example is Unemployment rate. If this index is greater,

economy moving is opposite direction

Frequency of data. Economic Indexes can be classified into three categories based on

the timescale.

Leading: is an Economic Indicator, published ahead of the time even before the event

occurs. These are primarily the predictions. As an example expected GDP growth, stock

indexes etc. they are based on certain historical data points model based future prediction.

Various parameters will determine the projection. Change in one of the parameters can

lead to a skewed actual value.

Lagged: is an economic indicator which is based on the historical data. Usually it’s derived from recent history. The accuracy here is higher as it’s already happened in the

past and the model will provide the key statistics. Unemployment rate is an lagged

indicator.

Coincident: is an economic indicator is one that simply moves at the same time the

economy does. It’s mostly the real time data. Gross Domestic Product is a coincident

indicator.

Time based. In most countries GDP figures are released quarterly, the unemployment

rate is released monthly. Some economic indicators, such as the SENSEX, Nifty are made

(5)

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories.

GE-International Journal of Management Research (GE-IJMR) ISSN: (2321-1709)

223 | P a g e Sometimes it brings lot of value if lagged Economic indicators be lead. As an

example, if unemployment rate can be estimated ahead of time, the State agencies can be

better prepared for unemployment insurance ahead of time. Though this is not the focus,

it becomes natural outcome of this model.

6. Case study of Unemployment rate

The context has US department of labour, way of calculating the Unemployment rate.

(Bureau of Labour statistics, 2015) There are about 60,000 eligible households are

considered for the sample dataset for this survey. It would be about 110,000. The sample

is selected in such a way that it represents entire population of the US. About 2,000

geographically apart areas are chosen as sampling units.

Every month, 15,000 of the households in the sample are changed to prevent

consecutive interviews more than 4 months. Census Bureau employees will interview the

60,000 eligible sample households with job related relevant questions, which feeds in as a

dataset for deriving the Unemployment rate using a defined model.

This is a lagged method which means by the data is historical than real time as the

time it takes to process the entire dataset.

Now let’s take a look at how the Social Media dataset can add value. As the Twitter

based dataset is already been explored in the research done by Antenucci et al. We will

look at using Facebook, RSS news feeds, Micro blogging and community blogs.

7. Model

Sentiment Analysis method is one of the best method followed in the industry on

Social media. The conceptual model of extracting the Index is explained in the following

steps, step by step.

Step 1. Collect the filtered data from all the related channels like Facebook, Twitter,

RSS feeds, Microblogging sites etc. in the case of Unemployment Rate calculations the

dataset must be filtered on “lost job”, “Looking for new job”, ”Jobless”, “Unemployment” could be used as keywords for filtering the dataset. Once we have the

filtered dataset, the next step will be Sentiment Analysis. Please note that there is no

specific need for regional sampling the filter will be only on the larger level i.e. the

(6)

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories.

GE-International Journal of Management Research (GE-IJMR) ISSN: (2321-1709)

224 | P a g e

Step 2. Sentiment Analysis will focus on classifying each row of the dataset into

“positive sentiment”, “negative sentiment” and “neutral sentiment” positive sentiments

are those keywords and context which adds up to the count in the direction of the

analysis. In this case any keyword that suggests or confirms the jobless will be a positive

sentiment. Filtered data may have the positive/negative/neutral sentiment. We need to

understand the context of the keywords and then perform the sentiment analysis. To

illustrate it better, we have the dataset which contains 3 rows of data, which are filtered

based in the keyword “job”.

1. “I lost my job and looking for new one quickly” Twitter message from person Y

2. “I got a new job. Feeling excited” Person X wrote in his Facebook

3. “It’s observed that job market will be stable” From News feeds

Among these the first statement is suggesting the job loss, which is what we are

interested in counting the numbers. So it’s termed as positive sentiment. Second one is the

negative count. Which means we have to deduct a count from unemployment rate count.

Hence it’s a negative sentiment. Third statement has a word job in it but has no certain

sentiment. Its neither positive nor negative hence its neutral sentiment.

Step 3. Track this number separately from the current ongoing method of deriving the

Unemployment Rate will continue

Step 4. The delta from actual Unemployment rate, Traditional model and the one with

Social media data set will be compared constantly and the delta is observed for any

relation using the machine learning algorithms.

Step 5. Make the right amount of data mix from traditional model and Social media if

it’s more accurate. Continue to use social media as a standalone index if it produces a

better accuracy

Step 6. Use feedback mechanism to see and correct the delta by introducing error

correction index. Error correction index will be derived from delta between actual data vs.

derived Index. Please see the below diagram illustrating the same concept. The Error

Index will be conceptualized based on the weighted average model to ensure each source

(7)

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories.

GE-International Journal of Management Research (GE-IJMR) ISSN: (2321-1709)

225 | P a g e multiple sources. This model is at conceptual stage. This will be the continuous process to

improve the indexes.

Sentiment Analysis Social Media

data

Weighted Scoring Model

Feedback Real-time

System

Feedback

Image 2. Build feedback system

8. Conclusion

Following the same step by step process for other Economic Indicators like

Level of New Business Startups, Consumer confidence, Consumer satisfaction Index etc. also

can be derived thru the same process using Social media set. It’s important to understand that

these can be better suited for, where human behavior or reactions, or such patterns are

directly contributing to the economic indexes.

References

1. Bureau of Labor statistics. (2015, October 15). Labor Force Statistics from the Current

Population Survey. Retrieved from Bureau of Labor statistics, available at:

http://www.bls.gov/cps/cps_htgm.htm

2. Chaffey, D. (2016, April 21). Global social media research summary 2016. Retrieved

from Smartinsights, available at: http://www.smartinsights.com/social-media-marketing/social-media-strategy/new-global-social-media-research/

3. Dolan, A. (2014). Using Social Media to Measure Labor Market Flows.

4. Moffatt, M. (2016, May 12). Beginner's Guide to Economic Indicators. Retrieved from

Economics.about.com:

http://economics.about.com/cs/businesscycles/a/economic_ind.htm

5. Wikipedia. (2016, May 12). Sentiment analysis. Retrieved from Wikipedia, available at:

References

Related documents

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories..

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories.. Page

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories.. Neena Nanda Associate Professor ITM,

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories. Page | 3 generalized, and sometimes not,

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories..

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories International Journal in Management and

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories. International Journal in Commerce, IT &

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories International Journal in Commerce, IT &