• No results found

Business Challenges and Research Directions of Management Analytics in the Big Data Era

N/A
N/A
Protected

Academic year: 2021

Share "Business Challenges and Research Directions of Management Analytics in the Big Data Era"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

Business Challenges and Research Directions of Management Analytics in the Big Data Era

Abstract

Big data analytics have been embraced as a disruptive technology that will reshape business intelligence, particularly marketing intelligence that have traditionally been relying market surveys to understand consumer behavior and product design. In this paper, we investigate how big data analytics will affect the landscape of business intelligence, leading to big data intelligence. Rooted in the recent literature, we delineate business opportunities and managerial challenges brought forward by the emergence of big data analytics and outline a number of research directions in big data intelligence for business.

Keywords: Big data, business intelligence, data analytics, data science, marketing intelligence 1. Introduction

Recent technological revolutions such as social media enable us to generate data much faster than ever before (McAfee et al. 2012). The concept of big data has attracted enormous attention in recent years because of its great potential in generating business impacts (Chen et al. 2012). The importance of big data has been recognized by companies, governments, and academia. Leading companies in big data analytics such as Google and now Facebook, are exploring ways to create new businesses and drive more sales from the big data (e.g. log data of online searches, posts and messages) (Lohr 2012). In March 2012, the Obama Administration announced a “Big Data Research and Development Initiative,” which aims to improve the ability to extract knowledge and insights from big data. At the meantime, six federal agencies, including the National Science Foundation, the National Institutes of Health, the U.S.

Geological Survey, the Departments of Defense and Energy, and the Defense Advanced Research Projects Agency, announced a joint R&D initiative that will invest more than $200 million to develop new big data tools and techniques (Wu et al. 2014).

“Big Data” is defined as “the amount of data just beyond technology’s capability to store, manage and process efficiently” (Kaisler et al. 2013). Data can be big in different dimensions, such as volume, velocity and variety (Zikopoulos et al. 2011). Big data research is mainly concerned with three types of challenges: storage, management and processing (Kaisler et al. 2013). In this paper, we focus on big data management, which emphasizes on creating value based on big data analysis. We propose a two-

dimensional framework for big data management decision making (Figure 1). The first dimension is related to methods of intelligence and the second dimension is related to origins of data.

2. Survey versus Log Data

Survey and log are the two commonly used methods to acquire data for business intelligence (Kaptein et al. 2013). A survey is defined as “collecting information in an organized and methodical manner about characteristics of interest from some or all units of a population using well-defined concepts, methods and procedures, and compiles such information into a useful summary form” (Canada 2010). Firms use survey to collect data for various purposes, such as understanding preference and behavior of their

(2)

customers. For example, Apple has been sending out surveys asking customers who have recently purchased an iPhone to gain feedback about their purchase and their experience with the product

(Etherington 2014). Log data is generated by information systems that capture transactional records and user behavior (Jacobs 2009). For example, Walmart has started to explore analyzing social media data to gain customer opinions towards the company or a particular product (Brown 2012). The two methods mainly differ in the following aspects:

Origin of data

Internal External

Method of intelligence SurveysLogs

• Occurs as byproduct of e‐

commerce

• Company controls how data  is captured

• Requires more investment in  data infrastructure

• Size of data is limited by e‐

commerce operations

• Require data acquisition  expenses

• Reliance on external services  on market intelligence

• Source of data becomes  increasingly abundant

• Help keep abreast of market  intelligence

• Requires an internal  marketing group

• Data capture is expensive and  infrequent

• Provide intelligence targeted  to business needs

• May miss market trends due  to internal constraints

• Requires survey service  expenses

• Reliance on external services  on survey of customers

• Flexibility in market  intelligence spending

• Data capture is expensive and  infrequent

Figure 1. Business opportunities and challenges due to big data analytics

1. Size: Since it is expensive to conduct surveys, the sample size for survey is relatively small, typically ranging from less than one hundred to a few thousands. On the other hand, log data may contain millions or billions records.

2. Quality: Log data, especially those data collected from external platforms, may contain noises, which are caused by incomplete data, false data, or irrelevant data. Data cleaning is usually a necessary step for log data analysis. Survey data is collected by well-designed questionnaires and the quality can be control by monitoring the data collection process. Systematic methods are available for quality control in survey data but no standard way for quality control in log data.

3. Frequency: Log data can be collected in real time while survey can only be conducted once for a while (e.g. a few weeks or months).

4. Objectives: Survey is designed based on clear goals for analysis. Log data are not necessary associated with any analysis goals as it occurs as byproduct of e-commerce platforms.

5. Contents: Survey data tends to be subjects’ intentions, opinions or comments. Log data records the actual behavior of users in the past.

(3)

6. Processing techniques: In order to analyze survey data, statistical techniques may be used to test hypotheses, for example, using regression, analysis of variance or chi-square tests. Data mining techniques, such as sentiment analysis and topic analysis, are usually used when processing log data.

Based on the discussion above, we can see that the two methods complement each other in various business contexts. Surveys can be useful when we want to collect data on phenomena that cannot be directly observed. Survey data can also be used when log data or analysis techniques/expertise are not available. Log data are preferred when real-time conclusions about user’s actual behavior are required.

Two methods can be combined when we want to study the relationship between user intention and user behavior. In our opinion, these two methods have both advantages and disadvantages and big data management should take both methods into consideration. Considering the following scenarios, different methods can be chosen.

Scenario 1: Company X designs a new product. They want to know how customers like their products before the product is released. What features of the product do customers like and what features of the product do customers dislike. In this case, survey can be used to collect customer opinion since log data may not be available as the product is not released yet.

Scenario 2: Company Y plans to design the next generation of a product. They want to know how to improve the existing product. They want to investigate what factors affect the sales of their current product. In this case, log data of sales record and customer reviews can be adopted to study the relationship between purchase behavior and product features.

3. Internal Vs. External Data

Nowadays companies not only rely on data from internal sources such as transaction records but also data from external sources that will help generate new insights and provide competitive advantages. Internal data is from firms’ business operations and thus is often limited in terms of data variety. This limitation has largely affected firms’ ability to provide useful and accurate business predictions. On the other hand, the Big Data era has allowed firms to collect population level external data, mainly from Internet-based sources such as social networking web sites. For instance, financial institutions mainly rely on the customers’ financial history for credit rating. But now external data sources such as social networking websites allows them to collect customers’ non-financial information such as graduated university, club memberships, and social circles to build more complete and accurate credit rating models.

As Figure 1 shows, internal survey data is often collected for a specific business purpose, such as investigating customers’ preference on mobile phone’s design. The data items are the most relevant to the focal purpose but often lack the ability to provide innovative insights and capture real market trends.

Internal log data have more breadth than survey data but often require more investment on data infrastructure. The size and usefulness of such data largely depend on the relevant business operations.

On the other hand, external survey data may not be directly related the firm’s business operations but can provide novel and flexible perspectives than internal surveys. It relies on external survey services and often cost more than internal surveys. External log data often provide much more data variety and abundance than the other three types of data due to its open nature. However, information overload becomes a main challenge in utilizing such data for big data analytics.

One of the most common business scenarios that largely depend on big data analytics is recommending products to customers. Individuals nowadays are overwhelmed by choices available online: millions of

(4)

books on Amazon.com, hundreds of millions of video clips on Youtube and items on Ebay. Data analytics that adopted by traditional recommendation systems (RS) largely depend on internal data such as purchase records. For instance, the main type of data analytic methods used in RS – collaborative filtering (CF) – use internal data on a user’s past purchase/rating behaviors and similar purchase/rating choices by other users (often termed as neighbors).

However, such internal data cannot enable CF methods to distinguish neighbors between close friends and strangers (Liu et al. 2010). Sinha and Sweringen (2008) found that people preferred recommendations made by their friends rather than CF-based recommender systems. In this case, external data such as customer’s social network information is needed to complement CF methods. Liu and Lee (2010) used external survey data to collect the foci user’s social network information in a Chinese photo sharing and blogging web site. Such network is then used to replace the nearest neighbors in CF methods to provide recommendations. Hevner et al. (2004), Massa et al. (2007), and Jamali et al. (2009) all used external trust information expressed among users to develop trust-aware CF recommender systems. Such systems are found to provide more accurate predictions. Moreover, such external data source can also provide additional information about customers’ personal tastes on certain types of products such as movies, music and books.

The major challenges for integrating both internal and external data for big data analytics are 1) connecting common data items across different data sources, and 2) selecting relevant data items for analysis. Still using the above example, the common data item linking the internal institutional customer records and their social networking profiles is the customer names. However, sharing the same names are common in population level data sets, automatic identity matching algorithms need to be developed and adopted on additional information such as date of birth (DOB) for connecting the external social network profile to the right internal customer. Moreover, selecting the relevant internal and external data features for predicting credit ratings will require both human expertise and developing automatic data mining techniques such as deep learning methods.

4. Research Directions

It is clear based on this study that company strategies and operations in marketing intelligence will become more complex due to the introduction of big data analytics to marketing intelligence. As a result, all companies must deal with new strategic and operational challenges in order to compete in the era of big data. This in turn creates new opportunities to researchers in the areas of information systems, management sciences, marketing sciences, business intelligence, and data sciences as outlined below:

Balance of investments in data infrastructures to support online surveys and data logs. In the future, big data intelligence will become a competitive source of market intelligence for consumer behavior and product planning and therefore all companies must invest in big data infrastructure including data scientists and big data platform.

Alignment of business operations on various data acquisition techniques. Traditionally, computing operations and marketing operations are separately managed with computing operations support marketing operations. With the raise of data science operations, companies will need to restructure computing operations and marketing operations since computational intelligence will become more important than previously.

Integration of survey analytics and big data analytics to increase the validity of market studies.

This will require new techniques in business intelligence and marketing sciences. Research will have opportunities to develop new methods and applications that will take advantage of social media

(5)

data. For instance, how to conduct surveys in social media and confirm the survey result with log data in social media will become an important topic in e-commerce research and applications.

Creation of new educational modules stemming from big data intelligence. The introduction of big data intelligence in business will transform the way marketing and sales are done. This will create a need for new courses and even new majors in business education. For instance, students in marketing majors and information systems majors will need to learn new data science techniques in order to become competent employees in big data intelligence.

References:

1. Brown,  E.  2012.  "Mining  the  social  data  stream  for  deeper  customer  insight," 

http://www.zdnet.com/. 

2. Chen, H. Roger H. L. Chiang, and V. C. Storey. 2012. Business intelligence and analytics: from big  data to big impact. MIS Q. 36, 4 (December 2012), 1165‐1188. 

3. Canada, S. 2010. Survey Methods and Practices, Statistics Canada. 

4. Etherington, D. 2014. "Apple Sends Out iPhone Survey, Seeks Feedback On Android, Touch ID And  More," http://techcrunch.com/. 

5. Hevner,  A.  R.,  March,  S.  T.,  Park,  J.,  and  Ram,  S.  2004.  "Design  science  in  information  systems  research," MIS Quarterly (28:1), pp 75‐105. 

6. Jacobs, A. 2009. "The pathologies of big data," Communications of the ACM (52:8), pp 36‐44. 

7. Jamali, M., and Ester, M. Year. "TrustWalker: a random walk model for combining trust‐based and  item‐based recommendation," Proceedings of the 15th ACM SIGKDD international conference on  Knowledge discovery and data mining, ACM2009, pp. 397‐406. 

8. Kaisler, S., Armour, F., Espinosa, J. A., and Money, W. Year. "Big data: Issues and challenges moving  forward,"  System  Sciences  (HICSS),  2013  46th  Hawaii  International  Conference  on,  IEEE2013,  pp. 

995‐1004. 

9. Kaptein,  M.,  Parvinen,  P.,  and  Poyry,  E.  Year.  "Theory  vs.  Data‐Driven  Learning  in  Future  E‐

Commerce," System Sciences (HICSS), 2013 46th Hawaii International Conference on, IEEE2013, pp. 

2763‐2772. 

10. Liu,  F.,  and  Lee,  H.  J.  2010.  "Use  of  social  network  information  to  enhance  collaborative  filtering  performance," Expert Systems with Applications (37:7) 7, pp 4772‐4778. 

11. Lohr, S. 2012. "The age of big data," New York Times (11). 

12. Massa,  P.,  and  Avesani,  P.  Year.  "Trust‐aware  recommender  systems,"  Proceedings  of  the  2007  ACM conference on Recommender systems, ACM2007, pp. 17‐24. 

13. McAfee,  A.,  Brynjolfsson,  E.,  Davenport,  T.  H.,  Patil,  D.,  and  Barton,  D.  2012.  "Big  Data,"  The  management revolution. Harvard Bus Rev (90:10), pp 61‐67. 

14. Smith, K. P., and Christakis, N. A. 2008. "Social Networks and Health," Annual Review of Sociology  (34:1), pp 405‐429. 

15. Wu, X., Zhu, X., Wu, G.‐Q., and Ding, W. 2014. "Data mining with big data," Knowledge and Data  Engineering, IEEE Transactions on (26:1), pp 97‐107. 

16. Zikopoulos, P., and Eaton, C. 2011. Understanding big data: Analytics for enterprise class hadoop  and streaming data, (McGraw‐Hill Osborne Media. 

   

References

Related documents

2 Detailed scale drawings, including plan views and elevations of all space only stands must be submitted to the event organiser prior to the event, so that they may ensure that

estimated effective rate of consumption taxes excluding indirect taxes other than those obtained up to this point (i.e., macro consumption tax revenues against consumption

The problem is formulated as an optimization problem taking into account the gimbal rotation limits and collision avoidance constraints, as well as camera angle driven

In addition, as small farmers’ market power is hindered by their lack of information on price levels and changes at different points of the marketing chain,

Our analysis is based on an international, systematic, cross-disciplinary review of the research evidence on the effects of choice-based mechanisms on pupil

gradual increase in reputation allows governments to attract a greater amount of FDIs... response is therefore to lower the tax rate in order to alleviate the negative impact

The interaction effects in Model 2 demonstrate that while workers in managerial occupations have high levels of autonomy relative to other human service workers, managers’

• Leadership capacity (seven subcapacities) • Adaptive capacity (seven subcapacities) • Management capacity (eight subcapacities) • Technical capacity (12 subcapacities) •