Term Paper on Emerging Technology Trends – Big Data Page 1
Term Paper
Emerging Technology Trends
Big Data Analytics
By Purnima Dubey Srishti - ADP First year
Term Paper on Emerging Technology Trends – Big Data Page 2
Table of Contents
1. Introduction ... 3
2. What is Big data? ... 3
2.1 Definitions ... 3
2.2 The three Vs ... 3
2.3 Latency and its relevance for Big data ... 4
3. Why is Big data so important now? ... 4
3.1 Growth in data volume: ... 5
3.2 Higher velocity of data ... 5
3.3 Greater variety of data sources ... 5
3.4 Real time data analysis ... 5
4. Big Data Applications ... 5
4.1 Example - Big data application for providing proactive patient care ... 5
5. Business models based on Big data analytics ... 6
5.1 Business Analytics as a Service (BAaas) ... 7
5.2 Big data analytics as revenue source for Communication Service Providers ( CSP) ... 7
5.3 Big data analytics as a revenue source for internet based social media and search engine providers ... 7
6. The sociological impact of Big Data ... 7
6.1 Leveraging the predictive power of big data ... 8
6.2 Privacy concerns ... 8
6.3 A new underclass ... 8
7. The Implications for User Experience Design(UXD) and Interaction Experience Design(IXD) ... 9
7.1 Crafting a better Interaction Experience ... 9
7.2 Designing for a better User Experience ... 9
8. Conclusion ... 10
Term Paper on Emerging Technology Trends – Big Data Page 3
1.
Introduction
2.
What is Big data?
Big data as the name indicates refers to large size datasets which require specific technology for their storage and management. However, the name is also a misnomer, since it implies that whatever is not ‘Big data’ is ‘Small data’. This distinction does not hold true and as the definitions below indicate, a better name for the concept of ‘Big Data’ would be ‘Bigger Data’.
2.1 Definitions
Roger Magoulas from O’Reilly media refers to the size aspect of Big data in his definition: “[Big data] refers to a wide range of large data sets almost impossible to manage and process using traditional data management tools – due to their size, but also their complexity.”
(Research Trends, 2012)
Big data is defined by Gartner as:
“Big data” is high-volume, -velocity and -variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.”
(Sicular, 2013)
The 3Vs (volume, velocity and variety) or V3 form a key part of Big data definition. The definition also makes apparent the relative-ness of the Big data concept. It is larger, faster moving and has relatively more sources. While, there is no threshold limit set to determine how much larger or faster, the thumb rule is, data which cannot be handled by ‘conventional’ data management technology both in terms of storage as well as manipulation. This need for targeted technology highlights the fact that the potential of Big data is defined and limited by the technology used to harness it.
Term Paper on Emerging Technology Trends – Big Data Page 4 The 3Vs mentioned in the definition above need further explanation to bring out their significance in Big data definition.
1. Volume
Big data refers to large volumes of data. Terabytes is usually a good place to start. There are reports of organisations storing Zetabytes or Petabytes of data as well.
2. Velocity
This is the rate at which data is increasing. Data streams which are fast moving generally related to events which have a high frequency of occurrences, often on a per second level.
3. Variety
This dimension of Big data is the most interesting. Variety refers to sources. It also refers to the qualitative variety, or what is termed as ‘unstructured and structured’ data. Unstructured data is data which has not been defined in a manner which machines can use. Text data, images, sound are some examples of this type of data. Big data derives its bigness from the ‘big’ complexity of unstructured data.
2.3 Latency and its relevance for Big data
Latency is the time from when the data is recorded to the time it is available to the user. Big data is of value when it is analyzed and the results are available to the user within the window of its relevance. So for instance, devices in a hospital may provide data at the rate of about 1000 readings per second. This data may be relevant up to a few minutes and then they are of no value. Any Big data analytics in this case would have to be done on real-time data stream and hence low latency is expected.
However, there may be situations when real-time data is analyzed along with monthly averages. In this case, we need historical data and real-time data for meaningful analytics. Here the latency expectation is mixed.
Thus, latency is linked with Big data value. The delivery of analytics within the window of relevance makes Big data valuable.
Term Paper on Emerging Technology Trends – Big Data Page 5 Research on Big Data emerged in the 1970s but has seen an explosion of publications since 2008 (Research Trends, 2012). So why has Big data gained prominence in past 5 years? The answer can be traced back to the 3Vs.
3.1 Growth in data volume:
The volume of data generated has increased. There are several instances of large data out today for example; the Square Kilometre Array (SKA) project. It is a collaborative efforts between countries to create a Large Telescope for the next generation radio observatory. It is planned to be constructed in South Africa and Australia. When the SKA is completed in 2024 it will produce in excess of one exabyte of raw data per day (1 exabyte = 1018 bytes), which is more than the entire daily internet traffic at present. (Research Trends, 2012). Similarly, The Human Genome Project is determining the sequences of the three billion chemical base pairs that make up human DNA (Research Trends, 2012).
3.2 Higher velocity of data
Not only is the data volume increasing, the rate of the increase has also rocketed. A report by the International Data Corporation (3) in 2010 estimated that by the year 2020 there will be 35 Zettabytes (ZB) of digital data created per annum.
3.3 Greater variety of data sources
With technology becoming cheap ‘intelligent’ devices are fairly common and their use is ubiquitous. From the mobile phone to the GPS in the car data, there is a large variety of unstructured data generated per second.
3.4 Real time data analysis
We have seen that a large amount of unstructured data getting generated from a variety of sources. For this data to be meaningful, it has to interpret as it is being generated. Hence it is crucial that data analysis happens in real time. This is the real value of Big data.
4.
Big Data Applications
Big data finds application in almost all walks of life. It is used to enrich user experience e.g. adapting to user behaviour online, predict outcomes to reduce risk e.g. predict occurrence of natural disasters, to help in diagnosis, prevention and cure of medical conditions e.g. human gene sequencing, making decisions e.g. business analytics, fighting crime e.g. credit card fraud detection.
Term Paper on Emerging Technology Trends – Big Data Page 6 1. Project details
This project was carried out at Toranto Hospital for Sick Children. (IBM, 2013) 2. The problem
Nosocomial infection which is contracted at the hospital and is life threatening to fragile patients such as premature infants is very hard-to-detect.
“Starting 12 to 24 hours before any overt sign of trouble, almost undetectable changes begin to appear in the vital signs of infants who have contracted this infection. The indication is a pulse that is within acceptable limits, but not varying as it should—heart rates normally rise and fall throughout the day. In a baby where infection has set in, this doesn’t happen as much and the heart rate becomes too regular over time. So, while the information needed to detect the infection is present, the indication is very subtle; rather than being a single warning sign, it is a trend over time that can be difficult to spot, especially in the fast-paced environment of an intensive care unit.
The monitors continuously generate information that can give early warning signs of an infection, but the data is too large for the human mind to process in a timely manner. Consequently, the information that could prevent an infection from escalating to life-threatening status is often lost.”
3. Solution
Using retrospective data, patterns were detected and algorithms were developed to spot the pattern which will lead to slowing down of variation in heart-beat rate.
Monitoring-device data and integrated clinician knowledge are brought together in real time for an automated analysis.
5.
Business models based on Big data analytics
A business model is the strategy the business employs to identify its sources of revenue, the customer base and funding. This paper describes the emerging business models for a business based on Big Data Analytics.
Term Paper on Emerging Technology Trends – Big Data Page 7
5.1 Business Analytics as a Service (BAaas)
This type of business offers analytics of Big data as a service. BAaas generally provide the data hosting services. They provide the service of extracting value from the hosted data using Big data analysis software. The customer comes up with questions. The BAaas service provider provides the analytics and data visualizations which help answer the questions. A fee maybe charged for ‘one time’ analytics job or a periodic fee may be charged for regular reports and analysis services.
5.2 Big data analytics as revenue source for Communication Service Providers ( CSP)
CSPs earn revenue by matching the promotion needs of businesses with the social behaviour and location data of its users. By tracking the location details of its approving customers, CSP creates a pattern of behaviour using Big data analytics. Online activity based on public information on social media as well as searches can be tracked to understand the preferences of the CSP customers. This when matched with location data provides additional insights on the area of influence of the person. For e.g. a person X works in a certain area, visits local cafe on regular basis and visits bookstores on weekends. This person is also active on social media and mentions books in regular posts. This person or people with this type of profile would be good candidates for targeting promotion for a new bookstore with a cafe. Involving these customers in promotion campaigns helps in spreading awareness about a product or a service and creates possibility of customer loyalty.
5.3 Big data analytics as a revenue source for internet based social media and search engine providers
This revenue source is similar to the CSP model mentioned above. The social media provider or the search engine provider tracks the web activity along with location details. The behaviour pattern is then used to display advertisements most likely to interest the user. The advertiser pays for this service.
6.
The sociological impact of Big Data
As big data usage becomes more prevalent, a lot of decisions from deciding location of the business to predicting the spread of diseases are based on big data analytics. This has several pros and cons which is analysed here.
Term Paper on Emerging Technology Trends – Big Data Page 8
6.1 Leveraging the predictive power of big data
Big data analytics offers the ability to predict outcome based on data. As a result, we are becoming more scientific and data driven in our approach to decisions in all aspects of life. We are able to base our decisions on analysis which was earlier thought to be impossible for human capability. Some examples are dating site Match.com uses online conversations, comments, preferences to match men and women for dates. Police departments in US use data on historical arrests and future events like sports events, holidays, weather forecast to predict occurrence and neighbourhood of crime (New York Times, 2012).
6.2 Privacy concerns
In May 2013, Edward Snowden, a US citizen, leaked government documents which brought to light a surveillance program by US National Security Agency ( NSA). This program, uses data from popular sites like Google, Facebook , mobile applications like Angry Birds games to gather information on people, their location, what they are talking, discussing or sharing online. It includes collecting voice call data and mapping voice call data with online records to develop ‘patterns’ of usage. These patterns are meant to help locate the ‘exceptions’ in behaviour, which will help in identifying terrorists or in detecting change in the behaviour of suspected terrorists. However, this approach to surveillance has drawn severe criticism from across the world, since, it has a large potential for misuse (BBC News , 2014 ). The data acquired for surveillance is unauthorised and without knowledge or specific approval of the people or any regulatory body. It is highly probable that this ability to monitor people and known about their private lives, may be misused for example, for targeting ethnic groups or political dissidents. For e.g. it has since been established that data related to heads of states like German chancellor, Angela Merkel (BBC News - Data Protection, 2014), were knowingly collected by NSA for years. Since she is unlikely to be a terror suspect, it is a matter of speculation why her voice calls were monitored.
Thus misuse of big data technology can result in violation of privacy and unethical exploitation. It can result in dangerous consequences if not regulated.
6.3 A new underclass
Increasingly business decisions are made by using big data analysis to find out the patterns of behaviour in people residing in a geographical location. Business may want to target certain neighbourhoods, while avoiding other areas. This behaviour may be repeated by all service providers. It raises an important question on what happens to people who do not generate online data or use credit cards frequently. Do they get marginalized? This is especially a
Term Paper on Emerging Technology Trends – Big Data Page 9 concern since the underprivileged sections of society are frequently the ones who do not have an online presence. There is a suggestion that big data is introducing a new social underclass (A New Underclass: The People Who Big Data Leaves Behind, 2013).
7.
The Implications for User Experience Design(UXD) and Interaction
Experience Design(IXD)
7.1 Crafting a better Interaction Experience
Product designers are mining product usage information from thousands of customers across various events to locate areas of improvement. For example, Microsoft (MS) Office 2007 introduced the ‘Ribbon User Interface (UI)’ which was based on usage data from over 1.3 billion sessions (Danyel Fisher, 2012). Earlier the product feature & UI design was based on ‘intelligent guesswork’. However, in Office 2003, Microsoft introduced the Microsoft Office Customer Experience Improvement Program. (Harris, 2006) All customers who installed MS Office 2003 were given the option to join this program. Their anonymous usage data was then uploaded to Microsoft website. Some of the questions the usage data answered were which commands are used together the most? How many documents does the user keep open at the same time? Which commands are used more via keyboard rather than the mouse? Big data analytics was used to sift through the billions of data points, (an estimated 32 million command bar clicks per 90 days from MS Word users (Harris, 2006)) to help predict changes which would improve the user experience.
7.2 Designing for a better User Experience
The ‘Internet of Things’ refers to the use of objects with embedded sensors .These objects can communicate real-time information about the user and the environment via wired or wireless networks using the Internet Protocol ( IP) which also connects the Internet. This network of physical objects, which maybe anything from roads to pacemakers, forms the ‘internet of things’. The information communicated by these objects is continuous hence high volume. Using big data analytics, this data can be matched with data from other sources to create a meaningful picture. For e.g. in Japan, billboards display the advertisement based on the passerby. An ‘internet of things’, customised and responsive to an individual, makes the user’s experience more meaningful. This enhancement in user experience is a result of Big data analytics.
Term Paper on Emerging Technology Trends – Big Data Page 10
8.
Conclusion
Big data analytics has given rise to possibilities beyond human capabilities in terms of storing, assimilating and synthesising large volume of constantly changing data from multiple sources. This capability along with the reducing cost of data storage and analysis has made information processing a non-issue, a given. It is the potential for improving the uses of data which is an area of investigation. Consequently, we are looking at a new dimension being added to our lives as we know it. Technology is set to become what ‘electricity’ means to the human civilization with big data as the enabler. The question however is, having data may make us better informed, but, will it also make us wise? Big data analytics can improve human capability to understand nature and predict its changes, predict and avoid crime, traffic accidents and critical health concerns. It can lead to a more enriched life with technology driven conveniences leaving more room for personal growth. But we may also end up a society ruled by technology. Will data overrule human intuition and emotion? Will we lose humanness in a blind consumption of technology? Whichever way mankind proceeds, big data looks set to be the enabler.
Term Paper on Emerging Technology Trends – Big Data Page 11
9.
Bibliography
A New Underclass: The People Who Big Data Leaves Behind. (2013, 09 12). Retrieved from
http://www.fastcoexist.com/: http://www.fastcoexist.com/3017102/a-new-underclass-the-people-who-big-data-leaves-behind
BBC News - Data Protection. (2014, 02 15). Data protection: Angela Merkel proposes Europe network. Retrieved from www.bbc.co.uk: http://www.bbc.com/news/world-europe-26210053 BBC News . (2014 , 01 17). US spy leaks: How intelligence is gathered. Retrieved 03 27, 2014, from www.bbc.co.uk: http://www.bbc.com/news/world-us-canada-24717495
Danyel Fisher, R. D. (2012, May). Interactions with Big Data Analytics. Retrieved from
research.microsoft.com: http://research.microsoft.com/pubs/163593/inteactions_big_data.pdf Harris, J. (2006, 04 05). Inside Deep Thought (Why the UI, Part 6). Retrieved from
http://blogs.msdn.com/b/jensenh/archive/2006/04/05/568947.aspx
IBM. (2013, 03 7). http://www-01.ibm.com/software/success/cssdb.nsf/CS/SSAO-8BQ2D3?OpenDocument&Site=corp&cty=en_us.
New York Times. (2012, 02 11). The Age of Big Data. Retrieved from http://www.nytimes.com:
http://www.nytimes.com/2012/02/12/sunday-review/big-datas-impact-in-the-world.html?pagewanted=all&_r=0
Research Trends. (2012, September). Section 4: ICSU and the challenges of Big data in sciences. Retrieved from http://www.researchtrends.com:
http://www.researchtrends.com/wp-content/uploads/2012/09/Research_Trends_Issue30.pdf
Sicular, S. (2013, March 27). http://www.forbes.com/sites/gartnergroup/2013/03/27/gartners-big-data-definition-consists-of-three-parts-not-to-be-confused-with-three-vs/.