• No results found

Science and Technology Select Committee inquiry Social media data and real-time analytics

N/A
N/A
Protected

Academic year: 2021

Share "Science and Technology Select Committee inquiry Social media data and real-time analytics"

Copied!
9
0
0

Loading.... (view fulltext now)

Full text

(1)

10 St Bride Street T 020 7331 2000

London F 020 7331 2040

EC4A 4AD www.techuk.org

techUK | Representing the future Contact:

Sureyya Cansoy

Director, Tech for Business & Consumer

T 020 7331 2049

E sureyya.cansoy@techuk.org

techuk.org | @techUK | #techUK

Science and Technology Select

Committee inquiry

Social media data and real-time

analytics

techUK response

April 2014

About techUK

techUK represents the companies and technologies that are defining today the world that we will live in tomorrow. More than 850 companies are members of techUK. Collectively they employ more than 500,000 people, about half of all tech sector jobs in the UK. These companies range from leading FTSE 100 companies to new innovative start-ups. The majority of our members are small and medium sized businesses.

(2)

Introduction

techUK welcomes this inquiry into social media data and real-time analytics.

The inquiry is timely as industry and policy-makers alike recognise the range and scale of the transformative opportunities afforded by the data revolution. As techUK, our mission is to:

Make the UK good for tech

 Ensure that the UK is the best place in the world for technology companies (both domestic and foreign owned) to locate and grow

Make tech good for the UK

 Ensure that the full economic potential of technology is harnessed right across the economy

Make tech good for people

 Ensure that technology is used to improve and enhance the quality of life of all consumers and citizens.

Our response to this inquiry has been informed through written responses and input from techUK members, and through a workshop held in March 2014 with 20 tech companies to explore key themes covered by the inquiry.

techUK would be pleased to engage further with the Science and Technology Select Committee to provide further detail into the initial answers provided in this brief response to the questions raised.

Question 1: How can real-time analysis of social media data benefit the UK? What should the Government be doing to maximise these benefits?

1.1. Real time analysis of social media data has a range of benefits and Government has a role in supporting the best environment in which these benefits can be maximised. These benefits are set to grow exponentially as the sophistication of analytic tools continues to mature. In this response to Question 1, we first set out a number of definitional points, before turning to an overview of benefits for businesses, individuals and also for Government. We then outline a number of broad areas where Government has a role in helping maximise these benefits.

Definitions

1.2. For the purposes of clarity and consistency, techUK urges policy-makers to be clear to what we are referring when business leaders and policy-makers use such terms as ‘social media analytics’, ‘social media data’, ‘big data’ and more. They are not strictly interchangeable, and the nuance between each term needs to be understood in order to appreciate the range and scale of opportunity and challenges in this area.

1.3. ‘Social media data’ is, broadly speaking, information relating to user-generated content on the internet. Social media is posted through interactive

(3)

platforms and has driven substantial change to the nature of communication between organisations, communities, and individuals.

‘Social media data’ can be thought of as ‘fully public’ or ‘network-specific’: 1.4. Fully public social media data is information gathered from user generated

content which has effectively been published in the public domain by individuals, businesses, governments and other parties. It is ‘non-protected’ social media forum accounts such as Twitter, and open for all to view. This can include blog posts, media sharing platforms such as flickr, as well as comments on threads, forums, and articles on news websites.

1.5. Network-specific social media data is the information derived from user generated content which has been published on a platform which is not fully open to all. Take for example, posts made on a Facebook account with user-set privacy user-settings, or in a gaming community, or on a ‘protected’ twitter user’s account. Internal business social networks (such as Microsoft’s Yammer) will also generate network-specific social media data.

‘Social media analytics’ (SMA), in turn, is the ability to interpret and appraise that data.

1.6. We should be clear that social media data is a subset of a wider debate on ‘big data’. ‘Big data’ is described by software provider SAS as possessing five defining characteristics: volume, velocity, variety, variability and complexity.1 It

includes both structured (data held in a defined field, for example, in traditional databases or spreadsheets) and unstructured data (data which cannot be easily held in a defined field, including photos, videos, emails, social media updates, text messages, and much more). Social media is a significant source of unstructured data. What matters most in the debate around big data is not how much data there is out there but what intelligence we are able to derive from that data by using sophisticated data analytics tools enabled by technology for the benefit of people, business and government. 1.7. The SMA market is maturing and current capabilities are increasingly

sophisticated. There is a wide range of capability and an equally diverse range of commercial options relating to the provision of SMA services. The wide number of solutions available can be divided into three level of functional capability:

Low – these solutions offer simplistic search and analysis capability. They are suited to specific marketing/brand campaign management.

Medium – these solutions are more sophisticated. They offer more advanced capabilities in terms of sentiment, intent and trend analysis. They are often used on a demographic and geographical basis.

High – these solutions can cover 1 and 2 but provide much greater and highly sophisticated analytical capability and easier integration with existing software/APIs. Typically such tools have been developed by security industry companies rather than commercial and marketing firms.

(4)

The benefits

1.8. Studies have indicated that, with over 1.5 billion social media users and 80% of online users interacting with social media regularly, there is a $1.3 trillion global opportunity to be unlocked through the social media revolution.2

techUK believes that social media real-time analytics can offer significant benefits to business, individuals and the government.

Benefits for businesses

1.9. SMA provides businesses with the ability to undertake well-targeted branding and marketing campaigns, carry out brand logo monitoring, better understand public sentiment and intent, strengthen their customer engagement and enrich their existing Customer Relationship Management provisions.

1.10.It is estimated that around one third of customer spending behaviour is influenced by social media.3It is therefore not surprising that businesses are using

SMA to gain better customer insights that allows them the opportunity to target products and services with unprecedented specificity. With the help of SMA, they are able to better understand consumer behaviour, attitudes and trends; tailor their services much more closely to match those; provide better customer service; and promote their products and services.

1.11.Ultimately, SMA enables businesses to work much smarter and be much closer to the consumer, which leads to better consumer experience.

Benefits for individuals

1.12.As a result of businesses developing a stronger understanding of their customers, individuals receive products and services better suited to their needs at much more competitive prices.

1.13.From the perspective of citizens and users of public services, SMA can help provide individuals with a more user-centric public service delivery.

Benefits for governments and citizens

1.14.The use of SMA can transform the way Government engages and communicates with citizens. The benefits can be grouped into several areas:  Real time communication tool to engage with citizens, particularly in the event

of a national crisis

 Real time public opinion tool to engage with citizens, enabling much quicker understanding of public opinion and reaction to Government strategies, campaigns, public engagement and results analysis (on topics such as anti-drugs, gang crime, tax policies, benefits policies, environmental planning, local planning etc.)

2McKinsey Global Institute, The social economy: Unlocking value and productivity

through social technologies

3 McKinsey Global Institute, The social economy: Unlocking value and productivity

(5)

 Ability to inform the creation and implementation of public policy

 Enabling tailoring of public service delivery to match people’s requirements, for example online public services for key transactions such as passport applications

 As an aide to informing policing and other crime related operations, for example, witness/information identification and appeal

 As a tool to add richness to intelligence, defence and cyber security related activities

1.15.A number of these points have huge implications for cost-reduction, efficiency, and improved public services – and techUK would be pleased to provide the Committee with further information on these areas pertaining to policy areas such as health, policing, work and pensions, and more.

Question 2: How does the UK compare to other EU countries in funding for real-time big data research?

2.1 Research in the UK is strong within the field of Big Data and Social Media Analytics. However, exact spend relative to Big Data is hard to establish as there are three levels of investment to be considered: government, academic and industry. In terms of industry spend, companies tend to announce their total Research and Development investment rather than specific topic spend (for example big data).

2.2 The funding levels of real-time big data research across the EU are difficult to ascertain. However, anecdotally, our members believe that the level of investment in the UK is not too dissimilar to other EU countries as the UK produces some of the most widely used Big Data Analytics tools.

2.3 It should also be noted that comparing the UK investment levels to those of other EU countries is not enough. This is a global race, where there are a number of countries considered to be leading in this field (such as the US and the Asian market). We should look globally to gauge our competitiveness. Government initiatives

2.4 The £42m Alan Turing Data Institute, announced in the Chancellor’s 2014 Budget, is a welcome boost for ongoing research in this area, and we encourage BIS as the leading department to ensure that Institute works closely with the industry.

2.5 The UK Government is to be commended for its creation of new and important centres working in this area, principally the Open Data Institute and the Connected Digital Economy Catapult. Government and industry together will continue to look at this topic through the Information Economy Council, of which techUK’s President Victor Chavez is co-chair with BIS Minister David Willetts MP.

(6)

Academic research

2.6 Academic research often focuses what the future may hold for data analytics. This is driven by the art of the possible in terms of future algorithms, social modelling and impact. The outcomes are frequently shared and are often written using open technology standards. However, it is often not scalable and is frequently difficult to move to fully supportable product or commercially viable solutions. Where businesses do not have the in-house research and development capabilities, they will often outsource specific research to academic institutions. In these circumstances the outcomes are not often shared publicly or with other organisations. However, there is a rich and historic culture within the academic fraternity to share their own internally funded ideas, innovation and concepts in the form of seminars, sharing of software code and published white papers and experiment/research outcome reports. Examples such as the Administrative Data Research Network (ADRN) has third sector data and social media data built into Phase 3.

Industry based research and development

2.7 Industry has made significant investment in research relating to big data and SMA. This type of research is ultimately driven by the need to find new commercially advantageous insights. Such research is often designed around ‘fit for purpose’ and is not researched to the same depth and length of time as pure academic research. It is highly commercially sensitive Intellectual Property and is not often shared. This is an important point. The UK operates a healthy split between industry based and academia base research. While the purpose and outcome of the two approaches is different, the existence of both types of research provides a much stronger research and development foundation for future growth. This does not prevent welcome collaboration between the industry, academia and government as already outlined.

Question 3: What are the barriers to implementing real time data analysis? Is the new Government data-capability strategy sufficient to overcome these barriers?

3.1 There are a number of barriers facing the implementation of SMA. Public trust

3.2 It is widely observed that the public trusts government less than the private sector when it comes to handling of their data, which in turn negatively impacts their willingness to share data with government. It is therefore necessary that more effort is put into demonstrating the benefits of government SMA both for individuals and society.

Building the UK’s skills capacity for SMA

3.3 There is a skills and educational gap which is creating lack of data specialists in the UK, as noted in the Data Capability Strategy. Anecdotal evidence from a number of techUK members suggests that those with the required skills prefer to take roles within start-ups rather than in government or established business where skills are currently needed. A change is needed in educational requirements for big data analytics, which is very different to current IT requirements and Government should have a programme of re-skilling to optimise opportunities in this area.

(7)

Barriers specific to use of SMA by Government

3.4 Data sharing between different parts of the government continues to be a challenge and impacts the SMA agenda. For example, the procurement processes used by Government may lead to systems that are incompatible and difficult to integrate. Even where data is held in the same private cloud, it is often not feasible to share or provide access to complete real time data analysis.

3.5 The single biggest barrier to implementing real time data-analysis is the risk-averse culture of government staff responsible for the systems and data. In part this is driven by fear of media and public opinion following several high profile cases of lost data, security breaches and general ‘mistrust’ of how government uses this type of information. In addition, the different interpretation of existing laws and regulation can also be a significant factor.

Question 4: What are the ethical concerns of using personal data and how is this data anonymised for research?

Anonymity and security

4.1 SMA, in the large majority of the cases, is conducted by using fully anonymised data that is provided direct by the data owners (e.g. social media channels and APIs) or by using publicly available information. However, it is understandable that in an increasingly digital world, people are more security and privacy conscious. However, it may be possible to gain some inferred knowledge when using anonymised data, particularly by combining two sets of anonymised data. For example, when Netflix released anonymised data in 2006, the University of Texas was able to correlate this with contribution data from the Internet Movie Database and identify individuals based on recommendations.

4.2 However, technology can be a strong tool to enhance people’s privacy and security in the digital world and techUK would be pleased to provide the committee with further information on this area.

Defining personal data

4.3 The definition of ‘personal data’ continues to evolve within the UK legal framework and there is an opportunity for government to simplify this. An example of this evolution of understanding around what constitutes ‘personal data’ can be seen in a recent Court of Appeal judgement confirming a person’s name constitutes personal data stating: “a name is personal data unless it is so common that without further information, such as its use in a work context, a person would remain unidentifiable despite its disclosure”.4

4Lexology.com. Court of Appeal confirms a person's name constitutes personal data |

Lexology. [Online] Available from:

http://www.lexology.com/library/detail.aspx?g=ec46382e-003c-479b-8117-bd166d9c3044 [Accessed 21 Mar 2014].

(8)

Question 5: What impact is the upcoming EU Data Protection Legislation likely to have on access to social media data for research?

An evolving agenda

5.1 Technological advances have profoundly changed how individuals’ data is used and the legal frameworks in place to protect our citizens must be robust, flexible and fit-for-purpose. Security and data privacy remains key issues for technology industry, and the industry as a whole recognises the need to reassure its customers of the security of their data. The debate on Data Protection in Europe has stimulated a great deal of concern amongst the tech industry and it is vital that the proposed legislation simultaneously protects individuals and supports innovative businesses at the forefront of the data revolution. The EU proposed legislation on data protection, as it currently stands, has the potential to have huge detrimental effect to businesses.

Legislative reform

5.2 techUK supports the EU’s commitment to reforming the current Data Protection framework to reflect the realities and needs of the new digital age. Harmonising data protection laws across all the EU member states will provide a strong backbone of support for the European and UK information economies. However, the proposed EU regulation, as it stands, is complex and far from being fit for purpose. It is also likely to have a major negative impact on SMA. There are couple of key points that need to be considered:

Consent: The proposed regulation requires that consent must be freely given and obtained for a specific purpose. This will have significant and potentially onerous implications for SMA, technically creating the need to get consent from every individual to use their personal data for SMA.

Definition of personal data is much broader than the existing legal framework, with cookies & IP addresses, for example, considered to be personal data. This blurs and extends the definition of personal data – even potentially meta-data.

5.3 In terms of specific impact of these points on SMA, the development of increasingly sophisticated analytics for wide sentiment, trend and intent analytics could be completed using anonymised data, though some data such as age and geographic location may still be needed. However, if the age and location is used without the name, one could argue it is not personal data within the definition laid down by the current EU Data Protection Legislation. It should be noted that this will be different under the new proposed legislation. 5.4 techUK has recently set up a working group, consisting of both large and small

tech companies with global and British perspectives, to ensure that the proposed EU regulation on data protection strengthens rather than weakens the UK information economy. We will be working closely with the UK government to achieve that, and would be happy to speak further with the Select Committee on this area.

(9)

6.1 Overall, techUK members are of the view that the current UK legislation is broadly fit for purpose. Given that the interpretation of existing law for SMA is in early stages, and is a rapidly evolving area, the UK Government should be wary of introducing new legislation that curbs the ability of UK businesses to innovate. It is imperative they are not left behind. However, there may be a need to provide guidance to clarify how the legislation should be interpreted in the context of SMA. Some tech companies have suggested that they would find it useful to have an indicative list of what is considered personal data under the current legislation. Another suggestion is to consider setting up of a public ombudsman or similar to allow disputes to be handled in a sensible way through a single channel, and without recourse to the courts.

References

Related documents

The history of nineteenth century public law litigation, as elucidated in this article, illustrates one way to mitigate the tension between judicial supremacy and popular

Amorphous aluminosilicate coatings, 65 nm thick, were prepared from precursor solutions with 50, 100, and 500 mmol dm −3 total concentrations of aluminum and silicon species

Recent research in macroeconomics has shown that a calibration of the Mortensen and Pissarides matching model account for 1 0 percent of the cyclical variability of the

For the same reason that the UPIA's default rules of investment practice do not, by implication, validate trust terms directing a trustee to invest in a bordello

Coverage under Medicare Part B is available for treatment services that are provided in the outpatient department of a hospital to patients who for example, have been discharged

Factors found to be associated with inconsistent condom use among these high-risk MSM included older age, self-rated quality of life as good or very good, self-per- ception of

Guidelines typically include dictionaries; style manuals; agency instructions concerning such matters as correspondence, or the handling of classified information; and

Another important aspect that holds true for Ælfric’s Colloquy referred to by Taavitsainen (2004) has to do with the fact that interactive discourse, like the one that endows this