15TH JANUARY 2014
Richard Kemp Paul Hinton
Jeremy Harris
There’s data –
and then there’s big data
IT is one of four technologies that will shape
future global developments
“information technology is entering the
big data era”
“process power and data storage are
becoming almost free”
“networks and the cloud will provide
global access and pervasive services”
“social media and cybersecurity will be
large new markets”
US National Intelligence Council’s December 2012 Report –
“Since modern data solutions have emerged, big data sets have grown exponentially in size. At the same time, the building blocks of knowledge discovery, and the software tools and best
practices available to organisations that handle big data sets, have not kept pace with such growth. So a large - and rapidly growing - gap exists between the amount of data that organisations can
accumulate and their abilities to leverage those data in a way that is useful” (NIC Report, p. 85/6)
The impact of big data is all about knowing your customer and
the competitive advantage that confers
Technology focus Current status Potential for 2030 Issues Impact
Data solutions Large data sorting and analysis is applied in various large
industries, but the quality of data
accumulating is
outstripping the ability of systems to leverage it efficiently. As software and hardware developments continue, new
solutions will emerge to allow considerably more data to be
collected, analysed and acted on.
The greatest areas of uncertainty are the
speed with which big data can be usefully and securely utilised by organisations. Opportunities for commercial organisations and governments to “know” their
customers better will increase. These
customers may object to the collection of so much data.
… in the context of organisations’ big data operations …
1. Input data from
multiple sources
• public domain
• market data
• social media
• personal data
• confidential data
• licensed data
• government data
• employee data
self-generated and
derived data
… in the context of organisations’ big data operations …
2. Processing operations
• third party applications
• ‘secret sauce’ algo
• pan enterprise search
• ‘one view’ of information
• data ‘re-purposing’
1. Input data from
multiple sources
• public domain
• market data
• social media
• personal data
• confidential data
• licensed data
• government data
• employee data
self-generated and
derived data
… in the context of organisations’ big data operations …
2. Processing operations
• third paty applications
• ‘secret sauce’ algo
• pan enterprise search
• ‘one view’ of information
• data ‘re-purposing’
3. Output data for
multiple purposes
- for internal use
• product development
• sales & mktg
• CRM
• management
• finance
- for external use
1. Input data from
multiple sources
• public domain
• market data
• social media
• personal data
• confidential data
• licensed data
• government data
• employee data
- self-generated and
derived data
15TH JANUARY 2014
Big Data and IP
The “Big Data” Factory
Social Media Question Regular ReportBig Data
Storage
Platform
Cloud?
Live data feed Internal structured data Third party review Third Party Data Transaction data Internal unstructured Data Algorithm Search engine databaseUnderstanding how IP fits in with Big Data means knowing:
what you are getting
from where
when
how
under what circumstances, and
how you’re using it
It’s worth looking at the IP position in terms of input and output data
Different types of data from many sources
Is there a licence in place? Does it matter from an IP perspective?
Not really:
– if there is no licence in place, it doesn’t mean that there are no IP rights in the
data
– even if there is a licence in place, still need to understand the IP position
– breach of licence could constitute infringement of IP, but
– the measure of damages for IP infringement is different to breach of contract
Licence might include indemnity against losses incurred as a result of
infringement of third party IP
– both from licensor to licensee and licensee to licensor
Database Right
Database Copyright
Literary Copyright
Confidence
Is there a qualifying database?
“a collection of independent works, data or other materials arranged in a systematic or methodical way and individually accessible by electronic or other means” (Art 1(2) DD)
Does the right subsist?
“qualitatively and/or quantitatively a substantial investment in either the obtaining, verification or presentation of the contents ..” (Art 7(1) DD)
– who created the data? – remember BHB v William Hill – the ‘investment in obtaining, verifying or presenting’ protects ‘resources used to seek out independent material’
– therefore, resources used for the creation of the data are not protected – contrast with recent case – Football DataCo v Sportradar
– data collected and recorded at a live events - the compiler of this information had little control over it and it was therefore not ‘created’ by that person but merely ‘obtained’ by them
Would use of the data constitute an infringement?
Extraction or reutilisation of a substantial part (quantitatively or qualitatively); or Repeated extraction or reutilisation of insubstantial parts
Substantial part:
‘quantitative’ evaluation - proportion of the volume of data lifted in relation to the total volume of the contents of the database
‘qualitative’ assessment – small part of the database which requires significant human, technical or financial investment, may amount to a substantial part evaluated qualitatively
– look at the ‘scale of investment’ in obtaining, verifying or presenting the part taken Accordingly, you need to look at
– how much data is being taken – how often
– how important it is
Is there a qualifying database?
Same definition as for database right
Does the right subsist?
Protection if, “by reason of the selection or arrangement of the contents of the database the database constitutes the author’s own intellectual creation” (section 3A CPDA)
– what exactly is the ‘author’s own intellectual creation’?
– Football DataCo v Brittens Pools/Yahoo = an original expression of the creative freedom of the author (which is a matter for the national court to determine)
Accordingly:
– concept of ‘selection and arrangement’ does not extend to the creation of the data contained in the database
– must be some creative ability in an original manner by making free and creative choices as to selection/arrangement
Is there a copyright work?
Original creation? Literary merit?
Skill, labour & judgment - no longer relevant
– the test is whether it constitutes the author’s own intellectual creation (Infopaq) Infringement:
– would taking the data constitute an an infringing act?
– has a substantial part been taken, whether on a quantatitive or qualitative basis? – quality is more important
Are there obligations of confidence in place?
Is the data stated to be confidential?
Website terms?
Circumstances under which reasonable person would understand it to be confidential?
Is IP being created in the output process?
What are you doing with the data?
– manipulation of the data in a new/different way – incorporation with other data sets
– creation of master database – producing reports
Is there any:
– investment in ‘obtaining’, ‘verifying’ or ‘presenting’? – intellectual creativity in ‘selection and arrangement’?
15TH JANUARY 2014
Navigating the gap
Contracting for “Big
Data” content
Licensing requirements
Input data – standard contracts FS “market data” contract terms Output data - key terms
Conclusion
_22
The “Big Data” Factory
_23 Social Media Question Regular Report
Big Data
Storage
Platform
Cloud?
Live data feed Internal structured data Social Media Data Public Data Transaction data Internal unstructured Data Algorithm Search engine database
Ensure that all input data can be used for all “Big Data”
purposes – ideally (i) assignment; or (ii) a broad
worldwide, irrevocable, perpetual, licence to do anything..
Assuming this is not always possible;
–What will the data be used for? By whom? Internal v
external?
–Will the data be relied upon for anything?
_24
Licensing requirements – Input data
_25
Plans
Change
Even if ‘no’ licence applies – access can be made subject to a licence at any time – once data obtains value this tends to happen
Many standard licences prohibit such use: – General website terms and conditions
“You are not permitted (except where you have been given express permission to do so) to adapt or modify the Information on this Website or any part of it and the Information or any part of it may not be copied, reproduced, republished, downloaded, stored, databased, posted, broadcast or transmitted in any other way to any third parties for commercial gain.”
– They can be amended at any time..
_26
Input data - standard contracts - social media
_27
Contracts – your use of data
Terms include…
• Only request data that you need for your application
• Must not include data in any search engine or directory without FB consent • Cannot include user data in an advertising creative EVEN if user consents • Cannot transfer data to advertising network
• Cannot sell data
• FB can force you to delete data if your use is “inconsistent with user expectations”
Terms include…
• Not to store on non-public user profile data or content
• Cannot use the Twitter API to aggregate geographic location information contained in Twitter Content
• May not use Twitter Content or other data collected from end users to create or maintain a separate status update or social network database or service
• Don’t sell access to the Twitter API or Twitter Content T’s consent
Terms include…
• They say very little… for now
• Reflects the limited data flow presently available on this platform
• Includes general restriction on use of “Pinterest Content” – you cannot use, modify, reproduce, distribute, sell, license, or otherwise exploit it without Pinterest’s permission.
Financial Services
“Market Data” = a mature data licensing market –
licensor maximises control of data and income from downstream use
“Historic” data v “Live” data
“Internal” licence v External “Distribution” licence and charge
Licensor controls any directly competing redistribution and requires direct
licence or identikit sub-licence
Use for trading on a platform/analysing v creating separate tradeable product
Do not generally enable broad
‘Big Data’ use per se but new licences and
charging mechanisms developing
_28
Ideally = flat fee or easily calculable fee for “Big Data” use
Key issue for Licensor is that
“Big Data” does not = substitute for
original data
Typical charges based upon individual traceable licensed users
–
“charges per user”
–What if upstream data only forms a small part of “Big Data” query?
–What if the enquiry is a “one-off”?
–Can use be tracked?
_29
Input data – financial services market data – licence and
charges
Derived data:
“data of any kind containing Data or any part of it and/or resulting directly or
indirectly from the manipulation or analysis of Data (whether generated by
human or machine) whether alone or in conjunction with other data regardless
of whether or not the Data is in any way identifiable from or within such data by
any means;
”
Standard positions:
– prohibited distribution of derived data
– permitted only if no part of original data shown or backwards calculable
– owned by original licensor and report/track licence and charge data use
_30
How much can “Big Data” be relied upon?
Warranties as to accuracy
“as is” “as available” “not to be relied
upon”
–data itself perhaps not verified if from third parties
–what if created by licensor
–what if derived by calculation by licensor – can warrant
calculation accurate
–no remedy/reasonableness
_31
“display the [Trade Marks] at all times in accordance with the
Permitted Distribution Policy solely in connection with the grant of
licence in clause [ ].”
practical or possible?
will data be individually discernible?
_32
“The Licensee shall permit [ ] to audit and inspect:
– the Licensee's accounts, records and other information and permit it to take
copies or extracts and on demand supply copies to [ ] of such information;
– any information in the Licensee's control that relate to any Subscriber;
– access to and monitor the use of the Licensee's system used to distribute the
Data, in order to verify that the use of the Data by the Licensee is in
accordance with this Agreement and that Charges due under this Agreement
have been calculated and paid correctly.
”
Is this possible/practical/desirable?
Confidentiality clauses relevant to other data - insert explicit licence for third
party auditors - but some of these will be competitors.
Alternative and lesser obligations
_33 Often delete/purge obligations
“…the Licensee must stop using the Data and the Trade Marks and
purge its systems of all Data”
Is this possible?
Also:
–what about reports already sent out?
–record keeping for regulatory requirements?
–record keeping to enable protection of claims?
_34
Output Data - Metered Access – key terms
Reflect upstream data input obligations accurately
Avoid sole reliance upon IP rights
– “Contract is king” - Etherton J in At the races v
BHB (2005)
The ideal is to create metered access to data subject to contract at each stage:
– Impose clear rights explicitly drafted as obligations into a contract
– Licence Clauses - set out only what may be done in detail – reserve all other rights
– Contract neutral and able to be flexible adapt where possible by reference to other
documents that can change
Post Termination Rights – do not presume the licence will be implied to terminate Regina
Glass Fibre v Schuller [1972] FSR 141 – “if the Licensee will not be able to enjoy the benefit of
“Big Data” requires significant legal input and a clear legal methodology and lead to ensure compliance and to maximise value
Back to front – understanding outputs and system capability critical before negotiating input agreements
Input data licences are likely to need careful contract review and negotiation Output data licences will need to be carefully drafted, flexible and updated
If complex licensed rights are asserted to data within the “Big Data Factory” system capabilities will be needed to ensure “output” compliance: (i) data tracking; (ii) limit access to or use of data; (iii) attach terms/attributions to data; and (iv) delete/remove data – how quickly/efficiently
Data strategy, policies, process and data management framework – technological solution
_36
15TH JANUARY 2014 Richard Kemp
Big data
regulatory aspects
information management
CCF Data Protection
– UK’s ICO has powers to fine up to £500,000
– current progress of draft Data Protection Regulation
– LIBE European Parliament Committee compromise amendments approved on
21 October 2013
– At the moment, fines of upto the greater of €100m or 5% ww turnover if greater
– Likely in force date 2015
– Most Enterprise and many SME orgs have in place formal data protection
compliance policies & processes
– can be used as the basis for ‘big data’ compliance?
Different legal areas where regulatory duties around data arise
1 - generic
Financial services
MiFID equity trading rules – pre- & post- trade data – transaction reporting
Market data
– source/exchange rules
MiFID II – will extend to other asset classes Market Abuse, Capital Adequacy directive
requirements
– alleged LIBOR, forex market manipulation
Air Transport Industry (ATI)
Passenger Name Record (PNR) data Fares data
– GDS (Amadeus, Worldspan, etc) – Airline websites
Mobile check in, etc
Different legal areas where regulatory duties around data arise
2 – sector specific
Professional services
e.g. legal services
regulatory client confidentiality rules rules on conflicts of interest
privilege rules
– litigation privilege
– legal professional privilege
Healthcare
clinical outcomes data
– aggregated and anonymised?
sensitive personal data
Different legal areas where regulatory duties around data arise
2 – sector specific
Articles 101 and 102 EU Treaty (Chapters I & II Competition Act 1998)
– Competition authorities becoming more vigilant around commercial practices for
supplying and licensing data
– Article 101/Chapter I concerned with anti-competitive agreements
– Article 102/Chapter II concerned with abusive conduct by market powerful orgs
– Two cases around securities identifiers the financial services area
– S&P & CUSIPs, Thomson Reuters and RICs
– Markit & CDSs (Credit Default Swaps)
Different legal areas where regulatory duties around data arise
3 – competition law
step 1: risk assessment
– structured process to review/assess/report/remediate
– involve all parts of the business
– establish all the types of data your organisation is using & their sources
– where does the data come from? what consents were obtained/are needed?
– what legal wrappers apply to all this data – IPR, contract, regulatory, etc
– what processes do these data undergo?
– what does your organisation use these data for?
– data protection environment is reasonably mature – use this as a start point?
step 2: strategy statement
– the start point that everything is referable back to
– high level statement of organisation’s goals relating to big data
– list stakeholders - top management and all parts of the business bought in
– rationale, scope, governance, etc
step 3: policy statement
– next level down
– people context: stakeholder groups, their interests & how they are achieved
– steering group, working party, compliance officers, etc
– project plan – scope, responsibilities, timelines, etc
– state tools to be used (mix of IT/system measure & processes & procedures
– approvals, etc
step 4: processes and procedures
– applicable to all staff – tie in to HR policies, etc
– proportionate processes & procedures to be followed
– IT system/measures & how they’re to be used
– awareness training
Our People
Paul Hinton Commercial Partner 020 7710 1623 [email protected] _48 Jeremy Harris Commercial Partner 020 7710 1658 [email protected] Richard Kemp Senior Partner 020 7710 1610 [email protected]
If you would like to attend either of the following events, please contact
[email protected]
:
– 5
thFebruary – HR Forum – Tweaking your business for success in 2014
– 2
ndApril - There’s data…and then there’s big data (re-run)
We will also be running events on the following topics over the next couple
of months – we will contact you with further information:
– Cloud
– Data Protection
Kemp Little LLP is a limited liability partnership registered in England and Wales (registered number: OC300242) and is authorised and regulated by the Solicitors Regulation Authority. Its registered office is Cheapside House, 138 Cheapside, London EC2V 6BJ. A list of members is open to inspection at the registered office.
KEMP LITTLE Cheapside House 138 Cheapside London EC2V 6BJ TEL +44 (0) 20 7600 8080 FAX +44 (0) 20 7600 7878 — kemplittle.com