Big Data Study
Office of Science and Technology Policy Eisenhower Executive Office Building 1650 Pennsylvania Avenue, NW Washington, D.C. 20502
VIA EMAIL [email protected]
Re: Big Data Study, Document Number 201404660 March 31, 2014
Dear Ms. Wong,
Thank you for the opportunity to provide public comment in response to your comprehensive review of “big data” and its implications for privacy, the economy, and public policy. Access (https://www.accessnow.org) is a global organization dedicated to defending and extending the digital rights of users at risk around the world. Access works through its Policy, Technology, and Advocacy teams to achieve this mission. Access provides thought leadership and policy
recommendations to the public and private sectors to ensure the internet’s continued openness and universality and wields an actionfocused global community of nearly half a million users from more than 185 countries. Access also operates a 24/7 digital security helpline that provides realtime direct technical assistance to users around the world.
I. The Challenges of "Big Data"
The growth in largescale collection, retention, transfer, and analysis of personal data places everyone’s privacy at risk. All types of organizations consumerfacing companies, third party data brokers, government agencies, and others develop comprehensive profiles at times containing identifying information, such as names, addresses, and phone numbers, as well as buying habits, personal interests, ethnic identities, political affiliations, marital status, credit card details, and numerous other data points. Enough information is often collected that even
1anonymous information can be reidentified easily. In one highprofile case, reporters were able
2to identify several anonymous users based solely on their AOL search history, which had been publicly released.
3Information in one user's records provided detailed information on her medical history and love life.
There has been an exponential increase in the amount of data collected and stored by private companies in recent years. Facebook announced in 2012 that its data center had grown 2500x
1 http://www.newrepublic.com/article/115041/whatbigdatadoesanddoesntknowaboutme
2
http://www.forbes.com/sites/adamtanner/2013/04/25/harvardprofessorreidentifiesanonymousvolunteersin
dnastudy/
3 http://www.nytimes.com/2006/08/09/technology/09aol.html?pagewanted=all
since 2008. By 2012, Facebook was collecting about 180 petabytes of data per year. For
4reference, one petabyte is the equivalent of 20 million 4drawer filing cabinets filled with text.
Retailers, whether focused at online markets or off, also track customers. It is estimated that in one hour WalMart processes about 1 million customer transactions containing 2.5 petabytes of data.
"Free” services offered by companies are often possible because these practices are part of a business model that relies on interpreting highquality data about their users in order to serve revenuegenerating targeted advertising. And over the years, many of these same internet companies have “simplified” their privacy policies by eliminating granular usercontrols while increasing the capacity to track each and every online action.
5Data collection practices have been connected to specific practices that negatively impact internet users. For example, in 2012, it was discovered that some online travel booking companies, including Orbitz Worldwide Inc., were charging customers using Apple products close to 30% more for flights and hotels than visitors using Windows. Such digital market
6manipulation leads to economic and privacy harms. A recent breach of Target’s systems is
7estimated to have affected up to one third of all Americans. Ensuring that citizens have
8adequate knowledge and control over their data would greatly reduce the privacy and other human rights risks associated with big data. Currently, comprehensive standandards apply to medical and financial data, but not other types of sensitive information.
It is not only private entities where data collection has skyrocketed. Recent revelations have shown that US government intelligence agencies have been implementing programs to collect personal information and communications of users around the world at unprecedented levels.
Some of these programs are implemented through legal processes, which compel companies to produce user information that the companies have otherwise collected for their own purposes.
These collection programs are overseen by the secret FISA Court, which issues orders requiring production while preventing companies from publicly revealing that the collection has occurred.
Under other programs, often authorized under Section 702 of the FISA Amendments Act and Executive Order 12333, the US is tapping fiber optic cables directly (BLARNEY, OAKSTAR,
4
https://www.facebook.com/notes/facebookengineering/underthehoodschedulingmapreducejobsmoreeffi cientlywithcorona/10151142560538920
5 http://mattmckeon.com/facebookprivacy/
6
http://online.wsj.com/news/articles/SB10001424052702304458604577488822667325882?mg=reno64wsj&ur l=http%3A%2F%2Fonline.wsj.com%2Farticle%2FSB10001424052702304458604577488822667325882.html
7 http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2309703
8 http://www.nytimes.com/2014/01/11/business/targetbreachaffected70millioncustomers.html
STORMBREW, FAIRVIEW), breaking into the private links between corporate data centers
9(e.g., MUSCULAR), or collecting the content of a whole country’s phone calls (e.g.,
10MYSTIC/RETRO). Given the preponderance of attacks on the US Government, these mass
11surveillance places a tremendous amount of users and user data at risk.
II. The Problem of Unauthorized Access
Once collected, bad data security practices have led to the unauthorized access to and use of personal information, compromising users around the world. Data breaches are increasing in frequency. Last year saw the highest total records breached, according to a report by Risk Based Security. In one incident, attackers obtained records with email addresses and
12passwords from around 152 million Adobe accounts. In another breach, approximately 110
13million Target accounts, about a third of the US, were affected by a data breach. While the
14Adobe and Target breaches are two of the largest known breaches to date, data continues to be compromised with such great frequency that these incidents account for only a small portion of the total data that is known to have been exposed in 2013. Indeed, last year there were 2,164 incidents of data breaches with 822 millions records exposed reported worldwide. Attacks against US entities accounted for nearly half of all breaches globally.
15Unauthorized access to user data is not a new problem. For the past 12 years, identity theft has been the biggest source of complaints to the Federal Trade Commission, which underlines that
16the identity and finances of citizens are consistently at risk due to needless collection practices and insufficient security practices employed by companies online. The economic impact of data breaches, and the accompanying reputational and legal fallout, is undoubtedly huge. Target spent $61 million in breach related costs in the first three months after the breach, which experts estimate may grow to as high as $1 billion. Target’s data breach is expected to be so
17expensive, in part, because it revealed data placing credit at risk. That might be good for credit
18monitoring agencies, but it can create everyday challenges for victims when they try to get a mortgage, get a credit card, or buy a car. Data breaches are also particularly expensive in the US for the companies who lost or had records stolen. In 2012, companies paid on average $188 per lost or stolen record. That equated to about $5.4 million in loss for each entity with a data
9
http://www.washingtonpost.com/business/economy/thensaslideyouhaventseen/2013/07/10/32801426e8 e611e2aa9fc03a72e2d342_story.html
10 https://www.accessnow.org/blog/2013/11/01/nsahacksinternetcompanydatacenters
11 https://www.accessnow.org/blog/2014/03/20/nsabulkcollectionisoutofcontrol
12 https://www.riskbasedsecurity.com/reports/2013DataBreachQuickView.pdf
13 http://www.reuters.com/article/2013/11/07/usadobecyberattackidUSBRE9A61D220131107
14 http://www.nytimes.com/2014/01/11/business/targetbreachaffected70millioncustomers.html
15 https://www.riskbasedsecurity.com/reports/2013DataBreachQuickView.pdf
16 http://www.ftc.gov/newsevents/pressreleases/2012/02/ftcreleasestopcomplaintcategories2011
17 http://www.reuters.com/article/2014/02/26/ustargetresultsidUSBREA1P0WC20140226
18
http://www.usnews.com/news/articles/2014/03/26/jackpottargetdatatheftvictimsbecomeacreditagency
goldmine
breach.
19Governments also take advantage of insecure data. While the surveillance programs discussed above often operate under a system of compelled production, others skip official channels and, instead, use back doors. One such program is the "Upstream" programs alluded to in slides released in June 2013, and later confirmed by government officials. Upstream collection takes data right off the "backbone" of the internet the wires over which information is transmitted from computer to computer. Further revelations have brought to light backbone collection by US and other governments of remotelyactivated webcam feeds, email contact lists, and
20 21information on internal company networks. It has also been revealed that the government has
22acted to preserve these collection programs by undermining data security standards.
23Unauthorized access or use of information by governments, as well as private actors,
fundamentally threatens the internet as we know it. The world’s largest internet companies build their business models around user trust in the networks that transmit and entities that store their personal data. Google’s public Chief Legal Officer David Drummond, has said, “Our business depends on the trust of our customers." More acutely at risk, U.S.based cloud computing firms spoke out after losing business following last summer’s NSA revelations, and fear losing up to
$35 billion in worldwide contracts as European regulators look to tighten restrictions on the cloud. Trust is also eroded when the NSA shares data with government agencies not dealing with foreign intelligence. For example, the NSA has provided evidence to the DEA, which then uses “parallel construction,” whereby agents find alternative grounds to justify arrests and skirt legal challenges. Rule of law is threatened when legal limitations fail to protect even the narrow
24existing privacy protections.
III. The Role of Data Security
As data are transferred from entity to entity, they become increasingly vulnerable, with more points at which unauthorized parties may be able to gain access to those data and use them for unintended purposes. Bad actors may compromise the financial or physical safety of users, and governments could use personal information to target dissidents, stifle speech, or influence
19
https://www4.symantec.com/mktginfo/whitepaper/053013_GL_NA_WP_Ponemon2013CostofaDataBrea chReport_daiNA_cta72382.pdf
20 http://www.foxnews.com/tech/2014/02/27/ukusspieshackedwebcamsmillionsyahoousers/
21
http://www.washingtonpost.com/world/nationalsecurity/nsacollectsmillionsofemailaddressbooksgloball y/2013/10/14/8e58b5be34f911e380c67e6dd8d22d8f_story.html
22
http://www.washingtonpost.com/world/nationalsecurity/nsainfiltrateslinkstoyahoogoogledatacenterswo rldwidesnowdendocumentssay/2013/10/30/e51d661e416611e38b74d89d714ca4dd_story.html
23 http://www.wired.com/2013/09/nsabackdooredandstolekeys/
24
http://www.washingtonpost.com/blogs/theswitch/wp/2013/08/05/thensaisgivingyourphonerecordstothe
deaandthedeaiscoveringitup/
political outcomes.
Access has attempted to move the global conversation on security of big data forward. In March 2014, Access released the Data Security Action Plan. In creating the Data Security Action Plan,
25Access considered what commonsense practices were needed to mitigate the extreme risk posed by the increasing amounts of data stored online. The Action Plan consists of seven steps that companies should take to protect their users. The seven steps are:
1. Implement strict encryption measures on all network traffic;
2. Executive verifiable practices to effectively store user data stored at rest;
3. Maintain the security of credentials and provide robust authentication safeguards;
4. Promptly address known, exploitable vulnerabilities;
5. Use algorithms that follow security best practices;
6. Enable or support the use of clienttoclient encryption; and
7. Provide user education tools on the importance of digital security hygiene.
All entities should support the implementation of these security measures on all relevant data and networks under their control. Widespread adoption would benefit all internet users around the world, and would raise the floor on minimallyacceptable data security practices. If we fail to consider data security in the debate on big data public policy, we are standardizing unacceptable risks for users, companies, and the public at large.
IV. Conclusion
To mitigate the harms of data breach and misuse and to build user trust, the White House should consider what steps are necessary to protect user data. Companies should take proactive steps to protect user data. Specifically, this means adopting privacycentered approaches to the collection and processing of user data, including: data minimization to limit collection of data where possible; ensuring that data is collected and stored for strictly defined purposes, and not used in a way that is incompatible with those purposes; and applying appropriate security measures to data both in transit and at rest.
Accordingly, Access calls on the government to bolster data protection standards, promote data security, and continue to foster a robust discussion on best practices.
Thank you for the opportunity to provide comment as part of this Big Data Study. For more information, please visit https://www.accessnow.org or contact the authors of this comment, Amie Stepanovich and Drew Mitnick, at [email protected] and [email protected] respectively.
25 More information is available at https://encryptallthethings.net