Specifying Requirements Using Commitment, Privilege, and Right Analysis.

(1)

ABSTRACT

SCHMIDT, JESSICA YOUNG. Specifying Requirements Using Commitment, Privilege, and Right (CPR) Analysis. (Under the direction of Annie I. Ant´on.)

Organizations have many documents, including policy documents (e.g., privacy policies and terms of use) and data use agreements (DUAs), with which their software must comply. Requirements engineers must incorporate these documents into the requirements phase of software in order to build in compliance from the start. In the United States, the Federal Trade Commission is empowered to monitor organizations’ compliance with the practices expressed in the organizations’ policy documents. Therefore, organizations must ensure that their software systems comply with these documents, which must serve as a primary source of compliance requirements early on in the software lifecycle. Regulations created pursuant to the U.S. Health Insurance Portability and Accountability Act specify that a DUA must exist for certain uses and disclosures of protected health information as a limited data set. For compliance reasons, it is important for requirements engineers to ask for and evaluate DUAs, as they are legally binding on the parties.

(2)

c

(3)

Specifying Requirements Using Commitment, Privilege, and Right (CPR) Analysis

by

Jessica Young Schmidt

A dissertation submitted to the Graduate Faculty of North Carolina State University

in partial fulfillment of the requirements for the Degree of

Doctor of Philosophy

Computer Science

Raleigh, North Carolina

2012

APPROVED BY:

Julia B. Earp Munindar P. Singh

Laurie Williams Annie I. Ant´on

(4)

DEDICATION

(5)

BIOGRAPHY

Jessica Young Schmidt grew up in Natural Bridge—a gorgeous little town in the mountains of Virginia. Since leaving the Shenandoah Valley for Raleigh, NC and Arlington, MA, she has realized just how lucky she was to grow up in such an amazing place. She was spoiled by the gorgeous views and tons of space to play. All of which led to her love of the outdoors— hiking, kayaking, mountain biking, running, snowboarding, fishing, playing soccer, and basically anything that lets her be outside with friends and family. Last year she married her husband Matt, a fellow NC State Computer Science Ph.D., on her family’s farm surrounded by family, friends, and, of course, the mountains.

(6)

ACKNOWLEDGEMENTS

The brick walls are there for a reason. The brick walls are not there to keep us out.

The brick walls are there to give us a chance to show how badly we want something. Because the brick walls are there to stop the people who don’t want it badly enough.

They’re there to stop the other people.

- Randy Pausch, The Last Lecture

Just keep swimming. Just keep swimming.

Just keep swimming, swimming, swimming. What do we do? We swim, swim.

- Dory, Finding Nemo

I believe these two quotes accurately explain my journey through grad school. It all started with the brick wall—my desire to get a Ph.D. and teach in hopes of inspiring students the way my professors inspired me. The journey wasn’t always easy but there was a purpose, and sometimes the only thing I could do wasjust keep swimming...

First, I want to thank my family and friends, who have always been there for me and will always be my top priority. Without them, life wouldn’t be nearly as fun, enjoyable, or worthwhile. I don’t regret any times I spent with them, even if at times I probably should have been working. They have made me the person I am today, and I love them dearly.

To Matt: You are absolutely amazing! I cannot even begin to say how thankful I am that you were persistent in getting me to come out with the group for drinks and in asking me out on a date without really asking me out on a date. You are the best thing that happened to me while at State. You managed to keep me sane, or at least somewhat sane, through grad school and everything else that has happened over the past four years. You were there for it all—success, excitement, screaming, crying—and always knew the right thing to say. And because you are as big of a nerd as I am, whenever I hit a roadblock with my research, you were there to talk through it with me. I love you!

(7)

To Karla: Thanks for being such an amazing little sister! You excel at everything you do, which makes me incredibly proud and motivates me to be better.

To Cody: During tough times, you were exactly what the family needed.

To the Young, Wood, Gibson, Schmidt, and Steltenpohl families: I feel extremely lucky to have such large and loving families. Thank you for your support!

To Emily, Ashleigh, Kara, Amy, Haley, and Jess: Not only did you make my time at Roanoke amazing, but years later you are still my closest friends. You always remind me that having fun and being goofy are important because“everybody is somebody else’s weirdo.”

To Paul, Kelly, Brandon, April, Prachi, and my other Raleigh friends: Thanks for making Raleigh so enjoyable!

To Team LTS: Thanks for making my summer in Maryland so fun.

Next, I want to thank all of the people at NC State who supported me over the years. These individuals helped me with research, teaching, and grad school in general.

To Dr. Ant´on: Thank you for your advice, support, and encouragement over the past five years. You helped me to become a more confident researcher and presenter. I really appreciate all of the time you have spent helping me with my research, especially the time you spent performing goal analysis for my comparison study. Thank you as well for all of the semesters that you funded me as your RA.

To Dr. Earp: Thank you for all of the time you spent performing goal analysis for my comparison study and for always being available to answer any research questions I had, especially those involving statistics.

To Dr. Singh: Thank you for your advice and expertise in helping me to improve my research, particularly my definitions.

To Dr. Williams: Thank you for your advice and encouragement in applying CPR analysis within the development of the health system prototype. I also wanted to thank you for the semesters you funded me as an RA.

(8)

To Aaron: On my first day at State, you were tasked with showing me around the lab and introducing me to the research of the group. Five years later... and you are still an amazing resource to me—always answering my questions, as silly as they may be. Thanks for being a great labmate and friend!

To Paul: You were already in law school by the time I started at State, but you have been a ginormous help to me. Thank you so much for putting up with all of my emails that contained countless legal questions. Your help has proved to be essential for positioning my work and showing the relation to legal concepts. Thanks!

To Sarah: Thank you for all of your guidance with teaching CSC 116. Without you, the experience would not have gone as well as it did.

To Dr. Thuente and Dr. Reeves: Thank you for providing me with the opportunity to teach.

To Margery and Carol: Thank you for all of your help. Your help and advice have saved me countless hours of headaches over these past five years.

I also want to thank my professors at Roanoke who not only persuaded me to major in Computer Science but also inspired me to get my Ph.D. and pursue a career in teaching.

(9)

TABLE OF CONTENTS

List of Tables . . . x

List of Figures . . . xi

Chapter 1 Introduction . . . 1

1.1 Motivation . . . 1

1.1.1 Policy Documents . . . 2

1.1.2 Data Use Agreements . . . 3

1.2 Related Work . . . 6

1.2.1 Policies and Requirements . . . 6

1.2.2 Goal-Based Analysis . . . 6

1.2.3 Traceability . . . 7

1.2.4 Compliance . . . 7

1.2.5 Natural Language . . . 8

1.2.6 Commitments . . . 9

1.2.7 User Studies . . . 9

1.3 Overview of Remaining Chapters . . . 10

Chapter 2 CPR Analysis Theory and Methodology . . . 11

2.1 Theory of Commitments, Privileges, and Rights . . . 12

2.1.1 Classification Terminology . . . 12

2.1.2 Rationale Terminology . . . 14

2.1.3 Terminology Comparison . . . 15

2.2 CPR Analysis Methodology . . . 19

2.2.1 Parse . . . 19

2.2.2 Classify . . . 19

2.2.3 Operationalize . . . 22

2.3 Summary . . . 23

Chapter 3 CPR Analysis Heuristics . . . 24

3.1 Parse . . . 24

3.2 Classify . . . 27

3.2.1 Item Heuristic (IH) . . . 27

3.2.2 Helper Heuristics (HH) . . . 28

3.2.3 Classifying Heuristics (CH) . . . 29

3.2.4 Documenting Attribute Heuristics (DAH) . . . 40

3.2.5 Documenting Rationale Heuristics (DRH) . . . 44

3.2.6 Forming CPR Heuristic (FH) . . . 49

3.3 Operationalize . . . 50

(10)

Chapter 4 Multi-Case Studies . . . 54

4.1 Formative Multi-Case Study . . . 55

4.1.1 Research Questions . . . 55

4.1.2 Documents Analyzed . . . 55

4.1.3 Results . . . 58

4.2 Summative Multi-Case Study . . . 64

4.2.1 Research Questions . . . 64

4.2.2 Documents Analyzed . . . 65

4.2.3 Results . . . 67

4.3 Threats to Validity . . . 72

4.4 Summary . . . 74

Chapter 5 Examining CPR Analysis for Ensuring Compliance . . . 78

5.1 Methodology . . . 79

5.1.1 Goal-based Analysis . . . 80

5.1.2 Differences Between Analysis Approaches . . . 81

5.2 Preserving Context and Intended Meaning . . . 82

5.3 Case Study Results . . . 83

5.3.1 Language . . . 83

5.3.2 Specificity . . . 85

5.3.3 Modality . . . 87

5.5 Discussion . . . 89

Chapter 6 Empirical User Study . . . 91

6.1 Experimental Design . . . 92

6.1.1 Design . . . 92

6.1.2 Subjects . . . 92

6.1.3 Materials . . . 94

6.1.4 Procedure . . . 95

6.1.5 Measurements . . . 96

6.1.6 Analysis Tests . . . 102

6.2 Pilot Study . . . 102

6.3 Data Analysis . . . 103

6.3.1 Subject Responses to Open-ended Questions for Analysis Tasks . . . 103

6.3.2 Results of Empirical Evaluation . . . 104

6.5 Summary . . . 113

Chapter 7 CPR Analysis within the Development of a Health System Prototype115 7.1 Overview of HealthSystem . . . 116

7.2 Requirements . . . 116

7.2.1 Software Requirements . . . 121

7.2.2 Operational Requirements . . . 123

(11)

7.3 DUA Template . . . 125

7.4 Summary . . . 125

Chapter 8 Conclusion . . . 127

8.1 Chapter Summaries . . . 127

8.2 Contributions . . . 129

8.2.1 Primary Contributions . . . 129

8.2.2 Demonstrated Experiences . . . 129

8.3 Future Work . . . 130

References. . . 131

Appendices . . . 141

Appendix A CPR Analysis Heuristics . . . 142

A.1 Parsing Heuristics . . . 142

A.2 Classification Heuristics . . . 143

A.3 Operationalization Heuristics . . . 150

Appendix B User Study Materials & Data . . . 151

(12)

LIST OF TABLES

Table 6.1 Subject Demographics . . . 93

Table 6.2 Subject Software Engineering Experience . . . 94

Table 6.3 Artifacts Submitted by Subjects . . . 96

Table 6.4 Possible Coding Combinations for Related Pairs . . . 99

Table 6.5 Possible Codings for Subject-Produced Requirements Not in Related Pair . 101 Table 6.6 Correctness and Completeness for Conditions . . . 108

Table 7.1 Requirements Counts for the HealthSystem Prototype . . . 120

Table A.1 Parsing Heuristics (PH) . . . 142

Table A.2 Item Heuristic (IH) . . . 143

Table A.3 Helper Heuristic (HH) . . . 143

Table A.4 Classifying Heuristic (CH) . . . 143

Table A.5 Classifying Heuristic for Policy Documents (CHP D) - Part 1 . . . 144

Table A.6 Classifying Heuristic for Policy Documents (CHP D) - Part 2 . . . 145

Table A.7 Classifying Heuristics for DUAs (CHDU A) . . . 146

Table A.8 Documenting Attribute Heuristics (DAH) . . . 147

Table A.9 Documenting Rationale Heuristics for Policy Documents (DRHP D) . . . . 148

Table A.10 Documenting Rationale Heuristics for DUAs (DRHDU A) . . . 149

Table A.11 Forming CPR Heuristic (FH) . . . 149

Table A.12 Requirement Type Heuristic (RTH) . . . 150

Table A.13 Operationalizing Heuristics (OH) . . . 150

(13)

LIST OF FIGURES

Figure 1.1 NIH Summary of the HIPAA §164.514(e)(4) [104] . . . 4

Figure 1.2 Excluded identifiers in an LDS—45 C.F.R. §164.514(e)(2) . . . 5

Figure 2.1 CPR Analysis Methodology . . . 20

Figure 2.2 Types of Items within a Document . . . 21

Figure 4.1 Policy Documents Analyzed in Formative Study . . . 56

Figure 4.2 DUAs Analyzed in Formative Study . . . 57

Figure 4.3 Formative Study - Policy Documents Case - Types of Items . . . 59

Figure 4.4 Formative Study - Policy Documents Case - Classifications . . . 60

Figure 4.5 Formative Study - Policy Documents Case - Rationale . . . 61

Figure 4.6 Formative Study - Policy Documents Case - Actors . . . 61

Figure 4.7 Formative Study - DUA Case - Types of Items . . . 62

Figure 4.8 Formative Study - DUA Case - Classifications . . . 63

Figure 4.9 Formative Study - DUA Case - Rationale . . . 64

Figure 4.10 Formative Study - DUA Case - Actors . . . 64

Figure 4.11 Policy Documents Analyzed in Summative Study . . . 66

Figure 4.12 DUAs Analyzed in Summative Study . . . 67

Figure 4.13 Summative Study - Policy Document Case - Types of Items . . . 68

Figure 4.14 Summative Study - Policy Document Case - Classifications . . . 69

Figure 4.15 Summative Study - Policy Document Case - Rationale . . . 69

Figure 4.16 Summative Study - Policy Document Case - Actors . . . 70

Figure 4.17 Summative Study - DUA Case - Types of Items . . . 71

Figure 4.18 Summative Study - DUA Case - Classifications . . . 71

Figure 4.19 Summative Study - DUA Case - Rationale . . . 72

Figure 4.20 Summative Study - DUA Case - Actors . . . 73

Figure 4.21 Multi-Case Studies - Types of Items . . . 74

Figure 4.22 Multi-Case Studies - Classifications . . . 75

Figure 4.23 Multi-Case Studies - Rationale . . . 76

Figure 4.24 Multi-Case Studies - Actors . . . 77

Figure 5.1 Analyzed Policy Documents . . . 80

Figure 6.1 Total Number of Subject-Produced Requirements by Condition and Subject104 Figure 6.2 Total Number of Expected Requirements to which Subject-Produced Requirements Related . . . 105

Figure 6.3 Correctness by Condition and Subject . . . 106

Figure 6.4 Completeness by Condition and Subject . . . 107

Figure 6.5 Total Number of Related Pairs with No Coding . . . 108

Figure 6.6 Total Number of Related Pairs with Mul Coding . . . 109

Figure 6.7 Total Number of Related Pairs with PoV/Mul Coding . . . 110

(14)

Figure 6.9 Derived Requirements . . . 111

Figure 6.10 Synthesized Requirements . . . 112

Figure 7.1 Classifications of Classifiable Items in HealthSystem DUAs . . . 118

Figure 7.2 Actors of Classifiable Items in HealthSystem DUAs . . . 118

Figure 7.3 Classification/Actor Combinations of Classifiable Items in HealthSystem DUAs . . . 119

Figure B.1 User Study Questionnaire . . . 152

Figure B.2 Problem Analysis Questions . . . 153

Figure B.3 Requirements Analysis Questions . . . 153

(15)

Chapter 1

Introduction

You never have a second chance to make a good first impression.

- Anonymous

The research presented herein addresses the problem of developing compliant software systems. Specifically,given that organizations’ policy documents and data use agreements express commitments, privileges, and rights that can be legally binding, requirements engineers must

specify these commitments, privileges, and rights as software requirements. In order to do so, requirements engineers must incorporate these documents into their requirements activities to ensure that the requirements fulfill the items within the documents. We incorporate policy documents and data use agreements (DUAs) into the requirements phase of the software lifecycle through the use of CPR analysis. During CPR (commitment, privilege, and right) analysis, requirements engineers extract commitments, privileges, and rights from documents and operationalize them as requirements. The objective of this chapter is to provide motivation for the inclusion of policy documents and DUAs in the requirements phase of the software lifecycle and a summary of related work.

1.1 Motivation

(16)

in the software lifecycle [19]. Herein, we focus on two types of documents with which software systems must comply—policy documents and DUAs. We will now discuss each type of document and the motivation for complying with each.

1.1.1 Policy Documents

We use the term policy document to refer to any policy that an organization has. The policy documents we examined were all posted on the organizations’ websites. These policy documents describe the organizations’ practices, including how consumers’ or users’ personal information will be collected, disclosed, protected, shared, stored, and used. Organizations post different types of policy documents online, including privacy policies, privacy statements, Internet privacy statements, web privacy statements, notices of information practices, notices of privacy practices, terms of use, and terms and conditions.

In the United States, the Federal Trade Commission (FTC) is empowered to monitor organizations’ compliance with the practices expressed in their public policy documents. The FTC Act prohibits “unfair or deceptive acts or practice” [70]. This means the FTC can hold organizations liable for the statements they make in their policy documents; therefore, organizations need to comply with their policy documents [70]. In order to build policy-compliant system, policy documents need to be considered as a primary source of requirements.

Recent sanctions by the FTC against organizations that fail to comply with their policies highlight the importance of actively addressing compliance as a best practice that avoids such penalties [37, 110]. The FTC has investigated policy compliance failures and brought enforcement actions against at least seventeen organizations since 2002 [37]. These noncompliance cases are an important motivation for organizations to evaluate software compliance with their policy documents.

We will discuss two enforcement examples and two investigation examples. The FTC took enforcement actions against ChoicePoint, a data broker, when the personal records of 163,000 consumers were compromised because ChoicePoint did not comply with its privacy policy [67, 110]. Specifically, ChoicePoint did not comply with the following statements in its privacy policy about who would have access to information and its credentialing process [67]:

- ChoicePoint allows access to your consumer reports only by those authorized under the FCRA.

- Every ChoicePoint customer must successfully complete a rigorous credentialing process. ChoicePoint

does not distribute information to the general public and monitors the use of its public record information

to ensure appropriate use.

(17)

consumer redress [67]. The settlement also required the implementation of procedures to avoid releasing consumer reports to illegitimate businesses, the creation of an information security program, and audits of the organization for 20 years.

The FTC also took enforcement actions against Gateway Learning Corporation (GLC), which sells educational tools, when the organization did not honor commitments it made to its users within its policy document, including [65, 66, 122]:

- We do not sell, rent or loan any personally identifiable information regarding our consumers with any third party unless we receive a customer’s explicit consent.

- We do not provide any personally identifiable information about children under 13 years of age to any third party for any purpose whatsoever.

Contradicting its policy document, GLC rented consumer information, which resulted in con-sumers receiving direct mailings and telemarketing calls [65, 66]. The FTC fined GLC the amount of money that it retained from renting user information; and the FTC mandated audits, strict record keeping of policy documents, and employee training [65, 66, 122]. To quote from the FTC decision, GLC is also prohibited from “disclosing to any third party any personal information collected on its Web site prior to the date it posted its revised privacy policy permitting third-party sharing (June 20, 2003) without the express affirmative (‘opt-in’) consent of the consumers to whom such personal information relates” [65].

The FTC investigated Facebook and Google because they did not live up to their promises. Specifically, Facebook told users that they could keep their personal information private, but Facebook shared users’ information that was marked as private [68]. Google came under FTC investigation when it launched Google Buzz, which violated some of Google’s privacy promises [69]. Facebook and Google were not charged with fees to settle their investigations, but, like ChoicePoint and GLC, the FTC required Facebook and Google to implement comprehensive privacy programs and to obtain audits for 20 years among other requirements relating to consumer privacy [68, 69].

Policy documents play an important role in identifying policy compliance requirements. A

policy compliance requirement is a requirement that must be met in order for an organization to comply with its policy documents. These requirements can be traced to the policy documents from which they were derived or extracted. Policy compliance requirements can be system-related or operational (system-related to business practices). Herein, we use compliance requirement

to refer to policy compliance requirements.

1.1.2 Data Use Agreements

(18)

whom it will be disclosed [104]. Regulations adopted pursuant to the United States Health Insurance Portability and Accountability Act (HIPAA)1 specify that a DUA must exist when certain protected health information is used or disclosed. In the context of the HIPAA, DUAs are critical, legally binding agreements that must be examined by requirements engineers in order to develop compliant software.

Under 45 C.F.R.§164.514(e) (part of the HIPAA Privacy Rule), data use agreements must be made between covered entities and limited data set recipients if a limited data set is being used or disclosed. The LDS can be used or disclosed only for the following purposes: research, public health, or healthcare operation. DUAs must contain certain information, described concisely by the National Institutes of Health (NIH) in Figure 1.1.

The Privacy Rule requires a data use agreement to contain the following provisions:

- Specific permitted uses and disclosures of the limited data set by the recipient consistent with the purpose for which it was disclosed (a data use agreement cannot authorize the recipient to use or further disclose the information in a way that, if done by the covered entity, would violate the Privacy Rule).

- Identify who is permitted to use or receive the limited data set. - Stipulations that the recipient will

- Not use or disclose the information other than permitted by the agreement or otherwise required by law.

- Use appropriate safeguards to prevent the use or disclosure of the information, except as provided for in the agreement, and require the recipient to report to the covered entity any uses or disclosures in violation of the agreement of which the recipient becomes aware.

- Hold any agent of the recipient (including subcontractors) to the standards, re-strictions, and conditions stated in the data use agreement with respect to the information.

- Not identify the information or contact the individuals.

Figure 1.1: NIH Summary of the HIPAA §164.514(e)(4) [104]

An LDS is not the full data set of protected health information (PHI), nor does it qualify as de-identified health information2 under the HIPAA. An LDS instead “refers to PHI that excludes 16 categories of direct identifiers and may be used or disclosed, for purposes of research, public health, or health care operations, without obtaining either an individual’s Authorization or a waiver or an alteration of Authorization for its use and disclosure, with a data use agreement”

1_{Pub. L. No. 104-191, 110 Stat. 1936 (1996).} 2

(19)

[104]. Specifically, an LDS excludes the identifiers listed in Figure 1.2 from the PHI.

- Names

- Postal address information, other than town or city, State, and zip code - Telephone numbers

- Fax numbers

- Electronic mail addresses - Social security numbers - Medical record numbers

- Health plan beneficiary numbers - Account numbers

- Certificate/license numbers

- Vehicle identifiers and serial numbers, including license plate numbers - Device identifiers and serial numbers

- Web Universal Resource Locators (URLs) - Internet Protocol (IP) address numbers

- Biometric identifiers, including finger and voice prints - Full face photographic images and any comparable images

Figure 1.2: Excluded identifiers in an LDS—45 C.F.R.§164.514(e)(2)

Simply reading the HIPAA Privacy Rule, requirements engineers would not know whether they need to care about DUAs. Given the existence of a DUA, however, a requirements engineer knows that the system must support the exchange of an LDS, along with the other terms specified in the legally binding DUA.

As a result of the 2009 Health Information Technology for Economic and Clinical Health (HITECH) Act3 that amended the HIPAA, KPMG will be auditing 150 covered entities to assess the compliance [59, 128]. During the audits, KPMG will check that the covered entities comply with HIPAA and any of their documents, including policy documents and DUAs.

DUAs play an important role in identifying contractual compliance requirements. A contrac-tual compliance requirement is a requirement that is agreed upon by two (or more)4 parties and is enforceable by law.

3_{Pub. L. No. 111-5, tit. XIII, 123 Stat 115, 226 (2009).} 4

(20)

1.2 Related Work

We now examine areas of related work for CPR analysis, including: policies and requirements; goal-based analysis; traceability; compliance; natural language; commitments; and user studies.

1.2.1 Policies and Requirements

Researchers have previously addressed the similarities and differences between policies and software requirements [21]. Policies and software requirements both convey intended outcomes, rather than fact. Policies, however, are broader in scope than software requirements. Whereas policies can govern multiple systems, software requirements are typically specified for one system. Because of the similarities between requirements and policies, policies are amenable—indeed, even well suited—to analysis using traditional requirements engineering techniques [23]. We derive software requirements from policy documents and DUAs that govern systems or websites.

1.2.2 Goal-Based Analysis

There are several goal-based approaches within requirements engineering [18, 20, 22, 24, 31, 35, 36, 56, 73, 74, 135]. Dardenne et al. developed a goal-directed approach for requirements acquisition—Knowledge Acquisition in autOmated Specification, KAOS [56]. The approach includes three modeling levels: the domain-independent meta level, the domain level, and the instance level. Yu’si* framework models goals and other intentional elements (tasks, resources, and softgoals) with the relationship between actors and intentional elements [135]. Giorgini et al.’s Secure Tropos framework supports security requirements modeling and analysis in terms of three relationships—ownership, trust, and delegation—between actors and services (goals, tasks, or resources) [74].

The Goal-Based Requirements Analysis Method (GBRAM) aids requirements engineers in identifying goals from requirements sources [18]. Applying the GBRAM results in goals that are operationalized as requirements using schema templates. Ant´on and Earp extended the GBRAM to support goal-mining (the extraction of pre-requirements goals from post-requirements text artifacts) to derive privacy- and security-related goals from requirements source documents [20]. Goal-mining has been successfully applied to analyze e-commerce, financial, and healthcare privacy policies [20, 22, 24]. Goal-mining is the goal-based approach against which we compared CPR analysis with respect to the ability for each approach to ensure compliance as discussed in Chapter 5.

(21)

describe instances when consumer privacy may be threatened [20]. Ant´on and Earp developed the taxonomy primarily to aid requirements engineers in ensuring better coverage of privacy requirements by identifying vulnerabilities that need mitigation via new requirements to address these vulnerabilities. In contrast, CPR analysis focuses on aiding organizations in ensuring that they do what their policy documents and DUAs say they do, thus helping organizations to avoid “unfair or deceptive” practices [70], thereby ensuring compliance.

Breaux et al. proposed Semantic Parameterization, a process for representing domain descriptions in first-order predicate logic [35]. They also introduced Knowledge Transformation Language (KTL), a context-free grammar, to analyze the most frequently expressed goals in over 100 online policy documents to derive semantic models [31, 36]. Commitments, rights, and privileges are richer than goals in that they preserve more context from the documents, as we discuss in Chapter 5. Goals examine targets of achievement whereas commitments, privileges, and rights address actions that are required or entitled.

1.2.3 Traceability

Traceability is critical in requirements engineering to link the requirements for the system and the other important entities of the software [54]. To establish software compliance with all governing legal texts, traceability from relevant regulations to requirements specifications is essential [33, 109]. The traceability helps to demonstrate due diligence in a court of law [94]. Regulatory compliance has been examined using automated traceability links [26, 55]. Our work supports both forward and backward traceability across policy documents, DUAs, commitments, privileges, rights, and requirements through documentation.

1.2.4 Compliance

Requirements engineering researchers have investigated software compliance [34, 73, 94, 95, 115]. Robinson identified a need for software requirements to comply with policies and developed a framework, ReqMon, to monitor software requirements at runtime [115]. His approach focused

on runtime compliance with system requirements, whereas we focus on the initial derivation of compliance requirements. Ghanavati et al. introduced an approach to track legal compliance; they connected models of business processes with legislation [73]. They used a goal-based approach, whereas we employ a CPR-based approach. They focused on checking compliance through their use of links, while we build in compliance during the requirements activities.

(22)

demonstrated improved compliance via improved security and privacy requirements for the system [94]. Maxwell and Ant´on developed a production rule model of the HIPAA Privacy Rule that enables analysts to query whether their requirements are compliant with the regulations [95]. It facilitates communication between requirements engineers and legal domain experts [95].

Breaux et al. performed a comparative evaluation between legal and product requirements based on Section 508 of the Rehabilitation Act Amendments of 1998,5 called the Accessibility Standards [34]. They compared the two requirements sets with respect to qualitative statement and phrase metrics. Statement metrics determine whether: requirements are equivalent; one requirement describes why the other should be implemented; or one requirement describes how the other should be implemented. Phrase metrics determine whether: concepts are generalized; concepts are refined; or the modality changes. Their study demonstrated limitations in existing requirement acquisitions methods. Their approach identified compliance gaps between previously specified product requirements and the Accessibility Standards. They also identified additional knowledge sources to help refine legal requirements into product requirements that comply with law.

Rather than checking the legal compliance of existing requirements or comparing two sets of requirements, within our comparison study (Chapter 5), we compare the results of two requirements engineering approaches—both of which extract requirements artifacts (CPRs and goals) from policy documents—for their ability to ensure compliance with the associated policy documents.

1.2.5 Natural Language

Berry et al. examined ambiguities in natural language software requirements specifications and legal contracts that arise from the use of ambiguous words, such as all,each,every,and, and

or [28, 29]. They discussed how to address these types of ambiguities in natural language and how to avoid them. For example, or can be ambiguous whether it refers to theinclusive OR

or exclusive OR; to avoid this ambiguity, they suggest explicitly stating either inclusive OR

orexclusive OR. As requirements engineers extract commitments, privileges, and rights from natural language documents, they should be aware of ambiguities that may exist and know how to handle the ambiguities.

Goldin and Berry’s AbstFinder tool can be employed to identify abstractions in natural language documents [79]. The abstractions express a document’s main ideas by ignoring the details during requirements elicitation for negotiation with customer. AbstFinder allows phrases to contain any number of words that can be spaced throughout the sentence in any order without concern for semantics. We are concerned with the specificity of the documents that we analyze

(23)

using CPR analysis. Our CPR analysis heuristics employ natural language patterns or phrases. These phrases differ from the phrases used within AbstFinder because the word ordering and context are important for compliance requirements.

Abbott transformed natural language descriptions into computer program by identifying patterns and parts of speech [1]. He linked descriptive noun phrases to a program’s data types and objects. Similarly, our CPR heuristics (Chapter 3) help to identify attributes by asking questions that often relate to parts of speech.

1.2.6 Commitments

Singh employs social commitments between agents6 _{for multi-party agreements [120]. He defines} a commitment as follows [120]:

A commitment is a four-place relation involving a proposition (p) and three agents (x, y, and G). Let c = C(x, y, G, p) denote a commitment from x toward y in the context of G and for the propositionp. Then,x is the debtor,y the creditor,G the context group, andp the discharge condition of commitmentc.

In contrast, our multi-party agreements exist between the organization, expressing its practices in policies, and the user, interacting with the organization’s policy-governed system, or between the data custodian and recipient for data use agreements. We specify the commitments, privileges, and rights in terms of actions instead of conditions because as requirements engineers, we want to know the actions that parties can or will do within the system. We discuss Singh’s definitions more in Section 2.1.3, where we discuss the justification for our terminology definitions.

Haddadi examined commitments as they relate to agents in order to form a “potential cooperation,” where agents work together to achieve a goal [84]. Haddadi discussed formation conditions or the conditions under which the commitment is formed. In our work, formation conditions relate to the rationale of the classifiable item.

1.2.7 User Studies

Ant´on conducted an empirical study in which she compared the GBRAM to alternative analysis methods—Object Modeling Technique (OMT) and non-method-assisted (control) [18]. Her hypothesis was that analysts performing GBRAM are better able to identify requirements than analysts using the alternative analysis methods. Ant´on examined the total number of requirements and total number of critical requirements (requirements that if not met cause the system to fail) produced by the subjects. GBRAM subjects produced significantly more critical

6

(24)

requirements than the OMT and control subjects. The user study discussed in Chapter 6 is structured in a similar way.

Breaux completed a user study to examine the effect of his upper ontology and phrase heuristics in legal requirements acquisition by measuring completeness and consistency [30]. The requirements produced by the participants were compared against a set of expected legal requirements to determine the number of expected legal requirements produced by the participants using a set of qualitative metrics—statement and phrase. Breaux found that participants using the ontology performed better in terms of completeness and consistency than participants performing traditional practices. In our user study, we compare the subject-produced requirements against a set of expected compliance requirements (Chapter 6).

1.3 Overview of Remaining Chapters

(25)

Chapter 2

CPR Analysis Theory and

Methodology

I meant what I said

And I said what I meant...

An elephant’s faithful One hundred per cent!

- Dr. Seuss,Horton Hatches the Egg

The objective of our work is to develop an approach to analyze policy documents and data use agreements (DUAs) for requirements. Initially we attempted to extract rights and obligations from policy documents based on Breaux et al.’s legal-based work with legal texts, where a right is “an action that a stakeholder is conditionally permitted to perform” and an

obligation is “an action that a stakeholder is conditionally required to perform” [36]. During this initial analysis, we discovered that Breaux et al.’s classifications—rights and obligations—were insufficient to characterize the types of items contained within policy documents which are important for requirements. Many of the items within the policy documents are actually pledges, or commitments from the organization to the user, rather than rights and obligations as found within legal texts. For example, many statements contained the phrase “we will,” which is used to express a commitment to a position or action [38].

(26)

developed new theory based on scientific analysis. Requirements engineering researchers have successfully relied on grounded theory approaches in their research of legal- and policy-based software requirements [20, 33, 35, 36, 64].

By applying a grounded theory approach to analyze policy documents and DUAs, we developed our theory of commitments, privileges, and rights (the classifications discovered through analyzing these documents) and our methodology (how to extract CPRs from policy documents [132, 134] and DUAs [119] and operationalize them as requirements). In our work, commitments reflect an actor’s pledges, whereas privileges and rights reflect actions that an actor is entitled to perform. In Section 2.1.1, we provide full definitions and discuss how commitments, privileges, and rights differ from one another.

When developing our theory and methodology, we completed two types of multi-case studies— formative and summative, which we discuss in Chapter 4. The formative multi-case study was conducted to form ideas and gain context. The summative multi-case study was conducted to validate the theory and methodology. The cases within the studies relied on two types of documents—policy documents and DUAs. During our formative study, we examined each available document three times. First, we examined the documents to understand the type of information they contained—pledges and entitlements. This analysis yielded classifications— commitments, privileges, and rights. During the second pass through the documents, we examined the natural language phrases that signaled each classification. This list of natural language phrases formed the basis for our classifying heuristics discussed in Chapter 3 [119, 132]. During the final pass through the documents, we applied our heuristics to confirm that each item could be classified using the heuristics.

For the remainder of this chapter we discuss our theory of CPRs and the CPR analysis methodology.

2.1 Theory of Commitments, Privileges, and Rights

Our theory contains three classifications: commitment, privilege, and right, which are defined in Section 2.1.1. Commitments are pledges; privileges and rights are entitlements. Each CPR has a rationale—internal or external, which are defined in Section 2.1.2. In Section 2.1.3, we provide a comparison of CPR terminology and definitions with terminology and definitions from other researchers in the following areas: legal concepts, legal-based requirements engineering, and multi-agent agreements.

2.1.1 Classification Terminology

(27)

• A commitment ofParty Awith respect to Party B is an action, a, that Party Apledges

toParty B.

Consider the following item from a health system prototype DUA:

Moreover, all access to the system and its data will be stored in a secure log file for

auditing.

The item contains a commitment:1

store all access to the system and its data in a secure log file for

auditing

• Aprivilege ofParty A with respect toParty B is an action,a, thatParty A is entitled to

perform, where a is an independent action. Anindependent action is one which has no complementary action2 _{from another party (}_{Party B}_{) that must be completed by} _Party

B in order for the action to take place. In other words, Party A has the privilege of performinga only if performing adoes not imply that a commitment fromParty B to

Party A must exist.

Consider the following item from Pfizer Privacy [111]:

Generally, you can browse our Web sites without disclosing any personally identifiable

information.

The item contains a privilege:3

browse the organization’s websites without disclosing any personally

identifiable information

The item is a privilege rather than a right because the actionbrowseis an independent action that does not have a complementary action that must be completed by another party.

• A right of Party A with respect to Party B is an action, a, that Party A is entitled to perform, where a is a dependent action. A dependent action, a, is one which has a complementary action, b, from another party that must take place in order for ato take place. In other words,Party Ahas a right only if performingaimplies that a commitment from Party B toParty Ato perform bexists.

Consider the following item from Aetna Privacy Notices [12]:

1

The item is classified as a commitment using CHDU A6 on page 39.

2_{Others researchers have also used complementary actions [25, 102, 103].} 3

(28)

The Health Insurance Portability and Accountability Act of 1996 (HIPAA) Privacy

Rule allows members the right to receive a notice that describes how individual health

information may be used and/or disclosed and how to acquire access to this information.

The item contains a right from legal text and will have external rationale:4

receive notice in order to describe how individual health information

may be used and/or disclosed and how to acquire access to this

infor-mation

The complementary action and commitment of the organization is the following because in order for the user to receive notice, the organization must send notice:

send notice to user in order to describe how individual health

infor-mation may be used and/or disclosed and how to acquire access to this

information

In software systems, it is important to distinguish between privileges and rights. If Party A

has a right, then the system not only needs to incorporate the action,a, ofParty A’s right but also the action,b, of Party B’s implied commitment that exists because ais a dependent action. The inclusion of bwithin the system is important because without it, Party Awill not be able to performa. For example, receive notice is a right rather than a privilege because if the user is entitled to receive a notice, it is dependent on the organization sending notice. Note that the action of the implied commitment,send, is the complementary action of receive, the action of the right. Given a privilege, the system only needs to incorporate the action,a, of the privilege without concern for actions of another party because a is independent and does not require the action of another party. An example privilege of the organization ischange the terms of

policy. This is a privilege because it expresses an action, changing, that the organization is entitled to perform, but that does not imply an action on the user, or other actor, because the action is independent.

CPR analysis provides requirements engineers with an understanding of the requirements contained within policy documents and DUAs by focusing on pledges and entitlements, which are important because the pledged and entitled actions should be required and allowed, respectively, by the organization and within the system.

2.1.2 Rationale Terminology

Each classifiable item has arationale, or basis for the classifiable item being included within the document; specifically:

4

(29)

• A classifiable item has an internal rationale, if it delineates organizational practices or

procedures as the basis for the item.

• A classifiable item has anexternal rationale, if it delineates legal texts as the basis for the item.

2.1.3 Terminology Comparison

We now compare our terminology and definitions to those of legal concepts, legal-based require-ments engineering, and multi-agent agreerequire-ments. Hohfeld, who was a law professor early in the twentieth century, proposed eight fundamental legal concepts that describe legal relations—duty,

right,privilege, no-right, power, liability, immunity, anddisability [87]; however, herein, we focus on the three concepts that are relevant to our analysis of policy documents and DUAs—duty, right, and privilege. These are relevant to our commitment, right, and privilege, respectively. Breaux et al. presented a legal-based approach to analyzing legal texts for software requirements [30, 36]. As previously discussed, originally we attempted to apply Breaux et al.’s legal-based approach [36] to policy documents but discovered that their classifications were insufficient to characterize the kinds of items within policy documents, which are inherently different from the kinds of items found in legal documents. Singh discusses agreements within multi-agent systems, employing social commitments, privileges, and rights between agents [120].

Commitment

First, we examine the terminology and definitions of legal concepts, legal-based requirements engineering, and multi-agent agreements that are relevant to our commitment classification.

Hohfeld’s duty and Breaux’s obligation legally require the actor to perform the action. Hohfeld defines a duty as follows [87]:

A duty or a legal obligation is that which one ought or ought not to do.

Breaux defines an obligation as follows [30]:

Obligation – an act that an actor is required to perform.

(30)

Singh employs social commitments between agents within multi-agent systems [120]. Singh defines a commitment as follows [120]:

A commitment is a four-place relation involving a proposition (p) and three agents (x, y, and G). Let c = C(x, y, G, p) denote a commitment from x toward y in the context of G and for the propositionp. Then,x is the debtor,y the creditor,G the context group, andp the discharge condition of commitmentc.

Singh’s commitment contains two parties—debtor and creditor. “[T]he debtor is the agent who is committed, and the creditor is the agent who receives the commitment” [120]. In CPR analysis, we do not use ‘debtor’ and ‘creditor’ because while the terms can have meanings that are outside of the financial domain, we use Party A and Party B in order to more generally note the parties. Party Aand Party B within our work relate to Singh’sdebtor andcreditor, respectively. Hohfeld and Breaux do not name the parties associated with duties and obligations

Like our definitions, Hohfeld’s and Breaux’s definitions are stated in terms of actions that take place. Singh’s definitions are stated in terms of conditions. We specify commitments, privileges, and rights in terms of actions instead of conditions, because as requirements engineers, we want to know the actions that parties can or will do within the system [30, 36, 96, 97, 121]. Considering the condition that holds exactly when a given action has been completed, one should be able to switch between the use of conditions and actions.

Singh discusses operations on commitments—create, discharge, cancel, release, delegate, and

assign [120]. Given our concern with compliance, it is important to note that if the policy document or DUA changes, the commitments also change. Similarly, within our work most of the actions the actors will complete multiple times, not just once; therefore we do not have discharge conditions. For these reasons, we do not perform operations on our commitments.

Right

We examine the terminology and definitions of legal concepts, legal-based requirements engi-neering, and multi-agent agreements that compare to our right classification. Like Breaux’s permission, our right is an action that the actor is entitled or permitted to perform. Breaux defines permission as follows [30]:

Permission – an act that an actor is permitted to perform.

Our right is a dependent action, a, that a party is entitled to perform that has a comple-mentary action from another party that must be completed; this complecomple-mentary action relates to a commitment from another party. Hohfeld’s right5 and Singh’s claim place a claim against

5

(31)

another party; in other words, they place a duty on another party. Therefore, they are similar to our right in that they are actions that place actions on another party. Hohfeld defines right as follows [87]:

A right is one’s affirmative claim against another

Singh defines claim as follows [120]:

Aclaim or right is what an agent can demand from another. It is like a commitment with respect to the relevant context, which is not made explicit by Hohfeld. Thus, we have Claim(x, y, p) 4 C(y, x, G, p).

Hohfeld’s right and duty are correlatives, which means the legal relations are equivalent. For example, “if X has a right against Y that he shall stay off the former’s land, the correlative (and equivalent) is that Y is under a duty towardX to stay off the place” [87]. A right, by our definition, does have an implied commitment for another party based on the dependent action. However, commitments do not always provide others with rights; instead, they can simply say whatParty Awill do without entitling Party B to do anything. For example, the organization’s commitment to comply with law does not provide the user with a right. Breaux and Ant´on discussed the need for implied commitments with rights and the unnecessary implied rights as [32]:

Rights without complementary obligations are meaningless since governed parties are not required to respond to the invocation of such rights. In terms of designing and engineering software systems, these rights may be effectively ignored. On the other hand, obligations without an explicit and complementary right do have value and must be properly incorporated into system specifications.

Based on Breaux and Ant´on’s discussion of implied rights and because commitments do not always provide others with rights, we do not document implied rights within our work.

Privilege

We examine the terminology and definitions of legal concepts, legal-based requirements engi-neering, and multi-agent agreements with respect to our privilege classification.

(32)

A privilege is one’s freedom from the right or claim of another

Singh defines privilege as follows [120]:

Aprivilege is a freedom an agent has from claims of another. In other words, it is an absence of a duty to refrain from the given act. In this sense, a privilege is the dual of a claim with the roles of the agents reversed, i.e., Priv(x, y, p) 4 ¬Claim(y, x,¬p).

Although our privileges are classified using a different definition than Hohfeld’s and Singh’s, our privileges would satisfy their definitions. We can show this by contradiction. To do this, we assume that there exists an actionathat Party Ahas the privilege to perform and Party Ahas the commitment not to performa. It is straightforward to see how this would be a contradiction of both Hohfeld’s and Singh’s definition of privilege. For this situation to be possible according to our definitions, one of two cases must exist: eitherParty A has committed toParty B not to performa, or Party B has the right to perform the complementary action of not performing

a. The first case would be a conflict in the policy document or DUA because the document would state that Party Awould have the privilege of performing aand the commitment not to perform a. In our studies we have not found an example of such a conflict, and thus can assume that such conflicts do not exist. The second case also cannot exist because it requires a complementary action fora, which by our definition of privilege cannot exist. Thus, according to our definition of privilege, ifParty Ahas the privilege to performa,Party Awill not have the commitment not to performa, which agrees with Hohfeld’s and Singh’s definition of privilege.

Breaux defines exclusion as follows [30]:

Exclusion – an act that an actor has no express permission to perform or that an actor is not expressly required or prohibited from performing.

Breaux also states that an exclusion is “Any action that an actor is not permitted, required, or prohibited from performing” [30]. In contrast, our privilege is an independent action that the actor is entitled, or permitted, to perform.

Summary of Comparison

(33)

Our commitments could be considered equivalent to Singh’s. Our commitments with external rationale match Hohfeld’s duty and Breaux’s obligation. Our privileges and rights would also be classified as privileges and rights, respectively, with Hohfeld’s and Singh’s definitions; however, all of Hohfeld’s and Singh’s privileges and rights would not be classified as privileges and rights, respectively, with our definitions.

2.2 CPR Analysis Methodology

CPR analysis employs a document as the input and produces a set of requirements as the output. The methodology consists of three steps (see Figure 2.1): (1) parse; (2) classify; and (3) operationalize.

2.2.1 Parse

The requirements engineer parses the source document into individual statements so that each statement may be independently analyzed. The documents are generally formatted as paragraphs and/or lists.

When considering paragraphs, the requirements engineer parses each sentence as a separate statement. To parse the documents based on sentences, we apply Manning and Sch¨utze’s algorithm to identify the sentence boundaries [92]. We tentatively place sentence boundaries after all periods and question marks; boundaries are disqualified if the period follows an abbreviation or if the period or question mark is part of a URL or email address [92]. Each section heading is also parsed as a separate statement.

A continuation is a statement that begins in the fragment before a list and ends within a list item [30]. Herein, the fragment before a list is prepended to the beginning of each list item, forming statements with the corresponding sentences.

2.2.2 Classify

The classify step contains multiple substeps within it. First, the requirements engineer distin-guishes between classifiable and unclassifiable items. Second, the requirements engineer classifies classifiable items as commitments, privileges, and rights. Third, the requirements engineer documents attributes for each classifiable item. Fourth, the requirements engineer documents the rationale of each classifiable item. Finally, the requirements engineer forms the commitments, privileges, and rights.

The requirements engineer examines each statement to determine whether it contains any

(34)

(35)

requirements or compliance—for example, if the statements are section headings, definitions, clarifications, or warnings; we refer to these as unclassifiable items and do not operationalize them as requirements. Other statements may contain multiple CPRs—if they contain multiple actions, conditions, and/or purposes.

The types of items within a document can be seen in Figure 2.2. Each item within the document is either classifiable or unclassifiable. A classifiable item is either a pledge or an entitlement. A pledge is a commitment, whereas an entitlement is either a privilege or a right. Because our theory and methodology were developed using a grounded theory approach, we know that all items that are important for requirements and compliance are classifiable—commitments, privileges, and rights; all other items are unclassifiable as they are not important for requirements and compliance.

By classifying the items, the requirements engineer can gain a greater understanding of the types of items that are expressed within policy documents and DUAs. The requirements engineer classifies each classifiable item as either a commitment, a privilege, or a right though the use of natural language patterns and heuristics, Classifying Heuristics (CH) given in Section 3.2.3. The requirements engineer also documents the attributes and rationale of each classifiable item using natural language patterns and heuristics, Documenting Attribute Heuristics (DAH) and Documenting Rationale Heuristics (DRH) given in Sections 3.2.4 and 3.2.5.

Figure 2.2: Types of Items within a Document

(36)

• The actor is the responsible stakeholder who performs the given action.

• The action is the action (verb) that the actor performs.

• The object is the item upon which the actor’s action is acting.

• The object’s source is the originator of the object.

• The target is the intended recipient of the actor’s action.

• The purpose is the reason for performing the action.

• The condition is the restriction or pre-condition on performing the action.

• Theexamples, or scenarios, illustrate how commitments, privileges, and rights are executed.

Values for these attributes are found by asking questions that relate to the definition of each attribute [114]. It is possible that some attributes may not have values for a given item. When an item does not include certain attributes or there are values that may be inferred, the item should be discussed with the organization to find out what is meant by—and ideally to clarify—the policy document or DUAs.

To form the CPRs, we use the CPR template (FH 1 on page 49) that incorporates the identified attributes, which we documented. If an attribute does not have a value, the related portion is omitted within the template. All of the attributes are used within the templates with the exception of examples. Software engineers gain a stronger understanding of the requirements by using these examples as instances of when and/or how the CPR expressed by the requirement is used within the system.

2.2.3 Operationalize

The requirements engineer operationalizes the CPRs into requirements using templates. Opera-tionalize refers to translating classifiable items into requirements. The unclassifiable items are not operationalized into requirements. Others have used templates when operationalizing items into requirements [18]. This final step takes classifiable items as input and produces a set of requirements.

CPR analysis produces two types of requirements—system and operational. System re-quirements specify the capabilities of the system or software. Operational requirements specify “business rules or operational procedures” [20] outside of the system. We employ templates for

(37)

2.3 Summary

(38)

Chapter 3

CPR Analysis Heuristics

Penny: You’ll never guess what just happened!

Leonard: Oh, I give up!

Sheldon: I don’t guess! As a scientist, I reach conclusions based on observations and experimentation.

- The Big Bang Theory

In this chapter, we discuss the heuristics employed to perform CPR analysis. The heuristics are rules that guide requirements engineers conducting CPR analysis. We organize this chapter into sections based upon the three steps of CPR analysis—(1) parse; (2) classify; and (3) operationalize. Each section discusses the heuristics employed for that step. Where separate heuristics exist for policy documents and data use agreements (DUAs), we note these heuristics with subscript P D and DU A, respectively.

3.1 Parse

(39)

Given a paragraph of the document, follow the following steps to parse it into sentences:

1. Tentatively place sentence boundaries after all periods and question marks. 2. Disqualify boundaries if the period follows an abbreviation.

3. Disqualify boundaries if the period or question mark is part of a URL or email address.

The result will be a set of statements based on sentences. PH 1: Parse Paragraph

Consider the following paragraph from Dossia’s Privacy Statement [60]:

Dossia collects the personal information you voluntarily enter into the Dossia website, including

the health information you enter, or authorize others to enter, into your Personally-Controlled

Health Record (PCHR). Dossia protects the privacy and security of this information as described

in this Privacy Statement and uses this information to provide you with your PCHR and associated

services. Except for any narrow exceptions explained in this Privacy Statement, Dossia will not

disclose information in PCHRs to third parties without your explicit permission.

Using PH 1 the paragraph is parsed into three statements based on the three sentences.

1. Dossia collects the personal information you voluntarily enter into the Dossia website, including the health information you enter, or authorize others to enter, into your

Personally-Controlled Health Record (PCHR).

2. Dossia protects the privacy and security of this information as described in this Privacy

Statement and uses this information to provide you with your PCHR and associated services.

3. Except for any narrow exceptions explained in this Privacy Statement, Dossia will not disclose information in PCHRs to third parties without your explicit permission.

Policy documents and DUAs tend to also contain lists. There are two types of lists within documents: lists that contain continuations and lists that do not contain continuations. A continuation is a statement that begins in the fragment before the list and ends within a list item [30]. PH 2 addresses how lists are parsed into statements for analysis.

- For lists that do not contain continuations, parse each list item using PH 1. - For lists that contain continuations, complete the following steps:

1. Prepend the fragment before the list to each list item. 2. UsePH 1 to parse the resulting list items.

(40)

Consider the following list from Aflac’s Privacy Policy [14]:

Accordingly, we will disclose Personal Information to employees, agents, or third parties only as

described herein and:

1. To fulfill a transaction that you have requested on the Web Sites.

2. To service your policy.

3. To investigate or handle claims.

4. As permitted or required by law by regulatory and law enforcement authorities.

5. To enforce or apply our Terms and Conditions of Use and other agreements.

6. To protect the rights, property, or safety of Aflac, aflac.com, aflacny.com, the Web Sites’

visitors, or others.

Because the list is a continuation, using PH 2, we prepend the fragment before the list (Accordingly,

we will disclose Personal Information to employees, agents, or third parties only as described herein and) to each list item. Then we parse each list item using PH 1, resulting in the following six statements:

1. Accordingly, we will disclose Personal Information to employees, agents, or third parties

only as described herein and To fulfill a transaction that you have requested on the Web

Sites.

only as described herein and To service your policy.

3. Accordingly, we will disclose Personal Information to employees, agents, or third parties only as described herein and To investigate or handle claims.

4. Accordingly, we will disclose Personal Information to employees, agents, or third parties only

as described herein and As permitted or required by law by regulatory and law enforcement

authorities.

only as described herein and To enforce or apply our Terms and Conditions of Use and

other agreements.

6. Accordingly, we will disclose Personal Information to employees, agents, or third parties only as described herein and To protect the rights, property, or safety of Aflac, aflac.com,

aflacny.com, the Web Sites’ visitors, or others.

The final parsing heuristic is for section headings. With PH 3, a section heading is considered its own statement.

Parse each section heading as a separate statement.

(41)

Using PH 3, we parse the following section heading from Dossia [60] as its own statement:

A. Information Collected and Used by Dossia

3.2 Classify

The classify step of CPR analysis contains multiple substeps. First, the requirements engineer distinguishes between classifiable and unclassifiable items by using the Item Heuristic (IH). Second, the requirements engineer classifies each classifiable item as a commitment, privilege, or right by using the Classifying Heuristics (CH). Third, the requirements engineer documents the attributes for each classifiable item by using the Documenting Attribute Heuristics (DAH). Fourth, the requirements engineer documents the rationale of each classifiable item by using the Documenting Rationale Heuristics (DRH). Finally, the requirements engineer forms commitments, privileges, and rights by using the Forming Heuristic (FH). The requirements engineer also uses Helper Heuristics (HH) for tasks within the classify step.

3.2.1 Item Heuristic (IH)

Recall that items within a document are either classifiable or unclassifiable (Figure 2.2 on page 21). The classifiable items are classified as commitments, privileges, and rights, then operationalized as requirements. The unclassifiable items are not commitments, privileges, or rights and are not operationalized as requirements. After an item is documented as unclassifiable, no other heuristics are applied to it. Based on the grounded theory analysis that was used to develop our theory and methodology, we found that each unclassifiable item in policy documents and DUAs was either a clarification, a definition, a section heading, or a warning. We employ IH 1 to determine whether an item is classifiable or unclassifiable based on the types of items in documents that are unclassifiable.

If the item is one of the following:

- Clarification: item clarifies or explains another item. - Definition: item defines a key word or phrase. - Section heading: item denotes the start of a section.

- Warning: item provides a warning, advisory, or suggestion.

(42)

The following item from Aetna’s Notice of Privacy Practices - Strategic Resource Company (SRC) [10] is unclassifiable based on IH 1 because it is a definition for health information:

By “health information,” we mean information that identifies you and relates to your medical

history (i.e., the health care you receive or the amounts paid for that care).

3.2.2 Helper Heuristics (HH)

We employ two Helper Heuristics to assist us during the classify step. HH 1 describes what the requirements engineer does when a classifiable item from a policy document or DUA contains multiple actions. If an item contains multiple actions, it is split into multiple items that each express a single action. This heuristic can be applied with any of the Classifying Heuristics (CH) when the item contains multiple actions.

If an item contains multiple actions, split the item into multiple items—each of which contains a single action.

HH 1: Multiple Actions

For example, if the item is in the following format: [actor] [action 1] and [action 2]

[ob-ject], then we separate it into two items: [actor] [action 1] [object]and[actor] [action 2] [object]. As such, each item will only have a single action. Consider the following item from Dossia’s Privacy Statement [60]:

Dossia collects and uses identifiable information about you for enrollment, ongoing account and

system administration, communications with you about your account, and internal operations.

The item contains two actions—collectanduse. Using HH 1, we split it into two items that each contain a single, testable action [72, 121]:

1. Dossia collects identifiable information about you for enrollment, ongoing account and sys-tem administration, communications with you about your account, and internal operations.

2. Dossia uses identifiable information about you for enrollment, ongoing account and system

administration, communications with you about your account, and internal operations.

(43)

When a pronoun is present in an item, change the pronoun to the antecedent that the noun references.

HH 2: Pronouns

Policy documents often reference the organization with we, us, or our. Policy documents reference theuser withyouoryour. We change the pronouns referencing the organization and the user toorganization anduser, respectively. In DUAs, pronouns were not as prominent as pronouns in policy documents.

3.2.3 Classifying Heuristics (CH)

We now discuss classifying heuristics that we employ to determine the classification of an item. These heuristics classify items based on the presence of natural languages phrases. While policy documents and DUAs contain some of the same phrases, the classification for the phrases are not identical. For this reason, each document type has its own set of heuristics.

There is one classifying heuristic that is the exception to the classifying heuristics being document specific. CH 1 applies to both policy documents and DUAs. As there are two types of entitlements—privilege and right, CH 1 is a helper classifying heuristic, which is used to distinguish between entitlements that are privileges and entitlements that are rights. This distinction is based on the differences in definitions discussed in Section 2.1.1. CH 1 is not used alone but instead is used by other classifying heuristics (CHP D 1, 2, 3, 6, and 9 & CHDU A 1, 3, and 6) to determine whether an entitlement is a privilege or a right.

- If the entitlement contains an independent action (no complementary action exists), classify it as aprivilege.

- If the entitlement contains a dependent action (a complementary action from another party that must take place exists), classify it as a right.

CH 1: Distinguish between Entitlements—Privileges and Rights

Classifying Heuristics for Policy Documents (CHP D)

We now examine classifying heuristics for policy documents (CHP D). The presence of modal verbs, which express modality—necessity or possibility, can be used to determine an item’s classification. CHP D 1 classifies an item based on modal verbs—can, could,do, may, might,must,