5.3 Creating the Meta-Model
5.4.3 Phase 3 Decision
The third key element, Prescription, relates to the findings on whether a practice violates privacy. Prescription asks that the assessor to establish, based on the evaluation carried out in the Risk Assessment phase, whether or not privacy has been or is likely to be violated if the new process is implemented. This involves presenting the findings which will guide the practitioner in whether or not a practice or process poses a potential challenge to privacy. This, it is contended, involves making a decision as to the compatibility or non-compatibility of the information for allowing those changes or alterations in the data flow. Therefore, this has been translated in the meta-model into ’Decision’. The reason for this is that practitioners may not relate to what prescription means, whereas they will be familiar with decisions and making decisions (see Section 2.5).
Figure 5.1 shows an activity diagram depicting each phase of the meta-model and the order in which practitioners will progress through the framework (i.e. the activity flow), starting with phase 1, Explanation; moving on to phase 2, Risk Assessment; and culminating at phase 3, Decision.
With reference to the decision heuristics (DH), the final heuristic Nissenbaum asks readers to consider is whether, based on the findings made in the previous considerations; ”contextual integrity recommends for or against the proposed new practices”. Again, this requires practitioners to make a decision in light of findings and therefore, this DH is considered in the final phase, decision.
In the decision phase there are only two classes. These classes and their relationships are shown in Figure 5.6.
Figure 5.6: Decision - Classes The attributes within each class will be:
Risk Assessment The Risk Assessment class will be carried over from Phase 2. This will feed into the decision element and produce an outcome class. This outcome class will hold the decision from each attribute group, contributing a score to aid the practitioner in making a decision. The relationship between the two classes is many to many as both classes contain multiple considerations that will need assessing.
Outcome The outcome class will consist of:
Finding This will be a ’to publish’ or ’not to publish’ decision.
Reason Here a series of contributing reasons can be recorded, this could, for example, refer to legal compliance such as ’Data Protection’ or a ’no privacy issues found’ reasoning.
If the decision is to publish the following additional attributes will also need to be completed:
Mitigating Steps This category will detail any mitigating steps that need to be carried out before publication can take place, this may, for instance, include redaction or anonymisation.
Actors Recording who is responsible, accountable, consulted and informed (RACI) as part of the process helps achieve transparency and provide assurance that proper process is followed in making, implementing and enforcing decisions made. Thus, here practitioners are asked to complete a responsibility matrix (Project Management Institute 2004). This will outline who is:
• Responsible; showing who is responsible for publication;
• Accountable; depicting who is accountable if the decision is challenged or there is a problem (there can only be one person accountable in a RACI matrix);
• Consulted; showing the actors whose opinions must be sought; and • Informed; detailing which actors who need to be informed of the decision. Time This will contain details of regularity of publication updates.
Validation - Decision
The decision phase for our hypothetical PB is where they will assess the privacy risks associated with making the dataset available as open data and identify any mitigation strategies. This will involve reviewing the privacy aspect within the strategy theme and any identify any ’how’s’ applicable in the actions theme (see Figure 4.5) as follows:
1. Strategy: Mitigation Here, the PB will identify what mitigation strategies there may be available for removing or reducing the identified risks, such as obfuscating attributes that contain identifiers (see 2.4.3) prior to publication;
2. How: Data format In this section the PB will record how they believe they can best implement the mitigation strategy. For the scenario of the friendship between the data processor and subject, this could for example, involve pseudonymisation or anonymisation of the dataset (see Section 2.4.2);
3. How: Governance Here the PB will record the outcome of the assessment to include the finding, the reasoning behind the decision, the mitigation strategies identified and whether or not these were applied and who will be responsible for what aspects of publication etc. (RACI) going forward.
5.5
Worked Example
This section seeks to provide an illustration of how the meta-model can be applied in practice by providing a worked example for each concept discussed above. This will take the form of a public body practitioner (’PP’). To give the PP some context we will give him the role of Data Officer and have him employed at the local Lending Library, applying the concepts to ascertain what privacy risks a hypothetical dataset, the ’Library Lending Register’, will pose if it was to be published in open format.
5.5.1 Explanation
For the explanation phase (see ’Phase 1, Explanation” section in Figure 5.7), PP will record details of the Library Lending Register (i.e. the ’Open Dataset’ in Figure 5.7).
Following the meta-model, this means PP will need to record details of the attribute category group (or column) that are contained within the dataset being assessed. For example, if PP is assessing the Lending Register of books on loan, PP will need to note details of each attribute category group (column within the database). This could, for example, include Customer ID, name, address and no of books on loan (see Table 5.1).
Customer ID Name Address Books on loan
12345 Alice Smith A Street 5
23456 Bob Jones B Street 1
34567 Eve Evans E Street 2
Table 5.1: Library Lending Register - extract
In addition to capturing this information, PP will also categorise each attribute type, so that each attribute type can be assessed for potential privacy risks, i.e. for attribute, what category group does that attribute belong to (personal identifier, quasi identifier, sensitive or non-sensitive, see (see ’Open Dataset’ in Figure 5.7). In this example, the name will be a direct identifier; the customer ID and address will be quasi-identifiers (they can link back to the personal information if linked) and the number of books will be non-sensitive. Thus, most of these columns contain potential identifying information.
PP will also need to record information pertaining to who has been involved in process- ing the data and what their role(s) are (i.e. the ’Actors’ and ’Roles’ in Figure 5.7). This will involve looking at where the data originated from (i.e. which department), who works
there and particularly, who worked with the data within that department (the data senders and data receivers). For the Library this might include a Learning Technologist and a Librarian. In addition, PP must ascertain who the responsible data controller is for the dataset. For context, PP must capture details of how these actors relate to each other and the data subjects (i.e. Alice, Bob and Eve, the library customers), e.g. are they personally acquainted or perhaps related? etc. Then, PP must record how the data is transmitted, what format it is held in, e.g. is it in a bespoke library lending database, a spreadsheet etc. Finally, PP will need to record how the data is transmitted, i.e. how does the information flow, both internally within the library and externally (the ’Transmission Principles’ in Figure 5.7).
Once PP has recorded details of the data, actors, roles, the prevailing context, and the transmission principles, risks can be identified in light of established norms and values and any legal obligations.
5.5.2 Risk Assessment
The risk assessment phase is where PP will assess the privacy risks associated with making the dataset available as open data (see ’Phase 2, Risk Assessment’ in Figure 5.7). This will involve reviewing the information captured as part of the evaluation and identifying any risks there might be associated with the data (see ’Disclosure Risk’ in Figure 5.7). For example, if the Librarian and Eve are friends, PP would need to note this relationship as a potential risk as part of completing the risk assessment. For each of the risk assessment areas PP needs to note any associated privacy risks. The fact that these two actors are friends could have potential risk implications in a number of areas, meaning PP will need to consider the risks associated with each area:
Disclosure Risk : There could be a number of potential disclosure risks identified as a result of the friendship between the actors in this instance. For example, this could include the relationship between the actors giving rise to a consideration about who the librarian may share the data with and how appropriate such sharing may be. It should also consider what the repercussions would be if the librarian was to divulge information obtained in the course of their work to a third party such as another friend. Similarly, consideration will need to be given as to whether or not Eve has given consent to the data being processed and what that consent covers. Another risk associated with the friendship could be that the Librarian may divulge personal information about Bob (another library user) to Eve;
Norms : The friendship could result in Eve receiving preferential treatment such as being allowed extra books on loan (discrimination risk) or be privy to confidential information about Bob (trust and confidentiality risks);
Regulations : Adherence to data protection regulations. For instance, if Bob has not given consent for his data to be shared, the divulging of the information to Eve will constitute a breach of data protection regulations;
Values : A breach of confidentiality would infringe on social and ethical norms (see ’Risk Assessment’ in Figure 5.7).
Once all of the disclosure risks have been identified in light of legal constraints, norms and values, PP will need to identify any mitigating steps that could be applied to facilitate making the data available in open format. Then, PP can assess whether, once these mitigations have been applied, any residual privacy risks could arise from publication and use this to inform the decision as to whether the Library Lending Register can be published.
5.5.3 Decision
The decision phase for our Lending Library is where PP must assess the privacy risks associated with making the dataset available as open data (see ’Phase 3, Decision’ section in Figure 5.7) and, based on this, make an informed decision (’Decision’ in Figure 5.7). As part of recording the ’Outcome’ in Figure 5.7, PP will record the decision and the outcome of the assessment. This should outline what the finding from the assessment is; the reasoning behind the decision and the mitigation steps identified and whether or not these were applied and who will be responsible for what aspects of publication etc. (RACI) going forward. For the Privacy Risk Assessment carried out on the Library Lending Register, the mitigating steps could, for example, include:
• Anonymisation PP could advise that identifying attributes should be anonymised prior to publication Samarati (2001), Lablans et al. (2015);
• Redaction PP could recommend that personal identifiers such as names, be redacted prior to publication ICO (2012), Pfitzmann and Hansen (2010).
Once these steps have been completed PP will have a detailed record of the outcome of the privacy assessment that includes the finding and the reason for the decision. This will enable other practitioners within the lending library to refer to the decision made and provide the organisation with quality assurance and an audit trail of decisions made.
5.6
Meta-model - Conclusion
The meta-model created in this chapter, shows that it is possible to break CI down into its component parts (see Figure 5.7). In doing so, the meta-model shows how, by breaking CI down into logical phases and modelling how these interlink, a decision-flow
can be established which can be followed in a methodical and systematic format when making decisions about privacy risks. The applicability of this meta-model in practice was demonstrated through a worked example that applied the concepts to a hypothetical public body, a public library (Section 5.5). Thus, the first proposition that there are existing framework(s) that singularly or through amalgamation of concepts can be adapted to provide a practical foundation for determining privacy risks hypothesis P1 (see Figure 3.2), holds true. The meta-model provides a practical example of how the CI conceptual framework can be translated into a working model for applying CI in practice.
5.6.1 Next Steps
The next step will be to determine how this model can be used to inform a working framework for applying CI in practice and answer RQ1 (see Figure 3.1 or Section 1.2). To this end the next Chapter explores how the meta-model can be used to create a more detailed framework that elaborates on each phase and breaks each phase down into a series of practical questions, based on the decision heuristics. These questions can then be used to apply CI in practice.
Case Study - Contextual Integrity in
practice
6.1
Introduction
It is argued in Chapter 1, Section 1.2 that existing privacy frameworks do not provide sufficient guidance on how to implement or conduct a privacy assessment, particularly in the context of open data publishing. Further, while practitioners are asked to consider context in some of these frameworks as part of the risk assessment (e.g. PIA, NIST, CI, see Chapter 2, Section 2.7), Chapter 4 confirmed there is little practical help in how to do so in practice.
In Chapter 5, proposition 1 (P1) was tested (see Figure 3.2) by taking an existing framework, Contextual Integrity (CI) and creating a visual model for how this could be adapted to work in practice. This resulted a meta-model based CI being created. This meta- model provided proof of concept that, in theory, CI can be adapted to provide a practical foundation for determining privacy risks (see Section 5.6). However, to demonstrate that there is little practical help available to practitioners requires more than proof of concept, it requires testing against what practitioners deal with in their daily work which is what this case study seeks to achieve.
This Chapter presents a case study on open data publishing. In the open data scenario, the recipient cannot be specifically defined as this would be anyone who downloads the data, something which none of the previous studies have considered (see Chapter 2, Section 2.7.7). Further, none of these studies have applied CI in a practical setting using real practitioners. This case study seeks to address this by trialling CI within a real practice setting, with existing circumstances where there is no clear demarcation between the context and the phenomenon being studied (Yin 2013) and is thus devised to apply the meta-model to a real problem in order to answer RQ1 (see Figure 3.1).
be used to inform the creation of a privacy-specific decision framework that can support practitioners in making informed decisions about privacy before data is published as open data. The aim of this will be to guide practitioners through the CI framework in a step by step manner to ensure all aspects are considered within context, so that they can make an informed decision, and formulate a well-reasoned appraisal or refusal as to how and why that decision was arrived at.
The research approach for this will be a case study (see Chapter 3, Section 3.4.5) that utilises the meta-model to create a practical privacy-specific decision framework for applying the CI framework to a real problem, the publishing of public body information in open format.
The rest of this chapter is organised as follows. First, the case study protocol, detailing the method used for creating a paper prototype (see Chapter 3, Section 3.4.11) of the meta-model is presented in Section 6.2. This is followed in Section 6.3 by a description for how the meta-model was further interpreted to create a practical questionnaire for assessing the privacy risks of publishing open data, CLIFOD. In Section 6.4, the CLIFOD questions are evaluated by a group of peers, followed by applying CLIFOD in practice in a practical trial in Section 6.5. In Section 6.6, the findings from this case study are presented and the chapter concludes in Section 6.7, confirming how the resulting questionnaire provides the answer to RQ1.