• No results found

Named Graph-based provenance is not limited to simple assertions about who made the assertion and when. Such assertions, whilst true according to the open world assumption (see Section 2.4.4), do not uniquely bind an owner to an asser- tion, or set of assertions. Cryptographic methods such as digital signatures offer one way to uniquely bind a security principal12to a digital document and, coupled

with digital certificates, add non-repudiation to signatures. Since it is difficult to guarantee that every RDF graph found on the Semantic Web is error free, Named Graphs and digital signatures form a good heuristic for evaluating trustworthiness. Simply asserting an RDF graph does not mean that the information it contains is reliable. Trusted metadata methods based on digital signatures are also a first step toward a basic level of trust on the Semantic Web [Bizer (2004b)]. Note that while we do not preclude the use of access control or confidentiality mechanisms on our trusted metadata, such mechanisms are beyond the scope of this thesis. Figure 4.8 gives an example of how Named Graphs can be signed. A separate Named Graph (in red) is created with the signature information, that thenasserts the referred Named Graph. We will see a concrete example of this in our work on the NG4J project in Section 4.6.4.

4.4.1

Signing RDF Graphs

As we argued in Section 3.2.2, to help create a more robust trust infrastructure during distributed collaborative software development, digital signatures must be- come a core part of the version control workflow. Digital signatures ensure the

12See http://www.pluralsight.com/wiki/default.aspx/Keith.GuideBook/What%20Is% 20A%20Security%20Principal.htmlfor further details.

dp:firstVersion dp:replaces dp:replaces dp:replaces dp:replaces dp:Wikipage dp:Document dp:Document dp:Document dp:Document Asserts... X.509Certificate... SignatureAlgorithm... DigestAlgorithm... Signature... asserts Asserts... X.509Certificate... SignatureAlgorithm... DigestAlgorithm... Signature... asserts Asserts... X.509Certificate... SignatureAlgorithm... DigestAlgorithm... Signature... asserts

Figure 4.8: DP with Digital Signatures

integrity of messages and, when generated with an appropriate PKI, support non- repudiation13[McCullagh and Caelli(2000);Zhou(2003)]. The goal in our work is to create an RDF digital signature framework that follows several of the principles in Tummarello et al. (2005), although builds upon Named Graphs rather than RDF Reification.

It is important to note that non-repudiation in our framework is supported through the use of asymmetric public key cryptography. This means the onus of responsi- bility for protecting private keys lies in the hands of the developer or administrator of the online repository. This is why PKI-based systems such as X.509 go to the trouble of using the Certificate Authority as the trusted third-party, tracking com- promised certificates using a Certificate Revocation List (CRL).

Symmetric key based systems such as Kerberos and the SAML protocol, do not support non-repudiation and therefore should not be used as part of our RDF digital signature mechanism.

13Non-repudiation is supported in X.509v3 with the KeyUsage (OID 2.5.29.15,

http://oid.elibel.tm.fr/2.5.29.15) critical extension. The non-repudiation bit is limited to digital signatures and precludes certificate and CRL signing.

4.4.1.1 Carroll’s algorithm vs. nauty

Due to the graph-like nature of RDF and the issue of blank nodes (see Sec- tion 3.4.1.1) generating reliable and robust digital signatures is non-trivial. As we have noted in Section 3.4.1.1), Carroll’s algorithm and nauty take very dif- ferent approaches to graph canonicalisation. Carroll’s algorithm can be seen as a quick and cheap canonicalisation method that does not attempt to solve the iso- morphism problem; Carroll(2003) goes to some lengths to state that the proposed algorithm is not intended for arbitrary RDF graphs. The nauty approach is far more elegant, satisfactorily solves the isomorphism problem; however, it is overly complex to program [Carroll (2003)] and non-polynomial. It would be unwise to rely on a non-polynomial algorithm in an RDF digital signature solution for the Semantic Web.

Figure 4.9: Comprehensive Canonical RDF Workflow

A another approach we have yet to consider is to combine Carroll’s algorithm with nauty. We could leverage the speed of Carroll’s algorithm for simple cases and nauty in complex cases, creating a workflow solution that is comprehensive.

Figure 4.9 shows our proposed canonical RDF workflow. At first glance, this workflow seems reasonable; we modify Carroll’s algorithm to detect when it has failed, then pass it on to nauty to complete the canonical reordering. Even if Carroll’s algorithm produced false negatives, they would be taken care of by

nauty.

One potential problem with Figure 4.9 is the expectation that a canonical rep- resentation produced by each algorithm is identical. Both algorithms take very different approaches, meaning that it is possible in one instance for Carroll’s algo- rithm to succeed and successfully sign an RDF graph. When the graph’s signature comes to be verified, Carroll’s algorithm fails, passing the graph tonauty. nauty

then produces a canonical reordering, however, completely different to Carroll’s algorithm, thus breaking the signature. AppendixC.1.1gives an example of where this case is true; we canonicalised the WordNet NounWordSense OWL class and compared the results of each algorithm which are shown to be very different. It is therefore important that the algorithm used to create the signature is the same as the algorithm used to verify the same signature.

If we were to take a pragmatic approach and choose between the two algorithms, Carroll’s algorithm offers the better choice. It is fast and able to cope with the majority of graphs we are interested in for our research. To be used in our RDF signature solution, however, we must devise a conservative approach that will guarantee a canonicalisation that can be reliably replicated in the future.

4.4.1.2 Conservative Canonicalisation

Conservative canonicalisation is an approach that accepts the existence of false negatives when using Carroll’s algorithm, i.e., if the algorithm claims it cannot canonicalise an RDF graph even though it should be able to, then we accept its conclusion. This approach may reject more graphs, however, should reduce the number of digital signatures generated that subsequently fail when verified. We have therefore placed restrictions on the introduction of blank nodes to improve reliability of our digital signature mechanism.

Our approach can be summarised as follows:

1. DP instances should consist of fully labelled RDF graphs.

3. External federated RDF that is to be signed should be analysed for blank nodes.

Carroll’s algorithm appears to suit to our needs in all cases given we take the conservative canonicalisation approach. We do not intend to use Sayers’ algorithm, which would otherwise introduce additional complexity when managing digital signatures and future verification. Continual publishing and updating of signatures seems to be rather laborious for our purposes.