The Rise of Linked Data and SPARQL - Beyond the ‘layer cake’

The Semantic Web is Like Archaeology: It’s All About Context

2.8 Beyond the ‘layer cake’

2.8.2 The Rise of Linked Data and SPARQL

Adding to this terminological confusion, is the relatively new concept of Linked Data, which was coined by Berners-Lee in 2006. Often perceived as being distinct

from the Semantic Web, in reality it is just a set of best practices for publishing data in a way which makes it part of a single global information space (Bizer et al. 2008, 2). These best practices define a way of taking data out of proprietary data silos (individual databases and documents), and by giving each piece of raw data its own unique address (in the form of a Uniform Resource Identifier or URI) it becomes uniquely identifiable and its location resolveable, and therefore linkable and manipulatable. Linked Data was never meant to be a replacement for the Semantic Web, or take it in a fundamentally different direction, it is merely the practical way this area of Semantic Web technology is developing (Heath 2009). The fact that it seems as though it has a life of its own, is because it has been so visibly successful.

Represented in Figure 7 as the part of the image made up of interlinked circles (and resembling a swarm of bees), Linked Data has enjoyed the hype and enthusiasm the Semantic Web has been waiting for for years. Almost as a collective sigh of relief that at last Semantic Web developers finally have a way to show their work, Linked Data has seen rapid uptake, especially by those with a mandate to make their data available (including the US and UK governments), through the W3C Linking Open Data project (World Wide Web Consortium 2010). That few resources yet exist to harvest and use that data in meaningful ways is another issue, but it is a first step in making the Semantic Web tangible which seems to be catching on.

The key to developing those resources is now part of the updated layer cake as well. Now that SPARQL is available, the protocol necessary for developing interfaces for querying Semantic Web data has been used to create ‘SPARQL endpoints’ for that purpose. The SPARQL query language allows queries to be written within a SPARQL endpoint that then returns the desired data. So even if full Semantic Web implementation is not yet available, Linked Data can now be queried with SPARQL and users are finally getting to see the Semantic Web in action.

2.9 Conclusion

In December of 2007, an article titled The Semantic Web in Action: Corporate applications are well under way, and consumer uses are emerging, was written in Scientific American as a follow up to the original piece written by Tim Berners- Lee and his co-authors in 2001. During the six years between the articles, much had changed. We are now swimming in what Dale Dougherty, the vice-president of O’Reilly Media, coined Web 2.0. As much a commercial designation as a change in technology, Web 2.0 referred to the hope that something would follow the burst of the Dotcom Bubble in 2000, but what this was had yet to be been defined (Vossen and Hagemann 2007, xi). In many ways that is still the case, but references to Web 2.0 are now generally accepted to mean the many forms of social media based on user created Web content.

This can take several forms. Blogs are centred on an individual opinion that can be commented on by others, a wiki can organise a group of opinions or information on a particular topic, you can rate and review something you purchased on a commercial website to help influence other consumers, you can participate in a social network like Facebook, or you can help identify the contents of a photograph in Flikr using tagging. The mainstream availability of technology, combined with a generation coming of age who have used the Web since

childhood, has created a massive surge in the Web over the last several years, moving from the ground up. The Semantic Web was a mature vision in the mind of Tim Berners-Lee long before the burst of the dotcom bubble, but it was always based on a top-down approach. Confusion about terms like Web 2.0, Web 3.0, the social web and the Semantic Web, etc. has led to erroneous ideas that these are ideas in competition. In reality, the social web and the Semantic Web have much to contribute to each other and will leave the Web stronger in the end.

The Scientific American article written by Berners-Lee in 2001 depicted a foreign and rather unsettling world at the time it was published, even to Web enthusiasts. The

idea of trusting unseen machines to help make decisions about even mundane Web interactions and information was disconcerting. Today this is no longer the case, and that the contributions made by Web 2.0 account for much of the reason. The level of trust and effort we all seem willing to invest, in order to collaborate with friends and strangers alike, is astounding. Of course, trusting a person and trusting the automated parameters created by a group of people are not the same thing, but it is the real desire for collaboration to make the whole greater than the sum of its parts, which sits at the heart of both Web 2.0 and the Semantic Web.

At the same time, there can be no doubt that the Semantic Web is surging ahead as well, as evidenced by the 2007 follow-up article in Scientific American. Once again, an article marks a watershed moment. Speaking about the same period of time between the publication of the previous article by Berners-Lee and the present, the authors are clear: ‘Since then the sceptics have said the Semantic Web would be too difficult for people to understand or exploit. Not so. The enabling technologies have come of age. A vibrant community of early adopters has agreed on standards that have steadily made the Semantic Web practical to use’ (Feigenbaum et al. 2007, 91).

The majority of the examples given in the article involve current uses of Semantic Web elements in the healthcare industry, but they also cite one of the best

examples of the Web 2.0 working in common with the Semantic Web:

Consumers are also beginning to use the data language and ontologies [of the Semantic Web] directly. One example is the Friend of a Friend (FOAF) project, a decentralized social- networking system that is growing in a purely grassroots way. Enthusiasts have created a Semantic Web vocabulary for describing people’s names, ages, locations, jobs and relationships to one another and for finding common interests among them. FOAF users can post information and imagery in any format

they like and still seamlessly connect it all, which MySpace and Facebook cannot do because their fields are incompatible and not open to translation (Feigenbaum et al. 2007, 93).

The reason behind the success of the Web thusfar will likely continue to propel it further. The willingness of Tim Berners-Lee to let the natural way people communicate flow over him and inform his thinking when he created the Web, rather than trying to create a rigid structure and asking users to conform, is ultimately in keeping with the Web 2.0 ethos and will help to propel the Semantic Web into the mainstream as well.

Figure 8: While there are still relatively few mainstream texts dedicated to the Semantic Web, the Semantic Web for Dummies was published recently, which can certainly be construed as a sign of mainstream acceptance.

The 2007 Scientific American article also addresses an issue of great concern to the further development of the Semantic Web, which is inclusion in, but never dominance of, the W3C by the commercial sector:

As applications develop, they will dovetail with research at the Web consortium and elsewhere aimed at fulfilling the Semantic

Web vision. Reaching agreement on standards can be slow, and some sceptics wonder if a big company could overtake this work by promoting a set of proprietary semantic protocols and browsers. Perhaps. But note that numerous companies and universities are involved in the consortium’s semantic working groups. They realize that if these groups can devise a few well- designed protocols that support the broadest Semantic Web possible, there will be more room in the future for any company to make money from it (Feigenbaum et al. 2007, 97).

This all feels miles away from the Wild West Web of the browser wars, and a bit of law and order has clearly rolled into town. By combining Web 2.0 with a Semantic Web that is becoming more mainstream, something quite other may form. If the grassroots Web 2.0 is the stalagmite pushing its way up from the bottom of the cave and the Semantic Web is the stalactite reaching down from the cave ceiling, when they meet in the middle to form a column what then will appear? Will that be Web 3.0? Perhaps.

When asked about Web 2.0 by a reporter for the International Herald Tribune, Tim Berners-Lee:

…shrugs at the use of the term ‘Web 2.0’ - a Silicon Valley buzzword to describe the Internet since the dot-com bust of the turn of the century - he does say he sees a new level of vigour across the network…’People keep asking what Web 3.0 is,’ Berners-Lee said. ‘I think maybe when you’ve got an overlay of scalable vector graphics - everything rippling and folding and looking misty - on Web 2.0 and access to a semantic Web integrated across a huge space of data, you’ll have access to an unbelievable data resource (Shannon 2006).

All of the pieces are in place for the discipline of Archaeology to begin taking advantage of this ‘unbelievable data resource’ on many different levels, and the Semantic Web is now moving beyond theory into practice. How that practice might be applied within archaeology, and specifically to the data derived from field drawing, is the subject of the rest of this thesis.

In document Seeing Triple: Archaeology, Field Drawing and the Semantic Web (Page 83-90)