• No results found

In this section we describe some of the most common graph database use cases, iden‐ tifying how the graph model and the specific characteristics of the graph database can be applied to generate competitive insight and significant business value.

Social

We are only just beginning to discover the power of social data. In their book Connec‐ ted, social scientists Nicholas Christakis and James Fowler show how, despite knowing nothing about an individual, we can better predict that person’s behavior by under‐ standing who he is connected to, than we can by accumulating facts about him.1

Social applications allow organizations to gain competitive and operational advantage by leveraging information about the connections between people, together with discrete

information about individuals, to facilitate collaboration and flow of information, and predict behavior.

As Facebook’s use of the term social graph implies, graph data models and graph data‐ bases are a natural fit for this overtly relationship-centered domain. Social networks help us identify the direct and indirect relationships between people, groups, and the things with which they interact, allowing users to rate, review, and discover each other and the things they care about. By understanding who interacts with whom, how people are connected, and what representatives within a group are likely to do or choose based on the aggregate behavior of the group, we generate tremendous insight into the unseen forces that influence individual behaviors. We discuss predictive modeling and its role in social network analysis in more detail in “Graph Theory and Predictive Modeling” on page 174.

Social relations may be either explicit or implicit. Explicit relations occur wherever social subjects volunteer a direct link—by liking someone on Facebook, for example, or in‐ dicating someone is a current or former colleague, as happens on LinkedIn. Implicit relations emerge out of other relationships that indirectly connect two or more subjects by way of an intermediary. We can relate subjects based on their opinions, likes, pur‐ chases, and even the products of their day-to-day work. Such indirect relationships lend themselves to being applied in multiple suggestive and inferential ways: we can say that A is likely to know, or like, or otherwise connect to B based on some common inter‐ mediaries. In so doing, we move from social network analysis into the realm of recom‐ mendation engines.

Recommendations

Effective recommendations are a prime example of generating end-user value through the application of an inferential or suggestive capability. Whereas line-of-business ap‐ plications typically apply deductive and precise algorithms—calculating payroll, apply‐ ing tax, and so on—to generate end-user value, recommendation algorithms are in‐ ductive and suggestive, identifying people, products, or services an individual or group is likely to have some interest in.

Recommendation algorithms establish relationships between people and things: other people, products, services, media content—whatever is relevant to the domain in which the recommendation is employed. Relationships are established based on users’ behav‐ iors, whether purchasing, producing, consuming, rating, or reviewing the resources in question. The engine can then identify resources of interest to a particular individual or group, or individuals and groups likely to have some interest in a particular resource. With the first approach, identifying resources of interest to a specific user, the behavior of the user in question—her purchasing behavior, expressed preferences, and attitudes as expressed in ratings and reviews—are correlated with those of other users in order to identify similar users and thereafter the things with which they are connected. The

2. Neo4j Spatial is an open source library of utilities that implement spatial indexes and expose Neo4j data to geotools.

second approach, identifying users and groups for a particular resource, focuses on the characteristics of the resource in question; the engine then identifies similar resources, and the users associated with those resources.

As in the social use case, making an effective recommendation depends on under‐ standing the connections between things, as well as the quality and strength of those connections—all of which are best expressed as a property graph. Queries are primarily graph local, in that they start with one or more identifiable subjects, whether people or resources, and thereafter discover surrounding portions of the graph.

Taken together, social networks and recommendation engines provide key differenti‐ ating capabilities in the areas of retail, recruitment, sentiment analysis, search, and knowledge management. Graphs are a good fit for the densely connected data structures germane to each of these areas; storing and querying this data using a graph database allows an application to surface end-user real-time results that reflect recent changes to the data, rather than precalculated, stale results.

Geo

Geospatial is the original graph use case: Euler solved the Seven Bridges of Königsberg problem by positing a mathematical theorem that later came to form the basis of graph theory. Geospatial applications of graph databases range from calculating routes be‐ tween locations in an abstract network such as a road or rail network, airspace network, or logistical network (as illustrated by the logistics example later in this chapter) to spatial operations such as find all points of interest in a bounded area, find the center of a region, and calculate the intersection between two or more regions.

Geospatial operations depend upon specific data structures, ranging from simple weighted and directed relationships, through to spatial indexes, such as R-Trees, which represent multidimensional properties using tree data structures. As indexes, these data structures naturally take the form of a graph, typically hierarchical in form, and as such they are a good fit for a graph database. Because of the schema-free nature of graph databases, geospatial data can reside in the database beside other kinds of data—social network data, for example—allowing for complex multidimensional querying across several domains.2

Geospatial applications of graph databases are particularly relevant in the areas of tel‐ ecommunications, logistics, travel, timetabling, and route planning.