Complex networks - Spatio-temporal modelling for issues in crime and security

Complex networks are mathematical structures which encode relationships between sets of discrete entities. As such, they are particularly useful in the study of complex systems, since they provide a means by which the interactions between components of a system can be represented. Indeed, since the non-trivial configuration of these interactions may itself be the source of complexity, they play a central role in the study of complex systems.

The study of networks in the context of crime is one of the primary themes of this thesis, and a number of examples will be considered. In particular, both phys- ical and abstract networks will be considered, representing the structure of urban streets and the proximity of crimes, respectively. Since the same basic theory and terminology is common to all cases, and will be invoked frequently, a brief introduc- tion to core network concepts will now be given. This will cover the basic structures and notation used in all discussion of networks, while more specific concepts will be introduced throughout the thesis as they become relevant.

Network science is a substantial field of research in its own right, with a number of strands: empirical study of real-world networks, modelling of network formation, and applications in real-world models. Much of this research is not relevant in the present context, and a review will not be given; instead, relevant literature will be introduced at points throughout the thesis. A number of comprehensive reviews can be found elsewhere (Newman, 2003; Boccaletti et al., 2006; Costa et al., 2011), as can those which focus on the particular sub-fields of spatial (Barth´elemy, 2011) and temporal (Holme & Saram¨aki, 2012) networks.

Mathematical representation

From a technical perspective, the term network refers to a mathematical object which is otherwise known as a graph. Indeed, the distinction between the two terms

is primarily one of context: ‘graph’ refers to the abstract object, whereas ‘network’ is more often used to imply that the structure in question represents real-world entities. The majority of terminology used to refer to networks is taken from the mathematical field of ‘graph theory’, which is an extensive and mature subject (see Bollob´as, 2002).

Basic notation

A network G = (V, E) is an ensemble of vertices, V , together with the links, E, which join them1_{. The set of vertices, V =} _{{v}, is a non-empty and countable set}

of N elements, and E = _{{e} is a set of M elements, each of which is a pair of} vertices. The number of vertices, N , is referred to as the order or the network, and the number of links as its size.

In order to be able to refer to elements of the network, its vertices are labelled using the integers 1, . . . , N , where the order is unimportant as long as the labelling is consistent and unique. Each particular vertex is then referred to using a subscript, vi, or the label itself, i, depending on which is convenient; the later convention will

be adopted for the remainder of this discussion. In this way, specific links can be denoted using pair notation: if a link exists between two vertices, i and j, it is represented by the pair (i, j). When a link is present, the vertices are said to be adjacent and each is a neighbour of the other.

At this point it is necessary to note a distinction between two types of network: undirected and directed. In an undirected graph, links have no orientation, and the ordering of vertices in any link (i, j) is unimportant. In the directed case, however, the ordering is significant, and such a link (i, j) is said to exist from i to j. Directed networks can contain links in both directions between a given pair of vertices, or just one. Graphically, directionality is usually represented by adding an arrow to a link; both undirected and directed examples are shown in Figure 1.1.

1_{Vertices are also commonly referred to as nodes, and links as edges, but these terms are}

avoided here to avoid confusion with the concept of ‘edge effects’ and the use of ‘activity node’ within criminological theory.

1

2

3

4

(a) An undirected network - G1

1

2

3

4

(b) A directed network - G2

Figure 1.1: Graphical representations of simple undirected and directed networks. Circles represent vertices and the presence of a link is indicated by a line or arrow.

Two extreme cases, in terms of the presence of links, arise frequently and are iden- tified by name. An empty network is one which contains no links (that is, E =_∅), and is simply defined by the number of vertices it contains. A complete network, on the other hand, is one in which all possible links are present. A complete undirected network therefore contains N (N₂−1) links (the number of pairs of vertices), whereas a complete directed network contains N (N _{− 1) (since two links are present for each} pair; one in each direction).

It is also possible for graphs (i.e. networks) to be derived from other networks. This can be done in a number of ways, many of which involve the notion of a subgraph. The vertices of a subgraph are a subset of the vertices of the original network, and its links are a subset of the links of those original network for which both vertices remain. Formally, then, for a network G = (V, E), a subgraph G0 = (V0, E0) is one such that V0 _{⊆ V and E}0 _{⊆ {(i, j) | (i, j) ∈ E and i, j ∈ V}0_{}. An induced} subgraph is a particular case in which all of the original links between members of V0 are retained; that is, E0 =_{{(i, j) | (i, j) ∈ E and i, j ∈ V}0_}

Adjacency matrix

The structure of a network can be expressed in a number of ways, the most convenient and common of which is provided by the adjacency matrix. This encodes

all information necessary to describe a network: for the network, G = (V, E), introduced above, the adjacency matrix, A, is an N_{× N matrix such that}

aij =       

1 if (i, j)_{∈ E (i.e. there is a link connecting i and j)} 0 otherwise.

(1.1)

The adjacency matrices of the two networks shown in Figure 1.1, for example, are

A1 =          0 1 0 1 1 0 0 1 0 0 0 1 1 1 1 0          A2 =          0 1 0 0 1 0 0 1 0 0 0 0 1 0 1 0          . (1.2)

For undirected networks, a_ij = a_ji for all i, j _{∈ V , and so these adjacency matrices} are symmetric, with all information contained in the upper (or lower) triangular part. There is no such constraint for directed networks.

Network metrics

Many properties of network structure can be measured, and a number of metrics are common in the empirical study of networks. Each emphasises some different aspect of structure, and their use is dependent on the particular characteristics of interest. The number of metrics employed is very large, and only the most basic will be given here; a number of others, however, will be introduced in the course of the thesis.

Many of the quantities typically studied involve the properties of individual vertices, and many concern some notion of ‘centrality’. The most common concept is that of degree, which simply measures the number of links which are incident with a particular vertex (i.e. the number of neighbours it has). For a node i in an undirected network, this is typically denoted ki. In terms of the adjacency ma-

trix, A, it is also straightforward to see that ki is equal to Pjaij. In the directed

case, analogous quantities of in-degree and out-degree are defined as the number of inward-pointing and outward-pointing links, respectively. The degree distribution is

the distribution of these values for a given network, and is frequently studied.

Other metrics concern larger-scale structures within networks, and the concept of clustering provides one such example. Clustering measures the tendency for links to form between the neighbours of a given node, and can be interpreted as the probability that the neighbours of a node should themselves be neighbours. For a given node, i, it is measured by the clustering coefficient, Ci, which is the ratio of the

number of links between the neighbours of i to the maximum possible number of such links:

Ci =

ki(ki− 1)/2

(1.3)

where qi is the number of edges between neighbours of i, and ki is the degree of i.

Again, this can be averaged over all vertices of the network, providing a macro-level measure of clustering: hCi = 1 n X i C(i) (1.4)

This is also an example of a metric which can, perhaps more intuitively, also be thought of in geometric or graphical terms. The clustering coefficient can also be expressed in terms of the closure of triangles in the network; that is, the probability that, when two sides of a triangle are already present, the third side will also be: It is notable that this quantity can be defined in several ways, typically involving the counting of certain types of subgraph:

C₄ = number of closed triplets

number of connected triples. (1.5)

Graphical interpretations such as these will be used a number of times within the thesis.

In document Spatio-temporal modelling for issues in crime and security (Page 31-35)