Complex Network Growth and Models of Language Network Evo-

2.3 Graphs of Language

2.3.2 Complex Network Growth and Models of Language Network Evo-

After seeing how language can be treated as a graph or network, the question has to be asked how such networks evolve. There have been different proposals for different kinds of networks. The most important ones are to be presented and compared here.

Random Networks (Erd¨os and R´enyi)

In 1959, Erd¨os and R´enyi proposed a model of random graphs. They start with n vertices

and connect those with N edges with a probability P (Erd¨os and R´enyi, 1959, p. 290).

Such a network’s degree distribution follows a Poisson distribution. The Poisson distribution indicates the probability that a certain state occurs. If we know that a graph has n vertices and N edges, the average number of edges a vertex has, its degree, should be

A = _Nn. The Poisson distribution shows the probability of the state A ± x to occur. An

example of an Erd¨os and R´enyi graph is given in Fig. 8.

With a lack of data on real-world networks, this model could not be tested extensively. The model constructs a random network that is not like real-world, complex networks, as we have seen them already and will analyze later on in this thesis. The model is missing the most basic features of complex networks: Neither the power law distribution, and thus the scale-free feature, nor the small-world property of natural networks can be explained

using this model (Barab´asi and R´eka, 1999, p. 510).

Small-World Networks (Watts and Strogatz)

One feature of complex networks that the above shown model does not predict is the small-world phenomenon. Watts and Strogatz (1998) propose a model that explains how small-world networks emerge.

Having a network of n vertices and k edges per node, Watts and Strogatz “rewire each edge at random with probability p” (Watts and Strogatz, 1998, p. 440), where p = 0 results in an ordered network (see Fig. 9(a)), and p = 1 results in an unordered network (see Fig. 9(c)). While an ordered network has a long average path length L (a large- world network), and a high clustering coefficient C, a network of 0 < p > 1 (see Fig.

Figure 8: An Erd¨os and R´enyi random graph with 100 vertices and a connectivity probability of 0.2.

9(b)) leads to a less clustered network with a short average path length or small-world network. Even for “small p, each short cut has a highly nonlinear effect on L, contracting the distance not just between the pair of vertices that it connects, but between their immediate neighbourhoods” (Watts and Strogatz, 1998, p. 440). But this model stills misses the power-law distribution that is typically found in complex networks, and it is no explanation of how such networks evolve since it does not incorporate growth of any kind.

Preferential Attachment (Barab´asi and R´eka)

Network models of language allow one to analyze the supposed structure of the mind and

to predict and analyze the evolution of language. Barab´asi and R´eka (1999) describe a

model of network growth using preferential attachment:18 Having a network, this mod-

els predicts that “new vertices in the growing network are preferentially attached to an existing vertex with a probability proportional to the degree of such a node” (Ferrer and

Sol´e, 2001, p. 2263). This is equivalent to the rich-get-richer principle introduced before.

The model leads to a power law distribution, which is generally seen as a natural result

18_{Preferential attachment applies Bayesian models which are also used (e.g., to predict the growing of}

the mind (cf. Tenenbaum et al., 2011)). Bayesian models complement network approaches and are used widely in language models in psychology, linguistics, and medicine.

(a) Probability p = 0. (b) Probability p = 0.5. (c) Probability p = 1. Figure 9: Watts and Strogatz graphs with n = 20 and k = 4.

of the way a system, like a network, grows over time. Barab´asi and R´eka state that this

indicates that “the development of large networks is governed by robust self-organizing

phenomena that go beyond the particulars of the individual systems” (Barab´asi and R´eka,

1999, p. 509).19

In formal terms, the probability P (ki) that a new vertex in a network starting with

m0 vertices and adding a new vertex at every time step t with m ≤ m0 edges will connect

to an existing vertex i depends on the degree ki of that node:

P (ki) =

P jkj

This leads to a random network with t + m0 vertices and mt edges that is “following a

power law with the an exponent ymodel = 2.9 ± 0.1” (Barab´asi and R´eka, 1999, p. 5).

An example is given in Fig. 10. Comparing this graph to an Erd¨os and R´enyi random

graph, one can see that the distribution of edges per vertex is not distributed in Poisson fashion, but that some vertices have a reasonably higher degree than others. This also leads to clustering as Fig. 10 shows.

This model predicts degree distributions that follow a power law, more or less like it can be observed in networks of natural language or, in general, in scale-free, small-world networks.

In Fig. 11, the degree distributions of the models presented above are given. In

19_{In protein networks, for example, scale-free distribution follows from preferential attachment in the}

Figure 10: A Barab´asi and R´eka graph with 100 vertices.

accordance with what has been stated above and as one can see, neither the degree distribution shown in Fig. 11(a) nor the one in Fig. 11(b) follows a power law. Only the

model of Barab´asi and R´eka (1999), Fig. 11(c), shows a degree distribution that is found

in most natural networks, such as social networks, and, as will be shown in later chapters, in ontologies.

Language Network Models (Steyvers and Tenenbaum)

Steyvers and Tenenbaum compare the Barab´asi and R´eka model to findings in natural

language networks20 _{and find that preferential attachment as proposed by Barab´}_{asi and}

R´eka (1999) does not explain the structure of semantic networks.

From a language evolution21 _{point of view, Steyvers and Tenenbaum argue that}

“[w]ords that enter the network early are expected to show higher connectivity” (Steyvers and Tenenbaum, 2005, p. 44). Also existing complex concepts (i.e., those with a high

20_{Unfortunately, they restrict themselves to an analysis of networks and ignore the large amount of}

linguistic research in the areas of language evolution and language change.

21_{Steyvers and Tenenbaum’s model could just as well be applicable to individual language acquisition,}

even though it cannot account for all the diverse processes that happen during the acquisition or evolution of natural languages.

(a) Degree distribution of Erd¨os and R´enyi random graph in Fig. 8.

(b) Degree distribution of Watts and Strogatz graph in Fig. 9(c).

graph in Fig. 10.

Figure 11: Degree distributions of Erdös and Rényi, Watts and Strogatz, and Barabási

connectivity or degree) in a language are more likely differentiated over time. This means complex and very wide terms tend to be differentiated into narrower terms. There are of course a lot of other processes that can be discussed concerning the evolution of a natural language, but this process is leading to concepts with a high connectivity (i.e., hubs or authorities) and thereby to a small-world network.

Two models for the growth of language networks are proposed by the authors: The first explains the growing of an undirected network, while the second one explains the growth of a directed network (e.g., a semantic relationship network of words such as WordNet or other language ontologies).

The first growth model can be formulated as follows: Given is a fully connected network of size n that grows over time, at each time point t with t(n) vertices, a randomly chosen vertex i is differentiated by adding a new vertex with M connections (M < n) to randomly chosen vertices in the neighborhood of i. This leads to the effect that a “new vertex can be thought of as differentiating the existing node, by acquiring a similar but slightly more specific pattern of connectivity” (Steyvers and Tenenbaum, 2005, p. 57).

Now the probability that vertex i is chosen has to be defined. The probability Pi(t) is

corresponding to the connectivity of the vertex i (i.e., its degree):

Pi(t) =

ki(t)

Pi−1

n(t)ki(t)

The degree of i at t, ki(t), is divided by the sum of degrees from i − 1 to n(t) (i.e., all

vertices at time t)

To choose a vertex j in the neighborhood Hi of i that the vertex to be added will

be connected to, the probability Pij(t) is calculated in proportion to the utility of the

corresponding node:

Pij(t) =

ki(t)

This is repeatedly done until M vertices from Hi have been chosen. Then the new vertex

is connected to them. These steps are repeated until the desired network size is reached. The network produced by the model can then be compared to a real-world network of the same size.

The second model of network growth results in a directed network. The process is very similar to the first model. The only difference is the connection of the new node:

Still, the vertices it will be connected to are chosen with the probability Pij(t), but the

direction of the connecting edges is chosen randomly.

While this model fits the examined features of language graphs, it should be mentioned that a model of the growth or evolution of a language network should also account for the loss of words (i.e., at randomly chosen time steps the network should to be pruned and poorly connected concepts should be erased). A diachronic analysis of language shows not only a differentiation of words and hence a semantic shift, as well as that concepts that are poorly connected and therefore less frequently used, cease to exist in the vocabulary of a single person or even a language community (cf. Pagel et al., 2013). This property is, in my opinion, missing in the model. Also, Dorogovtsev and Mendes (2001) point out, because of a semantic shift, existing concepts should at time be rewired to other concepts.

In document On link predictions in complex networks with an application to ontologies and semantics (Page 36-42)