• No results found

Modularity suffers from some known drawbacks, which are crucial to identify the domain of its applicability and the reliability of its measures.

CHAPTER 2. COMMUNITY DETECTION 28

Figure 2.2: Visualization of the steps of Louvain algorithm. Each pass is made of two phases: one where modularity is optimized by allowing only local changes of communities;

one where the found communities are aggregated in order to build a new network of communities. The passes are repeated iteratively until no increase of modularity is possible (Blondel et al. (2008)).

We said that a high value of modularity means that the given partition is a good one. However, a large value for the modularity maximum does not necessarily mean that a graph has community structure. Random graphs are supposed to have no community structure, as the linking probability between vertices is either constant or a function of the vertex degrees, so there is no bias a priori towards special groups of vertices. Still, random graphs may have partitions with large modularity values (Fortunato (2010)). This is due to fluctuations in the distribution of edges in the graph, in some cases they can concentrate in subsets of the network that can appear as communities.

A more fundamental issue, raised by Fortunato and Barthelemy (2007), is the so-called resolution limit and concerns the capability of modularity to detect communities which are comparatively small with respect to the graph

as a whole, even when they are well defined. So, if the partition with maxi-mum modularity includes communities with total degree of the order of √

m or smaller, one cannot know a priori whether they are single communities or combinations of smaller weakly interconnected communities.

The resolution limit comes from the very definition of modularity, in particular from its null model. The weak point of the null model is the implicit assumption that each vertex can interact with every other vertex, which implies that each part of the graph knows about every- thing else (Fortunato (2010)).

The resolution limit problem seems to be circumvented in Blondel et al.

(2008) thanks to the intrinsic multi-level nature of the algorithm. Since the first phase of the method involves the displacement of single nodes from one community to another, the probability that two distinct communities can be merged by moving nodes one by one is very low. These communities may possibly be merged in the following steps, after blocks of nodes have been aggregated. However, the algorithm provides a decomposition of the network into communities for different levels of organization so that one can observe its structure with the desired resolution.

Chapter 3 Data

Since 1973, when currencies began to be traded in financial markets and their values determined by the foreign exchange market, the volume of foreign exchange trading has been growing at an impressive rate. The transaction volume in 1995 was 80 times what it was in 1973. In the 1980s electronic trading, already a part of the environment of the major stock exchanges, was adapted to the foreign exchange market.

Physicists have generally investigated economic systems and problems only occasionally. Recently, however, a growing number of physicists is be-coming involved in the analysis of economic systems.

Financial markets are, indeed, remarkably well-defined complex systems, which are continuously monitored - down to time scales of seconds. Further, virtually every economic transaction is recorded, and an increasing fraction of the total number of recorded economic data is becoming accessible to interested researchers, making financial markets extremely attractive for re-searchers interested in developing a deeper understanding of modeling of

30

complex systems (Mantegna et al. (2000)).

Also space is central to the work of economic institutions, providing the framework for survey design, sample selection, data collection, tabulation, and dissemination. Geography provides meaning and context to statistical data.

Given the diversity of population, economic activities, and geographic areas considered when dealing with economic or financial datasets, a spa-tial framework is then critical to provide real insight on data. Therefore, geographic area concepts, information, and statistical data must keep pace with the needs of the researchers and analysts who work to understand the changing distribution and characteristics of people, places and economy.

Given that, the following sections will describe in detail the data used in the papers reviewed in this thesis.

3.1 Sardinian Inter-municipal Commuting Net-work (SMCN)

Sardinia is the second largest Mediterranean island with an area of approxi-mately 24.000 square kilometers and 1.600,000 inhabitants. In 1991, when the census was carried out, the island was partitioned in 375 municipalities, the second simplest body in the Italian public administration, each one of those generally corresponding to a major urban centre (in Figure 3.1 we report the geographical distribution of the municipalities).

For the whole set of municipalities the Italian National Institute of Statis-tics IST (1991) has issued the origin-destination table (OD) corresponding

CHAPTER 3. DATA 32

Figure 3.1:Geographical representation of the the Sardinian inter-municipal commuting network (SMCN).

to the commuting traffic at the inter-city level. The OD is constructed on the output of a survey about commuting behaviors of Sardinian citizens. This survey refers to the daily movement from the habitual residence (the ori-gin) to the most frequent place of employment (the destination): the data comprise both the transportation means used and the time usually spent for displacement. Hence, OD data give access to the flows of people regularly commuting among the Sardinian municipalities.