INTRODUCTION
Table of content
1.1 Context of the Thesis ... 2 1.2 Research Questions and Approach ... 3 1.2.1 Problem formulation of the thesis ... 3 1.2.2 Approach and achievements ... 4 1.3 Organization of the Document ... 5
2 Context of the Thesis
1.1
Context of the Thesis
Today, the agent-based simulation approach is increasingly used to develop simulation systems in quite different fields such as: (1) in natural resources management – e.g., an agent-based simulation system consisting of a set of management processes (i.e. water, land, money, and labour forces) built to simulate catchment water management in the north of Thailand (Becu et al., 2003) or the agent-based modeling used to develop a water resource management and assessment system which combines spatiotemporal models of ecological indicators such as rainfall and temperature, water flow and plant growth (Gaudou et al., 2013); (2) in biology - a model for the study of epidemiology or evacuation of injured persons (Amouroux et al., 2008; Dunham, 2005; Rao et al., 2009; Stroud et al., 2007), or the study of the invasion of rice pest (Nguyen et al., 2011; Phan et al., 2010; Truong et al., 2011); (3) in economics - customer flow management (Julka et al., 2002), stock market and strategic simulation (Chen and Yeh, 2001), simulated market network (Galtier et al., 2012), or operational management of risks and organizational design (Giannakis and Louis, 2011); and (4) in sociology - a multi-agent system to discover how social actors could behave within an organization (Sibertin-Blanc et al., 2013) or an agent- based simulation model to explore rules for rural credit management (Barnaud et al., 2008).
In the building such systems, we are not only concerned with modeling driven approach ‒ that is how to model and combine coupled models from different scientific fields - but also with data driven approach ‒ that is how to use empirical data collected from the target system in modeling, simulation and analysis (Edmonds and Moss, 2005; Hassan et al., 2010a, 2010b, 2008). The main idea behind such simulation systems is to combine and couple information available from various data sources and knowledge from scientific fields (like water management, climate sciences, sociology, economics and epidemiology). Such information mainly takes the form of empirical data gathered from the target system and these data can be used in processes such as design, initialization, calibration and validation of models (cf. Chapter 4 and 5). That raises the question about how to manage
Indeed, the current challenge of ABM is data management (Section 2.3.2) because of the weakness of agent-based platforms in data management addressed in Section 2.3.3. The basic observation we can make is that currently, if the design and simulation of models has benefited from advances in computer science through the popularized use of simulation platforms like Netlogo (Wilensky, 1999) and GAMA (Taillandier et al., 2012), it is not yet the case for the management of data, which are still managed in an ad hoc manner, despite the advances in the management of huge datasets (data warehousing for instance). Such a statement is rather pessimistic if we consider recent tendencies toward the use of data- driven approaches in simulation aiming at using more and more data available from the field into simulated models (Edmonds and Moss, 2005; Hassan, 2009).
Therefore there is definitely a need for a robust data management solution of huge datasets in agent based simulation systems: data management tools are currently needed in agent-
based simulation systems and database management is an important technology for agent- based simulation systems.
1.2
Research Questions and Approach
1.2.1 Problem formulation of the thesis
In my research, the first question I tackle is “What general architecture could serve the
following purposes: model and execute multi-agents simulations, manage the input and output data of simulations, integrate data from different sources and enable to analyze high volume of data?” To solve this problem, I examined several research studies and
solutions, related to simulation, management and analysis of big data. I argue that BI (Business Intelligence) solutions are a good way to handle and analyze big datasets. Because a BI solution contains a data warehouse, integrated data tools (ETL, Extract- Transform-Load tools) and Online Analytical Processing tools (OLAP tools), it is well adapted to manage, integrate, analyze and present huge amounts of data (Mahboubi et al., 2010; Vasilakis and El-Darzi, 2004). My answer to the first question is the logical framework proposed in Section 3.2 of Chapter 3.
The second problem that needs to be solved in my research is "How to introduce DWH
and OLAP technologies into a multi-agent based simulation system having to face huge amount of data?" The solution I propose in this thesis is the improvement of agent-based
4 Research Questions and Approach
platforms by adding new features, such as deep interactions with data warehouse systems as presented in Section 3.3 of Chapter 3.
1.2.2 Approach and achievements
In this thesis I propose a solution to handle the input and output of agent-based simulation models. The solution combines two aspects. The first deals with the status of the data, as a suitable solution should be able to manage empirical data gathered from the studied phenomenon or system as well as simulated data produced by simulations considered as in silico experiments on the same system. The second aspect concerns the use of a Business Intelligence (BI) solution envisaged as a system of data warehouse and analysis tools. A data warehouse includes a collection of data that supports decision-making processes (Inmon, 2005). Analysis tools may be data mining, statistical analysis, prediction analysis and so on. The services of a BI solution will help us to manage huge amount of historical data and make several analysis on such data.
I planned to organize my research as follows:
First, I studied the current state of the art on the two aspects (multi-agent simulation platform and business intelligence solutions) and researched related works on the management and analysis of input/output data of simulation models. The contribution of this first step is the state the art of my research. It is the background knowledge for the next steps.
Secondly, I proposed a conceptual framework that can help us to manage the input and output of both the simulation models and analysis models; to aggregate the empirical data and simulated data, which can be used for calibration and validation. The result of the second step is an article concerning the global architecture of our combined framework of multi-agent simulation platform and BI solution.
Third, I implemented the logical framework on the GAMA platform and I also gave a use case that applies the framework in building the Brown Plant Hopper (BPH) Prediction model. The contribution of this step is an article to present the concrete implementation of the logical framework (step 2) on GAMA platform and an application of the framework.
Fourth, I applied the framework to the calibration and validation of a simulation model to demonstrate the benefits of the proposed framework. The result of this step is an article about the application of the proposed framework to calibrate and validate an agent-based simulation model.
1.3
Organization of the Document
The thesis is organized into the following six chapters:
Chapter 1: INTRODUCTION
This chapter presents the problematic of data management in agent-based simulation and the approach, which has been adopted to solve the research question. In addition, this chapter also presents the important notations and links between the chapters of the thesis as a theoretical framework for the reader.