• No results found

Dependency Provenance in Agent Based Modeling

N/A
N/A
Protected

Academic year: 2021

Share "Dependency Provenance in Agent Based Modeling"

Copied!
9
0
0

Loading.... (view fulltext now)

Full text

(1)

Dependency Provenance in Agent Based Modeling

Preprint Version

Forthcoming in IEEE eScience 2013

Peng Chen

School of Informatics and Computing Indiana University

[email protected]

Beth Plale

School of Informatics and Computing Indiana University [email protected]

Tom Evans

Department of Geography Indiana University [email protected]

Abstract—Researchers who use agent-based models (ABM)

to model social patterns often focus on the model’s aggregate phenomena. However, aggregation of individuals complicates the understanding of agent interactions and the uniqueness of individuals. We develop a method for tracing and capturing the provenance of individuals and their interactions in the NetLogo ABM, and from this create a “dependency provenance slice”, which combines a data slice and a program slice to yield insights into the cause-effect relations among system behaviors. To cope with the large volume of fine-grained provenance traces, we propose use-inspired filters to reduce the amount of provenance, and a provenance slicing technique called “non-preprocessing provenance slicing” that directly queries over provenance traces without recovering all provenance entities and dependencies beforehand. We evaluate performance and utility using a well known ecological NetLogo model called “wolf-sheep-predation”.

I. INTRODUCTION

An agent-based model (ABM) is a computerized simulation of distributed decision-makers (agents) who interact through prescribed rules. ABM has been effective in studies such as complex adaptive spatial system (CASS) [6] and ecological modeling [4] because of its ability to represent heterogeneous individuals and the interactions among them. Researchers us-ing agent-based models often focus on the emergent outcomes of aggregated behaviors, however, the fundamental philosophy in ABMs of methodological individualism warns that this may yield misleading results, and advocates a focus on the uniqueness of individuals and interactions [4]. In addition, the interactions among agents are inherently associated with cause-effect relations in the transition of system states through time, and these cause-effect relations are important and necessary for an understanding of complex process, but elucidating these relations poses a significant challenge to the research community because of the complex dynamics in ABM [6].

The utility of provenance information has been established in many scientific domains, of most relevance to this paper are computer science [13] and geographic information system [14]. Provenance, which is a type of metadata, is the lineage of a data product or process: it’s creator, contributing processes, in-teractions, and data sources. In the context of agent based mod-eling, provenance captures state changes in individual agents and interactions through time. These state changes reveal every unique individual behavior, but produces provenance of a volume that is computationally and analytically challenging for models with large numbers of agents. Cheney et al. [10][9] argues that dependency analysis techniques used in program slicing can be a formal foundation for provenance that is

intended to show how (part of) the output of a query depends

on (parts of) its input. This so calleddependency provenanceis

different fromwhere-provenanceanddata lineage, but similar

tohow-provenance orwhy-provenance [7] in that it identifies

a data slice showing the input data relevant to the output data. However, we argue that it is also important to consider the part of program execution that is relevant to the data dependency— program slice. We demonstrate that the combination of data slice and program slice—what we call provenance slice—can explain how and why the output data depends on the input data, and can yield insight into cause-effect relations among system behaviors.

NetLogo [16] is an agent-based modeling platform in wide use in research and education worldwide. NetLogo has its own Logo programming language, containing high level primitives for performing batch operations over a group of agents. Capturing the provenance of complex system behaviors in NetLogo requires collecting information about the execution of every statement, which can generate huge amounts of fine-grained provenance that poses a big challenge for storage and subsequent querying. Pignotti et al. [12] discuss typical queries over a provenance record in ABM, and find that as the number of simulation runs and agents in the simulation increases, these queries become exponentially complex.

In this paper we propose a dependency provenance model for ABM. The utility of the approach is verified by means of a tool we built for tracing, capturing, storing and expos-ing the dependency provenance from NetLogo. Dependency provenance itself provides deeper understanding of an ABM

simulation through answers to questions such as “How did

the simulation evolve?”,“What changes occur after changing

a parameter’s value?”, “How influential is a parameter?”,

and“Which parameter is the most influential for a particular

type of agent?”. We propose several use-inspired filters that

dramatically reduce the amount of persistent storage, and propose a technique that we call non-preprocessing (NP) provenance slicing that avoids recovering entities and depen-dencies from the provenance traces that are irrelevant to the query. We demonstrate the utility of the techniques using a classical ABM model for ecological studies called “wolf-sheep-predation” [15].

The remainder of the paper is organized as follows: Sec-tion II reviews related work. SecSec-tion III describes the depen-dency provenance model in ABM and Section IV discusses the capture of dependency provenance in NetLogo. Section VI introduces the non-preprocessing provenance slicing technique

(2)

and Section V introduces some use-inspired filters. The experi-mental evaluation with NetLogo model “wolf-sheep-predation” is presented in Section VII. Section VIII demonstrates some additional use cases of dependency provenance in analyzing agent-based simulation. Section IX concludes the paper and discusses future work.

II. RELATED WORK

Bennett [6] illustrates the importance of explicitly consid-ering provenance in agent-based modeling through the devel-opment of a spatially explicit agent-based land use simulation framework. While their research is speculative, we implement the automated provenance tracing and capture for NetLogo and demonstrate some example provenance queries that can help understand and analyze the simulation.

Systems that gather fine-grained provenance metadata must process large amounts of information, and there is some exist-ing research on provenance filterexist-ing. SPADE [11] is an open source software platform that supports collecting, filtering, storing, and querying provenance metadata. SPADE provides a framework for implementing filters (that can be stacked in arbitrary order). A filter receives a stream of provenance graph vertices and edges, and can rewrite their annotations (in which domain-specific semantics are embedded). However, our research studies on how to filter provenance traces generated by probes in the agent based model, and proposes several use-inspired filters that can keep provenance traces that are relevant to particular provenance queries.

Program slicing is a well-explored technique in software engineering. Intuitively, program slicing attempts to provide a concise explanation of a bug or anomalous program behavior, in the form of a fragment of the program that shows only those parts “relevant” to the bug or anomaly. Cheney [9] argues that there is a compelling analogy between program slicing and data provenance and defines the dependency provenance of an output as the set of all input fields on which it depends (a data slice). However, the data slice alone does not explain why there are dependences, thus we propose the provenance slice as a combination of data slice and its related program fragment (program slice).

Typical precise dynamic slicing needs to build dependence graph from the programs execution trace (preprocessing) be-fore slicing. However, the dependence graph can be extremely large and run out of memory, so is the provenance graph if the query needs to recover all dependencies beforehand. Zhang et al. [17] present the design and evaluation of three precise dy-namic slicing algorithms called the full preprocessing (FP), no preprocessing (NP) and limited preprocessing (LP) algorithms. These algorithms differ in the relative timing of constructing the dynamic data dependence graph and its traversal for computing requested dynamic slices. The no preprocessing (NP) algorithm does not perform any preprocessing but rather during slicing it uses demand driven analysis that recovers dynamic dependencies and caches the recovered dependencies for potential future reuse, and we derive the non-preprocessing (NP) provenance slicing technique from this no preprocessing (NP) algorithm.

Pignotti et al. [12] investigate the role of provenance in agent-based simulation and discuss typical queries over a

provenance record. They describe three types of provenance that can be recorded from a simulation: the provenance about model development, the provenance about executing the model and the provenance about simulation. The dependency prove-nance we that explore is close to the type of proveprove-nance about simulation in their classification. However, they capture the provenance of simulations as actions performed by agents in discrete-event modeling, while we trace the dependency prove-nance by source code instrumentation; they concluded that as the number of simulation runs and agents in the simulation increases, the queries become exponentially complex, while we address this scalability problem by first using the use-inspired filters to drop off irrelevant provenance traces and then applying the non-preprocessing (NP) provenance slicing technique to avoid recovering irrelevant provenance.

III. DEPENDENCY PROVENANCE INABM

Dependency provenance is the information relating each part of the output of a query to a set of parts of the input on which the output depends. Dependency provenance can be used to compute data slices, or summaries of the parts of the input relevant to a given part of the output [10]. We want to define dependency provenance in ABM similar to this, but we also want to compute the program slice (relevant processes) as a complement to the data slice. While the data slice tells what the relevant input data is, the program slice tells how the output data depends on the input data. Specifically, the dependency provenance in ABM that we propose contains the information of:

• All data products and their dependencies;

• Procedures associated with these dependencies.

We choose W3C PROV [5] to record the dependency provenance in ABM (specifically from the NetLogo model), because PROV allows us to express the provenance of agents and the evolution of a variable. Unlike its predecessor OPM, an agent in PROV is also a particular type of entity and the PROV has the concept of versions. The mapping of ABM provenance to PROV is accomplished by first identifying the entities and dependencies in a NetLogo program, then mapping concepts in PROV to them (see Table I).

Note that in our mapping, the agent in ABM is mapped to an agent in PROV, this is because an agent in ABM has attributes and carries out actions that match the definition of agent in PROV; we define the state of a variable as an entity to capture its evolution over time; since the procedure in NetLogo can access global variables and any other agent without explicitly accepting them as input parameters, we need to capture the accurate dependencies at the statement level of the NetLogo program, however, we do not model the execution of a statement as an activity, as it can produce overwhelming amount of activities. Instead, we decide that the procedure level information is a good abstraction to explain why the output data depends on some input data.

IV. PROVENANCE TRACING

Provenance of a NetLogo model is captured through the process of adding probes to the model source code. These

(3)

TABLE I. MAPPINGS FROMPROVONTOLOGY TONETLOGO CONCEPTS Concept in PROV Concept in NetL-ogo Code example

Agent Agent breed [wolves wolf];; wolf is an agent

Activity Execution of a

pro-cedure ask sheep [ move;; an activity . . . ] Entity State of a global/agent/local variable

set color white;; the current state of

color (an agent variable) is an entity

Relationship: used

1) Procedure reads the current value of a variable

2) Procedure

de-pends on the current value of a variable

1) to reproduce-sheep . . .

set energy (energy / 2);; the activity

“reproduce-sheep” used the entity “energy”

. . . end 2) if grass? [

. . .

eat-grass;; the activity “eat-grass”

used the entity “grass?”

Relationship: wasGener-ateBy Procedure writes the value of a variable to reproduce-sheep . . .

set energy (energy / 2);; the

en-tity“energy” was generated by activity “reproduce-sheep” . . . end Relationship: wasDerived-From 1) A statement

reads var2 before writing var1

2) A statement

writing var1

depends on var2

1) set energy random (2 *

wolf-gain-from-food);; the entity “energy” was

derived from entity “wolf-gain-from-food”

2) if grass? [

set energy energy 1;; the entity

“en-ergy” was derived from entity “grass?”

Relationship: wasRevi-sionOf

If an variable was derived from itself

to reproduce-sheep . . .

set energy (energy / 2);; the entity

“energy” was a revision of itself

. . . end Relatinoship: wasIn-formedBy A procedure is invoked inside another procedure to go . . . ask sheep [

move;; activity “move” was in-formed by activity “go”

. . . end Relationship: wasAssociat-edWith A procedure is in-voked by an agent ask sheep [

move ;; the activity “move” was associated with a sheep agent

Relationship: wasAttribut-edTo

Variable belongs to an agent

sheep-own [energy];; the entity

“en-ergy” was attributed to a sheep agent

Relationship: alternateOf

A variable is a ref-erence to an agent

let prey one-of sheep-here;; the entity

“prey” is an alternate of a sheep agent (an agent is also an entity in PROV)

probes generate provenance traces. We propose three types of probes for this purpose, probes that log:

• Procedure invocations (Type 1),

• Write and read operations (Type 2), and

• Conditional statements (Type 3).

Type 1 probes generate information used to compute the program slice. Type 2 and Type 3 probes produce provenance about data dependencies that is used to compute the data slice. The open source tool that we built to implement our abstractions is called PIN (acronym for “Provenance in Net-Logo”) [2]. It can trace, capture, query and visualize the dependency provenance in NetLogo. It consists of four main

TABLE II. CODE SNIPPET BEFORE AND AFTER INSTRUMENTATION

Code snippet before instru-mentation

After instrumentation

if grass? [ ask patches [

set countdown random grass-regrowth-time

set pcolor one-of [green brown]

] ]

if grass? [

provenance:write (word ”dependsOn global grass? ”grass? )

ask patches [

provenance:write ”recordStart”

provenance:write (word ”read global grass-regrowth-time ”grass-grass-regrowth-time )

set countdown random grass-regrowth-time provenance:write (word ”write agent count-down ”countcount-down )

provenance:write ”recordEnd” provenance:write ”recordStart” set pcolor one-of [green brown]

provenance:write (word ”write agent pcolor ”pcolor )

provenance:write ”recordEnd” ]

provenance:write ”End dependsOn” ]

components: a source code analyzer used to automatically add probes to the model’s source code, a NetLogo extension for capturing the provenance traces generated from probes, a non-preprocessing (NP) provenance slicing technique for computing provenance slices using provenance traces, and a visualization component for visualizing the provenance slices. Figure 1 shows how the NetLogo provenance tool works. The tool is compatible with NetLogo version 5.0.3.

Fig. 1. PIN overview: Red rectangles represent major components, and blue

document charts represent input and output of each component

Table II gives a sample instrumented code snippet. Note that the primitive “provenance:write” is implemented in PIN’s NetLogo extension to write an input string to the log file and each probe passes a provenance trace as a string into it; for each statement in the original source code, we add different probes for its reading/writing operations on different variables and use two extra probes to enclose all of them.

To understand the scale of provenance traces generated and the overhead introduced by our method, we perform an empirical study using a classic NetLogo model for ecologists called “wolf-sheep-predation”. The model, designed by Uri Wilensky [15], explores the stability of predator-prey ecosys-tems. A system is stable if it trends to maintaining itself over time, despite fluctuations in population sizes. A system is similarly unstable if it trends to extinction of one or more species involved. The “wolf-sheep-predator” model has two variations. In the first variation, wolves and sheep wander the landscape randomly while wolves are busy looking for sheep to prey on. Each model step is an action by a wolf that costs them energy units, and they must eat sheep in order to replenish their energy. When a wolf runs out of energy, it dies. Each of wolf and sheep has a fixed probability of reproducing at each time step. This variation produces interesting population dynamics,

(4)

TABLE III. SIZE OF PROVENANCE LOG AFTER CERTAIN NUMBER OF ITERATIONS 10 50 100 200 300 400 Number of sheep 129 338 170 38 2,413 112,402 Number of wolves 63 82 349 1 1 116 Log size 523KB 3.63MB 12.1MB 20.4MB 35.3MB 773MB Number of traces 20,793 144,590 475,061 799,677 1,351,128 27,860,316

but is ultimately unstable. The second variation includes grass in addition to wolves and sheep. The behavior of the wolves is identical to the first variation, however this time the sheep must eat grass in order to maintain their energy—when they run out of energy they die. Once grass is eaten it will only regrow after a fixed amount of time. This variation is more complex than the first, but it is generally stable. We choose the first variation to study the scale of provenance under conditions where the model is unstable, and run it up to 400 iterations.

In Table III, we show the size of the provenance log and the agent population after 10, 50, etc. iterations. The size of provenance captured actually depends on both the number of iterations and the size of agent population. The provenance size increases dramatically at 400 iterations, which is because the wolves almost die out and the number of sheep has increased exponentially at 400 iterations. This means that, for simulations that have a large population of agents or need to run a long period of time, the size of provenance traces captured using our method can be overwhelming. To cope with this, we propose some use-inspired filters in Section V and evaluate their performance in Section VII.

To understand the time overhead introduced by provenance tracing and capture, we identify four performance metrics that are measured through timing information gathered during the “Model execution and provenance tracing” step (in Figure 1). These metrics are:

• Execution time: model execution;

• Message passing time: the probes pass the trace

messages to the NetLogo extension by calling the primitive “provenance:write”;

• Collecting time: the NetLogo extension collects trace

messages and their context information from the model;

• Writing time: the NetLogo extension writes traces into

a provenance log file.

Execution time is calculated by running the model without instrumentation, and the others are computed indirectly by running simulations with different versions of the NetLogo extension and calculating the average time differences. The model is running in NetLogo 5.0.3 on a single machine with Win8 64bit OS, 8GB memory, and Core i5 2.53GHz dual core CPU. Figure 2 summarizes the results.

From Figure 2 we see (a) that the “message passing time” is small compared to the “execution time” and “collecting time”, which means that we can simply turn off tracing once the capture finishes by not processing the trace messages, and this strategy can dramatically reduce the overhead. We

Fig. 2. Provenance tracing and capture: X-axis is number of iterations run; Y-axis has time cost (ms). (a) shows where time costs lie for up to 300 iterations. (b) shows the total time for 100, 200, etc. iterations.

can also use filters that perform aggregation and filtering to reduce the “writing time”, but these filters do not reduce the “collecting time”, and even introduce additional computation time. Figure 2(b) shows that the total time (including the overhead) scales linearly until 300 iterations and then increases dramatically when the population of sheep starts to increase exponentially.

V. FILTERINGABMPROVENANCE

To reduce the overhead and simplify subsequent querying, we propose filters that apply intelligence to reduce provenance before it is written to log tiles. Some of the filters abstract statement level traces and others drop unrelated traces and turn off the tracing once the capture finishes. In the remainder of this section, we discuss two types of provenance filters and their example outputs.

A. Aggregation

Keeping track of fine-grained provenance provides a de-tailed view of the dependencies among entities in a simulation, but at the expense of additional storage and processing over-head. We can extract the fine-grained provenance from prove-nance traces and aggregate them into high level proveprove-nance records.

For example, one of the interactions among sheep agents and wolf agents in the model“wolf-sheep-predator” is that a wolf agent acquires reference to a sheep agent and then kills it. We can aggregate these two steps into a single activity “kill”, or to a generic relationship “interact with”. We propose an aggregation filter that uses the generic relationship “interact with” to represent all the complex interactions among agents, since identifying less generic activities like “kill” is more complicated and may need the user effort. This aggregation filter is inspired by the interest of domain scientist in studying the interactions among agents in simulation, and we can visualize its output as a social network (see Figure 3 for an example). We implement this filter by buffering all provenance traces within an iteration at runtime and then extracting and writing the abstract provenance to the log file.

B. Filtering

Inspired by the forward and backward data slices discussed in [5], we propose filtering that computes a forward provenance slice or backward provenance slice for a given variable. Our PIN captures provenance traces of all iterations by default,

(5)

a-sheep 293 a-sheep 527 a-sheep 10 a-sheep 352 a-sheep 576 a-sheep 6 a-sheep 362 a-sheep 473 a-sheep 332 wolf 144 a-sheep 437 a-sheep 506 a-sheep 57 a-sheep 184 a-sheep 542 a-sheep 246 a-sheep 499 a-sheep 280 a-sheep 210 a-sheep 545 a-sheep 548 a-sheep 408 wolf 511 a-sheep 338 a-sheep 230 a-sheep 468 wolf 699 a-sheep 493 wolf 657 wolf 218 a-sheep 374 a-sheep 324 a-sheep 617 a-sheep 376 a-sheep 461 a-sheep 673 a-sheep 250 a-sheep 544 a-sheep 365 a-sheep 684 a-sheep 377 a-sheep 514 wolf 402 wolf 212 wolf 496 a-sheep 680 a-sheep 678 a-sheep 543 wolf 190 a-sheep 150 a-sheep 556 a-sheep 616 a-sheep 569 a-sheep 485 a-sheep 157 a-sheep 647 a-sheep 568 a-sheep 45 a-sheep 308 a-sheep 206 a-sheep 520 a-sheep 7 a-sheep 486 wolf 701 wolf 134 a-sheep 634 a-sheep 474 a-sheep 153 a-sheep 384 a-sheep 505 a-sheep 92 a-sheep 522 a-sheep 390 a-sheep 453 a-sheep 265 a-sheep 425 a-sheep 492 a-sheep 85 a-sheep 584 wolf 172 wolf 510 wolf 677 a-sheep 213 a-sheep 546 a-sheep 629 a-sheep 37 wolf 558 wolf 101 wolf 411 a-sheep 540 wolf 675 a-sheep 20 a-sheep 670 a-sheep 563 a-sheep 279 a-sheep 215 a-sheep 438 a-sheep 502 a-sheep 588 a-sheep 694 a-sheep 669 a-sheep 661 a-sheep 498 a-sheep 35 a-sheep 500 a-sheep 34 a-sheep 596 a-sheep 547 a-sheep 446 a-sheep 168 a-sheep 275 a-sheep 306 a-sheep 602 a-sheep 96 a-sheep 484 a-sheep 603 wolf 285 wolf 148 wolf 104 a-sheep 58 a-sheep 416 wolf 183 a-sheep 619 a-sheep 415 a-sheep 592 wolf 108 wolf 307 a-sheep 580 a-sheep 3 wolf 146 a-sheep 451 a-sheep 541 a-sheep 586 a-sheep 339 a-sheep 681 a-sheep 387 a-sheep 208 a-sheep 516 a-sheep 242 a-sheep 503 wolf 149 a-sheep 447 a-sheep 167 a-sheep 395 a-sheep 695 wolf 127 a-sheep 625 a-sheep 427 a-sheep 310 a-sheep 648 a-sheep 575 wolf 659 a-sheep 582 a-sheep 635 wolf 336 a-sheep 614 a-sheep 471 a-sheep 551 wolf 105 wolf 481 a-sheep 413 a-sheep 292 a-sheep 41 a-sheep 518 a-sheep 253 a-sheep 687 a-sheep 63 a-sheep 185 wolf 283 a-sheep 15 a-sheep 507 a-sheep 399 a-sheep 653 a-sheep 56 wolf 697 wolf 147 wolf 244 wolf 333 a-sheep 309 a-sheep 1 a-sheep 476 a-sheep 152 a-sheep 533 a-sheep 683 wolf 114 a-sheep 414 a-sheep 12 wolf 136 a-sheep 424 wolf 464 a-sheep 593 a-sheep 217 wolf 219 a-sheep 269 a-sheep 622 a-sheep 311 a-sheep 93 wolf 200 a-sheep 346 a-sheep 583 a-sheep 490 a-sheep 519 wolf 369 a-sheep 455 wolf 656 a-sheep 594 a-sheep 529 a-sheep 296 a-sheep 18 a-sheep 291 a-sheep 24 wolf 173 wolf 252 wolf 188 a-sheep 67 a-sheep 605 a-sheep 549 a-sheep 392 a-sheep 259 a-sheep 22 a-sheep 419 a-sheep 601 a-sheep 615 a-sheep 220 a-sheep 475 a-sheep 645 a-sheep 11 a-sheep 531 a-sheep 535 a-sheep 195 a-sheep 672 wolf 274 a-sheep 99 wolf 698 a-sheep 340 a-sheep 624 a-sheep 17 a-sheep 277 wolf 700 a-sheep 406 a-sheep 353a-sheep 249 a-sheep 665 a-sheep 53 a-sheep 501 a-sheep 331 a-sheep 46 a-sheep 75 a-sheep 405 a-sheep 690 a-sheep 368 wolf 155 wolf 525 a-sheep 222 a-sheep 385 a-sheep 187 a-sheep 440 a-sheep 276 a-sheep 565 wolf 227 a-sheep 356 wolf 243 a-sheep 449 a-sheep 375 wolf 381 a-sheep 386 wolf 196 a-sheep 422 a-sheep 552 a-sheep 177 a-sheep 236 wolf 106 a-sheep 442 wolf 350 a-sheep 358 a-sheep 66 a-sheep 562 a-sheep 668 a-sheep 626 a-sheep 679 a-sheep 299 a-sheep 268 a-sheep 71 wolf 559 a-sheep 61 a-sheep 2 a-sheep 691 a-sheep 595 a-sheep 636 a-sheep 30 a-sheep 515 a-sheep 74 a-sheep 80 wolf 103 a-sheep 95 a-sheep 38 a-sheep 521 a-sheep 649 a-sheep 652 a-sheep 401 wolf 294 wolf 482 wolf 161 a-sheep 328 a-sheep 164 a-sheep 49 a-sheep 491 a-sheep 312 wolf 633 a-sheep 89 a-sheep 654 a-sheep 366 wolf 226 a-sheep 8 a-sheep 651 a-sheep 618 a-sheep 644 a-sheep 154 a-sheep 9 wolf 171 a-sheep 330 a-sheep 266 a-sheep 341 a-sheep 301 a-sheep 664 a-sheep 581 a-sheep 355 a-sheep 295 a-sheep 64 wolf 465 wolf 156 wolf 434 a-sheep 564 wolf 199 a-sheep 241 wolf 643 a-sheep 261 a-sheep 692 a-sheep 462 a-sheep 174 a-sheep 640 a-sheep 271 a-sheep 639 wolf 121 a-sheep 397 a-sheep 70 wolf 122 a-sheep 216 a-sheep 88 a-sheep 94 a-sheep 428 a-sheep 441 a-sheep 637 wolf 131 wolf 123 a-sheep 663 a-sheep 247 a-sheep 65 a-sheep 288 wolf 198 a-sheep 234 a-sheep 685 a-sheep 297 a-sheep 671 a-sheep 630 a-sheep 270 a-sheep 221 a-sheep 599 a-sheep 458 a-sheep 487 a-sheep 566 a-sheep 354 a-sheep 233 a-sheep 16 a-sheep 364 a-sheep 151 a-sheep 272 a-sheep 52 a-sheep 460 a-sheep 286 a-sheep 69 a-sheep 201 a-sheep 528 a-sheep 526 wolf 632 a-sheep 662 a-sheep 223 a-sheep 628 a-sheep 388 a-sheep 567 a-sheep 42 a-sheep 393 a-sheep 326 a-sheep 572 a-sheep 429 wolf 420 a-sheep 327 a-sheep 62 a-sheep 504 a-sheep 398 a-sheep 194 a-sheep 29 a-sheep 175 a-sheep 205 a-sheep 86 a-sheep 44 a-sheep 578 a-sheep 530 a-sheep 483 a-sheep 560 wolf 237 a-sheep 202 a-sheep 305 wolf 351 a-sheep 489 a-sheep 278 a-sheep 50 wolf 631 a-sheep 5 a-sheep 488 a-sheep 667 a-sheep 231 a-sheep 682 a-sheep 4 a-sheep 573 wolf 444 a-sheep 686 a-sheep 0 a-sheep 160 a-sheep 597 wolf 478 a-sheep 394 a-sheep 60 a-sheep 534 wolf 537 wolf 290 wolf 570 a-sheep 689 a-sheep 448 a-sheep 555 a-sheep 26 a-sheep 423 a-sheep 627 a-sheep 450 a-sheep 469 wolf 696 a-sheep 84 a-sheep 158 a-sheep 472 wolf 120 a-sheep 623 a-sheep 264 a-sheep 87 a-sheep 561 a-sheep 304 wolf 589 a-sheep 532 wolf 289 a-sheep 39 a-sheep 313 a-sheep 83 wolf 314 a-sheep 320 a-sheep 646 a-sheep 359 a-sheep 90 a-sheep 260 a-sheep 298 wolf 641 a-sheep 55 a-sheep 329 a-sheep 407 a-sheep 655 a-sheep 536 a-sheep 180 a-sheep 574 a-sheep 19 a-sheep 638 a-sheep 76 wolf 145 a-sheep 14 a-sheep 585 a-sheep 78 wolf 107 a-sheep 396 a-sheep 169 a-sheep 577 a-sheep 404 wolf 612 a-sheep 621 a-sheep 267 a-sheep 666 a-sheep 214 wolf 110 wolf 138 wolf 348 a-sheep 445 a-sheep 345 a-sheep 457 a-sheep 477 a-sheep 163 wolf 611 wolf 109 a-sheep 426 a-sheep 391 a-sheep 378 a-sheep 513 a-sheep 225 a-sheep 204 a-sheep 27 a-sheep 207 wolf 658 a-sheep 557 a-sheep 360 a-sheep 193 a-sheep 97 a-sheep 48 a-sheep 170 a-sheep 303 a-sheep 232 wolf 380 a-sheep 650 a-sheep 179 a-sheep 688 a-sheep 73 a-sheep 21 a-sheep 344 a-sheep 43 a-sheep 512 a-sheep 181 a-sheep 579 a-sheep 238 a-sheep 674 a-sheep 494 a-sheep 606 a-sheep 591 a-sheep 33 a-sheep 287 wolf 335 wolf 382 a-sheep 430 a-sheep 604 a-sheep 191 wolf 125 wolf 129 a-sheep 587 a-sheep 693 a-sheep 379 a-sheep 456 a-sheep 600 observer wolf 523 a-sheep 409 wolf 100 a-sheep 79 wolf 317 a-sheep 417 wolf 273 a-sheep 81 wolf 370 a-sheep 256 wolf 609 a-sheep 439 wolf 135 a-sheep 91 wolf 432 a-sheep 262 wolf 431 a-sheep 59 wolf 347 a-sheep 400 wolf 497 a-sheep 550 wolf 113 a-sheep 203 wolf 642 a-sheep 620 wolf 660 a-sheep 467 wolf 466 a-sheep 255 wolf 165 a-sheep 178 wolf 421a-sheep 363 wolf 571a-sheep 607 wolf 166 a-sheep 235 wolf 480 a-sheep 13 wolf 410 a-sheep 517 wolf 124 a-sheep 459 wolf 116 a-sheep 302 wolf 508 a-sheep 72 wolf 119 a-sheep 211 wolf 102 a-sheep 98 wolf 372 a-sheep 47 wolf 435 a-sheep 470 wolf 433a-sheep 452 wolf 361 a-sheep 373 wolf 133 a-sheep 186 wolf 337a-sheep 443 wolf 263 a-sheep 281 wolf 610 a-sheep 343 wolf 126 a-sheep 32 wolf 608 a-sheep 254 wolf 318 a-sheep 251 wolf 371 a-sheep 321 wolf 403 a-sheep 229 wolf 228 a-sheep 209 wolf 115a-sheep 40 wolf 142 a-sheep 342 wolf 139 a-sheep 389 wolf 137 a-sheep 224 wolf 117 a-sheep 182 wolf 132 a-sheep 68 wolf 118 a-sheep 82 wolf 143 a-sheep 239 wolf 189 a-sheep 357 wolf 112 a-sheep 192 wolf 479 a-sheep 553 wolf 334 a-sheep 54 wolf 412 a-sheep 598 wolf 538 a-sheep 248 wolf 590 a-sheep 323 wolf 141 a-sheep 31 wolf 130 a-sheep 257 wolf 162 a-sheep 258 wolf 197 a-sheep 28 wolf 676 a-sheep 77 wolf 495 a-sheep 176 wolf 284 a-sheep 319 wolf 436 a-sheep 418 wolf 539 a-sheep 554 wolf 282 a-sheep 23 wolf 349 a-sheep 322 wolf 111 a-sheep 240 wolf 140a-sheep 36 wolf 315 a-sheep 383 wolf 316 a-sheep 325 wolf 524 a-sheep 367 wolf 463 a-sheep 613 wolf 128 a-sheep 454 wolf 159 a-sheep 51 wolf 245 a-sheep 25 wolf 509 a-sheep 300

Fig. 3. An example visualization of interactions among agents in “wolf-sheep-predation”. Edges represent interactions among vertices (agents). Edge and vertex are partitioned into strongly (or weakly) connected communities and are dyed accordingly. The biggest community in the center has the observer (a manager agent) and the agents that have no other interactions. Each of the smaller communities far from the center has one wolf agent and its preys

but we developed one filter that collects the provenance traces from only a single iteration. In this section, we introduce and describe the use cases and implementations of the forward, backward and single-iteration filters.

The backward provenance slice includes the processes, input data, intermediate data and agents that are involved in the generation of an output data. This can help user to better understand why a particular result is achieved and can be used for debugging. Figure 4 is the visualization of an example backward provenance from the model “wolf-sheep-predation”. That backward provenance slice explains how the final value of the variable “energy” of agent “wolf 130” is generated: there are three types of processes involved in the generation— “go”, “catch-sheep” and “wolf-reproduce”; the agent variable “energy” is reduced by 1 each time the “observer” agent (global manager) invokes the “go” procedure (which means one iteration starts); the agent variable “energy” increases by the value of “wolf-gain-from-food” after catching sheep agent “sheep 26”; the “energy” is reduced from 46 to 23 after the process “wolf -reproduce” took place.

It is usually difficult to decide the backward provenance slice for a variable before the simulation terminates, this is because the target variable can be directly or indirectly affected by any current data/process in the future. However, by observing the backward provenance we can get from the provenance traces of a finished simulation, we find that it is more like a linear structure rather than an upside down tree— it consists of information about how a given variable evolves as processes running on the same agent to which the variable belongs. So we implement an imprecise filter that drops all

alternateOf wasGeneratedBy used wasDerivedFrom wasAttributedTo wasAttributedTo wasAssociatedWith wasAttributedTo wasAttributedTo wasAttributedTo wasAttributedTo wasAssociatedWith wasAttributedTo wasAttributedTo used wasDerivedFrom wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith used wasDerivedFrom used wasRevisionOf used wasInformedBy wasGeneratedBy wasGeneratedBy wasRevisionOf used wasRevisionOf used wasGeneratedBy used wasGeneratedBy wasGeneratedBy wasRevisionOf wasRevisionOf wasGeneratedBy wasRevisionOf used wasRevisionOf wasGeneratedBy used wasInformedBy a-sheep 26 local:prey wolf 130 wolf-reproduce observer wolf-gain-from-food wolf 130:energy go wolf 130:energy go wolf 130:energy go go wolf 130:energy wolf 130:energy catch-sheep wolf 130:energy wolf 130:energy go wolf 130:energy reproduce-wolves

Fig. 4. The backward provenance of data product “energy” of agent “wolf

130

provenance traces that are not associated with the agent to which the target variable belongs.

The forward provenance slice of one input data tells information about the future data products that are derived from the input data. It can be used to understand the impact of a input parameter, or to control the error propagation if an input data is corrupted. Figure 5 is the visualization of an example forward provenance of a parameter named “wolf-reproduce”. Note that each “was derived from” relationship means that the source entity was used in the generation of the target entity, and we do not consider any other indirect dependency in this way. We implement the forward provenance filter by keeping track of all data products that are derived from the input data or the previously derived data.

wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasDerivedFrom wasRevisionOf wasDerivedFrom wasRevisionOf wasDerivedFrom wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasDerivedFrom wasDerivedFrom wasDerivedFrom wasDerivedFrom wasDerivedFrom wasDerivedFrom wasDerivedFrom wasRevisionOfwasRevisionOf wasRevisionOf wasRevisionOf wasDerivedFrom wasRevisionOf wasDerivedFrom wasRevisionOf wasDerivedFrom wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasDerivedFrom wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasDerivedFrom wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasDerivedFrom wasRevisionOf wasRevisionOf wasRevisionOf wasDerivedFromwasDerivedFrom wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasDerivedFrom wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasDerivedFrom wasDerivedFromwasDerivedFrom

wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wasRevisionOf wolf 140:energy wolf 123:energy wolf 159:energy wolf 155:energy wolf 123:energy wolf 159:energy wolf 123:energy wolf 104:energy wolf 140:energy wolf 140:energy wolf 104:energy wolf 155:energy wolf 107:energy wolf 171:energy wolf 159:energy wolf 121:energy wolf 106:energy wolf 104:energy wolf 107:energy wolf 171:energy wolf 159:energy wolf 155:energy wolf 121:energy

wolf 121:energywolf 159:energy wolf 121:energywolf 159:energywolf 171:energywolf 107:energy wolf 132:energy wolf 122:energy wolf 123:energy wolf 122:energy wolf 132:energy wolf 123:energy wolf 123:energy wolf 123:energy wolf 123:energy wolf 123:energy wolf 125:energy wolf 125:energy wolf 125:energy wolf 125:energy wolf 125:energy wolf 125:energy wolf 125:energy wolf 125:energy wolf 140:energy wolf 140:energy wolf 140:energy wolf 140:energy wolf 140:energy wolf 149:energy wolf 149:energy wolf 149:energy wolf 149:energy wolf 149:energy wolf 149:energy wolf 149:energy wolf 130:energy wolf 130:energy wolf 130:energywolf 103:energy

wolf 103:energy wolf 103:energy wolf 123:energy wolf 123:energy wolf 112:energy wolf 112:energy wolf 112:energy wolf 145:energy wolf 145:energy wolf-reproduce wolf 112:energy wolf 145:energy wolf 112:energy wolf 145:energy wolf 112:energy wolf 145:energy wolf 132:energy wolf 132:energy wolf 132:energy wolf 132:energy wolf 132:energy wolf 132:energy wolf 132:energy wolf 132:energy wolf 132:energy wolf 146:energy wolf 146:energy wolf 146:energy

wolf 146:energy wolf 130:energy

wolf 130:energy

wolf 130:energy wolf 103:energy wolf 103:energy

Fig. 5. The forward provenance of global variable “wolf-reproduce”

The last filter we propose keeps the execution traces only for a single iteration, which is inspired by the user’s interest in studying the various agent behaviors and the interactions among them. By recovering and visualizing the provenance from provenance traces of a single iteration, we can discover the different agent behavior patterns. Figure 6 is an example visualization that shows six different agent behaviors within an iteration: 1) the global agent “observer” invokes

(6)

“display-labels” on each wolf and sheep agent; 2) the wolf agents invoke the procedure “go”, “move” and “death”; 3) the wolf agents invoke the procedure “wolf-reproduce” in addition to the procedure “go”, “move” and “death” ; 4) the sheep agents invoke the procedure “go”, “move” and “death” ; 5) the sheep agents invoke the procedure “sheep-reproduce” in addition to the procedure “go”, “move” and “death”; 6) the wolf agents catch sheep agents via the procedure “catch-sheep”.

WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedBy

WasGeneratedBy WasGeneratedBy WasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy

WasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedByWasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy

WasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy wasAssociatedWith used WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy

wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWithused wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith WasGeneratedBy

wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithalternateOf wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWithalternateOf

used wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith

wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith

WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy used wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith WasGeneratedBy

wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith WasGeneratedBy wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWithused WasGeneratedBy wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith WasGeneratedBy WasGeneratedBy wasAssociatedWith

wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith WasGeneratedBy WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith

WasGeneratedBy WasGeneratedBy

wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith

WasGeneratedBy wasAssociatedWithwasAssociatedWithwasAssociatedWith

WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith

WasGeneratedBy wasAssociatedWith WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith

WasGeneratedBy WasGeneratedBy

wasAssociatedWith

wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith

wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith

wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith

wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith

wasAssociatedWith wasAssociatedWithwasAssociatedWith

wasAssociatedWith WasGeneratedBy

wasAssociatedWithwasAssociatedWithwasAssociatedWith WasGeneratedBy

wasAssociatedWith wasAssociatedWithwasAssociatedWith

wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith

wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith WasGeneratedBy wasAssociatedWith WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy

wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWith WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith

wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith

wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWith

wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith

wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith

wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith

wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith

wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWith

wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedByWasGeneratedByWasGeneratedBy

WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy wasAssociatedWith WasGeneratedBy

WasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedByWasGeneratedBy WasGeneratedByWasGeneratedBy

WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedBy WasGeneratedByWasGeneratedBy WasGeneratedBy WasGeneratedBy wasAssociatedWith wasAssociatedWith

wasAssociatedWithwasAssociatedWithwasAssociatedWith

wasAssociatedWith wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith

WasGeneratedBy WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith

wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith WasGeneratedBy wasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith

wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWith

wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith WasGeneratedBy

wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith WasGeneratedBy WasGeneratedBy

wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith

wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith

WasGeneratedBy wasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith WasGeneratedBy WasGeneratedBy wasAssociatedWithused wasAssociatedWith wasAssociatedWithwasAssociatedWithused

WasGeneratedBy used wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith WasGeneratedBy wasAssociatedWith WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith WasGeneratedBy WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWith wasAssociatedWith wasAssociatedWithwasAssociatedWithwasAssociatedWithwasAssociatedWith

wasAssociatedWithwasAssociatedWithwasAssociatedWith

wasAssociatedWith WasGeneratedBy wasAssociatedWith WasGeneratedBy wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith wasAssociatedWith WasGeneratedBy WasGeneratedBy WasGeneratedBy a-sheep 23:label wolf 122:label a-sheep 83:label a-sheep 90:label a-sheep 20:label a-sheep 11:label a-sheep 68:label a-sheep 95:label a-sheep 152:labela-sheep 66:label a-sheep 76:labelwolf 105:label a-sheep 62:labela-sheep 75:labelwolf 128:labelwolf 141:labela-sheep 3:label

a-sheep 57:label wolf 132:label a-sheep 39:label wolf 154:labelwolf 119:labela-sheep 99:labela-sheep 48:labelwolf 104:labela-sheep 40:labela-sheep 85:label a-sheep 72:label a-sheep 92:label a-sheep 26:label

a-sheep 61:label wolf 155:labela-sheep 13:label wolf 146:label a-sheep 10:label a-sheep 19:label wolf 161:label a-sheep 157:label a-sheep 18:label a-sheep 1:label a-sheep 84:label a-sheep 78:label a-sheep 29:label wolf 102:labela-sheep 158:label wolf 117:labela-sheep 82:labela-sheep 151:label

a-sheep 41:label wolf 129:label a-sheep 2:label a-sheep 46:label wolf 126:label a-sheep 30:label wolf 137:label a-sheep 0:label a-sheep 89:label

a-sheep 60:labela-sheep 38:labelwolf 107:labela-sheep 67:labelwolf 124:labelwolf 118:labela-sheep 12:label a-sheep 86:labela-sheep 17:label wolf 131:labela-sheep 5:label

wolf 110:labelwolf 130:labela-sheep 14:labela-sheep 64:labelwolf 149:labelwolf 142:label

wolf 148:energy wolf 135:prey wolf 129:prey catch-sheep

wolf 137:energy wolf 135:energy

wolf 129:energy

deathmovedeathmove reproduce-wolves reproduce-wolves movedeath catch-sheepreproduce-wolves movedeath catch-sheep wolf 125 wolf 137 a-sheep 33 wolf 148 a-sheep 44

wolf-reproduce wolf 102 wolf 128

wolf 135 wolf 129 wolf 117 wolf 100:prey catch-sheep reproduce-wolves move wolf 100 death wolf 130 wolf 127 death reproduce-wolves wolf 130:prey wolf 127:prey catch-sheepreproduce-wolves deathmove catch-sheepreproduce-wolvesmovedeath catch-sheep move movedeath reproduce-sheepwolf 148:preyreproduce-wolvesdeathmove

wolf 117:prey wolf 128:prey wolf 102:prey reproduce-wolvesdeath reproduce-wolves move death wolf 137:prey reproduce-sheep deathmove wolf 110 reproduce-wolves catch-sheep wolf 110:prey deathmove wolf 122 wolf 101:prey catch-sheep death move wolf 101 reproduce-wolves wolf 112 deathcatch-sheep wolf 112:prey wolf 124 reproduce-wolvesmove wolf 147 death move wolf 147:prey wolf 103 death wolf 147:energy catch-sheep death wolf 141 movereproduce-wolvescatch-sheep wolf 132:prey catch-sheep wolf 132 death

reproduce-wolves movedeathcatch-sheep reproduce-wolvesmove wolf 105 wolf 116:prey reproduce-wolves move wolf 116 death wolf 105:prey catch-sheep move reproduce-sheep reproduce-sheep move death reproduce-sheep death move a-sheep 9 wolf 141:prey move death wolf 124:prey catch-sheep a-sheep 42 wolf 103:prey

reproduce-sheep move movedeathreproduce-sheep

move catch-sheep move reproduce-wolves reproduce-wolves move move wolf 114 catch-sheep reproduce-wolvesmovecatch-sheep

reproduce-wolves wolf 131:prey death death wolf 120 reproduce-wolves wolf 120:prey catch-sheep reproduce-wolves death death wolf 155 catch-sheep move wolf 107 wolf 146 deathreproduce-wolvescatch-sheep wolf 131 death wolf 107:prey move wolf 146:prey a-sheep 68 a-sheep 91 reproduce-sheep death

move death reproduce-sheepmove death movereproduce-sheep reproduce-sheepdeathmove deathmove reproduce-sheep deathmove reproduce-sheep a-sheep 73

a-sheep 18 a-sheep 72 a-sheep 46 a-sheep 84

a-sheep 40 a-sheep 10 a-sheep 64 a-sheep 30 a-sheep 150 a-sheep 29 a-sheep 37 a-sheep 27 a-sheep 81 a-sheep 98 death move death move death reproduce-sheep death move move reproduce-sheep reproduce-sheep

reproduce-sheep deathreproduce-sheep reproduce-sheepdeath move deathmove

death reproduce-sheep reproduce-sheep move death reproduce-sheep move move death

reproduce-sheep movedeath movedeath reproduce-sheep move move death

reproduce-sheep reproduce-sheep reproduce-sheep movedeath move reproduce-sheep movedeath reproduce-sheepdeath reproduce-sheep

death

a-sheep 17 a-sheep 75

death wolf 121:prey

reproduce-wolvesmovecatch-sheep

reproduce-sheep move a-sheep 151 a-sheep 38 move reproduce-sheep death a-sheep 41 a-sheep 80 a-sheep 7 death reproduce-sheep a-sheep 66 death a-sheep 52 wolf 109:prey a-sheep 3 wolf 155:prey wolf 114:prey a-sheep 88 a-sheep 77 a-sheep 43 a-sheep 55 death wolf 149:prey wolf 133:prey wolf 119:prey wolf 144:prey a-sheep 53

move a-sheep 34

reproduce-sheep death a-sheep 97 a-sheep 59 a-sheep 96

move reproduce-sheep move a-sheep 20 death wolf 122:prey catch-sheep reproduce-wolves death death reproduce-sheep move move move

reproduce-sheep death reproduce-sheep move a-sheep 63 a-sheep 78 a-sheep 71 a-sheep 21 a-sheep 61 a-sheep 16 reproduce-sheep move reproduce-sheep move reproduce-sheepmovedeath

move reproduce-sheep death reproduce-sheep movedeath death move reproduce-sheep move death move death reproduce-sheep reproduce-sheep death move move a-sheep 28 move a-sheep 65 a-sheep 83 a-sheep 19 a-sheep 67 a-sheep 69

reproduce-sheep death reproduce-sheep

death move death

reproduce-sheep deathmove reproduce-sheep deathmove death reproduce-sheep reproduce-sheep death reproduce-sheep death

death move reproduce-sheep reproduce-sheep death move a-sheep 95 a-sheep 24 death move

reproduce-sheep reproduce-sheepdeath reproduce-sheep death move

a-sheep 85 a-sheep 45 a-sheep 87 a-sheep 49

a-sheep 93 a-sheep 12 a-sheep 99 reproduce-sheep death reproduce-sheep move

a-sheep 76 death reproduce-sheep move death death

a-sheep 15

reproduce-sheep move deathreproduce-sheep reproduce-sheep death deathreproduce-sheep move death a-sheep 89

reproduce-sheep deathreproduce-sheep move

a-sheep 70 move death move death death

move reproduce-sheep reproduce-sheep reproduce-sheep move move death move reproduce-sheep reproduce-sheep move death a-sheep 31 a-sheep 92 a-sheep 58

a-sheep 4 a-sheep 50 a-sheep 2 a-sheep 94

reproduce-sheep deathmove deathmove reproduce-sheep

move death reproduce-sheep move death reproduce-sheep move death reproduce-sheep reproduce-sheep move death reproduce-sheep move death move a-sheep 71:label wolf 135:label a-sheep 97:label a-sheep 32:label a-sheep 87:label a-sheep 49:label wolf 100:labela-sheep 51:labelwolf 140:labelwolf 153:labelwolf 101:labelwolf 106:label

wolf 145:label a-sheep 54:label wolf 127:label a-sheep 42:label a-sheep 50:label a-sheep 6:label a-sheep 9:label wolf 139:labela-sheep 93:label a-sheep 8:label a-sheep 15:labelwolf 103:label wolf 116:labela-sheep 16:label a-sheep 37:labela-sheep 159:labelwolf 125:labela-sheep 34:label wolf 148:labela-sheep 25:label wolf 108:labelwolf 160:labela-sheep 53:labela-sheep 96:label wolf 138:labela-sheep 22:labela-sheep 94:labela-sheep 4:labela-sheep 7:labelwolf 143:label a-sheep 70:label a-sheep 28:label a-sheep 21:label a-sheep 55:label a-sheep 31:label a-sheep 27:label a-sheep 63:label a-sheep 24:label wolf 144:labela-sheep 52:label

wolf 112:label wolf 156:label wolf 109:label a-sheep 91:label display-labels wolf 136:label wolf 121:label observer a-sheep 35:labela-sheep 45:labela-sheep 58:labela-sheep 80:labelwolf 115:labela-sheep 150:labela-sheep 73:labela-sheep 59:label

a-sheep 81:labelwolf 123:labela-sheep 43:labela-sheep 77:labelwolf 133:labelwolf 111:label

move reproduce-sheep a-sheep 0 move a-sheep 32 a-sheep 54 reproduce-sheep a-sheep 8 death wolf 123 move reproduce-wolvescatch-sheep wolf 123:prey wolf 145:prey catch-sheep death move reproduce-wolves wolf 145 a-sheep 39 death wolf 138 catch-sheep death move wolf 138:prey movedeath move reproduce-wolves reproduce-sheep a-sheep 11

death reproduce-wolvesmovecatch-sheep deathmove reproduce-wolvescatch-sheep movedeathcatch-sheep wolf 121 wolf 143 wolf 126

deathcatch-sheep move move wolf 111 reproduce-wolves death reproduce-wolvesdeathcatch-sheep movedeath reproduce-wolves a-sheep 98:labela-sheep 65:label

catch-sheep wolf 120:label reproduce-wolves catch-sheep move wolf 139 wolf 104:prey catch-sheep reproduce-wolves death move wolf 104 wolf 111:prey wolf 126:prey wolf 143:prey catch-sheep

reproduce-sheep deathreproduce-sheep move deathmove move

a-sheep 152 move a-sheep 22 movedeath reproduce-sheep

a-sheep 5 a-sheep 48 reproduce-sheep death

reproduce-sheep death movedeath

a-sheep 51 a-sheep 14 death move move death a-sheep 60 wolf 115:prey catch-sheep reproduce-wolves death move wolf 115 deathmove reproduce-wolvescatch-sheep wolf 118:prey deathmove a-sheep 57 reproduce-sheep movedeath death

a-sheep 62 wolf 118 a-sheep 1:energy a-sheep 82:energy reproduce-sheep a-sheep 82 a-sheep 1 movedeath reproduce-sheep

sheep-reproduce reproduce-sheep a-sheep 39:energy reproduce-wolves wolf 153 deathmovecatch-sheep wolf 153:prey reproduce-wolves wolf 154:prey wolf 108:prey a-sheep 23 a-sheep 13 move move reproduce-sheep wolf 142 catch-sheep move death reproduce-wolves wolf 142:prey catch-sheep wolf 140:prey reproduce-wolves death move wolf 140 catch-sheep reproduce-sheep a-sheep 6 reproduce-sheep death wolf 133 wolf 144 wolf 119 reproduce-wolves wolf 108 death reproduce-sheep a-sheep 35 move death reproduce-wolves move deathmove wolf 154 catch-sheep a-sheep 69:label move death a-sheep 86 death death move a-sheep 25 a-sheep 90 death wolf 114:label death move catch-sheep reproduce-wolves move catch-sheep reproduce-sheep reproduce-wolves wolf 109 reproduce-wolves reproduce-wolves wolf 136 move wolf 106 move move a-sheep 26 move catch-sheep catch-sheep wolf 139:prey catch-sheep catch-sheep wolf 156:prey reproduce-sheep a-sheep 88:label reproduce-wolves death reproduce-wolves reproduce-sheep death wolf 156 reproduce-wolves move wolf 149 move move deathcatch-sheep catch-sheep wolf 136:prey deathcatch-sheep wolf 106:prey death wolf 125:prey

Fig. 6. An example single-iteration provenance

The implementation of the single-iteration provenance filter is simple. We ask the user to start the filtering by entering the name of the entry procedure for the next iteration (like the procedure “go” in the model “wolf-sheep-predator”). After the simulation exits the entry procedure, the filter shuts down the tracing to minimize the overhead.

In sum, we have proposed two types of filters and included visualizations of the provenance computed from the output provenance traces of these filters. We examine their perfor-mance in Section VII .

VI. NON-PREPROCESSING PROVENANCE SLICING

The existing method that captures provenance from pro-gram logs needs to recover all the entities and dependencies from the logs before being able to answer queries [3]. However, our empirical study indicates that this can be infeasible for large provenance graph. For example, the log file that has the 20,793 provenance traces from a 10-iteration simulation of the model “wolf-sheep-predation” is only 523KB. However, re-covering the full provenance (that has 8,855 nodes and 18,330 edges) from this log file and store it into the Neo4J [1] graph database takes 1,035,845ms (about 17 mins) in a machine with 8GB memory and Core i5 2.53GHz dual core CPU.

However, we can actually avoid recovering and maintaining provenance entities and dependencies that are unnecessary for answering a specific query. To achieve this, we propose a query technique we call non-preprocessing (NP) provenance slicing that is derived from Zhang’s no preprocessing (NP) program slicing algorithm [10]. The non-preprocessing (NP) provenance slicing technique employs demand driven analysis of the provenance traces to recover dependency provenance. When a provenance query begins we traverse the provenance traces forward (or backward) to recover the dynamic depen-dencies required for the provenance slice computation. For example, if we need the provenance slice for the final value of some variable v (backward provenance), we traverse the execution traces backward till the last access of the variable was found. If that value of v depends on another variable w (see possible dependencies in Table 1), we resume the traversal to also calculate the provenance slice of w.

In essence this algorithm performs partial preprocessing for extracting entities, activities and dependencies relevant to a provenance query. It is possible that two different querying requests involve common information. In such a situation, the non-preprocessing (NP) provenance slicing algorithm will recover the common information from the execution trace during both provenance slice computations. To avoid this repetitive work we can cache the recovered entities, activities and dependencies. Therefore at any given point in time, all entities, activities and dependencies that have been computed so far can be found in the cache. Similar to Zhang’s def-inition, we also define two versions of this demand driven algorithm, that is, without caching and with caching, as non-preprocessing without caching (NPwoC) provenance slicing and non-preprocessing with caching (NPwC) provenance slic-ing. We evaluate the performance of NPwoC provenance slicing in Section VII and find that it can be even faster than traversing the pre-recovered full provenance graph.

VII. PERFORMANCEEVALUATION

In this section, we first evaluate the proposed filters by comparing the performance of provenance capture with these filters with the performance of a full provenance capture without any filter.

Table IV compares the size of provenance logs generated using different filters with the size of a full provenance log. We can see that all filters can dramatically reduce the storage cost. While the provenance log generated from single-iteration filter remains the same size (since the capture finishes), the logs generated from other provenance filters get larger as simulation goes on: the log size from aggregation filter keeps increasing since there are more and more agents; the log size from backward provenance filter keeps increasing until the target agent dies; the log size from forward provenance filter does not increase much after 200 iterations since we were tracing the usage of the global variable “wolf-reproduce” by wolf agents and there are few wolf agents after 200 iterations.

TABLE IV. EVALUATION OF STORAGE COST

Number of Iterations 10 50 100 200 300 400 full provenance capture 523KB 3.63MB 12.1MB 20.4MB 35.3MB 773MB aggregation filter 8.22KB 30.9KB 100KB 140KB 230KB 4.53MB backward provenance filter 5.36KB 18.1KB 18.1KB 18.1KB 18.1KB 18.1KB forward provenance filter 13.9KB 163KB 925KB 2.08MB 2.10MB 2.75MB single-iteration filter 46.8KB 45.2KB 45.2KB 45.2KB 45.2KB 45.2KB

Figure 7 shows the average time of provenance capture with different filters and without any filter (the full provenance capture), which proves our result of analysis in Section IV. It shows that while the single-iteration filter can significantly re-duce the time overhead, the backward provenance filter doesn’t reduce the time overhead much, and the forward provenance filter and aggregation filter have even higher overhead than the full provenance capture. However, we argue that this can be

Figure

TABLE I. M APPINGS FROM PROV ONTOLOGY TO N ET L OGO CONCEPTS Concept in PROV Concept in NetL-ogo Code example
TABLE III. S IZE OF PROVENANCE LOG AFTER CERTAIN NUMBER OF ITERATIONS 10 50 100 200 300 400 Number of sheep 129 338 170 38 2,413 112,402 Number of wolves 63 82 349 1 1 116 Log size 523KB 3.63MB 12.1MB 20.4MB 35.3MB 773MB Number of traces 20,793 144,590 475
Fig. 4. The backward provenance of data product “energy” of agent “wolf 130
Table IV compares the size of provenance logs generated using different filters with the size of a full provenance log.
+3

References

Related documents

Simulations of the low and high charge channel show that the 3.58 mm cable offers a sufficient voltage response and thus a sufficient slope for the low and high charge channel.

By using the impulse response of the multipath channel and the genetic algorithm synthesizing optimal antenna radiation pattern, the BER performance of B-PAM IR UWB

The last unit in every course includes a long-term group project called a Cooperative Challenge.. Assessment and student portfolio building is done using the Academy of Robotics

But, if Ann drinks a beer in some his- tories that radiate from the moment at which Ann is at the party, and Ann does not drink a beer in some other histories radiating from

SERVICE CUSTOMIZATION ASE NETWORK MANAGEMENT CONTROLS ASE NETWORK TOPOLOGY FCAPS SERVICE ELEMENTS LIBRARY ALGORITHMS ASE LOGICAL MAPPING VISUAL INTERFACE FSM FCAPS DG NM CONTROL

Saint Xavier University IL Chicagoland Collegiate Athletic Conference 73. San Diego Christian College CA Golden State Athletic Conference

The purpose of the Maine AHEC CUP Scholars Program is to provide opportunities for health professions students to increase leadership skills, gain competencies

> Press release, media alert, pitching, hundreds of radio spots, interviews, promos on KRBE.