7.4 Fault tolerance
7.5.1 Exchanged Message Evaluation
The first conducted test aims to measure the amount of traffic incoming to and outgoing from the mobile phone when using different numbers of worker nodes - one to eight - and different distributions of the data directory - centralized on the mobile, distributed across all the nodes of the infrastructure and distributed only across the worker nodes. The placement of the data directory has a significant impact on the kind and number of messages transferred from and to the mobile. Table 7.3 shows which types of message are sent from and received by the mobile device depending on the deployment of the data directory. Regardless of the locations of the data directory, every time that the mobile device fetches a data value located at a remote node, it sends a data value request to the remote node and receives the value; and vice versa when a remote node needs a value hosted by the mobile. In case of a centralized data directory on the mobile, the device receives the creation notifications of every value remotely computed and receives and replies all the existence or sources queries required by the workers. Conversely, when the directory is located only on the worker nodes, the mobile publishes the creation of data values – locally computed or received from a remote node –, subscribes for the existence and sources of values and receives their corresponding notifications. In the case where the phone has a share of the data directory and the other parts are distributed among the workers, the interactions with the directory depend on the specific hashcode of the value and the responsibilities of the mobile device. If the mobile node is responsible for the hashcode of one data value, it receives any remote data creation notification; otherwise, it sends creation notifications for those values locally computed. Similarly, the subscriptions to existence/sources and their corresponding notifications also depend on specific data hashcode.
Table 7.4 presents the number of messages and the number of bytes transferred in to and out from the mobile device during a low-resolution execution according to the number of workers used and the same data directory placements as in the previous table – only in the phone node (Mobile), shared among the workers (Workers) or across the whole infrastructure (Mobile + Workers). Although the scheduler in the frontend of the Cloud Platform is aware of the data locality, tasks
Mobile Mobile+Workers Workers Creation notification Received Received/Sent Sent Existence request Received Received/Sent Sent Existence response Sent Received/Sent Received Sources request Received Received/Sent Sent Sources response Sent Received/Sent Received Value request Received/Sent Received/Sent Received/Sent Value transmission Received/Sent Received/Sent Received/Sent
Table 7.3: Direction of each type of message according to the placement of the data directory: centralized on the mobile device (mobile), hosted by the worker nodes (Workers) or shared among all the nodes composing the infrastructure included the mobile (Mobile+Workers).
running on one node of the infrastructure may have dependencies with values generated on other nodes. The more nodes being part of the infrastructure, the less likely one node is to host all the values that a task requires; thus, nodes fetch values from other peers more often. Besides, as the infrastructure grows, the smaller the local share of the directory gets; and hence, the more queries to the data directory require information stored on other nodes. When the mobile device hosts the data directory – either the whole of it or just a share –, the number of messages processed by it increases as the size of the infrastructure does.
Mobile Mobile + Workers Workers
number of messages input bytes output bytes number of messages input bytes output bytes number of messages input bytes output bytes 1 worker 856 124,275 137,387 799 94,753 162,626 765 83,073 168,123 2 workers 961 136,051 148,213 886 97,854 179,228 765 83,984 168,123 4 workers 1,008 140,384 153,054 971 111,861 185,599 765 84,575 168,123 8 workers 1,016 140,525 154,587 1,098 136,211 199,820 765 84,906 168,123 Table 7.4: Number of messages and number of bytes received/transmitted by the mobile during a low-resolution execution according to the size of the underlying infrastructure and the nodes hosting the data directory (the mobile device, the worker nodes or shared across the whole infrastructure.
Distributing the data directory among all the nodes, including the mobile device, may enforce the master to interact with remote nodes to notify every locally created/accessed value and, besides, to reply queries from other nodes fetching values. If the mobile manages the hashes corresponding to all the values locally accessed, it only needs to reply the existence and sources requests from other nodes. Otherwise, if it manages none of the values it accesses, it needs to submit a creation notification for every value creation, request existence/sources requests corresponding to the values it fetches and to reply to queries from remote nodes to the controlled values. Besides, it assumes part of the traffic to forward to other nodes of the ring. Compared with the centralized approach, the option of distributing the data directory among all the nodes can either reduce or increase the number of messages.
7.5. EVALUATION
directory just to fetch remote data and to notify the local creation of data values. In this case, the number of messages depends on the application itself rather than on the infrastructure. The size of the output data – queries and notifications – remains constant, but the input size may change depending on the number of sources for the accessed values. The more nodes being part of the infrastructure, the more likely they are to grow. When the directory is deployed only atop worker nodes, the number of messages and the size of the input communications is always smaller than in the other deployments; the more nodes the infrastructure has, the more significative this reduction is.
Table 7.5 shows the same information included in Table 7.4 but for a high-resolution execution. Despite the bigger number of messages and the larger number of bytes transferred in to and out from the mobile device, the conclusions extracted from it are the same as with the low- resolution test case. When the mobile device hosts the data directory, either partially or totally, the number of messages and the number of transferred bytes grows along with the infrastructure; while they remain almost constant when the data directory is placed on the workers. Sharing the data directory across the whole infrastructure may increase or decrease the number of exchanged messages depending on the hashcode set associated to each node when compared to the centralized approach.
Mobile Mobile + Workers Workers
number of messages input bytes output bytes number of messages input bytes output bytes number of messages input bytes output bytes 1 worker 6,238 912,052 1,007,647 5,712 678,521 1,174,913 5,525 604,123 1,222,494 2 workers 7,817 1,108,741 1,225,531 6,635 737,570 1,330,608 5,525 612,284 1,222,494 4 workers 7,254 1,017,989 1,114,466 8,244 956,214 1,552,528 5,525 614,921 1,222,494 8 workers 7,437 1,037,932 1,135,8892 8,963 1,044,699 1,632,062 5,525 616,648 1,222,494 Table 7.5: Number of messages and number of bytes received/transmitted by the mobile during a high-resolution execution according to the size of the underlying infrastructure and the nodes hosting the data directory (the mobile device, the worker nodes or shared across the whole infrastructure.