How/When to Transmit Data/Updates

1.5 Document Outline

2.1.3 How/When to Transmit Data/Updates

One of the most important aspects in edge networks, when considering data updates, is that the process should be as simple as possible without the need of a globally strict replication and consistency scheme due to the high volatility of communication channels used by nodes near the edge of the network. Furthermore, since edge devices are also expected to be unpredictable with regard to their mobility and needs, updates to shared data objects should be applied in an opportunistic manner whenever possible, with the goal of achieving high performance while minimizing the number of messages to be flooded through the mesh in order to achieve a consistent view of each item.

During this research, one of the challenges faced was tied to this particular topic since most of the proposed solutions do not offer data updates at the edge or, if this action is indeed offered by the system, it is a full overwrite function masked as an update, which is not ideal in this type of ecosystem since the entire data needs to be sent whenever an update action is triggered.

The following frameworks represent the most interesting approaches, found during this research, related to the transmission of information.

HDUM [27]: one example of an edge-optimized solution is the Heuristic Data Update Mechanism, an opportunistic data update mechanism for peer-to-peer edge networks. In this proposal, the authors note that the network itself is partitioned in multiple distinct geographical groups which contain various Member Nodes (MN) and at least one leader (theGroup Node, GN), and the latter is elected, among all local MNs, based on its storage capacity, bandwidth, computing power and retention time.

Periodically, each GN broadcasts a beacon messages with a limited hop count, in order to collect all the shared data items in its group. Thus, when a MN receives that beacon packet it automatically determines which of the groups it belongs to. Although this approach is conceptually and programmatically very simple to implement, it bears major drawbacks due to the possibility of over flooding the networking with the routing messages. One solution proposed by the authors is to maintain aholder list in each of the nodes, which contains the set all of the neighbors that currently store a particular data item, that needs to be updated from time to time. Nonetheless, another issue arises in alternative modern solutions when updates are performed in distinct areas of the same infrastructure, as a consequence of all the messages that need to be sent throughout the entire network in order for each node to maintain a consistent view of the data that it currently holds, resulting in a waste of bandwidth and performance detriment.

In order to minimize the number of messages transmitted, the proposed solution combines two distinct approaches:

• Safe-time Derivation: temporal prevision of when a neighboring node will leave its radio range and become disconnected, in order to schedule the best time (safe time) to update and verify the consistency of its data with respect to the data stored in the disconnecting node. To achieve this, each node exchanges the location and speed/GPS information of itself with its neighbor nodes when it joins the network.

• Relative Ability Filter: best-effort estimate of a given node’s stability (relative abil-ity), using the gathered information such as its network connectivity, access fre-quency to a specific data item by that node and its residual battery capacity (this last one was considered to be the most important by the authors). That is, if the relative ability of a given node is below a certain threshold (meaning that the device might be leaving soon due to its battery running out, or it cannot spread data e ffi-ciently or it scarcely accesses the data item, so it is not urgent to update its local copy) it will not receive any data updates.

By combining the above estimates, HDUM is able to optimize the update process by only choosing to execute it if it indeed is worth the network overloading and by choosing the most convenient time to perform the action. This proposal is particularly interesting due to its novelty, but also because it fully leverages the mobility & context-awareness characteristics of edge devices.

REAP [28]: on the other hand, REAP is another reliable and efficient ad-hoc network protocol that leverages the Publish/Subscribe mechanism to notify edge nodes on data submissions through means of delta compression. When a REAP producer transmits a new message to the network, it starts by checking its cache containing previously pub-lished documents. Then, the MultiDiff algorithm proposed by the authors identifies matching character sequences between the stored documents and the new one, and, by choosing the best subset of documents for the matching algorithm, creates the smallest patch possible that needs to be applied on those documents in order to generate the one that was recently created. If no relevant character sequences were found, the document is transmitted as a whole after applying a zlib [34] compression pass.

When a patch is successfully generated, meaning that it references one or more doc-uments as well as the new data content, and received by a subscriber, due to the out-of-order nature of edge networks the client may need to wait for a few moments if he/she does not have some of the documents referenced in the patch. Thus, in this case, the patch will need to be momentarily stored on the side until those absent files are received by the user for the patch to be retried.

The key mechanism in the REAP protocol lies in the decision of how many and which documents should be cached in order to be referred inside patch messages. The authors propose two distinct approaches: a time-bounded or size-bounded window. In the first specification, the producer’s MultiDiff instance will attempt to use the documents sent within the lastk seconds to calculate a patch. For instance, if a window of 60 seconds

is specified, a full document will only need to be sent every minute. Therefore, if a new document is published every second, only the first packet will contain the full bytes and the next 59 will contain small patches. On the other hand, when using the latter approach, the producer will only be able to utilize the lastn messages sent for the diff. Nonetheless, independently of the implementation used, a small number of bytes are associated with each packet to inform the consumers when to flush their document caches.

We considered this specification to be worth studying due to its similarity with our proposed solution, specifically the Publish/Subscribe module. The fact that this frame-work utilizes deltas, that refer to previously sent files, in order to build a newly created object is a rather compelling approach to greatly decrease messages size. In spite of this, we recognize that the authors’ approach is highly vulnerable to packet loss and would not perform as well on storage networks where the published content structure and spec-ification is widely disparate. By taking into account the latter, the algorithm would not be able to create patches, falling back to a compressed whole file transfer in most of the cases, which would be inefficient.

CoPro-CoCache [30] is another implementation of a cloud service at the edge, materi-alized as a framework that allows infrastructure nodes to collaborate on video streaming caching and processing. The proposed solution is based on a single MEC server in each location that acts as a caching and transcoding node. Furthermore, the servers keep track of the most popular videos and each of them is evenly distributed among all the APs in a predetermined bitrate quality (tending towards higher bitrates). However, when a user asks one of those videos but in a lower quality, whether it is preferable due to network congestion, low battery percentage or diminished processing capabilities, the associated MEC will be responsible for applying an on-the-fly transcoding function to respond with the desired quality. The node is able to do this by checking if its local storage contains the requested video or by asking its neighboring MEC servers, without the need to communicate with the cloud infrastructure and obtain a new video file for the given quality.

One of the major takeaways from the CoPro-CoCache specification that we acknowl-edge as an asset to our system lies on the idea of being able to adaptively create smaller representations, in size an quality, of a data item. Furthermore, we could implement an iteration of this proposal in our solution to place specific quality-diminished data items (e.g. based on popularity) inside each infrastructure AP. This approach would be able to greatly increase system responsiveness on requests for those particular objects, while, at the same time, having a mechanism that would eventually send the object in its original quality to the user, in a later, more opportunistic, time frame.

In document A Persistent Publish/Subscribe System for Mobile Edge Computing (Page 43-46)