Cloud computing services,1 such as. Cloud-Based Desktop Services for Thin Clients. Thin Client Computing

(1)

Th

in C

lie

nt C

om

pu

tin

Cloud-Based Desktop

Services for Thin Clients

Lien Deboosere, Bert Vankeirsbilck, Pieter Simoens, Filip De Turck, Bart Dhoedt, and Piet Demeester

Ghent University

Cloud computing and ubiquitous network availability have renewed people’s interest in the thin client concept. By executing applications in virtual desktops on cloud servers, users can access any application from any location with any device. For this to be a successful alternative to traditional offline applications, however, researchers must overcome important challenges. The thin client protocol must display audiovisual output fluidly, and the server executing the virtual desktop should have sufficient resources and ideally be close to the user’s current location to limit network delay. From a service provider viewpoint, cost reduction is also an important issue.

C

loud computing services,1_{such as}

Amazon’s Elastic Compute Cloud, are widely available today, offer-ing computoffer-ing resources on demand. Thanks to such advances and ubiq-uitous network availability, the thin client computing paradigm is enjoy-ing increasenjoy-ing popularity. Originally intended for wired LAN environments,2

this paradigm is repeating its success in a mobile context. A study from ABI Research forecasts a US$20 billion turnover surrounding services directly associated with mobile cloud comput-ing by the end of 2014. Clearly, when applications are offloaded, the mobile terminal only needs to present audio-visual output to users and convey user input to remote servers, considerably reducing the client device’s computa-tional complexity. Consequently, appli-cations can run as-is, without requiring (many) scaled-down versions for mobile devices.

Several popular applications, such as Google Docs and Microsoft Live, already execute on servers in the cloud. The ability to access applications in the cloud is referred to as software as a service (SaaS), whereas hosting a virtual desk-top (VD) is referred to as deskdesk-top as a service (DaaS). We can categorize DaaS implementations according to where the VD is executed (locally or remotely) or to the method of accessing the VD’s output (browser or thin client protocol, such as the Remote Desktop Protocol or virtual network computing). With mobile users specifically, VDs are executed remotely due to resource constraints. To enable such users to access existing OSs and applications, we can employ a thin client protocol to visually render the output of applications executed by a VD.

Current DaaS deployments, such as the VMWare Virtual Desktop Infra-structure, are concentrated mainly in corporate environments. The availability

(2)

of (virtual) computing resources distributed over the network lets providers offer desktop services in mobile wide area network (WAN) environments.

Here, we discuss solutions that address the challenges providers face in offering cloud-based desktop services. We look at how to both improve users’ experience and reduce providers’ costs in offering the service. We also present a system architecture for offering efficient desk-top services in the cloud.

System Architecture

Existing cloud platforms fulfill the hardware requirements for implementing DaaS. However, an emerging category of mobile applications — including augmented reality, rich sensing, and multimedia editing — pose stringent require-ments on delays. Current cloud management systems can’t meet user expectations for these applications, especially in terms of latency. A clear need exists for novel cloud management algorithms that consider the specific requirements of mobile thin client computing. Our proposed

system architecture implements such algorithms in the service manager’s self-management component. The manager can be implemented as part of existing cloud management sys-tems such as OpenNebula, OpenStack, and Eucalyptus.

Figure 1 shows our architecture. Simplified OS image management (that is, re-using an OS image among users to reduce the storage per user) and application management are essential for the service to scale. Our system builds a VD from a shared golden image from the OS data-base and augments it with personal settings — for example, by using a copy-on-write solu-tion with UnionFS (http://unionfs.filesystems. org). Multilayer VDs simplify the complexity of upgrading the golden image without causing broken dependencies or conflicts.3_{To improve}

DaaS usability, we could combine DaaS with application virtualization technologies such as Softricity and Microsoft App-V. The system would then dynamically deliver applications to the user’s VD without having to install, con-figure, and update them. This approach further Figure 1. System architecture for enabling cloud-based desktop services for thin clients. Users connect via a thin client device — a smartphone, tablet PC, PDA, netbook, or minimal- or zero-state device — to their remote applications executed in a virtual desktop. The service manager’s self-management component covers optimizations to improve the user experience and decrease service provider costs. Application virtualization service Thin client users Service manager Thin client protocol Host 1 Host H … Virtual desktops Data center Self-management Overbooking Allocation Consolidation Relocation Data center Data center

Thin client user Thin client users OS image and profile database Monitoring framework Resource overbooking

(3)

reduces the complexity of upgrading golden images because applications aren’t installed in the user’s VD and thus can’t be broken.

User Experience

We must consider two aspects to maintain or improve user experience: high performance of the thin client protocol — that is, crisp inter-activity and fluid audiovisual output — and suf-ficient allocated resources on the server side so that applications respond quickly. For mobile users, reducing energy consumption on the cli-ent device is also important.

Crisp Interactivity

Acceptable interaction-delay bounds depend on the application at hand. For office automa-tion applicaautoma-tions, delays of up to 150 ms are tolerable,4_{whereas for multimedia applications}

such as video games, users are already affected by interaction delays higher than 80 ms. The result of user input can be seen only after at least one round-trip time (RTT), so we should address delay for critical applications using a proximate server. Every time users connect to the service, the system must select a data center that can be reached fast enough from the users’ current location. Inside the data center, the sys-tem should select an appropriate server based on the expected resource requirements of the user’s applications (determined via the user’s profile and the current load on the servers, as discussed later).

Due to user mobility, guaranteeing delay bounds implies that a VD might have to migrate to another server. In practice, the system con-tinuously monitors the desktop service and, for example, when the RTT exceeds a predefined boundary based on the type of active applica-tions, relocates the user’s VD to a more suitable host. The system performs this relocation using live migration.5

Storing a VD’s UnionFS delta file system on network storage equipment reduces a service

provider’s cost of relocating the VD to copying the current active memory or, in the worst case (that is, migration across data centers), copying the VDs’ delta file system. During the migration process, both the original and target host require resources. In a heavily loaded system, such double-resource reservations can lead the system to reject new user requests while caus-ing substantial network traffic for the memory- copying process. Thus, the system should relo-cate VDs only when it will achieve valuable improvements for them or their customers. Fluid Audiovisual Output

Multimedia content has been a stumbling block for thin client computing for years — especially in mobile WAN environments where bandwidth availability is limited and expensive. This is mainly because the same coding is applied to static (text editing, for example) and dynamic (as with video games) content. Recently, both researchers and industry have proposed several bandwidth optimizations for thin client proto-cols. One important innovation implements a channel for redirecting multimedia in its origi-nal format to the client (as with Citrix Speed-Screen),6_{at least when the appropriate codec is}

available on the client device. This approach is valid only for playing multimedia streams, not for displaying high-motion output from an application (such as a video game). We’ve evaluated a thin client protocol optimization that encodes applications’ high-motion output with a video codec and switches to a thin client protocol to encode low-motion output;7_{Table 1}

shows this approach’s feasibility for some pop-ular mobile devices. In these experiments, we played a full-screen video on the server and streamed it to the client. Given that the bottle-neck in the live-encoding process is the server’s CPU, we reached higher frame rates for smaller screen resolutions. Using a GPU’s processing power on the server side could improve the frame rate.

Table 1. Thin client computing performance on popular mobile devices.

Device Screen resolution Available codecs

Streaming frame rate (frames per second)

iPhone 4 640 × 960 H264, MPEG-4, M-JPEG 27 Samsung Galaxy S 800 × 480 H263, H264, MPEG-4, WMV, VC-1 23

iPad 1024 × 768 H264, MPEG-4, M-JPEG 20

(4)

Another approach is to cache important out-put sequences such as the desktop view and menu items to reduce both the required band-width and the interaction delay.8_{A complete}

overview of recent thin client protocol optimi-zations is available elsewhere.9

Resource Allocation

In the data center, the allocation algorithm must find a suitable host to satisfy an arriving user request, as Figure 2a shows. From the customer’s viewpoint, the least-utilized host is preferable, whereas the provider prefers the host resulting in the least resource fragmentation (that is, the best-fit host) because this can reduce energy consumption. The resources a user needs are specified in a service-level agreement (SLA). To observe the aforementioned balance, the alloca-tion algorithm attributes a penalty α for each request that receives too few resources, and a penalty b related to resource fragmentation — that is, to the amount of nonreserved resources on this host. The algorithm selects the host with the lowest penalty to handle the user request. Figure 2b shows the influence of the ratio α/b

on the probability of SLA violations for a simu-lation with 10 hosts and an average utilization of 90 percent. In this context, an SLA violation implies that the user applications receive fewer resources than requested. When α/b increases — that is, when SLA violations are expensive — the allocation algorithm can reduce the probability

of SLA violations by 10 percent. An SLA viola-tion as defined here might not be noticeable or obstructive for the user experience because it might just take a bit longer for the user applica-tions to execute a task.

For scalability, we can’t assume that every user has a dedicated profile. Rather, VDs’ resource requirements should be clustered offline into a finite number of profiles. At subscription time the system assigns a user one of these profiles. An online clustering algorithm such as a decen-tralized clustering algorithm10_{could map the}

current resource requirements of a user’s VD to one of the cluster profiles. This online mapping can let the system adapt the current resource allocation or even the user’s profile when appropriate.

If the current resource requirements don’t correspond to the user’s profile (for example, the system detects bursts of SLA violations), the cloud management component can decide, based on the user’s SLA, to adapt the resource allocation to current needs. If more resources are required and sufficient resources are avail-able on the current host, it simply allocates these additional resources. A problem arises, however, when the current host can’t update its resource reservation to the desired level. In this case the system can take one of two actions: it can relocate the user’s VD to a host with sufficient free resources or relocate other VDs from the current host until sufficient resources are freed. Figure 2. Simulation results. (a) The allocation algorithm selects a suitable host to satisfy an arriving user request, thereby balancing the provider’s gains by decreasing energy consumption — and hence limiting the number of active servers — and the penalties related to customers’ unsatisfied resource demands. (b) We evaluated our proposed allocation algorithm in a scenario with 10 hosts and an average utilization of 90 percent. The cost-based algorithm proposed in the main text shows a decrease of 10 percent in service-level agreement (SLA) violations. An SLA violation means that the user applications receive fewer resources than requested.

Arriving user requests

Allocation algorithm Virtual desktops (VDs) … 0 16

Average probability of SLA violations

(percent) –∞ –3 –2 –1 0 1 2 3 +∞ log (α/β) 2 4 6 8 10 12 14 Cost-based Random (a) (b)

(5)

The preferred choice depends on several factors, such as users’ SLA contracts and a VD’s mem-ory consumption, which determines the time required to finish its live migration.

Battery Autonomy

Limited battery drain is important for mobile users. Because computing power shifts to the network, we could expect a small bat-tery drain; on the other hand, the continuous wireless network connection is a huge battery consumer.

Several approaches exist for reducing a wireless network connection’s energy con-sumption, such as cross-layer optimization.9

Even with this adaptation, offloading all appli-cations isn’t justifiable in terms of reducing energy consumption. Thus, we propose weigh-ing the advantages of offloadweigh-ing an application to a remote server versus locally executing it. One solution between these two extremes is to offload parts of the applications and render them at remote servers while executing the other parts locally, which could also reduce interaction delay.11

Service Provider Costs

A service provider’s most important challenge is satisfying customers while minimizing costs. We focus on optimizing the number of users a single host can serve and minimizing energy consumption in the cloud.

Number of Users

Depending on the targeted user experience, resources should remain in reserve on the infrastructure. Of course, reserving worst-case resource needs will lead providers to overprovi-sion cloud resources. The planning guide from Citrix12_{suggests assigning at most 10 normal}

VDs or four heavy VDs to a single host. Given that mobile device screen resolutions are grow-ing closer to those of regular screens, the dif-ference in resource requirements for hosting a VD for a mobile or for a fixed user is negligi-ble. So, the Citrix study is also valid in today’s mobile context. When more VDs are assigned to a host, the performance degradation depends on the type of applications executed in those VDs.13_{Thus, the number of allocated resources}

should depend on the applications users will likely execute, as specified in their profiles.

It’s important to share resources to optimize utilization in the context of shared Internet hosting platforms.14_{Based on the observation}

that a VD’s resource requirement varies sig-nificantly and depends on many factors, such as multiple active applications, we can use a resource overbooking technique in a VD com-puting context.

Figure 3a illustrates our overbooking tech-nique, which exploits the shared resource platform the host uses to execute VDs. In our approach, the provider reserves part of the expected resource requirements according to Figure 3. Overbooking technique. (a) Nonconsumed reserved resources are collected in the host’s resource pool to be shared among virtual desktops (VDs) requesting more resources than reserved. (b) The simulation results (averaged over 15 simulations) consider a fully reserved host with normal VDs requesting resources (based on the planning guide from Citrix12_{) according to a normal}

distribution N(μ,σ2_{) with}_μ_{taken from N(10, 3.5) and}_σ2_{taken from N(3.5, 2/3}_×_3.5).

60 100

Average utilization (percent)

Average probability of _{SLA violations (percent)} 50

40 30 20

Overbooking degree (percent) 65 70 75 80 85 90 95 0 25 15 20 10 5 10 Utilization Probability (SLA violations) Add resources to RP Resource requirements VDi on time stamp t Reserved Requested Reserved Requested Request resources from RP Resource pool (RP) of host H

Resource requirements VDj

on time stamp t

(6)

the adopted overbooking degree. We define the overbooking degree as the probability of not being able to satisfy a user’s request. The host’s resource scheduler ensures that a VD can always consume at least the reserved resources. The host collects nonconsumed resources in its resource pool. VDs requesting more resources than reserved can receive additional resources from this pool. As Figure 3b shows, when the overbooking degree increases, the utilization of the host by the served VDs also increases. In this case, fewer resources are reserved and hence the probability of SLA violations increases.

The provider can assign different overbook-ing degrees to VDs with different profiles or SLA contracts. As emphasized previously, user experience isn’t determined only by the resource allocation for the user’s VD, but also by audiovi-sual quality and the interaction delay with the application. To globally optimize user experience and resource allocation, future research should examine how to couple the resource allocation strategy with the thin client protocol settings in a global framework.

Energy Cost

To achieve a green cloud-based desktop service, providers should implement a consolidation algorithm to adapt the online host pool to the current system load. This algorithm must pre-dict the (near) future system load to determine the required number of hosts. The time between two iterations of the consolidation algorithm is

called the time window. During a time window, the algorithm collects monitoring information and — based on the assumption that the sys-tem load during the next time window will vary in a similar way — predicts the system load via linear extrapolation (see Figure 4a).

When additional hosts are required, the sys-tem simply puts them online. When redundant hosts are found, more elaboration is required to decide which hosts should go offline. Idle hosts are naturally the best choice for going offline because no VDs must be relocated before this can occur. If there aren’t enough idle hosts, the algorithm sorts the hosts by ascending number of VDs. To minimize the number of relocations, the algorithm tries to relocate the VDs from the hosts in list order. When it can’t relocate all VDs on a host to other hosts, relocating any of them is pointless, and the algorithm should continue with the next host down the list, until sufficient hosts are put offline or no hosts remain on the list. When the real system load appears to be higher than expected, the monitoring frame-work notices this unfavorable situation and requests that the cloud management component take appropriate action.

The simulation results in Figure 4b are from a scenario with realistic user behavior (that is, a daily cycle of user requests according to the Lublin model15_{). These results show that a large}

potential exists for saving energy at the cost of a small increase in SLA violations. In this scenario, providers can save up to 36.6 percent Figure 4. A consolidation algorithm. To reduce the energy consumption of servers in the cloud, the algorithm adapts the number of online servers to the system load. This results in a small increase in SLA violations. (a) The algorithm can predict the system load in the next time window. (b) Our simulation considers a daily cycle of arrivals by two user types in a ratio of 3 normal to 1 heavy user, with an average resource request distribution of N(10, 3.5) and N(25, 5), respectively.

Energy consumption (kWh) 0 1 2 3 4 5 6 0 50 100 150 200 250 Disabled Enabled

Average probability of _{SLA violations (percent)}

Consolidation algorithm

Energy consumption Probability (SLA violations)

Predicted additional utilization Max utilization umax, τ Previous utilization uτ–1 Current utilization u_τ

Time window τ Time window τ+1 Predicted max utilization umax, τ+1

u_τ+1

(7)

in energy for an additional 1.7 percent in SLA violations.

E

xisting optimizations of thin client proto-cols and desktop services each focus on a specific part of the user experience. Currently, we can quantify user experience of thin-client-based virtual desktops offline only via a slow-motion benchmarking technique.13_{Clearly, we}

need a novel, objective metric that represents the global user experience along with online measurement methodologies. Future research should be devoted to integrating relevant thin client protocol optimizations with resource allocation strategies to achieve the best user experience. To further improve user experi-ence, we should extend the cloud management algorithms presented here to operate on inter-connected data centers — for example, by relo-cating virtual desktops from overloaded to less loaded data centers.

Acknowledgments

Lien Deboosere’s and Bert Vankeirsbilck’s research is funded by a PhD grant from the Institute for the Promotion of Innovation through Science and Technology, Flanders (IWT Vlaanderen).

References

1. R. Buyya et al., “Cloud Computing and Emerging IT Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility,” Future Generation

Com-puter Systems, vol. 25, no. 6, 2009, pp. 599–616.

2. A. Lai and J. Nieh, “On the Performance of Wide-Area Thin-Client Computing,” ACM Trans. Computer

Sys-tems, vol. 24, no. 2, 2006, pp. 175–209.

3. R. Schwarzkopf et al., “Multi-Layered Vir tual Machines for Security Updates in Grid Environments,”

Proc. 35th EUROMICRO Conf. Internet Technologies, Quality of Service, and Applications, IEEE Press, 2009,

pp. 563–570.

4. N. Tolia, D.G. Andersen, and M. Satyanarayanan, “Quantifying Interactive User Experience on Thin Cli-ents,” Computer, vol. 39, no. 3, 2006, pp. 46–52. 5. F. Travostino et al., “Seamless Live Migration of

Vir-tual Machines over the MAN/WAN,” Future Generation

Computer Systems, vol. 22, no. 8, 2006, pp. 901–907.

6. “SpeedScreen Latency Reduction Explained,” white paper, Citrix Systems, Dec. 2000.

7. P. Simoens et al., “Design and Implementation of a Hybrid Remote Display Protocol to Optimize Multimedia Experience on Thin Client Devices,”

Proc. Australasian Telecomm. Networks and Applica-tions Conf., IEEE Press, 2008, pp. 391–396.

8. B. Vankeirsbilck et al., “Bandwidth Optimization for Mobile Thin Client Computing through Graphical Update Caching,” Proc. Australasian Telecomm. Networks and

Applications Conf., IEEE Press, 2008, pp. 385–390.

9. P. Simoens et al., “Remote Display Solutions for Mobile Cloud Computing,” Computer, vol. 44, no. 8, 2011, pp. 46–53.

10. A. Quiroz et al., “Towards Autonomic Workload Provi-sioning for Enterprise Grids and Clouds,” Proc. IEEE/

ACM Int’l Conf. Grid Computing, IEEE Press, 2009,

pp. 50–57.

11. Y. Lu, S. Li, and H. Shen, “Virtualized Screen: A Third Element for Cloud-Mobile Convergence,” IEEE

Multi-media, vol. 18, no. 2, 2011, pp. 4–11.

12. “XenDesktop Planning Guide — Hosted VM-Based Resource Allocation,” white paper CTX12277, Citrix, 2010. 13. A. Berryman et al., “VDBench: A Benchmarking Tool-kit for Thin-Client-Based Virtual Desktop Environ-ments,” Proc. 2nd IEEE Int’l Conf. Cloud Computing

Technology and Science, IEEE Press, 2010, pp. 480–487.

14. B. Urgaonkar, P. Shenoy, and T. Roscoe, “Resource Overbooking and Application Profiling in a Shared Internet Hosting Platform,” ACM Trans. Internet

Tech-nology, vol. 9, no. 1, 2009, pp. 1–45.

15. U. Lublin and D.G. Feitelson, “The Workload on Parallel Supercomputers: Modeling the Characteristics of Rigid Jobs,” J. Parallel and Distributed Computing, vol. 63, no. 11, 2003, pp. 1105–1122.

Lien Deboosere is an IT business analyst at Melexis NV. At the time of this research, she was in the Department of Information Technology at Ghent University, Belgium. Her main research interest is the design of architec-tures for wide-area thin client computing. Deboosere has a PhD in computer science engineering from Ghent University. Contact her at [email protected].

Bert Vankeirsbilck is a PhD researcher in the Internet Based Communication Networks and Services research group, Department of Information Technology (INTEC), Ghent University, Belgium. His main research interest is the execution of resource-intensive applications on mobile devices through thin client technology. Vankeirsbilck has an MSc in computer science engineering from Ghent University. Contact him at bert.vankeirsbilck@ intec.ugent.be.

Pieter Simoens is a post-doctoral researcher with the Department of Information Technology at Ghent Uni-versity, Belgium. His current work focuses on smart clients and cloud computing; his research activities

(8)

are combined with a mandate as doctoral assistant at Ghent University. Simoens has a PhD in computer sci-ence engineering from Ghent University. Contact him at [email protected].

Filip De Turck is a professor in the Department of Infor-mation Technology at Ghent Universit y, Belgium. His main research interests include scalable software architectures for telecommunication network and ser-vice management, performance evaluation, and design of new telecommunication services. De Turck has a PhD in electronic engineering from Ghent University. He’s on the program committee of several conferences and a regular reviewer for conferences and journals in the telecommunication services field. Contact him at [email protected].

Bart Dhoedt is a professor in the Department of Information Technology at Ghent University, Belgium. His research

interests include software engineering, distributed (autonomic) systems, grid and cloud computing, and thin client computing. Dhoedt has a PhD in opto- electronics from Ghent University. Contact him at bart. [email protected].

Piet Demeester is a professor of communication networks in the Department of Information Technology at Ghent University, Belgium, where he heads the Internet Based Communication Networks and Services (IBCN) research group that’s part of the Interdisciplinary Institute for Broadband Technology (IBBT). Demeester has a PhD in photonics from Ghent University. He’s a fellow of IEEE. Contact him at [email protected]; www.ibcn.ugent.be.

Selected CS articles and columns are also available for free at http://ComputingNow.computer.org.

IEEE Software

seeks practical, readable articles that will appeal to experts and nonexperts alike. The magazine aims to deliver reliable information to software developers and managers to help them stay on top of rapid technology change. Submissions must be original and no more than 4,700 words, including 200 words for each table and fi gure.

Call

Articles

for

Author guidelines: www.computer.org/software/author.htm Further details: [email protected]