Sybase Unwired Platform Performance and Tuning

(1)

white paper

October 2010

Sybase

®

Unwired Platform

Performance and Tuning

(2)

taBLe OF CONteNtS

1 Introduction

1 Sybase Unwired Platform 1.5.2 Architecture 2 Synchronization Paradigms

2 Replication-Based Synchronization (RBS) 3 Message-Based Synchronization (MBS) 4 Replication-based Synchronization Performance 4 Synchronization Scenarios

6 Observations and Application

6 Observations of the Sample Model Performance 8 Tuning Recommendation

15 Message-based Synchronization Performance 15 Synchronization Scenarios

16 Observations and Application

16 Observations of the Sample Model Performance 18 Tuning for MBS

18 Tuning Recommendations 19 Tuning Process

20 RBS and MBS Comparison 20 Strength and Weakness 21 Performance Comparison 21 Recommendation 21 Determination Criteria 22 Validation 22 Appendix 22 MBS Tuning Information

22 Inbound and Outbound Queues for MBS 22 Change Detection for Package

(3)

iNtrOduCtiON

Sybase Unwired Platform 1.5.2 Architecture

Figure 1: Sybase Unwired Platform Architecture

Sybase Unwired Platform 1.5.2 adheres to the same high-level architecture of earlier versions of Unwired Platform. It is a technology refresh for some of the modules, to improve performance and scalability. At the same time, message-based synchronization has been added to support mobile applications that run over an always-on or mostly-on connectivity. The messaging services module in the above figure includes both replication- and message-based synchronization. The data services module has been redesigned to provide more flexibility in handling mobile data with various characteristics and to yield higher performance and scalability.

Figure 2 on the following page shows the deployment architecture for a typical Sybase Unwired Platform installation. Behind the relay server farm are the Device Management server (Afaria®) and the Unwired Server, providing both replication- and message-based synchronization. The Unwired Platform cluster can host a number of domains for multitenancy, each with its own set of mobile business object (MBO) packages and administrative controls. The hosted Sybase Unwired Platform will access the tenants’ enterprise information system (EIS) via a secured protocol. Data change notifications (DCNs) from the EIS, which update the MBO cache, are sent through the HTTP(S) protocol and distributed to the Sybase Unwired Platform cluster via a software or hardware load balancer.

Data Center Structured Semi-structured Unstructured Packaged Applications Documents/ Email Multi-media Databases Service Data Sources Data Services

(4)

2

Figure 2: Deployment Architecture

Synchronization Paradigms Replication-Based Synchronization (RBS)

Figure 3: RBS Data Flow

2

Relay Server IIS or Apache Relay Server (Optional for HA)

DMZ Devices Communicate to the Relay Server HTTP or HTTPS

Devices Connect to the

Relay Server Inbound RelayServer Farm HTTP(s) Outbound HTTP(s) Outbound HTTP(s) Firewall Firewall

Devices Internet Internal

SUP Servers Connect Outbound to the Relay Server Farm SUP Development (Cluster) SUP Production Cluster HTTP(s) Data Change Notification HTTP(s) Data Change Notification MBOs Deployed to Production Server JDBC/ Web Services /SAP JCo JDBC/ Web Services /SAP JCo Domains

HA is Available for the SUP Servers Field Devices Connect

to Domains which Host MBO Packages

Sybase Control Center SUP Adminstration Console Sybase Control Center SUP Adminstration Console EIS Message Based Synchronization Replication Based Synchronization Notifier

Data Sources

Staging DataStore

Data Services

Mobile Middleware Services

(5)

Replication-based synchronization has two distinct operation and data transfer phases: upload and download. These phases are conducted as transactions to protect the integrity of synchronization. Replication-based

synchronizations are carried out within a single data transfer session, allowing efficient exchanges of large payloads. In Figure 3 RBS Data Flow, solid arrows illustrate the path of an upload and broken arrows the download phase. During upload, a transaction is initiated and the device operations to be replayed are transferred as a set of rows. They are interpreted and executed by the operation replay logic within Mobile Middleware Services (MMS). Depending on the operations’ cache settings, the cache may be invalidated or updated within the same transaction as the EIS system of record. The upload transaction completes only after all replay operations are executed (successfully or not).

If the upload is successful, the download transaction starts to download data to bring the database on the device consistent with the state in the server’s cache. During download, depending on the state of the cache, data services may have to initiate a READ operation on the EIS to first fill the cache and calculate the difference between new and existing data.

Message-Based Synchronization (MBS)

Message-based synchronization is conducted via the reliable exchange of messages, trading off between efficiency and immediacy. Some messages arrive earlier than in replication-based synchronization because the payload is spread out over many relatively small messages that are delivered individually, as opposed to being delivered as a single large download. However, an always-on, or almost-always-on connection is required to achieve the benefit of near real-time data exchange.

Figure 4: MBS Data Flow

Reliable messaging requires transactions for both uploading and downloading individual messages to devices. As a result, in the current release it is relatively expensive to transfer large amounts of information via messaging. Incoming messages from the devices are either synchronization protocol directives or payload (operations to be replayed with associated data). Downloaded messages are the result or imports from the server that represent the server-side state. Each message containing operations is executed on the server, and the result is returned in a message to the device. Mobile Middleware Services execute change detections for each subscribed device on a schedule and push out new messages as necessary.

Message Based Synchronization Replication Based Synchronization Notifier

Data Sources

Staging DataStore

Data Services

Mobile Middleware Services

(6)

4

repLiCatiON-BaSed SyNChrONizatiON perFOrmaNCe Synchronization Scenarios

Replication-based synchronization (RBS) is based on differencing technology, whereby only updates from the mobile client to the server are integrated with data in the mobile middleware cache and pushed to the EIS. In the download operation, EIS delta changes are determined for, and supplied to, the specific client.

In RBS scenarios, there are three main points to consider performance — the client, the mobile middleware server, and the EIS. Performance is most affected by the mobile data model, the tuning of the mobile server, and the ability of the EIS to service the mobile model in response to client requests.

Replication-based synchronization use cases center around three primary phases in moving data between server and client:

• Initial synchronization – where data is moved from the back-end enterprise system, through the mobile middleware, and finally onto the device storage. This phase represents the point where a new device is put into service or data is reinitialized, and therefore represents the largest movement of data of all the scenarios. In these performance test scenarios, the device-side file may be purged to represent a fresh synchronization, or preserved to represent specific cache refreshes on the server.

• Incremental synchronization – involves create, update, and delete (CUD) operations on the device where some data is already populated on the device and changes are made to that data which then need to be reconciled with the middleware and the back-end enterprise system. When create and update operations occur, changes may be pushed through the middleware cache to the back end, reads may occur to properly represent the system of record, for example, data on the back end may be reformatted, or both. This scenario represents incremental delta changes to and from the system of record.

• Data change notification (DCN) – changes to the back-end data are pushed to the mobile middleware and then reconciled with the device data. DCN is typically observed in the context of additional changes to the device data so that changes from both device and back end are simultaneously impacting the mobile middleware cache. Figure 5 is a representative data model for all of the replication use cases in this document. Its objective is to demonstrate performance characteristics of various data synchronization loads in the context of a fairly simple example that can be extrapolated to more complex scenarios.

(7)

In this model, the employee, which is related to a personalization key that is set on the device, defines a partial subset of data that is downloaded for a device. Since the synchronization parameter for sales order is tied to the employee ID, the cache for sales orders and sales order line items is perfectly partitioned by the employee ID (sales_ rep). In other words, no two devices share sales data — each device downloads, on demand, the sales data only for the employee who is related to the specific device. Other related MBOs also provide specific data associated with sales orders, for example, sales order line items. When an employee synchronizes with the mobile server, the server loads the sales data for that employee based on the employee ID, and automatically locates, and synchronizes, all the other MBO data related to those orders through primary-foreign keys relationships.

The product MBO is considered common reference data; the entire contents are downloaded for each client once during initial synchronization. This MBO also provides bulk-download capacity tests as the amount of downloaded data varies, based on the number of products downloaded to devices. Once the data is loaded into the Sybase Unwired Platform cache, it is reloaded only if changes are made to the reference data in the EIS or if the device invokes a create, update, delete operation on the product catalog.

In each performance scenario, the latency effects of the EIS system are minimized by using an in-memory database that avoids reading data from disks that might be competing with Unwired Server disk requests. The Unwired Server infrastructure is run on three tiers; the EIS is on one tier, the consolidated database/cache is on the middle tier, and the Unwired Servers are on the front tier. A relay server farm between the client device and the Unwired Servers balances requests across all servers in the cluster.

• Initial sync use case, no client data present – vary the number of users and the data size on initial sync. This use case primes the Unwired Platform cache for each client/sales representative, then removes the client’s data. The effect of this measurement demonstrates the time the server takes to look in its cache for partitioned and reference data and return it to the client. The effect of this synchronization is that the client requests server-side updates, the server determines the last time the client synchronized (which in this case is never), the server looks in the client’s sales order partition based on the employee ID, locates the reference data, and returns the employee’s sales order, line items, and the common reference data.

• Subsequent sync use case, no data changes – vary the number of users and data size on initial sync. This use case primes the cache for each client/sales representative and also for the reference data, so that the EIS data load is measured before repeatedly executing and measuring data synchronizations (without any data changes to client or server) for each client. The effect of this synchronization is that the client requests server-side updates, the server determines the last time the client synchronized, checks the client’s sales order partition currency based on the employee id in addition to the currency of the reference data, and returns nothing because there are no changes between client and server data since the last time the client performed a synchronization.

• Incremental sync use case, client create, update, delete (CUD) – vary the number of users and data size after initial sync. This use case assumes a primed Unwired Platform cache and client. The effect of this synchronization is that the client updates one to ten sales orders and pushes those changes via parameter updates to the server cache and EIS, and subsequently reads the results of the operations’ EIS responses back to the client. The EIS response may include changes to the client updates.

• Incremental sync use case, DCN – vary the number of users and the data size after initial sync. This use case assumes a primed Unwired Platform cache and client. The effect of these synchronizations is that the client, acting as a server update, provides an update to the server via an HTTP operation and then, acting in the role of a device, synchronizes with the server and checks for valid order changes.

(8)

6

Observations and Application

In reality, each of these read/write synchronization scenarios occurs in differing proportions during daily life cycles of a mobile application, and can be estimated individually and then aggregated accordingly to determine the total load on the server and individual device performance characteristics.

For example, initial synchronization occurs when a new device is deployed or when a device is reinitialized. The amount of data transferred to each device and the number of concurrent devices that are synchronizing are two important parameters that affect performance at this stage. This activity is typically a one-time or rarely occurring activity but nevertheless puts a large load on the system, as all data for that device must be collected on the server, encypted, transmitted, and unencrypted, stored on the device (while potentially being rencrypted with that device’s key).

After initial synchronization, a mobile application user operating within an “occasionally connected” environment performs his or her daily work tasks offline, and typically makes several changes to device data while invoking operations. These operations are queued up and at some point, either incrementally during the day or at the end of the day, the user synchronizes all of these changes with the server. The amount of data transferred depends on the number of changes and the associated data that accompanies the business transaction to the mobile server. As the device uploads its operations to the Unwired Server, the server incorporates the EIS changes (which may include reformatted or adorned EIS inserts or updates originally made by the device), and finally sends the result back to the device.

Changes that are queued up by the Unwired Server for synchronization include both DCN pushes from the back end (EIS), scheduled cache refreshes to data partitions, and changes made by other mobile users that share data partitions. A data partition is defined by keys or parameters supplied by the device that serve to filter data sets within the MBO model. When MBO keys or parameters that are passed from a device match the data entered by other device users, the Unwired Platform cache data is shared among device users in a common cache partition. A properly designed and partitioned model defines narrow slices of EIS information that are relevant to its user so the least amount of data is transmitted to each device and shared information is rarely updated by devices.

Observations of the Sample Model Performance

The example model used in the performance cases partitions the sales orders, along with the related line items, so that any one device receives only its sales orders. This reduces the amount of data that must be sent to any particular device, which consequently reduces the amount of time for the server to locate, bundle, and transmit data to the device. Common reference data that rarely changes is synchronized on a schedule, so that the server can quickly determine the last time a device synchronized and compare that time to the last refresh of the reference data. This avoids unnecessary comparisons of individual data elements within the reference data partition.

Each performance use case provides visibility into aspects of operating overhead for mobilization; however, the cost of your EIS is not considered. When developing your mobile applications, center the bulk of performance considerations on proper model partitioning and EIS performance.

Initial Synchronization

(9)

If the synchronization and load parameters are both bound to the same personalization key, the cache is partitioned accordingly and loaded on-demand when a device with that parameter/personalization key is synchronized. If the load parameter defines a batch load, (for example, a “select * from…” statement), then the synchronization key can be used to filter data in the cache so that some subset of server-side cached data is returned to the user.

The initial synchronization is nearly always the most costly phase in terms of time to complete and load on the server. Fortunately, this type of synchronization typically occurs only on new device deployment or the occasional device reset. There are two major aspects to this synchronization phase — the Unwired Platform cache load time and the movement of the data to the device. Reference data is often bulk-loaded on a schedule so cache load time does not coincide with device synchronization, and each device does not affect the state of the reference data.

In our initial synchronization use case, as devices synchronize, the reference data has already been loaded into the cache and individual partitions are loaded as each device synchronizes. This initial synchronization requires the partition for a user to be loaded for the device from the EIS and specific partition data and all reference data transferred. It is highly unlikely that all devices will be initially synchronized simultaneously so in planning for capacity, take some small portion of this load into account in production scenarios alongside incremental synchronization. Some clients choose to dedicate an Unwired Server specifically to initial synchronization to predictably manage the load on Unwired Servers handling subsequent synchronizations.

Subsequent Synchronization

If a device user synchronizes without making any changes to existing data or invoking server operations, synchronization time includes only the cost of the algorithm to check the last synchronization time against the time the cache was last updated. This synchronization is a baseline cost and useful when comparing the cost of changes and data transferred between device and server. The total load on the server can be characterized as baseline synchronizations where a user is simply looking for changes on the server. Compare the cost of this operation against cases where the user has made changes, or where the data changed on the server; you can extrapolate the estimated costs based on the amount of data transferred.

Incremental Synchronization – CUD

In our use cases, the EIS cost is assumed to be negligible because the database is running in-memory. If possible, this type of measurement — where the EIS latency is removed from the equation — can be useful in determining the raw processing power required to support device synchronization (server cache maintenance overhead and data transmission costs).

As previously discussed, the EIS will almost certainly account for the bulk of the device synchronization response time. A slow EIS will also tie up resources in the Unwired Server with the potential to further impede devices that are competing for resources in connection pools, and so on.

The complexity of the mobile model, measured by the number of relationships between MBOs, has a significant impact on create, update, and delete operation performance. Shared partitions among users or complex locking scenarios involving EIS operations can become a major performance concern during device update operations. Cache and EIS updates are accomplished within the scope of a single transaction, so other users in the same partition are locked out during an update of that partition. Consider denormalizing the model if complex relationships cause performance concerns.

Incremental Cache Update – DCN

(10)

8

behavior is application-specific but can be determined by analyzing the mobile data model to understand how much of the data is shared. In our use cases, none of the DCN data is shared, as sales order data is specific to each client. However if the reference data in our example product data is updated, all devices observe those changes as they synchronize.

There are several ways to materially improve DCN performance. First, by using a load-balancer between the EIS and the Unwired Server, DCNs can be efficiently applied across an Unwired Platform cluster, as each node in the cluster helps to parse incoming payloads. Combining multiple updates into a single batch also has positive impacts on performance. Finally, running DCNs from a multithreaded source can parallelize updates. Figure 6 illustrates four clients (threads) updating a cluster. In this case, there is a diminishing return beyond three to four clients, in large part due to the nature of the model. Different models exhibit different performance characteristics when applying updates, so proper analysis of application behavior is important.

Figure 6: DCN Concurrent (Threaded) Update Sample

Tuning Recommendation

RBS tuning can best be understood in the context of the overall system topology — the relationship of the Unwired Platform mobile middleware, the consolidated database (CDB), and the EIS connectivity are shown in Figure 7 “Tuning Topology,” along with their major tuning points. Tuning is a function of providing the highest throughput while limiting CPU and memory consumption to reasonable operational levels.

To achieve this balance, maximize bandwidth between the relay server and its outbound enablers and the first point of interaction with the core middleware. In our performance analysis, for subsequent synchronizations with large payloads greater than 4MB, we needed to increase the bandwidth between the relay server and the Unwired Platform cluster nodes by increasing the number of relay server outbound enablers (RSOEs) and increasing the shared memory of the relay servers.

(11)

Working back from the thread count, we find the connection pools to the various storage points and the EIS. Provide as many resources as are available as you work back to the EIS. Do not limit the connection pools unless there is a requirement to do so relating to the storage medium. Each tuning point is described in the table below, along with the default and sample production settings.

Figure 7: Tuning Topology

There are many configuration properties in the server, but only a few have the greatest impact on production performance and are discussed here. The table includes default configurations as well as the scenario configurations. Default configurations are usable for small installations, for example, developer configurations.

The speed of the consolidated database (CDB) storage and storage controllers is the single most important factor in providing good system performance. Staging of mobile data is performed within the CDB in a persistent manner such that if a device synchronizes, it receives changes only since the last synchronization, and these changes are often provided without impacting the EIS. The CDB database server itself is largely self-tuning, although typical database maintenance is essential for proper performance. Database maintenance is beyond the scope of this document.

The production settings listed in this table are generally suitable for production scenarios using eight or more core CDB configurations. Notes in the table describe configurations where it makes sense to consider alternative settings.

Internal Network DMZ Internet/

Wireless Backbone Web App Zone

(SUP Mobile Tier)

EIS Server Monitor Server Messaging DB Server SUP CDB Tier (CDB ThreadCount) RBS Thread Count Synch Cache Size JVM Heap Sizes

Monitor Connection Pool (maxPoolSize ))

SUP NODE 1

SUP PRODUCTION ENVIRONMENT

RSOE ... EIS Connection Pool

(maxPoolSize ) Default Connection Pool

(CDB, maxPoolSize )

RBS Thread Count Synch Cache Size JVM Heap Sizes

Relay Server Farm

(HA)

Monitor Connection Pool (maxPoolSize ))

SUP NODE 2

RSOE ... EIS Connection Pool

(maxPoolSize ) Default Connection Pool

(12)

10

MObIle MIDDlewARe COMPONeNT

SeTTINg FUNCTION AND ReCOMMeNDATION DeFAUlT PRODUCTION SeTTINg

Unwired Platform replication thread count

Controls the concurrent number of threads that service devices for synchronization. This setting controls the amount of CPU used by the Unwired Platform and CDB tiers. If the processor of either the CDB or the Unwired Server is excessively high, you can use this setting to throttle the number of requests and limit contention for resources. Low settings decrease parallelism; high settings may cause undue contention for resources.

10 per server 12 per node in a 2- node cluster (or 24 for a single node)

Synchronization cache

size The maximum amount of memory the server uses for holding table data related to device users, network buffers, cached download data, and other structures used for synchronization. When the server has more data than can be held in this memory pool, the data is stored on disk.

50MB 1024MB (1G)

JVM minimum heap size

DJC_JVM_MINHEAP

The minimum memory allocated to the differencing and cache management functions of the server. To change this setting on a service, uninstall the service, make the change, and reinstall the service.

64MB 2048MB (2G)

JVM maximum heap size

DJC_JVM_MAXHEAP

The maximum memory allocated to the differencing and cache management functions of the server. To change this setting on a service, uninstall the service, make the change, and reinstall the service.

256MB 4096MB (4G)

Relay Server shared memory configuration

Settings for shared memory buffer. Remember that shared memory must account for concurrent connections, especially with large payloads.

10MB 2048MB

Relay Server/ RSOE1 _{With SQL Anywhere® 11, each RSOE provides an upload}

and download channel to the relay server. 1 RSOE 3 RSOE per Unwired Platform node for initial sync > 4MB Note: The synchronization differencing algorithms are a key feature of RBS; this technology runs in the JVM. You must provide adequate memory to these components. If these algorithms are memory starved, the JVM spends an inordinate amount of time garbage collecting memory, and synchronizations back up in the internal queues. You can monitor process memory usage with tools like SysInternal’s Process Explorer to determine the actual amount of memory in use by Unwired Platform, and adjust the JVM heap size accordingly.

(13)

CONSOlIDATeD DATAbASe (CDb) COMPONeNT

SeTTINg FUNCTION AND ReCOMMeNDATION DeFAUlT PRODUCTION SeTTINg

CDB thread count Corresponds to the SQL Anywhere –gn option

Sets the maximum multiprogramming level of the database server. It limits the number of tasks (both user and system requests) that the database server can execute concurrently. If the database server receives an additional request while at this limit, the new request must wait until an executing task completes.

20 200

CDB initial cache size Corresponds to the SQL Anywhere –c option

(sqlanywhereoptions. ini file)

Sets the initial memory reserved for caching database pages and other server information. The more cache memory that can be given the server, the better its performance.

The size is the amount of memory, in bytes. Use “k”,

”m”, or ”g” to specify units of kilobytes, megabytes, or gigabytes, respectively. The unit “p” is a percentage of the available physical memory.

24MB 8GB

CDB disk

configuration Isolate the data and log on distinct physical disks. Of the two files, the log receives the most activity. See the SQL Anywhere documentation to run a command similar to the following, which sets the transaction log to a different disk:

dblog “C:\SUP\CDb\data\default.db” -t “D:\DbTxlog\cdb.log” Use disk controllers and SAN infrastructure with write-ahead caching in addition to high-speed disk spindles or static memory disks. The same database must support all cluster members. The faster these database drives perform, the better the performance on the entire cluster.

1 disk 2 disks 10K RPM

eIS CONNeCTIvITy COMPONeNT

Default (CDB) connection pool maximum size

Sets the maximum (JDBC) connection pool size for connecting to the CDB from each cluster member. In the example

production configuration, each Unwired Server is allowed more connections than the number of threads set in the CDB thread configuration. This ratio helps to ensure that the CDB database server is the control point for limiting resource contention in the cluster.

In a server configured with the CDB connection pool settings set to 0, the actual number of connections used will correspond to the number of Unwired Server threads in use. The number of internal connections in use at any one time per thread varies, although fewer than four is typical.

100 0 (unlimited)

(14)

12

eIS CONNeCTIvITy COMPONeNT

EIS connection pool

maximum size The sample database was used as the EIS in the production analysis use cases, having been loaded with large test data sets on separate physical hardware, and being run in-memory. In a real-world scenario, it is highly likely that the EIS will introduce the majority of delays in mobile synchronization. Ensure that adequate EIS resources are allocated for servicing the mobile infrastructure.

The actual number of connections necessary varies, based on the maximum number of Unwired Platform threads in use at any time and the duration it takes for the EIS to respond. If possible, allocate a connection for each thread (or leave the setting unbounded).

If you must limit the number of EIS connections in the pool to a lower number than the number of Unwired Platform threads and you experience timeouts, you may need to adjust the EIS connection timeout values ; however, these connections will impede other threads competing for EIS connections. When you add a new EIS connection, the “Max Pool Size” property may not be present but you can add it in the management console, using the <ADD NEW PROPERTY> item.

10 0 (unlimited)

Note: Because the RBS model uses staging and replication technology, as well as using a differencing algorithm to determine what to synchronize to each device, the CDB is one of the most critical components for performance in terms of processing power, memory, and disk performance. The CDB must also be scaled vertically (on larger hardware) because it supports all of the nodes of the cluster.

The screen illustrations below show the location of some of the major configuration settings affecting performance.

(15)

Figure 9: JVM Settings

In the CDB default settings, the thread count of the database is controlled by a configuration file called install-sup-sqlany11.bat located in:

{install location}\Servers\SQLAnywhere11\data

The CDB_THREADCOUNT configuration setting controls the –gn setting for SQL Anywhere on the data tier. The “default” connection pool is the connection pool on the Unwired Platform tier that is used to communicate with the CDB data tier for both RBS and MBS.

(16)

14

RbS Tuning Process

To tune a production system:

1. Install the system components on distinct server tiers—the relay server, the mobile middleware, the CDB, monitoring.

2. Isolate the monitoring database from the CDB server and turn monitoring off in both the domain and the default configuration2_.

3. Shut down the cluster, including the CDB.

4. Use the SQL Anywhere utility dblog.exe to isolate the CDB disk and log on to fast storage.

5. Set the CDB thread count and initial cache size to an appropriately high value according to the configuration table. a. Stop the CDB

b. If your database is running as a service, run:

{install location}Servers\SAQlAnywhere11\data\install-sup-sqlany1.bat remove c. Edit the properties

d. If your database is running as a service, run:

{install location}Servers\SAQlAnywhere11\data\install-sup-sqlany1.bat auto hostname_primary hostname 5200

e. Restart the CDB 6. Restart the Unwired Servers.

7. In the console, under package connections, adjust the connection pools to high values using Max Pool Size. If possible, use a value of 0, (unlimited connections for default, monitoring, and any EIS). If the Max Pool Size property does not exist on a connection, use the management console to add it.

8. For each Unwired Server configuration a. Adjust the synchronization cache size.

b. Adjust the replication thread count for the entire cluster, so each node has an equal thread count (see the configuration table). This setting is used again to tune the throttle total system capacity.

c. Under the general settings, performance configuration, adjust the thread stack size and JVM heap sizes to the desired values (see the configuration table).

9. Restart the servers.

10. Run a typical application maximum mixed-load test on the server (5% initial sync, 95% subsequent sync) and note the CDB CPU load. Perform this measurement directly against the Unwired Servers (default port 2480), rather than against the relay server, to ascertain the maximum throughput of the Unwired Platform and CDB tiers. a. If the load is too high, decrease the replication thread count on each server, restart the server, and again observe the CDB CPU load.

b. If the CDB CPU load is low, note the client response times of similar syncs (initial to initial, subsequent to subsequent, similar payload, cache policy, and so on) and try increasing the replication thread count in small increments. Restart the server and repeat this process — the goal is get the synchronization response to be as low as possible for the same number of clients and types of synchronizations. Once the relative client response time is as low as possible without further increasing the replication thread count, you have reached the configuration that yields the maximum throughput. In other words, tune the thread count to yield the best overall response times. Usually, this thread count number is small. c. Repeat this process with the actual EIS and mobile applications to confirm the results; EIS latencies and cache policies can change the dynamics of the server.

11. Introduce the relay server farm and re-run your analysis. 12. Turn on and configure only the monitoring you really need.

2 To ensure both domain and default monitoring are off, check the property files in the following location and ensure that “status”, “enabled”, and “autoStart” are all set to false: {install location} \UnwiredPlatform\Servers\UnwiredServer\Repository\Instance\com\sybase\sup\server\monitoring\ MonitoringConfiguration\domainLogging.properties

(17)

Once you have obtained a balance between optimal response times for a certain configuration with a

representative number of concurrent clients, adding additional clients (load) may decrease response times, although the system throughput should remain relatively stable. The number of concurrent users is a function of total users — normally, 5 - 10% concurrency is expected of the total user population. For example, 200 concurrent users represents a total user population of 2000 or 4000, based on this metric.

Figure 11: Representative Response versus Throughput

Monitoring Configuration

Normally, monitoring in a cluster configuration occurs at two levels — the cluster and the domain. Sybase recommends that you turn off monitoring when assessing cluster performance, and apply monitoring as a final step. Excessive monitoring has a detrimental impact to performance of each server in the cluster. Monitoring is performed on separate threads in the server, so monitoring requests do not directly impact Unwired Platform performance. However, the effect of these background threads has an overall impact to the server such that in a default monitoring configuration, you can expect a three percent impact to response times and throughput.

In a production system, isolate the monitoring database from other cluster components. Leaving the monitoring database on the same machine as the CDB unduly impacts end-to-end performance. Allocate adequate hardware to accommodate a moderately busy monitoring database that may be used by more than one cluster. Put procedures in place to purge data from this store on a regular basis. If a monitoring database is shared among more than one cluster, size the monitoring server accordingly. Set the size of connection pools to the monitoring database from each Unwired Server in accordance with the configuration for the CDB connection pools, which is normally unlimited. meSSage-BaSed SyNChrONizatiON perFOrmaNCe

Synchronization Scenarios

(18)

16

The following scenarios are relevant to considering the performance of MBS.

• Subscription – a mobile application subscribes to an MBS package to receive data as “import” messages from the server. This is similar to initial synchronization in RBS. Upon the receipt of the subscription request from the device, the server checks security and executes a query against the consolidated database (CDB) to retrieve the data set, which it then turns into a series of import messages to be sent to the device. The maximum message size is currently fixed at 20KB. Increasing the maximum allowable size would allow the same amount of import data with fewer messages. However, in some devices, large message size can trigger resource exhaustion and failures. In Sybase Unwired Platform version 1.5.2, you cannot change the maximum message size.

• Subsequent Synchronization – device-side data changes due to the user interacting with a mobile application. It is important that you understand the unit of change in MBS. An operation with an associated tree of objects with a containment (or composite) relationship is considered a unit of change. Changes are wrapped in a message with the appropriate operation type: create or update. (The delete operation requires only the primary key that identifies the root of the tree to be transmitted, rather than the entire object tree). Upon receipt of the message, the server replays the operation against the EIS and returns a replay result message with one or more import messages that reflect the new state of the data. Unlike RBS, the unit of change is pushed to the device as soon as the changes are ready, so subsequent synchronization is occurring in the form of many messages. As a result, the synchronization happens over time as a stream of messages.

• Incremental Synchronization – the difference between RBS and MBS is best described as pull versus push. With RBS, the client decides when to synchronize any new data on the server side. With EIS-initiated data change notifications (DCNs) with an RBS synchronization, the cache in the CDB is updated and maintained on the server until a client initiates synchronization. As MBS uses push to send out the updates as import messages to the devices, each DCN normally causes updates to device data over the MBS channels. The MBS implementation in Sybase Unwired Platform version 1.5.2 pushes out changes on a configurable schedule within the server. The main reason for this behavior is the concern for activity storms arising from batched DCNs where many granular changes on the cache cause flurries of messages that might be better consolidated first on the server. Currently, most enterprise information systems are not event driven and DCNs are batched. Hence, having batched triggers directly driving an event based model can decrease efficiency without increasing data freshness.

Observations and Application

MBS uses the same data model as RBS. The life cycle of an MBS application starts with subscription to an MBS package exposed by the Unwired Server. If subscription is successful, import messages flow from the server to the device, where they are processed and inserted into the mobile database that corresponds to the subscribed package. The MBS programming model is asynchronous, which means the application can take advantage of partial data without having to wait for the end of import. However, the application use case determines whether partial data can be leveraged. If the application must wait for all data to be imported, the subscription or initial synchronization turns into a synchronous event for the application and the user is notified when to import all changes. If connectivity is robust or of high capacity, manage user expectations by using a very large initial data set. Results of CUD operations are returned asynchronously and applications should not attempt to block and wait for the response. Connectivity may not be available, and a blocking wait can render the application unusable. Because of this, the response time between CUD request and result is not typically a gating factor for the user. Most mobile applications that employ MBS should have a degree of fire-and-forget characteristics when it comes to CUD operations.

Observations of the Sample Model Performance

(19)

Initial Synchronization

Like RBS, the subscription phase is the most expensive or time consuming portion of the life cycle, especially for a large initial data set. However, compared to RBS, MBS can take longer to populate the data on the device database because of the aggregation of fine-grained messages. This latency is due to the additional cost of providing reliability to the delivery of every message. In addition, a significant amount of processing is incurred on both the server and the client side to marshal messages. While this marshaling may not necessarily impact the server, it places a heavy burden on the device, especially if the device is on the low end of the performance spectrum. The message import is also limited by the device’s nonvolatile storage write performance.

Testing has indicated that one Unwired Server node is capable of sustaining around 70 messages per second to devices. A second Unwired Server node provides an additional 60% increase in delivery capability. This increase is because the second MBS node increases message marshaling and transmission capability for the entire device population.

Device processing capacity limits the capacity of each message to 20KB. Due to serialization of data to JSON, a roughly 2X increase in size is observed within our data model, i.e. 4MB worth of data is represented by 9MB of serialized information for packaging into messages. As a result, the number of import messages is in the low 400s. The 2X factor is only an estimate for a moderately complex application; the true cost depends on the number of attributes (and the datatypes) of each MBO in the model.

Approach initial synchronization carefully, and follow a sound rollout policy for new MBS mobile applications. The easiest approach is to spread the rollout over a period a time to avoid a large number of devices attempting to download large amounts of data. Excessive load not only slows the rollout, but may also impact performance for existing MBS applications. Apart from Unwired Platform capacity and connectivity bandwidth, the actual time required to perform the initial synchronization depends mostly on the capability of the device.

Subsequent Synchronization

The subsequent synchronization scenario addresses mobile devices sending CUD requests to Unwired Platform and receiving responses and updates as a result of the operations. In general, the limiting factor for performance is on the EIS as the CUD requests are replayed. You may experience Unwired Server performance issues if requests are spread over a number of back-end systems. If the EIS response time is large, you may need to allocate additional threads to replay requests concurrently against the EIS. Tuning is required for each environment based on:

• The number of concurrent CUD operations that an EIS can handle without causing degradation • The number of simultaneously active EIS operations the load is spread across

• The average response time for various EIS operations

• Lock contention points in the mobile model that exist due to shared data partitions

Also consider the total application landscape when tuning for optimal performance, as Unwired Platform does not assign threads on a per package basis. All threads are available to service CUD operations for any of the deployed MBS packages.

Incremental Synchronization – DCN

(20)

18

An important consideration for incremental synchronization is how frequently the Unwired Server checks for changes to each MBS package. The higher the check frequency, the higher the cost, but data freshness may not actually increase. The gating factor is the frequency of updates from the EIS. If the cache is configured to refresh at midnight, after the EIS performs certain batch processing, it is most efficient to configure the scheduler to check for changes sometime after the cache refresh is completed. For an EIS that uses DCN to update the cache, the amount and frequency of changes depends on the business process. Understanding the data flow pattern from EIS to cache is crucial to determine how frequently the Unwired Server is to check for changes to be pushed to the devices.

Another aspect is the number of devices that will be supported by the MBS package. To determine if push messages are to be generated for a device, execute download SQL queries against the cache. The cost of the query depends on the complexity of the download logic (number of MBO relationships) in addition to synchronization timestamp comparison. For a large number of devices, this cost can be considerable. Do not start a new check before the previous one is completed.

In general, it is better to batch multiple updates in a single notification to reduce overhead. However, a batch that is too large may impact device synchronization response due to contention. A storm of DCNs can create significant work for devices. The tradeoff is efficiency versus performance (responsiveness).

Tuning for MbS

Tuning Recommendations

There is no difference between RBS and MBS in terms of tuning for the back-end EIS/CDB. However, instead of replication thread count, the MBS processing limit (throttle) is determined by the number of inbound message queues. Any Unwired Server within a cluster can process messages on inbound queues. The number of inbound queues is equal to the number of parallel MBS package requests that the cluster can concurrently service.

The limiting factor in achieving good performance is the Messaging Services bulk-download to devices. Cache refresh and the initial load of data from EIS may still take a long time, depending on the amount of data being extracted. The repackaging of a large amount of data into multiple messages affects the total throughput limit of Messaging Services. In addition, a large number of context switches are required to package and deliver data to devices. These switches span multiple operating environments: CDB, Mobile Middleware Services, Messaging Services, network connectivity, messaging client on device, mobile database, and so on.

(21)

MObIle MIDDlewARe COMPONeNT

SeTTINg FUNCTION AND ReCOMMeNDATION DeFAUlT PRODUCTION SeTTINg

Number of inbound

queue The concurrent number of threads that service MBS package(s) requests from devices. If the processor of either the CDB or the Unwired Server is excessively high, you can throttle the number of requests and limit contention for resources using this setting. Low settings decrease parallelism; high settings may cause undue contention for resources.

This is a cluster-wide setting.

5 per

cluster 100 per cluster in a 2 node cluster

Number of outbound

queue The number of outbound queues between Unwired Platform and Messaging Services. Messages from these queues are eventually persisted, by the JmsBridge component into a per-device queue belonging to the Messaging Services. Subscription requests can generate a large number of outbound messages and temporarily cause backups in the queues. Since multiple devices can share the same output queue, larger number of queues can reduce delay getting the messages to the intended devices.

25 per

cluster 100 per cluster in a 2 node cluster

Check interval for

MBS package Specifies whether the interval system should check for an MBS package to see if there are changes to be pushed to the devices. Align this setting when the cache is being refreshed or updated. If the cache is refreshed every 4 hours, the check interval should also be 4 hours. However, if the cache is only updated via DCN, align the update frequency with the DCN interval accordingly. The check/push generation algorithm is scheduled to be enhanced in future versions. Currently, do not set the interval to less than 10 minutes.

10 min 10 min

Number of push

processing queues Number of queues (threads) that execute the change detection function (download SQL execution) to determine if there are changes to be pushed to the device. The number of queues determines the maximum concurrency for change detection. All MBS packages shared the same set of push processing queues.

25 25 per cluster in a 2 node cluster JVM minimum heap size DJC_JVM_MINHEAP

The minimum memory allocated to the differencing and cache management functions of the server. To change this setting on a service, uninstall the service, make the change, and reinstall the service. 512MB 2048MB (2GB) JVM maximum heap size DJC_JVM_MAXHEAP

The maximum memory allocated to the differencing and cache management functions of the server. To change this setting on a service, uninstall the service, make the change, and reinstall the service.

2048MB 6144MB (6GB) Relay Server

configuration Settings for shared memory buffer. 10MB 2048MB

During the packaging of data into messages, serialization consumes a significant amount of memory. Avoid garbage collection activities, especially during MBS application rollout, to prevent “choking.”

Tuning Process

(22)

20

The maximum messaging throughput with the described configuration is approximately 70 messages per second in a wired environment. If the calculated throughput for the test is below this number, it is likely that the connection method (as opposed to the server environment) is the limiting factor. In this case, more devices can be supported without any degradation in server performance.

Once you reach the maximum throughput, the number of devices performing the initial subscription is the maximum one server can handle. At this point, you can add another server to the cluster for additional message processing power (up to a 60% increase). CUD operations issued by mobile applications have a throughput much lower than that of initial subscription, so normally, no specific tuning for that scenario is required.

rBS aNd mBS COmpariSON Strengths and weaknesses

The RBS and MBS synchronization approaches both have advantages, and are applicable to differing usage scenarios. RBS is designed for “occasionally connected” use, while MBS is designed for “mostly connected” scenarios. It is important to recognize the differences between these technologies and select the approach that best matches the usage scenarios or requirements of the EIS.

APPROACh STReNgThS weAkNeSSeS

RBS Exceptional in batch processing of

transactions typified in scenarios where users synchronize at the beginning and end of the day (or some similar pattern) with occasional incremental synchronization of changes between these major synchronization points. Some application developers may find the synchronous nature of RBS more familiar and simpler.

Device memory usage and performance is optimized when downloading data — the data is not parsed before insertion into the local device data store.

The entire synchronization process is treated as a unit; until the synchronization is complete, none of the updates are available.

Depending on the amount of data, synchronizations can be relatively short in duration. Users can synchronize manually, incorporate polling from the device to determine changes to server data and to push device changes, or use an out-of-band nudge to indicate to the device that it is time to synchronize based on server-side data changes, for example, SMS.

If many small updates are going to be applied to the EIS during the course of the day, setting up and deleting connections to the server may prove to be expensive in terms of battery use. Some incremental use is acceptable.

MBS Can provide a more interactive user experience by providing small incremental updates throughout the course of the day. Updates to data can appear without user intervention or background synchronization.

The MBS style of asynchronous programming is typically considered to be more flexible in terms of device-side interaction. For example, partial updates appear over the course of time, which means the user can work with that data while the user interface continues to update.

MBS has the potential to scale well in scenarios involving the use of SAP® Data Orchestration Engine (DOE) relative to SAP JCO approaches, with DOE, no data is cached in the Unwired Server, and the SAP DOE is optimized for SAP interactions. Only MBS offers the option of using DOE, which is the preferred approach when communicating with SAP.

(23)

Performance Comparison

Since both MBS and RBS performance analysis uses the same data model and data, we can determine how they behave in different use cases. The result confirms that RBS handles large amounts of data, whether during initial or subsequent download, much more efficiently and in a more timely manner. For a very large data set, there may be differences of several orders of magnitude between MBS and RBS. Besides the obvious protocol differences that result in a structural performance gap, the constraints of device platforms amplify it even further. Most device platforms do not respond well to extra context switches, interrupts, deserialization, and writes to nonvolatile storage. You must know the synchronization data volume. Performance analysis illustrates that MBS can accommodate 2 - 4MB with reasonable response time, and 4 - 8MB as tolerable. Consider data sizes of 8MB and larger only if the user is not actively waiting for the push to be completed, for example, if using overnight synchronization.

Frequent CUD requests reconciled with the server are handled more efficiently with MBS as frequent RBS connections incur significant overhead.

Recommendation Determination Criteria

To use SAP DOE, MBS is the only available synchronization option. If you are not using SAP, base your decision on how much data is being transferred throughout the course of the day and how many times you expect your users to synchronize. You can defer this decision until deployment time, and when you need to generate client-side objects. It is possible to use a common model for RBS and MBS, however, the device-side programming style is different for each.

Use the following flowchart as an initial guide to determine the use of synchronization protocols:

Data Freshness not Required or Synchronization

on Schedule Occasionally Connected Synchronization Paradigm Decision Chart Connectivity Consider MBS No No No No Consider RBS High Data Volume Yes Yes Occasional or Infrequent Sync Yes If device platform does not support RBS use MBS instead

(24)

22

Validation

The key question to ask is whether you are getting the data that is required by the application within a reasonable time. If the answer is yes, then whatever protocol you select should satisfy the requirement. There are secondary factors such as cost, connectivity, user perception, scalability, and so on, that can bias your choice. If you are familiar with your own use cases, validate your decision by applying the characteristics of the two synchronization protocols through prototyping. During development, simulate the use cases by using sample data within a relational database to observe message flow and measure response time with actual devices that are to be deployed with the application. Sybase recommends that you use the actual device when prototyping, so you can fully understand what users will eventually experience. You can tune the server side, but there is little you can do on the device besides getting a more capable one. A development process and analysis plan can reveal serious architecture concerns by using a real device. appeNdix

MbS Tuning Information

Inbound and Outbound Queues for MBS

Figure 12: Inbound and Outbound Queues for MBS

The Inbound and outbound message queue settings are on the General tab of the server configuration. Expand the “Hide optional properties” control. For a production deployment, Sybase recommends a setting of 100 for each queue. Increase the number of inbound queues to prevent long-running requests from blocking processing of queued requests. Once a message is removed from the queue and processed by the server, no other messages behind it can be serviced. Thus, the more inbound queues, the less likely that requests from other devices can be blocked behind a long-running one.

Change Detection for Package

(25)

Figure 13: Push Processing Queues Push Processing Queues

The number of processing queues determines the parallelism of change detection execution. There is one thread per queue. When the MBS package is scheduled to run change detection, each device subscribed to the package is serviced by one of these threads through execution of the download function. More queues means more devices can be concurrently serviced. A cluster with multiple nodes can support additional queues, as all servers in the cluster share the queues. To tune this setting, adjust the pushJobWorkerThreadsCount property. This property is in: {install location}\Servers\UnwiredServer\Repository\Instance\com\sybase\djc\server \ ApplicationServer\<host>.properties

After changing this property, execute updateProps.dat. See the description of the update properties utility in the System Administration document for Sybase Unwired Platform 1.5.2.

JVM Heap Size

Use Sybase Control Center to change the JVM heap size for each server in the cluster. If Unwired Platform is running on a 32-bit system, do not set the maximum heap higher than 1.8GB. On a 64-bit system, set the maximum heap size based on the amount of physical memory available on the system, taking into account the portion required by the OS and other programs.

(26)

www.sybase.com

Sybase, Inc.

Worldwide Headquarters One Sybase Drive Dublin, CA 94568-7902 U.S.A

1 800 8 sybase