IBM Endpoint Manager for Software Use Analysis Version 9. Scalability Guide. Version 3

(1)

IBM Endpoint Manager for Software Use Analysis

Version 9

Scalability Guide

Version 3

(2)

(3)

IBM Endpoint Manager for Software Use Analysis

Version 9

Scalability Guide

Version 3

(4)

Scalability Guide

This edition applies to versions 9.1 and 9.0 of IBM Endpoint Manager for Software Use Analysis (product number 5725-F57) and to all subsequent releases and modifications until otherwise indicated in new editions.

© Copyright IBM Corporation 2002, 2014.

(5)

Scalability Guidelines . . . 1

Introduction . . . 1

Scanning and uploading scan data . . . 1

Extract, Transform, Load (ETL) . . . 2

Decision flow . . . 4

Planning and installing Software Use Analysis . . 6

Hardware requirements. . . 6

Network connection and storage throughput . 7 Dividing the infrastructure into scan groups . . . 7

Good practices for running scans and imports . . 8

Plan the scanning schedule . . . 8

Avoid scanning when it is not needed . . . . 8

Limit the number of computer properties that are to be gathered during scans . . . 9

Ensure that scans and imports are scheduled to run at night. . . 9

Run the initial import . . . 9

Review import logs . . . 9

Maintain frequent imports . . . 10

Disable collection of usage data . . . 10

Make room for end-of-scan-cycle activities . . . 11

Configuring the application and its database for medium and large environments . . . 11

Configuring the transaction logs size . . . 11

Configuring the transaction log location . . . . 11

Increasing Java heap size . . . 12

Preventive actions . . . 12

Limiting the number of scanned signature extensions . . . 12

Recovering from accumulated scans . . . 14

Cleaning up high volume of Software Use Analysis scans uploaded since the last import . . 14

Verifying the removal of scan data. . . 15

IBM PVU considerations . . . 16

REST API considerations . . . 17

Improving user interface performance . . . 17

Using relays to increase the performance of IBM Endpoint Manager . . . 18

Reducing the Endpoint Manager server load . . 18

Appendix. Executive summary . . . . 21

Notices

. . . 23

Trademarks . . . 24

(6)

(7)

Scalability Guidelines

This guide is intended to help system administrators plan the infrastructure of

IBM®Endpoint Manager for Software Use Analysis and to provide

recommendations for configuring the Software Use Analysis server to achieve optimal performance. It explains how to divide computers into scan groups, schedule software scans, and run data imports. It also provides information about other actions that can be undertaken to avoid low performance.

Introduction

IBM Endpoint Manager clients report data to the Endpoint Manager server that stores the data in its file system or database. The Software Use Analysis server periodically connects to the Endpoint Manager server and its database, downloads the stored data and processes it. The process of transferring data from the

Endpoint Manager server to the Software Use Analysis server is called Extract, Transform, Load (ETL). By properly scheduling scans and distributing them over the computers in your infrastructure, you can reduce the length of the ETL process and improve its performance.

Scanning and uploading scan data

To evaluate whether particular software is installed on an endpoint, you must run a scanner. It collects information about files with particular extensions, package data, and software identification tags. It also gathers information about the running processes to measure software usage. The software scan data must be transferred to the Endpoint Manager server from which it can be later on imported to Software Use Analysis.

To discover software that is installed on a particular endpoint and collect its usage, you must first install a scanner by running the Install Scanner fixlet. After the scanner is successfully installed, the Initiate Software Scan fixlet becomes relevant on the target endpoint. The following types of scans are available:

Catalog-based scan

In this type of scan, the Software Use Analysis server creates scanner catalogs that are sent to the endpoints. The scanner catalogs do not include signatures that can be found based on the list of file extensions or entries that are irrelevant for a particular operating system. Based on those catalogs, the scanner discovers exact matches and sends its findings to the Endpoint Manager server. This data is then transferred to the Software Use Analysis server.

File system scan

In this type of scan, the scanner uses a list of file extensions to create a list of all files with those extensions on an endpoint.

Package data scan

In this type of scan, the scanner searches the system registry (Windows) or package management system (Linux, UNIX) to gather information about packages that are installed on the endpoints. Then, it returns the findings to the Endpoint Manager server where the discovered packages are compared with the software catalog. If a particular package matches an entry in the catalog, the software is discovered.

(8)

Application usage statistics

In this type of scan, the scanner gathers information about processes that are running on the target endpoints.

Software identification tags scan

In this type of scan, the scanner searches for software identification tags that are delivered with software products.

You should run the catalog-based, file system, package data, and software identification tags scans on a regular basis as they are responsible for software discovery. The application usage statistics gathers usage data and can be disabled if you are not interested in this information.

When the status of the Initiate Software Scan fixlet shows complete (100%), it indicates that the scan was successfully initiated. It does not mean that the relevant data was already gathered. After the scan finishes, the Upload Software Scan

Resultsfixlet becomes relevant on the targeted endpoint. It means that the relevant data was gathered from the endpoint. When you run this fixlet, the scan data is uploaded to the Endpoint Manager server. It is then imported to Software Use Analysis during the Extract, Transform, Load (ETL) process.

Extract, Transform, Load (ETL)

The Extract, Transform, Load (ETL) is a process in the database usage that combines three database functions, which transfer data from one database to another. The first stage, Extract, involves reading and extracting data from various source systems. The second stage, Transform, converts the data from its original form into the form that meets the requirements of the target database. The last stage, Load, saves the new data into the target database, thus finishing the process of

transferring the data.

In Software Use Analysis, the Extract stage involves extracting data from the Endpoint Manager server. The data includes information about the infrastructure, installed agents, and detected software. ETL also checks whether a new software catalog is available, gathers information about the software scan and files that are present on the endpoints, and collects data from VM managers.

The extracted data is then transformed to a single format that can be loaded to the Software Use Analysis database. This stage also involves matching raw data with the software catalog, calculating processor value units (PVUs), processing the capacity scan, and converting information that is contained in the XML files. After the data is extracted and transformed, it is loaded into the database and can be used by Software Use Analysis.

(9)

Software Use Analysis server 1. Extract • Infrastructure information

Scan data files Use data files Capacity data files Package data files Files with VM manager information • Installed agents • • Software • • • Client computer - browser Endpoint Manager server Relay Web ser nterface u i Core business logic Client computer - console Core business logic Software Use Analysis database Endpoint Manager database Information about files Endpoint Manager file system Raw scan files Software catalog

Extract, Transform, and Load

2. Transform • Information from the XML files is processed. • • R • • T Data is transformed to a single format. aw data is matched with the software catalog.

PVU and RVU values are calculated.

he capacity scan is processed.

3. Load

Data is loaded into the Software Use Analysis database tables. High-speed

network connection

Endpoint Manager Client on Windows, Linux and UNIX

VM manager data (Windows and Linux x86/x64 only) XML Usage data Scan data Catalog-based File system Capacity Package

Software identification tags XML

Endpoint Manager client on Linux on System z Capacity configuration XML Usage data Scan data Catalog-based File system Capacity Package

Software identification tags XML

The hardest load on the Software Use Analysis server occurs during ETL when the following actions are performed:

v _{A large number of small files is retrieved from the Endpoint Manager server}

(Extract).

v _{Many small and medium files that contain information about installed software}

packages and process usage data are parsed (Transform).

v _{The database is populated with the parsed data (Load).}

At the same time, Software Use Analysis prunes large volumes of old data that exceeds its data rentention period.

The performance of the ETL process depends on the number of scan files, usage analyses, and package analyses that are processed during a single import. The main bottleneck is storage performance because many small files must be read, processed, and written to the Software Use Analysis database in a short time. By properly scheduling scans and distributing them over the computers in your infrastructure, you can reduce the length of the ETL process and improve its performance.

(10)

Decision flow

To avoid running into performance issues, you should divide the computers in your infrastructure into scan groups and properly set the scan schedule. You should start by creating a benchmark scan group on which you can try different configurations to achieve optimal import time. After the import time is satisfactory for the benchmark group, you can divide the rest of your infrastructure into analogical scan groups.

Start by creating a single scan group that will be your benchmark - when you are satisfied with the performance that you achieve for this group, you will create other scan groups on its basis. The size of the scan group might vary depending on the size of your infrastructure, however avoid creating a group larger than 20 000 endpoints.

Scan the computers in this scan group. When the scan finishes, upload its results to the Endpoint Manager server and run an import. Check the import time and decide whether it is satisfactory. For information about running imports, see section “Good practices for running scans and imports” on page 8.

If you are not satisfied with the import time, check the import log and try undertaking one of the following actions:

v _{If you see that the duration of the import of raw file system scan data or} package data takes longer than one third of the ETL time and the volume of the data is large (a few millions of entries), create a smaller group. For additional information, see section “Dividing the infrastructure into scan groups” on page 7.

v _{If you see that the duration of the import of raw file system scan data or} package data takes longer than one third of the ETL time but the volume of the data is low, fine tune hardware. For information about processor and RAM requirements as well as network latency and storage throughput, see section “Planning and installing Software Use Analysis” on page 6.

v _{If you see that processing of usage data takes an excessive amount of time and} you are not interested in collecting usage data, disable gathering of usage data. For more information, see section “Disable collection of usage data” on page 10. After you adjust the first scan group, run the software scan again, upload its results to the Endpoint Manager server and run an import.

When you achieve an import time that is satisfactory, decide whether you want to have a shorter scan cycle. For example, if you have an environment that consists of 42 000 endpoints and you created a scan group of 6000 endpoints, your scan cycle will last seven days (on assumption that you create seven equal groups). To shorten the scan cycle, you can try increasing the number of computers in a scan group, for example, to 7000. It will allow you for shortening the scan cycle to six days. After you increase the scan group size, observe the import time to ensure that its performance remains on an acceptable level.

When you are satisfied with the performance of the benchmark scan group, create the remaining groups. Schedule scans so that they fit into your preferred scan cycle. Then, schedule import of data form the Endpoint Manager. Observe the import time. If it is not satisfactory, adjust the configuration as you did in the benchmark scan group. When you achieve suitable performance, plan for end-of-cycle activities.

(11)

Use the following diagram to get an overview of actions and decisions that you will have to undertake to achieve optimal performance of Software Use Analysis.

Create a scan group (up to 20 000

computers)

Initiate the scan and upload scan results

Run an import and check its time

Is the import time satisfactory?

Do you want to have a shorter

scan cycle? Increase the size of

the scan group

Fine tune hardware (if possible) Create a smaller scan group Yes No No

Create the remaining scan groups

Schedule the scans to fit into the scan cycle

Schedule daily imports Yes

Plan for end-of-cycle activities

Is the import time still satisfactory?

Yes

Disable gathering of usage data (if you do not need it)

Fine tune hardware (if possible) Create a smaller scan group No Disable gathering of usage data (if you do not need it) Plan and install

Software Use Analysis

Installation

(12)

Planning and installing Software Use Analysis

Your deployment architecture depends on the number of endpoints that you want to have in your audit reports.

For information about the Endpoint Manager requirements, see Server requirements available in Software Use Analysis documentation.

Hardware requirements

If you already have the Endpoint Manager server in your environment, plan the infrastructure for the Software Use Analysis server. Software Use Analysis server stores its data in a dedicated DB2® database.

The following tables are applicable for environments with the following configuration parameters: a weekly software scan, daily imports, and 60 applications that are installed on an endpoint (on average).

Table 1. Processor and RAM requirements for Software Use Analysis

Environment size Topology Processor Memory

Small environment

Up to 5 000 endpoints

1 server IBM Endpoint Manager, Software Use Analysis, and DB2 At least 2,5 GHz - 4 cores 8 GB Medium environment 5 000 - 50 000 endpoints*

2/3 servers IBM Endpoint Manager 2-3 GHz - 4 cores 16 GB Software Use Analysis and DB2 At least 2 GHz - 4

cores

24 GB A distributed environment is advisable. If you

separate DB2 from Software Use Analysis, the DB2 server should have at least 16 GB RAM.

Large environment

More than 50 000 endpoints**

3 servers IBM Endpoint Manager 2-3 GHz - 4-16 cores 16-32 GB Software Use Analysis At least 2 GHz - 8

cores

16 GB

DB2 At least 2 GHz - 16

cores

64 GB

*For environments with up to 35 000 endpoints, there is no requirement to create scan groups. If you have more than 35 000 endpoints in your infrastructure, you must create scan groups. For more information, see section “Dividing the infrastructure into scan groups” on page 7.

**For larger environments, scan groups are required.

Medium-size environments

You can use virtual environments for this deployments size, but it is advisable to have dedicated resources for processor, memory, and virtual disk allocation. The virtual disk that is allocated for the virtual machine should have dedicated RAID storage, with dedicated input-output bandwidth for that virtual machine.

Large environments

For large deployments, use dedicated hardware. For optimum

performance, use a DB2 server that is dedicated to Software Use Analysis and is not shared with Endpoint Manager or other applications.

Additionally, you might want to designate a separate disk that is attached to the computer where DB2 is installed to store the database transaction logs. You might need to do some fine-tuning based on the

(13)

Install Software Use Analysis and DB2 ononecomputer Plan and prepare

for installation Small environment

Size:

What is the size of your

environment? Up to 5 000 endpoints Size: 5 000 - 50 000 endpoints Install or reuse IBM Endpoint Manager Size: 50 000 - 250 000 endpoints Medium environment Large environment Large Software Use Analysis and DB2 DB2 server Software Use Analysis server A separate disk or storage might be necessary. Small Medium Install or reuse IBM Endpoint Manager Install or reuse IBM Endpoint Manager

Install Software Use Analysis and DB2 ononecomputer

Install Software Use Analysis and DB2 ontwocomputers

Software Use Analysis and DB2

Network connection and storage throughput

The Extract Transform and Load (ETL) process extracts a huge amount of scan data from the Endpoint Manager server, processes it on the Software Use Analysis server, and saves it in the DB2 database. The following two factors affect the time of the import to the Software Use Analysis server:

Gigabit network connection

Because of the nature of the ETL imports, you are advised to have at least a gigabit network connection between the Endpoint Manager, Software Use Analysis, and DB2 servers.

Disk storage throughput

For large deployments, you are advised to have dedicated storage, especially for the DB2 server. The expected disk speed for writing data is approximately 400 MB/second.

Dividing the infrastructure into scan groups

It is critical for Software Use Analysis performance that you properly divide your environment into scan groups and then schedule scans in those scan groups accurately. If the configuration is not well-balanced, you might experience long import times.

For environments larger than 35 000 endpoints, divide your endpoints into separate scan groups. The system administrator can then set a different scanning schedule for every scan group in your environment.

Example

If you have 60 000 endpoints, you can create six scan groups (every group

containing 10 000 endpoints). The first scan group has the scanning schedule set to Monday, the second to Tuesday, and so on. Using this configuration, every

endpoint is scanned once a week. At the same time, the Endpoint Manager server receives data only from 1/6 of your environment daily and for every daily import

(14)

the Software Use Analysis server needs to process data only from 10 000 endpoints (instead of 60 000 endpoints). This environment configuration shortens the

Software Use Analysis import time.

The image below presents a scan schedule for an infrastructure that is divided into six scan groups. You might achieve such a schedule after you implement

recommendations that are contained in this guide. The assumption is that both software scans and imports of scan data to Software Use Analysis are scheduled to take place at night, while uploads of scan data from the endpoints to the Endpoint Manager server occur during the day.

If you have a powerful server computer and longer import time is not problematic, you can create fewer scan groups with greater number of endpoints in the

Endpoint Manager console. Remember to monitor the import log to analyze the amount of data that is processed and the time it takes to process it.

For information how to create scan groups, see the topic Computer groups that is available in Endpoint Manager documentation.

Good practices for running scans and imports

After you enable the Software Use Analysis site in your Endpoint Manager console, you should carefully plan the scanning activities and their schedule for your deployment.

Plan the scanning schedule

After you find the optimal size of the scan group, set the scanning schedule. It is the frequency of software scan on an endpoint. The most common scanning schedule is weekly so that every endpoint is scanned once a week. If your environment has more than 100 000 endpoints, consider performing scans less frequently, for example monthly.

Avoid scanning when it is not needed

The frequency of scans depends both on how often software products change on the endpoints in your environment and also on your reporting needs. If you have systems in your environment that have dynamically-changing software, you can group such systems into a scan group (or groups) and set more frequent scans, for example once a week. The remaining scan groups that contain computers with a more stable set of software can be scanned less frequently, for example once a month.

(15)

Limit the number of computer properties that are to be gathered

during scans

By default, the Software Use Analysis server includes four primary computer properties from the Endpoint Manager server that is configured as the data source: Computer Name, DNS Name, IP address, and Operating System. Imports can be substantially longer if you specify more properties to be extracted from the Endpoint Manager database and copied into the Software Use Analysis database during each data import. As a good practice, limit the number of computer properties to 10 (or fewer).

Ensure that scans and imports are scheduled to run at night

Some actions in the Software Use Analysis user interface cannot be processed when an import is running. Thus, try to schedule imports when the application administrator and Software Asset Manager are not using Software Use Analysis or after they finished their daily work.

Run the initial import

It is a good practice to run the first (initial) import before you schedule any software scans and activate any analyses.

Examples of when imports can be run:

v _{The first import uploads the software catalog from the installation d irectory to} the application and extracts the basic data about the endpoints from the Endpoint Manager server.

v _{The second import can be run after the scan data from the first scan group is} available in the Endpoint Manager server.

v _{The third import should be started after the scans from the third scan group are} finished, and so on.

Review import logs

Review the following INFO messages in the import log to check how much data was transferred during an ETL.

Number

Information

about Items specified in the import log Description

1. Infrastructure Computer items: The total number of computers in your environment. A

computer is a system with an Endpoint Manager agent that provides data to Software Use Analysis.

2. Software and

hardware

SAM::ScanFile items The number of files that have input data for the following items:

v _{File system scan information (SAM::FileFact items)} v _{Catalog-based scan information (SAM::CitFact items)} v _{Software identification tag scan information}

(SAM::IsotagFact items)

SAM::FileFact items The total count of information pieces about files from all computers in your environment (contained in the processed scan files).

SAM::CitFact items The total count of information pieces from catalog-based scans (contained in the processed scan files).

SAM::IsotagFact items The total count of information pieces from software identification tag scans (contained in the processed scan files).

(16)

Number

Information

about Items specified in the import log Description

3. Installed

packages

SAM::PackageFact items The total count of information pieces about Windows packages that have been gathered by the package data scan.

SAM::UnixPackageFact items The total count of information pieces about UNIX packages that have been gathered by the package data scan. 4. Software usage SAM::AppUsagePropertyValue items The total number of processes that were captured during

scans on the systems in your infrastructure. Example:

INFO: Computer items: 15000

INFO: SAM::AppUsagePropertyValue items: 4250 INFO: SAM::ScanFile items: 30000

INFO: SAM::FileFact items: 15735838 INFO: SAM::IsotagFact items: 0 INFO: SAM::CitFact items: 149496 INFO: SAM::PackageFact items: 406687 INFO: SAM::UnixPackageFact items: 1922564

Maintain frequent imports

After the installation, imports are scheduled to run once a day. Do not change this configuration. However, you might want to change the hour when the import starts. If your import is longer than 24 hours, you can:

v _{Improve the scan groups configuration.}

v _{Preserve the current daily import configuration because Software Use Analysis}

handles overlapping imports gracefully. If an import is running, no other import is started.

Disable collection of usage data

Software usage data is gathered by the Application Usage Statistics analysis. If the analysis is activated, usage data is gathered from all endpoints in your

infrastructure. However, the data is uploaded to the Endpoint Manager server only for the endpoints on which you run software scans. For the remaining endpoints, the data is stored on the endpoint until you run the software scan.

About this task

If you do not need usage data or the deployment phase is not finished, do not activate the analysis. It can be activated later on, if needed. If the analysis is already activated, but you decide that processing of usage data takes too much time or you are not interested in usage statistics, disable the analysis.

Procedure

1. Log in to the Endpoint Manager console.

2. In the navigation tree, open the IBM Endpoint Manager for Software Use

Analysis v9> Analyses.

3. In the upper-right pane, right-click Application Usage Statistics, and click

(17)

Make room for end-of-scan-cycle activities

Plan to have an import from SmartCloud Control Desk through IBM Tivoli®

Integration Composer at the end of a 1- or 2-week cycle. Include in your

end-of-scan-cycle activities the catalog update and the time for extracting Software Use Analysis compliance reports.

Configuring the application and its database for medium and large

environments

To avoid performance issues in medium and large environments, configure the location of the transaction log and adjust the log size. Apart from that, you can also adjust the Java™_{heap size.}

Configuring the transaction logs size

If your environment consists of many endpoints, increase the transaction logs size to improve performance.

About this task

The transaction logs size can be configured through the LOGFILSIZ DB2 parameter that defines the size of a single log file. To calculate the value that can be used for this parameter, you must first calculate the total disk space that is required for transaction logs in your specific environment and then divide it, thus obtaining the size of one transaction log. The required amount of disk space depends on the number of endpoints in your environment and the number of endpoints for which new scan results are available and processed during the data import.

Procedure

1. Use the following formula to calculate the disk space for your transaction logs: <The number of endpoints> x 1 MB + <the number of endpoints

for which new scan results are imported> x 1 MB + 1 GB

2. Divide the result by 0.00054 to obtain the size of a single transaction log file.

3. Run the following command to update the transaction log size in your

database. Substitute value with the size of a single transaction log. UPDATE DATABASE CONFIGURATION FOR SUADB USING LOGFILSIZ value

Example

v _{Calculating the single transaction log size for 100 000 endpoints and 15 000 scan} results:

100 000 x 1 MB + 15 000 x 1 MB + 1 GB = 114 GB 114 / 0.00054 = 211111

UPDATE DATABASE CONFIGURATION FOR SUADB USING LOGFILSIZ 211111

Configuring the transaction log location

To increase database performance, move the DB2 transaction log to a file system that is separate from the DB2 file system.

About this task

Medium environments: Strongly advised

(18)

Procedure

To move the DB2 transaction log to a file system that is separate from the DB2 file system, update the DB2 NEWLOGPATH parameter for your Software Use Analysis database:

UPDATE DATABASE CONFIGURATION FOR SUADB USING NEWLOGPATH value

Where value is a directory on a separate disk (different from the disk where the DB2 database is installed) where you want to keep the transaction logs. This configuration is strongly advised.

Increasing Java heap size

The default settings for the Java heap size might not be sufficient for medium and large environments. If your environment consists of more than 5000 endpoints, increase the memory available to Java client processes by increasing the Java heap size.

Procedure

1. Go to the <INSTALL_DIR>/wlp/usr/servers/server1/ directory and edit the jvm.optionsfile.

2. Set the maximum Java heap size (Xmx) to one of the following values,

depending on the size of your environment:

v _{For medium environments (5000 - 50 000 endpoints), set the heap size to}

6144m.

v _{For large environments (over 50 000 endpoints), set the heap size to 8192m.} 3. Restart the Software Use Analysis server.

Preventive actions

Turn off scans if the Software Use Analysis server is to be unavailable for a few days due to routine maintenance or scheduled backups.

If imports of data from Endpoint Manager to Software Use Analysis are not running, the unprocessed scan data is accumulated on the Endpoint Manager server. After you turn on the Software Use Analysis server, a large amount of data will be processed leading to a long import time. To avoid prolonged imports, turn off scans for the period when the Software Use Analysis server is not running.

Limiting the number of scanned signature extensions

The scanner scans the entire infrastructure for files with particular extensions. For some extensions, the discovered files are matched against the software catalog before the scan results are uploaded to the Endpoint Manager server. It ensures that only information about files that produce matches is uploaded.

For other extensions, the scan results are not matched against the software catalog on the side of the endpoint. They are all uploaded to the Endpoint Manager server. Thus, you avoid rescanning the entire infrastructure when you import a new catalog or add a custom signature. The new catalog is matched against the information that is available on the server. However, such a behavior might cause that large amounts of information about files that do not produce matches is uploaded to the server. It might in turn lead to performance issues during the import.

(19)

To reduce the amount of information that is uploaded to the server, limit the list of file extensions that are not matched against the software catalog on the side on the endpoint.

Procedure

1. Stop the Software Use Analysis server by running the following command:

/etc/init.d/wlpserver stop

/etc/init.d/SUAserver stop

2. To limit the number of extensions that are not matched against the software catalog on the side of the endpoint, edit the following files. They are in the <SUA_install_dir>\wlp\usr\servers\server1\apps\tema.war\WEB-INF\domains\ sam\configdirectory.

v _{In the file_names_all.txt file, leave the following extension:}

\.ear$

v _{In the file_names_unix.txt file, leave the following extensions:}

\.sh$ \.bin$ \.pl$ \.ear$ \.SH$ \.BIN$ \.PL$ \.EAR$

v _{In the file_names_windows.txt file, leave the following extensions:}

\.exe$ \.sys$ \.com$ \.ear$ \.ocx$

Note: Do not remove file extensions that you used to create custom signatures. They are likely to produce matches with the software catalog, so they can be uploaded to the Endpoint Manager server.

3. Start the Software Use Analysis server by running the following command:

/etc/init.d/wlpserver start

/etc/init.d/SUAserver start

4. Upload the software catalog. If a new version of the catalog is available, upload the new version. If it is not available yet, reupload the catalog that is currently imported to Software Use Analysis.

v _{If you are using Software Knowledge Base Toolkit for catalog management}

and upload a new version of the catalog, see: Updating the software catalog in Software Knowledge Base Toolkit.

v _{If you are using Software Knowledge Base Toolkit but reupload the same}

software catalog, perform the following steps:

a. Log in to Software Use Analysis.

b. In the top navigation bar, click Management > Catalog Servers.

(20)

v _{If you are using the built-in catalog management functionality that is} available in Software Use Analysis, perform the following steps:

a. Optional: Download the software catalog by using the Software Catalog

Downlaodfixlet. Download the compressed file that contains only the software catalog in the XML format.

b. In the navigation bar, click Management > Catalog Upload.

c. Click Browse and select the

IBMSoftwareCatalog_canonical_2.0_form_date.zipfile.

d. To upload the file, click Upload.

What to do next

After you modify the list of extensions that are uploaded directly to the server, wait for the scheduled import or run it manually. During the import, the changes are propagated to the specified endpoints. After the next scheduled scan and import, the changed list of file extensions is used. More file extensions are matched against the software catalog on the side of the endpoint and the import time is shorter.

Recovering from accumulated scans

To recover from a situation when you have a large amount of accumulated scan data, you need to clean up the Software Use Analysis scans and then verify if the data was removed correctly.

Cleaning up high volume of Software Use Analysis scans

uploaded since the last import

To clean up accumulated scan data, use an SQL query that removes scan file entries from the BFEnterprise tables.

Procedure

1. Back up the BFEnterprise and, tem_analytics databases.

2. Back up and remove all files from the UploadManager/sha1 directory on the

Endpoint Manager server.

3. To delete Software Use Analysis scan files entries from BFEnterprise tables, run the following SQL statement against the BFEnterprise database:

use BFEnterprise

delete from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)

delete from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)

4. Run a Software Use Analysis import.

What to do next

You can either:

v _{Restore Software Use Analysis scans in sha1 directory in small chunks and run}

imports (that is, restore 10 000 scans, run an import, restore next 10 000 scans, run another import, and so on. Software Use Analysis will incrementally import 10 000 scans). While restoring the scans in sha1 directory, Endpoint Manager

FillDBprocess monitors changes in UploadManager\sha1 directory and updates

dbo.uploadsand dbo.uploads_availability tables. v _{Perform the following steps:}

(21)

1. Rescan computers.

2. Upload new scan files.

3. Run an import.

Repeat the procedure for every 10 000 computers.

Tip: Avoid importing more than 10 000 or 20 000 scans within one Software Use

Analysis import. Such an import takes 10 - 15 hours even on systems that meet the Software Use Analysis hardware requirements.

Verifying the removal of scan data

After you remove the scan data, verify whether the removal was successful.

Procedure

1. Stop all fixlets that scan and upload data.

2. Back up the Endpoint Manager databases:

SQL server

BFEnterpriseand BESReporting

In SQL Server Management Studio, right click the database and select

Tasks > Back Up....

DB2 BFENT and BESREPOR

If you get a SQL1015N The database is in an inconsistent state.

SQLSTATE = 55025 error when attempting to back up the DB2

database, check if the database is in an inconsistent state by running the following command: db2 get db cfg for database_name.

Note: Verify whether the parameter All committed transactions

have been written to disk is set to NO.

If the database is in inconsistent state, run the following commands: a. DB2 restart database DATABASE_NAME

b. DB2 force applications all c. db2stop

d. db2start

3. Move all files in the Upload Manager/sha1 directory on the Endpoint Manager

server to a backup location.

v _{Linux: /install_dir/UploadManagerData/BufferDir/sha1 (that is}

install_dir = /var/opt/BESServer)

v _{Windows: install_dir\UploadManagerData\BufferDir\sha1 (that is}

install_dir = c:\Program Files (x86)\BigFix Enterprise\BESServer\ UploadManagerData\BufferDir\sha1)

4. Delete scan files entries from BFEnterprise tables:

SQL Server

use BFEnterprise

(22)

connect to BFENT

5. Run a Software Use Analysis import.

6. Move a subset of the sha1 folders from the backup location from step 3 back

to the original sha1 folder.

v _{The sha1 directory contains folder names that match the last 2 digits of the} endpoint (computer) ID. The folder names can be used to identify a specific endpoint or endpoints from a specific scan group.

v _{The Presentation Debugger can be used to run session relevance queries to}

retrieve computer information. The following sample queries show how to retrieve the information for computers that are associated with sha1 folders: –

(id of it, hostname of it, name of it, ip addresses of it, operating system of it) of bes computers whose

(id of it mod 100 = name_of_sha1_subdirectory_moved) returns the following when "name_of_sha1_subdirectory_moved" is replaced with "14":

7423114, nc9048149178.tivlab.austin.ibm.com, nc9048149178, 9.48.149.178, Linux Red Hat Enterprise Server 6.4 (2.6.32-358.el6.x86_64)

–

(id of it, hostname of it, name of it, ip addresses of it, operating system of it) of bes computers

whose ((id of it mod 100 = name_of_sha1_subdirectory_moved) or (id of it mod 100 = name_of_sha1_subdirectory_moved)) returns the following when "name_of_sha1_subdirectory_moved" is replaced with "14" and "5":

7423114, nc9048149178.tivlab.austin.ibm.com, nc9048149178, 9.48.149.178, Linux Red Hat Enterprise Server 6.4 (2.6.32-358.el6.x86_64)

16389405, nc038067.tivlab.raleigh.ibm.com, NC038067, 9.42.38.67, Win2003 5.2.3790

7. Verify that new rows are created in the BFEnterprise tables after copying the sha1folders back to the original location.

SQL Server:

use BFEnterprise

select * from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)

select * from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)

DB2:

connect to BFENT

select * from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)

select * from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)

8. Run a Software Use Analysis import to load endpoint data into Software Use

Analysis.

9. Repeat steps 5 and 6 until all of the original sha1 folders have been imported into Software Use Analysis.

10. Restart all fixlets that were stopped in step 1.

IBM PVU considerations

If you need to generate PVU reports for an IBM compliance purposes, the best practice is to generate the report at least monthly.

(23)

For organizations that span continents, from IBM compliance perspective, the License Metric Tool Regions must be applied requiring separate Software Use Analysis deployments. For more information, see Virtualization Capacity License Counting Rules.

REST API considerations

You can use REST API for software licensing information to retrieve large amounts of data that is related to computer systems, software instances, and license usage in your environment. Such information can then be passed to other applications for further processing and analysis.

Although using single API requests to retrieve data only from a selected subset of computers does not greatly impact the performance of Software Use Analysis, this is not true when retrieving data in bulk for all your computer systems at the same time. Such an action requires the processing of large amounts of data and it always influences the application performance.

In general, the API requests should not be used together with other performance intensive tasks, like software scans or data imports. Each user that is logged in to the application, as well as the number of actions that are performed in the web user interface during the REST API calls also decrease the performance.

Important: Each time you want to retrieve data through REST API, ensure that the use of Software Use Analysis at a moderate level, so that the extra workload resulting from REST API does not overload the application and create performance problems.

When you retrieve data in bulk, you can also make several API requests and use the limit and offset parameters to paginate your results instead of retrieving all the data at the same time:

v _{Use the limit parameter to specify the number of retrieved results:}

https://hostname:port/api/sam/computer_systems?token=token&limit=100000

v _{If you limit the first request to 100 000 results, append the next request with the} offset=100000parameter to omit the records that you already retrieved:

https://hostname:port/api/sam/computer_systems?token=token&limit=100000&offset=100000

Note: The limit and offset parameters can be omitted if you are retrieving data from up to about 50 endpoints. For environments with approximately 200 000 endpoints, you are advised to retrieve data in pages of 100 000 rows for computer systems, 200 000 rows for software instances, and 300 000 rows for license usage.

Improving user interface performance

If you experience problems with the performance of the user interface while working with reports, increase the number of rows that are loaded into the user interface. Adjust the number of rows to the size of your environment.

About this task

When you open a report, 50 rows of data are loaded to the user interface by default. When you scroll past those 50 rows, next 50 rows must be loaded. To improve the response time of the user interface, you can increase the number of rows that are loaded to the user interface.

(24)

Procedure

1. Stop theSoftware Use Analysis server.

2. On the computer where the Software Use Analysis server is installed go to the

sua_installation_dir\TEMA\work\tema\webapp\javascripts\report_components directory and open the grid.js file.

3. Find the following lines:

$.widget("bigfix.grid", {options:

{pageSize: 50, gridOptions: {} }

4. Increase the value of the pageSize parameter according to the size of your environment.

Table 2. Number of rows loaded into Software Use Analysis user interface

Environment size Value of the pageSize parameter

10 000 - 15 000 computers 800 15 000 - 30 000 computers 1600

over 30 000 computers 2500

5. Start the Software Use Analysis server.

6. Clear the cache in the web browser.

Using relays to increase the performance of IBM Endpoint Manager

To take advantage of the speed and scalability that is offered by IBM Endpoint Manager, it is often necessary to tune the settings of the Endpoint Manager deployment.

A relay is a client that is enhanced with a relay service. It performs all client actions to protect the host computer, and in addition, delivers content and software downloads to child clients and relays. Instead of requiring every networked computer to directly access the server, relays can be used to offload much of the burden. Hundreds of clients can point to a relay for downloads, which in turn makes only a single request to the server. Relays can connect to other relays as well, further increasing efficiency.

Reducing the Endpoint Manager server load

For all but the smallest Endpoint Manager deployments (< 500 Endpoint Manager clients), a primary Endpoint Manager relay should be set for each Endpoint Manager client even if they are not in a remote location.

The reason for this is that the Endpoint Manager server performs many tasks including:

v _{Gathering new Fixlet content from the Endpoint Manager server}

v _{Distributing new Fixlet content to the clients}

v _{Accepting and processing reports from the Endpoint Manager clients}

v _{Providing data for the Endpoint Manager consoles}

v _{Sending downloaded files (which can be large) to the Endpoint Manager client,}

and much more.

By using Endpoint Manager relays, the burden of communicating directly with every client is effectively moved to a different computer (the Endpoint Manager

(25)

relay computer), which frees the Endpoint Manager server to do other tasks. If the relays are not used, you might observe that performance degrades significantly when an action with a download is sent to the Endpoint Manager server (you might even see errors).

Setting up Endpoint Manager relays in appropriate places and correctly configuring clients to use them is the most important change that has highest impact on performance. To configure a relay, you can:

v _{Allow the clients to auto-select their closest Endpoint Manager relay.} v _{Manually configure the Endpoint Manager clients to use a specific relay.} For more information, see Managing relays.

(26)

(27)

Appendix. Executive summary

Table 3. Summary of the scalability best practices

Step Activities

1. Environment planning

Review the summary information that matches your environment size:

Small

v _{Up to 5 000 endpoints}

v _{Software Use Analysis and DB2 installed on the same server} v _{Scan groups (optional)}

Medium

v _{5 000 - 50 000 endpoints} v _{Scan groups (advisable)}

v _{Software Use Analysis and DB2 installed on separate computers}

v _{It is possible to use virtual environments for this deployments size, however it is advisable to have dedicated} resources for processor, memory, and virtual disk allocation.

Large

v _{50 000 - 250 000 endpoints} v _{Scan groups (required)}

v _{Software Use Analysis and DB2 installed on separate computers, dedicated storage for DB2} v _{Fine-tuning might be required}

2 Good practices for creating scan groups

v _{Plan the scan group size} v _{Create a benchmark scan group}

v _{Check the import time and decide whether it is satisfactory}

v _{When you achieve an import time that is satisfactory, decide whether you want to have a shorter scan cycle.} v _{When you are satisfied with the performance of the benchmark scan group, create the remaining groups.} 3 Good practices for running scans, imports and uploads

v _{Run the initial import before scanning the environment} v _{Plan the scanning schedule}

v _{Avoid scanning when it is not needed}

v _{Limit the number of computer properties to the ones that are relevant for software inventory management} v _{Ensure that scans and imports are scheduled to run at night}

v _{Disable gathering of usage data in the initial rollout phase}

v _{Carefully plan for gathering of usage data in large environments. Testing is required.} v _{Configure regular imports (Daily imports are advisable.)}

v _{Review import logs} v _{Maintain frequent imports}

v _{Ensure that scans and imports are run at night} v _{Run imports once a day}

v _{Configure upload schedule (daily)} 4 End of cycle activities

v _{Regularly import a new software catalog, for example monthly}

v _{Periodically import data from SmartCloud Control Desk through IBM Tivoli Integration Composer, for example at the} end of the 1 or 2-week cycle

(28)

(29)

Notices

This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing

IBM Corporation North Castle Drive

Armonk, NY 10504-1785 U.S.A.

For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to:

Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan, Ltd.

1623-14, Shimotsuruma, Yamato-shi Kanagawa 242-8502 Japan

The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law:

INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for

convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.

(30)

IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact:

IBM Corporation 2Z4A/101

11400 Burnet Road Austin, TX 79758 U.S.A

Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee.

The licensed program described in this information and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement, or any equivalent agreement between us.

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of

performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.

Trademarks

IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.

Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

(31)

Privacy policy considerations

IBM Software products, including software as a service solutions, (“Software Offerings”) may use cookies or other technologies to collect product usage information, to help improve the end user experience, to tailor interactions with the end user or for other purposes. In many cases no personally identifiable information is collected by the Software Offerings. Some of our Software Offerings can help enable you to collect personally identifiable information. If this Software Offering uses cookies to collect personally identifiable information, specific information about this offering’s use of cookies is set forth below.

This Software Offering does not use cookies or other technologies to collect personally identifiable information.

If the configurations deployed for this Software Offering provide you as customer the ability to collect personally identifiable information from end users via cookies and other technologies, you should seek your own legal advice about any laws applicable to such data collection, including any requirements for notice and consent.

For more information about the use of various technologies, including cookies, for these purposes, See IBM’s Privacy Policy at http://www.ibm.com/privacy and IBM’s Online Privacy Statement at http://www.ibm.com/privacy/details the section entitled “Cookies, Web Beacons and Other Technologies” and the “IBM Software Products and Software-as-a-Service Privacy Statement” at

(32)

(33)

(34)

IBM Endpoint Manager for Software Use Analysis Version 9. Scalability Guide. Version 3

IBM Endpoint Manager for Software Use Analysis

Version 9

Scalability Guide

Version 3

IBM Endpoint Manager for Software Use Analysis

Version 9

Scalability Guide

Version 3

Contents

Scalability Guidelines . . . 1

Appendix. Executive summary . . . . 21

Notices

. . . 23

Scalability Guidelines

Introduction

Scanning and uploading scan data

Extract, Transform, Load (ETL)

Extract, Transform, and Load

Decision flow

Planning and installing Software Use Analysis

Hardware requirements

Network connection and storage throughput

Dividing the infrastructure into scan groups

Good practices for running scans and imports

Plan the scanning schedule

Avoid scanning when it is not needed

Limit the number of computer properties that are to be gathered

during scans

Ensure that scans and imports are scheduled to run at night

Run the initial import

Review import logs

Maintain frequent imports

Disable collection of usage data

About this task

Procedure

Make room for end-of-scan-cycle activities

Configuring the application and its database for medium and large

environments

Configuring the transaction logs size

About this task

Procedure

Example

Configuring the transaction log location

About this task

Procedure

Increasing Java heap size

Procedure

Preventive actions

Limiting the number of scanned signature extensions

Procedure

What to do next

Recovering from accumulated scans

Cleaning up high volume of Software Use Analysis scans

uploaded since the last import

Procedure

What to do next

Verifying the removal of scan data

Procedure

IBM PVU considerations

REST API considerations

Improving user interface performance

About this task

Procedure

Using relays to increase the performance of IBM Endpoint Manager

Reducing the Endpoint Manager server load

Appendix. Executive summary

Notices

Trademarks

Privacy policy considerations