IBM Endpoint Manager for Software Use Analysis
Version 9
Scalability Guide
Version 3
IBM Endpoint Manager for Software Use Analysis
Version 9
Scalability Guide
Version 3
Scalability Guide
This edition applies to versions 9.1 and 9.0 of IBM Endpoint Manager for Software Use Analysis (product number 5725-F57) and to all subsequent releases and modifications until otherwise indicated in new editions.
© Copyright IBM Corporation 2002, 2014.
Contents
Scalability Guidelines . . . 1
Introduction . . . 1
Scanning and uploading scan data . . . 1
Extract, Transform, Load (ETL) . . . 2
Decision flow . . . 4
Planning and installing Software Use Analysis . . 6
Hardware requirements. . . 6
Network connection and storage throughput . 7 Dividing the infrastructure into scan groups . . . 7
Good practices for running scans and imports . . 8
Plan the scanning schedule . . . 8
Avoid scanning when it is not needed . . . . 8
Limit the number of computer properties that are to be gathered during scans . . . 9
Ensure that scans and imports are scheduled to run at night. . . 9
Run the initial import . . . 9
Review import logs . . . 9
Maintain frequent imports . . . 10
Disable collection of usage data . . . 10
Make room for end-of-scan-cycle activities . . . 11
Configuring the application and its database for medium and large environments . . . 11
Configuring the transaction logs size . . . 11
Configuring the transaction log location . . . . 11
Increasing Java heap size . . . 12
Preventive actions . . . 12
Limiting the number of scanned signature extensions . . . 12
Recovering from accumulated scans . . . 14
Cleaning up high volume of Software Use Analysis scans uploaded since the last import . . 14
Verifying the removal of scan data. . . 15
IBM PVU considerations . . . 16
REST API considerations . . . 17
Improving user interface performance . . . 17
Using relays to increase the performance of IBM Endpoint Manager . . . 18
Reducing the Endpoint Manager server load . . 18
Appendix. Executive summary . . . . 21
Notices
. . . 23
Trademarks . . . 24
Scalability Guidelines
This guide is intended to help system administrators plan the infrastructure of
IBM®Endpoint Manager for Software Use Analysis and to provide
recommendations for configuring the Software Use Analysis server to achieve optimal performance. It explains how to divide computers into scan groups, schedule software scans, and run data imports. It also provides information about other actions that can be undertaken to avoid low performance.
Introduction
IBM Endpoint Manager clients report data to the Endpoint Manager server that stores the data in its file system or database. The Software Use Analysis server periodically connects to the Endpoint Manager server and its database, downloads the stored data and processes it. The process of transferring data from the
Endpoint Manager server to the Software Use Analysis server is called Extract, Transform, Load (ETL). By properly scheduling scans and distributing them over the computers in your infrastructure, you can reduce the length of the ETL process and improve its performance.
Scanning and uploading scan data
To evaluate whether particular software is installed on an endpoint, you must run a scanner. It collects information about files with particular extensions, package data, and software identification tags. It also gathers information about the running processes to measure software usage. The software scan data must be transferred to the Endpoint Manager server from which it can be later on imported to Software Use Analysis.
To discover software that is installed on a particular endpoint and collect its usage, you must first install a scanner by running the Install Scanner fixlet. After the scanner is successfully installed, the Initiate Software Scan fixlet becomes relevant on the target endpoint. The following types of scans are available:
Catalog-based scan
In this type of scan, the Software Use Analysis server creates scanner catalogs that are sent to the endpoints. The scanner catalogs do not include signatures that can be found based on the list of file extensions or entries that are irrelevant for a particular operating system. Based on those catalogs, the scanner discovers exact matches and sends its findings to the Endpoint Manager server. This data is then transferred to the Software Use Analysis server.
File system scan
In this type of scan, the scanner uses a list of file extensions to create a list of all files with those extensions on an endpoint.
Package data scan
In this type of scan, the scanner searches the system registry (Windows) or package management system (Linux, UNIX) to gather information about packages that are installed on the endpoints. Then, it returns the findings to the Endpoint Manager server where the discovered packages are compared with the software catalog. If a particular package matches an entry in the catalog, the software is discovered.
Application usage statistics
In this type of scan, the scanner gathers information about processes that are running on the target endpoints.
Software identification tags scan
In this type of scan, the scanner searches for software identification tags that are delivered with software products.
You should run the catalog-based, file system, package data, and software identification tags scans on a regular basis as they are responsible for software discovery. The application usage statistics gathers usage data and can be disabled if you are not interested in this information.
When the status of the Initiate Software Scan fixlet shows complete (100%), it indicates that the scan was successfully initiated. It does not mean that the relevant data was already gathered. After the scan finishes, the Upload Software Scan
Resultsfixlet becomes relevant on the targeted endpoint. It means that the relevant data was gathered from the endpoint. When you run this fixlet, the scan data is uploaded to the Endpoint Manager server. It is then imported to Software Use Analysis during the Extract, Transform, Load (ETL) process.
Extract, Transform, Load (ETL)
The Extract, Transform, Load (ETL) is a process in the database usage that combines three database functions, which transfer data from one database to another. The first stage, Extract, involves reading and extracting data from various source systems. The second stage, Transform, converts the data from its original form into the form that meets the requirements of the target database. The last stage, Load, saves the new data into the target database, thus finishing the process of
transferring the data.
In Software Use Analysis, the Extract stage involves extracting data from the Endpoint Manager server. The data includes information about the infrastructure, installed agents, and detected software. ETL also checks whether a new software catalog is available, gathers information about the software scan and files that are present on the endpoints, and collects data from VM managers.
The extracted data is then transformed to a single format that can be loaded to the Software Use Analysis database. This stage also involves matching raw data with the software catalog, calculating processor value units (PVUs), processing the capacity scan, and converting information that is contained in the XML files. After the data is extracted and transformed, it is loaded into the database and can be used by Software Use Analysis.
Software Use Analysis server 1. Extract • Infrastructure information
Scan data files Use data files Capacity data files Package data files Files with VM manager information • Installed agents • • Software • • • Client computer - browser Endpoint Manager server Relay Web ser nterface u i Core business logic Client computer - console Core business logic Software Use Analysis database Endpoint Manager database Information about files Endpoint Manager file system Raw scan files Software catalog
Extract, Transform, and Load
2. Transform • Information from the XML files is processed. • • R • • T Data is transformed to a single format. aw data is matched with the software catalog.PVU and RVU values are calculated.
he capacity scan is processed.
3. Load
Data is loaded into the Software Use Analysis database tables. High-speed
network connection
Endpoint Manager Client on Windows, Linux and UNIX
VM manager data (Windows and Linux x86/x64 only) XML Usage data Scan data Catalog-based File system Capacity Package
Software identification tags XML
Endpoint Manager client on Linux on System z Capacity configuration XML Usage data Scan data Catalog-based File system Capacity Package
Software identification tags XML
The hardest load on the Software Use Analysis server occurs during ETL when the following actions are performed:
v A large number of small files is retrieved from the Endpoint Manager server
(Extract).
v Many small and medium files that contain information about installed software
packages and process usage data are parsed (Transform).
v The database is populated with the parsed data (Load).
At the same time, Software Use Analysis prunes large volumes of old data that exceeds its data rentention period.
The performance of the ETL process depends on the number of scan files, usage analyses, and package analyses that are processed during a single import. The main bottleneck is storage performance because many small files must be read, processed, and written to the Software Use Analysis database in a short time. By properly scheduling scans and distributing them over the computers in your infrastructure, you can reduce the length of the ETL process and improve its performance.
Decision flow
To avoid running into performance issues, you should divide the computers in your infrastructure into scan groups and properly set the scan schedule. You should start by creating a benchmark scan group on which you can try different configurations to achieve optimal import time. After the import time is satisfactory for the benchmark group, you can divide the rest of your infrastructure into analogical scan groups.
Start by creating a single scan group that will be your benchmark - when you are satisfied with the performance that you achieve for this group, you will create other scan groups on its basis. The size of the scan group might vary depending on the size of your infrastructure, however avoid creating a group larger than 20 000 endpoints.
Scan the computers in this scan group. When the scan finishes, upload its results to the Endpoint Manager server and run an import. Check the import time and decide whether it is satisfactory. For information about running imports, see section “Good practices for running scans and imports” on page 8.
If you are not satisfied with the import time, check the import log and try undertaking one of the following actions:
v If you see that the duration of the import of raw file system scan data or package data takes longer than one third of the ETL time and the volume of the data is large (a few millions of entries), create a smaller group. For additional information, see section “Dividing the infrastructure into scan groups” on page 7.
v If you see that the duration of the import of raw file system scan data or package data takes longer than one third of the ETL time but the volume of the data is low, fine tune hardware. For information about processor and RAM requirements as well as network latency and storage throughput, see section “Planning and installing Software Use Analysis” on page 6.
v If you see that processing of usage data takes an excessive amount of time and you are not interested in collecting usage data, disable gathering of usage data. For more information, see section “Disable collection of usage data” on page 10. After you adjust the first scan group, run the software scan again, upload its results to the Endpoint Manager server and run an import.
When you achieve an import time that is satisfactory, decide whether you want to have a shorter scan cycle. For example, if you have an environment that consists of 42 000 endpoints and you created a scan group of 6000 endpoints, your scan cycle will last seven days (on assumption that you create seven equal groups). To shorten the scan cycle, you can try increasing the number of computers in a scan group, for example, to 7000. It will allow you for shortening the scan cycle to six days. After you increase the scan group size, observe the import time to ensure that its performance remains on an acceptable level.
When you are satisfied with the performance of the benchmark scan group, create the remaining groups. Schedule scans so that they fit into your preferred scan cycle. Then, schedule import of data form the Endpoint Manager. Observe the import time. If it is not satisfactory, adjust the configuration as you did in the benchmark scan group. When you achieve suitable performance, plan for end-of-cycle activities.
Use the following diagram to get an overview of actions and decisions that you will have to undertake to achieve optimal performance of Software Use Analysis.
Create a scan group (up to 20 000
computers)
Initiate the scan and upload scan results
Run an import and check its time
Is the import time satisfactory?
Do you want to have a shorter
scan cycle? Increase the size of
the scan group
Fine tune hardware (if possible) Create a smaller scan group Yes No No
Create the remaining scan groups
Schedule the scans to fit into the scan cycle
Schedule daily imports Yes
Plan for end-of-cycle activities
Is the import time still satisfactory?
Yes
Disable gathering of usage data (if you do not need it)
Fine tune hardware (if possible) Create a smaller scan group No Disable gathering of usage data (if you do not need it) Plan and install
Software Use Analysis
Installation
Planning and installing Software Use Analysis
Your deployment architecture depends on the number of endpoints that you want to have in your audit reports.
For information about the Endpoint Manager requirements, see Server requirements available in Software Use Analysis documentation.
Hardware requirements
If you already have the Endpoint Manager server in your environment, plan the infrastructure for the Software Use Analysis server. Software Use Analysis server stores its data in a dedicated DB2® database.
The following tables are applicable for environments with the following configuration parameters: a weekly software scan, daily imports, and 60 applications that are installed on an endpoint (on average).
Table 1. Processor and RAM requirements for Software Use Analysis
Environment size Topology Processor Memory
Small environment
Up to 5 000 endpoints
1 server IBM Endpoint Manager, Software Use Analysis, and DB2 At least 2,5 GHz - 4 cores 8 GB Medium environment 5 000 - 50 000 endpoints*
2/3 servers IBM Endpoint Manager 2-3 GHz - 4 cores 16 GB Software Use Analysis and DB2 At least 2 GHz - 4
cores
24 GB A distributed environment is advisable. If you
separate DB2 from Software Use Analysis, the DB2 server should have at least 16 GB RAM.
Large environment
More than 50 000 endpoints**
3 servers IBM Endpoint Manager 2-3 GHz - 4-16 cores 16-32 GB Software Use Analysis At least 2 GHz - 8
cores
16 GB
DB2 At least 2 GHz - 16
cores
64 GB
*For environments with up to 35 000 endpoints, there is no requirement to create scan groups. If you have more than 35 000 endpoints in your infrastructure, you must create scan groups. For more information, see section “Dividing the infrastructure into scan groups” on page 7.
**For larger environments, scan groups are required.
Medium-size environments
You can use virtual environments for this deployments size, but it is advisable to have dedicated resources for processor, memory, and virtual disk allocation. The virtual disk that is allocated for the virtual machine should have dedicated RAID storage, with dedicated input-output bandwidth for that virtual machine.
Large environments
For large deployments, use dedicated hardware. For optimum
performance, use a DB2 server that is dedicated to Software Use Analysis and is not shared with Endpoint Manager or other applications.
Additionally, you might want to designate a separate disk that is attached to the computer where DB2 is installed to store the database transaction logs. You might need to do some fine-tuning based on the
Install Software Use Analysis and DB2 ononecomputer Plan and prepare
for installation Small environment
Size:
What is the size of your
environment? Up to 5 000 endpoints Size: 5 000 - 50 000 endpoints Install or reuse IBM Endpoint Manager Size: 50 000 - 250 000 endpoints Medium environment Large environment Large Software Use Analysis and DB2 DB2 server Software Use Analysis server A separate disk or storage might be necessary. Small Medium Install or reuse IBM Endpoint Manager Install or reuse IBM Endpoint Manager
Install Software Use Analysis and DB2 ononecomputer
Install Software Use Analysis and DB2 ontwocomputers
Software Use Analysis and DB2
Network connection and storage throughput
The Extract Transform and Load (ETL) process extracts a huge amount of scan data from the Endpoint Manager server, processes it on the Software Use Analysis server, and saves it in the DB2 database. The following two factors affect the time of the import to the Software Use Analysis server:
Gigabit network connection
Because of the nature of the ETL imports, you are advised to have at least a gigabit network connection between the Endpoint Manager, Software Use Analysis, and DB2 servers.
Disk storage throughput
For large deployments, you are advised to have dedicated storage, especially for the DB2 server. The expected disk speed for writing data is approximately 400 MB/second.
Dividing the infrastructure into scan groups
It is critical for Software Use Analysis performance that you properly divide your environment into scan groups and then schedule scans in those scan groups accurately. If the configuration is not well-balanced, you might experience long import times.
For environments larger than 35 000 endpoints, divide your endpoints into separate scan groups. The system administrator can then set a different scanning schedule for every scan group in your environment.
Example
If you have 60 000 endpoints, you can create six scan groups (every group
containing 10 000 endpoints). The first scan group has the scanning schedule set to Monday, the second to Tuesday, and so on. Using this configuration, every
endpoint is scanned once a week. At the same time, the Endpoint Manager server receives data only from 1/6 of your environment daily and for every daily import
the Software Use Analysis server needs to process data only from 10 000 endpoints (instead of 60 000 endpoints). This environment configuration shortens the
Software Use Analysis import time.
The image below presents a scan schedule for an infrastructure that is divided into six scan groups. You might achieve such a schedule after you implement
recommendations that are contained in this guide. The assumption is that both software scans and imports of scan data to Software Use Analysis are scheduled to take place at night, while uploads of scan data from the endpoints to the Endpoint Manager server occur during the day.
If you have a powerful server computer and longer import time is not problematic, you can create fewer scan groups with greater number of endpoints in the
Endpoint Manager console. Remember to monitor the import log to analyze the amount of data that is processed and the time it takes to process it.
For information how to create scan groups, see the topic Computer groups that is available in Endpoint Manager documentation.
Good practices for running scans and imports
After you enable the Software Use Analysis site in your Endpoint Manager console, you should carefully plan the scanning activities and their schedule for your deployment.
Plan the scanning schedule
After you find the optimal size of the scan group, set the scanning schedule. It is the frequency of software scan on an endpoint. The most common scanning schedule is weekly so that every endpoint is scanned once a week. If your environment has more than 100 000 endpoints, consider performing scans less frequently, for example monthly.
Avoid scanning when it is not needed
The frequency of scans depends both on how often software products change on the endpoints in your environment and also on your reporting needs. If you have systems in your environment that have dynamically-changing software, you can group such systems into a scan group (or groups) and set more frequent scans, for example once a week. The remaining scan groups that contain computers with a more stable set of software can be scanned less frequently, for example once a month.
Limit the number of computer properties that are to be gathered
during scans
By default, the Software Use Analysis server includes four primary computer properties from the Endpoint Manager server that is configured as the data source: Computer Name, DNS Name, IP address, and Operating System. Imports can be substantially longer if you specify more properties to be extracted from the Endpoint Manager database and copied into the Software Use Analysis database during each data import. As a good practice, limit the number of computer properties to 10 (or fewer).
Ensure that scans and imports are scheduled to run at night
Some actions in the Software Use Analysis user interface cannot be processed when an import is running. Thus, try to schedule imports when the application administrator and Software Asset Manager are not using Software Use Analysis or after they finished their daily work.
Run the initial import
It is a good practice to run the first (initial) import before you schedule any software scans and activate any analyses.
Examples of when imports can be run:
v The first import uploads the software catalog from the installation d irectory to the application and extracts the basic data about the endpoints from the Endpoint Manager server.
v The second import can be run after the scan data from the first scan group is available in the Endpoint Manager server.
v The third import should be started after the scans from the third scan group are finished, and so on.
Review import logs
Review the following INFO messages in the import log to check how much data was transferred during an ETL.
Number
Information
about Items specified in the import log Description
1. Infrastructure Computer items: The total number of computers in your environment. A
computer is a system with an Endpoint Manager agent that provides data to Software Use Analysis.
2. Software and
hardware
SAM::ScanFile items The number of files that have input data for the following items:
v File system scan information (SAM::FileFact items) v Catalog-based scan information (SAM::CitFact items) v Software identification tag scan information
(SAM::IsotagFact items)
SAM::FileFact items The total count of information pieces about files from all computers in your environment (contained in the processed scan files).
SAM::CitFact items The total count of information pieces from catalog-based scans (contained in the processed scan files).
SAM::IsotagFact items The total count of information pieces from software identification tag scans (contained in the processed scan files).
Number
Information
about Items specified in the import log Description
3. Installed
packages
SAM::PackageFact items The total count of information pieces about Windows packages that have been gathered by the package data scan.
SAM::UnixPackageFact items The total count of information pieces about UNIX packages that have been gathered by the package data scan. 4. Software usage SAM::AppUsagePropertyValue items The total number of processes that were captured during
scans on the systems in your infrastructure. Example:
INFO: Computer items: 15000
INFO: SAM::AppUsagePropertyValue items: 4250 INFO: SAM::ScanFile items: 30000
INFO: SAM::FileFact items: 15735838 INFO: SAM::IsotagFact items: 0 INFO: SAM::CitFact items: 149496 INFO: SAM::PackageFact items: 406687 INFO: SAM::UnixPackageFact items: 1922564
Maintain frequent imports
After the installation, imports are scheduled to run once a day. Do not change this configuration. However, you might want to change the hour when the import starts. If your import is longer than 24 hours, you can:
v Improve the scan groups configuration.
v Preserve the current daily import configuration because Software Use Analysis
handles overlapping imports gracefully. If an import is running, no other import is started.
Disable collection of usage data
Software usage data is gathered by the Application Usage Statistics analysis. If the analysis is activated, usage data is gathered from all endpoints in your
infrastructure. However, the data is uploaded to the Endpoint Manager server only for the endpoints on which you run software scans. For the remaining endpoints, the data is stored on the endpoint until you run the software scan.
About this task
If you do not need usage data or the deployment phase is not finished, do not activate the analysis. It can be activated later on, if needed. If the analysis is already activated, but you decide that processing of usage data takes too much time or you are not interested in usage statistics, disable the analysis.
Procedure
1. Log in to the Endpoint Manager console.
2. In the navigation tree, open the IBM Endpoint Manager for Software Use
Analysis v9> Analyses.
3. In the upper-right pane, right-click Application Usage Statistics, and click
Make room for end-of-scan-cycle activities
Plan to have an import from SmartCloud Control Desk through IBM Tivoli®
Integration Composer at the end of a 1- or 2-week cycle. Include in your
end-of-scan-cycle activities the catalog update and the time for extracting Software Use Analysis compliance reports.
Configuring the application and its database for medium and large
environments
To avoid performance issues in medium and large environments, configure the location of the transaction log and adjust the log size. Apart from that, you can also adjust the Java™heap size.
Configuring the transaction logs size
If your environment consists of many endpoints, increase the transaction logs size to improve performance.
About this task
The transaction logs size can be configured through the LOGFILSIZ DB2 parameter that defines the size of a single log file. To calculate the value that can be used for this parameter, you must first calculate the total disk space that is required for transaction logs in your specific environment and then divide it, thus obtaining the size of one transaction log. The required amount of disk space depends on the number of endpoints in your environment and the number of endpoints for which new scan results are available and processed during the data import.
Procedure
1. Use the following formula to calculate the disk space for your transaction logs: <The number of endpoints> x 1 MB + <the number of endpoints
for which new scan results are imported> x 1 MB + 1 GB
2. Divide the result by 0.00054 to obtain the size of a single transaction log file.
3. Run the following command to update the transaction log size in your
database. Substitute value with the size of a single transaction log. UPDATE DATABASE CONFIGURATION FOR SUADB USING LOGFILSIZ value
Example
v Calculating the single transaction log size for 100 000 endpoints and 15 000 scan results:
100 000 x 1 MB + 15 000 x 1 MB + 1 GB = 114 GB 114 / 0.00054 = 211111
UPDATE DATABASE CONFIGURATION FOR SUADB USING LOGFILSIZ 211111
Configuring the transaction log location
To increase database performance, move the DB2 transaction log to a file system that is separate from the DB2 file system.
About this task
Medium environments: Strongly advised
Procedure
To move the DB2 transaction log to a file system that is separate from the DB2 file system, update the DB2 NEWLOGPATH parameter for your Software Use Analysis database:
UPDATE DATABASE CONFIGURATION FOR SUADB USING NEWLOGPATH value
Where value is a directory on a separate disk (different from the disk where the DB2 database is installed) where you want to keep the transaction logs. This configuration is strongly advised.
Increasing Java heap size
The default settings for the Java heap size might not be sufficient for medium and large environments. If your environment consists of more than 5000 endpoints, increase the memory available to Java client processes by increasing the Java heap size.
Procedure
1. Go to the <INSTALL_DIR>/wlp/usr/servers/server1/ directory and edit the jvm.optionsfile.
2. Set the maximum Java heap size (Xmx) to one of the following values,
depending on the size of your environment:
v For medium environments (5000 - 50 000 endpoints), set the heap size to
6144m.
v For large environments (over 50 000 endpoints), set the heap size to 8192m. 3. Restart the Software Use Analysis server.
Preventive actions
Turn off scans if the Software Use Analysis server is to be unavailable for a few days due to routine maintenance or scheduled backups.
If imports of data from Endpoint Manager to Software Use Analysis are not running, the unprocessed scan data is accumulated on the Endpoint Manager server. After you turn on the Software Use Analysis server, a large amount of data will be processed leading to a long import time. To avoid prolonged imports, turn off scans for the period when the Software Use Analysis server is not running.
Limiting the number of scanned signature extensions
The scanner scans the entire infrastructure for files with particular extensions. For some extensions, the discovered files are matched against the software catalog before the scan results are uploaded to the Endpoint Manager server. It ensures that only information about files that produce matches is uploaded.
For other extensions, the scan results are not matched against the software catalog on the side of the endpoint. They are all uploaded to the Endpoint Manager server. Thus, you avoid rescanning the entire infrastructure when you import a new catalog or add a custom signature. The new catalog is matched against the information that is available on the server. However, such a behavior might cause that large amounts of information about files that do not produce matches is uploaded to the server. It might in turn lead to performance issues during the import.
To reduce the amount of information that is uploaded to the server, limit the list of file extensions that are not matched against the software catalog on the side on the endpoint.
Procedure
1. Stop the Software Use Analysis server by running the following command:
/etc/init.d/wlpserver stop
/etc/init.d/SUAserver stop
2. To limit the number of extensions that are not matched against the software catalog on the side of the endpoint, edit the following files. They are in the <SUA_install_dir>\wlp\usr\servers\server1\apps\tema.war\WEB-INF\domains\ sam\configdirectory.
v In the file_names_all.txt file, leave the following extension:
\.ear$
v In the file_names_unix.txt file, leave the following extensions:
\.sh$ \.bin$ \.pl$ \.ear$ \.SH$ \.BIN$ \.PL$ \.EAR$
v In the file_names_windows.txt file, leave the following extensions:
\.exe$ \.sys$ \.com$ \.ear$ \.ocx$
Note: Do not remove file extensions that you used to create custom signatures. They are likely to produce matches with the software catalog, so they can be uploaded to the Endpoint Manager server.
3. Start the Software Use Analysis server by running the following command:
/etc/init.d/wlpserver start
/etc/init.d/SUAserver start
4. Upload the software catalog. If a new version of the catalog is available, upload the new version. If it is not available yet, reupload the catalog that is currently imported to Software Use Analysis.
v If you are using Software Knowledge Base Toolkit for catalog management
and upload a new version of the catalog, see: Updating the software catalog in Software Knowledge Base Toolkit.
v If you are using Software Knowledge Base Toolkit but reupload the same
software catalog, perform the following steps:
a. Log in to Software Use Analysis.
b. In the top navigation bar, click Management > Catalog Servers.
v If you are using the built-in catalog management functionality that is available in Software Use Analysis, perform the following steps:
a. Optional: Download the software catalog by using the Software Catalog
Downlaodfixlet. Download the compressed file that contains only the software catalog in the XML format.
b. In the navigation bar, click Management > Catalog Upload.
c. Click Browse and select the
IBMSoftwareCatalog_canonical_2.0_form_date.zipfile.
d. To upload the file, click Upload.
What to do next
After you modify the list of extensions that are uploaded directly to the server, wait for the scheduled import or run it manually. During the import, the changes are propagated to the specified endpoints. After the next scheduled scan and import, the changed list of file extensions is used. More file extensions are matched against the software catalog on the side of the endpoint and the import time is shorter.
Recovering from accumulated scans
To recover from a situation when you have a large amount of accumulated scan data, you need to clean up the Software Use Analysis scans and then verify if the data was removed correctly.
Cleaning up high volume of Software Use Analysis scans
uploaded since the last import
To clean up accumulated scan data, use an SQL query that removes scan file entries from the BFEnterprise tables.
Procedure
1. Back up the BFEnterprise and, tem_analytics databases.
2. Back up and remove all files from the UploadManager/sha1 directory on the
Endpoint Manager server.
3. To delete Software Use Analysis scan files entries from BFEnterprise tables, run the following SQL statement against the BFEnterprise database:
use BFEnterprise
delete from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
delete from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
4. Run a Software Use Analysis import.
What to do next
You can either:
v Restore Software Use Analysis scans in sha1 directory in small chunks and run
imports (that is, restore 10 000 scans, run an import, restore next 10 000 scans, run another import, and so on. Software Use Analysis will incrementally import 10 000 scans). While restoring the scans in sha1 directory, Endpoint Manager
FillDBprocess monitors changes in UploadManager\sha1 directory and updates
dbo.uploadsand dbo.uploads_availability tables. v Perform the following steps:
1. Rescan computers.
2. Upload new scan files.
3. Run an import.
Repeat the procedure for every 10 000 computers.
Tip: Avoid importing more than 10 000 or 20 000 scans within one Software Use
Analysis import. Such an import takes 10 - 15 hours even on systems that meet the Software Use Analysis hardware requirements.
Verifying the removal of scan data
After you remove the scan data, verify whether the removal was successful.
Procedure
1. Stop all fixlets that scan and upload data.
2. Back up the Endpoint Manager databases:
SQL server
BFEnterpriseand BESReporting
In SQL Server Management Studio, right click the database and select
Tasks > Back Up....
DB2 BFENT and BESREPOR
If you get a SQL1015N The database is in an inconsistent state.
SQLSTATE = 55025 error when attempting to back up the DB2
database, check if the database is in an inconsistent state by running the following command: db2 get db cfg for database_name.
Note: Verify whether the parameter All committed transactions
have been written to disk is set to NO.
If the database is in inconsistent state, run the following commands: a. DB2 restart database DATABASE_NAME
b. DB2 force applications all c. db2stop
d. db2start
3. Move all files in the Upload Manager/sha1 directory on the Endpoint Manager
server to a backup location.
v Linux: /install_dir/UploadManagerData/BufferDir/sha1 (that is
install_dir = /var/opt/BESServer)
v Windows: install_dir\UploadManagerData\BufferDir\sha1 (that is
install_dir = c:\Program Files (x86)\BigFix Enterprise\BESServer\ UploadManagerData\BufferDir\sha1)
4. Delete scan files entries from BFEnterprise tables:
SQL Server
use BFEnterprise
delete from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
delete from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
connect to BFENT
delete from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
delete from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
5. Run a Software Use Analysis import.
6. Move a subset of the sha1 folders from the backup location from step 3 back
to the original sha1 folder.
v The sha1 directory contains folder names that match the last 2 digits of the endpoint (computer) ID. The folder names can be used to identify a specific endpoint or endpoints from a specific scan group.
v The Presentation Debugger can be used to run session relevance queries to
retrieve computer information. The following sample queries show how to retrieve the information for computers that are associated with sha1 folders: –
(id of it, hostname of it, name of it, ip addresses of it, operating system of it) of bes computers whose
(id of it mod 100 = name_of_sha1_subdirectory_moved) returns the following when "name_of_sha1_subdirectory_moved" is replaced with "14":
7423114, nc9048149178.tivlab.austin.ibm.com, nc9048149178, 9.48.149.178, Linux Red Hat Enterprise Server 6.4 (2.6.32-358.el6.x86_64)
–
(id of it, hostname of it, name of it, ip addresses of it, operating system of it) of bes computers
whose ((id of it mod 100 = name_of_sha1_subdirectory_moved) or (id of it mod 100 = name_of_sha1_subdirectory_moved)) returns the following when "name_of_sha1_subdirectory_moved" is replaced with "14" and "5":
7423114, nc9048149178.tivlab.austin.ibm.com, nc9048149178, 9.48.149.178, Linux Red Hat Enterprise Server 6.4 (2.6.32-358.el6.x86_64)
16389405, nc038067.tivlab.raleigh.ibm.com, NC038067, 9.42.38.67, Win2003 5.2.3790
7. Verify that new rows are created in the BFEnterprise tables after copying the sha1folders back to the original location.
SQL Server:
use BFEnterprise
select * from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
select * from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
DB2:
connect to BFENT
select * from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
select * from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
8. Run a Software Use Analysis import to load endpoint data into Software Use
Analysis.
9. Repeat steps 5 and 6 until all of the original sha1 folders have been imported into Software Use Analysis.
10. Restart all fixlets that were stopped in step 1.
IBM PVU considerations
If you need to generate PVU reports for an IBM compliance purposes, the best practice is to generate the report at least monthly.
For organizations that span continents, from IBM compliance perspective, the License Metric Tool Regions must be applied requiring separate Software Use Analysis deployments. For more information, see Virtualization Capacity License Counting Rules.
REST API considerations
You can use REST API for software licensing information to retrieve large amounts of data that is related to computer systems, software instances, and license usage in your environment. Such information can then be passed to other applications for further processing and analysis.
Although using single API requests to retrieve data only from a selected subset of computers does not greatly impact the performance of Software Use Analysis, this is not true when retrieving data in bulk for all your computer systems at the same time. Such an action requires the processing of large amounts of data and it always influences the application performance.
In general, the API requests should not be used together with other performance intensive tasks, like software scans or data imports. Each user that is logged in to the application, as well as the number of actions that are performed in the web user interface during the REST API calls also decrease the performance.
Important: Each time you want to retrieve data through REST API, ensure that the use of Software Use Analysis at a moderate level, so that the extra workload resulting from REST API does not overload the application and create performance problems.
When you retrieve data in bulk, you can also make several API requests and use the limit and offset parameters to paginate your results instead of retrieving all the data at the same time:
v Use the limit parameter to specify the number of retrieved results:
https://hostname:port/api/sam/computer_systems?token=token&limit=100000
v If you limit the first request to 100 000 results, append the next request with the offset=100000parameter to omit the records that you already retrieved:
https://hostname:port/api/sam/computer_systems?token=token&limit=100000&offset=100000
Note: The limit and offset parameters can be omitted if you are retrieving data from up to about 50 endpoints. For environments with approximately 200 000 endpoints, you are advised to retrieve data in pages of 100 000 rows for computer systems, 200 000 rows for software instances, and 300 000 rows for license usage.
Improving user interface performance
If you experience problems with the performance of the user interface while working with reports, increase the number of rows that are loaded into the user interface. Adjust the number of rows to the size of your environment.
About this task
When you open a report, 50 rows of data are loaded to the user interface by default. When you scroll past those 50 rows, next 50 rows must be loaded. To improve the response time of the user interface, you can increase the number of rows that are loaded to the user interface.
Procedure
1. Stop theSoftware Use Analysis server.
2. On the computer where the Software Use Analysis server is installed go to the
sua_installation_dir\TEMA\work\tema\webapp\javascripts\report_components directory and open the grid.js file.
3. Find the following lines:
$.widget("bigfix.grid", {options:
{pageSize: 50, gridOptions: {} }
4. Increase the value of the pageSize parameter according to the size of your environment.
Table 2. Number of rows loaded into Software Use Analysis user interface
Environment size Value of the pageSize parameter
10 000 - 15 000 computers 800 15 000 - 30 000 computers 1600
over 30 000 computers 2500
5. Start the Software Use Analysis server.
6. Clear the cache in the web browser.
Using relays to increase the performance of IBM Endpoint Manager
To take advantage of the speed and scalability that is offered by IBM Endpoint Manager, it is often necessary to tune the settings of the Endpoint Manager deployment.
A relay is a client that is enhanced with a relay service. It performs all client actions to protect the host computer, and in addition, delivers content and software downloads to child clients and relays. Instead of requiring every networked computer to directly access the server, relays can be used to offload much of the burden. Hundreds of clients can point to a relay for downloads, which in turn makes only a single request to the server. Relays can connect to other relays as well, further increasing efficiency.
Reducing the Endpoint Manager server load
For all but the smallest Endpoint Manager deployments (< 500 Endpoint Manager clients), a primary Endpoint Manager relay should be set for each Endpoint Manager client even if they are not in a remote location.
The reason for this is that the Endpoint Manager server performs many tasks including:
v Gathering new Fixlet content from the Endpoint Manager server
v Distributing new Fixlet content to the clients
v Accepting and processing reports from the Endpoint Manager clients
v Providing data for the Endpoint Manager consoles
v Sending downloaded files (which can be large) to the Endpoint Manager client,
and much more.
By using Endpoint Manager relays, the burden of communicating directly with every client is effectively moved to a different computer (the Endpoint Manager
relay computer), which frees the Endpoint Manager server to do other tasks. If the relays are not used, you might observe that performance degrades significantly when an action with a download is sent to the Endpoint Manager server (you might even see errors).
Setting up Endpoint Manager relays in appropriate places and correctly configuring clients to use them is the most important change that has highest impact on performance. To configure a relay, you can:
v Allow the clients to auto-select their closest Endpoint Manager relay. v Manually configure the Endpoint Manager clients to use a specific relay. For more information, see Managing relays.
Appendix. Executive summary
Table 3. Summary of the scalability best practices
Step Activities
1. Environment planning
Review the summary information that matches your environment size:
Small
v Up to 5 000 endpoints
v Software Use Analysis and DB2 installed on the same server v Scan groups (optional)
Medium
v 5 000 - 50 000 endpoints v Scan groups (advisable)
v Software Use Analysis and DB2 installed on separate computers
v It is possible to use virtual environments for this deployments size, however it is advisable to have dedicated resources for processor, memory, and virtual disk allocation.
Large
v 50 000 - 250 000 endpoints v Scan groups (required)
v Software Use Analysis and DB2 installed on separate computers, dedicated storage for DB2 v Fine-tuning might be required
2 Good practices for creating scan groups
v Plan the scan group size v Create a benchmark scan group
v Check the import time and decide whether it is satisfactory
v When you achieve an import time that is satisfactory, decide whether you want to have a shorter scan cycle. v When you are satisfied with the performance of the benchmark scan group, create the remaining groups. 3 Good practices for running scans, imports and uploads
v Run the initial import before scanning the environment v Plan the scanning schedule
v Avoid scanning when it is not needed
v Limit the number of computer properties to the ones that are relevant for software inventory management v Ensure that scans and imports are scheduled to run at night
v Disable gathering of usage data in the initial rollout phase
v Carefully plan for gathering of usage data in large environments. Testing is required. v Configure regular imports (Daily imports are advisable.)
v Review import logs v Maintain frequent imports
v Ensure that scans and imports are run at night v Run imports once a day
v Configure upload schedule (daily) 4 End of cycle activities
v Regularly import a new software catalog, for example monthly
v Periodically import data from SmartCloud Control Desk through IBM Tivoli Integration Composer, for example at the end of the 1 or 2-week cycle
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing
IBM Corporation North Castle Drive
Armonk, NY 10504-1785 U.S.A.
For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to:
Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan, Ltd.
1623-14, Shimotsuruma, Yamato-shi Kanagawa 242-8502 Japan
The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for
convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact:
IBM Corporation 2Z4A/101
11400 Burnet Road Austin, TX 79758 U.S.A
Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee.
The licensed program described in this information and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement, or any equivalent agreement between us.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of
performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Privacy policy considerations
IBM Software products, including software as a service solutions, (“Software Offerings”) may use cookies or other technologies to collect product usage information, to help improve the end user experience, to tailor interactions with the end user or for other purposes. In many cases no personally identifiable information is collected by the Software Offerings. Some of our Software Offerings can help enable you to collect personally identifiable information. If this Software Offering uses cookies to collect personally identifiable information, specific information about this offering’s use of cookies is set forth below.
This Software Offering does not use cookies or other technologies to collect personally identifiable information.
If the configurations deployed for this Software Offering provide you as customer the ability to collect personally identifiable information from end users via cookies and other technologies, you should seek your own legal advice about any laws applicable to such data collection, including any requirements for notice and consent.
For more information about the use of various technologies, including cookies, for these purposes, See IBM’s Privacy Policy at http://www.ibm.com/privacy and IBM’s Online Privacy Statement at http://www.ibm.com/privacy/details the section entitled “Cookies, Web Beacons and Other Technologies” and the “IBM Software Products and Software-as-a-Service Privacy Statement” at