• No results found

The system analyst discovery process

Other sitesBoulder

Chapter 6. Reports and decision information usage

6.3 The discovery process

6.3.1 The system analyst discovery process

First, we will establish the scope of our data search by asking the following questions, which will help the system analyst focus on resolving the problem.

Some typical questions follow:

• Which mail systems have CPU workload problems?

• Which systems have a high average CPU run length cue?

• Which mail systems have high memory utilization?

• Which systems have a high memory page scan rate?

• Which mail systems have high network utilization?

• What is the average forecast mail delivery time?

The analysis of the views and information that these questions generate will allow the system analyst to identify whether there are any response- or workload-related problems with the mail servers. The systems analyst needs to do the following:

• Find out why the response from the Lotus Notes mail servers is poor

• Analyze the information

• Consider solution options

• Present the proposed technical solution to the IT Manager for a final decision on any changes or technology investment that may be necessary to resolve the problem.

To begin the decision process, the system analyst will use the Tivoli Decision Support Discovery Interface, select both the Server Performance Prediction Discovery Guide and the Domino Management Discovery Guide, and choose the role of systems analyst.

6.3.1.1 Which mail systems have CPU workload problems?

After selecting the All System Metrics report, we will filter the information to see only the Lotus Notes Servers. The resulting graph, as shown in Figure 98 on page 171, displays the Lotus Notes server performance metrics by CPU utilization. This view shows us the monthly average percentage CPU busy time of the Lotus Notes Mail servers: nickel, desdemona, cypress, and burnet. The graph shows several CPU metrics from which it is clear that the server nickel has high average CPU percentage busy, system time, and user time utilization. This could be an indication that the server has

performance-related problems. We can also understand that all the other servers are under less stress.

Figure 98. Lotus Notes mail servers by CPU utilization

6.3.1.2 Which systems have a high average CPU run length cue?

We will find the answer to this question by selecting the Server Performance Prediction Discovery Guide and then selecting Busiest Systems report. In this view, as shown in Figure 99 on page 172, we can look at the busiest systems based on the average daily run queue length metric for each system. The run queue length metric is the number of processes that are ready to run

(processes not waiting for Input/Output or user input) that the system cannot dispatch until it has free processor cycles. From the graph, we can see that the server nickel has a high run queue length. This a key metric for

determining processor load and is measured in average number of waiting processes. The reports also show us that the other servers have average to normal workload characteristics.

Figure 99. Lotus Notes Mail Servers daily average run length cue

6.3.1.3 Which mail systems have high memory utilization?

Using the All System Metrics report in the Server Performance Prediction Discovery Guide and filtering on By Physical Memory, we can see, as shown in Figure 100 on page 173, the busiest systems based on the hourly average run queue length metric for each system. The report shows us the memory utilization for the Lotus Notes servers, and we can find out that the server nickel has a high utilization and that the usage for all the other servers is moderate to low.

Figure 100. Lotus Notes mail servers by memory utilization

6.3.1.4 Which systems have a high memory page scan rate?

By selecting the Systems That Need More Memory report from the Server Performance Prediction Discovery Guide, the system analyst can drill down and retrieve information from all Lotus Notes Servers. Figure 101 on page 174 highlights systems with physical memory from 32 MB up to 64 MB where the page scan rate is exceptionally high.

The page-scan rate is presented in terms of pages scanned per second. In order to evaluate this metric, you need to take into account the amount of physical memory on the system. It is known that the server nickel has 64 MB of physical memory installed. A scan rate of 1000 pages/second is

considered very high on a system with 64 MB of physical memory but not on one with 256 MB of physical memory. We can also see that the server nickel has a scan rate of nearly 1000 pages per second. This can be regarded as high for this amount of memory and will need to be corrected.

Figure 101. Lotus Notes mail servers that need more memory

6.3.1.5 Which mail systems have high network activity?

By selecting All System Metrics from the Server Performance Prediction Discovery Guide, the system analyst can drill down and retrieve information on all Lotus Notes Servers. By filtering on network utilization the network activity is displayed. Figure 102 on page 175 highlights the systems with high network activity. It can be seen that the servers desdemona, cypress, and burnet have relatively low network utilization while that of nickel is high.

Previously, we found that nickel had a high CPU utilization; this, coupled with high network activity, is an indication of an under-provisioned system.

Figure 102. Lotus Notes mail servers by network utilization

6.3.1.6 What is the average forecast mail delivery time?

From the Domino Management Discovery Guide and the When might servers begin experiencing problems report, the system analyst can filter by mail server and then by the Mail.AverageDeliveryTime measure. Figure 103 on page 176 shows the mail average and peak delivery time forecast of the server nickel. The forecast highlights that, for the next 30, 60, and 90 days, the averages and peaks of mail delivery time are increasing. In addition, since, according to the SLA, all mail deliveries must be within 20 seconds, nickel will exceed the SLA within 30 days.

Figure 103. Lotus Notes mail server - forecasted average mail delivery time

6.3.1.7 The system analyst’s conclusions and suggestions

Based on the results of the information gathered earlier in this section, the system analyst will deliver a report to the IT Manager addressing the cause of the problem and deciding on a course of action.

The following are conclusions that can be drawn from the discovery of the network:

• The servers desdemona, cypress, and burnet are operating within normal parameters and are attending the SLA.

• The server nickel is overloaded and under-provisioned. It means that the CPU is inadequate for the workload.

• The Lotus Notes mail service is currently operating at capacity, and the response problem only affects the customers attended by the server nickel, which is overloaded.

• The forecast to compromise the SLA is at least within 30 days since the workload on server nickel is increasing.

The system analyst makes the following recommendations to resolve the problem:

• Add or upgrade the CPU to server nickel

This will relieve the problem in the short term but does not address the underlying problem of nickel being overloaded.

• Increase the amount of physical memory in nickel to 128 MB.

This will solve the problem in the medium term but still does not resolve the fact that nickel is overloaded.

• Redistribute the workload across the other servers

This will offer a longer term solution but might be disruptive to the organization.