Chapter 12. Finding and Fixing Problems: the pidiag Utility
2.5 Monitoring the Archive
Queue Id shows the sequence number of the primary queue (always 0 under normal
conditions).
2.5
Monitoring the Archive
On a daily basis, the System Manager should look at the internal counters for the Archive Subsystem. This enables you to predict the next archive shift as well as to monitor ongoing system behavior and performance. You can use piartool -as and piartool -al for this purpose. Other piartool commands are discussed in the chapter, Managing Archives.
Note: Windows-based PI Server exposes the Archive data displayed by piartool -as
as Windows Performance Counters. These counters may be viewed with the Windows Performance Monitor or recorded to PI Server with the OSI Performance Monitor Interfaces. This subject is covered in detail later in this chapter.
2.5.1 piartool -as
The piartool -as command lists the Archive Subsystem (piarchss) internal counters every 5 seconds until you type <CTRL-C>.
The column at the right margin gives the difference in the count since the previous 5 seconds. The counters are reset to 0 when the Archive Subsystem is started.
$ piartool -as
Counters for 7-Aug-03 14:51:10
Archived Events: 1050621 1485 Out of Order Events: 0 0 Events Cascade Count: 0 0 Events Read: 5 0 Read Operations: 0 0 Cache Record Count: 0 0 Cache Records Created: 6 0 Cache Record Memory Reads: 5 0 Cache Clean Count: 0 0
Archive Record Disk Reads: 146342 219 Archive Record Disk Writes: 152737 226 Unflushed Events: 12431 -203 Unflushed Points: 3131 -48 Point Flush Count: 133491 211 Primary Archive Number: 5 0 Archive Shift Prediction (hr): 1 0 Archiving Flag: 1 0
Archive Backup Flag: 0 0 Archive Loaded Flag: 1 0 Shift or System Backup Flag: 0 0 Failed Archive Shift Flag: 0 0 Overflow Index Record Count: 0 0 Overflow Data Record Count: 5082 4
The piartool utility can run remotely by specifying some additional parameters on the command line as described in Table 3–1. Options for Use with piartool on page 37.
Archived Events Counter
The Archived Events counter is incremented for every new event written to the archive (via the archive cache). This count includes delete and edit events.
Out-of-Order Events Counter
The Archive Subsystem receives events from the Snapshot Subsystem. If the timestamp of the event is older than the last event in the target record, it is considered an out-of-order event and is added to this counter.
Excessive out-of-order events might lead to system problems such as excess CPU consumption, excessive disk I/O, and archives filling faster than expected.
Events Cascade Count
Out of order events are inserted into the target record. The insert requires moving other events within the record. If the record is full, one or more events are forced out of the record into the adjacent record. This counter is incremented each time an insertion forces an event out of a record. This counter is an indication of the impact of out of order events on the archive.
Events Read Counter
Number of events read by all applications. For example, a trending application requests an array of events over a specified time period. This counter is incremented for each event returned.
Read Operations Counter
Number of archive read requests. Each archive read request increments this counter once, regardless of the number of events returned.
Archive Memory Cache Counters
The Archive Subsystem uses a memory cache when handling events sent to the archive disk file.
During routine operation, the cache is automatically flushed to disk at least every 15 minutes. Abrupt system shutdowns, such as power loss, should lose no more than the last 15 minutes of data. This value may be changed via a configurable timeout table parameter.
The data archive cache architecture provides large performance gains over reading and writing directly to disk. The cache even provides significant performance over the Operating System file cache. As with all file cache designs, the disk image will often be slightly inconsistent; therefore archive backup cannot be performed without coordination with the Archive Subsystem. The utility piartool -bs places the archive in a safe consistent state for backups; piartool -be returns the archive to normal operation. This is discussed in detail in the system backup section in Chapter 3, Troubleshooting and Repair.
2.5 - Monitoring the Archive
The cache consists of archive records loaded into memory. Records are added as needed; they are deleted when unused for a certain length of time. Cache Record Count yields the current count. Cache Records Created is incremented when memory is allocated for a new record. When archive data is requested (for example, the user is trying to trend a point in PI ProcessBook), the Archive Subsystem goes to the cache to retrieve the event data. If the record is not there, the Archive Subsystem loads the record from disk to the cache; Archive
Record Disk Reads is incremented.
When writing events to the archive, they are stored first in memory. Unflushed Events
Counter indicates the total number of events not yet flushed to disk. Unflushed Points counter
indicated the number of points with any number of events not yet flushed.
Archive Record Disk Writes is incremented each time a record is written to disk. This occurs
during the regular cache flush every 15 minutes. It also occurs when the number of un- flushed events for a point exceeds the configured maximum.
Cache Record Memory Reads is incremented for each read access.
Cache Clean Count indicates the number of records that were removed from the cache. The
archive cache contains a finite number of records. Old or low use records are removed from memory to make room for most recently accessed records.
Primary Archive Number
The Primary Archive Number is an internal identifier and should be ignored. It is not to be confused with the sequence number of the archive, as listed by piartool -al.
Archive Shift Prediction
Archive Shift (hr) estimates the predicted time to the next archive shift. Use piartool -al to
list the target archive file for shift. The target archive will be initialized on shift; if it contains data, make sure it is backed up. If this data is required to remain online, a new archive of adequate size should be created and registered.
When the current archive is less then 20% full, the estimate is 0. In order to determine whether a zero estimate means the archive is nearly full or not, run piartool -al. The message will tell you if there is not enough data for a prediction.
Shift Time: Not enough information for prediction
The shift prediction in piartool -as differs slightly from the one in piartool -al. The piartool -al figure is calculated when called. piartool -as shows the latest 10 minutes average. The latter number is available as a Windows Performance Counter.
Archiving Flag
Indicates whether or not events may be written to the archive; a value of 1 indicates events may be written, a value of 0, events may not be written. The Archiving Flag is set to 1 when there is a mounted Primary Archive. A Primary Archive may be registered but not mounted, for example during an archive shift. In this case, the Archiving Flag would be set to 0. This flag is also set to 0 when in backup mode.
All registered archives may be viewed using piartool -al. The Archive Flag is set to 0 if the Primary Archive becomes full and there is no other archive file available into which to shift. Note that the Primary Archive will never overwrite itself.
Archive Backup Flag
This flag is set to 1 when the archive is in backup mode. Backup mode indicates the archive file is in a consistent state unlocked state and may be backed up. The value is 0 when the archive is available for normal access.
Backup mode is entered and exited by running piartool -bs and piartool -be, respectively.
Archive Loaded Flag
This flag is 1 when a valid primary archive is mounted. 0 if the primary archive is not mounted.
Shift or System Backup Flag
This flag is 1 when the archive is in shift mode or the Archive Subsystem has been placed in backup mode. Shifts occur automatically or can be forced via piartool -fs. System backup mode is entered with piartool -systembackup.
Failed Archive Shift Flag
Set to 1 when a shift should occur but no shiftable archive exists. Under normal conditions this flag is 0.
Overflow Index Record Count
Number of index records. Index records speed up access to overflow records. Index records are created when two overflow records for a point are full and third one is being created. This counter is a measurement of archive file consumption.
Overflow Data Record Count
Number of non-primary data records. Each archive has a primary record for each point. When this record is full, data is written to overflow records. This counter gives a measurement of archive consumption.