Service processor system monitoring - surveillance – IBM H SERIES RS/6000 User Manual

Page 68

Advertising
background image

Service Processor System Monitoring - Surveillance

Surveillance is a function in which the Service Processor monitors the system, and
the system monitors the Service Processor. This monitoring is accomplished by peri-
odic samplings called heartbeats.

Surveillance is available during two phases:

1. System firmware bringup (automatic)

2. Operating system runtime (optional)

System Firmware Surveillance:

Provides the Service Processor with a means

to detect boot failures while the system firmware is running.

System firmware surveillance is automatically enabled during system power-on. It
cannot be disabled via a user selectable option.

If the Service Processor detects no heartbeats during system IPL (for 7 minutes), it
cycles the system power to attempt a reboot. The maximum number of retries is set
from the Service Processor menus. If the fail condition repeats, the Service
Processor leaves the machine powered on, logs an error and offers menus to the
user. If Call-out is enabled, the Service Processor calls to report the failure and dis-
plays the operating system surveillance failure code on the operator panel.

Operating System Surveillance:

Provides the Service Processor with a means

to detect hang conditions, hardware or software failures while the operating system is
running. It also provides the operating system with a means to detect the Service
Processor failure by the lack of a return heartbeat.

Operating system surveillance is enabled by default. This is to allow the user to run
operating systems that do not support this Service Processor option.

Operating system surveillance can be enabled and disabled via:

service processor Menus

service processor Service Aids

Three parameters must be set for operating system surveillance:

1. Surveillance enable/disable

2. Surveillance interval

This is the maximum time in minutes the Service Processor should wait for a
heartbeat from the operating system before timeout.

3-28

RS/6000 Enterprise Server Model H Series User's Guide

Advertising