Troubleshooting hardware problems affecting pairs, Troubleshooting with raid manager – HP XP7 Storage User Manual

Page 100

Advertising
background image

Troubleshooting hardware problems affecting pairs

Hardware failures affecting Continuous Access Journal are described in the following table. Note
also that, in addition to the problems described below, hardware failures that affect cache memory
and shared memory can cause the pairs to be suspended.

Recovery procedure

SIM

Causes of suspension

Classification

Depending on the SIM, remove the hardware
blockade or failure.

DC0x
DC1x
DC2x

Hardware redundancy has been lost due
to some blockade condition. As a result,
one of the following could not complete:

Local or remote system
hardware

Resynchronize the failed volume pairs
(pairresync) .

primary-secondary system communication,
journal creation, copy operation,

If a failure occurs during execution of the RAID
Manager horctakeover command, S-VOLs in

resynchronize operation, staging process,
or de-staging process.

SSWS pair status may remain in the master

Journals cannot be retained because some
portion of the cache memory or the

journal . If these volumes remain, execute the
pairresync -swaps command on the S-VOLs

shared memory has been blocked due to
hardware failure.

whose pair status is SSWS (pairresync is the
RAID Manager command for resynchronizing

The primary system failed to create and
transfer journals due to unrecoverable
hardware failure.

pair and -swaps is a swap option) .This
operation changes all volumes in the master
journal to primary volumes. After this operation,
resynchronize the pairs.

The secondary system failed to receive
and restore journals due to unrecoverable
hardware failure.

The drive parity group was in
correction-access status while the Cnt
Ac-Jpair was in COPY status.

Remove the failure from the primary and
secondary systems or the network relay
devices.

DC0x

DC1x

Communication between the systems
failed because the secondary system or
network relay devices were not running.

Communication between
the local and remote
systems

If necessary, increase resources as needed (for
example, the amount of cache, the number of

Journal volumes remained full even after
the timeout period elapsed.

paths between primary and secondary systems,
the parity groups for journal volumes, etc.).

Resynchronize the failed pairs.

Release failed pairs (pairsplit-S) .

DC2x

An unrecoverable RIO (remote I/O)
timeout occurred because the system or

RIO overload or RIO
failure

If necessary, increase resources as needed (for
example, the amount of cache, the number of

network relay devices were overloaded.
Or, RIO could not be finished due to a
failure in the system.

paths between primary and secondary system,
the parity groups for journal volumes, etc.).

Recreate failed pairs.

No recovery procedure is required. The
primary system automatically removes the

DC8x

The Cnt Ac-J pairs were temporarily
suspended due to a planned power
outage to the primary system.

Planned power outage
to the primary system

suspension condition when the system is
powered on.

Troubleshooting with RAID Manager

When an error has occurred in Continuous Access Journal pair operation when using RAID
Manager, you can identify the cause of the error by referring to the RAID Manager operation log
file. The file is stored in the following directory by default:

/HORCM/log*/curlog/horcmlog_HOST/horcm.log

Where:

* is the instance number.

HOST is the host name.

100 Cnt Ac-J troubleshooting

Advertising
This manual is related to the following products: