Failback, Modifying the failover policy – Dell PowerVault 775N (Rackmount NAS Appliance) User Manual

Page 79

Advertising
background image

The group's resources are taken offline.

The resources in the group are taken offline by MSCS in the order determined by the group's dependency hierarchy:

dependent resources first, followed by the resources on which they depend.

For example, if an application depends on a Physical Disk resource, MSCS takes the application offline first, allowing the

application to write changes to the disk before the disk is taken offline.

The resource is taken offline.

Cluster Service takes a resource offline by invoking, through the Resource Monitor, the resource DLL that manages the

resource. If the resource does not shut down within a specified time limit, MSCS forces the resource to shut down.

The group is transferred to the next preferred host node.

When all of the resources are offline, MSCS attempts to transfer the group to the node that is listed next on the group's

list of preferred host nodes.

For example, if cluster node 1 fails, MSCS moves the resources to the next cluster node number, which is cluster node

2.

The group's resources are brought back online.

If MSCS successfully moves the group to another node, it tries to bring all of the group's resources online. Failover is

complete when all of the group's resources are online on the new node.

MSCS continues to try and fail over a group until it succeeds or until the number of attempts occurs within a predetermined

time span. A group's failover policy specifies the maximum number of failover attempts that can occur in an interval of time.

MSCS will discontinue the failover process when it exceeds the number of attempts in the group's failover policy.

Modifying the Failover Policy

Because a group's failover policy provides a framework for the failover process, ensure that your failover policy is appropriate

for your particular needs. When you modify your failover policy, consider the following guidelines:

Define the method in which MSCS detects and responds to individual resource failures in a group.

Establish dependency relationships between the cluster resources to control the order in which MSCS takes resources

offline.

Specify Time-out, failover Threshold, and failover Period for your cluster resources

Time-out controls how long MSCS waits for the resource to shut down.

Threshold and Period control how many times MSCS attempts to fail over a resource in a particular period of

time.

Specify a Possible owner list for your cluster resources. The Possible owner list for a resource controls which

cluster nodes are allowed to host the resource.

Failback

When the System Administrator repairs and restarts the failed cluster node, the opposite process occurs. After the original

Advertising