Failback, Modifying your failover policy – Dell PowerVault 770N (Deskside NAS Appliance) User Manual

Page 85

Advertising
background image

For example, if an application depends on a Physical Disk resource, the Cluster Service takes the application offline

first, allowing the application to write changes to the disk before the disk is taken offline.

The resource is taken offline.

Cluster Service takes a resource offline by invoking, through the Resource Monitor, the resource DLL that manages the

resource. If the resource does not shut down within a specified time limit, the Cluster Service forces the resource to

shut down.

The group is transferred to the next preferred host node.

When all of the resources are offline, the Cluster Service attempts to transfer the group to the node that is listed next

on the group's list of preferred host nodes.

For example, if cluster node 1 fails, the Cluster Service moves the resources to the next cluster node number, which is

cluster node 2.

The group's resources are brought back online.

If the Cluster Service successfully moves the group to another node, it tries to bring all of the group's resources online.

Failover is complete when all of the group's resources are online on the new node.

The Cluster Service continues to try and fail over a group until it succeeds or until the number of attempts occurs within a

predetermined time span. A group's failover policy specifies the maximum number of failover attempts that can occur in an

interval of time. The Cluster Service will discontinue the failover process when it exceeds the number of attempts in the

group's failover policy.

Modifying Your Failover Policy

Because a group's failover policy provides a framework for the failover process, make sure that your failover policy is

appropriate for your particular needs. When you modify your failover policy, consider the following guidelines:

Define the method in which the Cluster Service detects and responds to individual resource failures in a group.

Establish dependency relationships between the cluster resources to control the order in which the Cluster Service

takes resources offline.

Specify Time-out, failover Threshold, and failover Period for your cluster resources

Time-out controls how long the Cluster Service waits for the resource to shut down.

Threshold and Period control how many times the Cluster Service attempts to fail over a resource in a

particular period of time.

Specify a Possible owner list for your cluster resources. The Possible owner list for a resource controls which

cluster nodes are allowed to host the resource.

Failback

When the System Administrator repairs and restarts the failed cluster node, the opposite process may occur. After the original

cluster node has been restarted and rejoins the cluster, the Cluster Service will bring the running application and its resources

offline, move them from the failover cluster node to the original cluster node, and then restart the application. This process of

returning the resources back to their original cluster node is called failback.

Advertising