Restoring normal operation, Master node, Transit nodes with one port down – Allied Telesis AlliedWare OS User Manual

Page 7: Transit nodes with both ports down

Advertising
background image

Page 7 | AlliedWare™ OS How To Note: EPSR

How EPSR Works

Restoring Normal Operation

Master Node

Once the fault has been fixed, the master node’s Health messages traverse the whole ring and
arrive at the master node’s secondary port. The master node then restores normal
conditions by:

1.

declaring the ring to be in a state of Complete

2.

blocking its secondary port for data VLAN traffic (but not for the control VLAN)

3.

flushing its forwarding database for its two ring ports

4.

sending a Ring-Up-Flush-FDB message from its primary port, to all transit nodes.

Transit Nodes with One Port Down

As soon as the fault has been fixed, the transit nodes on each side of the (previously) faulty
link section detect that link connectivity has returned. They change their ring port state from
Links Down to Pre-Forwarding, and wait for the master node to send a Ring-Up-Flush-FDB
control message.

Once these transit nodes receive the Ring-Up-Flush-FDB message, they:

flush the forwarding databases for both their ring ports

change the state of their ports from blocking to forwarding for the data VLAN, which
allows data to flow through their previously-blocked ring ports

The transit nodes do not start forwarding traffic on the previously-down ports until after
they receive the Ring-Up-Flush-FDB message. This makes sure the previously-down transit
node ports stay blocked until after the master node blocks its secondary port. Otherwise,
the ring could form a loop because it had no blocked ports.

Transit Nodes with Both Ports Down

The Allied Telesis implementation includes an extra feature to improve handling of double
link failures. If both ports on a transit node are down and one port comes up, the node:

1.

puts the port immediately into the forwarding state and starts forwarding data out that
port. It does not need to wait, because the node knows there is no loop in the ring—
because the other ring port on the node is down

2.

remains in the Links Down state

3.

starts a DoubleFailRecovery timer with a timeout of four seconds

4.

waits for the timer to expire. At that time, if one port is still up and one is still down, the
transit node sends a Ring-Up-Flush-FDB message out the port that is up. This message is
usually called a “Fake Ring Up message”.

Sending this message allows any ports on other transit nodes that are blocking or in the Pre-
forwarding state to move to forwarding traffic in the Links Up state. The timer delay lets the
device at the other end of the link that came up configure its port appropriately, so that it is
ready to receive the transmitted message.

Note that the master node would not send a Ring-Up-Flush-FDB message in these
circumstances, because the ring is not in a state of Complete. The master node’s secondary
port remains unblocked.

Advertising