Wednesday, July 3, 2013

Optimizing and Protecting Spanning Tree

Optimizing STP


Left to defaults, 802.1d (plain old STP) can take a very long time to converge.  For example, when a root switch fails, a switch must wait Maxage (20 seconds) before convergence can even begin.  Then, the newly forwarding ports must wait 2 x Forward Delay (15 seconds) to transition through the listening and learning states before they can begin to actually start forwarding.  This is a total of 50 seconds - a noticeable network hit.

Enhancements have been added over time to address this, such as PortFast, UplinkFast, and BackboneFast.

PortFast

This Cisco-proprietary feature allows a port to immediately transition to forwarding state once it is physically up (powered on and plugged in).  It does this by skipping the listening and learning states.  This should only be enabled on access ports.  If a switch is connected to a port with PortFast enabled, loops may occur.  For this reason, it is a good idea to enable Bridge Protocol Data Unit (BPDU) Guard and Root Guard when using PortFast.

UplinkFast


UplinkFast improves convergence by providing alternate root ports (RPs) for immediate transition in case of a failure of the current RP.  When you enable UplinkFast, three things occur:
  1. Increases root priority to 49,152
  2. Increases port costs to 3000
  3. Tracks alternate RPs which are ports that are receiving Hello messages from the root switch.
This lends itself well to good STP design with access switches - access switches should never become root or transit switches.  The increased root priority reduces the chance of the switch becoming root.  The increased port costs reduce the chance of the switch becoming a transit switch.  Lastly, when the RP fails, the switch can immediately fail over to an alternate uplink.

When a failure of the RP occurs on a switch with UplinkFast enabled, the switch immediately transitions to an alternate RP and begins forwarding.  It also sends out a multicast frame with the source MAC address of each local MAC address which causes other switches to update their Content Addressable Memory (CAM).

BackboneFast


BackboneFast optimizes convergence when an indirect failure occurs.  When a direct failure occurs, such as an RP, a switch doesn't have to wait Maxage to transition (thanks to UplinkFast).  However, when an upstream link to the root fails, this causes lost Hello messages for downstream switches.  This is where these switches would have to wait Maxage before converging. BackboneFast addresses this by causing the switch to ask their neighboring switch if they are still receiving Hellos from the root.

BBFast_Indirect_Fail
 

When a Hello goes missing, a switch with BackboneFast enabled will send a Root Link Query (RLQ) BPDU out the port that the Hello should have arrived.  If the switch that receives the RLQ has a direct failure of the root, it will send a RLQ message back to the requesting switch to inform it that the path to root has been lost.  This will trigger the requesting switch to skip the Maxage timer and begin converging.  These RLQs sent back and forth of course requires that BackboneFast be configured on all switches participating.

As an interesting side note, the UplinkFast and BackboneFast features were incorporated into the 802.1w (RSTP) protocol.

Protecting STP


BPDU Guard and BPDU Filter


BPDU Guard is basically a feature to prevent a situation where good intentions can lead to network outages.  For example, just a few more ports may be needed in a meeting room, so someone goes and finds a switch (with no knowledge of how that switch operates or is configured), and attaches it to the network.  Now there is the risk of your access ports receiving superior BPDUs that cause topology changes or worse.  BPDU Guard is enabled per port and protects your access ports by disabling them upon receiving any BPDU (because we don't expect there to be any BPDUs received on access ports).  When a port is shut down (err-disabled) by BPDU Guard, configuration must occur in order to recover.  The port must be manually re-enabled or a timeout can be configured where the port will automatically recover.

BPDU Filter restricts the switch from sending BPDUs out access ports, as these would be unnecessary.  It can be enabled per-interface or globally.  When enabling BPDU Filter globally, the following occurs:
  • Filtering takes effect on all operational PortFast ports that do not have it already specifically enabled.
  • Upon startup, the port will transmit ten BPDUs. If BPDUs are seen, the port will lose its PortFast status, BPDU Filter will disable, and the port will revert to sending and receiving BPDUs like any standard STP switch port.

Root Guard


Root Guard is also enabled per port and is used to ignore superior BPDUs that would allow an attached switch to become root.  Upon receipt of a superior BPDU, the port is placed into a root-inconsistent state, and stops receiving or forwarding frames until the superior BPDUs cease. Current design practices are to place this on access ports.  Placing this on inter-switch links (trunks) could result in switch isolation when inter-switch link failures occur.

Unidirectional Link Detection (UDLD)


UDLD protects a switch trunk port from causing loops.  It does this by detecting a unidirectional link condition which can be caused by miscabling, cutting one fiber cable, unplugging one fiber, GBIC problems, etc.  Although the likelihood of this occurring in fiber connections is much greater, it can also occur in copper and UDLD handles that as well.  UDLD can be run in regular or aggressive mode.  In regular mode, L2 message is used to detect when a switch can no longer receive frames from a neighbor.  The switch whose transmit interface didn't fail is placed into an err-disabled state.  In aggressive mode, eight attempts are made to reconnect to the neighbor.  If no reply is received, both sides become err-disabled.

Loop Guard


Loop Guard is used to prevent a switch trunk port from transitioning from blocking to forwarding upon an absence of BPDUs.  The loss of BPDUs doesn't always mean a broken link - it could be degraded performance.  A port moving to forwarding could cause more damage than the absence of BPDUs itself.  Loop Guard addresses this by placing a port into a loop-inconsistent state rather than allowing it to transition to a forwarding state.

Below is a picture where these features should be placed, in my opinion.

Optimizing and Protecting Spanning Tree

Notice that Root Guard is nowhere to be found.  This is because after research and testing, it is my opinion that Root Guard should not be used unless there is a security requirement for it or a specific set of circumstances exist, such as a separate network you have no control over connecting to your network and needing to participate in your STP topology.  Root Guard in this scenario would prevent something in that network from accidentally hijacking your root bridge.

1 comment:

  1. Thanks! I'm currently studying for the CCNP Switch exam and this helped clear up a few misconceptions I had.

    ReplyDelete