STP

  1. Spanning tree protocol (STP)
    1. STP types
    2. Timers
    3. Bridge ID (BID)
    4. Bridge protocol data unit (BPDU)
      1. Configuration BPDU
      2. TCN BPDU
    5. BW and cost
    6. Port roles
      1. Root port election
      2. Designated port election
    7. States
  2. Enhancements
    1. PortFast
    2. UplinkFast
    3. BackboneFast
  3. Root Guard
  4. BPDU Guard
  5. BPDU filter
  6. Loop Guard
  7. Unidirectional link detection (UDLD)
  8. FlexLinks
  9. STP problems

Spanning tree protocol (STP)

  • IEEE 802.1D, RSTP – 802.1w (802.1D-2004), MST – 802.1s (802.1q-2005)
  • MAC addresses
    • src MAC = interface MAC
    • dst MAC
      • STP: 0180.c200.0000 (LLC)
      • PVST+: 0100.0ccc.cccd (LLC+SNAP)
  • enabled for all VLANs by default
  • CoS = 0
  • cost = root path cost (BPDU) + ingress interface cost
(config)# spanning-tree vlan <VLAN>
(config-if)# spanning-tree vlan <VLAN>
; enabled features: *Fast, *Guard
# show spanning-tree summary

STP types

  1. common spanning tree (CST)
    • single STP for all VLANs
    • 802.1q
  2. per-VLAN spanning tree (PVST)
    • over ISL only
  3. PVST+
    • supports PVST, PVST+, CST 802.1q
    • tagged BDPU is sent in every VLAN
      • TLV with VLAN number is included – native VLAN mismatch detection
      • untagged BPDU
        • VLAN1: CST, without TLV with VLAN info
        • native VLAN: includes TLV with VLAN info
      • VLAN1 is sent 2 BPDUs – PVST BPDU (STP VLAN mismatch, discarded) and IEEE BPDU
    • mcast

Timers

  • types
    1. hello
      • 2s default
      • BPDU transmission interval, set by root
      • TCN BPDU retransmit with local hello value
    2. max age
      • 20s default
      • how long to store non-refreshed BPDU from root (max_age – msg_age)
    3. forward delay
      • 15s default
      • duration of listening/learning
  • set and distributed by root via BDPUs
  • defaults are calculated for diameter = 7 – max number of switches from root to leaf
(config)# spanning-tree vlan <LIST> hello-time <sec>
(config)# spanning-tree vlan <LIST> forward-time <sec>
(config)# spanning-tree vlan <LIST> max-age <sec>

Bridge ID (BID)

  • the lower, the more priority
  • traditional 802.1D: MAC is unique
    • priority = 0x8000 by default
    • MAC = internal BIA MAC, not interface
  • 802.1t:
    • 4 bit priority || VLAN ID || MAC
    • MAC does not have to be unique
    • MAC address reduction feature: single MAC can be used for all VLANs (before – MAC pool)
; use 802.1t, enabled by default if switch lacks 1024 MAC for internal use
(config)# spanning-tree extend system-id

(config)# spanning-tree vlan <LIST> priority <N>

; macro command – applied once, no change tracking
; primary: 24576 if enough (current root has larger value or same value + higher MAC)
; if not enough – assign priority less by 4096 then current root (0 cannot be assigned)
; secondary: priority = 28672
; diameter: 7 by default
(config)# spanning-tree vlan <LIST> root primary|secondary [diameter <N>] [hello <sec>]
; BID MAC in use
# show version

Bridge protocol data unit (BPDU)

  • types
    1. configuration BDPU: sent by root, forwarded by others
    2. topology change notification (TCN): generated locally
  • lower BID is better ⇒ newer switches do not preempt root role
  • access ports: only IEEE BPDUs, otherwise – inconsistent state
  • trunk:
    • IEEE BDPU are processed by VLAN1 instance
    • PVST BPDU:
      • compare port VLAN ID and tag, if mismatch – PVID_inconsistent
      • processed in VLAN n
      • discarded in VLAN 1

Configuration BPDU

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
                                                +-+-+-+-+-+-+-+-+
                                                | ProtoID High  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  ProtoID Low  |    Version    |    Msg type   |     Flags     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|_                          Root BID                           _|
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Root path cost                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|_                         Sender BID                          _|
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            Port ID            |           Message age         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            Max age            |           Hello time          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|         Forward delay         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Protocol ID: 0x0000
Version: 0x00
Message age: units – 1/256s, if exceeds Max age – do not forward BPDU
Max age: units – 1/256s
Hello time: units – 1/256s
Forward delay: units – 1/256s

Msg type:

  • 0x00 ≡ configuration BPDU
  • 0x80 ≡ TCN BPDU

Flags:

  • 0x01: Topology Change Ack
  • 0x80: Topology Change

TCN BPDU

  • topology change:
    • direct: physical port is down; immediate TCN if root port is up
    • indirect: link is up/up, but no BPDUs are received (e.g., filtered); TCN after max_age
    • insignificant: change on non-trunk; immediate TCN
  • triggers
    • port becomes forwarding + at least 1 designated is available
    • forwarding/learning becomes blocking
    • interface cost change without change of topology – not a topology change event
    • no triggered by ports with PortFast
  • sent via root port at locally configured hello interval, until acknowledgement from designated is received ≡ TCA flag in configuration BPDU
  • TCN received:
    • acknowledge to downstream, forward to upstream
    • if root, start sending configuration BPDU with Topology Change flag (TC)
  • TC flag:
    • reduce aging CAM to forward delay: 300s → 15s by default
    • causes stale info to be cleared earlier: reduce trash load
    • transmitted for max_age + forward_delay, when TC us cleared – aging = 300s
  • if PCs connect and disconnect often, all switches are constantly erasing CAM
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Protocol ID         |    Version    |    Msg type   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Protocol ID: 0x0000
Version: 0x00

Msg type:

  • 0x00 ≡ configuration BPDU
  • 0x80 ≡ TCN BPDU

BW and cost

  • old STP: 1 byte, linear scale
  • new STP: 1 byte, logarithmic scale
  • long STP: 4 bytes, linear scale
SpeedOld STP costNew STP costLong STP cost
10 Mbps1001002000000
100 Mbps1019200000
1 Gbps1420000
10 Gbps022000
(config)# spanning-tree pathcost method long
(config-if)# spanning-tree [vlan <LIST>] cost <N>

; multiple of 64
(config-if)# spanning-tree [vlan <LIST>] port-priority <N>

Port roles

  1. root port
  2. designated port
  3. blocking port
  4. alternate port (UplinkFast)
  5. forwarding (no STP activity)

Root port election

  1. min received cost
  2. min BID
  3. min sender port ID
  4. min receiver port ID

Designated port election

  • min root cost
  • min BID
  • min receiver port ID

States

  • when first enabled – listening
StateData transmitLearn MACBPDU actionDuration
Blockingreceive onlyInfinite
Listeningreceive & sendForward delay
Learning+receive & sendForward delay
Forwarding++receive & sendInfinite

Enhancements

PortFast

  • remove listening/learning state for access port
  • accepts and sends BPDU
    • if BPDU is received, PortFast status is lost
  • change of port state does not trigger TCN
(config)# spanning-tree portfast default
(config-if)# spanning-tree portfast [disable|trunk]

; macro, enables PortFast
(config-if)# switchport host

; NX-OS
(config-if)# spanning-tree port type edge
# show spanning-tree interface <INTF> portfast

UplinkFast

  • leaf switch only, must not be used on root
  • fast switchover to other upstream port into forwarding, when root port fails
  • when root port is restored
    • not switched back to FWD immediately, because upstream can still be blocked
    • waits for max_age + forward_delay: new root – BLK, old root – FWD; after expiry – immediate switch to FWD and BLK
  • tracks all paths to root (links, receiving BPDUs), selects link with lowest root path cost
  • does not send TCN
  • enabled for all VLANs
    • priority = 49152
    • cost for all ports is increased by 3000 (short) or 10M (long)
  • dummy mcast:
    • LLC, ARP, AppleTalk, IPX
    • destination: 0100.0cccd.cdcd
    • source: all CAM entries via edge or designated ports
    • purges CAM entries via root port
    • sent on switchovers
; dummy frames transmission rate
(config)# spanning-tree uplinkfast [max-update-rate <pps>]
# show spanning-tree uplinkfast

BackboneFast

  • protection from indirect link failure, accelerates convergence
  • trigger: inferior BPDU is received via root or blocked port from same sender BID
    • new switch in segment with new BID – not a trigger
    • replaces waiting for superior BPDU to expire while ignoring inferior BPDU
  • alternate to root:
    • inferior BPDU is received on BLK: root, blocked
    • inferior BPDU is received on root: blocked
    • inferior BPDU is received on root and no blocked are available: recompute STP ≡ consider itself root
    • excludes port on which inferior BPDU is received
  • Root Link Query (RLQ):
    • algorithm
      1. if alternate to root ports are available, send RLQ request through them
      2. if RLQ request is received
        • if root itself – reply on designated ports
        • if root is different – reply on designated port
        • otherwise distribute request further via root port
      3. RLQ reply is received with own root: STP listening on ports with different root
      4. if RLQ reply is received on all requesting ports, none have own root → STP listening on all ports
    • not sent via port that received inferior BPDU
    • RLQ reply is flooded from root via designated ports
      • not flooded by switch that initiated RLQ (its BID is in RLQ messages)
    • configuration BPDU format
      • different SNAP EtherType:
        • 0x0108 ≡ RLQ request
        • 0x0109 ≡ RLQ reply
    • BLK ports with positive RLQ response switch to LIS naturally, when root ports receives negative RL (thus invalidating its previous BPDU)
  • resets max_age timer for invalid BPDUs, forward_delay still active
  • disabled by default
    • must be enabled in the whole network because of RLQ responses
(config)# spanning-tree backbonefast
# show spanning-tree backbonefast

Root Guard

  • protection against superior BPDUs on designated
    • if received, interface – root inconsistent ≡ LIS (blocked for data, processes BPDUs)
    • if superior BPDUs ceases, port goes through common states: LIS → LRN → FWD
  • forces port to designated, cannot be root
  • disabled by default
  • all VLANs at once
  • incompatible with LoopGuard
(config-if)# spanning-tree guard root
# show spanning-tree inconsistentports

BPDU Guard

  • if BPDU is received – err-disable
  • disabled by default
  • not enabled by default along with PortFast
  • less priority than BPDU filter
; enable along with PortFast
(config)# spanning-tree portfast bpduguard default
; unconditional enable
(config-if)# spanning-tree bpduguard enable

BPDU filter

  • does not send or receive BPDU
  • disabled by default
  • priority over BPDU Guard
  • useful for connecting independent sites, when loop between them is impossible
  • isolates topology changes effect
  • PortFast
    • sends 11 BPDUs after startup, if receives BPDU → normal STP
    • does not send BPDUs later
; enable along with PortFast
(config)# spanning-tree portfast bpdufilter default
; unconditional, filters ingress and egress BPDUs ≡ disable STP
(config-if)# spanning-tree bpdufilter enable|disable

Loop Guard

  • tracks BPDU on non-designated ports
  • does not allow becoming designated after max_age on BPDU loss
    • loop inconsistent ≡ blocking
    • if BPDUs appear again – listening
    • prevents both switches from becoming designated for same segment
    • protection from software failure
  • Etherchannel:
    • BPDU is sent over first link, if it becomes unidirectional – whole bundle is disabled
    • bundling loop-inconsistent into port-channel clears inconsistency ≡ can cause a loop
  • disabled by default
  • P2P only; otherwise hosts, connected via hub to switches, might lose connectivity
  • per VLAN
  • incompatible with Root Guard
; enable on all P2P links
(config)# spanning-tree loopguard default
(config-if)# spanning-tree guard loop

Unidirectional link detection (UDLD)

  • Cisco proprietary
  • unidirectional link (UDL): appears bidirectional, but does not pass info in one direction (fiber)
    • explicit
      • UDLD message without own switch/port ID is received: UDL or frames from other switch
      • UDLD with own switch/port ID originator: self-loop
      • single peer is discovered, but UDLD list contains more: NBMA
    • implicit
      • 3 UDL messages are lost in a row
  • messages:
    • contain switch/port originator ID
    • contain list of peer switch/port, discovered in segment
  • modes
    • normal
      • err-disable on explicit UDL
      • syslog on implicit UDL
      • default
    • aggressive
      • err-disable on both explicit and implicit UDL
      • before disabling the link, sends 1 UDL message per second during 8s (8 messages in total)
  • destination MAC = 0100.0ccc.cccc
  • 15s timeout per message
    • total timeout must be lower than max_age + 2×forward_delay
    • tracking starts after receiving first UDLD reply
    • before receiving first UDLD reply – infinite peer poll ≡ link up/up
  • 2 UDLD processes per link (one per switch)
  • disabled globally by default
  • protection against miswiring
    • usually fiber
    • not required for copper
  • per port
  • disables only UDL in port-channel, not the whole bundle
  • newer HW: UDL discovered → port down
    • FastEthernet: far end fault indication (FEFI)
    • GigabitEthernet: link negotiation
; on fiber ports only, enable ≡ normal
(config)# udld enable|aggressive

; 15s default
(config)# udld message time <sec>
; default depends on platform and linktype (copper/fiber)
(config-if)# udld enable|aggressive|disable
; clear err-disabled status on all ports 
# udld reset

# show udld neighbors

FlexLinks

  • L2, alternative to STP (disables STP)
  • local significance
  • access layer
  • Cisco proprietary
  • pair of 2 links:
    • master and backup
    • may have different types
    • no preemption by default
    • switchover:
      • send dummy to 0100.0ccd.cdcd, src = MAC from CAM via other ports
      • CAM entries are transferred from master to backup
; forced – master always preempts, bandwidth – master ≡ link with max BW
(config-if)# switchport backup interface <INTF> [preemption mode forced|bandwidth|off]

STP problems

  • BPDU loss on active link may cause temporary loop
  • reasons
    • duplex mismatch
    • unidirectional link
    • frame corruption: bad cable, excessive length
    • resource error: CPU overload
    • PortFast config error