Port-channel

  1. Etherchannel
  2. Hash-based load-balancing
  3. Link Aggregation Control Protocol (LACP)
    1. LACP negotiation
    2. LACP frame
    3. TLV
      1. Actor/Partner TLV
      2. Collector TLV
  4. Port Aggregation Protocol (PAgP)
    1. PAgP silent
  5. Switch stack
  6. Design

Etherchannel

  • must match: speed, duplex
    • access: same VLAN
    • trunk: trunk mode, PortFast mode, STP port priority, native VLAN, permitted VLANs,
      • STP cost: must match in NX-OS, can be different in IOS
  • not supported: SPAN dst port, port security
  • STP guard: err-disable if STP BPDU is detected on non-default port for static port-channel
; enabled by default, err-disable if mode on is connected to LACP/PAgP port
(config)# spanning-tree etherchannel guard misconfig
; limit allowed aggregation protocols
(config-if)# channel-protocol pagp|lacp

; if active links count is less than N, do not enable port-channel, port-channel config mode
(config-if)# port-channel min-links <N>

; place standalone links in suspended, enabled by default
(config-if)# port-channel standalone-disable
; IOS
# show etherchannnel

; NX-OS
# show port-channel

Hash-based load-balancing

  • methods
    • IP src, IP dst, IP src + dst
    • MAC src, MAC dst, MAC src + dst
    • UDP/TCP port
  • “+” ≡ XOR
  • if current methods is not suitable → fallback to next lower method
  • if link fails, its hash is served by next link
  • unicast, mcast and bcast egress traffic is balanced
; common default – src-mac
(config)# port-channel load-balance <METHOD>
# show etherchannel load-balance

; interface load, useful for load-balancing tuning
# show etherchannel port-channel

; default port used for control-plane traffic
# show etherchannel summary

; reason for suspended status
# show etherchannel detail

Link Aggregation Control Protocol (LACP)

  • IEEE 802.3ad (new – 802.1x)
  • Ethertype = 0x8809 (Slow protocols), subtype = 0x01
  • destination MAC = 0180.c200.0002
  • Hellos are sent every 30s
  • system priority
    • 2 byte priority + MAC
    • master election, who selects active links
  • port priority
    • 2 byte priority + 2 byte port number
    • the lower, the more priority
    • always preempted
  • up to 16 links: 8 active + 8 standby
  • does not support half-duplex (→ suspended)
  • if bundle is not aggregated, links are active individually
; default: 0x8000
(config)# lacp system-priority <N>

; NX-OS
(config)# feature lacp
(config-if)# channel-group <N> mode active|passive

; default: 0x8000
(config-if)# lacp port-priority <N>

; initialize VLANs first, then send LACPDU; reduces convergence and down time
; ≈ PortFast ⇒ should not be enabled towards switch
(config-if)# lacp vpc-convergence

; send S:1 and C:1 immediately
; for non-compliant devices that send traffic after receiving only S:1 and C:0
(config-if)# no lacp graceful-convergence
; no hostname
# show lacp neighbors

LACP negotiation

A                         B
        S:1 C:0 D:0
    ------------------>
        S:1 C:0 D:0
    <------------------
        S:1 C:1 D:0
    ------------------>
        S:1 C:1 D:0
    <------------------
        S:1 C:1 D:1
    ------------------>
        S:1 C:1 D:1
    <------------------

LACP frame

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Destination MAC                        |
+                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               |                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
|                          Source MAC                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           EtherType           | SlowProto type|    Version    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                              TLV                              /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                       Reserved (50 bytes)                     /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                              FCS                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

TLV

  • 1 byte type, 1 byte length
  • types
    1. Actor information (length = 20 bytes)
    2. Partner information (length = 20 bytes)
    3. Collector information (length = 16 bytes)
    4. Terminator (length = 0 bytes)

Actor/Partner TLV

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        System priority        |                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
|                           System ID                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|              Key              |         Port priority         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|              Port             |     State     |    Reserved   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            Reserved           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

State:

  • 0x01:
    • LACP activity
    • 1 ≡ active, 0 ≡ passive
  • 0x02:
    • timeout
    • 1 ≡ short, 0 ≡ long
  • 0x04:
    • aggregation
    • 0 ≡ individual, 1 ≡ aggregatable
  • 0x08:
    • synchronization
    • 0 ≡ out of sync (parameter mismatch or during convergence), 1 ≡ in sync
  • 0x10:
    • collecting
    • 0 ≡ not ready for traffic, 1 ≡ can accept traffic
  • 0x20:
    • distributing
    • 0 ≡ not ready to send traffic, 1 ≡ can send traffic
  • 0x40:
    • defaulted
    • not used in FSM, diagnostics only
    • 0 ≡ peer info from LACPDU is used, 1 ≡ peer info from own config is used
  • 0x80:
    • expired
    • not used in FSM, diagnostics only
    • 0 ≡ timer refreshed
    • 1 ≡ timer expired once, next state – defaulted, then tear connection

Key:

  • port-channel ID
  • can identify whether peer links are in the same group (port-channel)

Collector TLV

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Collector Max delay      |                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
|_                           Reserved                          _|
|                                                               |
+                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Max delay: units – 10 µs, max delay on port-channel

Port Aggregation Protocol (PAgP)

  • timers:
    • normal: Hellos are sent every 30s
    • fast: Hellos are sent every 1s
  • up to 8 links
  • if bundle is not aggregated, links are active individually
  • can show peer (≈ CDP)
  • not supported by NX-OS
(config-if)# channel-group <N> auto|desirable
(config-if)# pagp timer normal|fast
; hostname is included
# show pagp neighbors

PAgP silent

  • used when peer device does not sent traffic
    • if no traffic during 15s, add link to bundle
    • peer device understands PAgP
  • prevents STP down status for the link
; if link transmits nothing, does not add it to bundle (UDL protection)
(config-if)# channel-group <N> mode auto|desirable non-silent

Switch stack

  • if slave fails, master rewrites corresponding ASIC entries to other slave
  • if master fails, it has to be reelected
    • master should not be part of port-channel, otherwise double delay: master election + ASIC update
    • frames coming to master are dropped, because ASICs cannot be rewritten before election

Design

  • L2:
    • use LACP/PAgP instead of static config
      • protection against miswiring/misconfig
      • STP BPDU are sent over default port, if they are lost with port-channel static mode – STP loop
    • preventive port-channel creation
      • even with single member
      • expansion is not disruptive (creation is disruptive)
  • L3:
    • use static mode: faster convergence, no STP
  • negotiation delay – 7s
  • L4 CEF to avoid polarization
  • IGP ECMP is easier to troubleshoot
    • OSPF: port-channel change (member is down) ≡ SPF
    • EIGRP: lowest BW is used ⇒ port-channel might be insignificant