PIM

  1. PIM
    1. IOS CLI
    2. NX-OS CLI
  2. Prune override
  3. Assert
  4. DR
    1. PIMv2 header
    2. PIMv2 Hello
    3. PIMv2 Encoded Group Address
    4. PIMv2 Encoded Source Address
    5. PIMv2 Encoded Unicast Address
    6. PIMv2 Join/Prune/Graft/Graft-Ack
    7. PIMv2 Assert
  5. Mroute flags
  6. PIM DM
  7. PIM SM
    1. PIMv2 Register
    2. PIMv2 Register-Stop
  8. PIM SSM
  9. PIM BD
    1. DF election
    2. Phantom RP
    3. PIM-BD header
    4. PIM-BD Backoff
    5. PIM-BD Pass
  10. Bootstrap
    1. PIMv2 BSM
    2. PIMv2 Candidate-RP-Advertisement
  11. Auto-RP
  12. HSRP-aware PIM
  13. Anycast-RP PIM
  14. IPv6
    1. PIMv6
    2. Embedded RP
  15. VIF

PIM

  • protocol independent multicast
  • uses uRIB for RPF
  • messages:
    1. hello:
      • establishes adjacency, DR election
      • IP 103, 224.0.0.13
      • holdtime value
    2. join ≡ prune:
      • prune override: if prune is received on RPF interface and there are active clients – send join upstream, resetting prune on upstream
      • 224.0.0.13
    3. assert:
      • when several routers send traffic to segment
      • includes AD and metric
    4. bootstrap: sent out through all PIM-enabled ports
  • timers:
    1. hello: 30s default, does not have to match for neighbourship
    2. holdtime: 3.5 * hello by default
    3. prune override: 3s default, delay before stopping sending mcast traffic after prune received
  • RPF interface is recalculated on RIB change
  • DR:
    1. highest priority → highest IP address
    2. v1: sends query, others do not; IGMPv1 has no querier election (= DR in PIM)
    3. preempted: inaccessible only when neighbourship is down, no timeout
    4. IGMP querier ≠ DR (except for IGMPv1)
    5. sends Join and Register, may not be in datapath
    6. no effect on PIM neighbour
  • v1 uses Query instead of Hello: IP2, 224.0.0.2
  • Prune override, Asser, DR – LAN
  • Assert:
    1. RP-bit = 0 > RP-bit = 1
    2. lower AD wins
    3. lower metric wins
    4. highest IP
    • winner sends traffic to VLAN on behalf of all PIM neighbors (also for IGMP clients as a result)
  • Assert – traffic downstream, on-demand election
  • DR – control plane traffic upstream (election based on Hello); election is always run

IOS CLI

(config)# ip pim neighbor-filter <ACL>
; query interval in v1, hello interval in v2
(config-if)# ip pim query-interval <sec>

; permits mcast, does not establish PIM adjacencies
(config-if)# ip pim passive
; info about mcast routers in a segment
# mrinfo

; uses IGMP Request/Response, unicast
# mtrace <RP> <group>

; build topology per hop, collects statistics per hop (if clocks synced); negative loss ≡ extra packets received ≡ loop
# mstat <src> <dst> <group>
# show ip pim interface <intf>
# show ip pim neighbor

NX-OS CLI

(config)# ip pim bfd

Prune override

Assert

Trigger for Assert – receive mcast (S,G) that the router itself already sends traffic to through the interface

DR

Serves only local clients, sends traffic to them, builds SPT

PIMv2 header

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| Type  |   Reserved    |            Checksum           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type:

  • 0 = Hello
  • 1 = Register
  • 2 = Register-Stop
  • 3 = Join/Prune
  • 4 = Bootstrap
  • 5 = Assert
  • 6 = Graft
  • 7 = Graft-Ack
  • 8 = Candidate-RP-advertisement
  • 10 = DF election (PIM BD)

PIMv2 Hello

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+
|           Option type         |         Option length         |  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   \
\                                                               \    > option
/                         Option value                          /   /   list
\                                                               \  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+

Option types:

  • 1 = holdtime
  • 2 = LAN prune delay
  • 19 = DR priority
  • 20 = generation ID
  • 22 = bidir capable
  • 24 = addr list

PIMv2 Encoded Group Address

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Address family | Encoding type |B| Reserved  |Z|  Mask length  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                      Group mcast address                      /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

B: 1 = PIM-BD
Z: 0 = admin scope zone, 1 = BSM

PIMv2 Encoded Source Address

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Address family | Encoding type |   Rsv   |S|W|R|  Mask length  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                        Source address                         /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

S:

  • sparse bit
  • 1 = PIM-SM
  • PIMv1 compatibility

W:

  • wildcard bit
  • 1 = for (*,G) in Join/Prune
  • 0 = for (S,G) in Join/Prune
  • 0 for other messages

PIMv2 Encoded Unicast Address

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Address family | Encoding type |                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+        Unicast address        /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

PIMv2 Join/Prune/Graft/Graft-Ack

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/           Encoded unicast upstream neighbour address          /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Reserved    |  N of groups  |            Holdtime           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+
\                                                               \  |
/                    Encoded multicast group                    /  |
\                                                               \  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  |
|   Number of joined sources    |   Number of pruned sources    |  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  |
\                                                               \  |
/                Encoded joined source address 1                /  |
\                                                               \  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  |
|                              ...                              |  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   \
\                                                               \    > group
/                Encoded joined source address n                /   /  info
\                                                               \  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  |
\                                                               \  |
/                Encoded joined pruned address 1                /  |
\                                                               \  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  |
|                              ...                              |  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  |
\                                                               \  |
/                Encoded joined pruned address n                /  |
\                                                               \  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+

PIMv2 Assert

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                      Encoded group address                    /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                 Encoded unicast source address                /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R|                  Administrative distance                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                             Metric                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

R:

  • RPT-bit
  • 1 = for (*,G)
  • 0 = for (S,G)

Mroute flags

  1. D: dense mode
  2. S: sparse mode
  3. C: connected, group member is directly attached (src or dst)
  4. L: local, router is the group member
  5. P: pruned, OIL = NULL
  6. R: RP-bit:
    • points to RP
    • entry is build for RPT, not SPT
    • prune if SPT is used
    • if set – RPF for RP, not source
  7. F: register, during registration process
  8. T: SPT flag, SPT tree is used for (S,G) entry
  9. J: join SPT, sparse mode:
    • for (*,G) = RPT → SPT because threshold is exceeded
    • for (S,G) = created as a result of SPT switchover, not SSM
    • time window for monitoring for SPT → RPT – 1 min per (S,G)
  10. I: created by IGMP SSM
  11. Z:
    • mcast tunnel
    • P-packet must be decapsulated, inside – C-packet
  12. Y: data is received via data MDT
  13. y: data is send via data MDT

PIM DM

  • when receives prune, starts prune timer, after its timeout → forward
  • SPT is build when src starts streaming
  • all links are in SPT except RPF
  • messages:
    1. prune:
      • sent on RPF failure or absense of clients
      • removes link from SPT
      • (S,G)
    2. graft:
      • after prune, subscribe to group (e.g. after topology reconvergence)
      • sent to RPF interface, unicast
    3. state refresh
      • PIMv2 feature
      • sent before prune timer expiration
      • refreshes prune timer on upstream neighbour
    4. graft ack: unicast
  • stub mrouting to avoid periodic flood (in SM – for restricting control traffic)
  • timers:
    1. prune:
      • default 210s
      • on expiry → forward
    2. state refresh:
      • 60s default
      • refreshes prune timer on neighbour
      • prevents mcast flood
    3. graft ack:
      • 3s default
      • waiting for graft ack; if not received – graft retransmit
; sends Register for groups in ACL for traffic via this interface (DM has no motion of subscription to stream)
(config-if)# ip pim dense-mode [proxy-register list <ACL>]

Multicast traffic if forwarded via only those interface where PIM neighbour or IGMP client is present

No PIM neighbourship → no PIM traffic

R4 forwards IGMP messages from host, answers only group-specific query, all messages are mcast (not like DHCP relay)

PIM SM

  • Join is sent to receive traffic for mcast group
  • Join is processed by a single upstream (listed in Join)
  • Joins are periodically sent upstream
  • FHR sends Register when src starts streaming; if there are no clients, RP responds with Register-Stop
  • if Join is received and group is already listened to, Join is not forwarded further (no need)
  • messages:
    1. Join:
      • 224.0.0.13
      • builds RPT and SPT for a group
      • on receiving Join or IGMP membership report
    2. Register:
      • unicast, dst = RP IP, src = DR IP of the egress interface towards RP
      • encapsulates mcast messages (RP forwards further is clients are present)
    3. Register-Stop:
      • unicast
      • rejects Register
      • resets Register suppression timer on FHR
    4. Prune (≡ Join):
      • sent on leaving (*,G) or (S,G) when OIL = NULL
    5. RP-Reachability:
      • Cisco only
      • RP sends its keepalives
  • timers:
    1. Register suppression:
      • 60s default
      • 5s to tick, Register is sent with Null-Register bit set, no mcast payload
    2. Join:
      • 60s default
      • refresh Join upstream
    3. Prune:
      • 3 mins default
      • intf → pruned
  • before RPT is build, FHR sends mcast within unicast Register to RP; RPT is finished = FHR receives Register-Stop
  • building (S,G) from RP allows sending mcast without extra encapsulation into unicast ⇒ no suboptimal routing if transit routers have clients as well
  • (S,G) Joins, (S,G) Prune and Register are sent periodically as long as clients are active
  • SPT switchover:
    • if threshold is exceeded (from all sources, aggregated), switch RPT → SPT
    • cannot use SPT at first without RPT because src IP is not known
    • recalculation each 1s
    • (S,G) entries are monitored separately, no overlap with (*,G)
    • no per-source monitoring (otherwise ~ DM)
  • if RP has no route for src, RP sends Register-Stop immediately after receiving Register (RPF-check)
; on all routers, including RP
(config)# ip pim rp-address <RP IP>

; filter Join/Prune towards RP IP for groups in ACL, sparse-mode
(config)# ip pim accept-rp <IP> <ACL>

; on DR
(config)# ip pim register-source <intf>

; kbps, = 0 by default (immediate switchover)
(config)# ip pim spt-threshold <rate>|infinity <ACL>

; limits number of Register per second; infinity by default
(config)# ip pim register-rate-limit <pps>

; filter Register by source
(config)# ip pim accept-register <ACL>
; 1 default; SM only
(config-if)# ip pim dr-priority <n>

(config-if)# ip pim sparse-mode

; SM only; records neighbour addresses from PIM Join per group
(config-if)# ip pim nbma-mode

(config-if)# ip pim sparse-dense-mode
# show ip pim rp [mapping]
# show ip pim rp-hash

RP stores info about (S,G), when clients appear – Join to source can be sent

Register is sent periodically (≈ source keeplive report)

Join: (*,G) towards RP according to uRIB, receiving Join adds interface to OIL for the group

While Joins from RP have not reached FHR yet, mcast within unicast Register allows traffic to flow to clients. After receiving Join from RP mcast is sent along the RPT without extra encapsulation.

Trigger for Register-Stop – receiving mcast (S,G) within Register and from RPT

On receiving mcast LHR finds out the source, builds SPT via Join (S,G), removes RPT

Trigger for Prune (S,G) – receiving mcast from different interfaces; cannot be done immediately because traffic would not reach receiver for some time; all groups are not pruned in case new sources become available.

PIMv2 Register

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|B|N|                        Reserved                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                      Multicast data packet                    /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

B:

  • border bit
  • 0 = originator – DR for the source
  • 1 = PMBR

N:

  • null-register bit
  • = 1 if payload = NULL

PIMv2 Register-Stop

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                      Encoded group address                    /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                 Encoded unicast source address                /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

PIM SSM

  • no need for RP lookup, registering source, SPT switchover
  • immediately SPT (src is known), MSDP not required
  • no problems separating traffic from different sources (as in ASM)
  • everybody could use the same mcast group
  • mitigate DoS when rogue streams to group address
  • src address has to be known in advance
  • one-to-many: static, DNS
; enables SSM for the range
(config)# ip pim ssm default|<range ACL>

PIM BD

  • many-to-many (multiplayer, DC using vMotion)
  • no SPT, only RPT
  • RPF only towards RP
  • no difference between source and receiver for the tree
  • no source registration at RP, traffic goes unconditionally towards RP = tree root
  • designated forwarder (DF):
    • elected on every segment, has DR role automatically
    • instead of RPF as traffic may come not only from upstream RP
    • DF per RP
    • only if all routers in segment run BD, otherwise – pruned
  • OIL: only DF interfaces ⇒ no Assert needed
  • routing:
    1. PIM Join/Leave received on DF – pass to RP; received on non-DF – drop
    2. mcast traffic received on DF – sent to OIL and RP; received on non-DF – send to OIL only if IIF=RPF interface
    • guarantee of single copy towards RP
  • DF election – same as Assert (AD, metric to RP); per RP
  • bidir groups may exists along with common groups (bidir bit set in BSM, C-RP)
(config)# ip pim bidir enable

; bidir enabled only for groups from ACL
(config)# ip pim rp-address <RP IP> [<ACL>] bidir
; which neighbours must be BD for DF election to succeed; all by default
(config-if)# ip pim bidir-neighbor-filter <ACL>

DF election

  1. route to RP has changed (any router):
    • if worse than DF metric: no action
    • if DF metric got better: winner (update info on neighbours)
    • if DF metric got worse: winner (the router with better metric can respond)
  2. new PIM neighbour
  3. DF failure (RPF change on downstream, reelection needed = send Offer)
  • win election if no better offer is received within 3 * offer interval
  • if RP is reachable via interface with election, then metric in Offer = infinity
  • if DF interface becomes RPF interface, DF sends Offer with metric = infinity
  • messages:
    1. offer: unicast metric
    2. winner:
      • sent by the router elected
      • periodically sent for reassert
    3. backoff:
      • sent bt DF on receiving better offer
      • does not pass DF role (just yet)
      • once received, should wait for Pass
    4. pass:
      • sent by DF on passing DF role
      • new DF is elected by old DF from better offers
  • timers:
    1. offer interval: 100ms
    2. suppress offer:
      • 3 * offer
      • on receiving a better offer
      • after expiry – offer if no winner elected
    3. backoff period:
      • 1s
      • time for collecting Offers before Pass

Phantom RP

  • every router has a route towards RP (static or Auto-RP; BSR not supported)
  • RP address is not assigned to any device (can be done since there are Registers)
  • “RP” election and redundancy through announcing different masks for routes towards RP

PIM-BD header

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| Type  |Subtype|  Rsv  |            Checksum           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                  Encoded unicast RP address                   /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Administrative distance                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                             Metric                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Subtype:

  • 1 = Offer (no payload)
  • 2 = Winner (no payload)
  • 3 = Backoff
  • 4 = Pass

PIM-BD Backoff

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/               Encoded unicast offering address                /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|               Offering administrative distance                |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Offering metric                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Interval (ms)        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

PIM-BD Pass

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/              Encoded unicast new winner address               /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|              New winner administrative distance               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       New winner metric                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Bootstrap

  • RP election, has more priority over statically configured
  • C-RP – RP candidate
  • C-BSR – BSR candidate
  • BSR – bootstrap router, informs other routers about RP
  • BSR election:
    1. bootstrap message is sent to 224.0.0.13 on all PIM interfaces, TTL=1
    2. RPF check on BSM source
    3. elected based on priority + IP (larger = better)
    4. on receiving better BSM – stop sending out own BSM ≡ preemption
    5. as a result C-RP know BSR address
  • RP election:
    1. C-RP send Candidate-RP-Advertisement to BSR (unicast): list of groups and holdtime
    2. BSR creates RP-set: RPs-to-group mapping
    3. BSR sends out BSM with RP-set, RPF towards BSR
    4. each mcast routers selects RP per group on its own:
      • mask length (239.1.1.1/32 > 239.0.0.0/8)
      • priority (lower = better); 0 (Cisco) or 192 (RFC) default
      • highest hash
      • highest IP
  • BSM contains BSR address
  • timers:
    1. BS period:
      • 60s default
      • BSM sending period
    2. BS timeout:
      • 130s default
      • how long BSR is valid in absence of BSM
      • reset by receiving BSM
    3. C-RP adv:
      • 60s default
      • Candidate-RP-Advertisement sending period
    4. C-RP timeout: 150s default
  • BSM are sent periodically or after mapping change
  • BSR election start with Candidate-BSR-State: listen only
  • after BS timeout expires, state → elected BSR: listen + send
  • supports NBMA
; enables C-RP, n=0 (Cisco), 192 (RFC) by default
(config)# ip pim rp-candidate <intf> [group-list <ACL>] [priority <n>]

; enables C-BSR
; bits = 0 by default, how many bits from G are included in RP hash, by default not included so all group mapped to single RP
; bits = 1 → 2 groups per RP
; RP election is pseudorandom ≈ uniform
(config)# ip pim bsr-candidate <intf> hash-mask-length <bits>

; if there is BSR/Auto-RP – use configured RP; by default configured RP has less priority
(config)# ip pim rp-address <RP IP> override
; ingress BSMs are ignored, no BSMs are sent
(config-if)# ip pim bsr-border

PIMv2 BSM

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Fragment tag         | Hash mask len |  BSR Priority |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
\                                                               \
/                  Encoded unicast BSR address                  /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-----------+
\                                                               \           |
/                     Encoded group address                     /           |
\                                                               \           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+           |
|    RP Count   | Frag RP count |           Reserved            |            \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+          > group
\                                                               \  |         /  info
/                  Encoded unicast RP address                   /   \       |
\                                                               \    >  RP  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   /  info |
|          RP holdtime          |  RP Priority  |   Reserved    |  |        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+--------+

Fragment Tag: randomly chosen, equal for fragments of the same BSM
Fragment RP Count: number of C-RP in the segment

PIMv2 Candidate-RP-Advertisement

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Prefix count  |    Priority   |           Holdtime            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                  Encoded unicast RP address                   /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                    Encoded group address 1                    /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                              ...                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\                                                               \
/                    Encoded group address n                    /
\                                                               \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Prefix Count = 0 ≡ all groups

Auto-RP

  • Cisco proprietary
  • supports administrative scope (BSR does not)
  • no NBMA support (split-horizon)
  • process:
    1. RP-announce are sent periodically
    2. mapping-agent election, RP-discovery are sent
  • have to Join RP for 224.0.1.39/40 but no RP is known:
    • sparse-dense-mode (SD)
    • Auto-RP listener
  • messages:
    1. RP-announce:
      • 224.0.1.39, TTL=32, UDP 496
      • RP, groups, holdtime
    2. RP-discovery:
      • 224.0.1.40, TTL=32, UDP 496
      • mapping agent
      • groups and RP for them
      • sent periodically or on mapping change
  • if mapping agents have different information – RP flap
  • timers:
    1. RP announce: 60s default
    2. RP holdtime: 3*RP announce
    3. RP discover: 60s default
  • mapping agent:
    1. usually it’s also RP but not necessarily
    2. listens to 224.0.1.39, sends RP-discovery, ignores other RP-discovery (learns from RP-announce)
    3. creates group-to-RP mapping, election by IP (higher = better)
    4. selects from redundant RPs, decision point (can be duplicated as well)
; for SM, DM rules only for Auto-RP groups
(config)# ip pim autorp listener

; RP candidate
(config)# ip pim send-rp-announce [<intf>] [scope <TTL>] [interval <sec>]

; on mapping agent
(config)# ip pim send-rp-discovery [<intf>] [scope <TTL>]

(config)# ip pim rp-announce-filter rp-list <ACL> group-list <ACL>
R2(config-if)# ip multicast boundary <ACL> filter-autorp

; R2 accepts announce, can send announce
R2(config-std-nacl)# permit 224.0.1.39

; R2 does not send discovery
R2(config-std-nacl)# deny 224.0.1.40

; permits traffic for 224.1.1.1 + Auto-RP mapping ⇒ does not send announce itself because there is corresponding mapping
R2(config-std-nacl)# permit 224.1.1.1

HSRP-aware PIM

  • DR status is synced with Active HSRP/VRRP
  • no support for IPv6, SSO
  • 16 tracked groups per interface max
  • Active sends PIM Hello from vIP with DR priority = 0
  • on switchover new DR sends Hello with a new gen ID ⇒ downstream create state by sending Join/Prune
(config-if)# standby <n> name <NAME>
(config-if)# ip pim redundancy <NAME> [hsrp|vrrp] dr-priority <m>

Anycast-RP PIM

  • RFC 4610
  • replaces MSDP for PIM
  • NX-OS
  • full-mesh of connections between anycast RP
  • on receiving Register, RP sends it the first time to other RPs to create state on them
  • received Register from RP is not forwarded to other RP (~ iBGP)
  • RP sends Register periodically to refresh state (3 mins by default)
; lists anycast RP members per RP, including itself
(config)# ip pim anycast-rp <RP_anycast_addr> <RP_physical_addr>

IPv6

PIMv6

  • link-local addresses for neighbourship
  • SM only, no DM
  • no support for Auto-RP
  • static RP, BSR
  • does not use MSDP, PIM sessions instead
  • ff3x::/96 – SSM
(config)# ipv6 pim rp-address <IPv6>

; Register is sent to peers as well, they do not forward it further (full-mesh required)
(config)# ipv6 pim anycast-rp <ANYCAST_IP> <peer>

; C-BSR
(config)# ipv6 pim bsr <IPv6>

; C-RP
(config)# ipv6 pim rp <IPv6>
(config-if)# no ipv6 pim
(config-if)# ipv6 pim dr-priority <PRIORITY>
(config-if)# ipv6 pim hello-interval <sec>
# show ipv6 pim neighbour

; mapping between group and RP
# show ipv6 pim range-list# show ipv6 pim interface# show ipv6 pim traffic# show ipv6 pim tunnel

; address of selected RP
# show ipv6 pim group-map <GROUP># show ipv6 pim bsr rp-cache# show ipv6 pim bsr candidate-rp

Embedded RP

  • RP address is embedded into group address
  • RP setting only
  • ff70::/12:
    1. RP flag = 1
    2. dynamically assigned
    3. contains network prefix
  • does not provide resiliency
  • does not support PIM-BD
  • format
    • ff7<scope>:0:<4bit RP interface ID><8bit hex prefix length>:<64bit RP prefix>:<32bit group ID>
    • example
      • RP = 2001:2:2:2::2/64
      • G_IP = FF7E:0240:2001:2:2:2:0:1
    • group 0 not used – snooping would not be triggered
; disable embedded RP processing
(config)# no ipv6 pim rp embedded

VIF

  • service reflection for matching packets
  • logical interface: up/up, has own subnet
  • private-to-public group mapping + source translation
  • NAT changes only source, mcast map – only destination
(config)# interface vif 1
(config-if)# ip address <IP>

; SGRP → DGRP, SRC_IP ∈ IP
(config-if)# ip service reflect <IN_INTF> dest <SGRP> to <DGRP> mask <LENGTH> source <SRC_IP>
(config-if)# ip pim sparse-mode
(config-if)# ip igmp static-join <SGRP>