DMVPN

  1. DMVPN
  2. Phase 1
  3. Phase 2
  4. Phase 3
  5. Design
  6. Per-tunnel QoS

DMVPN

  • mGRE + NHRP + IPsec
  • IP 47, UDP 500
  • 24 bytes overhead
  • timing out:
    • daemon is triggered every 60s
    • if enry timeout is
      • CEF
        • > 120s → do nothing
        • ≤ 120s → stale flag
      • process switching
        • > 120s, used flag set → clear used
        • ≤ 120s
          • used set → refresh entry
          • used not set → nothing
  • flags
    • local
      • connected prefix
      • if disappears, must be purged from others as well
    • unique
      • remapping denied
      • for NHC
    • no-socket
      • does not trigger IPsec
    • used
      • for process switching every packet matching entry sets the flag
    • stale
      • CEF
      • triggers NHRP request
    • implicit
      • entry received from NHRP request
    • router
      • entry for the router or subnet behind it
  • interface status = reset: hub is configured with ip nhrp nhs
  • multicast
    • traffic must go through hub: no SPT switchover
    • SPT switchover
      1. SPT-Join to hub
      2. hub drops Join because it is not for him (address is listed in Join)
      3. mcast is not forwarded to correct spoke because there is no mcast mapping
  • NAT-T
    • use IPsec NAT-T if NAT device does not support GRE
    • tunnel mode only (inner IP ≡ internal tunnel ID)
  • PMTUD
    • copies inner DF to outer DF
    • adjusts tunnel MTU on receiving ICMP error → eventually hosts adjusts its MTU

Phase 1

  • hub-n-spoke topology
  • mGRE hub, P2P GRE spokes

Hub config:

(config-if)# ip nhrp network-id <ID>
(config-if)# tunnel mode gre multipoint

; allows changing public IP, otherwise "unique address registered already"
(config-if)# ip nhrp registration no-unique

Phase 2

  • hub-n-spoke, spoke-to-spoke tunnels
  • CEF
    • resolve next-hop
    • requires full RIB with unchanged next-hop
  • process switching
    1. R2 requests destination prefix (B) as a whole from NHS, not next-hop only
    2. R2 sends first packet of the flow to NHS along with NHRP request as well
    3. NHS does not find it in cache, searches RIB, finds next-hop (R3), sends request to R3
    4. R3 responds to R2 via R1 about the prefix as a whole instead of single address
    5. R2 notes R3’s source address from reply – no need to request it from NHS
    6. if R3 requests prefix behind R2 in future, R2 would respond directly because it has necessary mapping

Spoke config:

(config-if)# ip nhrp network-id <ID>
(config-if)# ip nhrp map multicast dynamic
(config-if)# tunnel mode gre multipoint

; NHS address
(config-if)# ip nhrp nhs <IP>
(config-if)# ip nhrp map <IP> <NBMA>
(config-if)# ip nhrp map multicast <NBMA>

; 1/3 holdtime by default
(config-if)# ip nhrp registration timeout <sec>

; 40 mins default
(config-if)# ip nhrp holdtime <sec>

Phase 3

  • spokes also reply to requests
  • hub can aggregate routes
  • no invalid/glean adjacency, all point to hub
  • “H” RIB code, AD = 250
  • ‘%’ in RIB ≡ next-hop override
  • message flow
    1. R1 receives packet and has to send it back to mGRE ≡ redirect trigger
      • there is a more optimal route
      • redirect does not contain next-hop, only address that triggered redirect
      • redirected only if network ID matches
    2. R2 sends request for prefix through NHS to R3
      • R3 processes request, R1 only forwards it
      • dst IP in NHRP header matches trigger dst IP
      • request goes hop-by-hop, changing IP header
    3. R3 finds prefix in RIB and sends reply about whole prefix directly to R2
      • R2 address is determined from NBMA src IP of the request
    4. R2 modifies RIB and FIB on receiving reply from R3
  • NHRP overwites next-hop in FIB
    • OSPF P2M can be used
    • EIGRP with *next-hop-self” can be used

Spoke:

(config-if)# ip nhrp shortcut

Hub:

(config-if)# ip nhrp redirect

Design

  • no fragmentation
    • IP MTU ≈ 1400
    • TCP MSS ≈ 1360
  • QoS per spoke
  • EIGRP
    • set real bandwidth for EIGRP
      • BW is divided between spokes
      • by default equal to 1.544Mbps – might not be enough
  • OSPF
    • do not use single area with several hubs: hub-spoke traffic might go through other spokes
  • IPv6
    • manual link-local addresses because DAD does not work
  • PIM:
    • NBMA mode

Hub announces summary of prefixes from area X. Metric – minumum (default) or manual
Solutions:

  • multi-access link between hubs
  • 802.1q between hubs, subinterface per area
  • spoke in its own area (not scalable)
  • disable summarization

In case of IS-IS link between hubs must be L1/L2.

(config-if)# ip mtu 1416
(config-if)# ip tcp adjust-mss 1376

; off default
(config-if)# tunnel path-mtu-discovery

(config-if)# no ip next-hop-self eigrp <ASN>

(config-if)# ip pim nbma-mode

; P2P default
(config-if)# ip ospf network point-to-multipoint

Per-tunnel QoS

  • requires CEF
  • only IPv4, IPv6
  • egress only
  • spoke has only 1 group per interface
  • group info exchange via NHRP with vendor private extensions
; spoke
(config-if)# ip nhrp group <NAME>

; hub
(config-if)# ip nhrp map group <NAME> service-policy output <PMAP>

; adaptive QoS based on WAN BW, proprietary algorithm
(config-pmap-c)# shape adaptive upper-bound <bps> [lower-bound <bps>]
# show policy-map multipoint