L4-L7 integration

  1. Limits
  2. Device package
  3. Service graph
  4. Go-to mode
  5. Go-through mode
  6. ASA
  7. Copy service
  8. Device selection policy
  9. PBR node tracking
  10. L4-L7 virtual IP
  11. PBR
  12. Backup PBR policy
  13. Symmetric PBR
  14. L1/L2 PBR
  15. L3Out PBR
  16. Active-standby with Multipod

Limits

  • 30 managed, 50 unmanaged, 1200 virtual HA pairs
  • prefixes “N-” and “C-” cannot be used in names
  • requires contract enforcement in VRF, otherwise PBR does not work
  • places service in separate BD that must have IP dataplane learning disabled
  • only unicast IPv4/IPv6
  • Multisite does not support active/active clusters
  • by default traffic from service node is permitted IN TCAM with “default” filter; customer filters are applied via template – filter-from-contract

Device package

  1. XML ≡ capabilities, model, vendor, version
  2. python scripts
  3. function profile: default values
  4. configuration parameters: required parameters
  • device cluster is controlled by single package at a time
  • major version ≡ new package, disruptive upgrade, manual switching
  • minor version ≡ upgrade, replacement, not disruptive

Service graph

  • features:
    1. potentially vendor-agnostic config
    2. VLAN autoconfiguration
    3. PBR (service redirect) by creating shadow EPG
    4. dynamic endpoint attach
  • redirect limits:
    1. only goto devices
    2. L2 connection to ACI
    3. EP learning for BD on service node intf has to be disabled, shadow EPG created
    4. one vMAC for active and standby nodes
  • modes:
    1. network policy:
      • unmanaged, ACI manages only network part
      • does not support per-port VLAN ⇒ requires 2 leafs for L1/L2 graph
    2. service policy: managed
    3. service manager: network via ACI, policy – on its own
  • parameter order (?!)
    1. function profile
    2. EPG
    3. tenant
  • parameter attributes:
    1. mandatory: cannot be null
    2. locked: use parameter from function profile if defined
    3. shared: user parameter from EPG or tenant if defined (?!); on parameter change in function profile refresh device config
  • elements:
    1. function node (e.g FW)
    2. terminal node
    3. connector
    4. connection
  • anycast endpoint permits having devices with same MAC or IP without causing EP flap

Go-to mode

  • routed
  • PBR based on MAC: BD for service is configured on ingress and egress leafs that switch traffic within BD towards service MAC
  • service leaf sets DL flag because IP dataplane learning is disabled ⇒ compute leaf does not learn EP via service leaf
  • separate BDs for interfaces, one BD cannot refer to 2 devices simultaneously
  • usually 2-armed
  • direct connect: allows consumer/provider to ping their connectors; by default it’s prohibited since there is no contract between EPG and shadow EPG in case of unidirectional PBR

Go-through mode

  • transparent
  • involved BD: BUM flood + no IP routing
  • limitation: 1024 IP per MAC ⇒ has to:
    • disable EP learning or
    • disable IP routing in BD or
    • enable RPF
  • usually 1-armed for filtering within BD/EPG

ASA

  • does not support clustering through ACI
  • separate etherchannel for control plane and vPC for data plane
  • in order to configure failover, admin context has to be registered in ACI as L4-L7
  • vASA requires separate EPG for failover traffic

Copy service

  • SPAN defined by a contract
  • does not copy BUM if it does not fall under the contract
  • does not copy CoS, DSCP

Device selection policy

  • translates service graph template into config: graph + contract + device + EPG
  • creates internal EPG and internal contracts for enforcing the policy
  • L3 destination (VIP) ≡ disables PBR for connector

PBR node tracking

  • tracks liveliness of PBR target via ICMP, TCP, L2Ping, HTTP
  • if PBR fails, broadcasts this event within fabric so nobody would send traffic to it
  • health group:
    • link group with same affinity (e.g. internal + external)
    • if a member of the group fails – do not use the group members (e.g. do not use 2-armed PBR node, ~ LACP min-links)
  • threshold:
    1. can adjust PBR if a node from the group cannot handle extra load from previously failed node
    2. action:
      • permit: default; if failure – forward directly through ACI
      • deny
      • bypass:
        • pass traffic to next node in multinode graph
        • if last node failed ≡ permit
        • incompatible with remote leaf, one-arm nodes
  • resilient hashing:
    1. instead of rebalancing hash across nodes – pass traffic from single failed node to another single node
    2. if multiple failures – pass traffic to different nodes
    3. can lead to overload
  • backup action in bypass – per PBR policy; different graphs must not use PBR policy with the same name: 2nd node can be different ⇒ unclear how to render backup; can be fixed by using different names for PBR policy

L4-L7 virtual IP

  • EPG configuration mode
  • disables IP dataplane learning for specific IP (otherwise IP flapping with DSR)
  • control plane IP learning is allowed only in the EPG, where this setting is configured
  • direct server return (DSR):
    1. L2 LB: VIP on LB and servers, only LB answers ARP for VIP
    2. response server → client bypasses LB, goes directly, src IP = VIP
    3. LB rewrites dst MAC for server MAC, does not change IP
  • scope:
    1. EPG with L4-L7 VIP
    2. EPG with contracts to EPG from 1)

PBR

  • requires unicast routing on BD with EPG interfaces
  • always routed even for bridged traffic ⇒ reduces TTL, does not preserve src MAC till dst EPG (last leaf changes for its own MAC)
  • PBR operation: rewrite dst MAC = PBR node + VNID = PBR BD, does not change other fields
  • hash for choosing PBR node: src IP, dst IP, protocol
  • 1st generation leafs cannot direct PBR to their own ports
  • if 2 contracts use the same first PBR node and after it actions are different (e.g. permit and redirect), then filters-from-contract is required in graph template instead of default filter on provider connector
  • by default PBR does not rewrite src MAC ⇒ PBR node can learn the MAC and send return traffic to it ⇒ return traffic is dropped on leaf because there is no MAC EPG in service BD; rewrite source MAC = change src MAC to SVI MAC 0022.BDF8.19FF
# show service redir info

Backup PBR policy

  • standby nodes, switches traffic over to them if primary node fails
  • requires resilient hash
  • backup policy can be used only in a single primary policy
  • switchover:
    1. failed primary count ≤ backup count: switch to backupfailed primary count > backup count: backups are considered to be primary nodes that they have replaced; select nodes from remaining
    • selection: round-robin, increasing IP order

Symmetric PBR

  • uses hash to return traffic to the same PBR node where it came from without SNAT
  • hash has to be mirrored
  • PBR nodes are sorted by IP address ⇒ addresses in consumer/provider connectors have to be in the same order
  • destination name sorting – use string in lieu of IP address

L1/L2 PBR

  • L1 = VLAN is the same (wire tap)
  • L2 = VLANs on two legs of PBR node are different (ASA transparent mode)
  • only destination name sorting because there are no IP
  • requires IP routing but BD subnet is not required; BD must be dedicated to PBR node
  • PBR policy programs static MAC EP ⇒ L1/L2 must have entry for forwarding to this EP (e.g. static MAC in CAM)
  • leaf rewrites dst MAC to static MAC, sends frame to PBR node (consumer connector); leaf receives this MAC on provider conenctor and routes it based on inner IP ⇒ last leaf inserts correct MAC; TTL is decreased by 1
  • L2Ping uses static MACs as src/dst MAC (Ethertype 0x0721)
  • active/active:
    1. L2:
      • each arm has its own encap VLAN
      • flood in encapsulation restricts looping traffic back to PBR node, shadow EPG config
    2. L1:
      • each PBR node is allocated its own encap VLAN
      • flood in encapsulation restricts looping traffic to other PBR nodes (shadow EPG)
      • provider and consumer connectors in different physical domains: different FD_VLAN is required (to prevent traffic from looping back via path provider connector → consumer connector through fabric) ⇒ different VLAN pools requires with the same values ⇒ different domains required + port-local scope
      • fabric has loop prevention ⇒ disable MCP, CDP, LLDP
  • provider and consumer connectors have to be on different leafs for L1
  • MAC can be configured (used in dataplane and L2Ping), IP cannot (autogenerated, key in DB/msgs)

L3Out PBR

  • PBR policy uses L3Out device MAC (1st hop) and IP of PBR node ⇒ 1st hop also has to perform PBR
  • MAC – dst MAC rewrite, IP – for IP-SLA tracking (mandatory)
  • each PBR node is allocated its own BD and VRF:
    1. BD is used to select specific PBR node (MAC is the same, = L3Out device)
    2. VRF: 0.0.0.0/0 via L3Out
    3. shadow EPG is allocated global pcTag because of implicit inter-VRF leaking
  • leaf rewrites inner dst MAC to 0C0C.0C0C.0C0C, VNID – internal BD VNID (selects PBR node instead of MAC)
  • L3Out EPG classification:
    1. src IP in consumer/provider BD/L3Out ⇒ shadow EPG
    2. src IP in L3Out EPG ⇒ L3Out EPG

Active-standby with Multipod

L12 learns MAC EP1 (bridging) – no issue

L12 learns IP EP1 (routing)

L12 sends L3Out → EP1 according to endpoint table, hits bounce on L11; return traffic uses VPNv4 routes towards L21 (routing) – L21 learns IP EP1 and sends traffic to L12 because L12 announces MAC Active – L12 learns MAC EP1 (bridging) without refreshing IP value but renewing aging

Once bounce expires, entry for IP EP1 is still active on L12 ⇒ blackhole on L11

Solution:

  • disable remote EP learn
  • EP announce