EIGRP summary & BGP default route

This lab intends to highlight one of the potential pitfalls of EIGRP summarization.

Configuration

  1. Setup the addressing: each router has at least one loopback for connectivity verification.
  2. Enable eBGP between ISP and Hub.
  3. Enable EIGRP between Hub and Spoke.

Task

  1. ISP should announce only default route via eBGP.
  2. Hub should announce both loopbacks into BGP: Hub loopback and Spoke loopback.
  3. Spoke should not announce any routes, except for directly connected prefixes.
  4. Make sure Spoke receives only a single route from Hub.

Observations

Test connectivity between ISP and Spoke. What is wrong with the path? Find the problem and fix it.

Solution

Note that pings fail on Hub. The reason – EIGRP summary has AD = 5 by default, so BGP default route does not get to be installed.

Hub#show ip route 0.0.0.0  
Routing entry for 0.0.0.0/0, supernet
  Known via "eigrp 1", distance 5, metric 28160, candidate default path, type internal
  Redistributing via eigrp 1
  Routing Descriptor Blocks:
  * directly connected, via Null0
      Route metric is 28160, traffic share count is 1
      Total delay is 100 microseconds, minimum bandwidth is 100000 Kbit
      Reliability 255/255, minimum MTU 1500 bytes
      Loading 1/255, Hops 0

Solution is simple – decrease AD, that is assigned to summary route:

Hub(config)#router eigrp 1
Hub(config-router)#summary-metric 0.0.0.0/0 distance 250

Images

IOS image: c7200-adventerprisek9-mz.152-4.M11.image

Follow on Telegram, LinkedIn, Twitter

EIGRP named mode: migration pitfall

Let’s imagine that you’ve got an unstoppable urge to upgrade your network software to the latest available version as well as to adopt all the best practices available (you’re not looking for a new job just yet). Your first Guinea pig is EIGRP in classic mode – you can’t wait to bump it to named mode because of all shiny new features. Even better, you can do it with just a single eigrp upgrade-cli command – couldn’t be easier, what could possibly go wrong? As you might have guessed from my previous posts, such an upgrade could wreck your network in certain circumstances.

What could be simpler than four routers? Exactly, three routers! Each of them is running EIGRP, R1 & R3 – classic mode, while R2 has just finished upgrading to named mode.

R1#show run | section router eigrp|interface
interface Loopback0
 ip address 1.1.1.1 255.255.255.255
interface FastEthernet0/0
 ip address 192.168.12.1 255.255.255.0
router eigrp 1
 network 0.0.0.0
R3#show run | section router eigrp|interface
interface Loopback0
 ip address 3.3.3.3 255.255.255.255
interface FastEthernet0/1
 ip address 192.168.23.3 255.255.255.0
router eigrp 1
 network 0.0.0.0
R2#show run | section router eigrp|interface
interface Loopback0
 ip address 2.2.2.2 255.255.255.255
interface FastEthernet0/0
 ip address 192.168.12.2 255.255.255.0
interface FastEthernet0/1
 ip address 192.168.23.2 255.255.255.0
router eigrp NAMED
 address-family ipv4 unicast autonomous-system 1
  network 0.0.0.0

As you probably expect, there is nothing criminal just yet, R3 is still able to reach R1 without hiccups:

R3#show ip route eigrp
<output omitted>
      1.0.0.0/32 is subnetted, 1 subnets
D        1.1.1.1 [90/158720] via 192.168.23.2, 00:03:32, FastEthernet0/1
      2.0.0.0/32 is subnetted, 1 subnets
D        2.2.2.2 [90/28160] via 192.168.23.2, 00:03:37, FastEthernet0/1
D     192.168.12.0/24 [90/30720] via 192.168.23.2, 00:03:37, FastEthernet0/1
R3#  
R3#ping 1.1.1.1 source lo 0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 1.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 3.3.3.3 
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/28/36 ms

So far so good, isn’t it? However, just as you preparing to hit upgrade-cli on yet another router, there is a request coming in to deprioritize 1.1.1.1/32 for some kind of traffic engineering. You want it out of your way ASAP, so you adjust the bandwidth on the loopback:

R1(config)# interface lo0
R1(config-if)# bandwidth ?
  <1-10000000>   Bandwidth in kilobits
  inherit        Specify how bandwidth is inherited
  qos-reference  Reference bandwidth for QOS test
  receive        Specify receive-side bandwidth

R1(config-if)# bandwidth 1

KABOOM! R3 has just lost its connectivity to R1:

R3#ping 1.1.1.1 so lo 0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 1.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 3.3.3.3 
UUUUU
Success rate is 0 percent (0/5)
R3#
R3#show ip route eigrp 
<output omitted>
      1.0.0.0/32 is subnetted, 1 subnets
D        1.1.1.1 [90/2560133120] via 192.168.23.2, 00:00:56, FastEthernet0/1
      2.0.0.0/32 is subnetted, 1 subnets
D        2.2.2.2 [90/28160] via 192.168.23.2, 00:09:42, FastEthernet0/1
D     192.168.12.0/24 [90/30720] via 192.168.23.2, 00:09:42, FastEthernet0/1

EIGRP must be the culprit, however, the route is still in RIB with worse metric as expected.

R3#traceroute 1.1.1.1 source lo0 numeric 
Type escape sequence to abort.
Tracing the route to 1.1.1.1
VRF info: (vrf in name/id, vrf out name/id)
  1 192.168.23.2 12 msec 16 msec 16 msec
  2 192.168.23.2 !H  !H  !H

R2, on the other hand, ignores your efforts to squeeze the traffic through it, because…

R2#show ip route eigrp
<output omitted>
      3.0.0.0/32 is subnetted, 1 subnets
D        3.3.3.3 [90/2662400] via 192.168.23.3, 00:14:07, FastEthernet0/1

It has lost the route!

However, the loss is not quite complete as it may look like. The prefix is still in EIGRP topology table with perfectly valid metrics:

R2#show ip eigrp topology 1.1.1.1/32
EIGRP-IPv4 VR(NAMED) Topology Entry for AS(1)/ID(2.2.2.2) for 1.1.1.1/32
  State is Passive, Query origin flag is 1, 0 Successor(s), FD is Infinity, RIB is 4294967295
  Descriptor Blocks:
  192.168.12.1 (FastEthernet0/0), from 192.168.12.1, Send flag is 0x0
      Composite metric is (655694233600/655687680000), route is Internal
      Vector metric:
        Minimum bandwidth is 1 Kbit
        Total delay is 5100000000 picoseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1500
        Hop count is 1
        Originating router is 1.1.1.1

The data seems to be an order. So far we’ve got two mysteries on our hands:

  1. Why R2 has lost its route?
  2. Why R3 has NOT lost its route?

The first question directly affects availability, so we tackle this one first. Notice anything unusual about EIGRP metrics? It’s way bigger than “RIB is 4294967295” which is the upper bound of 32-bit RIB metrics. EIGRP cannot squeeze its 64-bit wide metric into 32-bit RIB metric, so the route is not installed. Solution? Scale down EIGRP metric before putting it into RIB by using metric rib-scale,which is equal to 128 by default:

R2#show ip protocols 
Routing Protocol is "eigrp 1"
  Outgoing update filter list for all interfaces is not set
  Incoming update filter list for all interfaces is not set
  Default networks flagged in outgoing updates
  Default networks accepted from incoming updates
  EIGRP-IPv4 VR(NAMED) Address-Family Protocol for AS(1)
    Metric weight K1=1, K2=0, K3=1, K4=0, K5=0 K6=0
    Metric rib-scale 128
    Metric version 64bit
    NSF-aware route hold timer is 240
    Router-ID: 2.2.2.2
    Topology : 0 (base) 
      Active Timer: 3 min
      Distance: internal 90 external 170
      Maximum path: 4
      Maximum hopcount 100
      Maximum metric variance 1
      Total Prefix Count: 5
      Total Redist Count: 0

  Automatic Summarization: disabled
  Maximum path: 4
  Routing for Networks:
    0.0.0.0
  Routing Information Sources:
    Gateway         Distance      Last Update
    192.168.12.1          90      00:17:36
    192.168.23.3          90      00:17:36
  Distance: internal 90 external 170

Guess what? 128 is still not enough to bring  655694233600 to 32-bit number, 160 seems to do the trick though:

R2(config)#router eigrp NAMED  
R2(config-router)#address-family ipv4 autonomous-system 1
R2(config-router-af)#metric rib-scale 160
R2#show ip route eigrp 
<output omitted>
      1.0.0.0/32 is subnetted, 1 subnets
D        1.1.1.1 [90/4098088960] via 192.168.12.1, 00:00:49, FastEthernet0/0
      3.0.0.0/32 is subnetted, 1 subnets
D        3.3.3.3 [90/2129920] via 192.168.23.3, 00:00:49, FastEthernet0/1

R3 is able to reach 1.1.1.1/32 again as well:

R3#ping 1.1.1.1 so lo 0                  
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 1.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 3.3.3.3 
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/32/52 ms

So, the first mystery is declassified now. What about the second on: why on earth did R3 retain the route after R2 stopped using it? It’s not an idle question though: such a behaviour is bound to confuse troubleshooting engineer, who is led to believe that routing is still intact, since the proper route is installed in RIB.

After EIGRP router loses all of its successor routes, it runs a synchronization algorithm called DUAL. Our case is not an exception, so let’s walk the process between R2 and R3:

  1. R2 loses the successor for 1.1.1.1/32, because it receives Query from R1, so R2 sends the Query of its own towards R3.

Notice the metric: delay corresponds to the actual value on R2 instead of Infinity constant.

  1. R3 updates its topology with the received metric components:
R3#show ip eigrp topology 1.1.1.1/32
EIGRP-IPv4 Topology Entry for AS(1)/ID(3.3.3.3) for 1.1.1.1/32
  State is Passive, Query origin flag is 1, 1 Successor(s), FD is 2560133120
  Descriptor Blocks:
  192.168.23.2 (FastEthernet0/1), from 192.168.23.2, Send flag is 0x0
      Composite metric is (2560133120/2560130560), route is Internal
      Vector metric:
        Minimum bandwidth is 1 Kbit
        Total delay is 5200 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1500
        Hop count is 2
        Originating router is 1.1.1.1

Since R3 has no alternatives to R2 and thus no possible EIGRP neighbours to query further, it responds back with the Infinity metric due to split horizon rule:

  1. R2 receives all Reply to outstanding Query, so it is able to select the loop-free route. The only available one cannot squeeze into RIB, so R2 is left with no route.

Fun fact: if you flap RIB scale config so that R2 loses the existing route, Query from R2 indicates route loss properly:

The reason for such a different processing seems to be simple: the initial Query is triggered by the Query from successor R1 before RIB update is attempted (no reason to specify Infinity metric); the second Query is performed after proper route loss from RIB perspective. The initial Query cannot trigger RIB update because routing information has to be updated via DUAL first. I reckon there could be two solutions to that:

  1. either send Update with Infinity metric after the route fails to be installed in RIB
    or
  2. always send Query with Infinity metric (which is the approach in EIGRP RFC).

Is it a likely failure scenario? Not really, modern networks make it difficult to end up with a metric high enough to get an out-of-bounds value. However, it’s still a valid scenario, especially in case of lousy metric engineering. The prevention is well-known – pilot testing and maintenance windows with automated predefined checks.

Follow on Telegram, LinkedIn, Twitter

EIGRP SIA – why?

It’s very likely that you already know what EIGRP stuck-in-active (SIA) feature means. Just a quick recap: if a router does not get a Reply message for previously sent Query within Active timer (3 minutes by default), it tears down the adjacency with the “stuck” neighbour; in the meantime the router probes its neighbours with SIA-Query, resetting Active timer if there is SIA-Reply from the neighbour. Sounds simple, right? Just another failsafe to protect network from a router that might go haywire. Let me ask you a long multi-question though:

Why SIA is required – there is no way to disable it? Isn’t it enough to expire Holddown timer on the stuck neighbour and consider its Reply unnecessary?

Well, the reply really depends on the viewpoint (Cisco’s “it depends”, uh-huh). Let’s see it on an example:

In such a setup there is absolutely no way SIA would be needed. Let’s imagine that R3 stops sending EIGRP packets for some reason and 1.1.1.1/32 on R1 goes down:

  1. R1 would send a Query for 1.1.1.1/32 to R2;
  2. R2 would send a Query for 1.1.1.1/32 to R3, however, it will never get a Reply;
  3. There would be a few unsuccessful EIGRP retransmits from R2 towards R3;
  4. Either Holddown timer expires (15s by default) or number of retransmits reaches 16 (only Cisco knows how long);
  5. R2 tears down neighbourship with R3 and sends Reply back to R1;
  6. Active timer on R1 never comes even close to expiration (3 minutes) so the 1.1.1.1/32 in Active state is removed.

Remember, however, that EIGRP was designed really long time ago – when serial links were ubiquitous. The most important feature of these links for this discussion – relatively long distance and high delay as a result. Although serial links are actively upgraded, there is still a similar connection – radiolinks. Consider the following setup:

The only non-default thing is the serial link using Frame-Relay for encapsulation.

R1#sho run | s interface|router
interface Loopback0
 ip address 1.1.1.1 255.255.255.255
interface FastEthernet0/0
 ip address 192.168.12.1 255.255.255.0
interface FastEthernet0/1
 ip address 192.168.14.1 255.255.255.0
router eigrp 1
 network 0.0.0.0
R2#show run | section interface|router
interface Loopback0
 ip address 2.2.2.2 255.255.255.255
interface FastEthernet0/0
 ip address 192.168.12.2 255.255.255.0
interface Serial4/0
 ip address 192.168.23.2 255.255.255.0
 encapsulation frame-relay
 no keepalive
 frame-relay interface-dlci 100
router eigrp 1
 network 0.0.0.0
R3#show run | section interface|router
interface Loopback0
 ip address 3.3.3.3 255.255.255.255
interface Serial4/0
 ip address 192.168.23.3 255.255.255.0
 encapsulation frame-relay
 no keepalive
 frame-relay interface-dlci 100
router eigrp 1
 network 0.0.0.0

Let’s try to run the scenario without SIA involved. The feature was introduced in 12.1(5) release so any 12.0 software should do. Although we cannot drop Queries specifically, we can discard all unicast packets to achieve the following: drop Queries and accept Hello. As a result, R2 would consider R3 to have failed based on Active timer (180 seconds by default) and not on Holddown timer (also 180 seconds by default). Although it seems like a setup at the first glance, I suggest holding on to it for some time.

R3#show ip access-lists
Extended IP access list NOUNICAST
    10 permit ip any 224.0.0.0 15.255.255.255
    20 deny ip any any

Now, let’s bring down 1.1.1.1/32 and activate the ACL on R3:

R3(config)#interface s4/0
R3(config-if)#ip access-group NOUNICAST in
R1(config)#iinterface lo0
R1(config-if)#sh

Now R1 considers the route to be in Active state.

R1# show ip eigrp topology active
IP-EIGRP Topology Table for AS(1)/ID(1.1.1.1)

Codes: P - Passive, A - Active, U - Update, Q - Query, R - Reply,
       r - Reply status

A 1.1.1.1/32, 1 successors, FD is Infinity
    1 replies, active 00:00:07, query-origin: Local origin
      Remaining replies:
         via 192.168.12.2, r, FastEthernet0/0

After 3 minutes R1 should flush the route because by that moment it has received no Reply from R2 as there was no response from R3. However, this is not the case:

R1#show ip eigrp topology active 
IP-EIGRP Topology Table for AS(1)/ID(1.1.1.1)

Codes: P - Passive, A - Active, U - Update, Q - Query, R - Reply,
       r - Reply status

A 1.1.1.1/32, 1 successors, FD is Inaccessible
    1 replies, active 00:03:05, query-origin: Local origin
         via Connected (Infinity/Infinity), Loopback0
    Remaining replies:
         via 192.168.12.2, r, FastEthernet0/0
R1#show ip eigrp topology active 
IP-EIGRP Topology Table for AS(1)/ID(1.1.1.1)

Is there anything wrong with the configuration? I don’t think so. However, let’s get back to the failure condition based on Active timer instead of Holddown timer. Imagine that there are a bunch of other routers between R1 and R2, all using serial links and thus contributing to overall delay. May there be just a slight difference between 1.1.1.1/32 going down (and starting Active timer) and last Hello from R3 arriving (refreshing Holddown timer) that is covered completely by that delay? Definitely so:

  1. Although R2 might terminate neighbourship with R3 after 180 seconds, there is still a propagation delay for that event to reach R1.
  2. With a bit of “luck”, last Hello and disapperance of 1.1.1.1/32 would line up.

As soon as R2 prepares the Reply to be sent back to R1, Active timer on R1 expires and R1 resets the neighbourship with R2, at least according to the description of DUAL. As you could imagine, such a behaviour causes chain flapping of EIGRP neighbourships all around the network, just because there are high-delay links and a rogue malfunctioning router.

So why did we filter only unicast packets instead of dropping all EIGRP datagrams? Well, it would have required me to initiate the events at the same time right after last Hello from R3 was received. Although it’s possible with some automation, using Active timer instead removed the delay between my brain and the keyboard completely from equation while still providing us with the same result.

However, that’s not what we received during the test. I’ll have to speculate a little bit here as I don’t have a strict explanation for it, only suggestion.

  1. It’s possible to alleviate the problem by increasing the gap between default values of Active and Holddown timers. However, feasibility of such a method really depends on the total delay between the routers so I’d consider it to be a workaround. It seems that IOS 12.0 implements exactly this behaviour; version 11 could have provided different results but I could not find the image.
  2. The proper solution to the problem at hand is SIA. The idea is simple: separate prefix availability check (Query) from neighbour availability check (SIA-Query). Such an approach incurs no tangible dependency on total delay compared to timer tuning. Besides, it is generally a good idea to separate functions and not to overload them extensively.

Does it really matter in the modern world, especially since SIA cannot be disabled? Most likely not, to be honest, unless you run a very outdated IOS version (SIA would be the least of your concerns in such a case though). Understanding the reason for a feature to be implemented makes me feel good – so maybe such a knowledge would make someone feel good as well.

Kudos for review: Anastasiia Kuraleva

Follow on Telegram, LinkedIn

BGP best path selection in L3VPN: hidden pitfall

If you have ever configured MPLS L3VPN, it should raise no doubt that BGP is the tool the whole setup revolves around. As a protocol with a strong sense of dignity (after all, Internet is built with it), it has a fairly long list of decision-making points called best path selection algorithm. Despite the horrifying length of the list, most of the items are mere tie-breakers rather than knobs used for traffic engineering. Sometimes, however, the most ubiquitous attributes are not the right ones for the job. If you see EIGRP in L3VPN environment – beware, it might be the case we are going to discuss in this article.

Let’s start with a sample topology:

R1-R3 are PE-routers that run LDP in the core along with OSPF. They are also providing connectivity between sites using various IGPs: R1↔R4 ≡ OSPF, R2↔R5 ≡ EIGRP, R3↔R6 ≡ eBGP. Client devices R4-R6 reside in a single VRF. R4 and R5 provide R6 with an access to some service, reachable via 8.8.8.8/32.

Sample config from R1:

R1(config)#vrf definition A
R1(config-vrf)# rd 1:1
R1(config-vrf)# route-target export 1:1
R1(config-vrf)# route-target import 1:1
R1(config-vrf)# address-family ipv4
R1(config)#interface Loopback0
R1(config-if)# ip address 1.1.1.1 255.255.255.255
R1(config)#interface FastEthernet0/0
R1(config-if)# vrf forwarding A
R1(config-if)# ip address 192.168.14.1 255.255.255.0        
R1(config)#interface FastEthernet1/0
R1(config-if)# ip address 192.168.13.1 255.255.255.0
R1(config)#interface FastEthernet1/1
R1(config-if)# ip address 192.168.12.1 255.255.255.0
R1(config)#router ospf 2 vrf A
R1(config-router)# redistribute bgp 123 subnets
R1(config-router)# network 0.0.0.0 255.255.255.255 area 0
R1(config)#router ospf 1
R1(config-router)# mpls ldp autoconfig
R1(config-router)# router-id 1.1.1.1
R1(config-router)# network 0.0.0.0 255.255.255.255 area 0
R1(config)#router bgp 123
R1(config-router)# bgp router-id 1.1.1.1
R1(config-router)# no bgp default ipv4-unicast
R1(config-router)# neighbor L3VPN peer-group
R1(config-router)# neighbor L3VPN remote-as 123
R1(config-router)# neighbor L3VPN update-source Loopback0
R1(config-router)# neighbor 2.2.2.2 peer-group L3VPN
R1(config-router)# neighbor 3.3.3.3 peer-group L3VPN
R1(config-router)# address-family vpnv4
R1(config-router-af)#  neighbor L3VPN send-community both
R1(config-router-af)#  neighbor 2.2.2.2 activate
R1(config-router-af)#  neighbor 3.3.3.3 activate
R1(config-router-af)# exit-address-family
R1(config-router)# address-family ipv4 vrf A
R1(config-router-af)#  redistribute ospf 2

As for CE routers, their config is even more ascetic:

R6(config)#interface Loopback0
R6(config-if)# ip address 6.6.6.6 255.255.255.255
R6(config)#interface FastEthernet0/0
R6(config-if)# ip address 192.168.36.6 255.255.255.0
R6(config)#router bgp 6
R6(config-router)# bgp router-id 6.6.6.6
R6(config-router)# no bgp default ipv4-unicast
R6(config-router)# neighbor 192.168.36.3 remote-as 123
R6(config-router)# address-family ipv4
R6(config-router-af)#  network 6.6.6.6 mask 255.255.255.255
R6(config-router-af)#  neighbor 192.168.36.3 activate

This lab uses loopbacks to emulate the service:

R4(config)#interface Loopback1
R4(config-if)# ip address 8.8.8.8 255.255.255.255
R5(config)#interface Loopback1
R5(config-if)# ip address 8.8.8.8 255.255.255.255

Let’s see first whether R6 is able to reach 8.8.8.8 at all:

R6#ping 8.8.8.8 source loopback 0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 8.8.8.8, timeout is 2 seconds:
Packet sent with a source address of 6.6.6.6 
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 52/60/80 ms

The key point of this setup is to have redundant paths from R6 to 8.8.8.8/32 via R4 and R5, that’s why there are direct BGP sessions between R1-R3 and R2-R3. Both R1 and R2 should consider IGP prefixes to be the best in BGP RIB (they are locally originated routes) thus R3 should have 2 prefixes in BGP RIB, one for each peer.

R3#sho bgp vpnv4 unicast all 
BGP table version is 13, local router ID is 3.3.3.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, 
              r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter, 
              x best-external, a additional-path, c RIB-compressed, 
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 1:1 (default for vrf A)
 *>i 4.4.4.4/32       1.1.1.1                  2    100      0 ?
 *>i 5.5.5.5/32       2.2.2.2             103040    100      0 ?
 *>  6.6.6.6/32       192.168.36.6             0             0 6 i
 *>i 8.8.8.8/32       2.2.2.2             103040    100      0 ?
 *>i 192.168.14.0     1.1.1.1                  0    100      0 ?
 *>i 192.168.25.0     2.2.2.2                  0    100      0 ?

Somehow R3 receives only a single prefix instead of two. It might be possible that R1 does not send any update regarding 8.8.8.8/32:

R1#sho bgp vpnv4 unicast all
BGP table version is 13, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, 
              r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter, 
              x best-external, a additional-path, c RIB-compressed, 
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 1:1 (default for vrf A)
 *>  4.4.4.4/32       192.168.14.4             2         32768 ?
 *>i 5.5.5.5/32       2.2.2.2             103040    100      0 ?
 *>i 6.6.6.6/32       3.3.3.3                  0    100      0 6 i
 r>i 8.8.8.8/32       2.2.2.2             103040    100      0 ?
 r                    192.168.14.4             2         32768 ?
 *>  192.168.14.0     0.0.0.0                  0         32768 ?
 *>i 192.168.25.0     2.2.2.2                  0    100      0 ?

R1 considers 8.8.8.8/32 from R2 to be the best route (symbol ‘>’). Since aforementioned prefix is received via iBGP, it’s not forwarded to any normal iBGP peer. Why does not R1 pick the local route though? According to the best path selection, only weight and local preference have precedence higher than the local routes. Besides, local routes also have a weight set to 32768 by default so 8.8.8.8/32 from OSPF must be selected the best (32768 vs 0), not the one from R2!

R1#sho bgp vpnv4 unicast all 8.8.8.8/32
BGP routing table entry for 1:1:8.8.8.8/32, version 13
Paths: (2 available, best #1, table A, RIB-failure(17) - next-hop mismatch)
  Not advertised to any peer
  Refresh Epoch 1
  Local
    2.2.2.2 (metric 2) from 2.2.2.2 (2.2.2.2)
      Origin incomplete, metric 103040, localpref 100, valid, internal, best
      Extended Community: RT:1:1 Cost:pre-bestpath:128:103040 0x8800:32768:0 
        0x8801:1:2560 0x8802:65281:25600 0x8803:65281:1500 0x8806:0:134744072
      mpls labels in/out nolabel/22
  Refresh Epoch 1
  Local
    192.168.14.4 from 0.0.0.0 (1.1.1.1)
      Origin incomplete, metric 2, localpref 100, weight 32768, valid, sourced
      Extended Community: RT:1:1 OSPF DOMAIN ID:0x0005:0x000000020200 
        OSPF RT:0.0.0.0:2:0 OSPF ROUTER ID:192.168.14.1:0

Since it’s quite tedious to remember numbers by heart, it’s nice to have some cheat sheet with community format for OSPF and EIGRP at hand. However, the culprit we are looking for has a self-explanatory name – Cost:pre-bestpath:128:103040.

The purpose of cost community with EIGRP in L3VPN is relatively simple. For example, consider two sites, A and B, that are connected with fast L3VPN as a primary path and a slow leased line as a backup path. At some point Prefix-A emerges within Site-A and propagates across these paths. In such a case there is a race condition present:

  1. Update about Prefix-A reaches Site-B via MPLS backbone first. Site-B PE installs the prefix into BGP RIB, redistributes it into EIGRP (reconstructing metric with communities) and announces the prefix within Site-B. Since leased line has an unfavourable metric, the path through MPLS backbone would be selected by site-B as expected.
  2. Update about Prefix-A reaches Site-B via leased line first. Site-B PE imports the prefix into BGP RIB. When BGP update from Site-A finally reaches PE on Site-B, this new prefix would not be considered the best because there would be a better, locally originated, one. In such a case, Site-B would use leased line as a primary path towards Prefix-A.

Cost community remedies such a problem. It allows BGP to consider a certain metric at a specific point in best path selection algorithm (in our case, before the whole process). EIGRP cost is used as a value for such a metric thus the shortest path is always selected.

Although aforementioned approach works well in a pure EIGRP L3VPN environment, it has an obvious side effect – it breaks the usual path selection process. As you could imagine, OSPF does not use cost community at all so there is a default value of 0x7FFFFFFF (214748347) that is much higher than a typical EIGRP metric.

At first glance, the problem could be fixed by changing EIGRP metric for 8.8.8.8/32 to be higher than 0x7FFFFFFF; however, it would only reverse the issue because at this point R2 would compare OSPF default cost community metric with an awful cost of the prefix imported from EIGRP.

A possible solution could be to give up cost community at all, however, there is no direct command for that yet. The available knob though allows to completely ignore cost community:

R1(config)#router bgp 123
R1(config-router)#bgp bestpath cost-community ignore 
R3#sho bgp vpnv4 unicast all 
BGP table version is 26, local router ID is 3.3.3.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, 
              r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter, 
              x best-external, a additional-path, c RIB-compressed, 
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 1:1 (default for vrf A)
 *>i 4.4.4.4/32       1.1.1.1                  2    100      0 ?
 *>i 5.5.5.5/32       2.2.2.2             103040    100      0 ?
 *>  6.6.6.6/32       192.168.36.6             0             0 6 i
 * i 8.8.8.8/32       1.1.1.1                  2    100      0 ?
 *>i                  2.2.2.2             103040    100      0 ?
 *>i 192.168.14.0     1.1.1.1                  0    100      0 ?
 *>i 192.168.25.0     2.2.2.2                  0    100      0 ?

Although we managed to make R1 ignore cost community, R3 requires the same config; otherwise it would always select EIGRP-originated route despite other BGP attributes.

R3(config)#router bgp 123
R3(config-router)#bgp bestpath cost-community ignore 
R3#sho bgp vpnv4 unicast all 
BGP table version is 29, local router ID is 3.3.3.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, 
              r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter, 
              x best-external, a additional-path, c RIB-compressed, 
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 1:1 (default for vrf A)
 *>i 4.4.4.4/32       1.1.1.1                  2    100      0 ?
 *>i 5.5.5.5/32       2.2.2.2             103040    100      0 ?
 *>  6.6.6.6/32       192.168.36.6             0             0 6 i
 *>i 8.8.8.8/32       1.1.1.1                  2    100      0 ?
 * i                  2.2.2.2             103040    100      0 ?
 *>i 192.168.14.0     1.1.1.1                  0    100      0 ?
 *>i 192.168.25.0     2.2.2.2                  0    100      0 ?

At last BGP RIB selection adheres to the common path selection rules: lower MED wins as expected. An alternative approach would be to remove cost community manually from announced updates:

R2(config)#ip extcommunity-list 100 permit pre-bestpath
R2(config)#route-map COSTCOMM
R2(config-route-map)#set extcomm-list 100 delete
R2(config)#router bgp 123
R2(config-router)#address-family vpnv4
R2(config-router-af)#neighbor L3VPN route-map COSTCOMM out 
R2#sho bgp vpnv4 unicast all extcommunity-list 100
BGP table version is 23, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, 
              r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter, 
              x best-external, a additional-path, c RIB-compressed, 
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 1:1 (default for vrf A)
 *>  5.5.5.5/32       192.168.25.5        103040         32768 ?
 *>  8.8.8.8/32       192.168.25.5        103040         32768 ?
 *>  192.168.25.0     0.0.0.0                  0         32768 ?
R3#sho bgp vpnv4 unicast all 8.8.8.8/32
BGP routing table entry for 1:1:8.8.8.8/32, version 46
Paths: (2 available, best #1, table A)
  Advertised to update-groups:
     1         
  Refresh Epoch 5
  Local
    1.1.1.1 (metric 2) from 1.1.1.1 (1.1.1.1)
      Origin incomplete, metric 2, localpref 100, valid, internal, best
      Extended Community: RT:1:1 OSPF DOMAIN ID:0x0005:0x000000020200 
        OSPF RT:0.0.0.0:2:0 OSPF ROUTER ID:192.168.14.1:0
      mpls labels in/out nolabel/23
  Refresh Epoch 1
  Local
    2.2.2.2 (metric 2) from 2.2.2.2 (2.2.2.2)
      Origin incomplete, metric 103040, localpref 100, valid, internal
      Extended Community: RT:1:1 0x8800:32768:0 0x8801:1:2560 
        0x8802:65281:25600 0x8803:65281:1500 0x8806:0:134744072
      mpls labels in/out nolabel/23

The purpose of this article is twofold: to introduce cost community and to practice hands-on troubleshooting. Remember, there is no magic in technology but defaults.

Kudos for review: Anastasiia Kuraleva

Follow on Telegram, LinkedIn

EIGRP RID

This lab shows network discrepancies in EIGRP domain caused by duplicate Router IDs.

topology

Configuration (Part 1)

1) Redistribute loopbacks on R1 and R2 into EIGRP.
2) Shutdown any loopback.

Task (Part 1)

1) Identify the discrepancy in the network.
2) Find out the process of RID election.
3) Suggest the solution to the problem.

Observations (Part 1)

On redistribution EIGRP adds router ID to the external prefixes. Routers ignore updates with own RID in order to avoid routing loops. Although no log messages are issued, there is an entry in EIGRP events: “Ignored route, dup routerid ext: 8.8.8.8“

Configuration (Part 2)

1) Enable the shutdown loopback interface.
2) Instead of redistributing the loopbacks, add them to EIGRP domain as internal ones.
3) Shutdown any loopback once again.

Task (Part 2)

1) Identify the discrepancy.
2) Fix the problem.

Observations (Part 2)

Starting from IOS 15 train, EIGRP adds RID to the internal prefixes too so the behaviour is similar to the one with external routes. Older IOS versions, however, accept such updates as there is no RID present so it is not checked. The EIGRP event message is only slightly different: “Ignored route, dup routerid int: 8.8.8.8“

IOS image used: c7200-adventerprisek9-mz.152-4.M11.image

Resources

https://www.cisco.com/c/en/us/support/docs/ip/enhanced-interior-gateway-routing-protocol-eigrp/18685-eigrp-dup-id.html

https://www.cisco.com/c/en/us/support/docs/ip/enhanced-interior-gateway-routing-protocol-eigrp/118974-technote-eigrp-00.html#anc28

Follow on Telegram, LinkedIn, Twitter

EIGRP Feasible Distance definition explained

EIGRP is a distance-vector protocol originally developed by Cisco. One of the major differences from its predecessor, IGRP, is DUAL – algorithm that ensures no permanent loops exist in EIGRP topology. However, the precise definition of one the key parameters, feasible distance (FD), is hard to come by. Here is the definition from the official website:

Feasible distance is the best metric along a path to a destination network, including the metric to the neighbor advertising that path.

Enhanced Interior Gateway Routing Protocol, https://www.cisco.com

Such a definition covers most of the cases but not all of them. The correct one is out there in the wild, however, it’s nice to figure smth out by ourselves. As usual, the topology:

topology

Each of the routers has a loopback (e.g. R1 – 1.1.1.1/32) and EIGRP is used as an IGP with no special configuration:

R3#sho run | section router eigrp
router eigrp 1
 network 0.0.0.0

In this article we would be looking at 3.3.3.3/32 from R1 perspective:

R1#deb eigrp fsm
EIGRP Finite State Machine debugging is on

R1#sho ip eigrp topology 3.3.3.3/32
EIGRP-IPv4 Topology Entry for AS(1)/ID(1.1.1.1) for 3.3.3.3/32
  State is Passive, Query origin flag is 1, 1 Successor(s), FD is 158720
  Descriptor Blocks:
  192.168.12.2 (FastEthernet0/0), from 192.168.12.2, Send flag is 0x0
      Composite metric is (158720/156160), route is Internal
      Vector metric:
        Minimum bandwidth is 100000 Kbit
        Total delay is 5200 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1500
        Hop count is 2
        Originating router is 3.3.3.3

There are two ways to change EIGRP metric by default: change bandwidth or change delay. In our case (and most of the cases) it’s more predictable to change the delay. Let’s increase the cost of the R2-R3 link:

R2(config-if)#delay 100

As expected, R1 loses the only valid route towards 3.3.3.3/32 and puts the prefix into Active state:

R1#
*Mar  2 20:17:07.655: EIGRP-IPv4(1): rcvupdate: 3.3.3.3/32 via 192.168.12.2 metric 181760/179200 on tid 0
*Mar  2 20:17:07.659: EIGRP-IPv4(1): Find FS for dest 3.3.3.3/32. FD is 158720, RD is 158720 on tid 0
*Mar  2 20:17:07.659: EIGRP-IPv4(1): 	192.168.12.2 metric 181760/179200 not found Dmin is 181760
*Mar  2 20:17:07.659: DUAL: AS(1) Peer total 1 stub 0 template 1 for tid 0
*Mar  2 20:17:07.659: DUAL: AS(1) Dest 3.3.3.3/32 entering active state for tid 0.
*Mar  2 20:17:07.659: EIGRP-IPv4(1): Set reply-status table. Count is 1.
*Mar  2 20:17:07.659: EIGRP-IPv4(1): Not doing split horizon
*Mar  2 20:17:07.759: EIGRP-IPv4(1): rcvreply: 3.3.3.3/32 via 192.168.12.2 metric 181760/179200 for tid 0
*Mar  2 20:17:07.759: EIGRP-IPv4(1): reply count is 1
*Mar  2 20:17:07.759: DUAL: AS(1) Clearing handle 0, count now 0
*Mar  2 20:17:07.759: DUAL: AS(1) Freeing reply status table
*Mar  2 20:17:07.759: EIGRP-IPv4(1): Find FS for dest 3.3.3.3/32. FD is 72057594037927935, RD is 181760 on tid 0found
*Mar  2 20:17:07.759: DUAL: AS(1) RT installed 3.3.3.3/32 via 192.168.12.2

The FD is changed as well and matches the current metric towards 3.3.3.3/32:

R1#sho ip eigrp topology 3.3.3.3/32
EIGRP-IPv4 Topology Entry for AS(1)/ID(1.1.1.1) for 3.3.3.3/32
  State is Passive, Query origin flag is 1, 1 Successor(s), FD is 181760
  Descriptor Blocks:
  192.168.12.2 (FastEthernet0/0), from 192.168.12.2, Send flag is 0x0
      Composite metric is (181760/179200), route is Internal
      Vector metric:
        Minimum bandwidth is 100000 Kbit
        Total delay is 6100 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1500
        Hop count is 2
        Originating router is 3.3.3.3

Let’s try changing delay on R2 to a different value, resetting our previous changes beforehand. Current delay for f0/1 on R2:

R2#sho int f0/1
FastEthernet0/1 is up, line protocol is up 
  Hardware is i82543 (Livengood), address is ca02.0ebd.0006 (bia ca02.0ebd.0006)
  Internet address is 192.168.23.2/24
  MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec,
<output omitted>

We’ve changed delay to 1000us (remember, the value is tens of microseconds?) which is way more than initial delay. Let’s try now a minimal change of 110us:

R2(config-if)#delay ?     
  <1-16777215>  Throughput delay (tens of microseconds)

R2(config-if)#delay 11

Whenever you input a number, it’s a good idea to spend a second of precious time and hit question mark to see the actual measure units; such a habit might save quite some troubleshooting time once in a while. Let’s check on R1:

R1#
*Mar  2 20:25:40.227: EIGRP-IPv4(1): rcvupdate: 3.3.3.3/32 via 192.168.12.2 metric 158976/156416 on tid 0
*Mar  2 20:25:40.231: EIGRP-IPv4(1): Find FS for dest 3.3.3.3/32. FD is 158720, RD is 158720 on tid 0
*Mar  2 20:25:40.231: EIGRP-IPv4(1): 	192.168.12.2 metric 158976/156416 found Dmin is 158976
*Mar  2 20:25:40.239: DUAL: AS(1) RT installed 3.3.3.3/32 via 192.168.12.2

The debug output is way smaller this time. What about FD?

R1#sho ip eigrp topology 3.3.3.3/32
EIGRP-IPv4 Topology Entry for AS(1)/ID(1.1.1.1) for 3.3.3.3/32
  State is Passive, Query origin flag is 1, 1 Successor(s), FD is 158720
  Descriptor Blocks:
  192.168.12.2 (FastEthernet0/0), from 192.168.12.2, Send flag is 0x0
      Composite metric is (158976/156416), route is Internal
      Vector metric:
        Minimum bandwidth is 100000 Kbit
        Total delay is 5210 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1500
        Hop count is 2
        Originating router is 3.3.3.3

FD does NOT match the metric! What’s the catch? An attentive reader might have noticed a difference between these 2 cases besides distinct delay values. If not, let’s dissect them step by step.

  Delay 1100 Delay 110
Step 1 R1 receives an update about 3.3.3.3/32
Step 2 R1 searches for feasible successor for 3.3.3.3/32
Step 3 R1 does not find FS and triggers DUAL R1 finds FS and installs the route
Step 4 After completing DUAL R1 selects the best route and installs it  

As you can see, the key difference is triggering the DUAL process. The large change of the delay on R2 (100) does not satisfy feasibility condition on R1. Having no feasible successors, R1 is forced to run DUAL to find loop-free route. The small change of delay (1), however, keeps advertised distance (AD) low enough to satisfy feasibility condition on R1 (addition is less than delay of R1-R2 link). Since there is a FS, EIGRP can install the corresponding route without running laborious DUAL. Also, the debug output shows the exact value of FD that is used for finding FS; it’s infinity after DUAL is complete. Basically, FD is the historically lowest value of metric; FD is also reset to infinity as soon as the route is put into Active state.

Here is nice and precise definition of feasible distance borrowed from Cisco Community:

Feasible Distance is the lowest distance to the destination experienced since the last time the route went from Active to Passive state

Peter Paluch

Although a practical aspect of such a knowledge might seem negligible, it always warms one’s heart to unravel smth new. I would like to encourage everyone to spread such insights further within community to foster awareness for details of the technology.

Kudos for review to: Anastasiia Kuraleva, Maxim Klimanov

Follow on Telegram, LinkedIn