A network technician often needs to troubleshoot the network. Errors can have various causes, and sometimes it’s hard to quickly resolve the problem. In this article we’ll focus on a specific area of troubleshooting: problems related to IPv4 routing. We’ll shortly summarize the most common reasons, the symptoms of the errors, the methods of revealing the problem, and how to fix it. It’s important to be aware of the basic troubleshooting skills as a CCNA, especially if you want to go on to CCNP, because it has a complete part about this.
From a little network to the big internet, we need some method to find the best path for the data packets: this process is called routing. Routers are devices that are capable to make a decision about the proper path. A router can get to know a route towards a network from various sources:
The network can be directly connected; the router knows it on its own, when the interface connected to this comes up.
The administrator can configure statically; this method is simple and sometimes the best solution, especially on a small network.
And finally, the router can learn a path from another router via a routing protocol; this is the case on a bigger network.
Let’s go on with a lab topology. We can use this to demonstrate the working of these methods, to simulate some errors intentionally and to troubleshoot them. We don’t need any difficult topology (I like to create a difficult task with very few devices, anyway):
As we see, there are three routers (connected by Ethernet and serial interfaces), two PCs and an Internet connection. The latter will be simulated by a loopback interface on R2, with the IP address of 126.96.36.199/32. The other IP addresses are assigned by the next logic: The second octet references the two interconnecting routers, the last octet will be the router ID (e.g., on R1 the serial interface has the address of 10.13.0.1), and on the LANs the routers have the first useable address.
The first exercise will be very simple, but it introduces a very good and important tool in troubleshooting: the router’s debugging feature. If we want to look closer to see what happens inside the router during specific tasks, we can switch the device into debug mode. In this mode, we’ll get a lot of information on the console screen, much more than during normal operation. This mode, on the other hand, is using more resources on the router, so it’s a best practice to use the debug until we reveal the problem, and then switch it off. Now use the debug ip routing command to observe what happens when we turn an interface into up mode:
Look at the rows labeled with yellow. We get two important pieces of information. First, the network has the administrative distance of 0, and cost of 0—this is a connected route. Second, the output interface is Fa0/0. This information can also be seen by the show ip route command, extended by the connected parameter. We can say that connected routes don’t have too much possibility of error: If we configure the proper IP address and mask, and we don’t forget to bring up the interface, they just work. If we don’t see them in the routing table, check the cabling and the interface status also.
Static routing requires the administrator to manually configure the information to reach a particular network. This can be achieved by setting up the destination network and netmask, the output interface and/or the next hop IP address. On a broadcast network, like Ethernet, we can set both of them, avoiding the recursive lookup in the routing table.
With static routing, the problems can be the logically wrong configuration, and a reachability issue with the next hop router. Let’s look at the first. In the topology we want a static route from R1 to the 192.168.1.128/25 LAN. The correct setup, when we want to use the shortest path, is across R3, with the next hop address of 10.13.0.3. Now we simulate an error: set up the next hop address as 10.23.0.3—the difference is just one digit, and moreover this is an existing address on R3. First, turn off all debugging with the undebug all command, then turn on another useful feature: debug ip routing.
As we can see, the debug doesn’t show anything, and the route doesn’t appear in the routing table. This is evidence that something is wrong. If you encounter this kind of problem, double-check the configuration with the show running-config command. It doesn’t hurt to check the reachability of the next hop also, by pinging that address.
So far we have used the simplest methods to teach our routers, but life is difficult and so is a router’s life. On a bigger network, when there are a lot of changes, it’s impossible to use static routing only. We need some routing protocol to dynamically handle the problem. In the CCNA level we can learn about the IGP family of protocols, namely RIP, OSPF, and EIGRP. There can be some common problems and errors in the configuration of these protocols.
For example, in the configuration we can forget some network to advertise. Obviously, the other routers have no idea about our internal LANs, and we need to include the interconnecting networks also, on which the routing advertisements can travel between the routers. The symptom usually is that some networks are missing from the routing tables.
It’s relatively simple to troubleshoot such kind of problems. We can use the output of show ip protocols command, and check the Routing for Networks section. All of the advertised networks should be shown. Another method is to see the output of the show running-config command, and check the protocol’s section.
Every routing protocol can use the passive interface feature. If an interface is passive, the router doesn’t send out any information on it. It’s advisable for security reasons and to reduce the traffic. But if we use a passive interface at a wrong place, we can get some annoying surprises. For example, if we configure OSPF between R1 and R2, and on R1 make the Fa0/0 interface passive (instead of Fa0/1), the routers won’t be neighbors. Why? Because R1 doesn’t send any hellos on this interface, therefore the neighborship cannot be built. This is true for EIGRP also. With RIP, because there’s no active adjacency, the effect will be that R1 will get information from R2, but R2 doesn’t get any information from R1 directly, just with the help of R3. Therefore (if we configure RIPv2 correctly) the LAN behind R1 will be seen by R2 two hops away through R3, which is not the optimal path. This problem can be resolved by the show ip protocols command again: check for the Passive Interface(s) section, or use show running-config, and search for the router rip section.
ACLs can cause similar problems. Maybe we need to deny any traffic except, for example, ICMP and HTTP, from the 192.168.1.0/25 LAN. We create the following simple ACL:
access-list 100 permit icmp any any
access-list 100 permit tcp any any eq 80
Now we apply it to the incoming direction, but by accident on Fa0/0 interface, instead of Fa0/1. In this case, the RIP advertisements cannot come in from R2 directly, because RIPv2 uses UDP, which is forbidden by the implicit deny at the end of the ACL.
Always plan and configure ACLs carefully, and do not forget to allow the routing protocols (OSPF and EIGRP) to have their own entries in the ACL command line!
One more common problem can be the routing protocol’s authentication mechanism. Every protocol has some method to authenticate the peer, to be sure that the information is from a trusted party. The authentication can be a simple shared password or some sophisticated method, such as using MD5 authentication. If the peer cannot authenticate itself properly, then protocol information cannot be exchanged between the routers. We’ll see more details later.
Continuing with specific protocol problems, first take RIP, specifically version 2. This version is classless, which is a must in any modern network, but this behavior is not the default setting. Let’s see what happens when we configure it with the following settings (the picture shows just R1, but the configuration is similar on the other routers):
Because of the default behavior of RIP, on R2 we can observe the following:
Based on this information, R2 thinks that the 192.168.1.0 LAN is reachable via two paths, but we know that this isn’t the case. Because R1 and R3 automatically summarize their networks, each of them sends the advertisement with the classful /24 mask. This is wrong. If we try to ping a host, for example 192.168.1.2 from R2, half of the pings will be lost, because half of the ICMP packets reach R3, which doesn’t have this network. The solution is simple: turn off the default behavior by the no auto-summary command, and everything will be fine. By the way, this setting is also useful for EIGRP, because this is also a distance vector protocol with automatic summarizing feature.
RIP configuration is a rather simple task at the CCNA level, and the upcoming curriculum (the v5.0) has less information about it. If we sometimes need to troubleshoot RIP, don’t forget one useful command: debug ip rip. This helps to see the information sent and received by the router.
The other distance vector protocol we need to mention is EIGRP. It’s an advanced DV protocol, and because Cisco released a draft to IETF about some of its internals, more vendors can use it on their devices in the future. So we need some knowledge about its troubleshooting process also.
First of all, there are some settings that need to be the same on the routers that want to share information by EIGRP. EIGRP uses the concept of neighbors, they can be discovered and maintained by hello packets. The neighborship cannot be established if the routers use different autonomous system (AS) identifiers. It can be easily checked by the show commands mentioned before. So when we configure EIGRP with the router eigrp Asnumber command, we need to use the same value on all routers in an AS.
EIGRP uses a composite metric, and the parameters (e.g., bandwidth and delay) in it are weighted by five values. These values (the so called K-values) must be the same on two adjacent routers. The hello packets contain information about the K-values, so the peer can recognize the mismatch before adjacency can be established. The default values can be seen by the show ip protocols command, and we can change them with the metric weights command (its first parameter must be 0):
If we forget to set these values exactly the same on the other routers, we’ll get a console error message stating K-value mismatch. If the console logging is switched off, we can debug this event with the debug eigrp neighbors command. By the way, Cisco doesn’t recommend using other K-values than K1 and K3, related to bandwidth and delay respectively, in the composite metric.
There are two timers in EIGRP which are important to know: the hello timer and the hold timer. The hello timer is used to set the frequency of sending hello packets (e.g., on Ethernet this is 5 seconds). The hold time is the interval in which we need to get at least one hello from our neighbor before we drop the neighborship. EIGRP doesn’t need these values to be the same. But if we don’t set them carefully (if we need to set them of course), we can cause troubles; for example, we can make a flapping route.
Let’s take an example. On R1 set the hello timer to 20 seconds, by the ip hello-interval eigrp ASnumber 20 interface-command. It means that R1 sends hellos every 20 seconds, but on R2 the hold time remains the default 15 seconds. What happens? After 15 seconds, R2 drops the neighborship, as it will not receive any hellos in the hold-time interval. After another 5 seconds it gets the hello packet from R1, and happily establishes the adjacency again. But after another 15 seconds the process repeats. We see some routes appearing and disappearing from the routing table—we call them flapping routes—as well seeing console messages about the adjacency changes and that holding time expired. Of course, we can increase the holding time on R2 to avoid this error, by the ip hold-time eigrp ASnumber newvalue command. We can check the timers in the running-config, or by issuing the show ip eigrp neighbors and the show ip eigrp interfaces detail commands:
For now it’s clear that troubleshooting routing protocols is not trivial, but if we use the correct tools (and know some internals of the protocols of course), we can successfully resolve basic problems quickly. Because of the size of the topic, we’ll have another article, in which we’ll talk about some more possible errors and how to resolve them. Have a nice, error-free time till then!