In all of the cases mentioned, it's easy to bypass DNS so your troubleshooting results are accurate. All of the commands we discussed earlier accept an
-n option, which disables any attempt to resolve IP addresses into hostnames. I've just become accustomed to adding -n to all of the commands I introduced you to in the first part of this chapter unless I really do want IP addresses resolved.
Note: DNS resolution can also affect your web server's performance in an unexpected way. Some web servers are configured to resolve every IP address that accesses them into a hostname for logging. Although that can make the logs more readable, it can also dramatically slow down your web server at the worst times -- when you have a lot of visitors. Instead of serving traffic, your web server can get busy trying to resolve all of those IPs.
Find the network slowdown with traceroute
When your network connection seems slow between your server and a host on a different network, sometimes it can be difficult to track down where the real slowdown is. Especially in situations where the slowdown is in latency (the time it takes to get a response) and not overall bandwidth, it's a situation traceroute was made for. traceroute was mentioned earlier as a way to test overall connectivity between you and a server on a remote network, but traceroute is also useful when you need to diagnose where a network slowdown might be. Since traceroute outputs the reply times for every hop between you and another machine, you can trace down servers that might be on a different continent or gateways that might be overloaded and causing network slowdowns. For instance, here's part of a traceroute between a server in the United States and a Chinese Yahoo server:
traceroute to yahoo.cn (184.108.40.206), 30 hops max, 60 byte packets
1 64-142-56-169.static.sonic.net (220.127.116.11) 1.666 ms 2.351 ms 3.038 ms
2 2.ge-1-1-0.gw.sr.sonic.net (18.104.22.168) 1.241 ms 1.243 ms 1.229 ms
3 265.ge-7-1-0.gw.pao1.sonic.net (22.214.171.124) 3.388 ms 3.612 ms 3.592 ms
4 xe-1-0-6.ar1.pao1.us.nlayer.net (126.96.36.199) 6.464 ms 6.607 ms 6.642 ms
5 ae0-80g.cr1.pao1.us.nlayer.net (188.8.131.52) 3.320 ms 3.404 ms 3.496 ms
6 ae1-50g.cr1.sjc1.us.nlayer.net (184.108.40.206) 4.335 ms 3.955 ms 3.957 ms
7 ae1-40g.ar2.sjc1.us.nlayer.net (220.127.116.11) 8.748 ms 5.500 ms 7.657 ms
8 as4837.xe-4-0-2.ar2.sjc1.us.nlayer.net (18.104.22.168) 3.864 ms 3.863 ms 3.865 ms
9 22.214.171.124 (126.96.36.199) 275.648 ms 275.702 ms 275.687 ms
10 188.8.131.52 (184.108.40.206) 284.506 ms 284.552 ms 262.416 ms
11 220.127.116.11 (18.104.22.168) 263.538 ms 270.178 ms 270.121 ms
12 22.214.171.124 (126.96.36.199) 303.441 ms * 303.465 ms
13 188.8.131.52 (184.108.40.206) 306.968 ms 306.971 ms 307.052 ms
14 220.127.116.11 (18.104.22.168) 295.916 ms 295.780 ms 295.860 ms
Without knowing much about the network, you can assume just by looking at the round-trip times that once you get to hop 9 (at the 22.214.171.124 IP), you have left the continent, as the round-trip time jumps from 3 milliseconds to 275 milliseconds.
Find what is using your bandwidth with iftop
Sometimes your network is slow not because of some problem on a remote server or router, but just because something on the system is using up all the available bandwidth. It can be tricky to identify what process is using up all the bandwidth, but there are some tools you can use to help identify the culprit.
top is such a great troubleshooting tool that it has inspired a number of similar tools like
iotop to identify what processes are consuming the most disk I/O. It turns out there is a tool called
iftop that does something similar with network connections. Unlike top, iftop doesn't concern itself with processes but instead lists the connections between your server and a remote IP that are consuming the most bandwidth. For instance, with iftop you can quickly see if your backup job is using up all your bandwidth by seeing the backup server IP address at the top of the output.
iftop is available in a package of the same name on both Red Hat- and Debian-based distributions, but in the case of Red Hat-based distributions, you might have to find it from a third-party repository. Once you have it installed, just run the iftop command on the command line (it will require root permissions). Like with the top command, you can hit Q to quit.
At the very top of the iftop screen is a bar that shows the overall traffic for the interface. Just below that is a column with source IPs followed by a column with destination IPs and arrows between them so you can see whether the bandwidth is being used to transmit packets from your host or receive them from the remote host. After those columns are three more columns that represent the data rate between the two hosts over 2, 10, and 40 seconds, respectively. Much like with load averages, you can see whether the bandwidth is spiking now, or has spiked some time in the past. At the very bottom of the screen, you can see statistics for transmitted data (TX) and received data (RX) along with totals. Like with top, the interface updates periodically.
The iftop command run with no arguments at all is often all you need for your troubleshooting, but every now and then, you may want to take advantage of some of its options. The iftop command will show statistics for the first interface it can find by default, but on some servers you may have multiple interfaces, so if you wanted to run iftop against your second Ethernet interface (eth1), type
iftop -i eth1.
By default iftop attempts to resolve all IP addresses into hostnames. One downside to this is that it can slow down your reporting if a remote DNS server is slow. Another downside is that all that DNS resolution adds extra network traffic that might show up in iftop! To disable network resolution, just run iftop with the -n option.
Normally iftop displays overall bandwidth used between hosts, but to help you narrow things down, you might want to see what ports each host is using to communicate. After all, if you knew a host was consuming most of your bandwidth over your web port, you would perform different troubleshooting than if it was connecting to an FTP port. Once iftop is launched, press P to toggle between displaying all ports and hiding them. One thing you'll notice, though, is that sometimes displaying all the ports can cause hosts you are interested in to fall off the screen. If that happens, you can also hit either S or D to toggle between displaying ports only from the source or only from the destination host, respectively. Showing only source ports can be useful when you run iftop on a server, since for many services, the destination host uses random high ports that don't necessarily identify what service is being used, but the ports on your server are more likely to correspond to a service on your machine. You can then follow up with the
netstat -lnp command referenced earlier to find out what service is listening on that port.
Like with most Linux commands, iftop has an advanced range of options. What we covered should be enough to help with most troubleshooting efforts, but in case you want to dig further into iftop's capabilities, just type
man iftop to read the manual included with the package.
This article is excerpted from the book DevOps Troubleshooting: Linux Server Best Practices by Kyle Rankin, published by Pearson/Addison-Wesley Professional. It is reprinted by permission. Copyright 2013 Pearson Education Inc., all rights reserved.
- Hadoop for Dummies Today, organizations in every industry are being showered with imposing quantities of new information. Along with traditional sources, many more data channels and...
- The Top Five Ways to Get Started with Big Data Despite the increased focus on big data over the past few years, most organizations are still talking about what big data is rather...
- Data Warehouse Augmentation: The Queryable Data Store While organizations have, to date, been busy exploring and experimenting, they are now beginning to focus on using big data technologies to solve...
- The IBM Big Data Platform IBM is unique in having developed an enterprise class big data platform that allows you to address the full spectrum of big data...
- Live Webcast Best Practices: How to Improve Business Continuity with Virtualization VMware solutions include a range of business continuity capabilities to help ensure availability for applications across your virtualized environment. Learn More>>
- Endpoint Data Management: Protecting the Perimeter of the Internet of Things Not surprisingly, "Internet of Things" (IoT) and Big Data present new challenges AND opportunities for enterprise IT. Teams need to harness, secure and...
- Best Practices: How to Improve Business Continuity with Virtualization VMware solutions include a range of business continuity capabilities to help ensure availability for applications across your virtualized environment. Learn More>> All Data Center White Papers | Webcasts