Skip the navigation

Server's down: How do I find out what's wrong?

By Kyle Rankin
January 28, 2013 05:55 AM ET

Can I route to the remote host?

After you have ruled out DNS issues and see that web1 is resolved into its IP 10.1.2.5, you must test whether you can route to the remote host. Assuming ICMP is enabled on your network, one quick test might be to ping web1. If you can ping the host, you know your packets are being routed there and you can move to the next section, Is the remote port open? If you can't ping web1, try to identify another host on that network and see if you can ping it. If you can, then it's possible web1 is down or blocking your requests, so move to the next section. If you can't ping any hosts on the remote network, packets aren't being routed correctly. One of the best tools to test routing issues is traceroute. Once you provide traceroute with a host, it will test each hop between you and the host. For example, a successful traceroute between dev1 and web1 would look like this:

$ traceroute 10.1.2.5
traceroute to 10.1.2.5 (10.1.2.5), 30 hops max, 40 byte packets
1 10.1.1.1 (10.1.1.1) 5.432 ms 5.206 ms 5.472 ms
2 web1 (10.1.2.5) 8.039 ms 8.348 ms 8.643 ms

Here you can see that packets go from dev1 to its gateway (10.1.1.1), and then the next hop is web1. This means it's likely that 10.1.1.1 is the gateway for both subnets. On your network you might see a slightly different output if there are more routers between you and your host. If you can't ping web1, your output would look more like the following:

$ traceroute 10.1.2.5
traceroute to 10.1.2.5 (10.1.2.5), 30 hops max, 40 byte packets
1 10.1.1.1 (10.1.1.1) 5.432 ms 5.206 ms 5.472 ms
2 * * *
3 * * *

Once you start seeing asterisks in your output, you know that the problem is on your gateway. You will need to go to that router and investigate why it can't route packets between the two networks. Instead you might see something more like

$ traceroute 10.1.2.5
traceroute to 10.1.2.5 (10.1.2.5), 30 hops max, 40 byte packets
1 10.1.1.1 (10.1.1.1) 5.432 ms 5.206 ms 5.472 ms
1 10.1.1.1 (10.1.1.1) 3006.477 ms !H 3006.779 ms !H 3007.072 ms

In this case, you know that the ping timed out at the gateway, so the host is likely down or inaccessible even from the same subnet. At this point, if you haven't tried to access web1 from a machine on the same subnet as web1, try pings and other tests now.

Note: If you have one of those annoying networks that block ICMP, don't worry, you can still troubleshoot routing issues. You just need to install the tcptraceroute package (sudo apt-get install tcptraceroute), then run the same commands as for traceroute, only substitute tcptraceroute for traceroute.

Is the remote port open?

So you can route to the machine but you still can't access the web server on port 80. The next test is to see whether the port is even open. There are a number of different ways to do this. For one, you could try telnet:

$ telnet 10.1.2.5 80
Trying 10.1.2.5...
telnet: Unable to connect to remote host: Connection refused

If you see Connection refused, then either the port is down (likely Apache isn't running on the remote host or isn't listening on that port) or the firewall is blocking your access. If telnet can connect, then, well, you don't have a networking problem at all. If the web service isn't working the way you suspected, you need to investigate your Apache configuration on web1. (Troubleshooting web server issues is covered elsewhere in the book).

Instead of telnet, I prefer to use nmap to test ports because it can often detect firewalls. If nmap isn't installed, use your package manager to install the nmap package. To test web1, type the following:

$ nmap -p 80 10.1.2.5
Starting Nmap 4.62 ( http://nmap.org ) at 2009-02-05 18:49 PST
Interesting ports on web1 (10.1.2.5):
PORT STATE SERVICE
80/tcp filtered http

Aha! nmap is smart enough that it can often tell the difference between a closed port that is truly closed and a closed port behind a firewall. Normally when a port is actually down, nmap will report it as closed. Here it reported it as filtered. What this tells us is that some firewall is in the way and is dropping the packets to the floor. This means you need to investigate any firewall rules on the gateway (10.1.1.1) and on web1 itself to see if port 80 is being blocked.

Our Commenting Policies
Internet of Things: Get the latest!
Internet of Things

Our new bimonthly Internet of Things newsletter helps you keep pace with the rapidly evolving technologies, trends and developments related to the IoT. Subscribe now and stay up to date!