Saturday, 13 October 2012

Internet connection troubleshooting

So your internet connection is not working or it is really slow. It is pretty common and Linux has a big number of tool to try to pinpoint the problem.

The first easy steps that really help understand what is wrong in the connection do not actually require any hack or software. I will call them Safety Checks. It includes parts of the routine checks they ask you to go through at annoying helplines and - for how annoying they could seem - they solve the problem most of the times. You could go straight to the command line and try find a solution without any pre-check, but why complicate life so much, especially when you are already frustrated by a non-working internet connection.

This guide is for ethernet connections (I will add wireless later, so switch to cables if you are using wireless, as often wireless has channel, signal or authentication problems).

1. Safety Checks (these apply to any OS)


a. Check again that cables are connected correctly, that your router is on and it is not giving problems (e.g. error messages through flashing leds). If the hardware is giving problem, do not think you will manage to magically fix your internet with a string of command.

b. If everything is ok, shut off your modem and wait a few minutes before turning it on. Meanwhile, reboot your computer as well (I know it is very annoying!). You will thank me later if that solves the problem.

c. If you tried that already/it is not enough, it really helps to try internet on other computers or operative systems. That will help localize the problem. If internet does work on other devices, you can be sure that the problem is within your device. This is good, as it is usually easier to solve (the problem depends on you - where you can put hands on - and not on them). If this is the case, you will be able to solve the problem, otherwise the problem could not be within your range (Power failures, remote server shut down, bad cabling, network congestion).

d. Control DHCP settings. DHCP is a tool in your router that assigns IP to each device automatically, making it non-static. You need to control DHCP is activated both on router and local device. You usually access the control panel of your router by typing the default route (Gateway, the IP of your router) into your web browser address bar. If you do not know what is it, it is usually 192.168.0.1 or 192.168.0.255 (you can try them), or more easily found under the name of Gateway in connection information (depends on OS).

e. Ping! Most of the times bad connectivity is caused by bad DNS servers. DNS are used to translate website names into IP numbers. Pinging the direct IP will then work as usual if the problem is in the DNS. In order to ping, run (from terminal or otherwise):

$ ping 74.125.132.147

That IP is actually from a Google server. If the problem is in the DNS, you will receive back using this command, but not using:

$ ping www.google.com

In which case you will need to change your DNS (depends on OS, but they are usually under your connection settings, together with IP, Gateway and Subnet mask settings). Try put the OpenDNS: 208.67.222.222 as primary DNS and  208.67.220.220 as secondary.

If both ping do not work, that could be due to invalid ping (rare), server down (rare), firewall blocking packets (quite rare) or most likely incorrect routing or missing connectivity.

2. Eliminating interface problems


Check that the local interface is working. It might have been shut down for some reasons or it might be broken. From a terminal, logged in as root, type:

# ifconfig

The output should include eth0 (and lo and wlan0 if you have a wireless connection). But we will care about eth0, being the cable connection. If the interface is on, you will see a line with

inet addr:192.168.0.4  Bcast:192.168.0.255  Mask:255.255.255.0

and it should say UP BROADCAST MULTICAST instead of BROADCAST MULTICAST. This is how the output should look like:

eth0      Link encap:Ethernet  HWaddr 00:1b:13:84:3g:4d 
          inet addr:192.168.0.7  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:2924 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2287 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:180948 (176.7 Kb)  TX bytes:166377 (162.4 Kb)
          Interrupt:16

You should see no (or just a few) errors, dropped, overruns or collisions. Each could give a hint towards the problem, but if you do see errors (say more than 1%) you should be warned that it is usually hardware problems (either local but also possibly not within your powers), but I will deal with them in section 4.

If your interface is down, instead, run:

# ifconfig eth0 up

And check with ifconfig again. If internet does work now, problem solved! Otherwise, continue (I will assume the interface is active now and working, if you can't manage to turn it on, it might likely be a distro-related problem).

3.  Eliminating Routing Problems


Sometimes, regardless of what your OS might be showing (Connection established icons or messages), it does not understand that it is connected, but routed badly (missing IP, different Gateway, DHCP conflict). That can result in a bad connectivity.

a. Try this command:

# route

and you will see an output of this kind:

Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.0.0 * 255.255.255.0 U 0 0 0 eth0
link-local * 255.255.0.0 U 1000 0 0 eth0
default 192.168.0.1 0.0.0.0 UG 100 0 0 eth0

where the last line is the one that matters. You will see your Gateway (check it is correct) and the flag UG, with the interface you are using, eth0. If you do not, try to receive the information about your gateway and run:

# route add default gw 192.168.0.1 eth0

and you should see your gateway appear under the route command again.

b. Try the command:

# arp -a

 and you will see an output of this kind:

? (192.168.0.1) at 00:1b:13:84:3g:4d [ether] on eth0

where in brackets it will be your Gateway. If you see this, instead:

? (192.168.0.1) at <incomplete> on eth0

There are problem, and most likely it will be a wrong setting for gateway, or DHCP conflict. Check you DHCP settings in the router again, as pointed in Section 1.d. If that does not work, continue.

c. Type:

# cat  /etc/network/interfaces

to show your /etc/network/interfaces file. It should look like:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet dhcp

If instead of the last line, you have:

iface eth0 inet static
       address 192.168.0.2
       netmask 255.255.255.0
       gateway 192.168.0.1

it means that you are using a manual setting on your computer, so replace it with "iface eth0 inet dhcp" or configure IP settings correctly, making sure that all the addresses matches both on router and machine, and that DHCP is switched off on the router.

This should solve the routing problems and you should have a well connected router to your pc. If the internet connection is still not working, go to the next section.

4. Heavy duty tests


Before this, you should have solved your problem already, and if you are here, it means that something big is going on.

Run this commands:

# mii-tool -v

# ethtool eth0

# ethtool -S eth0

# netstat -i

They should give different outputs, but with visible errors, if there are any.
I will sum up the meaning of those errors:

Collisions: they usually happen within normal connection below 0.1% of the times. Poorly terminated cables or badly working networking card can increase such errors.

CRC errors: file transfer completed but with corrupted chunks in the transit. The presence of CRC errors usually hints toward electrical noise, such as damaged cable or non-properly connected cables.

FIFO and Overrun errors: this is usually a sign of excessive traffic

Length errors: this is most frequently due to incompatible duplex settings.

Carrier errors: faulty interfaces or networking equipment.

If you report such errors and/or still cannot manage your connection to work, it might be a problem which is not in your powers to solve. Try call your ISP.

5. Speed test


Your internet connection might be working now, but you want to test its speed. There are loads of online in-browser services to do that, but what I hate about them is the fancier and fancier flash interface they propose which literally slows down my not so powerful computer.

For this reason, I prefer to just see the speed of a file transfer and judge by myself the speed (in comfortable KB/MB [kilo/mega-byte] notation, not the annoying Mb [mega-bit], which is mainly used by big companies to confuse people about broadband speed). The way I do it is by using wget:

$ wget --output-document=/dev/null http://speedtest.wdc01.softla\
yer.com/downloads/test500.zip

This command will download a test file (no worries, redirected to /dev/null, so nowhere basically) and you will see the speed of download on screen. You can stop it anytime with Ctrl+C.


Hope this helped to solve your broadband problems or at least give you a better insight into it.