A few weeks back I was faced with 30, 60 and sometimes even 95% packet loss every 3-4 days, so I wanted to setup some simple network monitoring to keep track of my broadband connection. I started off using ad-hoc tools whenever the internet mis-behaved, and then I would call up my ISP and wait for them to fix it. This resulted in quite a few refunds! Next, it was time to gather some data on just how bad the situation was – who knows, maybe the information will help my ISP troubleshoot the problem (I can tell them how long the problem has been there, even if I’ve been asleep or at work), or perhaps get past the first line support people..
Ad-hoc diagnostic tools
I expect most people to be familiar with ping and traceroute (tracert on Windows), and websites like speedtest.net and testmy.net. These are a good start, but I find using MTR preferrable to ping & traceroute – it runs continuously, collecting stats on ping times, packet loss, all while showing you traceroute information. It also gathers the traceroute information within seconds, even when some of the routers in between you and your target don’t reply.
I also recently discovered the unofficial speedtest CLI tool written in Go. There’s another version, speedtest-cli, which appears to be more popular (judging by the number of GitHub stars). My initial testing suggests that the Go version is able to reach higher speeds.
Ping monitors (internet -> home)
These are straightforward to setup. You just create an account, enter your IP address into their form, and you’re done. If your IP address changes regularly, I suggest you use dyndns and a ping monitor that does DNS lookups. Here are some suggestions for the UK:
Ping monitors (home -> internet)
While all of the above can be done in no time at all, I found this bit to be more time-consuming. The internet offers loads of network monitoring tools (including for ping monitoring), but it was surprisingly difficult to find something that satisfied my requirements (see below). There’s a lot of old Perl-based stuff out there. And not many obvious solutions.
- Free (ideally Open Source)
- Run on the old Raspberry Pi (no Windows tools, sorry)
- Support packet loss monitoring
- Simple setup
- produce pretty graphs (easy to interpret data)
I thought my 256MB Raspberry Pi (Model B) might struggle a bit with running Observium, Nagios and other feature-rich monitoring solutions. At the end of the day, I settled on smokeping. The graphs looked decent, it measured packet loss, the main downside was the somewhat annoying setup procedure for the web frontend. I do like the way smokeping visualises the latency distribution.
It should be possible to use the aforementioned speedtest CLI tool, cron, and some graphing utilities together with a server running speedtest-mini to graph the available bandwidth over time as well. Daniel Wenzel has documented one way of achieving this.
My Raspberry Pi (256MB model B) does not seem to be able to run the Speedtest CLI utility written in Go, [UPDATE 30/4/2015] but the 512MB model B is able to run it. The Python-based speedtest-cli reports lower-than-expected speeds,
so I don’t currently have a cost effective way of monitoring my available bandwidth. [UPDATE 30/4/2015] Stay tuned.
A note on interpreting the data
Beware of interpreting the data from any of the above sources in isolation. If you’re streaming some HD movies while torrenting something and your metrics deviate from what you’d get on an idle connection, that would be expected. If you then call up your ISP, complain, and they find that the fault is at your end, it may well end up costing you. This is especially the case if they had to send out an engineer or incur other expenses.
Conclusion and next steps
The above has been sufficient for my purposes so far, and my broadband hasn’t been causing me noticeable grief for a little while.
Where do we go from here? The next most useful metrics (for me) would be my router’s up/download rates over time, to make it easier to understand whether any measured packet loss is because of internal or external reasons (i.e. is it my fault, or my ISPs?). I could setup MRTG to extract those out of my router, but perhaps I should setup Observium instead, to accommodate a few more metrics. Maybe on a Raspberry Pi 2. But then I’d also like to add (custom) metrics collected from my cable modem (the power levels for example). We’ll see. That’ll be material for another blog post.
I am currently a Cisco employee, and the views expressed on this blog are mine and do not necessarily reflect the views of Cisco.