How Linux can help you protect your privacy online

Beat the CIA
Use Wireshark to see what information is being sent in and out of your PC

You're not paranoid - they really are watching you. Criminals, web companies and governments all have a reason to spy on your online life, and the methods that they use are becoming increasingly sophisticated.

2011 was the most dangerous year to be an online citizen, particularly if you happened not to agree with everything your government said. 199 people around the world were arrested or detained because of content they posted online. Many are still languishing in jail.

The offending information ranged from exposes of environmental damage to religious instruction and criticism of unelected autocrats.

In addition, there has been a recent increase in the use of netizens' information by web companies. Privacy policies have been extended, and Twitter now sells the rights to users' data.

Some of the self-protection methods shown here will have an impact on how you can use a computer. For most people, implementing all of them would be over the top. What we're aiming to do here is show you who can find out what about you, and how to stop them.

What you do with that information is, of course, up to you. Whether you are concerned about the scale of information gathered by web companies, or you are hiding from a corrupt government, read on to find out how to keep your data yours.

You can find out just how much information you're revealing to the world using Wireshark. This tool captures all information passing through your network interfaces and allows you to search and filter for particular patterns. It takes information from your network interface, so any information displayed in it is visible to other (potentially malicious) people on the network.

Wireshark should be available through your package manager, or from wireshark.org. Once installed, you can start it with: sudo wireshark

You will get a message telling you that you've started it with super user privileges and this isn't a good way of doing it. If you plan on using the tool a lot, you should follow their guide on a better set-up, but for a one-off, you can ignore this.

Wireshark

Click on your network device in the interface list (probably eth0 for a wired network and wlan0 a wireless) to start a capture. As soon as you start using the network, the top part of the screen will fill with variously coloured packages. The tool has a filter to help you make some sense of this multi-coloured mess.

For example, you can keep a prying eye on duckduckgo.com searches using the filter: http.request.full_uri contains "duckduckgo.com?q" If you now do a search using http://duckduckgo.com, it will appear in the list, and the search term will be in the Info column.

A similar technique could be used on any of the popular search engines. You may not be concerned about people being able to read your search terms, but exactly the same technique can be used to pull usernames and passwords that are sent in plain text.

For example, most forums send passwords in plain text (because they're not a serious security risk, and secure certificates can be expensive). The www.linuxformat.com forums are set up in this way.

To sniff LinuxFormat.com passwords, fire up Wireshark and start a package capture using the filter: http.request.uri contains "login.php" When you log in to www.linuxformat.com/forums/index.php (you will need to create an account if you don't already have one), the filter will capture the packet. The line-based text data will contain: Username=XXX&password=YYYY&login=Log+in

How many computers are you sharing this information with? Depending on your network set-up, probably every other computer on the LAN or Wireless network.

As well as these, every computer that sits in the route between you and the server you're communicating with. To discover what these are, use traceroute to map the path the packets take.

For example, traceroute www.google.com

If your computer's behind a firewall, you may find that this just outputs a series of asterisks. In this case, you can use a web-based traceroute such as the ones indexed at www.traceroute.org. This list is a little out of date, and not all of the servers are still hosting traceroute, but you should be able to find one that works in your area.

Do you know who's running these computers? Or who has remote access to them? Do you want these people to be able to see everything you do online? If you use services with unsecured passwords (and there's no reason you shouldn't, as long as you understand the implications), then it's important not to use the same password for a secure service.

The most basic piece of the web privacy puzzle is the Secure Sockets Layer (SSL). This rather obscure-sounding protocol is a way of creating an encrypted channel between an application running on your computer and an application running on another computer.

For each insecure network protocol, there's a secure one that does the same basic task, but through an SSL channel.

Any time you use an insecure protocol, an eavesdropper can read what you send, but if you use a secured one, only the intended recipient can see the data.

For web browsing, it's HTTPS that's important. As we saw before, many computers can read what we send in HTTP, but if we perform the same test again, but using duckduckgo's secure web page - https://www.duckduckgo.com (note the s) - then you will find that the information does not appear in Wireshark.

unsecure web page

Some web browsers show a padlock when connected to a secure website, but this can be spoofed easily using favicons. If you're unsure, click on the icon. A legitimate padlock will open a pop-up telling you about the security on the page.

Of course, this ensures only that the information can't be read as it's being transmitted between your computer and the server. Once there, the organisation running the server could pass it on to third parties, or transmit it insecurely between their data centres. Once you send information, you lose control of it.

Before hitting Submit, always ask yourself, do you trust the organisation receiving the data? If not, don't send it.

HTTPS is a great way to keep your web browsing private. However, because of the way it has been bolted on top of HTTP, it isn't always easy to make sure you use it. For example, if you use https://www. google.com to search for 'wikipedia', it will direct you to the HTTP version of the encyclopaedia, not the HTTPS version.

The Electronic Frontiers Foundation (EFF), a non-profit dedicated to defending digital rights, has developed an extension for Firefox that forces browsers to use HTTPS wherever it's available. A Chrome version is currently in beta. Get this from https://www.eff.org/https-everywhere to keep your web usage away from eavesdroppers.

Like all forms of encryption, SSL has a weakness, and that's the keys which are stored in certificates. Just as a hacker can easily get in to your accounts if they know your password, they can easily eavesdrop on SSL encrypted data - or spoof it - if they can trick your computer into using their certificates.

The main point here is that they are stored on the computer, not in your memory like passwords. If an attacker can put files on your system, they can break SSL encryption. You are at particular risk when using a computer you haven't personally installed the operating system on, such as a work machine or at an internet café.

You should be able to view the current certificates and authorities in your browser's security settings, but it isn't always easy to identify things that shouldn't be there. Here, live distros come to the rescue, since you can carry a trusted operating system with you and use that whenever you are at a computer of dubious provenance.