The challenge of attribution by IP address

network
(Image credit: Shutterstock)

You might have been lead to believe that an IP address is the missing link between something that happened on the Internet and the person responsible, especially if you watch a lot of CSI or hacker movies. However:

  • There are at least four common technologies that obscure who is tied to an IP
  • There are many other less transient signatures of a system than an IP address
  • Once a computer is identified it does not always ascertain who is using it       

An Internet Protocol (IP) is an address given to a system for a period of time that enables it to exchange data to and from it on networks. Only a few network devices need to keep the systems address (known as a MAC address) because everything else uses the IP to communicate.  There are two major versions of IP today, the IPv4, which has about 4 billion addresses, and IPv6, which has so many addresses it’s compared to grains of sand on Earth. IPV4 is exhausted in many ways and has lead to a slow migration to IPV6 that most major networks and devices support today. These two versions are significant because they both have their own ways of obstructing identifying a person by an IP.

What’s the issue?

Some of the ways IPs can be obscured were created specifically for privacy, others were needed to solve limited network addresses available before IPV6.

1.       Virtual Private Networks (VPNs), encrypt traffic between a machine and the VPN, so that any untrusted networks in between cannot easily snoop on the data. But multiple people can use the same VPN at the same time and any activity will only have the IP address of the VPN, not the systems connected to it. Most corporations use them, individuals can purchase them, and you can create your own. Only the VPN identify systems if it keeps logs, leaving it possible to commit malicious activity and remain unknown.

2.       Proxies are just like the name implies, usually routing traffic for a specific protocol like web site traffic, for purposes such as filtering unwanted websites from schools, public places, and companies. They will only see the proxy IP not the IP of the system.

3.       Network Address Translation (NAT) is a technology that will create an internal network that is not seen by an external network. This is used when there are a lot of internal devices and a few public IP addresses available. Once again only seeing the IP address of the NAT device, although this is often contained within the system it’s connected to.

4.       Dynamic Host Configuration Protocol (DHCP) is a technology that time shares an IP address to ensure a pool of IP addresses are used to their best ability for the devices that need them.

The above technologies are often used with each other and together make an IP address much less reliable as a personal identifier.  Advertisers will only use one to determine an approximate region, for anything else more specific they use other means. While in security they are used to identify systems and kept within that context.

Can they be used to identify people? 

The list of practical system and people signatures changes constantly. There are privacy features created to remove them and new research and technologies that create new ones all the time. But essentially all of our interactions can create signatures that can identify people behind a system. For a comprehensive list of web browser signatures go to https://panopticlick.eff.org/ and run their test. It shows the list of browser plugins, cookies, settings, and technologies used to track you.

Additionally, many of the things we do on a system can be used to create a signature: 

1.       The unique way we type or use a mouse can very easily be recorded from a remote system, none of the technologies mentioned will mask this. But storing information at this level simply isn’t practical though. 2.       What’s more commonly used is the correlation of your personal accounts. Anything that requires authentication is generally assumed to be you, such as work, email and social media accounts. Access to the granular logs of one of those systems and the one they are trying to identify someone on, means there could be enough information correlated to make a reasonable connection. 3.       Uploaded information can also be used, as files have a lot of information embedded in them that can be tied to you and a system. For example, geographic coordinates are embedded in pictures automatically by most cameras.

How to prevent being tracked online

There are a lot of reasons that people want to have some level of privacy online and there are a few steps you can implement:

1.       Use a privacy VPN that doesn’t keep logs. 

2.       Use an operating system (OS) with a browser built with privacy in mind. Consider the TAILS OS for online activity. 

3.       Don’t use the same browser/OS/system (in level of inconvenience order) for things that identify you personally and things you do not want to be associated with easily.

IPs and attribution are therefore problematic, but there can be useful intelligence derived when assessed with context in mind and the rights tools to help. A platform that detects, identifies, and correlates tens of millions of threat indicators against network activity and logs in real-time gives organisations just that. Providing both internal and external context that is relevant and timely threat intelligence, which enables them to respond to cyber threats and help identify who is behind them.

Anthony Aragues, Vice President, Product Marketing, Anomali

Anthony Aragues is the Vice President in charge of Product Marketing for Anomali and is an industry veteran with 20 years of experience in threat intelligence and data visualisation.