Fueled by the rise of smartphones and web apps, our data is being silently collected.
Once gathered, information on location, calls, web searches and preferences is used to drive services and target advertisers - often released publicly as 'anonymised' or aggregated data sets.
Privacy is a tenacious issue. In Europe, a comprehensive reform of the EU's data protection rules is currently underway, with the aim of strengthening online privacy rights. This includes the way data is collected, accessed and used.
Location data can be useful: among other things, it gives sat-nav companies the ability to improve traffic reporting, and cab firms a way to enhance services. However, so-called anonymous data can often be linked back to users.
With this in mind, researchers at the Massachusetts Institute of Technology's MIT Media Lab have created open source gatekeeper software OpenPDS, which allows users to 'own' their data in terms of its possession, use and disposal.
OpenPDS looks into the mathematical risk of people being re-identifiable, examining how they can be found in databases and reconciled data sets.
PDS stands for Personal Data Store: a centralised location where data is contained, allowing the user to view and control the information. OpenPDS' software allows users to control the flow of data and manage fine-grained authorisations for accessing it.
The software works under the premise that it is difficult to anonymise high dimensional data such as geolocation while retaining the information's value. Therefore, it has created SafeAnswers, which allows applications to ask questions that will be answered via a user's personal data. In practice, applications send code to be run against the data and the answer is sent back to them.
Aggregating personal data
SafeAnswers uses two separate layers for aggregating personal data, with sensitive information processing taking place within the user's PDS. The second data layer can be anonymously aggregated across users - without the need to share sensitive information with an intermediate entity - through a privacy-preserving group computation method.
"What's unique about PDS is that it is privacy by design; it's about keeping data under the control of the user," OpenPDS founder and researcher Yves-Alexandre de Montjoye tells us. "From a privacy perspective, it is entirely different: when it's small data such as medical research; or postcode information; you can make this reasonably anonymous. It is very hard to do so with the big data from mobile phones."
In fact, OpenPDS believes the current model - where every mobile app is collecting its own data and sending it on to its servers - "is not efficient or good from a privacy perspective".
This is partly because it is so easy to identify individuals using their data. In a paper published recently by science journal Nature SRep, Montjoye's team showed it is possible to identify a user from only four data points, because human patterns are very limited in scope and therefore act like a fingerprint. "The way you move around is as unique as your fingerprint," says de Montjoye.
As such, OpenPDS emphasises the idea of 'unicity': the unique way that human behaviour makes you identifiable. "'Unicity' is so important because the more unique your behaviour is, the easier it is to link the data back to you and identify you," de Montjoye explains. "However, it is also 'unicity' which makes this data so useful and valuable."
De Montjoye started working with metadata around eight years ago, when he realised its potential in the wider world. "Working with this metadata in various countries, I became aware of how it worked and what you could do with it," he says.
OpenPDS is now working on extending its reach: it has made its first prototype and is deploying with partners including Telefonica, Telecom Italia and the Technical University of Denmark - where 1,000 undergraduates are collecting their mobile phone data.
It's not on the market yet, but soon, OpenPDS could be used in wider businesses. "It could be useful for businesses in general to allow third parties to safely access and use their data," says de Montjoye. "A lot of information is being collected so it can allow companies to leverage that data."
SafeAnswers can also be used on company databases instead of releasing anonymous data sets. OpenPDS could be a good thing for both "a small start up and big players such as Samsung which have access to the customer", de Montjoye says. "It is an opportunity for companies to do something different. We collect this data and give it to you to use - we are doing something differently to Google, which has a different business model where it is hosting and using the data."
It can not be ignored: as privacy concerns grow, the value of data is increasing. But with gains to be made by companies, scientists, and users, De Montjoye does not think firms should stop collecting or using information.
"We just need to think of new ways of using the data while protecting the individual's privacy," he says. "It is a matter of education and making sure we understand what's going on with our data."