What if data had a half-life?

Published in

UX Collective

6 min readJun 6, 2018

Image by MontyLov on Unsplash: https://unsplash.com/photos/_Q96YBb998E

If during the past few months you have spent any time on planet earth, whether in Europe or not, you will probably have heard about GDPR — the EU data legislation for General Data Protection Regulation. Despite the apparent negativity about GDPR due to the increasing furore of activity and commentary as the compliance date of 25th May 2018 approached, it’s ultimately a very good thing — intended to give EU citizens better control of their personal data. A familiar commentary as the date approached was, “My inbox is full of emails from companies that I’ve not engaged with for {insert subjective length of time here depending what the company does}”. This was usually followed by expressions of how tough digital life is (Hint: firstly, don’t sign up to so many emails and secondly, have a word with yourself) or the revelation that you are the first person to have seen a glitch in the matrix and can bring the whole system crumbling down by simply not clicking the link in the email. Anyway, a little over a week and one stress ball later, I started pondering about why there isn’t just a requirement for data to expire after a set time?

The reality of course is that data doesn’t (nor is required to) expire because it’s essentially the currency of the digital ecosystem. I’m not in the habit of throwing away money that I’ve not used for a while and digital services’ attitudes to data is pretty much the same (even the CIA’s). Despite increased mainstream awareness of data usage such as the recent Facebook / Cambridge Analytica revelations regarding how much data is stored about us and how it can be used to manipulate society, this is unlikely to change any time soon. Data collection and subsequent user profiling is a hugely profitable system for both digital services and advertisers, hence why the biggest social media services work so hard to ensure that the apparent value generated by convenience and new features deters too many people questioning what these ‘free’ services really cost. That said, social media services also recognise that intelligent cyclical data collection and modelling can create both better user experiences and subsequently generate more and better quality data, so in practice the end-result for users of today’s social media services are often quite amazing products.

All of which evolved my pondering about data expiry to consider, with the capacity for such intelligence of data modelling, rather than data expiring perhaps a better model for everyone is one whereby data diminishes over time? Rather than data just disappearing at a specific point in time, if data had a half-life — the time required for a quantity to reduce to half its initial value — there would potentially be benefits for the whole digital ecosytem:

Users could feel safer and more trusting knowing that irrelevant, historical data disappears (or is significantly diminished).
A user’s data profile is most enriched based on how recently or how much they engaged in what and/or with whom.
Digital services focus on creating better experiences driven by the user’s enriched profile.
Advertisers get more granular, current and accurate user profiles, creating better ad relevance and increased conversions.
Data collection, modelling and storage is more efficient and focussed on the data which most appropiately represents a user.

If you’re particularly eagle-eyed, you’ll have recognised the spoiler is included above. Steps 2–5 are already happening. Social media services and advertisers have invested heavily in collecting the right data and making it more valuable through correlation and understanding relevance. Which just leaves step 1. Going back to GDPR, the greatest user concern seems to be about data which are either mis-representative, irrelevant or unseen, and hence rather than try to regulate for this, perhaps a more understandable and achievable model would be one where data diminishes over time. Rather than require users and services to agree/adhere to GDPR in 2018, and then the entire ecosystem constantly push it to or beyond its limits until the inevitable GDPR2 in 20XX, perhaps a better model would be a more organic one which addresses current issues of trust, transparency, accountability and power imbalance, yet recognises the need to evolve based on where the motivation, expertise and investment lie.

As ever, the complexity and effectiveness of such an approach lies in the detail and implementation but if there are appropriate benefits across the digital ecosystem then there are some amazing minds and systems which could make this happen, and the result could be more trusting users engaging in better services which generate increased revenues. I’m probably not one of those amazing minds that can make this model a reality but what I can do is bring it to life a bit by considering some of the potential (albeit simplistically described) outcomes:

Data created through my explicit actions could make life simpler

My actions generate data which creates a profile of me that can be used in many ways by anyone with the data. For explicit actions, some of the data I generate could come with implicit acceptance of its usage. For example, when I buy something, I typically get a 1-year warranty included. For a simple retail exchange, this seems a sensible agreed period of engagement. I don’t want to be hammered with marketing non-stop for that year — and I should have a means of avoiding that if an inevitably doomed, naïve retailer goes that route — but I am presumably engaged enough that appropriate marketing activity in that period is relevant to me.

Diminishing data creates metadata

As historical data diminishes, it could be commodified into non-identifiable metadata. For instance, if the outliers of my friend network are removed due to diminshed relevance, it would seem acceptable to maintain a record of how many outliers were removed. This would allow for intelligence around my historical behaviour without the concerns of irrelevant data lasting forever.

Diminishing data provides valuable insights

Diminishing data has equal or potentially greater value than fixed data and no data. Or to put it another way, we know what we know, we can’t know what we don’t know… but perhaps the real intelligence is in identifying and analysing how and why our data evolves from one to the other.

Increased transparency in real-world profiles

What does having a network of 1,000+ people really mean? In some cases it’s meaningful, in others it’s vastly misleading. For instance, on a dating service it might be more valuable to know that someone has engaged with 10 people in the past week than has a network of 100 people. Depending what you’re looking for, 10 people in a week may be a good thing or a bad thing, but either way it’s probably more insightful.

Future data is more predictable

A great thing about half-life is that it’s predictable. Therefore, not only do I have a historical picture of what my data has been but potentially a clearer projection of what my data will be. This could create greater awareness of how my current behaviour will affect my future profile and generate new opportunities for impacting it — either by me or by the services I engage with.

The rate of diminishment could be variable

The ‘half’ in half-life is a fixed thing but it needn’t be fixed in the data diminishment concept. Potentially I want some of my data to diminish rapidly and others barely at all. This could be a manual scale that I can manipulate or a dynamic scale applied by a digital service. This sounds complicated but perhaps in practice it’s much simpler than trying to understand GDPR:

Obviously this would be most effective if a standardised measure could be achieved — albeit one with variable descriptions relevant to the service in use. Even the scale labels could be variable (Min > Max, Short > Long, You hate us > You love us) but even just a consistent title and 1–5 scale could become a useful representation of how long a service holds on to and enriches your data.

Summary

So, guess what? It’s not a perfect concept. It’s hard to design for all use cases. It’s difficult to conceive what a diminished email address is compared to a diminished network diagram. It’s complicated to consider the context of a small hairdresser collecting email addresses alongside a social media network storing network relationships, photos, comments and intimate moments. It requires a more transparent approach from services. It requires greater trust from users. But data isn’t going away. It’s growing in size, relevance and value, so to my mind we need to consider new models for how the data of the past and the data of today relate to the data of the future.