Following my Networkshop talk on logfiles, I was asked at what point logfiles can be treated as “anonymous” under data protection law. Since the GDPR covers all kinds of re-identification, as well as data that can “single out” an individual even without knowing their name, that’s a good CompSci/law question: the work of Paul Ohm and others suggests it may take a very long time. But when designing processes I wonder if we should approach from a different angle?
When we were looking at GDPR for Jisc’s (then) 130+ services, we concluded that much the best place to start was identifying the purpose of the processing and the lawful basis applicable to that. Once we understood those, most of the requirements – including on transparency, safeguards and user rights – could “simply” be looked up in the law.
I’m now wondering whether we can get a similarly helpful collection of guidance by treating “processing that can use anonymous data” as a sort-of seventh lawful basis? Like the guidance we derived from the other six, that should deliver:
- Clear definition (and limitation) of purpose, without which we’re unlikely to get the right kind and level of anonymisation. We might even conclude that the purpose is better achieved using personal data within the GDPR;
- Transparency both about how we can produce the anonymised data we need and – since the anonymisation step involves processing of personal data – how we explain that to data subjects;
- Safeguards, in particular how we ensure that our data are, and remain, anonymised. These will depend on whether we are actually destroying the input personal data (for example aggregating log records to get usage statistics over time), or just removing sufficient information to make re-identification or singling out unlikely. As the UK Anonymisation Network explain, at least the latter approach requires an ongoing risk-management process not dissimilar to the legitimate interests balancing test when using personal data;
- Individual rights, at least those applicable to whatever lawful basis we are using for the anonymisation process, and perhaps also something similar to a “right to object” if the data subject can point out a particular reason why that process doesn’t effectively anonymise them. We should at least learn from the GDPR right to object and use any data subject concerns as prompts to review what we are doing and how we are explaining it, to understand why that isn’t sufficiently reassuring.
The term “anonymous” is used with sufficient variety of meaning that I find it worrying more often than reassuring. Has the speaker actually implemented a process to produce “personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable” (GDPR Rec. 26), or just blanked the “Personally Identifying Information” (the dreaded US term)? Something like the above approach would give me a lot more confidence.