So many “AI ethics frameworks” are crossing my browser nowadays that I’m only really keeping an eye out for things that I’ve not seen before. The Government’s new “Ethics, Transparency and Accountability Framework for Automated Decision-Making” has one of those: actively seeking out ways that an AI decision-making system can go wrong.
The terminology makes pretty clear that this is based on how we have been finding security vulnerabilities in software systems for many years: using “red teams” and “bounty schemes”. But here the aim isn’t to find bugs in software that give access to what should be private data or systems, it’s to find situations where an AI decision-maker or support-system will make wrong, biased, discriminatory or harmful choices.
“Red-teaming” – in the sense of an internal activity to test the limits of systems – isn’t really new for AI. Practices such as ensuring test data sets are comprehensive and include examples that have caused problems in the past should be routine, especially given the availability of tools such as Facebook’s Casual Conversations. And there’s an active field of research exploring how adversarial modifications can make AI vision processors, in particular, mis-classify or mis-read what they see.
But the idea of a bounty scheme does seem relatively new (it turns out it was proposed in a paper in 2020). For software bugs, payments for making helpful reports were introduced into what was already a thriving security researcher scene. Since many security bugs could be exploited to make money, either by the researcher or by someone who paid them for their knowledge, the idea was to create a counter-incentive. If you discover a new bug, report it to the software vendor and help them fix it then you may receive a financial reward without the risk of being involved in criminality. That model has grown to the point where a software producer can subscribe to a bounty-as-a-service platform such as HackerOne; a few researchers make enough for bounties to be their main source of income, but for most they seem to be a token of appreciation, a way to fund the next round of research, or simply to justify the time taken to make a quality report.
The context for a bias and safety bounty is a bit different. It’s less obvious how a criminal would make money from secret knowledge of an AI bias risk and – as far as I know – there isn’t the same community of hobbyists searching for such problems as in the heyday of bugtraq in the late 1990s. So perhaps the main focus of a bounty scheme is to create that community, with the signal it sends being as important as the payment: we want to know about bias problems, help us find them and we’ll show our gratitude in an appropriate way (which, the history of bug bounties suggests, could include T-shirts or public thanks, as well as money).
One thing that is common to both types of bounty is that the benefits depend heavily on how organisations respond to the reports they receive. It would be nice to think that – twenty years on – we won’t see a return to vendors threatening to sue researchers, and researchers threatening to “go public” with their findings. At the very least, work together to fix the problem that has been identified: ideally, take a step back, work out how the problem got into the system and fix the development process as well.