Chatbots and Voicebots: legal similarities and differences

The EDPB’s new Guidance on Data Protection issues around Virtual Voice Assistants (Siri, Alexa and friends) makes interesting reading, though – as I predicted a while ago for cookies – they get themselves into legal tangles by assuming “If I need consent for X, might as well get it for Y”.

We’ve been focusing more on text-interface chatbots than voice interfaces, so I did a quick compare and contrast. My conclusion is that voice interfaces do raise novel data protection challenges: text interfaces probably only familiar ones. Parts of the EDPB guidance would become relevant if a chatbot were used to continually monitor what you typed into other applications, to combine its text input with other data such as location or device type, or to provide access to emergency services.

Using the EDPB’s headlines and paragraph numbers…

Transparency. Both kinds of bot must provide accurate notices of processing (62); where the bot is part of a wider set of functions, the bot’s operations should be made clear and not buried in that notice (61); both types must support rights of information, subject access, etc. in accordance with the GDPR (65). In what becomes one common theme of the guidance, these rights are likely to be easier to provide through a (layered) text interface than audio: the gabbled style used to speak the rules on lottery adverts almost certainly wouldn’t pass Article 12’s intelligibility requirement! Accountability returns to this idea, requiring that any transparency messages communicated through voice interfaces should also be provided (to users and regulators) in written form on websites (143).

The other theme is the loose connection between a voice-bot and the humans around it: unlike a screen/keyboard interface you don’t need to physically touch, or even look at, a voice-bot to use it. Reminding humans of the bot’s presence and state is a new challenge (63), as is the likelihood that more than one human will interact through the same interface while needing to be provided with personalised transparency information and rights (64). There are useful reminders (66) that both kinds may have access to incidental information about the surroundings – sounds for an audio interface; location, device type, etc. for text – and the need to design technical, process and legal controls to handle this data (or exclude it) in accordance with the GDPR (Data Minimisation: 139).

Purpose limitation. Both kinds must meet the usual GDPR requirement to only provide functions that users expect and only perform processing that is necessary for those functions (89). The EDPB also recall (90) the need to provide separate opt-ins for each purpose that is based on consent. Their heavy emphasis on consent may make this more challenging than it needs to be.

Data Retention. Both kinds must minimise their storage of personal data, both in quantity and duration (108, 106). The GDPR normally offers anonymisation as an alternative to deletion, but effectively anonymising a voice recording may be impossible (107). The EDPB’s framing makes misidentified wake-words a particular problem (109). This is unlikely to arise for text-based bots unless they are continually monitoring typing and popping up when they think help is required.

Security. Both kinds of bot need to meet the usual GDPR requirements on security of data (123); if either is used as a way to access emergency services then availability should be a major design focus (124). Bots that provide transactional facilities, or that implement rights such as subject access, must ensure they appropriately authenticate the user before making changes or providing personal information (122). Text chat-bots can do this using a wide range of existing keyboard/screen mechanisms. For voice-bots many of these are unavailable (reading out a password is not secure!) and …

Processing Special Category Data. … voiceprint identification – which is a unique feature of the voice interface – are in the GDPR’s most sensitive category of data (“biometric authentication”). Nonetheless the EDPB do seem attracted to them as a way to address the “multiple-users” issue as well as individual authentication. They point out legal and technical challenges: that voiceprints should be stored and processed on the device, not a central server (133); that additional standards on protecting data must be followed (134); and that individual identification must be sufficiently accurate for all individuals and demographic groups (135). If either type of bot is used to access special category data then GDPR’s Article 9 provisions will, of course, apply.

By Andrew Cormack

Leave a Reply Cancel reply