A data protection model for production learning analytics

[UPDATE: the full paper describing this approach has now been published in the Journal of Learning Analytics]

[based on Doug Clow’s liveblog of the talk I did at the LAEP workshop in Amsterdam]

I was a law student when I first came across learning analytics; the idea of being asked “do you consent to learning analytics?” worried me. From the student point of view it didn’t seem to offer me either useful information or meaningful control of what (unknown) consequences might arise if I answered “yes” or “no”. From the perspective of a responsible service provider it doesn’t seem to help much either: it doesn’t give me any clues what I ought to be doing with students’ data, beyond a temptation to “whatever they’ll let me get away with”, which almost certainly isn’t how an ethical provider should be thinking. Since learning analytics done well ought to benefit both students and providers, there ought to be a better way.

Looking at data protection law as a whole, a lot more guidance becomes available if we think of learning analytics as a two-stage process. First finding patterns in data, and then using those patterns either to improve our educational provision as a whole, or to personalise the education offered to individual students.

Considering the “pattern-finding” stage as a legitimate interest of the organisation immediately provides the sort of ethical and practical guidance we should be looking for. The interest must be defined, stated and legitimate (“improving educational provision” should satisfy that), our processing of personal data must be necessary for that purpose, and the impact on individuals must be minimised. Furthermore the interest must be sufficiently strong to justify any remaining risk of impact – if that balancing test isn’t met then that line of enquiry must stop. Practices that reduce the risk to individuals – such as anonymisation/pseudonymisation and rules against de-identification and misuse – make it more likely that the test will be passed.

Many of the patterns that emerge from this stage can be used without any further data processing: if correlations suggest that 9am Saturday lectures don’t lead to good results, or that one section of a course is particularly hard for students to follow, then we can simply make the necessary changes to course materials or schedules.

Other patterns may suggest that some types of student will benefit from more challenging materials, or from greater support, or from a particular choice of textbook or study path. For these the aim is to maximise the (positive) impact on each student, so a different approach is needed. Now we are “pattern-matching”: querying the data to discover which individuals should be offered a particular personalised treatment. And once a pattern has been identified and the appropriate response identified, then the organisation is much better able to describe to the student what is involved and what the risks and benefits might be. Now consent can give the student meaningful information and, if the offer is presented as a choice between personalised and standard provision, control as well. Seeking consent at this stage also feels ethically meaningful because it signifies an agreement between the student and the organisation to the switch from impact-minimising to impact-maximising behaviour.

A paper discussing the legal background to the model, and exploring some of the practical guidance that emerges from it, will be published in the Journal of Learning Analytics next month.

By Andrew Cormack

Leave a Reply Cancel reply