What the A-level debacle teaches us about algorithms and government

There are four key lessons about decision-making and redress

August 26, 2020
Students protest the government's handling of their exam results. photo:  Isabel Infantes/EMPICS Entertainment
Students protest the government's handling of their exam results. photo: Isabel Infantes/EMPICS Entertainment

In the face of overwhelming public pressure and looming legal action, the government last week scrapped its algorithm for calculating A-level grades. This resolved pressing concerns about the algorithm’s accuracy and fairness, even if it also created fresh problems for students and universities. But algorithms will continue to play a growing role in public sector decision-making in the UK, across almost all areas and levels of government. It is therefore essential to learn the lessons of this debacle so that history is not repeated.

Much of the A-level controversy focused on substantive issues around how the algorithm operated. For example, was the government right to calculate grades based principally on a school’s historical results rather than a student’s academic performance, and to apply the algorithm only to cohorts of a certain size? These are critically important questions. However, it is equally important to look at what can be learnt from the failures in the government’s decision-making processes. What were those failures? Were there adequate avenues for members of the public to review the government’s decisions? And what does this experience tell us about our systems for public decision-making and redress in a world of algorithms? These questions have received less coverage. From this perspective, there are four key lessons to be taken from the A-level debacle.

The first is that the government must be transparent in developing and using algorithms. Transparency promotes better decision-making, by enabling people to test, explain and supplement the information before the decision-maker. It also respects those affected by a decision, by treating them as capable of participating in and understanding it.

In this case, the government’s decision-making was, on its face, open. Ofqual, the exams watchdog, began consulting the public about the algorithm’s design and use in April, and announced its decisions in May. But the reality was quite different. It was only after results were released on 13th August that the public that the public had enough information about the algorithm to understand and assess its implications. This lack of transparency was flagged as being problematic and legally dubious well ahead of results day. The government must promptly disclose the details of its decision-making algorithms, particularly where they have serious consequences for people’s lives.

Ofqual’s approach also needlessly created a “sink or swim” moment for the system on results day. By that point, it was too late for the government to refine its approach, and the class of 2020 was already puzzled and angry at having been judged based on the performance of students past. Ultimately, the government’s only option was a sweeping U-turn.

The second lesson is that there must be rigorous, expert and independent scrutiny of government algorithms before they are deployed. Transparency alone will not ensure that algorithms are accurate, fair, and lawful. This is particularly so where, as here and in many cases, an algorithm is technically complex and difficult for laypeople to examine, and where, unlike here, an algorithm affects only a discrete minority with limited institutional support. There must also be robust legal and policy structures to support this scrutiny. The government might otherwise simply decline to engage, as Ofqual appears to have done at times in this process.

The A-level debacle highlights the limits of our current systems and institutions in this respect. Data protection law required Ofqual to assess and address any risks to the rights and freedoms of students, and to consult with the Information Commissioner (ICO), the independent data protection authority, if such risks were high. Under equality law, Ofqual had a positive and continuing duty to give due regard to the need to eliminate discrimination and advance equality of opportunity.

But Ofqual’s published version of its data protection impact assessment was manifestly inadequate. For example, it failed to genuinely engage with the legal requirements for fully automated decisions. It is unclear whether the ICO independently scrutinised the government’s decision-making at any stage. And Ofqual apparently failed to consider how several aspects of its approach, such as the selective application of the algorithm and the last-minute provision for appeals based on mock exams, might discriminate against particular groups.

The government obviously faced a very difficult task: attempting to reconcile conflicting objectives, such as avoiding grade inflation and treating individuals fairly, in exceptional circumstances. But further research and policy thinking is required so that, in the future, the UK has the systems and institutions necessary to address these kinds of problems before they arise.

The third lesson is that the availability of judicial review can help regulate government decisions about when and how to use algorithms. Several judicialreviews were pending when the government scrapped the algorithm, raising some of the problems discussed here and elsewhere: Ofqual’s apparent failure to comply with its statutory objectives; the algorithm’s perverse results in certain cases; the lack of transparency around how it actually worked; and so on. The A-level experience, together with several other recentcases, shows that judicial review can help ensure algorithms comply with basic standards of good governance laid down by parliament.

The fourth lesson, however, is that there must be adequate redress for poor algorithmic decisions beyond judicial review. While vital, judicial review has several limitations. It is expensive, the judge can review only the legality, and not the merits, of a government decision, and even if a person is successful, the judge can generally only order the government to make its decision again. When algorithmic decision-making goes wrong, an individual must be able to get effective redress for their grievances.

The redress scheme in this case was clearly defective. The initial appeals process was very limited. Only schools, and not students, could challenge results, and only on exceptional grounds, which did not include the merits of the algorithm’s assessment of a student’s academic performance. Appeals were left until after students had received their final grades and universities had made their offers, emptying them of much of their practical significance for many students, at least in the short term. The process was also going to cost individuals up to hundreds of pounds each, until the government made it free in response to public criticism.

The eventual U-turn has fortunately meant that this redress scheme will never be tested. But in the future, the government must carefully identify the grievances that might arise from algorithmic decision-making, and design accessible systems that enable those problems to be resolved efficiently and effectively.