Advertisement

COVID-19: How SAGE scientists really calculate the all important R number

There are few more important numbers these days than R. The reproduction rate has for much of this pandemic been cast as the most crucial of all pieces of data.

Perhaps this is unsurprising; after all the main question everyone is focused on is whether the spread of the disease is accelerating or not - and the official reproduction rate, the measure published by the government's scientific adviser group SAGE each week, is perhaps the simplest way of depicting that.

Anything above one and the disease's spread is speeding up, anything below one and it's slowing.

Today, for instance, we learnt that the R number is moving from the wrong place to the right place.

Having hovered above one for most of the past few months, it is now in a range between 0.8 and 1.

The median point, in other words, is 0.9. That implies that having grown rapidly in recent months, especially around the Christmas period, the spread of COVID-19 is decelerating.

Now the provisos. The first is that even though, on the basis of this latest estimate of R, the disease doesn't seem to be spreading as rapidly as in previous weeks, it is nonetheless spreading - only less quickly than before.

Or to put it another way, it's not as if people aren't catching COVID-19. They are, but fewer new cases are materialising each week than the previous week.

The second proviso is perhaps even more important. While the reproduction rate has the vestiges of being a totemic, monolithic figure, it is really a judgement call, or rather a combination of judgement calls. The process whereby R is manufactured is not widely discussed, yet it matters since it underlines the uncertainty of the number.

Scientists from 11 institutions take part in a conference call, under the aegis of the Scientific Pandemic Influenza Group on Modelling.

Each of them proposes a range for R based on their own modelling. Some may put more emphasis on case growth, others on death numbers, others still on hospitalisations.

Some may have their own anecdotal theories about why each of these datapoints is reliable or suspect - for instance one group recently posited that some of the data on young adults were suspect because they had picked up that some students were advising each other not to get tested.

Anyway, as you might expect, these individual theories mean that often there is quite a big difference in the estimates.

Over the course of the discussion, the members discuss their estimates and challenge each other on them. Some groups might realise that theirs is too high or low, or change their minds for another reason and withdraw their estimates.

At the end of the meeting, the surviving estimates are combined into an average. That average is the R number released to the public.

That simple R estimate, in other words, can disguise a lot of behind-the-scenes disagreement.

Let me give you an example: back on 6 January, SPI-M set the R range at 1.0-1.4.

That, at least, was the number released publicly. But we now know from documents released today that at that meeting, one SPI-M member thought R was 0.8. There was another member whose range went as high as 1.6.

Now, that wide range of views is partly why the range of R that week was especially wide (normally it's about 0.2 or 0.3 between the two numbers). But there is only so much a single datapoint can really tell you - and that's kind of my point.

The reproduction rate is an estimate couched by scientists with quite a high degree of uncertainty. But the minute such things are converted into a single datapoint and used as tools of policy they tend to lose all nuance, be stripped of the impression of uncertainty and used instead to justify decision-making.

Science is all about uncertainty - but you probably wouldn't have guessed that from the way the R number has been depicted at the Downing Street press conferences.

Now, I don't want to overstate the degree of disagreement we're talking about here. We're not talking about some members of SPI-M claiming that COVID is a hoax and others declaring it to be the end of the world as we know it.

This is not Twitter. But the point is that even among this group of relatively mainstream epidemiologists there is nonetheless quite a difference of opinion. And this difference of opinion is not always evident from the way R is being presented and reported.

There is only so much one can do about this kind of thing. Not everyone is a scientist or a data obsessive. Not everyone will scour the minutes for granular detail.

Politicians and, for that matter most of the public, prefer their information in relatively small morsels, so it's somewhat inevitable that judgement calls with lots of provisos get converted into big, seemingly inviolable numbers.

It happens in economics all the time - consider GDP estimates or forecasts, all of which have a lot of probabilistic uncertainty beneath the surface.

But one way the process could be improved is to have more transparency over how these numbers are put together. For instance, I would love to be able to tell you the range of views from the 11 SPI-M members over today's R number.

I'm willing to bet there are still some groups which are convinced that R is still well over 1 and some who think it's fallen even faster than the headline numbers - but we simply don't know because the "consensus statement" about this meeting will not be published until next month.

Even when those statements are published they are still quite impenetrable. We don't get any detail about the rationale for why one group's estimate of R is up here and another's is down there.

Indeed, the minutes don't even specify which institution is responsible for what number - we get an anonymised chart instead.

There is far more left unsaid than said. For instance, in that 6 January meeting, the chart of different members' R estimates only includes seven contributions.

What about the other four members? Did they withdraw their estimates? Were they embarrassingly high or low? Were they simply on holiday? We don't really know.

There are other problems with the number: is it supposed to be a "nowcast" - in other words reflecting the world as of today? Or is it reflecting the world as of a week or two ago?

The statements alongside the number's release seem to suggest the latter, but it is often presented more as the former. Either way, a bit more clarity on this would be helpful as there is quite a big difference.

For a number which, like it or not has become extremely important, this lack of transparency is unsettling. In any other realm of policy it would be considered unacceptable.

Consider the meetings of the Monetary Policy Committee, where we get relatively detailed minutes explaining why each member voted the way they did, on the day of the decision. That's the kind of standard we should be expecting here.

Now perhaps you believe that producing extra detail and publicising the uncertainty that exists here is irresponsible.

Perhaps you believe this is one of those moments where the gravity of the situation means that expressions of doubt should be suppressed and any sense of uncertainty elided. There are many who believe that, but I respectfully disagree. Transparency breeds trust. Censorship does not.

As it happens, these days the government no longer seems to be quite as fixated on R. It no longer says the decision to lift lockdown will be predicated on getting it below one, pointing instead to other metrics like hospital admissions and vaccinations.

Even so, R remains one of the most important numbers in British politics. So it is great news that today's number is heading in the right direction.

But let's also recall that the number is not just a number but a judgement call.