The Past, Present & Future of Coronavirus…
COVID-19 forecasts try to predict the future. Accurate predictions would help hospitals and emergency rooms plan staffing and order supplies. Even better would be a model that could guide public policy, telling officials how proposed rules or closures would affect economic activity and infections. But despite many smart people involved in modeling, the COVID crystal ball is still cloudy.
Today’s blog explains the different ways that scientists are trying to make sense of coronavirus data, and peeks behind the curtain to suggest why such a simple-sounding task is so very difficult. This serves as an update to my earlier blog on Understanding Forecasts, which discussed only the earliest version of the University of Washington model. (For a more detailed dive into coronavirus data modeling see the IEEE Spectrum article by Matthew Hutson.)
COVID data presentations seem to fit into three categories:
- Past: Data that describes hospitalizations and deaths after they occur.
- Present: Models that try to predict how people and the virus react in response to healthcare advice and rules.
- Future: Forecasts of what lies ahead according to various scenarios.
– What Is Life, Without a Touch of Art In It?
These topics reminded me of a famous painting created by Paul Gauguin in Tahiti in 1897. Its title is D’où Venons Nous / Que Sommes Nous / Où Allons Nous, generally translated as Where Do We Come From? What Are We? Where Are We Going? Gauguin stated that the painting should be read from right to left, communicating the artist’s messages about birth, adulthood and the approach of death.
We can do no better than borrow from Gauguin as we describe the types of COVID-19 forecast and data models that are dominating the attention of many smart people. And here are the sections of this blog:
COVID-19 Data Models – Where Do We Come From?
– Excess Deaths, from Centers for Disease Control
By October 27, there had been 311,882 more US deaths in 2020 than expected from previous years. At least 231,952 of those deaths can be assigned with high confidence to COVID-19 coronavirus.
Another CDC chart tracks excess deaths by week. Peak deaths occurred in the week ending April 18, when there were 78,989 deaths, substantially more than the expected death count of 55,640. Recent weeks, for which records are still incomplete, show 3,000 to 5,000 excess deaths per week.
CDC shows this data, and more, not only for the US as a whole, but also state by state. Although deaths lag infections and hospitalizations by several weeks, deaths are the most reliable measure of whether we are making progress against the virus.
– State Cases Video, from Benjamin Renton / Flourish
The Flourish Design Studio displays a video by Benjamin Renton on their website. The video shows the cumulative COVID-19 cases per million population by state, from March 1 until now. Furthermore, since the case count is scaled by population, smaller states are more prominent than in other displays. Here’s an iFrame showing the video on the Flourish website:
Above: Coronavirus Cases per Million by Benjamin Renton via Flourish
The states are coded red and blue, according to how they voted for President in 2016. The chart draws on data from Johns Hopkins University. Certainly, it is amazing to see how the dominance in cases shifted from blue to red states as the pandemic developed.
COVID-19 Analysis Models – What Are We?
COVID-19 models that address What Are We attempt to predict how we and society will behave, given various health recommendations and rules. Such models allow researchers to see how changes in health rules or public behavior might affect virus spread for good or ill. And they tend to require a lot of computer time.
– Neural Network Models
Neural network models crunch a great deal of data and create rules that describe the apparent relationships between parameters. For example, the networks might construct relationships “between input data (such as mobility, testing, and social media) and pandemic outcomes (such as hospitalizations and deaths).” Prof. Prakash’s group at Georgia Tech terms their model “DeepCOVID.”
Neural network models allow tinkering with some factors to see how other factors change. However, the rules developed by the neural networks are very complex, which makes them hard to understand and non-intuitive. In addition, we can’t be sure of the range of parameters over which the trained network will give valid results.
– Agent Based Models
Although neural networks juggle data without considering individuals, agent based models do just the opposite. Amazingly, a group at the University of Sydney, Australia built a model that digitally represents 24 million people, Australia’s 2016 census count. In addition, these simulated people were assigned demographically accurate ages, family sizes and jobs, then allowed to mix in daytime and in nighttime venues.
The researchers adjusted the parameters to best match real life coronavirus data. They then varied factors like air travel, isolation of victims, home quarantine, social distancing and school closures to see how they affected the spread of infections. They found some non-obvious results, among them that school closures themselves were less important than the level of compliance with social distancing.
COVID-19 Forecast Models – Where Are We Going?
A COVID-19 forecast tries to predict how the virus will advance or retreat in the coming weeks or months. These models mainly deal with the current set of health rules and public behavior. And they don’t generally try to predict what will happen if we change one rule or another.
– COVID-19 Forecast of Daily Deaths, from IHME
IHME, the Institute for Health Metrics and Evaluation at the University of Washington, has been modeling and forecasting since early in 2020. My earlier blog described their original model, which assumed that hospital use would follow a rise-and-fall curve similar to that seen in China, Italy and Spain. Because the US and its people responded differently to the challenge of the virus, we did not follow the same curves and for that reason the IHME model initially under-forecast the impact of the disease.
Since then, additional data generated by the virus in the US has allowed IHME to evolve to a “compartmental” model. Such a model separately describes people who are Susceptible, Exposed, Infected, or Removed (via recovery or death), and how quickly people move from one category to another. IHME continually adjusts the model to match real-world data.
As of October 22, IHME’s model anticipated a total of 385,611 COVID deaths in the US by February 1. Like all projections, the number is uncertain: IHME suggested it might be as low as 322,836 with universal masking, or as high as 485,607 with wide easing of mandates. The daily death rate by then might be anywhere from 1,299 to 5,562 persons per day.
A later article (October 23) extends the forecast through the end of February 2021. They find that an expected 511,000 US deaths could be reduced by 96,000 to 130,000 with 85% to 95% mask wearing.
– Ensemble COVID-19 Forecast at CDC
An early success in coronavirus modeling was the work by data scientist Youyang Gu. He built an artificial intelligence model driven by daily deaths plus parameters such as reproduction number, infection mortality rate and lockdown fatigue. And it was very successful in predicting future COVID deaths.
Now that there are dozens of pretty good models, Gu is retiring from updating his coronavirus model. In its place, he recommends a collection of models known as the Ensemble Forecast which CDC collects and presents.
This discussion has just scratched the surface of the huge number of active modeling efforts going on. I count 44 different models on one of CDC’s summary pages!
How does a thinking person, a non-expert in epidemiology, deal with this glut of material? It depends on your personal goal:
- If you want to know whether the US (or an individual state) is making overall progress, track its Excess Deaths count.
- To see how red versus blue states have fared through the year, check the Flourish video.
- If a long-term forecast interests you, the IHME model tries to scratch that itch.
- To see a range of near-term forecasts from dozens of research groups, the CDC Ensemble is a good source.
- Or, you may want to keep your eyes on the computation-intensive Analysis Models as their creators try to explore what, if anything, public health agencies can do to quench this persistent social emergency.
The scientific effort on COVID-19 forecasts and models rivals the huge efforts to develop vaccines, to find effective treatment protocols, and to give patient care during a state of continuous emergency. And all of these folks deserve our thanks and kudos for doing their part to return us to a better, more normal world.
– Paul Gauguin’s Where do we come from? Who are we? Where are we going? is in the public domain under US law
– The other figures are referenced in the text near each image
– Extra thanks to Jim Zucchetto, who alerted Linos Jacovides who alerted me to the hypnotically fascinating Flourish “racing bar chart” video above