What individual-based models give us
There are two main reasons to model sexually transmitted infections (STI) epidemics using individual-based models: To include more heterogeneity than is practically possible with standard compartmental differential-equation-based models, and to account for networks of sexual relationships.
But are we actually increasing our knowledge of these epidemics by using this more sophisticated technique? Sexual partnership formation and dissolution in the natural world, as well as the risk of infection per partnership, are extremely complicated processes and impossible to capture precisely in models. The quality of the available data on the distribution of sexual relationships, the frequency with which people have sex, the frequency and number of concurrent relationships, the frequency of casual encounters, and the risk of transmission of STIs is contested, and quite poor. When we use these data — and, sometimes very imaginatively, extrapolate or guess the missing bits — in our models are we generating outputs, such as estimates of prevalence and mortality, that have any validity?
Consider three examples:
Example one: Universal test-and-treat
Granich et al. published their widely cited model that found that a universal test-and-treat approach would virtually eliminate the South African HIV epidemic (1). A spate of models followed looking at the same question, many of which were reviewed by Eaton et al. (2).
This was followed by Hontelez et al. publishing the results of an extremely sophisticated set of individual-based models of increasing complexity (3). They confirmed “previous predictions that the HIV epidemic in South Africa can be eliminated through universal testing and immediate treatment at 90% coverage.” But they also stated that their “more realistic models show that elimination is likely to occur at a much later point in time than the initial model suggested.” They also found that the policy at that time of initiating treatment at a CD4 threshold of 350 cells/microlitre would also eliminate the epidemic, albeit at a later date.
One of the authors of the Granich et al. model, SACEMA’s Brian Williams, wrote a sceptical response: “I disagree with both claims and believe that their more complex models rely on unwarranted and unsubstantiated assumptions.” (4). (Without endorsing or disagreeing with Williams, it’s worth reading his short paper in full, because it makes interesting points that modellers need to at least be aware of.)
It is hard to resolve this disagreement. On the one hand the Hontelez model has numerous parameters, adding complexity based on assumptions and data that are at best contested, yet it reaches very different qualitative findings to the Granich model. On the other hand the Granich model is extremely simple: it has no age and sex structure. We have here polar opposite approaches to modelling, and it is currently impossible to say which captures reality more accurately.
Example two: Comparing models against survey data
Eaton et al. assessed ten models of the South African HIV epidemic against survey data (5). Three were individual-based models and the remainder equation-based models. All the models estimated lower prevalence for 2012 than a survey estimate, with eight estimating below the survey’s 95% confidence interval. Eight models estimated that prevalence would stay the same or decline between 2008 and 2012 whereas it increased across two surveys. The models’ estimates also differed significantly from each other on some outputs, though they did match survey data in some respects, including predicting approximately the same number of people on antiretroviral treatment. Also, it is conceivable that the survey data were less accurate than some of the models.
In essence, model outcomes differed to varying extents with each other and the empirical data. These differences were not merely quantitative; they were sometimes qualitative: predicting a stabilisation or decrease in prevalence versus the increase found by the surveys. Here the uncertainty does not only arise from the inclusion of individual-based models; we see it with the differential-equation-based ones too.
What are we to make of these conflicting results? Do they reduce our confidence in modelling generally, or are there valid insights about the HIV epidemic that we can learn from all or most of these models?
Example three: Matching algorithms give different results
My area of research is the development of algorithms that match agents in individual-based models. There are algorithms that can replicate the distribution of relationships in the population we’re studying perfectly (within the confines of our flawed data, of course, and the word “perfectly” needs clarification beyond the scope of this article), but they are very slow. To run thousands of simulations on large populations we have to use algorithms that merely approximate the distribution of relationships. The challenge is to find algorithms that do this well and quickly.
We recently published a few such algorithms (6). We subsequently tested them in an individual-based-model of fictitious sexually transmitted infections and found two things: (1) Different algorithms produce different results, though often these differences are not too big to worry about. (2) More worrying is that if the risk of infection per partnership is high (e.g. an infection with the transmission rate similar to that for HPV), then as the population increases, the incidence of the STI drops (our paper is still under peer-review). The finding is not apparent with small population increments of a few thousand. We varied our population from 5,000 individuals all the way up to 40 million, running thousands of simulations so we could estimate average final prevalence over many runs (like most individual-based models, ours is stochastic, so no two simulations give identical results).
When we randomly pair agents, population size makes no difference, but if we use an algorithm that attempts to match the underlying distribution, incidence decreases with population size.
We were able to make this latter finding because our model is written in C++, is highly optimised, and runs on state-of-the-art consumer hardware, allowing us to execute many simulations at many population sizes. (It’s free software, albeit still rough-at-the-edges, available at https://github.com/nathangeffen/faststi.)
The finding is interesting but disturbing. It suggests that for epidemics with high infection rates, models may need to have approximately the same number of individuals as the population being modelled. We have some hypotheses as to why this phenomenon happens, but I’m presently unconvinced by them. The most obvious explanation is that with bigger populations, the quality of matches is better, because individuals are more likely to be matched in partnerships they “desire”. This then results in distinct networks where the disease remains confined for longer. However, while this seems plausible, it does not appear to be what is happening in our simulations.
Is this finding a real-world effect? We suspect so, but can’t prove it. Is it merely an artefact of our methodology? Maybe. Is it a programming bug? We do not think so but we cannot be sure.
With all this uncertainty, what do we get from models?
Do these contradictory results mean that modelling is presently at best an immature field that fails to increase our knowledge of epidemics? The question may be asked of non-STI disease models too. Butler discussed the fact that Ebola models overestimated the number of cases in Liberia during the 2014 epidemic (7). The article was subtitled “Rate of infection in Liberia seems to plateau, raising questions over the usefulness of models in an outbreak”.
This sparked a strongly worded response by Rivers who wrote: “Your assertion that models of the Ebola epidemic have failed to project its course misrepresents their aims … They helped to inspire and inform the strong international response that may at last be slowing the epidemic”. The letter concluded “Epidemics are affected by countless variables, so uncertainty is a given. Models synthesize available information. Without them, there is little to guide decision-makers during an outbreak. Their importance goes beyond providing forecasts.” (8)
Perhaps this is the best we can hope for with STI models too. After all, even for the most sophisticated individual-based models, the assumptions of how sexual behaviour occurs are highly idealised.
A more optimistic view is that models have had a crucial role in HIV-related policy making. Models cannot be expected to be precise, but they can be expected to give broadly accurate answers to the questions they try to answer. So the first models of HIV in South Africa in the early 1990s accurately showed that a large epidemic was underway and that it would become massive, but they could not predict the size of the epidemic with precision (9). In the early 2000s, the ASSA models and subsequent work were accurate in that they showed that ART would save many lives and that implementing it would be affordable. But they could not precisely predict how many lives would be saved, how much life-expectancy would increase, or exactly how much it would cost. The Thembisa model is currently used to assess the short- to medium-term future of the epidemic (10). We expect it to be reasonably accurate, but no one can expect its projections to be precise. The Granich model showed — hopefully accurately; it’s too early to tell — that a policy of universal treatment would prevent many new infections in South Africa, but its projections will almost certainly be imprecise (1). It’s also too early to tell if the individual-based model by Hontelez et al. will turn out to be more precise than the Granich model.
Individual-based models provide valuable insights and it is worth investing research efforts into improving them. One study that compared individual-based models against differential-equation-based ones, found that the latter are prone to overestimating incidence, so it would be unfortunate to dismiss individual-based models as too complex to be useful (11).
One way to increase our confidence in individual-based models is to do sensitivity testing, i.e. running many simulations varying the parameters within reasonable ranges. This applies to differential-equation-based models too, but it has been hard to do with individual-based models because they are often too slow to make this practical. However, by using programming languages like C/C++ that compile to highly optimised machine code to write our simulations, as well as matching algorithms that approximate the distribution of relationships, and with advances in hardware, this is becoming increasingly feasible.
The examples above show the substantial uncertainty that surround model-based estimates, especially, but not only, from individual-based models. If you read the SACEMA Quarterly, you — like me — probably enjoy modelling, the intellectual challenge of fiddling with maths and software algorithms to reach new predictions or insights about the world of STIs. Individual-based models are particularly exciting; the outcomes of these complex models are extremely difficult to predict in advance. Programming heterogeneous behaviour into the individuals in the model world, executing the simulation, and then waiting for its results creates wonderful anticipation, with the consequent temptation to believe we’ve discovered something new about the world. But our enthusiasm needs to be tempered with the understanding that the assumptions and data informing our models is often dubious at best, and the likelihood of our models containing bugs is not insignificant. Therefore the outputs we present to the world must be treated with great caution and healthy scepticism.
Note: This article is adapted from sections in the author’s PhD dissertation. Some paragraphs are copied entirely.