Category

Methodology

Category

In today’s world, infectious diseases can spread quickly – and therefore we should be able to react swiftly and correctly when they manifest. We want to be able to predict both the way in which an epidemic will likely spread, as well as the effect that certain interventions, such as vaccination campaigns, will have.

However, conducting large-scale clinical trials is often infeasible, due to budget, ethical reasons, or time constraints. This is why modelling and simulation play such an important role in modern infectious disease epidemiology. Building mathematical models allows us to compare ‘what-if’ scenarios that would be impossible to evaluate in a controlled way in real life.

Over the last few years, individual-based models, where each individual in a population is represented by a unique entity, have become increasingly popular. These kind of models are able to take into account the multiple levels heterogeneity that exist within a population in a more direct and intuitive manner than compartmental, population-level models can.

One way in which this heterogeneity manifests is social behaviour. The average number of contacts a person has per day, and with whom those contacts are, varies from person to person. Modelling this is not trivial: social mixing patterns are very complex, and are influenced by many factors. However, they can have an important effect on the spread of an infectious disease, and thus should be included in a model for the transmission of such a disease (1, 2).

Several individual-based models for the transmission of infectious diseases, that take into account heterogeneous social mixing, have already been proposed. However, many of these models focus on very specific diseases or situations (3, 4) and only a few, such as FluTE (5) and FRED (6) are publicly available as open-source projects.

Stride (a Simulator for the TRansmission of Infectious DisEases) is an open-source simulator for the transmission of infectious diseases. It has been designed to be applicable to multiple situations, populations and diseases: through input files characteristics of the environment, the population and the disease can be specified. Special care has been taken to ensure optimal runtimes, even when working with populations of up to several millions of individuals.

In Stride, the influence of age, context and type of day on social mixing patterns is explicitly modelled (7, 8): a fifteen-year-old individual will have different social behaviour than someone who is in their forties, and that same fifteen-year-old will make different contacts at school than at home.

After briefly introducing our model, we illustrate it by simulating the spread of Influenza in a synthetic population for Miami-Dade (Florida, USA). Besides using age- and context-dependent social contact frequencies, we also compare three social contact hypotheses to examine the influence of the type of day (weekday, weekend day or holiday) on mixing patterns and disease spread.

The Stride simulator

Since Stride is based on an individual-based model, the central entity of the simulator is the Person. Each person is characterized by their age, gender, and health status, which can be one of either susceptible, exposed, infectious, symptomatic, recovered or vaccinated/immunized.

Each person is also a member of four different Clusters. A cluster represents a group of individuals that may contact each other during a simulation day. Every individual belongs to a Household cluster. Individuals younger than 18 are also assigned a School cluster, while those older than 18 are assigned to a Work cluster – with a certain probability, reflecting the unemployment rate in the simulated population. Finally, every person is also a member of two more general Community clusters. These represent other contacts an individual may have during the day, such as visiting public spaces or meeting with friends.

For each of these different cluster types, a Contact Profile is read from an input file at the beginning of the simulation. This contains the rates at which members of these clusters contact each other, depending on their ages. From these rates, a contact probability can be derived, which is then compared to a randomly drawn number to determine if contact actually occurs between two given members of a cluster.

The simulator moves forward in discrete time steps of one day. Each simulation day, we first update the health status of all individuals in the population. This means that for exposed persons, we check whether they should become infectious and/or symptomatic, and for infectious and/or symptomatic persons, we check whether they should recover. The durations of the incubation period and of the infectious and symptomatic periods are unique for each individual, and are drawn from a distribution at the beginning of the simulation.

Secondly, we decide for each cluster, which individuals are present on the current simulation day. This depends on the type of the cluster (household, school, workplace, or community cluster) and on whether the simulated day is a weekday, weekend day or a holiday. As we will see in the example below, different kinds of regimes can be implemented here.

Next, for each cluster, we simulate the actual contacts and transmissions that take place. To ensure optimal runtimes, we implemented two different algorithms to simulate contacts in a cluster. Which algorithm is used depends on which contacts we are interested in: do we want to study all possible contacts that take place within a cluster, or are we only interested in those contacts that can lead to a transmission of the disease? In the latter case, we will use an optimised algorithm, which only simulates the contacts between infectious and susceptible members of a cluster (9).

In the first case, when all possible contacts between members of a cluster are simulated, it is possible to mark a certain fraction of the population as ‘survey participants’. As such, only the contacts made by these ‘survey participants’ are logged, to limit the output of the simulator. For example, the resulting contact patterns presented in this article are based on a sample of 3000 survey participants from the total simulated population.

As mentioned before, whether contact occurs between two members of a cluster depends on the type of cluster and the age of the individuals: this probability can be looked up through a contact profile specific to the cluster type. When contact does occur between an infectious and a susceptible individual, the transmission probability determines whether a transmission of the disease also occurs.

More information on the structure and implementation of Stride can be found in a GIT repository on Bitbucket http://www.bitbucket.org/stride_ua/stride.

Influenza in Miami-Dade, USA

To illustrate the Stride model and elaborate on the importance of social contact modelling, we performed a case study on influenza transmission in Miami-Dade (Florida, USA). Except for using age- and context-related contact probabilities, we also compared three social contact hypotheses. First, we ran the simulator with only weekday contact patterns (“only weekdays” scenario). Then, we ran the simulator with contact rates that were the weighted average of week- and weekend days (“7-day average” scenario). Finally, we used separate contact profiles for weekdays and weekend days (“week/weekend” scenario).

We used a synthetic population of 2.45 million people in 867 251 households, 441 schools and 180 772 workplaces, extracted from the 2010 U.S. Synthetic Population Database (Version 1) of RTI International (10). To construct the aforementioned Community clusters, we aggregated households (which were sorted on ID: in our model proximity in household IDs reflects spatial proximity as well) until a threshold of 2000 people was reached (5). The contact rates we used were partly derived from a survey in Flanders (2), since these data were not available at a population level for the USA.

After someone has been exposed, they will, on average, become infectious after two days, while they recover, on average, after 6 days. After a person recovers in this model, they will become immune against future influenza infections. The basic reproduction number R0 – the number of secondary cases an infected individual causes on average – was estimated to be 1.56 for the Miami-Dade population. We repeated a simulation running for 150 simulated days 20 times, each simulation starting with 245 infected persons in the population. Next we examined both the social contact patterns emerging from these simulations, and the number of influenza cases that manifested over time.

In figure 1, age-related mixing patterns observed in the “7 day average” scenario are presented. Yellow represents a high contact frequency, while red represents a lower frequency. For the other two scenarios, similar patterns, which closely mimic those observed in the POLYMOD social contact study (1), were observed (results not shown here). We can see that people are mainly contacting people of the same age, while there are also contacts between children and their (grand)parents – although these are less frequent. Individuals between 18 and 65 can be seen to mix with a broader range of ages, which can be attributed to their contacts in the workplace

FIGURE 1. AGE-AGE MIXING PATTERNS IN MIAMI-DADE POPULATION FOR THE THREE SCENARIO. (A) “ONLY WEEKDAYS”, (B) “7 DAY AVERAGE”, (C) “WEEK/WEEKEND”. YELLOW INDICATES HIGH CONTACT FREQUENCIES, WHILE RED INDICATES LOWER CONTACT FREQUENCIES.

In figure 2, the number of new influenza cases over time is plotted for the three different scenarios that we tested. We see that, although contact patterns between the three scenarios are very similar, the transmission dynamics are heavily influenced by the timing of those contacts. Not only is the attack rate (fraction of the population that acquired the infection by the end of the outbreak) higher in the “only weekdays” scenario – which may be partially attributed to the higher number of contacts typically occurring on weekdays -, the peak of the epidemic also occurs earlier than in the other two scenarios. Furthermore, the number of daily cases in the “week/weekend” scenario showed an irregular shape, indicating decreased transmission during weekend days.

FIGURE 2. COMPARISON BETWEEN THREE SOCIAL CONTACT HYPOTHESES OF INFLUENZA CASES OVER TIME FOR A SIMULATED EPIDEMIC IN THE MIAMI-DADE POPULATION, WITH R0 = 1.56.

Future work

Stride explicitly models the influence of age, context and type of day on social contact patterns. However, there are many other factors that influence mixing patterns, which might be included in our model in the future. Examples of these are general policies such as school closures and quarantines, but also fluctuations in people’s social behaviour during an epidemic. We expect future research to focus on including these and other types of adaptive behaviours.

This article was based on the following paper: Kuylen E, Stijven S, Broeckhove J, Willem L. Social Contact Patterns in an Individual-based Simulator for the Transmission of Infectious Diseases (Stride). Procedia Computer Science. 2017; 108: 2438-2442.

Decades of prevention and control programmes for infectious diseases are more and more challenged by the combination of increased global reach of both infectious diseases and vaccine controversies. Historical examples exist where misperceptions on vaccine-related side effects have lowered vaccine coverage substantially, opening a window for outbreaks, re-emergence and sustained prevalence. For instance, a vaccine scare in 1998 linking MMR vaccination and autism has significantly decreased the coverage in England and Wales from around 92% in 1995 to around 80% in 2003 (1). In addition, low incidence of vaccine-preventable diseases often leads to the public perception of reduced severity and susceptibility, which increasingly leads to people delaying or refusing vaccinations (2). These events threaten the high historical immunization coverage in many countries. So-called herd immunity that results from high immunization coverage is extremely important as it indirectly protects risk groups who cannot be vaccinated due to age or medical reasons (e.g. very young children or immunocompromised individuals). The options to adjust current programmes to prevent the (re)emergence of pathogens are endless and require continuous evaluation using different methods.

Mathematical models provide a powerful set of tools in this process as timely, budgetary or ethically feasible alternatives are often lacking. For example, modelling stochastic transmission events of vaccine-preventable childhood diseases in highly immunized populations with (clustered) heterogeneity in susceptibility can benefit from an individual-level approach using individual-based models. These models work bottom-up, with population-level behaviour emerging from the interactions between autonomous individuals and their environment. Individual-based models allow a high degree of heterogeneity for the creation, disappearance and movement of a finite collection of discrete interacting individuals.

Different terminology has been used in the literature for individual-level or individual-based models (IBM) including agent-based models, cellular automata, micro-simulation as well as more generic terms such as computer simulations and complex adaptive systems (3). A distinction in nomenclature can be designated by whether the simulation is based on nodes of a grid (as in a cellular automata), or based on agents that are self-contained programs that collect information from their surroundings and have the autonomy and capacity to learn and adapt (as in agent-based models). These terms have been used inter-changeably in the literature and this inconsistency curtails efficient knowledge transfer. The standard incorporation of the over-arching term “individual-based model” in the abstract or keywords would greatly improve current and future systematic searches in large electronic databases. Henceforth, we will use the overall term “IBM” to refer to the individual-level approach.

Current and future IBM applications

A systematic review of a decade (2006-2015) of disease transmission modelling (3) showed that most papers elaborate on unspecified close-contact infections or influenza, though IBMs for other air-, saliva-, vector-borne and sexually transmitted infections are emerging. Methods for vector-borne diseases have been described for malaria and dengue and could guide future research. IBM applications on chikungunya and zika are expected over the next decade given the growing geographical expansion of their common vectors. Also screening and (non-)pharmaceutical intervention strategies have not been fully explored with IBMs yet.

The combination of targeted screening and vaccination strategies with economic evaluations is promising for the near future. Relatively few papers have used an IBM to model stochastic outbreak analysis under high vaccination coverage for vaccine-preventable childhood diseases. However, for measles it has been shown that stochastic fluctuations around the endemic equilibrium in populations with high vaccination coverage could cause recurrent epidemics (4). The three main reasons cited in the reviewed papers for choosing an IBM are: [a] to model heterogeneous between-host interactions regarding social mixing behaviour, age, compliance to mitigation strategies and spatial distribution; [b] to model heterogeneous within-host processes in combination with between-host interactions; [c] to obtain stochastic individual-level information on the disease burden to inform economic analysis or other post-processing.

The lowest-level entity in each model was a “person” and the minimum characteristic was the health state. Depending on the research questions, also heterogeneity for age, gender, spatial location, social mixing behaviour, compliance to reactive strategies, serotype carriage and cellular mediated immunity were incorporated. Social mixing behaviour and transmission events were modelled in one unified population and/or within specific social contact clusters such as households, schools, workplaces and communities, sometimes in combination with occasional long distance trips.

How to execute IBMs

To implement an IBM, there are different simulation platforms. Firstly, software environments for statistical computing (e.g. R) enable many embedded features and are user-friendly but currently lack specific modules for IBMs. Secondly, there are integrated platforms such as Netlogo (5), which can be practical and straightforward but might not fulfil all requirements of the inherent heterogeneity and computational burden of IBMs. A third option are low-level programming languages such as C++, which enable high-performance code but require high-level programming skills.

Computational performance is an important aspect of a simulator’s usefulness. The evaluation and update of each unique individual in an IBM requires more processing and data access compared to population-aggregate models. Although runtimes and memory requirements are inherent to model implementation and computer hardware, the endless options with an IBM come at a price. The memory access and data movement slows down simulation runs especially. Given the high programming burden, transparent reuse of models increases confidence in their approach and generated results. Making IBM code open-source (e.g. FluTE, FRED, STRIDE) is useful to validate model outcomes, to inspire future modelling projects and to expand model exploration. Consistent “branding” of the IBM, with a proper acronym, is practical to link studies and consolidate intellectual ownership of freely accessible source code.

In conclusion, IBMs are suited to combine heterogeneous within-and between-host interactions and offer many opportunities, especially to analyse targeted interventions for endemic infections and to model host behaviour. The latter has a major impact on disease transmission and policy interventions. There is an increasing interest to incorporate behaviour change in response to disease-related information (6). To facilitate this expansion, we advocate the exchange of (open-source) platforms and stress the need for consistent”. IBMs come at a computational cost but offer a very powerful and flexible framework to analyse disease transmission in depth and ultimately to inform policy making in decades to come.

Foot-and-mouth disease (FMD) is one of the most feared transboundary diseases (1). FMD is a rapidly spreading viral infection capable of infecting a range of animals, including livestock. Infection is typically mild; it causes fever and blistering on the animal’s hooves and mouth. FMD also has a near-global distribution (2). Figure 1 displays a global map of FMD prevalence in cattle. Fear of FMD, therefore, is primarily driven by the risk of its introduction and spread into non-endemic areas. It is a fear constructed by both economics and production loss. In addition to the direct consequences of infection on production, the economic effects of FMD are also attributable to trade restrictions. Because the World Organization for Animal Health (OIE) classifies countries based on whether FMD is present, only countries free of FMD can export livestock products internationally.

FMD is endemic in many sub-Saharan African countries (3). In South Africa, successful control has eliminated the infection from most of the country. South Africa is recognized by the OIE as FMD free without vaccination. Infection, however, remains in the areas surrounding the Kruger National Park (Figure 1). In this area, African buffalo are the primary wildlife reservoir for FMD. Transmission between buffalo and cattle is believed to occur based on the following evidence. First, transmission from buffalo to cattle has been demonstrated under experimental conditions (4). Second, genetic information has shown that strains identified in buffalo are similar to those found in cattle (5). Third, new FMD infections are more common in cattle that frequently contact African buffalo (6). Understanding how buffalo maintain the infection, when transmission from buffalo is most likely, and why transmission occurs are important to understanding FMD in South Africa.

This article discusses a model-guided research programme in Kruger National Park. We focus on the development and use of an individual-based model that is guiding our field and experimental data collection.

FIGURE 1. (ABOVE) GLOBAL DISTRIBUTION OF FMD (REPRODUCED FROM: 7, 8). PREVALENCE INDICES WERE DERIVED FROM REPORTED INCIDENCE RATES AS WELL AS OTHER INDICATORS OF REFLECTING FMD RISK, INCLUDING EXPERT OPINION. (BELOW) MAP OF SOUTH AFRICA SHOWING THE FMD MANAGEMENT ZONES AND OUTBREAK LOCATIONS (REPRODUCED FROM: 5). THE YELLOW AREA INDICATES THE INFECTED ZONE OF KRUGER NATIONAL PARK, THE GREEN AREA REPRESENTS THE BUFFER ZONE, AND THE ORANGE AREA REPRESENTS THE SURVEILLANCE ZONE, WHICH EXTENDS ALONG THE NORTHERN AND EASTERN BORDER OF SOUTH AFRICA. IN THESE AREAS, LIVESTOCK ARE INSPECTED EVERY 7 OR 14 DAYS, RESPECTIVELY, AND THE MOVEMENT OF ANIMALS BETWEEN CONTROL AND DISEASE-FREE AREAS IS PROHIBITED.

Using individual-based models to study infection persistence

FMD epidemics in livestock are characterized by explosive spread: after the introduction of infection on a farm, the entire farm is rapidly infected. To capture this pattern, models of FMD generally divide individuals into categories based on their disease status (9). Individuals are represented as susceptible to FMD, infected but not yet infectious, infectious, or recovered. Models with this structure, called SEIR (Susceptibles, Exposed, Infectious, Recovered) models, have been successfully used to model a range of rapidly transmitting infections (e.g. measles, pertussis, polio). One property of this type of model is that persistence in small populations is rare (10, 11). Transmission in the early stages of an outbreak is so rapid that most individuals quickly become recovered and few susceptible individuals remain to transmit the infection. As a result, highly contagious pathogens tend to exhibit violent fluctuations, exposing them to greater risk of extinction between outbreaks. Our research programme investigates how FMD, one of the most contagious animal pathogens, overcomes this challenge and persists in populations of its reservoir host, the African buffalo.

To address this question, we built a series of stochastic (e.g. with a random probability distribrution) individual-based models, representing alternative mechanisms of persistence. We used individual-based models because we were interested in the stochastic persistence of FMD. Using this framework, we were able to incorporate randomness in the timing and order of events and uncertainty in timing of infection (e.g. how long is it between when a buffalo is exposed and when they are infectious). We were also interested in quantifying variation between individuals in the timing of infection and the timing of births. Increased variation in either process is known to increase the chance that the pathogen persists (11).

A model-guided fieldwork approach

Our first step was to build a model representing the simplest hypothesis describing FMD, the SEIR model. This model represents rapid transmission and rapid recovery from infection. Over 98% of buffalo have been exposed to FMD, suggesting that the high rate of transmission represented in the model is appropriate. Birthing in African buffalo is seasonal, with most calves being born from December to April. This wide birth pulse may allow FMD to persist because it allows susceptible animals into the population over time. FMD could circulate through each year’s population of new susceptible calves, with the latest born calves of one year sparking the new epidemic in the earliest born calves of the following year’s cohort. To evaluate this null hypothesis, we analysed the SEIR model and evaluated the conditions allowing persistence.

To parameterize the model, we conducted experimental transmission studies. In these experiments, we housed experimentally infected and naive buffalo together in the same enclosure, and we monitored the timing of transmission and infection. There are three types of FMD, called serotypes, known to circulate within South Africa: South African Territories (SAT) serotypes 1, 2, and 3. We, therefore, repeated this experiment for each serotype. These experimets allowed us to estimate serotype-specific model parameters: how long do buffalo remain infectious and how rapidly is FMD transmitted? They also allowed us to quantify uncertainty in epidemiological parameters to ensure our model predictions robustly account for the challenges inherent to data collection.

The experimental results showed that FMD was transmitted rapidly from acutely infected buffalo and recovery occurred within 4-6 days. Based on this information, model results indicated that FMD would invariably go extinct from buffalo populations within a year or two. In model simulations, FMD only persisted when buffalo were assumed to be infectious for longer than 20 days or in populations larger than 1000 buffalo (Figure 2). This is clearly not the case—which raises the question, what piece of essential biology is represented incorrectly in this model?

Our second step was to build models representing additional mechanisms of persistence and collect data to quantify and test these mechanisms empirically. One potential mechanism involves buffalo that maintain the infection over longer time periods. Some studies have reported recovery of FMD virus from buffalo over five years post infection. Therefore, we are quantifying how often and under what conditions do these longer infected buffalo, called carriers, transmit FMD. Another potential mechanism is that buffalo may lose immunity over time. FMD is a rapidly evolving infection so the viral population may change to allow re-infection. We can represent loss of immunity in our model, and we are currently quantifying changes in the virus population in wild buffalo. Models representing these alternative hypotheses provide a clear set of assumptions to inform field and experimental data collection. Experiments to quantify these processes are underway and can be incorporated into our individual-based modelling framework.

In conclusion, our model results suggest that transmission between calf cohorts, as represented in the SEIR model, is highly unlikely to support the persistence of FMD in African buffalo populations. Because the assumptions in this model were used to inform data collection, all epidemiological parameters were estimated from experimental infections. The individual-based modelling approach gave us the flexibility to incorporate both uncertainty in parameter estimates and individual variation into model predictions. It also provided a clear set of assumptions to guide ongoing data collection for our alternative hypotheses.

This work is, therefore, one step in a larger model-guided study on FMD in African buffalo. As we finalize data collection this year, we look forward to further refining our assumptions and clarifying how FMD persists in its wildlife reservoir.

About the FMD-buffalo research team: The FMD-buffalo research team is an interdisciplinary and multinational collaboration. Dr. Jan Medlock and Dr. Anna Jolles lead the modeling work on this project (Oregon State University, OSU). Field-based data collection and disease testing are organized through Kruger National Park’s Veterinary Wildlife Services department, Dr. Lin-Mari de Klerk-Lorist, Dr. Louis van Schalkwyk (Department of Agriculture, Forestry and Fisheries, Directorate of Animal Health, State Veterinary Office in Skukuza), Dr. Brianna Beechler (OSU), Dr. Francois Maree, Dr. Katherine Scott (ARC- Onderstepoort Veterinary Institute), and Dr. Eva Perez (The Pirbright Institute).

How?
“Classroom models” and hands-on mini-projects with NetLogo and RStudio.

Instructors?
Wim Delva (SACEMA, Ghent University, Hasselt University, KU Leuven) and Lander Willem (University of Antwerp).

Where?
SACEMA, Stellenbosch University, South Africa.

What?
Focus on the principles of conceptualising a model world, coding the model, analysing its behaviour, fitting it to data and communicating its results, with practical model examples in the epidemiology of HIV, Influenza, Malaria and Diabetes.

For whom?
Postgrad students, postdocs and health science professionals whose work potentially involves the design and/or use of individual-based models in epidemiology. Prior experience with R is a plus. Prior experience with NetLogo is not required.

Fees & application form?
To be announced on www.sacema.org in early 2018.