The mathematical model that is presented in this project is a compartmental model. A compartment model uses simple varibles including:

  1. Expected number of susceptible individuals having significant contact with another individual in the time interval t < T = t + t
  2. Expected number of susceptible individuals having significant contact with infected individuals in the time interval t < T = t + t
  3. Expected number of susceptible individuals who will become infected in the time interval t = T = t + t.
  4. Total population
  5. Number of infected at time 0
  6. Number of infected at time t
  7. Number of susceptible at time t
  8. Probability that any given individual has significant contact with any other individual, in a unit time interval
  9. Probability of infection (i.e. infectiousness), given a single significant contact
  10. Time step

         We can now express the number infected at any time as the sum of those infected at the end of the previous time step, and the new infections1. Since we are using an SI model, where all members of a population are assumed to be either infected or susceptible, we can estimate the number susceptible at any given time as the difference between the total population and the estimate of the number infected at that time. Alternatively, we can estimate the number susceptible in the same fashion as the number infected, by virtue of the fact that new infections flow out of the S compartment at the same time that they flow into the, I compartment. We can also say that the difference between the number infected at time t + _t and the number infected at time t is the flow into the I compartment, which is itself a function of the time difference (t). This general form, where one difference is expressed as a function of another, gives us the term “difference equation”. We now have formulas for estimating the number infected and the number susceptible at the end of each time step, in terms of the same quantities at the end of the previous time step.
  • We assume that we know the initial number infected (I0), the total population (P), and thus the initial number susceptible (S0 = P – I0).
  • We can now estimate the number infected and the number susceptible at any given time (It and St), by starting at time 0, and moving forward repeatedly by the time step, calculating the number infected and number susceptible at each step, until we reach the moment of interest. As mentioned previously, the selection of a step size is a non-trivial issue. For example, even though the chart on page 8 is pretty smooth, we can see (if we look closely) that it is not actually a straight line, nor is it the smooth curve that page 9 might lead us to believe. Making the step size smaller (computing a correspondingly larger number of steps) would make it even smoother; what might not be as obvious is that the resulting curve could change in more substantive ways with different step sizes.
  • In fact, when the flow rate between two compartments is not constant, but depends on the level in one or the other compartment, different step sizes will give different results – sometimes subtly so, but sometimes dramatically so.
  • Above all, we need to select a step size which is appropriate for the real world problem we are modeling. For example, if we are modeling galactic formation, a step size of a second, an hour, or even a day would be too small: we would spend all our time calculating almost insignificant changes. On the other hand, if we are modeling quantum particle collisions, a step size of a second is many times too large: we are modeling changes that take place in tiny fractions of a second, and our step size must be set accordingly. For an epidemic model, a step size of a day (or some significant fraction of a day) is generally appropriate.
  • When we choose a step size which is equal to the unit time – as in our example, where we are using a unit time of 1 day, and our step size is 1 day – it is fairly common to drop the t symbol from the equations of flow, and from the resulting difference equations. In estimating the number infected, we quickly start getting numbers which aren't integers. In epidemic modeling, we can argue that it doesn't make sense to consider a fraction of an individual to be infected: either an individual is infected, or it isn't. So we often round the results of each step's calculations to the nearest integers, and use these rounded values to compute the next step. (Note that this is not the same as doing the computations for all of the steps – out to the time horizon of interest – without rounding, and only rounding the results at the end.)
  • On the other hand, there are some cases in epidemic modeling where we definitely wouldn't round the results. For example, if we were mostly interested in the proportions of the population in the S and I compartments at each step, rather than the actual numbers, then we could represent the total population as 1.0, and the initial infected as some fractional value (in our example, it would be 0.001), and not perform any rounding. Also, if we were to apply calculus to our model, to analyze what happens as we make the step size ever smaller (approaching a limit of 0), we would need to avoid the stair-step effect produced by rounding. Even when we decide to round our results at each step, the appropriate direction for the rounding is not always clear. Should we round to the nearest integer? Always round down? Always round up? The answer is not the same in every case, and it depends greatly on the type of problem being modeled.
  • Finally, we must be aware that in many cases, the results produced with rounding are qualitatively different than those produced without rounding. In particular, our flow equations and step size might be such that rounding will result in no change at all in compartment levels from step to step; this might be appropriate, or it might be an indicator that our step size is too small (or maybe even that we shouldn't be rounding).