Navigation Bar (see also links at bottom of page) Home PageContactsNewsSiteIndexTrainingSupportPricesResearchSoftware

Avoiding Mistakes in Population Modeling:
Variability (Stochasticity)

List of common mistakes in population modeling

Ignoring variability

Although deterministic measures such as predicted population size and population growth rate may be useful under certain circumstances, variability is such a pervasive feature of natural populations that ignoring it almost always leads to an incomplete picture of the state of the population and an incorrect prediction of its future.

Deterministic models make point (single-value) estimates of variables such as future population size, based on point estimates (e.g., average values) of parameters such as survival rates, fecundities, etc. In natural populations, these parameters show variation in time and space, and it is not the average values of these parameters that put the populations at risk; it is the lower values they take (or the tails of their distributions). Therefore, ignoring variability often gives a misleading, optimistic picture of a population's viability.

In RAMAS Metapop, natural variability in parameter values (environmental stochasticity) is modeled by random fluctuations in age or stage-specific fecundities and survivor rates, carrying capacities, and dispersal rates, as well as two types of catastrophes. In addition, demographic stochasticity and observation error can be simulated (see the Stochasticity dialog under the Model menu).

Not using demographic stochasticity

Demographic stochasticity is the sampling variation in the number of survivors and the number of offspring that occurs (even if survival rates and fecundities were constant) because a population is made up of a finite, integer number of individuals. In RAMAS Metapop, demographic stochasticity is modeled by sampling the number of survivors from a binomial distribution and the number of offspring from a Poisson distribution (Akçakaya 1991). Relative to other factors, demographic stochasticity becomes more important at small population sizes.

Demographic stochasticity option should be used for all models, unless you are modeling densities (such as number of animals per km2) instead of absolute numbers of individuals (however, in this case, note the fact that the program always models abundance as an integer). It is especially important to model demographic stochasticity when modeling impacts of habitat fragmentation. A model that ignores demographic stochasticity will underestimate the increased extinction risk when a population is fragmented into several smaller populations.

If the standard deviation estimates you are using incorporate the effects of demographic variability, it is more appropriate to estimate standard deviation without the contribution by demographic stochasticity (see below), than to exclude demographic stochasticity from your model. This is because if the population size decreases in the future, the component of observed variance due to demographic stochasticity in your data will underestimate the variance due to demographic stochasticity in these lower population sizes, thus your model will underestimate the risk of decline or extinction of the population.

Environmental stochasticity distribution

If the standard deviations are high, the results may be biased because of truncation. In this case, selecting a lognormal distribution instead of a normal distribution may be helpful. Lognormal distribution is recommended if (i) any survival rate or fecundity has a small average value with a large standard deviation, (ii) any survival rate has a high average value with a large standard deviation. For a more detailed discussion, see the help file and the manual.

Ignoring correlations

Correlations among vital rates (elements of the stage matrix) increases the variability of the population size, and hence increases the risks of decline or increase. Thus, when correlations are not known, assuming full correlation rather than independence gives results that are more conservative (precautionary). Note that the correlation discussed here is the correlation (across years) among the underlying vital rates, for example among the survival rates (or fecundities) of different age classes; it is not the correlation in the observed number or proportion of survivors (or offspring, in the case of fecundities). The distinction relates to demographic stochasticity: Even if the underlying survival rates have high correlation across years (e.g., between one-year old and two-year old individuals), the observed proportion of survivors may have a low correlation, because the observed proportion will include sampling variation (and demographic stochasticity), which by definition is independent between age classes. Thus, the observed correlation may underestimate the actual correlation if the sample size (or abundance in the observed age classes) is low. (RAMAS Metapop assumes full correlation between survival rates of different stages, and between fecundities of different stages. By default, the program sets full correlation between survival rates, fecundities and carrying capacities, which can be changed by the user).

A related issue is the correlation among vital rates of different populations in a metapopulation model, which may have a large effect on simulation results.

Too few replications

Models with environmental stochasticity should have a sufficient number of replications to adequately simulate the tails of the distributions of random variables, and to estimate risks with sufficient precision. Use the maximum number of replications (10000) unless you are running a test simulation or making a demonstration. With 1000 replications, the risk curves have 95% confidence interval of about ±0.03 (see the figure and discussion in Chapter 11 of the manual on confidence intervals for risk curves).

However, one should also be careful not to be too confident because there are a large number of replications. The confidence limits are purely statistical; they do not tell anything about the reliability of your model. If the model has large uncertainties, the risk results will not be reliable, regardless of the number replications.

Random fluctuations vs. regular oscillations

In some cases, the observed variability in a population is part of a regular oscillation, or other fluctuation that has periods longer than the time step of the model. Examples include multi-annual predator-prey cycles, and long-term oscillations in demographic parameters caused by global cycles such as ENSO. In many cases, annual fluctuations would be superimposed on such cycles in demographic rates (see the figure on the right for an example). When observed variability in a demographic rate includes a long-term (multi-annual) cycle, using the total observed variance as the variance of a short-term (annual) fluctuations in a model could give biased results. Note that the cycles and fluctuations we are interested in here are in vital rates (survival and fecundity); they are not cycles or fluctuations in population size or density. This is an important distinction (see below).

The correct way of incorporating an observed pattern of random fluctuations superimposed on longer-term cycles is to model the long-term cycle as a temporal trend in vital rate (using .SCH and .FCH files in the Populations dialog), and to use the remaining (residual) variance to model short-term (annual) fluctuations (environmental stochasticity as modeled in the "Standard deviations" dialog). For an example of this approach used to model predator-induced cyclicity, see Akçakaya et al. (2004b). See the program manual and help files for details about temporal trends files.

This issue is related to the temporal autocorrelation in survival and fecundity, i.e., whether deviations in a vital rate are correlated in time (e.g., "bad" years more likely to be followed by "bad" years). Population sizes are often temporally autocorrelated, because population size at any time step is (at least in part) a function of the population size in the previous time step(s). However, this does not necessarily mean that vital rates are temporally autocorrelated. In an exponential (density-independent) model, temporally uncorrelated vital rates ("white noise") will result in a "random walk" pattern of temporally autocorrelated population sizes, associated with spectral reddening. Most natural populations show dynamics "halfway" between white noise and random walk (in population size, not vital rates), a pattern that can be explained by white-noise (uncorrelated) environmental fluctuations and a combination of weak to no density dependence, age structure, and/or measurement (observation) error (see Akçakaya et al. 2003a). For these reasons, in RAMAS Metapop, temporal autocorrelation is not explicitly modeled. However, autocorrelated environmental fluctuations can be added as discussed above, by using temporal trend files in addition to environmental stochasticity in the model.

Using catastrophes instead of environmental stochasticity

Catastrophes are infrequent events that cause large reductions (or increases) in population parameters. (Events that increase vital rates are called "bonanzas", but both catastrophes and bonanzas can be modeled as Catastrophes in RAMAS Metapop, using different "multipliers".) Some models include catastrophes that would be better modeled as part of normal environmental variability. Many factors that are modeled as catastrophes (such as El Nino, fire, drought) form a continuum, and can be modeled either as catastrophes or as part of normal environmental variability (perhaps overlaid on top of a temporal trend), depending on temporal and spatial scales. The best (perhaps the only) way to determine whether to model such a factor as a catastrophe or as part of annual fluctuations is to check the distribution of vital rates (or other population parameter affected by the factor). If the distribution is bimodal (with some years having distinctly different values), then adding the factor as a catastrophe is justified (for example, see Figure 3 in Akçakaya et al. 2003b).

Overestimating catastrophe impact

If a model includes a catastrophe (e.g., a hurricane that lowers annual survival rate), then the estimates of the mean and standard deviation of the affected demographic rates (e.g., survival rate) should exclude the rates observed during catastrophe years (e.g., see Akçakaya & Atwood 1997). Otherwise, the model would include the effects of the catastrophe twice, overestimating its impact.

Not incorporating delayed effects of catastrophes

Some catastrophes have delayed effects. For example, the effects of a fire that reduces the carrying capacity by destroying part of the available habitat will last for several years until the habitat is fully restored. During this time the carrying capacity may be gradually increasing to its pre-fire level. Similarly, the effects of a toxic spill may last for several years, during which average survival rates may be gradually increasing to their normal level, even as they fluctuate from year to year. In RAMAS Metapop, such effects can be modeled by a combination of catastrophes and temporal trends in average population parameters (such as carrying capacities or survival rate). In modeling such effects, it is important to correctly specify "Time since last catastrophe" and "Reset vital rates" parameters. See the help file or the manual for more details.

Overestimating variation

The standard deviation parameters to model environmental stochasticity should be based on observed temporal variance. Variance components due to sampling and demographic stochasticity must be subtracted from total observed variance. Otherwise, standard deviations may be overestimated, which may cause overestimated risks, as well as truncation (and, consequently, bias) in vital rates.

Using the wrong standard deviation

The standard deviation parameters to model environmental stochasticity should be based on the temporal variation in these parameters. When such data are lacking, some modelers use estimates of spatial variation (or even measurement error or sampling variation) from a single time step. There is no reason to expect that spatial variance (or sampling variation) should be similar to temporal variation.

Another mistake is to base the estimates on the standard error of mean, rather than on standard deviation of the series of estimates of the vital rate.

When "Pool variance for survivals" option is selected in RAMAS Metapop (in the "Advanced stochasticity settings"), the standard deviations must be estimated in a specific way (see the help file).

If you are estimating fecundities as a product (e.g., maternity multiplied by zero-year-old survival rate), remember that variance of the product of two random numbers is a function of their means, variances and covariance (see Akçakaya & Raphael 1998 for an example).

Not correcting truncations

Truncation of sampled values may introduce a bias, so that the realized means may be different from the average value used as the model parameter. (Note any error messages that the program displays while running a simulation.) The most common causes of truncation (and suggestions for correcting them) are listed below.

  1. The standard deviations and means you entered are not based on data. If you guessed the values of stage matrix and standard deviation matrix, you may have overestimated them. The means and standard deviations should be based on observed survival rates.
  2. The standard deviations are large because the distribution of the survival rates is bimodal. This may happen, for example, if males and females in the same age class or stage have different average survival rates. In this case, it may be better to include separate stages for males and females, or to model only females.
  3. Bimodality may also result from catastrophes, if the survival rates in years with catastrophes are very different than those without catastrophes, and they are combined in a single distribution. In this case, it may be better to add catastrophes to the model explicitly, and separate their effects from normal year-to-year variability.
  4. The standard deviations are large because they include variability due to sampling and measurement error and demographic stochasticity, and spatial variability. In this case, the standard deviations estimates should exclude these components (see above). Also, the "Pooled variance for survivals" option (in "Advanced Stochasticity Settings") will reduce truncations.
  5. The distribution for environmental variation is Normal when there are survival rates close to 0 or 1. Use lognormal distribution instead.
  6. The population you are modeling does not fit the assumption of the program that all survival rates within a population are perfectly correlated. The "Negative correlation for largest survival" option (in "Advanced Stochasticity Settings") provides an alternative assumption.

Phantom individuals

As discussed elsewhere, constraints must be imposed when sampling vital rates, especially survival rates. (RAMAS Metapop does this automatically, as long as the "Constraints Matrix" is properly specified.)

Even when constraints are imposed, demographic stochasticity may make the number of individuals surviving from a given stage larger than the number in the stage in the previous time step, thus creating "phantom" individuals. This happens if there are two or more survival rates (to different stages) from a given stage, the total survival rate is close to 1.0, and the number of individuals is small. This is automatically corrected by RAMAS Metapop, but if you are using another program, make sure a similar correction is made.

Duration (simulation time horizon) too long or too short

The effect of uncertainties in the model structure and parameters on model results get compounded in time. In other words, the range of predicted outcomes expands with time, so that for long time horizons (simulation durations), the results may become too uncertain to be of any use. Long time horizons also make it difficult to justify model assumptions (e.g., that the vital rates will fluctuate around the same average values with the same variability as was observed).

In contrast a simulation time horizon that is too short may yield results that are not relevant or interesting. For example, when simulation time horizon is less than the generation time of the species, the risk of extinction or substantial decline may be very low, but this low risk may not be relevant to the management question at hand.

Also, as discussed elsewhere, impacts may be assessed as very small when simulation time horizon is too long or too short (see also Akçakaya & Sjögren-Gulve 2000)

The appropriate duration depends on the biology of the species (especially its generation time), the amount of data (especially the length of the time period from which data are available), and the question addressed (especially whether an absolute or relative prediction is required).

Confusing uncertainty and variability

Confusion about the terminology related to uncertainty and variability contributes to some modeling mistakes. As modeled in RAMAS Metapop, natural variability results from temporal and spatial environmental fluctuations, catastrophes, and demographic stochasticity. Natural variability can be modeled and translated into risk (probability) estimates using a stochastic model.

Model or parameter uncertainty results from measurement error, lack of knowledge, and model misspecification and biases. It determines model accuracy, and its effects on the uncertainty of model results increases with time. In principle, this kind of uncertainty can be reduced by collecting additional data. This type of uncertainty should be incorporated in the form of ranges (lower and upper bounds) of each model parameter, anmd results used in a sensitivity analysis (see below). For more information, see the "Measurement error and Uncertainty" topic in the RAMAS Metapop help file or manual.

Note that the sensitivity analysis module of RAMAS GIS (see Chapter 13 of the manual) allows analyzing sensitivity to one parameter at a time. A more comprehensive method, called global sensitivity analysis (GSA), considers all parameters simultaneously. A new GSA method, implemented as an R package, allows global sensitivity analyses using RAMAS (Aiello-Lammens and Akçakaya 2016). The R package, including sample data and a tutorial, is freely available at

List of common mistakes in population modeling

Software · Prices · Training · What's New · Forum

Research · Support · Index · Contact Us · Home