Choosing the right simulation technique
The bad news is that there is no general guidance and no general method for simulation. The choice of the right simulation technique rather depends on the underlying problem, data set, and the aim of the study.
We already mentioned in which areas simulation plays a role. Depending on the area of interest, different techniques are considered. For a Bayesian analysis, the methods differ between general inference statistics using resampling techniques, and when optimization comes into the topic. The methods change completely when interaction of populations or individuals are in focus, or when predictions about the future on an individual level (micro-simulation) are required.
However, some general questions may give a little guidance for choosing the right technique. This is illustrated in Table 1.1. Of course, it is often the case that a clear decision cannot be made. For example, if you work with a sample, optimization techniques might be used for several reasons, but the main aim of optimization is more general. It can be applied to samples or population data. Another example includes, for example, in agent-based microsimulation of course one can compare models, but it's not the main aim. Thus this table should give only a very rough categorization of methods for basic questions. It should be clear that optimization methods, methods for system dynamics, and agent-based micro-simulation techniques may differ to the methods that are developed to express the uncertainty of estimators, such as resampling methods.
The following table describes choosing the right simulation technique, Monte Carlo (MC) techniques, and resampling techniques; Markov chain Monte Carlo (MCMC) methods; MC test Monte Carlo techniques applied to hypothesis testing, optimization (O), system dynamics (SD), agent-based modeling (ABM), design-based simulation (DBS), and model-based simulation (MBS):
Question |
Yes |
No |
---|---|---|
Do you work with a sample? |
MC, MC test, MBS, DBS |
ABM, SD |
Is variability/randomness important? |
MC, MC test, MBS, DBS |
ABM, SD |
Is the number of observations large? |
ABM |
SD, MCMC |
Do you apply a hypothesis test? |
MC test |
ABM, DBS, SD, Opt |
Is the sample drawn from a finite population? |
DBS |
MBS |
Do you work with a population? |
ABM, Opt, SD |
MC, MC test |
Do you want to compare models? |
MC |
ABM, SD, Opt |
Do you apply Bayesian statistics? |
MCMC |
SD, Opt |
Do you need to simulate certain distributions? |
RN, MCMC |
SD, Opt |
Is probability theory a main issue? |
ABM, MCMC |
MC, MBS, DBS |
Has something to be optimized? |
Opt |
MC, MC test, MBS |
Dynamic rules of behavior within individuals. |
SD, ABM |
All others |
Do changes to the system happen over time? |
SD |
All others |
Can the time-frame of interest be long? |
ABM, SD |
All others |
Please enjoy all the chapters mentioned, simulate a cozy burning fire with R's package animation (Yihui 2013) in Figure 1.1, and start to explore all the different issues in Simulation for Data Science with R:
To run the burning fire simulation, have a look at the code on this website: http://yihui.name/en/2009/06/simulation-of-burning-fire-in-r/.