There are two approaches to
statistical inference:
model-based inference and
design-based inference. Both approaches rely on some
statistical model to represent the data-generating process. In the model-based approach, the model is taken to be initially unknown, and one of the goals is to
select an appropriate model for inference. In the design-based approach, the model is taken to be known, and one of the goals is to ensure that the sample data are selected randomly enough for inference. Statistical assumptions can be put into two classes, depending upon which approach to inference is used. • Model-based assumptions. These include the following three types: • Distributional assumptions. Where a
statistical model involves terms relating to
random errors, assumptions may be made about the
probability distribution of these errors. In some cases, the distributional assumption relates to the observations themselves. • Structural assumptions. Statistical relationships between variables are often modelled by equating one variable to a function of another (or several others), plus a
random error. Models often involve making a structural assumption about the form of the functional relationship, e.g. as in
linear regression. This can be generalised to models involving relationships between underlying unobserved
latent variables. • Cross-variation assumptions. These assumptions involve the
joint probability distributions of either the observations themselves or the random errors in a model. Simple models may include the assumption that observations or errors are
statistically independent. • Design-based assumptions. These relate to the way observations have been gathered, and often involve an assumption of
randomization during
sampling. The model-based approach is the most commonly used in statistical inference; the design-based approach is used mainly with
survey sampling. With the model-based approach, all the assumptions are effectively encoded in the model. ==Checking assumptions==