In a probability sample (also called "scientific" or "random" sample) each member of the target population has a known and non-zero probability of inclusion in the sample. A survey based on a probability sample can in theory produce statistical measurements of the target population that are
unbiased, because the expected value of the sample mean is equal to the population mean, E(ȳ)=μ, or have a measurable sampling error, which can be expressed as a
confidence interval or
margin of error. A probability-based survey sample is created by constructing a list of the target population, called the
sampling frame, a randomized process for selecting units from the sample frame, called a selection procedure, and a method of contacting selected units to enable them to complete the survey, called a data collection method or mode. For some target populations this process may be easy; for example, sampling the employees of a company by using payroll lists. However, in large, disorganized populations simply constructing a suitable sample frame is often a complex and expensive task. Common methods of conducting a probability sample of the household population in the United States are Area Probability Sampling, Random Digit Dial telephone sampling, and more recently, Address-Based Sampling. Within probability sampling, there are specialized techniques such as
stratified sampling and
cluster sampling that improve the precision or efficiency of the sampling process without altering the fundamental principles of probability sampling. Stratification is the process of dividing members of the population into homogeneous subgroups before sampling, based on auxiliary information about each sample unit. The strata should be mutually exclusive: every element in the population must be assigned to only one stratum. The strata should also be collectively exhaustive: no population element can be excluded. Then methods such as
simple random sampling or
systematic sampling can be applied within each stratum. Stratification often improves the representativeness of the sample by reducing sampling error. ==Non-sampling error in probability sampling==