Broadly speaking, IRT models can be divided into two families: unidimensional and multidimensional. Unidimensional models require a single trait (ability) dimension {\theta}. Multidimensional IRT models model response data hypothesized to arise from multiple traits. However, because of the greatly increased complexity, the majority of IRT research and applications utilize a unidimensional model. IRT models can also be categorized based on the number of scored responses. The typical
multiple choice item is
dichotomous; even though there may be four or five options, it is still scored only as correct/incorrect (right/wrong). Another class of models apply to
polytomous outcomes, where each response has a different score value. A common example of this is
Likert-type items, e.g., "Rate on a scale of 1 to 5." Another example is partial-credit scoring, to which models like the
Polytomous Rasch model may be applied.
Number of IRT parameters Dichotomous IRT models are described by the number of parameters they make use of. The 3PL is named so because it employs three item parameters. The two-parameter model (2PL) assumes that the data have no guessing, but that items can vary in terms of location (b_i) and discrimination (a_i). The one-parameter model (1PL) assumes that guessing is a part of the ability and that all items that fit the model have equivalent discriminations, so that items are only described by a single parameter (b_i). This results in one-parameter models having the property of specific objectivity, meaning that the rank of the item difficulty is the same for all respondents independent of ability, and that the rank of the person ability is the same for items independently of difficulty. Thus, 1 parameter models are sample independent, a property that does not hold for two-parameter and three-parameter models. Additionally, there is theoretically a four-parameter model (4PL), with an upper
asymptote, denoted by d_i, where 1-c_i in the 3PL is replaced by d_i-c_i. However, this is rarely used. Note that the alphabetical order of the item parameters does not match their practical or psychometric importance; the location/difficulty (b_i) parameter is clearly most important because it is included in all three models. The 1PL uses only b_i, the 2PL uses b_i and a_i, the 3PL adds c_i, and the 4PL adds d_i. The 2PL is equivalent to the 3PL model with c_i = 0, and is appropriate for testing items where guessing the correct answer is highly unlikely, such as fill-in-the-blank items ("What is the square root of 121?"), or where the concept of guessing does not apply, such as personality, attitude, or interest items (e.g., "I like Broadway musicals. Agree/Disagree"). The 1PL assumes not only that guessing is not present (or irrelevant), but that all items are equivalent in terms of discrimination, analogous to a common
factor analysis with identical loadings for all items. Individual items or individuals might have secondary factors but these are assumed to be mutually independent and collectively
orthogonal.
Logistic and normal IRT models An alternative formulation constructs IRFs based on the normal
probability distribution; these are sometimes called
normal ogive models. For example, the formula for a two-parameter normal-ogive IRF is: p_i(\theta)= \Phi \left( \frac{\theta-b_i}{\sigma_i} \right) where
Φ is the
cumulative distribution function (CDF) of the standard normal distribution. The normal-ogive model derives from the assumption of normally distributed measurement error and is theoretically appealing on that basis. Here b_i is, again, the difficulty parameter. The discrimination parameter is {\sigma}_i, the standard deviation of the measurement error for item
i, and comparable to 1/
a_i. One can estimate a normal-ogive latent trait model by factor-analyzing a matrix of tetrachoric correlations between items. This means it is technically possible to estimate a simple IRT model using general-purpose statistical software. With rescaling of the ability parameter, it is possible to make the 2PL logistic model closely approximate the
cumulative normal ogive. Typically, the 2PL logistic and normal-ogive IRFs differ in probability by no more than 0.01 across the range of the function. The difference is greatest in the distribution tails, however, which tend to have more influence on results. The latent trait/IRT model was originally developed using normal ogives, but this was considered too computationally demanding for the computers at the time (1960s). The logistic model was proposed as a simpler alternative, and has enjoyed wide use since. More recently, however, it was demonstrated that, using standard polynomial approximations to the normal CDF
, the normal-ogive model is no more computationally demanding than logistic models.
The Rasch model The
Rasch model is often considered to be the 1PL IRT model. However, proponents of Rasch modeling prefer to view it as a completely different approach to conceptualizing the relationship between data and theory. Like other statistical modeling approaches, IRT emphasizes the primacy of the fit of a model to observed data, while the Rasch model emphasizes the primacy of the requirements for fundamental measurement, with adequate data-model fit being an important but secondary requirement to be met before a test or research instrument can be claimed to measure a trait. Operationally, this means that the IRT approaches include additional model parameters to reflect the patterns observed in the data (e.g., allowing items to vary in their correlation with the latent trait), whereas in the Rasch approach, claims regarding the presence of a latent trait can only be considered valid when both (a) the data fit the Rasch model, and (b) test items and examinees conform to the model. Therefore, under Rasch models, misfitting responses require diagnosis of the reason for the misfit, and may be excluded from the
data set if one can explain substantively why they do not address the latent trait. Thus, the Rasch approach can be seen to be a confirmatory approach, as opposed to exploratory approaches that attempt to model the observed data. The presence or absence of a guessing or pseudo-chance parameter is a major and sometimes controversial distinction. The IRT approach includes a left asymptote parameter to account for guessing in
multiple choice examinations, while the Rasch model does not because it is assumed that guessing adds randomly distributed noise to the data. As the noise is randomly distributed, it is assumed that, provided sufficient items are tested, the rank-ordering of persons along the latent trait by raw score will not change, but will simply undergo a linear rescaling. By contrast, three-parameter IRT achieves data-model fit by selecting a model that fits the data, at the expense of sacrificing
specific objectivity. In practice, the Rasch model has at least two principal advantages in comparison to the IRT approach. The first advantage is the primacy of Rasch's specific requirements, which (when met) provides
fundamental person-free measurement (where persons and items can be mapped onto the same invariant scale). Another advantage of the Rasch approach is that estimation of parameters is more straightforward in Rasch models due to the presence of sufficient statistics, which in this application means a one-to-one mapping of raw number-correct scores to Rasch {\theta} estimates. ==Analysis of model fit==