In clinical practice, post-test probabilities are often just estimated or even guessed. This is usually acceptable in the finding of a
pathognomonic sign or symptom, in which case it is almost certain that the target condition is present; or in the absence of finding a
sine qua non sign or symptom, in which case it is almost certain that the target condition is absent. In reality, however, the subjective probability of the presence of a condition is never exactly 0 or 100%. Yet, there are several systematic methods to estimate that probability. Such methods are usually based on previously having performed the test on a
reference group in which the presence or absence on the condition is known (or at least estimated by another test that is considered highly accurate, such as by "
Gold standard"), in order to establish data of test performance. These data are subsequently used to interpret the test result of any individual tested by the method. An alternative or complement to
reference group-based methods is comparing a test result to a previous test on the same individual, which is more common in tests for
monitoring. The most important systematic
reference group-based methods to estimate post-test probability includes the ones summarized and compared in the following table, and further described in individual sections below.
By predictive values Predictive values can be used to estimate the post-test probability of an individual if the pre-test probability of the individual can be assumed roughly equal to the prevalence in a
reference group on which both test results and knowledge on the presence or absence of the condition (for example a disease, such as may determined by "
Gold standard") are available. If the test result is of a
binary classification into either
positive or negative tests, then the following table can be made: Pre-test probability can be calculated from the diagram as follows: Pretest probability = (True positive + False negative) / Total sample Also, in this case, the
positive post-test probability (the probability of having the target condition if the test falls out positive), is numerically equal to the
positive predictive value, and the
negative post-test probability (the probability of having the target condition if the test falls out negative) is numerically complementary to the
negative predictive value ([negative post-test probability] = 1 - [negative predictive value]), again assuming that the individual being tested does not have any other risk factors that result in that individual having a different
pre-test probability than the reference group used to establish the positive and negative predictive values of the test. In the diagram above, this
positive post-test probability, that is, the posttest probability of a target condition given a positive test result, is calculated as: Positive posttest probability = True positives / (True positives + False positives) Similarly: The post-test probability of disease given a negative result is calculated as: Negative posttest probability = 1 - (True negatives / (False negatives + True negatives)) The validity of the equations above also depend on that the sample from the population does not have substantial
sampling bias that make the groups of those who have the condition and those who do not substantially disproportionate from corresponding prevalence and "non-prevalence" in the population. In effect, the equations above are not valid with merely a
case-control study that separately collects one group with the condition and one group without it.
By likelihood ratio The above methods are inappropriate to use if the pretest probability differs from the prevalence in the reference group used to establish, among others, the positive predictive value of the test. Such difference can occur if another test preceded, or the person involved in the diagnostics considers that another pretest probability must be used because of knowledge of, for example, specific complaints, other elements of a
medical history, signs in a
physical examination, either by calculating on each finding as a test in itself with its own sensitivity and specificity, or at least making a rough estimation of the individual pre-test probability. In these cases, the
prevalence in the reference group is not completely accurate in representing the
pre-test probability of the individual, and, consequently, the
predictive value (whether
positive or
negative) is not completely accurate in representing the
post-test probability of the individual of having the target condition. In these cases, a posttest probability can be estimated more accurately by using a
likelihood ratio for the test.
Likelihood ratio is calculated from
sensitivity and specificity of the test, and thereby it does not depend on prevalence in the reference group, • Pretest odds = Pretest probability / (1 - Pretest probability) • Posttest odds = Pretest odds * Likelihood ratio In equation above,
positive post-test probability is calculated using the
likelihood ratio positive, and the
negative post-test probability is calculated using the
likelihood ratio negative. • Posttest probability = Posttest odds / (Posttest odds + 1) The relation can also be estimated by a so-called
Fagan nomogram (shown at right) by making a straight line from the point of the given
pre-test probability to the given
likelihood ratio in their scales, which, in turn, estimates the
post-test probability at the point where that straight line crosses its scale. The post-test probability can, in turn, be used as pre-test probability for additional tests if it continues to be calculated in the same manner.
Example An individual was screened with the test of
fecal occult blood (FOB) to estimate the probability for that person having the target condition of bowel cancer, and it fell out positive (blood were detected in stool). Before the test, that individual had a pre-test probability of having bowel cancer of, for example, 3% (0.03), as could have been estimated by evaluation of, for example, the medical history, examination and previous tests of that individual. The sensitivity, specificity etc. of the FOB test were established with a population sample of 203 people (without such heredity), and fell out as follows: From this, the
likelihood ratios of the test can be established: in effect decreasing the sensitivities and specificities of such subsequent tests. On the other hand, the effect of interference can potentially improve the efficacy of subsequent tests as compared to usage in the reference group, such as some abdominal examinations being easier when performed on underweight people.
Overlap of tests Furthermore, the validity of calculations upon any pre-test probability that itself is derived from a previous test depend on that the two tests do not significantly overlap in regard to the target parameter being tested, such as blood tests of substances belonging to one and the same deranged
metabolic pathway. An example of the extreme of such an overlap is where the sensitivity and specificity has been established for a blood test detecting "substance X", and likewise for one detecting "substance Y". If, in fact, "substance X" and "substance Y" are one and the same substance, then, making a two consecutive tests of one and the same substance may not have any diagnostic value at all, although the calculation appears to show a difference. In contrast to interference as described above, increasing overlap of tests only decreases their efficacy. In the medical setting, diagnostic validity is increased by combining tests of different modalities to avoid substantial overlap, for example in making a combination of a blood test, a
biopsy and
radiograph.
Methods to overcome inaccuracy To avoid such sources of inaccuracy by using likelihood ratios, the optimal method would be to gather a large reference group of equivalent individuals, in order to establish separate
predictive values for use of the test in such individuals. However, with more knowledge of an individual's medical history, physical examination and previous test etc. that individual becomes more differentiated, with increasing difficulty to find a reference group to establish tailored predictive values, making an estimation of post-test probability by predictive values invalid. Another method to overcome such inaccuracies is by evaluating the test result in the context of diagnostic criteria, as described in the next section.
By relative risk Post-test probability can sometimes be estimated by multiplying the pre-test probability with a
relative risk given by the test. In clinical practice, this is usually applied in evaluation of a
medical history of an individual, where the "test" usually is a question (or even assumption) regarding various risk factors, for example, sex,
tobacco smoking or weight, but it can potentially be a substantial test such as putting the individual on a
weighing scale. When using relative risks, the resultant probability is usually rather related to the individual developing the condition over a period of time (similarly to the
incidence in a population), instead of being the probability of an individual of having the condition in the present, but can indirectly be an estimation of the latter. Usage of
hazard ratio can be used somewhat similarly to relative risk.
One risk factor To establish a relative risk, the risk in an exposed group is divided by the risk in an unexposed group. If only one risk factor of an individual is taken into account, the post-test probability can be estimated by multiplying the relative risk with the risk in the control group. The control group usually represents the unexposed population, but if a very low fraction of the population is exposed, then the prevalence in the general population can often be assumed equal to the prevalence in the control group. In such cases, the post-test probability can be estimated by multiplying the relative risk with the risk in the general population. For example, the
incidence of
breast cancer in a woman in the United Kingdom at age 55 to 59 is estimated at 280 cases per 100.000 per year, and the risk factor of having been exposed to high-dose
ionizing radiation to the chest (for example, as treatments for other cancers) confers a relative risk of breast cancer between 2.1 and 4.0, compared to unexposed. Because a low fraction of the population is exposed, the prevalence in the unexposed population can be assumed equal to the prevalence in the general population. Subsequently, it can be estimated that a woman in the United Kingdom that is aged between 55 and 59 and that has been exposed to high-dose ionizing radiation should have a risk of developing breast cancer over a period of one year of between 588 and 1.120 in 100.000 (that is, between 0,6% and 1.1%).
Multiple risk factors Theoretically, the total risk in the presence of multiple risk factors can be estimated by multiplying with each relative risk, but is generally much less accurate than using likelihood ratios, and is usually done only because it is much easier to perform when only relative risks are given, compared to, for example, converting the source data to sensitivities and specificities and calculate by likelihood ratios. Likewise, relative risks are often given instead of likelihood ratios in the literature because the former is more intuitive. Sources of inaccuracy of multiplying relative risks include: • Relative risks are affected by the prevalence of the condition in the reference group (in contrast to likelihood ratios, which are not), and this issue results in that the validity of post-test probabilities become less valid with increasing difference between the prevalence in the reference group and the pre-test probability for any individual. Any known risk factor or previous test of an individual almost always confers such a difference, decreasing the validity of using relative risks in estimating the total effect of multiple risk factors or tests. Most physicians do not appropriately take such differences in prevalence into account when interpreting test results, which may cause unnecessary testing and diagnostic errors. • A separate source of inaccuracy of multiplying several relative risks, considering only positive tests, is that it tends to overestimate the total risk as compared to using likelihood ratios. This overestimation can be explained by the inability of the method to compensate for the fact that the total risk cannot be more than 100%. This overestimation is rather small for small risks, but becomes higher for higher values. For example, the risk of developing breast cancer at an age younger than 40 years in women in the United Kingdom can be estimated at 2%. Also, studies on
Ashkenazi Jews has indicated that a mutation in
BRCA1 confers a relative risk of 21.6 of developing breast cancer in women under 40 years of age, and a mutation in
BRCA2 confers a relative risk of 3.3 of developing breast cancer in women under 40 years of age. From these data, it may be estimated that a woman with a BRCA1 mutation would have a risk of approximately 40% of developing breast cancer at an age younger than 40 years, and woman with a BRCA2 mutation would have a risk of approximately 6%. However, in the rather improbable situation of having
both a BRCA1 and a BRCA2 mutation, simply multiplying with both relative risks would result in a risk of over 140% of developing breast cancer before 40 years of age, which can not possibly be accurate in reality. The (latter mentioned) effect of overestimation can be compensated for by converting risks to odds, and relative risks to
odds ratios. However, this does not compensate for (former mentioned) effect of any difference between pre-test probability of an individual and the prevalence in the reference group. A method to compensate for both sources of inaccuracy above is to establish the relative risks by
multivariate regression analysis. However, to retain its validity, relative risks established as such must be multiplied with all the other risk factors in the same regression analysis, and without any addition of other factors beyond the regression analysis. In addition, multiplying multiple relative risks has the same risk of missing important overlaps of the included risk factors, similarly to when using likelihood ratios. Also, different risk factors can act in
synergy, with the result that, for example, two factors that both individually have a relative risk of 2 have a total relative risk of 6 when both are present, or can inhibit each other, somewhat similarly to the interference described for using likelihood ratios.
By diagnostic criteria and clinical prediction rules Most major diseases have established
diagnostic criteria and/or
clinical prediction rules. The establishment of diagnostic criteria or clinical prediction rules consists of a comprehensive evaluation of many tests that are considered important in estimating the probability of a condition of interest, sometimes also including how to divide it into subgroups, and when and how to treat the condition. Such establishment can include usage of predictive values, likelihood ratios as well as relative risks. For example, the
ACR criteria for systemic lupus erythematosus defines the diagnosis as presence of at least 4 out of 11 findings, each of which can be regarded as a target value of a test with its own sensitivity and specificity. In this case, there has been evaluation of the tests for these target parameters when used in combination in regard to, for example, interference between them and overlap of target parameters, thereby striving to avoid inaccuracies that could otherwise arise if attempting to calculate the probability of the disease using likelihood ratios of the individual tests. Therefore, if diagnostic criteria have been established for a condition, it is generally most appropriate to interpret any post-test probability for that condition in the context of these criteria. Also, there are risk assessment tools for estimating the combined risk of several risk factors, such as the online tool from the
Framingham Heart Study for estimating the risk for coronary heart disease outcomes using multiple risk factors, including age, gender, blood lipids, blood pressure and smoking, being much more accurate than multiplying the individual relative risks of each risk factor. Still, an experienced physician may estimate the post-test probability (and the actions it motivates) by a broad consideration including criteria and rules in addition to other methods described previously, including both individual risk factors and the performances of tests that have been carried out. ==Clinical use of pre- and post-test probabilities==