To many people, screening instinctively seems like an appropriate thing to do, because catching something earlier seems better. However, no screening test is perfect. There will always be the problems with incorrect results and other issues listed above. It is an ethical requirement for balanced and accurate information to be given to participants at the point when screening is offered, in order that they can make a fully informed choice about whether or not to accept. Before a screening program is implemented, it should be looked at to ensure that putting it in place would do more good than harm. The best studies for assessing whether a screening test will increase a population's health are rigorous
randomized controlled trials.When studying a screening program using case-control or, more usually, cohort studies, various factors can cause the screening test to appear more successful than it really is. A number of different biases, inherent in the study method, will skew results.
Overdiagnosis Screening may identify abnormalities that would never cause a problem in a person's lifetime. An example of this is
prostate cancer screening; it has been said that "more men die with prostate cancer than of it". Autopsy studies have shown that between 14 and 77% of elderly men who have died of other causes are found to have had
prostate cancer. Aside from issues with unnecessary treatment (prostate cancer treatment is by no means without risk), overdiagnosis makes a study look good at picking up abnormalities, even though they are sometimes harmless. Overdiagnosis occurs when all of these people with harmless abnormalities are counted as "lives saved" by the screening, rather than as "healthy people needlessly harmed by
overdiagnosis". So it might lead to an endless cycle: the greater the overdiagnosis, the more people will think screening is more effective than it is, which can reinforce people to do more screening tests, leading to even more overdiagnosis. Raffle, Mackie and Gray call this the popularity paradox of screening: "The greater the harm through overdiagnosis and overtreatment from screening, the more people there are who believe they owe their health, or even their life, to the programme"(p56 Box 3.4) The screening for neuroblastoma, the most common malignant solid tumor in children, in Japan is a very good example of why a screening program must be evaluated rigorously before it is implemented. In 1981, Japan started a program of screening for neuroblastoma by measuring homovanillic acid and vanilmandelic acid in urine samples of six-month-old infants. In 2003, a special committee was organized to evaluate the motivation for the neuroblastoma screening program. In the same year, the committee concluded that there was sufficient evidence that screening method used in the time led to overdiagnosis, but there was no enough evidence that the program reduced neuroblastoma deaths. As such, the committee recommended against screening and the Ministry of Health, Labor and Welfare decided to stop the screening program. Another example of overdiagnosis happened with thyroid cancer: its incidence tripled in United States between 1975 and 2009, while mortality was constant. In South Korea, the situation was even worse with 15-fold increase in the incidence from 1993 to 2011 (the world's greatest increase of thyroid cancer incidence), while the mortality remained stable. The increase in incidence was associated with the introduction of ultrasonography screening. The problem of overdiagnosis in cancer screening is that at the time of diagnosis it not possible to differentiate between a harmless lesion and lethal one, unless the patient is not treated and dies from other causes. So almost all patients tend to be treated, leading to what is called
overtreatment. As researchers Welch and Black put it, "Overdiagnosis—along with the subsequent unneeded treatment with its attendant risks—is arguably the most important harm associated with early cancer detection." So, the cases screening often detects automatically have better prognosis than symptomatic cases. The consequence is those more slow progressive cases are now classified as cancers, which increases the incidence, and due to its better prognosis, the survival rates of screened people will be better than non-screened people even if screening makes no difference.
Selection bias Not everyone will partake in a screening program. There are factors that differ between those willing to get tested and those who are not. If people with a higher risk of a disease are more likely to be screened, for instance women with a family history of breast cancer are more likely than other women to join a
mammography program, then a screening test will look worse than it really is: negative outcomes among the screened population will be higher than for a random sample. Selection bias may also make a test look better than it really is. If a test is more available to young and healthy people (for instance if people have to travel a long distance to get checked) then fewer people in the screening population will have negative outcomes than for a random sample, and the test will seem to make a positive difference. Studies have shown that people who attend screening tend to be healthier than those who do not. This has been called the healthy screenee effect, Cardiovascular risk screening is a vital tool in reducing the global incidence of cardiovascular diseases.
Study Design for the Research of Screening Programs The best way to minimize selection bias is to use a
randomized controlled trial, though
observational, naturalistic, or
retrospective studies can be of some value and are typically easier to conduct. Any study must be sufficiently large (include many patients) and sufficiently long (follow patients for many years) to have the
statistical power to assess the true value of a screening program. For rare diseases, hundreds of thousands of patients may be needed to realize the value of screening (find enough treatable disease), and to assess the effect of the screening program on mortality a study may have to follow the cohort for decades. Such studies take a long time and are expensive, but can provide the most useful data with which to evaluate the screening program and practice
evidence-based medicine.
All-cause mortality vs disease-specific mortality The main outcome of cancer screening studies is usually the number of deaths caused by the disease being screened for - this is called disease-specific mortality. To give an example: in trials of mammography screening for breast cancer, the main outcome reported is often breast cancer mortality. However, disease-specific mortality might be biased in favor of screening. In the example of breast cancer screening, women overdiagnosed with breast cancer might receive radiotherapy, which increases mortality due to lung cancer and heart disease. The problem is those deaths are often classified as other causes and might even be larger than the number of breast cancer deaths avoided by screening. So the non-biased outcome is all-cause mortality. The problem is that much larger trials are needed to detect a significant reduction in all-cause mortality. In 2016, researcher Vinay Prasad and colleagues published an article in
BMJ titled "Why cancer screening has never been shown to save lives", as cancer screening trials did not show all-cause mortality reduction. ==See also==