In real data, it sometimes happens that there is an observation X_i in the sample which equals zero or a pair (X_i, Y_i) with X_i = Y_i. It can also happen that there are tied observations. This means that for some i \neq j, we have X_i = X_j (in the one-sample case) or X_i - Y_i = X_j - Y_j (in the paired sample case). This is particularly common for discrete data. When this happens, the test procedure defined above is usually undefined because there is no way to uniquely rank the data. (The sole exception is if there is a single observation X_i which is zero and no other zeros or ties.) Because of this, the test statistic needs to be modified.
Zeros Wilcoxon's original paper did not address the question of observations (or, in the paired sample case, differences) that equal zero. However, in later surveys, he recommended removing zeros from the sample. Then the standard signed-rank test could be applied to the resulting data, as long as there were no ties. This is now called the
reduced sample procedure. Pratt observed that the reduced sample procedure can lead to paradoxical behavior. He gives the following example. Suppose that we are in the one-sample situation and have the following thirteen observations: :0, 2, 3, 4, 6, 7, 8, 9, 11, 14, 15, 17, −18. The reduced sample procedure removes the zero. To the remaining data, it assigns the signed ranks: :1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, −12. This has a one-sided
p-value of 55/2^{12}, and therefore the sample is not significantly positive at any significance level \alpha . Pratt argues that one would expect that decreasing the observations should certainly not make the data appear more positive. However, if the zero observation is decreased by an amount less than 2, or if all observations are decreased by an amount less than 1, then the signed ranks become: :−1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, −13. This has a one-sided
p-value of 109/2^{13}. Therefore the sample would be judged significantly positive at any significance level \alpha > 109/2^{13} \approx 0.0133. The paradox is that, if \alpha is between 109/2^{13} and 55/2^{12}, then
decreasing an insignificant sample causes it to appear significantly
positive. Pratt therefore proposed the
signed-rank zero procedure. This procedure includes the zeros when ranking the observations in the sample. However, it excludes them from the test statistic, or equivalently it defines \sgn(0) = 0. Pratt proved that the signed-rank zero procedure has several desirable behaviors not shared by the reduced sample procedure: • Increasing the observed values does not make a significantly positive sample insignificant, and it does not make an insignificant sample significantly negative. • If the distribution of the observations is symmetric, then the values of \mu which the test does not reject form an interval. • A sample is significantly positive, not significant, or significantly negative, if and only if it is so when the zeros are assigned arbitrary non-zero signs, if and only if it is so when the zeros are replaced with non-zero values which are smaller in absolute value than any non-zero observation. • For a fixed significance threshold \alpha, and for a test which is randomized to have level exactly \alpha, the probability of calling a set of observations significantly positive (respectively, significantly negative) is a non-decreasing (respectively, non-increasing) function of the observations. Pratt remarks that, when the signed-rank zero procedure is combined with the average rank procedure for resolving ties, the resulting test is a consistent test against the alternative hypothesis that, for all i \neq j, \Pr(X_i + X_j > 0) and \Pr(X_i + X_j differ by at least a fixed constant that is independent of i and j. The signed-rank zero procedure has the disadvantage that, when zeros occur, the null distribution of the test statistic changes, so tables of
p-values can no longer be used. When the data is on a
Likert scale with equally spaced categories, the signed-rank zero procedure is more likely to maintain the Type I error rate than the reduced sample procedure. From the viewpoint of statistical efficiency, there is no perfect rule for handling zeros. Conover found examples of null and alternative hypotheses that show that neither Wilcoxon's and Pratt's methods are uniformly better than the other. When comparing a discrete uniform distribution to a distribution where probabilities linearly increase from left to right, Pratt's method outperforms Wilcoxon's. When testing a binomial distribution centered at zero to see whether the parameter of each Bernoulli trial is \tfrac12, Wilcoxon's method outperforms Pratt's.
Ties When the data does not have ties, the ranks R_i are used to calculate the test statistic. In the presence of ties, the ranks are not defined. There are two main approaches to resolving this. The most common procedure for handling ties, and the one originally recommended by Wilcoxon, is called the
average rank or
midrank procedure. This procedure assigns numbers between 1 and
n to the observations, with two observations getting the same number if and only if they have the same absolute value. These numbers are conventionally called ranks even though the set of these numbers is not equal to \{1, \dots, n\} (except when there are no ties). The rank assigned to an observation is the average of the possible ranks it would have if the ties were broken in all possible ways. Once the ranks are assigned, the test statistic is computed in the same way as usual. For example, suppose that the observations satisfy In this case, X_3 is assigned rank 1, X_2 and X_5 are assigned rank (2 + 3) / 2 = 2.5, X_6 is assigned rank 4, and X_1, X_4, and X_7 are assigned rank (5 + 6 + 7) / 3 = 6. Formally, suppose that there is a set of observations all having the same absolute value v, that k - 1 observations have absolute value less than v, and that \ell observations have absolute value less than or equal to v. If the ties among the observations with absolute value v were broken, then these observations would occupy ranks k through \ell. The average rank procedure therefore assigns them the rank (k + \ell) / 2. Under the average rank procedure, the null distribution is different in the presence of ties. The average rank procedure also has some disadvantages that are similar to those of the reduced sample procedure for zeros. It is possible that a sample can be judged significantly positive by the average rank procedure; but increasing some of the values so as to break the ties, or breaking the ties in any way whatsoever, results in a sample that the test judges to be not significant. However, increasing all the observed values by the same amount cannot turn a significantly positive result into an insignificant one, nor an insignificant one into a significantly negative one. Furthermore, if the observations are distributed symmetrically, then the values of \mu which the test does not reject form an interval. The other common option for handling ties is a tiebreaking procedure. In a tiebreaking procedure, the observations are assigned distinct ranks in the set \{1, \dots, n\}. The rank assigned to an observation depends on its absolute value and the tiebreaking rule. Observations with smaller absolute values are always given smaller ranks, just as in the standard rank-sum test. The tiebreaking rule is used to assign ranks to observations with the same absolute value. One advantage of tiebreaking rules is that they allow the use of standard tables for computing
p-values.
Random tiebreaking breaks the ties at random. Under random tiebreaking, the null distribution is the same as when there are no ties, but the result of the test depends not only on the data but on additional random choices. Averaging the ranks over the possible random choices results in the average rank procedure. Random tiebreaking has the advantage that the probability that a sample is judged significantly positive does not decrease when some observations are increased.
Conservative tiebreaking breaks the ties in favor of the null hypothesis. When performing a one-sided test in which negative values of T tend to be more significant, ties are broken by assigning lower ranks to negative observations and higher ranks to positive ones. When the test makes positive values of T significant, ties are broken the other way, and when large absolute values of T are significant, ties are broken so as to make |T| as small as possible. Pratt observes that when ties are likely, the conservative tiebreaking procedure "presumably has low power, since it amounts to breaking all ties in favor of the null hypothesis." The average rank procedure can disagree with tiebreaking procedures. Pratt gives the following example. ==Computing the null distribution==