MarketSomers' D
Company Profile

Somers' D

In statistics, Somers’ D, sometimes incorrectly referred to as Somer’s D, is a measure of ordinal association between two possibly dependent random variables X and Y. Somers’ D takes values between when all pairs of the variables disagree and when all pairs of the variables agree. Somers’ D is named after Robert H. Somers, who proposed it in 1962.

Somers’ D for sample
We say that two pairs (x_i,y_i) and (x_j,y_j) are concordant if the ranks of both elements agree, or x_i>x_j and y_i>y_j or if x_i and y_i. We say that two pairs (x_i,y_i) and (x_j,y_j) are discordant, if the ranks of both elements disagree, or if x_i>x_j and y_i or if x_i and y_i>y_j. If x_i=x_j or y_i=y_j, the pair is neither concordant nor discordant. Let (x_1,y_1), (x_2,y_2), \ldots, (x_n,y_n) be a set of observations of two possibly dependent random vectors and . Define Kendall tau rank correlation coefficient \tau as : \tau=\frac{N_C-N_D}{n(n-1)/2}, where N_C is the number of concordant pairs and N_D is the number of discordant pairs. Somers’ D of with respect to is defined as D_{YX}=\tau(X,Y)/\tau(X,X). Note that Kendall's tau is symmetric in and , whereas Somers’ D is asymmetric in and . As \tau(X,X) quantifies the number of pairs with unequal values, Somers’ D is the difference between the number of concordant and discordant pairs, divided by the number of pairs with values in the pair being unequal. ==Somers’ D for distribution==
Somers’ D for distribution
Let two independent bivariate random variables (X_1, Y_1) and (X_2, Y_2) have the same probability distribution \operatorname{P}_{XY}. Again, Somers’ D, which measures ordinal association of random variables and in \operatorname{P}_{XY}, can be defined through Kendall's tau : \begin{align} \tau(X,Y) &= \operatorname{E}\Bigl(\sgn(X_1-X_2)\sgn(Y_1-Y_2)\Bigr) \\ &= \operatorname{P}\Bigl(\sgn(X_1-X_2)\sgn(Y_1-Y_2)=1\Bigr) - \operatorname{P}\Bigl(\sgn(X_1-X_2)\sgn(Y_1-Y_2)=-1\Bigr), \\ \end{align} or the difference between the probabilities of concordance and discordance. Somers’ D of with respect to is defined as D_{YX} =\tau(X,Y)/\tau(X,X). Thus, D_{YX} is the difference between the two corresponding probabilities, conditional on the values not being equal. If has a continuous probability distribution, then \tau(X,X)=1 and Kendall's tau and Somers’ D coincide. Somers’ D normalizes Kendall's tau for possible mass points of variable . If and are both binary with values 0 and 1, then Somers’ D is the difference between two probabilities: : D_{YX}=\operatorname{P}(Y=1 \mid X=1)-\operatorname{P}(Y=1\mid X=0). ==Somers' D for binary dependent variables==
Somers' D for binary dependent variables
In practice, Somers' D is most often used when the dependent variable Y is a binary variable, Identical to the Gini coefficient, Somers’ D is related to the area under the receiver operating characteristic curve (AUC), :\mathrm{AUC}=\frac{D_{XY}+1}2. In the case where the independent (predictor) variable is and the dependent (outcome) variable is binary, Somers’ D equals : D_{XY}=\frac{N_C-N_D}{N_C+N_D+N_T}, where N_T is the number of neither concordant nor discordant pairs that are tied on variable and not on variable . Example Suppose that the independent (predictor) variable takes three values, , , or , and dependent (outcome) variable takes two values, or . The table below contains observed combinations of and : The number of concordant pairs equals :N_C = 3 \times 7 + 3 \times 6 + 5 \times 6 = 69. The number of discordant pairs equals :N_D = 1 \times 5 + 1 \times 2 + 7 \times 2 = 21. The number of pairs tied is equal to the total number of pairs minus the concordant and discordant pairs :N_T = (3+5+2) \times (1+7+6) - 69 - 21 = 50 Thus, Somers’ D equals :D_{XY} = \frac{69-21}{69+21+50} \approx 0.34. ==References==
tickerdossier.comtickerdossier.substack.com