The following
statistical tests are used to determine the type of trend: • significance of the breakpoint (BP) by expressing BP as a function of
regression coefficients A1 and A2 and the means Y1 and Y2 of the
y-data and the means X1 and X2 of the
x data (left and right of BP), using the laws of
propagation of errors in additions and multiplications to compute the
standard error (SE) of BP, and applying
Student's t-test • significance of A1 and A2 applying Student's t-distribution and the
standard error SE of A1 and A2 • significance of the difference of A1 and A2 applying Student's t-distribution using the SE of their difference. • significance of the difference of Y1 and Y2 applying Student's t-distribution using the SE of their difference. • A more formal statistical approach to test for the existence of a breakpoint, is via the pseudo score test which does not require estimation of the segmented line. In addition, use is made of the
correlation coefficient of all data (Ra), the
coefficient of determination or coefficient of explanation,
confidence intervals of the regression functions, and
ANOVA analysis. The coefficient of determination for all data (Cd), that is to be maximized under the conditions set by the significance tests, is found from: • C_d=1-{\sum (y-Y_r)^2\over\sum (y-Y_a)^2} where Yr is the expected (predicted) value of
y according to the former regression equations and Ya is the average of all
y values. The Cd coefficient ranges between 0 (no explanation at all) to 1 (full explanation, perfect match). In a pure, unsegmented, linear regression, the values of Cd and Ra2 are equal. In a segmented regression, Cd needs to be significantly larger than Ra2 to justify the segmentation. The
optimal value of the breakpoint may be found such that the Cd coefficient is
maximum. ==No-effect range==