If we set E:=1 independently of the data we get a
trivial e-value: it is an e-variable by definition, but it will never allow us to reject the null hypothesis. This example shows that some e-variables may be better than others, in a sense to be defined below. Intuitively, a good e-variable is one that tends to be large (much larger than 1) if the alternative is true. This is analogous to the situation with p-values: both e-values and p-values can be defined without referring to an alternative, but
if an alternative is available, we would like them to be small (p-values) or large (e-values)
with high probability. In standard hypothesis tests, the quality of a valid test is formalized by the notion of
statistical power but this notion has to be suitably modified in the context of e-values. The standard notion of quality of an e-variable relative to a given alternative H_1 , used by most authors in the field, is a generalization of the
Kelly criterion in economics and (since it does exhibit close relations to classical power) is sometimes called
e-power; the optimal e-variable in this sense is known as
log-optimal or
growth-rate optimal (often abbreviated to GRO show that, under no regularity conditions at all, E= \frac{q(Y)}{\sup_{P \in H_0} p(Y)} \left( = \frac{q(Y)}{{p}_{\hat{\theta} \mid Y } (Y) } \right) is an e-variable (with the second equality holding if the MLE (
maximum likelihood estimator) \hat\theta \mid Y based on data Y is always well-defined). This way of constructing e-variables has been called the
universal inference (UI) method, "universal" referring to the fact that no regularity conditions are required.
Composite alternative, simple null Now let H_0 = \{ P \} be simple and H_1 = \{Q_{\theta}: \theta \in \Theta_1 \} be composite, such that all elements of H_0 \cup H_1 have densities relative to the same underlying measure. There are now two generic, closely related ways of obtaining e-variables that are close to growth-optimal (appropriately redefined but, in essence, re-discovered by
Philip Dawid as "prequential plug-in" and
Jorma Rissanen as "predictive
MDL". The method of mixtures essentially amounts to "being Bayesian about the numerator" (the reason it is not called "Bayesian method" is that, when both null and alternative are composite, the numerator may often not be a Bayes marginal): we posit any prior distribution W on \Theta_1 and set \bar{q}_W(Y) := \int_{\Theta_1} q_{\theta} (Y) dW(\theta) and use the e-variable \bar{q}_W(Y)/p(Y) . To explicate the plug-in method, suppose that Y= (X_1, \ldots, X_n) where X_1, X_2, \ldots constitute a stochastic process and let \breve\theta \mid X^{i} be an estimator of \theta \in \Theta_1 based on data X^i=(X_1, \ldots, X_i) for i \geq 0 . In practice one usually takes a "smoothed"
maximum likelihood estimator (such as, for example, the regression coefficients in
ridge regression), initially set to some "default value" \breve\theta \mid X^{0}:= \theta_0 . One now recursively constructs a density \bar{q}_{\breve\theta} for X^n by setting \bar{q}_{\breve\theta}(X^n) = \prod_{i=1}^n q_{\breve\theta \mid X^{i-1}}(X_i \mid X^{i-1}) . Effectively, both the method of mixtures and the plug-in method can be thought of
learning a specific instantiation of the alternative that explains the data well. However, in many other statistical testing problems, it is currently (2023) unknown whether fast implementations of the reverse information projection exist, and they may very well not exist (e.g. generalized linear models without the model-X assumption). In
nonparametric settings (such as testing a mean as in the example above, or nonparametric 2-sample testing), it is often more natural to consider e-variables of the 1+ \lambda U type. However, while these superficially look very different from likelihood ratios, they can often still be interpreted as such and sometimes can even be re-interpreted as implementing a version of the RIPr-construction. Such functions are called
p-to-e calibrators. Formally, a calibrator is a nonnegative decreasing function f : [0, 1] \rightarrow [0, \infty] which, when applied to a p-variable (a random variable whose value is a
p-value), yields an e-variable. A calibrator f is said to dominate another calibrator g if f \geq g, and this domination is strict if the inequality is strict. An admissible calibrator is one that is not strictly dominated by any other calibrator. One can show that for a function to be a calibrator, it must have an integral of at most 1 over the uniform
probability measure. One family of admissible calibrators is given by the set of functions \{f_{\kappa} : 0 with f_\kappa(p) := \kappa p^{\kappa -1}. Another calibrator is given by integrating out \kappa: : \int_0^1 \kappa p^{\kappa -1} d\kappa = \frac{1-p+p \log p}{p(-\log p)^2} Conversely, an e-to-p calibrator transforms e-values back into p-variables. Interestingly, the following calibrator dominates all other e-to-p calibrators: : f(t) := \min(1, 1/t). While of theoretical importance, calibration is not much used in the practical design of e-variables since the resulting e-variables are often far from growth-optimal for any given H_1. == E-processes ==