It is a trivial matter to show that a Gibbs random field satisfies every
Markov property. As an example of this fact, see the following: In the image to the right, a Gibbs random field over the provided graph has the form \Pr(A,B,C,D,E,F) \propto f_1(A,B,D)f_2(A,C,D)f_3(C,D,F)f_4(C,E,F). If variables C and D are fixed, then the global Markov property requires that: A, B \perp E, F | C, D (see
conditional independence), since C, D forms a barrier between A, B and E, F. With C and D constant, \Pr(A,B,E,F|C=c,D=d) \propto [f_1(A,B,d)f_2(A,c,d)] \cdot [f_3(c,d,F)f_4(c,E,F)] = g_1(A,B)g_2(E,F) where g_1(A,B) = f_1(A,B,d)f_2(A,c,d) and g_2(E,F) = f_3(c,d,F)f_4(c,E,F). This implies that A, B \perp E, F | C, D. To establish that every positive probability distribution that satisfies the local Markov property is also a Gibbs random field, the following lemma, which provides a means for combining different factorizations, needs to be proved:
Lemma 1 Let U denote the set of all random variables under consideration, and let \Theta, \Phi_1, \Phi_2, \dots, \Phi_n \subseteq U and \Psi_1, \Psi_2, \dots, \Psi_m \subseteq U denote arbitrary sets of variables. (Here, given an arbitrary set of variables X, X will also denote an arbitrary assignment to the variables from X.) If \Pr(U) = f(\Theta)\prod_{i=1}^n g_i(\Phi_i) = \prod_{j=1}^m h_j(\Psi_j) for functions f, g_1, g_2, \dots g_n and h_1, h_2, \dots, h_m, then there exist functions h'_1, h'_2, \dots, h'_m and g'_1, g'_2, \dots, g'_n such that \Pr(U) = \bigg(\prod_{j=1}^m h'_j(\Theta \cap \Psi_j)\bigg)\bigg(\prod_{i=1}^n g'_i(\Phi_i)\bigg) In other words, \prod_{j=1}^m h_j(\Psi_j) provides a template for further factorization of f(\Theta). In order to use \prod_{j=1}^m h_j(\Psi_j) as a template to further factorize f(\Theta), all variables outside of \Theta need to be fixed. To this end, let \bar{\theta} be an arbitrary fixed assignment to the variables from U \setminus \Theta (the variables not in \Theta). For an arbitrary set of variables X, let \bar{\theta}[X] denote the assignment \bar{\theta} restricted to the variables from X \setminus \Theta (the variables from X, excluding the variables from \Theta). Moreover, to factorize only f(\Theta), the other factors g_1(\Phi_1), g_2(\Phi_2), ..., g_n(\Phi_n) need to be rendered moot for the variables from \Theta. To do this, the factorization \Pr(U) = f(\Theta)\prod_{i=1}^n g_i(\Phi_i) will be re-expressed as \Pr(U) = \bigg(f(\Theta)\prod_{i=1}^n g_i(\Phi_i \cap \Theta, \bar{\theta}[\Phi_i])\bigg)\bigg(\prod_{i=1}^n \frac{g_i(\Phi_i)}{g_i(\Phi_i \cap \Theta, \bar{\theta}[\Phi_i])}\bigg) For each i = 1, 2, ..., n: g_i(\Phi_i \cap \Theta, \bar{\theta}[\Phi_i]) is g_i(\Phi_i) where all variables outside of \Theta have been fixed to the values prescribed by \bar{\theta}. Let f'(\Theta) = f(\Theta)\prod_{i=1}^n g_i(\Phi_i \cap \Theta, \bar{\theta}[\Phi_i]) and g'_i(\Phi_i) = \frac{g_i(\Phi_i)}{g_i(\Phi_i \cap \Theta, \bar{\theta}[\Phi_i])} for each i = 1, 2, \dots, n so \Pr(U) = f'(\Theta)\prod_{i=1}^n g'_i(\Phi_i) = \prod_{j=1}^m h_j(\Psi_j) What is most important is that g'_i(\Phi_i) = \frac{g_i(\Phi_i)}{g_i(\Phi_i \cap \Theta, \bar{\theta}[\Phi_i])} = 1 when the values assigned to \Phi_i do not conflict with the values prescribed by \bar{\theta}, making g'_i(\Phi_i) "disappear" when all variables not in \Theta are fixed to the values from \bar{\theta}. Fixing all variables not in \Theta to the values from \bar{\theta} gives \Pr(\Theta, \bar{\theta}) = f'(\Theta) \prod_{i=1}^n g'_i(\Phi_i \cap \Theta, \bar{\theta}[\Phi_i]) = \prod_{j=1}^m h_j(\Psi_j \cap \Theta, \bar{\theta}[\Psi_j]) Since g'_i(\Phi_i \cap \Theta, \bar{\theta}[\Phi_i]) = 1, f'(\Theta) = \prod_{j=1}^m h_j(\Psi_j \cap \Theta, \bar{\theta}[\Psi_j]) Letting h'_j(\Theta \cap \Psi_j) = h_j(\Psi_j \cap \Theta, \bar{\theta}[\Psi_j]) gives: f'(\Theta) = \prod_{j=1}^m h'_j(\Theta \cap \Psi_j) which finally gives: \Pr(U) = \bigg(\prod_{j=1}^m h'_j(\Theta \cap \Psi_j)\bigg)\bigg(\prod_{i=1}^n g'_i(\Phi_i)\bigg) Lemma 1 provides a means of combining two different factorizations of \Pr(U). The local Markov property implies that for any random variable x \in U, that there exists factors f_x and f_{-x} such that: \Pr(U) = f_x(x, \partial x)f_{-x}(U \setminus \{x\}) where \partial x are the neighbors of node x. Applying Lemma 1 repeatedly eventually factors \Pr(U) into a product of clique potentials (see the image on the right).
End of Proof ==See also==