Suppose that the parameter \theta = (\theta_1, \theta_2, \dots, \theta_k) characterizes the
distribution f_W(w; \theta) of the random variable W. Suppose the first k moments of the true distribution (the "population moments") can be expressed as functions of the \thetas: \begin{align} \mu_1 & \equiv \operatorname E[W] = g_1(\theta_1, \theta_2, \ldots, \theta_k) , \\[4pt] \mu_2 & \equiv \operatorname E[W^2] = g_2(\theta_1, \theta_2, \ldots, \theta_k), \\ & \,\,\, \vdots \\ \mu_k & \equiv \operatorname E[W^k] = g_k(\theta_1, \theta_2, \ldots, \theta_k). \end{align} Suppose a sample of size n is drawn, resulting in the values w_1, \dots, w_n. For j=1,\dots,k, let \hat\mu_j = \frac{1}{n} \sum_{i=1}^n w_i^j be the
j-th sample moment, an estimate of \mu_j. The method of moments estimator for \theta_1, \theta_2, \ldots, \theta_k denoted by \hat\theta_1, \hat\theta_2, \dots, \hat\theta_k is defined to be the solution (if one exists) to the equations: \begin{align} \hat \mu_1 & = g_1(\hat\theta_1, \hat\theta_2, \ldots, \hat\theta_k), \\[4pt] \hat \mu_2 & = g_2(\hat\theta_1, \hat\theta_2, \ldots, \hat\theta_k), \\ & \,\,\, \vdots \\ \hat \mu_k & = g_k(\hat\theta_1, \hat\theta_2, \ldots, \hat\theta_k). \end{align} The method described here for single random variables generalizes in an obvious manner to multiple random variables leading to multiple choices for moments to be used. Different choices generally lead to different solutions. ==Advantages and disadvantages==