Much research has been done to solve uncertainty quantification problems, though a majority of them deal with uncertainty propagation. During the past one to two decades, a number of approaches for inverse uncertainty quantification problems have also been developed and have proved to be useful for most small- to medium-scale problems.
Forward propagation Existing uncertainty propagation approaches include probabilistic approaches and non-probabilistic approaches. There are basically six categories of probabilistic approaches for uncertainty propagation: • Simulation-based methods:
Monte Carlo simulations,
importance sampling, adaptive sampling, etc. • General surrogate-based methods: In a non-intrusive approach, a
surrogate model is learnt in order to replace the experiment or the simulation with a cheap and fast approximation. Surrogate-based methods can also be employed in a fully Bayesian fashion. This approach has proven particularly powerful when the cost of sampling, e.g. computationally expensive simulations, is prohibitively high. • Local expansion-based methods:
Taylor series,
perturbation method, etc. These methods have advantages when dealing with relatively small input variability and outputs that don't express high nonlinearity. These linear or linearized methods are detailed in the article
Uncertainty propagation. • Functional expansion-based methods: Neumann expansion, orthogonal or Karhunen–Loeve expansions (KLE), with
polynomial chaos expansion (PCE) and
wavelet expansions as special cases. • Most probable point (MPP)-based methods:
first-order reliability method (FORM) and second-order reliability method (SORM). • Numerical integration-based methods: Full factorial
numerical integration (FFNI) and dimension reduction (DR). For non-probabilistic approaches,
interval analysis,
Fuzzy theory,
Possibility theory and evidence theory are among the most widely used. The probabilistic approach is considered as the most rigorous approach to uncertainty analysis in engineering design due to its consistency with the theory of
decision analysis. Its cornerstone is the calculation of probability density functions for sampling statistics. This can be performed rigorously for random variables that are obtainable as transformations of Gaussian variables, leading to exact confidence intervals.
Inverse uncertainty Frequentist In
regression analysis and
least squares problems, the
standard error of
parameter estimates is readily available, which can be expanded into a
confidence interval. If the goal is uncertainty quantification for future observations, one may use
conformal prediction, which circumvents parameter estimation.
Bayesian Several methodologies for inverse uncertainty quantification exist under the
Bayesian framework. The most complicated direction is to aim at solving problems with both bias correction and parameter calibration. The challenges of such problems include not only the influences from model inadequacy and parameter uncertainty, but also the lack of data from both computer simulations and experiments. A common situation is that the input settings are not the same over experiments and simulations. Another common situation is that parameters derived from experiments are input to simulations. For computationally expensive simulations, then often a
surrogate model, e.g. a
Gaussian process or a
Polynomial Chaos Expansion, is necessary, defining an
inverse problem for finding the surrogate model that best approximates the simulations. The modular Bayesian approach derives its name from its four-module procedure. Apart from the current available data, a
prior distribution of unknown parameters should be assigned. ;
Module 1: Gaussian process modeling for the computer model To address the issue from lack of simulation results, the computer model is replaced with a
Gaussian process (GP) model : y^m(\mathbf{x},\boldsymbol{\theta})\sim\mathcal{GP}\big(\mathbf{h}^m(\cdot)^T\boldsymbol{\beta}^m,\sigma_m^2R^m(\cdot,\cdot)\big) where : R^m\big((\mathbf{x},\boldsymbol{\theta}),(\mathbf{x}',\boldsymbol{\theta}')\big)=\exp\left\{-\sum_{k=1}^d \omega_k^m(x_k-x_k')^2\right\}\exp\left\{-\sum_{k=1}^r \omega_{d+k}^m(\theta_k-\theta_k')^2 \right\}. d is the dimension of input variables, and r is the dimension of unknown parameters. While \mathbf{h}^m(\cdot) is pre-defined, \left\{\boldsymbol{\beta}^m, \sigma_m, \omega_k^m, k=1,\ldots,d+r\right\} , known as
hyperparameters of the GP model, need to be estimated via
maximum likelihood estimation (MLE). This module can be considered as a generalized
kriging method. ;
Module 2: Gaussian process modeling for the discrepancy function Similarly with the first module, the discrepancy function is replaced with a GP model : \delta(\mathbf{x})\sim\mathcal{GP}\big(\mathbf{h}^\delta(\cdot)^T\boldsymbol{\beta}^\delta,\sigma_\delta^2R^\delta(\cdot,\cdot)\big) where : R^\delta(\mathbf{x},\mathbf{x}')=\exp\left\{-\sum_{k=1}^d \omega_k^\delta(x_k-x_k')^2 \right\}. Together with the prior distribution of unknown parameters, and data from both computer models and experiments, one can derive the maximum likelihood estimates for \left\{\boldsymbol{\beta}^\delta, \sigma_\delta, \omega_k^\delta, k=1,\ldots,d\right\} . At the same time, \boldsymbol{\beta}^m from Module 1 gets updated as well. ;
Module 3: Posterior distribution of unknown parameters
Bayes' theorem is applied to calculate the
posterior distribution of the unknown parameters: : p(\boldsymbol{\theta}\mid\text{data},\boldsymbol{\varphi})\propto p(\rm{data}\mid\boldsymbol{\theta},\boldsymbol{\varphi})p(\boldsymbol{\theta}) where \boldsymbol{\varphi} includes all the fixed hyperparameters in previous modules. ;
Module 4: Prediction of the experimental response and discrepancy function
Full approach Fully Bayesian approach requires that not only the priors for unknown parameters \boldsymbol{\theta} but also the priors for the other hyperparameters \boldsymbol{\varphi} should be assigned. It follows the following steps: • Derive the posterior distribution p(\boldsymbol{\theta},\boldsymbol{\varphi}\mid\text{data}) ; • Integrate \boldsymbol{\varphi} out and obtain p(\boldsymbol{\theta}\mid\text{data}) . This single step accomplishes the calibration; • Prediction of the experimental response and discrepancy function. However, the approach has significant drawbacks: • For most cases, p(\boldsymbol{\theta},\boldsymbol{\varphi}\mid\text{data}) is a highly intractable function of \boldsymbol{\varphi} . Hence the integration becomes very troublesome. Moreover, if priors for the other hyperparameters \boldsymbol{\varphi} are not carefully chosen, the complexity in numerical integration increases even more. • In the prediction stage, the prediction (which should at least include the expected value of system responses) also requires numerical integration.
Markov chain Monte Carlo (MCMC) is often used for integration; however it is computationally expensive. The fully Bayesian approach requires a huge amount of calculations and may not yet be practical for dealing with the most complicated modelling situations. ==Known issues==