07 - Sampling Methods

This chapter of the NDA* Handbook will explain the non deterministic approaches based on sampling. Those methods can be used for both possibilistic and probabilistic approaches with some adaptations.

Here a general overview and the classical definition of these methods is proposed, leaving to specific sections how to customize these techniques for different purposes. Some examples are given to better explain the advantages and the limitations of using sampling methods.

Introduction to Monte Carlo methods

Numerical methods that are known as Monte Carlo methods can be loosely described as statistical simulation methods. A statistical simulation can be defined, in general terms, to be any method that utilizes sequences of random numbers to perform the simulation.

Monte Carlo methods have been used for centuries, but only in the past several decades this technique has gained the status of a full-fledged numerical method capable of addressing the most complex applications. The name "Monte Carlo" was coined by Metropolis (inspired by Ulam's interest in poker) during the Manhattan Project of World War II, because of the similarity of statistical simulation to games of chance, and because the capital of Monaco was a center for gambling and similar pursuits.

Monte Carlo is now used routinely in many diverse fields, from the simulation of complex physical phenomena such as radiation transport in the earth's atmosphere to more common cases, such as the simulation of a Bingo game. The analogy of Monte Carlo methods to games of chance is a good one but, in engineering applications, the "game" is a physical system, a Finite Element model or a black-box function, and the outcome of the game is a solution to some problem. The scientist judges the value of his results on their intrinsic worth, rather than the extrinsic worth of his holdings.

Statistical simulation methods may be contrasted to conventional numerical discretization methods, which typically are applied to ordinary or partial differential equations that describe some underlying physical or mathematical system.

In many applications of Monte Carlo, the physical process is simulated directly, and there is no need to even write down the differential equations that describe the behavior of the system. The only requirement is that the physical (or mathematical) system be described by probability density functions (PDFs).

From now on, we will assume that the behavior of a system can be described by PDFs. Once the PDFs are known, the Monte Carlo simulation can proceed by random sampling from the PDFs. Many simulations are then performed (multiple "trials" or "histories") and the desired result is taken as an average over the number of observations (which may be a single observation or perhaps millions of observations). In many practical applications, one can predict the statistical error (the "variance") in this average result, and hence an estimate of the number of Monte Carlo trials that are needed to achieve a given error.

Assuming that the evolution of the physical system can be described by probability density functions (PDFs), then the Monte Carlo simulation can proceed by sampling from these PDFs, which necessitates a fast and effective way to generate random numbers uniformly distributed on the interval [0,1]. The outcomes of these random samplings, or trials, must be accumulated or tallied in an appropriate manner to produce the desired result, but the essential characteristic of Monte Carlo is the use of random sampling techniques (and perhaps other algebra to manipulate the outcomes) to arrive at a solution of the physical problem. In contrast, a conventional numerical solution approach would start with the mathematical model of the physical system, discretizing the differential equations and then solving a set of algebraic equations for the unknown state of the system.

It should be kept in mind though that this general description of Monte Carlo methods may not directly apply to some applications. It is natural to think that Monte Carlo methods are used to simulate random, or stochastic, processes, since these can be described by PDFs. However, this coupling is actually too restrictive because many Monte Carlo applications have no apparent stochastic content, such as the evaluation of a definite integral or the inversion of a system of linear equations. However, in these cases and others, one can pose the desired solution in terms of PDFs, and while this transformation may seem artificial, this step allows the system to be treated as a stochastic process for the purpose of simulation and hence Monte Carlo methods can be applied to simulate the system.

Therefore, a broad view of the definition of Monte Carlo methods can be taken and can include in the Monte Carlo rubric all methods that involve statistical simulation of some underlying system, whether or not the system represents a real physical process.

 

Major Components of a Monte Carlo Algorithm

Given the definition of Monte Carlo, let's now describe briefly the major components of a Monte Carlo method. These components comprise the foundation of most Monte Carlo applications, and the following sections will explore them in more detail. An understanding of these major components will provide a good foundation to construct your own Monte Carlo method although, of course, the physics and mathematics of the specific application are well beyond the scope of this page. The primary components of a Monte Carlo simulation method include the following:

  • Probability Distribution Functions (PDFs) - the physical (or mathematical) system must be described by a set of PDFs.
  • Random number generator - a source of random numbers uniformly distributed on the unit interval [0,1] must be available.
  • Sampling rule - a prescription must be given for sampling from the specified PDFs, assuming the availability of random numbers on the unit interval.
  • Scoring (or tallying) - the outcomes can be accumulated into overall tallies or scores for the quantities of interest.
  • Error estimation - an estimate of the statistical error (variance) as a function of the number of trials and other quantities must be determined.
  • Variance reduction techniques - methods for reducing the variance in the estimated solution to reduce the computational time for Monte Carlo simulation
  • Parallelization and vectorization - algorithms to allow Monte Carlo methods can be implemented efficiently on advanced computer architectures and can exploit the full potential of parallel computations.

The remainder of this book section will treat these topics in some more detail.

 

Sampling from Probability Distribution Functions

As described earlier, a Monte Carlo simulation consists of some physical or mathematical system that can be described in terms of probability distribution functions, or PDFs. These PDFs, supplemented perhaps by additional computations, describe the evolution of the overall system, whether in space, or energy, or time, or even some higher dimensional phase space. The goal of the Monte Carlo method is to simulate the physical system by random sampling from these PDFs and by performing the necessary supplementary computations needed to describe the system evolution. In essence, the physics and mathematics are replaced by random sampling of possible states from PDFs that describe the system.

We will now discuss how to obtain a random sample $x$ from either a continuous PDF* $f(x)$ or a discrete PDF* $\{p_{i}\}$.

Equivalent Continuous PDFs

It will be convenient to express a discrete PDF* as a continuous PDF* using "delta functions". This will make the ensuing discussion easier to follow and simplifies many of the manipulations for discrete PDFs. Given a discrete pdf $\{p_{i}\}$, let us associate event $i$ with the discrete random variable (r.v.) $x_{i}$, and then define an equivalent "continuous" PDF* as follows:

$f(x)= sum_{i=1}^{N}p_{i}delta(x-x_{i})$

(1)

Here $delta(x-x_{i})$ is the "delta" function and it satisfies the following properties:

$int_{-oo}^{oo}delta(x-x_{i})dx=1$

(2)

$int_{-oo}^{oo}f(x)delta(x-x_{i})dx=f(x_{i})$

(3)

 

 

Using these properties, it is straightforward to show that the mean and variance of the equivalent continuous pdf, as defined in Eq. (1), are identical to the mean and variance of the original discrete PDF*. Begin with the definition of the mean of the equivalent continuous PDF*:

$bar x =int_{-oo}^{oo}xf(x)dx=int_{-oo}^{oo}x[sum_{i=1}^{N}p_{i}delta(x-x_{i})]dx$

(4)

Now take the summation outside the integral and use Eq. (3),

$bar x= sum_{i=1}^{N} int_{-oo}^{oo}xp_[i}delta(x-x_{i})dx = sum_{i=1}^{N}x_{i}p_{i}$

(5)

which is the true mean for the discrete PDF*. This also holds for the variance, and in general for any moment of the distribution. Much of the material that follows holds for both discrete and continuous PDFs, and this equivalence will be useful in this discussion.

 

Transformation of PDFs

In order to have a complete discussion of sampling, we need to explain transformation rules for PDFs. That is, given a PDF* $f(x)$, one defines a new variable $y=y(x)$, and the goal is to find the pdf $g(y)$ that describes the probability that the r.v. $y$ occurs. For example, given the pdf $f(E)$ for the energy of the scattered neutron in an elastic scattering reaction from a nucleus of mass $A$, what is the pdf $g(v)$ for the speed $v$, where $E=frac{1}{2}mv^{2}$?

First of all, we need to restrict the transformation $y=y(x)$ to be a unique transformation, because there must be a 1-to-1 relationship between $x$ and $y$ in order to be able to state that a given value of $x$ corresponds unambiguously to a value of $y$. Given that $y(x)$ is 1-to-1, then it must either be monotone increasing or monotone decreasing, since any other behavior would result in a multiple-valued function $y(x)$.

Let us first assume that the transformation $y(x)$ is monotone increasing, which results in $frac{dx}{dy}>0$ for all $x$. Physically, the mathematical transformation must conserve probability, i.e., the probability of the r.v. $x'$ occurring in $dx$ about $x$ must be the same as the probability of the r.v. $y'$ occurring in $dy$ about $y$, since if $x$ occurs, the 1-to-1 relationship between $x$ and $y$ necessitates that $y$ appears. But by definition of the pdf's $f(x)$ and $g(y)$,

${:( f(x)dx="prob"(x lt= x' lt=x+dx) ),( g(y)dy="prob"(y lt= y' lt=y+dy) ):}$

 

The physical transformation implies that these probabilities must be equal.

Equality of these differential probabilities yields

$f(x)dx=g(y)dy$

(6)

and one can then solve for $g(y)$:

$g(y)=f(x)/[dy/dx]$

(7)

 

This holds for the monotone increasing function $y(x)$. It is easy to show that for a monotone decreasing function $y(x)$, where $dy/dx<0$ for all $x$, the fact that $g(y)$ must be positive (by definition of probability) leads to the following expression for $g(y)$:

$g(y)=f(x)/[-dy/dx]$

(8)

Combining the two cases leads to the following simple rule for transforming pdf's:

$g(y)=f(x)/|dy/dx|$

(9)

 

For multidimensional pdf's, the derivative $|dy/dx|$ is replaced by the Jacobian of the transformation, which will be described later when we discuss sampling from the Gaussian pdf.

Example 1: An illustration of the cumulative distribution function, or cdf.

Perhaps the most important transformation occurs when $y(x)$ is the cumulative distribution function, or cdf:

$y(x)=F(x)-=int_{-oo}^{oo}f(x')dx'$

(10)

In this case, we have $dy/dx=f(x)$, and one finds the important result that the pdf for the transformation is given by:

$g(y)=1, quad 0 lt= y lt= 1$

(11)

In other words, the cdf is always uniformly distributed on [0,1], independently of the pdf $f(x)$. Any value for the cdf is equally likely on the interval [0,1]. As will be seen next, this result has important ramifications for sampling from an arbitrary pdf.