General aspects of non-determinism in numerical analysis

(this section is only a container of subsections)

Sources of numerical non-determinism in general FE modelling

Definitions

In literature, the use of the terminology error, uncertainty and variability is not unambiguous. Different researchers apply the same terminology but the meaning attached to these is rather inconsistent. This necessitates a profound clarification of the terminology for each publication which treats uncertainties. This work does not propose a new terminology, but applies the terminology proposed by Oberkampf [15]. Some additional nuances are, however, necessary in order to enable clear distinction between probabilistic and non-probabilistic quantities in the remainder of this chapter.

The term variability covers the variation which is inherent to the modelled physical system or the environment under consideration. Generally, this is described by a distributed quantity defined over a range of possible values. The exact value is known to be within this range, but it will vary from unit to unit or from time to time. Ideally, objective information on both the range and the likelihood of the quantity within this range is available. Some literature refers to this variability as aleatory uncertainty or irreducible uncertainty, referring to the fact that even when all information on the particular property is available, the quantity cannot be deterministically determined.

An uncertainty is a potential deficiency in any phase or activity of the modelling process that is due to lack of knowledge. The word potential stresses that the deficiency may or may not occur. This definition basically states that uncertainty is caused by incomplete information resulting from either vagueness, nonspecificity or dissonance [16]. Vagueness characterises information which is imprecisely defined, unclear or indistinct. It is typically the result of human opinion on unknown quantities ("the density of this material is around $x$ "). Nonspecificity refers to the availability of a number of different models that describe the same phenomenon. The larger the number of alternatives, the larger the nonspecificity. Dissonance refers to the existence of conflicting evidence of the described phenomenon, for instance when there is evidence that a quantity belongs to disjoint sets. Possibly, limited objective information is available, for instance when a range of possible values is known. In most cases, however, information on uncertainties is subjective and based on some expert opinion. Others in literature refer to this uncertainty as reducible, epistemic or subjective uncertainty.
An error is defined as a recognisable deficiency in any phase of modelling or simulation that is not due to lack of knowledge. The fact that the error is recognisable states that it should be identifiable through examination, and as such is not caused by lack of knowledge. This means that the error could be avoided by an alternative approach which is known to be more accurate, but which is possibly limited in practical applicability by computational cost or other practical considerations. A further distinction between acknowledged and unacknowledged errors is possible. Errors will not be considered further in this paper.

Figure 1 summarises the definitions in this section with their main characteristics in the context of the FE methodology.

Occurrence of variabilities, uncertainties and errors in the FE procedure
Figure 1: Occurrence of variabilities, uncertainties and errors in the FE procedure

 

Discussion and extension of the definitions

The above definitions of uncertainty and variability are fairly straightforward and comprehensible. However, they are not mutually exclusive, since a variability could be subject to lack of knowledge when information on its range or likelihood within the range is missing. This is for instance the case for every design dimension subject to tolerances, but without further specification of manufacturing process or supplier. The tolerances represent the bounds on the feasible domain, but there is no information on the likelihood of the possible values within these bounds. Consequently, because there is a lack of knowledge, such a variability is also an uncertainty. It is referred to here as an uncertain variability. Some vague knowledge may be available ("the mean value is approximately $x$ ") but also nonspecificity may play an important role in the uncertainty, for instance in choosing an appropriate model to describe a random quantity. Opposed to the uncertain variability, a certain variability refers to a variability the range and likelihood of which are exactly known.

On the other hand, it appears logical to state that every property in a numerical model corresponding to a physical quantity is a variability, since it will eventually have a range of possible values and a likelihood inside this range in the physical model. This argumentation implies that all uncertainties are also variabilities. In practice however, the majority of model properties are implemented as constant deterministic values in the numerical model. Though they are subject to variation, the influence of their variability on the analysis result is considered to be negligible. Often, uncertainties refer to a possible lack of knowledge in these deterministic properties. This type of uncertainty is referred to as invariable uncertainty. Note that invariable in this case does not mean that the property cannot change over different analyses. According to the definition of uncertainty, it will change when additional information is gathered that decreases the amount of uncertainty. The invariable uncertainties typically occur in model properties for model parts that are difficult to describe numerically, but considered constant in the final physical product (connections, damping, ...). Other examples are design properties which have negligible variability but which are not defined exactly in an early design stage. Figure 2 gives a graphical illustration of the proposed subdivision of the definitions for uncertainty and variability.

The group of variabilities may be further subdivided into two categories. Inter-sample variability is the property of a population of nominally identical realisations of a particular product, with each individual element of the population possibly exhibiting scatter. Intra-sample variability is a property of one particular realisation --- of which other realisations possibly exist --- that exhibits one or more properties that may change over time, due to temperature differences, ageing, ...

Classification of variabilities and uncertainties in numerical modelling
Figure 2: Classification of variabilities and uncertainties in numerical modelling

 

 

 

Numerical concepts for non-deterministic numerical modelling

Historically, the introduction of the non-probabilistic approaches for non-deterministic analysis has initiated a profound discussion in literature. On one side, some claim that the probabilistic approach is only a subcategory of a more universal non-probabilistic approach. Therefore, the latter would represent a more unified approach for non-deterministic analysis. On the other side, some argue that probabilistic methods are able to model anything the non-probabilistic approach can. The goal of this section is not to choose either side in this discussion, but merely to review the applicability of the non-probabilistic concepts from an objective point of view. Therefore, for each concept, its compatibility with the definitions of uncertainty and variability in the previous section is discussed. This discussion focusses on the ability to objectively represent the available information. In order to enable a critical review of the capabilities of the non-probabilistic concepts, this section first starts with a brief discussion on the main features of the probabilistic concept in the framework of uncertainty and variability modelling.

Evidence Theory [31] can be regarded as a generalisation that covers both the probabilistic as well as the non-probabilistic approaches for non-deterministic analysis, although the operations and inference rules in these theories are completely different. Evidence theory is regarded as a universal approach that can handle combinations of both probabilistic and non-probabilistic information in a single analysis. Its practical application is based on the availability of information in the form of basic belief assignments. The practical value of this approach therefore depends by large on the availability of the information in this form. While this theory is gaining interest in recent literature, its practical application in real-life engineering has yet to be proven. Therefore, it is left out of the current discussion. The following paragraphs briefly discuss the probabilistic concept, the interval concept and the fuzzy concept as stand-alone tools for the representation of non-deterministic parameters in engineering analysis. Finally, some hybrid non-deterministic numerical modelling concepts are briefly discussed.

 

Basic properties of the probabilistic concept

In the classical frequentist application of the probabilistic concept, the goal of a numerical property description is to define a domain of possible values this property can adopt, and to give information on the frequency of occurrence of the numerical values in this domain. This is typically done by defining a probability density function over the domain of possible values.

Extensive literature exists on the subject of probability theory, treating a vast variety of PDFs and their applicability for description of random quantities. An overview of these can be found in [17] and [18].

In most available non-deterministic FEM software codes, the probabilistic concept is applied to describe both variabilities and uncertainties in a model. This is mainly due to the fact that there exists a large number of numerical analysis procedures based exclusively on probabilistic input quantities. Therefore, every non-deterministic quantity in a model is readily replaced by a probabilistic quantity by introducing an appropriate PDF*. However, the probabilistic model does not necessarily represent the available objective information. For the study of the applicability of the probabilistic model, distinction between certain variabilities, uncertain variabilities and invariable uncertainties is necessary.

It is clear that the probabilistic concept is most appropriate to represent certain variabilities, since in the frequentist interpretation, the probabilistic description using a PDF* is completely consistent with the definition of a variability as in section "Sources of numerical non-determinism in general FE modelling". The information on the range and the likelihood of a certain variability can be unambiguously incorporated in the PDF*. Furthermore, the probabilistic outcome of the analysis will give an indication of the actual expected frequency of occurrence of the analysed phenomenon. It is, however, important that all information is available in order for the model to realistically represent the variability. For instance, if more than one variable property is present in the model, the correlation between the different variabilities might play an important role in the probabilistic analysis. Ideally, the joint PDF* describing the likelihood and interdependence of all non-deterministic model properties is available. Since this is almost never the case, the probabilistic description of variability interdependence is generally limited to some moments of low order. Often, when cross correlations are unknown, the variabilities are assumed to be independent of one another.

For uncertain variabilities, a representation by a single random quantity is generally not sufficient. Engineering scientist Freudenthal [19] who was one of the pioneers of probabilistic methods in engineering states that "... ignorance of the cause of variation does not make such variation random.". By this, he means that when crucial information on a variability is missing, it is not good practice to model it as a probabilistic quantity represented by a single random PDF*. On the contrary, in this case it is mandatory to apply a number of different probabilistic models to examine the effect of the chosen PDF* on the result. For instance, when the range of the variability is known but the information on the likelihood is missing, all possible PDFs over the range should be taken into consideration in the analysis. The analyst will generally select only a few probabilistic models which he considers consistent with the limited available information or most appropriate to obtain as much knowledge as possible on the result.

Most often, invariable uncertainties are represented by random quantities in probabilistic analysis. As such, the analyst tries to express his lack of knowledge of the property. This means that some PDF* is chosen which to the knowledge of the analyst represents best the uncertain nature of the quantity, but which is not based on available objective information. It is clear that in this case, the information contained in the random quantity does not represent the actual variation of the quantity in the final product, since by definition, the invariable uncertainties are considered to be constant. The random quantity in this case merely represents the presumed likelihood that a model parameter will adopt a value. As such, the lack of knowledge is filled by subjective information provided by the analyst, expressed in the form of a PDF*. This is sometimes referred to as a subjective PDF*. In this context, Bayesian methods are becoming increasingly popular for the modelling of subjective uncertainty. The main advantage of using the probabilistic approach for subjective uncertainty modelling is that the available probabilistic procedures can be readily applied for the analysis. It should be kept in mind, however, that the main strength of the Bayesian approach is its capability of incorporating objective information that becomes gradually available. When this is not the case, the Bayesian approach remains a fully subjective representation of reality.

At this point, it is very important to emphasise the consequences of the difference in the use of the probabilistic concept for variabilities on the one hand, and invariable uncertainties on the other hand. The former represents inter-sample or intra-sample variability for the final product, while the latter clearly may not be interpreted in this sense. Consequently, when interpreting the results of a probabilistic analysis based on both uncertainties and variabilities, it is imperative to distinguish between the different meanings attached to both. Though this may seem straightforward, neglecting this distinction is a very common mistake in probabilistic uncertainty analysis.

 

Basic properties of the interval concept

Recent developments in interval arithmetics are mainly based on the work of Moore [20], who introduced interval vectors and matrices and the first non-trivial applications. By definition, an interval scalar consists of a single continuous domain. The range is bounded by a lower and an upper bound. Combining different interval numbers is generally done by simply combining all component intervals independently. This means that all entries are implicitly assumed to be mutually independent quantities. This has very important consequences for the use of the interval concept in FE analysis since there is generally a strong dependency between FE matrix coefficients and right hand side coefficients in FE analysis. Neglecting this dependency results in the implicit introduction of conservatism into the analysis.

The information represented by an interval object depends on the type of modelled non-deterministic quantity. Also here, distinction between certain variabilities, uncertain variabilities and invariable uncertainties is necessary.

For certain variabilities, the input interval objects are derived from the support of the corresponding input PDFs. Consequently, the result of an interval analysis only represents the actual range of the variable outcome of the analysis. The available information on the likelihood inside the range is lost, which is an important disadvantage. Especially for a variability with a justifiable PDF* support that is very large, using the support as input for the interval analysis will generally result in an extremely wide output interval. While it is theoretically correct to state that the final result will range over this output interval, disregarding the probability of the PDF* tails in this case clearly strongly devaluates the interval analysis.

When the upper and lower bounds of a non-deterministic property are well-defined but information on the type of the distribution is missing, it belongs to the class of uncertain variabilities. In this case, the interval model represents perfectly the available information. However, especially for variabilities with a very large PDF* support, the determination of the corresponding interval bounds is not always unambiguous, since the probability of the values that are located in the tails of the commonly applied PDFs with large support is typically very low. If these tails cannot be justified adequately with experimental data, there is no reason to unconditionally use the PDF* support for the interval analysis. In this case, the analyst should implement the bounds which he considers realistic with respect to the available experimental data. Often, the $3\sigma$-bounds are assumed to be realistic interval bounds. This conversion does not necessarily reduce the truthfulness of the uncertainty representation when there is little information on the actual tails of the PDF*. Still, if the tails of the PDF* are expected to have little probability, the impact of the subjective interval bounds on the interval analysis result is much larger than the impact of subjective PDF* support limits on the probabilistic analysis result. Therefore, variabilities with unknown PDF* support but a well-known normal-like behaviour near the center of the PDF* are best modelled probabilistically.

For invariable uncertainties, generally a subjective interval is required. In this case, care should be taken not to interpret the interval quantity as the actual range in the physical product. It merely represents the values the analyst considers possible at the time the analysis is performed. Therefore, similar to the application of the probabilistic concept for invariable uncertainties, it is important to acknowledge the subjectivity in the result of the analysis. However, since the interval concept requires less subjective information to be added to the problem description, there is less room for misinterpretation of the results.

To conclude, we can state that the probabilistic concept remains the most valuable for the representation of certain variabilities and uncertain variabilities with unknown support but known normal-like behaviour. The omission of a known PDF* through the interval concept can only be justified when probabilistic information is not required, or the computational cost of the interval analysis is significantly lower. The interval concept is most valuable when dealing with uncertain variabilities with known support but unknown distribution, or invariable uncertainties.

 

Basic properties of the fuzzy concept

The theory of fuzzy logic was introduced by Zadeh [21] in 1965, and has gained an increasing popularity during the last two decades. Its most important property is that it is capable of describing linguistic and, therefore, incomplete information in a non-probabilistic manner.

A fuzzy set can be interpreted as an extension of a classical set. Where a classical set clearly distinguishes between members and non-members of the set, the fuzzy set introduces a degree of membership, represented by the membership function. This membership function describes the grade of membership to the fuzzy set for each element in the domain. Figure 3 shows the membership functions of some typical normal fuzzy numbers.

Some typical membership functions that describe linguistic variables

Figure 3: Some typical membership functions that describe linguistic variables

While the concept of fuzzy logic was invented in 1967, it resulted mainly in practical applications during the last two decades. The works of Dubois and Prade [22, 23] contributed to a large extent to this evolution. The concept has been most successful in the application to controller design, known as fuzzy control [24].

Zadeh [25] extended the theory of fuzzy sets to a basis for reasoning with possibility. In this interpretation, the membership function is considered as a possibility distribution function, providing information on the values that the described quantity can adopt. More generally, the possibility is defined as a subjective measure that expresses the degree to which the analyst considers that an event can occur. It provides in a system of defining intermediate possibilities between strictly impossible and strictly possible events. Through this interpretation, the fuzzy concept has become a tool to model subjective knowledge numerically in a non-probabilistic concept. This has drawn the attention of the numerical community, since knowledge of uncertainties in a numerical model is commonly based on expert opinion. This has lead to the first attempts to use the fuzzy concept in a non-deterministic framework, resulting in some applications in structural optimisation under uncertainty [26,27]. Also, this has initiated the development of the Fuzzy Finite Element Method (FFEM) for numerical analysis of non-deterministic models [28].

However, the application of the fuzzy concept for non-deterministic numerical modelling is not straightforward. The main problem of the representation of a model property through a fuzzy set, is that the membership function does not relate to an objective measurable quantity. The level of membership that is assigned to different members of a fuzzy set is completely based on the subjective beliefs of the analyst. Therefore, also the fuzzy results obtained from the analysis will be biased with the subjective input. Hence, these results may only be interpreted in reference to the assumed fuzzy input. This poses an important restriction on the use of the fuzzy approach for numerical design validation purposes.

For a fuzzy representation of certain variabilities, the known PDF* has to be converted to a compatible membership function. A number of methods have been developed for this purpose [23,29]. The basic law for the conversion follows from the consistency principle, which states that the degree of possibility of an event is greater than or equal to its degree of probability. These conversion techniques always rely on some sort of subjective judgement. It is the authors' opinion that forcing the application of fuzzy sets into the domain of certain variabilities through a conversion of PDFs as described above is rather irrational. Available objective probabilistic data is replaced by a subjective description, resulting in the loss of very valuable information. This loss is generally unjustifiable. Therefore, the conversion of a PDF* to a membership function should not be done.

For uncertain variabilities, the fuzzy concept can be used for a hybrid uncertainty model. It stems from an alternative interpretation of a possibility distribution introduced by Dubois and Prade [30] based on the Evidence Theory. In this approach, a fuzzy number is used to represent a class of probability random quantities that have a cumulative distribution function (CDF*) in between boundaries derived directly from the possibility distribution. The left boundary on the compatible CDFs coincides with the increasing branch of the fuzzy number. The right boundary coincides with the complement of the decreasing branch of the fuzzy number. Figure 4 clarifies this approach. In this concept, the possibilistic approach becomes a tool to simultaneously examine the effect of a set of different PDFs in a single analysis. While the ability of this concept to model classes of probabilistic data seems extremely powerful, it has only been applied very rarely in uncertainty analysis.

Possibility distribution of a fuzzy number and corresponding lower and upper boundaries for CDF compatible with the fuzzy number

Figure 4: Possibility distribution of a fuzzy number and corresponding lower and upper boundaries for CDF* compatible with the fuzzy number

Finally, an invariable uncertainty requires a fuzzy set that represents the subjective expectation of the analyst. When the invariable uncertainty represents an open design decision subject to optimisation, the analyst can express his preference of the quantity through the possibility distribution. Still, when interpreting the results, reference to the chosen input membership functions is imperative.

Considering the explicit subjective nature of a fuzzy set, it is concluded that it is most useful to describe uncertainties. The more objective information becomes available on a non-deterministic model property, the less the fuzzy concept is appropriate to describe it.

 

Hybrid non-deterministic modelling concepts

If for an uncertain variability there is information about the range but no information on the probability of occurrence, every PDF* over this range becomes equally plausible and should be taken into consideration. However, the information of the likelihood inside the range is not needed for the interval object, which makes it perfectly suited to model this kind of uncertainty. An interval property can consequently be interpreted as a collective description of all possible probability density functions over the considered interval. For uncertain variabilities without objective information on the actual range, a subjective interval has to be chosen in order to apply the interval concept. For special cases, however, it might be possible to introduce a parametrical representation of the PDF* of an uncertain variability, using the unknown PDF* properties as parameters. For instance, it might be known that the quantity is normally distributed, but the mean value and variance of the distribution is not exactly known. A parametrical PDF* is introduced:

$f_X(x,m,\sigma) = \frac{1}{\sqrt{2 \pi} \sigma} \exp( - \frac{(x-m)^2}{2\sigma^2})$

(1)

with $m$ and $\sigma$ two parameters representing the mean value and standard deviation of the distribution. The interval or convex concept can then be applied to represent the uncertainty on the parameters of the probabilistic description. The resulting numerical description of the quantity takes into account all possible PDFs that can be generated using the limited probabilistic knowledge and added non-probabilistic uncertainty description. Therefore, it is called a hybrid model of uncertainty. Elishakoff et al. [51] introduced this hybrid concept for the uncertainty description by applying the concept of convex modelling on parameters in a probabilistic description of seismic excitation. He obtained the worst and best case response taking into consideration a large set of plausible probabilistic excitation functions through a single analysis.

The key idea of an alternative hybrid approach is to search for an upper and lower bound on the probabilistic reliability using the theory of Interval Probabilities. The method is based on the work of Ditlevsen [52] and has been successfully applied to describe bounds on the reliability of a system for which the interdependencies between the random quantities are unknown [53].

Finally, also the concept of Fuzzy Randomness is clearly a hybrid approach. It adopts principles from both probabilitstic and fuzzy theory. This concept allows impreciseness in the values that are assumed by a random variable. The fuzzy theory is used to represent this imreciseness. The translation of the classical properties of random numbers (e.g. mean value, variance) to this hybrid approach has been established in an early paper by Kwakernaak [49]. Recently, the application of this theory in civil engineering applications is gaining interest in literature [50].