4.3 Analytic Development of Availability and Maintainability in Engineering Design 433 Fig. 4.16 Flowchart of the Monte Carlo simulation pro- cedure (La w et al. 1991) H times a) Generate samples of x 1 and x 2 N times M times b) Generate samples of ε 1 (x 1 ) d) Generate samples of ε 2 (x 2 ,y) f) Obtain c.d.f or p.d.f of z c) Calculate output of simulation model I y=F 1 (x 1 )+ε 1 (x 1 ) c) Calculate output of simulation model II z=F 2 (x 2 ,y)+ε 2 (x 2 ,y) The Monte Carlo simulation approach generates statistical estimates of the sys- tem output based on the given distributions of the inputs and error models. This gives more information than does the extreme condition approach, by which only the best and worst performance are estimated. Because the statistical approach is based on Monte Carlo simulation, it often requires a large number of simulations. More effective sampling techniques such as the Latin hypercube and fractional fac- torial design can be used to reduce the amount of simulations (Hicks 1993). The Monte Carlo simulation procedure is as follows (Law et al. 1991): i) Generate H samples of x 1 and x 2 as simulation inputs based on distribution functions. ii) For the given x 1 , calculate the distribution parameters of the internal uncer- tainty ε 1 (x 1 ) for simulation model I, and generate N samples of the internal uncertainty ε 1 for simulation model I based on the distribution function. iii) Evaluate the corresponding output y= F 1 (x 1 )+ ε 1 (x 1 ) for simulation model I. iv) For each y, calculate the distribution parameters of the internal uncertainty ε 2 (x 2 y) of simulation model II, and generate M sam ples of the internal un cer- tainty ε 2 based on the distribution function. v) Evaluatethe correspondingoutputz = F 2 (x 2 )+ ε 2 (x 2 ) for simulation model II. vi) Calculate the mea n value μ z ,thestandard deviation σ z or the c.d.f. and p.d.f. of z based on H ×M×N samples of z. 434 4 Availability and Maintainability in Engineering Design d) Mitigating the Effect of Uncertainty To assist designers to make reliable design decisions under uncertainties, the p ro- posed techniques of propagating the effect of uncertainties is integrated with the multidisciplinary optimisation approach based on the principles of rob ust design, i.e. by extending the quality engineering concept to the mitigation of the effects of both external and internal uncertainties. From the viewpoint of robust design, the goal is to make the system (or product) least sensitive to the potential variations without eliminating the sources of uncertainty (Taguchi 1993). The same concept is used here to reduce the impact of both external and internal uncertainties associated with the simulation programs. The robust optimisation ob- jectives are achieved by simultaneously optimising mean performance and reducing performance variation, subject to the constraints brought about by their deviations. Taguchi’s robust design h as been used in the past f or mitigating th e effect of param- eter uncertainty, which is similar to the external uncertainty considered here. This concept is extend ed to mitigate the effect of model structure uncertainty (internal uncertainty). For the extreme condition approach, the robust design model can be formulated as: Given: (4.155) Parameter and model uncertainties (ranges) Find: Robust design decisions (x) Subject to: System constraints: g worst (x) ≤ 0 Objectives: i) Optimise the mean of system attributes: a(x) ii) Minimise the deviation of system attributes: Δa(x) . In the above model, g worst (x) is the maximum constraint function estimated by the worst case of constraint function g(x),anda is the objective vector. Both g(x) and a(x) are the subsets of system output vector z. The mean and deviation of the system outputs can be obtained by the extreme condition approach as introduced earlier. This constitutes the necessary multiple objectives in robust design (i.e. both the mean and the deviation of the system are expected to be minimised with the as- sumption that optimising the mean of a system attribute can always be transformed into a minimisation problem ). The g eneral form of the objective can be expressed as: min[ax,Δa(x)] (4.156) Many existing appro aches can be used to solve this multi-objective robust optimi- sation problem. In the above model, the worst-case analysis is used to formulate the constraints. The worst-case analysis assumes that all fluctuations may occur simul- taneously in the worst possible combination (Parkinson et al. 1993). The effect of 4.3 Analytic Development of Availability and Maintainability in Engineering Design 435 variations on a function is estimated using a first-order Taylor’s series as follows: Δg(x)= ∑ f ∂ g(x)Δ(x) ∂ x 1 (4.157) where Δg(x) represents the variation transmitted to constraint g(x) for a worst-case analysis. The design feasibility expressed in Eq. (4.155) can be formulated by increasing the value of the mean g(x) by the functional variation Δg (x): g worst (x)=Δg(x)+ ∑ f ∂ g(x)Δx ∂ x 1 (4.158) For the statistical approach to estimate the performance distribution, the robust model can be formulated as: Given: (4.159) Parameter and model uncertainties (distributions) Find: Robust design decisions x Subject to: System constraints: P[g(x) ≤ 0 ] ≥P limit Objectives: i) Optimise the mean of system attributes a(x): μ a (x) ii) Minimise the standard deviation of system attributes a(x): σ a (x) . μ a (x) and σ a (x) are the estimates of the mean and variance of the system outputs respectively. The constraints in the above model are expressed by the probab ilistic formulation. P[g(x) ≤ 0] is the probability of constraint satisfaction, and it should be greater than or equal to the defined probability limit P limit . Because it is computationally expensive to evaluate the probability of constraint satisfaction, alternativeform ulations—for example, the moment matchingmethod— are used in practice to evaluate the constraints. With the moment matching method, g(x) is assumed to follow a normal distribution (Parkinson et al. 1993). The constraint in Eq. (4.159) is (Parkinson et al. 1993): μ a (x)+k σ a (x) ≤ 0 (4.160) where k is the constant for the probability of constraint satisfaction. For example, k = 1 stands for the probability ≈0.8413 and k = 2 stands for prob- ability ≈0.9772. Based on the previous considerations, the strategy that integrates the propagation and mitigation of the effect of uncertainties is summarised in Fig. 4 .17. Module A is the uncertainty quantification module that represents the first stage in the integrated methodology. Module B is the propagation module. In this module, either the ex- treme condition approach or the statistical approach is used to identify the range or to estimate the population parameters of system performance under the influence of 436 4 Availability and Maintainability in Engineering Design A. Uncertainty quantification module B. Propagation module Range [x 1min , x 1max ] [x 2min , x 2max ] [z min , z max ] or (μ z , σ z ) External uncertainties range Δx 1 and Δx 2 or c.d.f of x 1 and x 2 Internal uncertainties ε 1 (x 1 ) and ε 2 (x 2 ,y) [x 1min , x 1max ] [x 2min , x 2max ] ε 1 (x 1 ), ε 2 (x 2 ,y) or c.d.f of x 1 and x 2 ε 1 (x 1 ), ε 2 (x 2 ,y) optimization simulation [z min , z max ] (μ z , σ z ) Fig. 4.17 Propagation and mitigation strategy of the effect of uncertainties (Parkinson et al. 1993) both internal and external uncertainties. The performance ranges or estimated pop- ulation parameters a re then used to mitigate the effect of uncertainties. The basis for controlling the effect of uncer tainties is the robust design approach formulated in Eqs. (4.155) and (4.159). The process to manage the effect of uncertainty is it- erative and involves repeated uncertainty analysis until a robust optimal solution is obtained. 4.3.2 Analytic Development of Availability and Maintainability Assessment in Preliminary Design Techniques selected for further development of availability and maintainability as- sessment to determine the integrity of engineering design in the preliminary or schematic design phase of the engineering design process include the application of Petri nets (PN). The techniques selected are considered under the following topics: i. Maximising design availability using Petri net models ii. Designing for availability using Petri net modelling. Designing for availability with preventive maintenance Analytic assessment of large complex process systems has increasingly become an integral part of the en- gineering design process, particu larly in designing for availability and maintain- ability—and even more so with the inclusion of complex interactions, such as pre- ventive maintenance on system availability. Preventivemaintenance is considered as one of the key factors to increasing system reliability, availability and productivity, 4.3 Analytic Development of Availability and Maintainability in Engineering Design 437 and to reducing production costs. The importance of the inclusion of maintenance in engineering design has led to an increased sophistication in mathematical models required to analyse its impact on complex system behaviour (Lam et al. 1994). A quantitative example of designing for availability with the inclusion of preven- tive maintenance is developed. The designed system starts in a working state, but ages with time and eventually fails if no preventive maintenance action is carried out. Preventive maintenance is performed at fixed intervals from the start-up of the system in the operational state. The preventive maintenance activity takes an expo- nentially distributed amount of time and is in the form of component renewal that is assumed will allow for full system performance. The preventive maintenance in- terval is thus a critical design parameter. If the interval approaches zero, the system is always under maintenance and availability drops to zero. Conversely, if the in- terval becomes too large, the beneficial effect of the preventive maintenance action becomes negligible. The goal of the example is to develop an analytic expression for the steady-state behaviour of a complex system using Petri net (PN) methodology, and to determine the optimal design maintenance interval that will maximise system availability. 4.3.2.1 Maximising Design Availability Using Petri Net Models Petri net models have only recently gained widespread acceptance—they provide a graphical language ideally suited to modern CAD environments that can be con- cise in their specification; they pr ovide a natural way to present complex logical in- teractions among integrated system s, or process activities within a system; and they are closer to a designer’s intuition about what a complex systems model should look like (Peterson 1981; Murata 1989). Many structural and stochastic extensions have been proposed in the application of Petri nets to increase their modelling power and their capability to represent large, complex integrated systems. The most up-to-date and valuable source of references for the theoretical development and application of Petri net models is the series of international workshops known as Petri Nets and Performance Models (PNPM), initiated in Italy in 1985, and which moved to the USA, Japan, Australia and France in the following decades. a) Petri Net Theory Petri nets have been used as mathematical, graphical tools for modelling and analysing systems showing dynamic behaviours characterised by synchronous and distributed operation, as well as non-determinism (Peterson 1981). A basic Petri net structure consists of places and transitions interconnected by directed arcs. Places are denoted by circles and represent conditions, while transitions are denoted by bars and represent events. The directed arcs in a Petri net represent flow of control where the occurrence of events is controlled by a set of conditions. 438 4 Availability and Maintainability in Engineering Design In addition to its graphical structure, a Petri net is effectively used to simulate the dynamic behaviour of a modelled system in terms of states, or markings, and their changes during modelexecution. A marking is anassignment of tokens to the places, where a token denotes that the corresponding condition is true. Thus, the marking of places describes the current state of the Petri net in terms of the conditions that are true and those that are false. The translation of a flowchart to a Petri net is illustrated in Fig. 4.18 where the nodes of the flowchart are replaced by transitions in the Petri net, and the arcs are replaced by places. The Petri net execution changes the number and location o f tokens according to a rule of transition enabling and firing (Murata 1989) where: • a transition t is enabled if each input place p is marked with the tokens w(p,t), where w(p,t) is the weight of the arc from p to t; • an enabled transition may or may not fire d epending on whether o r not the event actually takes place; • an enabled transitiont is fired by removingw(p,t) tokens from each inputplace p and adding w(t, p) tokens to each output place p. Petri nets represent a powerful paradigm, useful for modelling complex systems in the context of systems performan ce, in designing for availability subject to preven- tive maintenance strategies that include complex interactions such as component renewal. Such interactions are time-related and dependent upon component age and estimated re sidual life. However, original Petri nets did no t carry any notion of time. Thus, in order to make the technique useful for quantitative systems analysis in en- gineering design, a variety of time extensions have been proposed in the literature. The distinguishing features of these time extensions are whether the duration of the events should be modelled by deterministic or random variables, and whether the time is associated with process functions, or transitional events. Petri nets in which the timing is stochastic are referred to as stochastic Petri nets (SPN), and the most common assumption is that time is assigned to the duration of transitional events. The tim e evolution of an SPN is expressed as a stochastic process, and referred to A Read a, b, c a > 0 a = a–1 c = b.a Write c B C D C B.T B.F A D p 1 p 2 p 3 p 4 p 5 F T Fig. 4.18 Translation of a flowchart to a Petri net (Peterson 1981) 4.3 Analytic Development of Availability and Maintainability in Engineering Design 439 as its marking process. SPN can be used to automatically generate the underlying marking process, which can then be analysed to yield results in terms of the original Petri net mo del. This is a case where a user-level representation of complex systems, typically in the form of simulation models (such as the process equipment models (PEMs) developed in Sect. 4.4), is translated into an analytic representation that is processed and the results referred back to the user-level representation. The most common assumption in the literature is to assign to the PN transitions an exponentially distributed firing time (i.e. start to completion time of an activity), so that the resulting marking process is a continuous-time Markov chain (CTMC; Molloy 1982). Almost all the PN-based tools are based on this assumption. In prin- ciple, simple equations can be derived for both transient and steady-state analysis of CTMCs. However, practical limitations arise from the fact that the state space (i.e. the composition of d ifferent states of a system and its transition interaction of moving from state i to state j, including the probability of such a transition) grows much faster than the number of components in the system being modelled. The use of an exponentially distributed firing time has been regarded as a restriction in the application of PN mo dels, as there are many engineering processes with times to occurrence that are not exponentially distributed. The hypothesis of exponen- tial distributions in these cases results in the construction of models that give only a qualitative, rather than quantitative analysis of real systems. The existence of de- terministic or other non-exponentially distributed events in engineering processes, such as start-up delays and pre-planned downtimes in real-time systems, gives rise to stochastic models that are non-Markovian in nature. In recent years, a consid- erable effort has been devoted to improving the PN methodology in order to deal with generally distr ibuted events in real-time systems. However, the inclusion of non-exponential distributions affects the associated marking process (in that some or other retained memory of past events would then be required),and further specifi- cation is needed at the PN level in order to uniquely define how the marking process is conditioned on past history (Ciardo et al. 1994). b) Definition of the Basic Petri Net Model AmarkedPNisatuplePN=(P,T, I,O,M) (Peterson 1981), where: • P = {p 1 , p 2 , ,p n } is the set of places (drawn as circles); • T = {t 1 ,t 2 , ,t n } is the set of transitions (drawn as bars); • I and O are the input and the output functions respectively; • M = {m 1 ,m 2 , ,m n } is the marking of the PN. The generic entry m i is the number of tokens (drawn as black dots) in place p i ,in marking M. The initial marking is M 0 . The input function I provides the multiplic- ities of the input arcs from functions to transitions; the output function O provides the multiplicities of the output arcs from transitions to functions. Input and output arcs have an arrowhead on their destination. A transition is enabled in a marking if 440 4 Availability and Maintainability in Engineering Design Fig. 4.19 Typical graphical representation of a Petri net (Lindemann et al. 1999) each of its input places contains at least as many tokens as the multiplicity of the input function I. An enabled transition fires by removing as many tokens as the mul- tiplicity of the input function f from each input place, and adding as many tokens as the multiplicity of the ou tput function O to each output place. A marking M is said to be directly reachable from M when it is gene rated from M by firing a single enabled transition t k . The reachability set R(M 0 ) is the set of all the markings that can be generated from an initial marking M 0 by repeated application of the above rule. The enabling of a transition corresponds to the starting of an activity, while the firing corresponds to the completion of an activity. Thus, the firing of a transi- tion causes a previously enabled transition to become disabled. PNs can be used to capture the behaviour of many real-world situations, such as the typical PN given in Fig. 4.19 below (Lindemann et al. 1999): Structural extensions Various structural extensions have been proposed in the lit- erature to increase either the c lass of problems that can be represented, or the ability and the ease with which real systems can be modelled. The modelling power of a PN is the ability of the PN formalism to represent classes of problems. Modelling con- venience is defined as the practical ability to represent a given behaviour in a sim- pler, more compact or more natural way. Decision power is defined to be the set of properties that can be analysed. Increasing the modelling convenience decreases the decision power. Thus, each possible extension to the basic formalism requires an in-depth evaluation of its effect upon modelling convenience and decision power (Peterson 1981). 4.3 Analytic Development of Availability and Maintainability in Engineering Design 441 Some extensions have proven so effective that they are now considered part of the standard PN definition. They are: • Inhibitor arcs • Transition priorities • Marking-dependent arc multiplicity. Inhibitor arcs connect a place to a transitio n and are drawn with a small circle on their destination. An inhibitor arc from a place p i to a transition t k disables t k when p i is not empty. It is possible to use an arc multiplicity extension together with inhibitor arcs. In this case, a transition t k is disabled whenever place p i contains at least as many tokens as the multiplicity of the inhibitor arc. The numbe r of tokens in an inhibitor input is not affected by a firing operation. Transition priorities are integer numbers assigned to the transitions. A transition is enabled in a marking if and only if no higher priority transitions are enabled. If this extension is introduced, some markings of the original PN may no longer be reachable. Marking-dependent arc multiplicity was introduced with the intent to model sit- uations in which the number of tokens to be transferred along the arcs (or to enable a transition) depends upon the system state. Arcs with marking-dependent multi- plicity are indicated by a Z on the arc, and allow simpler and more compact PNs than would otherwise be possible. In many practical problems, their use can reduce the complexity of the PN model (Ciardo 1994). c) Definition of Stochastic Petri Nets The most common way to include time into a PN is to associate the time duration with the activities that induce state changes (i.e. transitions). The duration of each activity is represented by a non-negative random variable with a known cumulative distribution function (c.d.f.). Let Γ =( γ 1 , γ 2 , , γ nt ) be the set of the nt random variables associated with the nt transitions, then the set of their c.d.f. is: G =[G 1 (t),G 2 (t), ,G nt (t)] (4.161) When a waiting time γ k is associated with a transition t k , the transition becom es enabled according to the rules of the untimed PN, but it can fire only after a time equal to γ k has elapsed. This time between the enabling and the firing is referred to as the firing time. Let {M(t),t ≥0} be the marking process, M(t) representing the marking reached by the PN at time t. The following analysis is restricted to SPNs in which the random firing times have continuous c.d.f. with infinite support, i.e. (0,∞). With this assumption, the marking process M(t) is a continuous-time, discrete-state, stochastic process with a state space that is isomorphic to the reachability graph of the untimed PN (i.e. the one looks exactly the same as the other). 442 4 Availability and Maintainability in Engineering Design Given a marking in which more than one transition with the same priority level (if priority is used) is enabled, the firing po licy determines the transition that will fire next. There are thus two possible alternatives (Ajmone Marsan et al. 1995): • Race policy: the transition of which the firing time elapses first is assumed to be the one that will fire next. • Pre-selection policy: the next transition to fire is chosen according to an exter- nally specified probability mass function independent of their firing times. By far the most common firing policy for timed transitions is the race policy i). The pre-selection policy ii) is commonly used for immediate transitions, which are introduced for the first time into Markovian SPN (Ajmone Marsan et al. 1995). Once the firing policy is defined, the execution policy must be specified. The ex- ecution policy consists of a set of specifications for uniquely defining the stochastic process, {M(t)}, under1ying an SPN. There are two elements that characterise the execution policy: a criterion to keep memory of the past history of the process (the memory policy), and an indicator of the re-sampling status of the firin g time. The memory policy defines how the process is conditioned upon the past. An age vari- able a g associated with the timed transition t g keeps track of the time for which the transition has been enabled. A timed transition fires as soon as the memory variable a g reaches the value of the firing time γ g . In the activity period of a transition, the age variable is not 0. The random firing time γ g of a transition t g can be sampled at a time instant prior to the beginning o f an activity period. To keep track of the re-sampling condition of the random firing time associated with a timed transition, a binary indicator vari- able r g that is equal to 1 is assigned to each timed transition t g when the firing time is to be sampled, and equal to 0 when the firing time is not to be sampled. Reference is made to r g as the re-sampling indicator variable. Hence, in general, the (continu- ous) memory of a transition t g is captured by the tuple (a g , r g ). At any time period t, transition t g has memory (its firing process depends on the past) if either a g or r g is different from zero. At the entrance to a marking, the remaining firing time (rft) has the value rft = γ g −a g , and is comp uted for each enabled transition given its currently sampled firing time γ g and the age variable a g . According to the race policy, the next marking is determined by the minimal of the rfts. The following execution policies can now be defined. Execution policies A timed transition t g can be: • Pre-emptive repeat different (prd): if both the age variable a g and the re-sampling indicator r g are reset each time t g is disabled or it fires. • Pre-emptive resume (prs): if both the age variable a g and the re-sampling indica- tor r g are reset only when t g fires. • Pre-emptive repeat identical (pri): if the age variable a g is reset each time t g is disabled or fires but the re-sampling indicator r g is reset only when t g fires. . development of availability and maintainability as- sessment to determine the integrity of engineering design in the preliminary or schematic design phase of the engineering design process include. value μ z ,thestandard deviation σ z or the c.d.f. and p.d.f. of z based on H ×M×N samples of z. 434 4 Availability and Maintainability in Engineering Design d) Mitigating the Effect of Uncertainty To. the principles of rob ust design, i.e. by extending the quality engineering concept to the mitigation of the effects of both external and internal uncertainties. From the viewpoint of robust design,