4.3 Analytic Development of Availability and Maintainability in Engineering Design 443 Transition t g is prd—each time a prd transition is disabled o r it fires, its memory variable a g is reset and its indicator re-sampling variable r g is set to 0 (the firing time must be re-sampled from the same distribution when t g becomes re-enabled). Transition t g is prs—when t g is disabled, its associated age variable a g is not reset but maintains its constant value until t g is re-enabled whereby t g = 1. At each suc- cessive enabling point, a g restarts from the previously retained value. When t g fires, both a g and r g are reset so that the firing time must b e re-sampled at the successive enabling point ( γ 2 ). The memory of t g is reset only when the transition fires. Transition t g is pri—under this policy, each time t g is disabled, its age variable a g is reset but its indicator re-sampling variable r g remains equal to 1, and the firing time value γ 1 remains active, so that in the next enabling period an identical firing will result. The same value is maintained over different enabling periods up to the firing of t g . Only when t g fires are both a g and r g reset, and the firing time is re- sampled ( γ 2 ). Hence, also in this case, the memory is lost only upon firing of t g . If the firing time is exponentially distributed, both the prd and prs policies behave in the same way. However, the pri policy does not have the property of no memory. Thus, the markingprocess of an SPN with only exponentiallydistributed firingtimes is not a continuous-time Markov chain (CTMC) if at least a single non-exclusively enabled transition exists with assigned pri policy. If the firing time is deterministic, both the prd and pri policies behave in the same way (that is, re-sampling a deterministic variable always provides an identical value). The memory of the global marking process is considered as the superpo- sition of the individual memories of the transitions. In gener al, the marking pro- cess {M(t)} underlying an SPN is not analytically tractable (i.e. easily manageable) unless some restrictions are imposed (Ciardo et al. 1994). Note th at a simulation appr oach for the prd and the prs cases, based on very similar assumptions, has been adopted in the application sim ulation modelling of Sect. 4.4. d) Definition o f Markovian Stochastic Petri Nets (MSPN) When all the random variables γ k associated with the PN transitions are exponen- tially distributed, and the execution policy is not pri, the dynamic behaviour of the PN is mapped into a continuous-time Markov chain (CTMC) with state space iso- morphic to the reachability graph of the untimed PN. This restriction is the most popular, and is usually referred to simply as MSPN or GSPN (Molloy 1982). In order to completely specify the m odel, the set Λ =( λ 1 , λ 2 , , λ nt ) of the nt firing rates assigned to the nt transitions is included. A usual convention in the graphical representation is to indicate transitions with exponentially distributed fir- ing times by means of empty rectangles, and transitions with non-exponentiallydis- tributed firing times by means of filled rectangles, as illustrated in Fig. 4.20. Modelling r eal systems often involves the presence of activities or actions (such as preventive main tenance activities) of which the dur ation is short or even negligi- ble, with respect to the timescale of the process (especially continuous engineering 444 4 Availability and Maintainability in Engineering Design Fig. 4.20 Illustrative example of an MSPN for a fault- tolerant process system (Aj- mone Marsan et al. 1995) processes). Hence, it is desirable to associate an exponentiallydistributed firing time only with those tr ansitions that are believed to have the largest impact on the system operation. The starting assumption in the MSPN model is that transitions are parti- tioned into two differentclasses, namely immediatetransitions and timed transitions (Ajmone Marsan et al. 1995). Immediate transitions fire in zero time once they are enabled, and have prior- ity over timed transitions. Timed transitions fire after an exponentially distributed firing tim e (these are called EXP transitions). In the graphical representation of MSPN, immediate transitions are drawn as thin bars. Markings enabling immediate transitions are passed through in zero time and are called vanishing states.Mark- ings enabling no immediate transitions are called tangible states. Since the process spends zero time in the vanishing states, they do not contribute to the dynamic be- haviour of the system, and a procedure can be developed to eliminate these from the final Markov chain. With the partition of PN-transitions into a timed and an im- mediate class, a greater flexibility of modelling is achieved without increasing the dimensions of the final tangible state space from which the process measures are computed. An illustrative example of an MSPN is given in Fig. 4.20. Dealing with large complex systems MSPNs can provide a compact representa- tion of very large systems. This is reflected in an exponentialgrowth of the reachable markings as a function of the primitive elements in the M SPN (places and transi- tions), and as a function of the number of tokens in the initial marking. This exponential growth of the state space has often been recognised as a severe limitation in the use of the PN methodology to deal with real-life applications, and 4.3 Analytic Development of Availability and Maintainability in Engineering Design 445 a significant effort has been devoted to overcome or to alleviate this problem (Mol- loy 1982). Since Markovian-SPNs are based on the solution of a CTMC, all the techniques that have been explored to handle very large Markov chains can prof- itably be utilised in connection with MSPNs. When dealing with large m odels, not only does the solution of the system become difficult but also the model description and the computer representation become complex, which has resulted in an increas- ing application of reachability graphs. e) Generating Reachability Graphs The generation of a PN reachability graph (an extended and a reduced) is best ex- plained with the aid of an example. Consider a process system based on a queuing client-server paradigm (ty pically in discrete event, single item and batch processing systems), the PN model being shown in Fig. 4.21. Transitions labelled t ek or s tk are timed transitions that fire after an exponentially distributed firing time EXP (rep- resented by empty rectangles), and transitions labelled t ik are immediate transitions that fire in zerotime once they are enabled (represented by thin single-line bars). The system is made up of process units (clients) waiting in a controlled queue, requiring processing (transition t e1 ) that can be supplied with probability (1−c) (transition t i3 ) by two servers (processing assemblies) working in parallel, and with probability c (transition t i1 ) by accessing a resource (place p 12 ) shared by the two servers (in this case, the resource can be envisaged as some or other utility controlling the client queue and the servers, such as a distributed control system DCS). In the case o f firing of t i3 , a message forwarded by the client is split into two sub-messages each addressed to a different server (places p 5 and p 6 ). The two servers are characterised by an exponentially distributed service time modelled by transitions s t1 and s t2 re- spectively. It is assu med, in the definition of the process model, that a processing transac- tion is concluded when all the servers have served the sub-messages they have been assigned. When a server has processed its sub-message, it accesses the shared re- source (DCS) to record its processing results (transitions t e2 and t e3 ). After both servers have accessed the shared resource, a join operation is performed and the processed result is returned to control the client queue (i.e. transition t i6 ). Conversely, with probability c, the message of a client in the queue is already available in the shared resource, so that the service requirement is met by the server accessing the resource, retrieving the message and returning it to control the client queue (transitionst i2 and t e4 ). The reachability graph illustrated in Fig. 4.22 can now be generated from the initial token distribution depicted in the PN model shown in Fig. 4.21 and the markings of Table 4.3. The extended reachability graph of an MSPN comprises both tangible and van- ishing states. Elimination of the vanishing states results in a reduced reachability graph that is isomorphic to the CTMC. Given a vanishing marking denoted by m b (which is directly reachable from a tangible marking m a ), and the set of tangible markings S, reached from m b passing through a sequence of vanishing markings 446 4 Availability and Maintainability in Engineering Design Fig. 4.21 MSPN for a process system based an a queuing client-server paradigm (Aj- mone Marson et al. 1995) Fig. 4.22 Extended reacha- bility graph generated from the MSPN m odel (Ajmone Marsan et al. 1995) m4 m1 m8 m14 m18 m5 m9 m12 m17 m15m10 m3 m2 ti2 st1 te3 te2 te2 te3 ti4ti5 st1st2 st2 st2st1 te3 te1 ti2 ti1 ti3 te4 te2 ti5ti4 m6 m11 m16 m7 m13 4.3 Analytic Development of Availability and Maintainability in Engineering Design 447 only, it is possible to evaluate the probability of the next tangible marking after m b over S.Furthermore,m a may belong to S. The vanishing marking m b and the ones reachable from m b by the firing of immediate transitions can be eliminated only by introducing arcs directly connecting m a to m c ∈ S , m c = m a , and by modifying the firing rate associated with the generic transition t k enabled in m a (Ajmone Marsan et al. 1995). Table 4.3 gives the distribution of the tokens in the reachable markings. It is quite evident that the markings m 2 ,m 3 ,m 6 ,m 7 ,m 11 ,m 13 and m 16 are vanishing (shadowed markings in Fig. 4.22) and can be eliminated. The reduced reachability graph, defined over the tangible markings only, can then be generated as illu strated in Fig. 4.23. Once the reduced reachability graph is obtained, the matrix for the underlying continuous-time Markov chain (CTMC) can be constructed. Let R 0 be the reduced reachability graph of a Markovian SPN, and N its cardinality. The infinitesimal gen- erator of the underlying CTMC is then a N ×N matrix Q,whereQ =[Q ij ]. Let Π (t) be the N-dimensio nal state probability vector, of which the generic en- try π i (t) is the probability of being in state i(i = 1,2, ,N) at time t in the associated CTMC. Then, Π (t) is the solution of the standard linear differential equation: d dt Π (t)= Π (t) ·Q (4.162) with initial condition: Π (0)=[1,0,0, ,0] . Table 4.3 Distribution of the tokens in the reachable markings p 1 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 9 p 10 p 11 p 12 m 1 •• m 2 •• m 3 •• m 4 • m 5 •• • m 6 •• • m 7 •• • m 8 •• m 9 ••• m 10 •• m 11 ••• m 12 •• m 13 •• m 14 •• m 15 ••• m 16 ••• m 17 •• m 18 •• 448 4 Availability and Maintainability in Engineering Design m1 m5 m4 m8 te3 te3 te3 te2 te2 te2 te4 te1 st1 st2 st2 st2 st1 st1 m14 m17 m18m15m9m10 m12 Fig. 4.23 Reduced reachability graph generated from the MSPN model If the steady-state probability vector Π = lim t→∞ Π (t) of the CTMC exists, it can be calculated that: Π Q = 0 with: N ∑ i=1 π i = 1 Since some of the output measures depend on the integrals of the probabilities, rather than on the probabilities per se, it is necessary to provide the appropriate computation of the integrals of the state probabilities: L i (t)= t 0 π i (z)dz (4.163) where L i (t) is the expected time that th e CTMC stays in state i during the inter- val (0,t). Let L(t) denote the N-dimensional row vector consisting of the elements L i (t). Integrating both sides of Eq. (4.162), the following relation is obtained: d dt L(t)=L(t) ·Q+ Π (0) (4.164) L(t)=N-dimensional row vector Q = N ×N matrix of the CTMC Π (0)=initial condition of the N-dimensional state probability vector. 4.3 Analytic Development of Availability and Maintainability in Engineering Design 449 f) Measures of Markovian Stochastic Petri Nets (MSPN) A fundamental property of the time-dependent representation of system behaviour through SPNs is that they enable the user to define, in a simple and natural way, a large number of different measures related to the performance and reliability of the system. The stochastic behaviour of a Markovian-SPN is determined by calculating the Π (t), Π (0) and L(t) vectors over the reduced reachability set of R 0 .However, the final output measures should be defined at the Petri net level as a function of its primitive elements (i.e. places and transitions). The following mathematical models provide a practical outline as how to relate the probabilities a t the CTMC level with useful measures at the PN level. The probability of a given condition on the SPN By means of logical or algebraic functions of the number of tokens in the PN places, a particular condition C (e.g. no tokens in a given place) can be specified, and the subset of states S ∈ R 0 can be identified for which the condition is true. The output m easure: C s (t)=Prob {condition C is true at time t} given by: C s = ∑ s∈S π s (t) (4.165) where π s (t) is the probability of being in state s at time t. Note: if S is the set of operational states, C s (t) is the usual definition of system availability. A very useful case arises when the measur e is the transient probability that the condition is satisfied for the first time. By using such an approach in the analysis of stochastic processes, the states s ∈ S can be made absorbing (i.e. assimilated), and the quantity evaluated from Eq. (4.165) as the value of the process when entering S. In this way, the ab ove equation can be used to calculate system reliability: C s (t)= ∑ s∈S π s (t) System availability: where S = set of operational states. System reliability: C s (t)= ∑ s∈S π s (t) where s ∈ S and process entering S . 450 4 Availability and Maintainability in Engineering Design The time spent in a marking Let S ∈ R 0 be the subset of markings in which a par- ticular condition is fu lfilled. The expected time, ψ s (t), spent in the markings s ∈ S during the interval (0,t) is given by: ψ s (t)= ∑ s∈S t 0 π s (z)dz (4.166) = ∑ s∈S L s (t) Moreover, from the theory of irreducible Markov chains, as t approaches infinity, the proportion of the time spent in states s ∈ S equals the asymptotic probability (Choi et al. 1994): ψ s (t)= ∑ s∈S π s (4.167) = lim t→∞ ψ s (t) t ψ s (t)/t represents the utilisation factor in the interval (0,1),and ψ s the expected steady-state utilisation factor. For example, if S is the set of states in which a pro- cess is idle, ψ s (t)/t is the fraction of idle time in (0,1) and ψ s is the expected idle time. The mean first passage time Given that C s (t), as calculated in Eq. (4.165), is the probability of having entered subset S before t for the first time, the mean first passage time μ s can be calculated as: μ s = ∞ 0 [1−C s (z)]dz (4.168) This formularequires the transientanalysis to be extendedover long intervals. There are other direct techniques for calculating mean first passage times in a CTMC but these are not relevant to this research (Ciardo et al. 1994). The distribution of tokens in a place The cumulative distribution function (c.d.f.) of the number of tokens in place p i of the SPN at time t is a step f unction in which the amplitude of the kth step is obtained by summing up the probabilities of all the states in the set R 0 containing k tokens (k = 0,l,2, ,K)inp i at time t.The probability function f i (k,t) is the amplitude of the kth step. The expected value of the number of tokens in place p i at time t is: ET [m i (t)] = ∞ ∑ k=0 kf i (k,t) (4.169) As an example, if place p i represents identical units in a queue for a common re- source, the above quantity gives the expected value of the number of units in the queue at time t. In reliability analysis, the tokens in place p i represent the number of failed components. 4.3 Analytic Development of Availability and Maintainability in Engineering Design 451 The expected number of firings of a PN transition Given an interval (0,t),the expected number of firings would indicate how many times, on average, an event modelled by a PN transition has occurred in that interval. Let t k be a g eneric PN transition, and let S be the subset of R 0 that include s all the markings s ∈ S en- abling t k . The expected number of firings of t k in (0,t) is given by: η k (t)= ∑ s∈S λ k (s) t 0 π s (z)dz (4.170) = ∑ s∈S λ k (s) ·L s (t) where λ k (s) is the firing rate of t k in marking s. In steady state, the expected number of firings per unit of time becomes: η k (t)= ∑ s∈S π s λ k (s) (4.171) This quantity represents the throughput associated with the given transition. If tran- sition t k represents the completion of a service in a queuing system, η k (t) is the expected number of services completed in time (0,t) and η k is the expected steady- state throughput. g) Definition of Stochastic Reward Nets Stochastic reward nets (SRN) introduce a new extension into Markovian-SPNs, al- lowing for the possibility of associating reward rates to the markings. The reward rates are specified at the PN level as a function of its primitives (i.e. the number of tokens in a place, or the rate of a transition). The underlying CTMC is then trans- formed into a Markov reward model, thus per mitting the eva1uation of performance measures. Implementation of this extension allows the reward structure superim- posed on the reachability graph to be generated automatically, and easily provides performance measures (Ciardo et al. 1991). The reward definition is called rate-based, to indicate that the system produces reward at rate r(i) for all the time it remains in state i ∈ R 0 .Furthermore,impulse- based reward models can be implemented where a reward function r ij is associated with each transition from the state i ∈ R 0 to j ∈ R 0 . Each time a transition from i to j occurs, the cumulative reward of the system instantaneously increases by r ij .In general, several combinations of the different reward functions can be specified in the same mo del. h) Definition o f N on-Markovian Stochastic Petri Nets As indicated previously, in o rder to define a PN with generally distributed tran- sitions, th e following entities must be specified for each transition: t g ∈ T:the 452 4 Availability and Maintainability in Engineering Design c.d.f. G g (t) of the random firing times γ g , and the execution policy for determin- ing (a g ,r g ). Several classes of SPN models have been developed that incorporate some non- exponential characteristics in their definition, and that adhere to the individual mem- ory requirements indicated previously. With the aim of specifying non-Markovian SPN models that are analytically tractable, three approaches can be considered, specifically (Bobbio et al. 1997): • An approach based on Markovian regenerative theory • An approach based on the use of supplementary variables • An approach based on state space expansion. The first approach originates from a particular definition of a non-Markovian SPN where, in each marking , a single transition is allowed to have associated with it a deterministic fir ing time with prd execution policy (i.e. a deterministic SPN, or DSPN). The marking process underlying a DSPN is a Markov regenerative process (MRGP) in which equations can be derived for the transition probability matrix in transient and in steady-state conditions (Choi et al. 1994). Generalisation of the previous formulation is proposed by including the possi- bility of modelling prs transitions and also by including pri transitions. The most general framework under which the Markov regenerative theory has been applied is where any regeneration time period is dominated by a single transition (non- overlapping dominant transitions). The second approach resorts to the use of supplementary variables. This method has been applied to prd execution policies only, and with mutually exclusive gen- eral transitions. A steady-state solution has been proposed, while the possibility of applying the methodology to transient analysis has also been explored (German et al. 1994). The third approach is based o n the expansion of the reachability graph of the basic PN. In this approach, the original non-Markovian marking process is approxi- mated by means of a continuous-time Markov chain (CTMC), defined over an aug- mented state space. According to the definitions given previously, the reachability graph expansion technique can be realised by assigning a continuous distributed random variable to each transition (Neuts 1981). Basically, the merit of this approach is the flexibility in modelling any combi- nation of prd and prs memory policies, and any number of concurrent or conflict- ing transition s with generally distributed firing times. Furthermo re, the reachability graph expansion technique can be implemented using a computer program. Starting from the basic specification at the PN level, all the solution steps can be hidden from the modeller in an OOP environment. The drawback of this approach is, of course, the explosion of the state space. . tangible marking m a ), and the set of tangible markings S, reached from m b passing through a sequence of vanishing markings 446 4 Availability and Maintainability in Engineering Design Fig reliability: C s (t)= ∑ s∈S π s (t) where s ∈ S and process entering S . 450 4 Availability and Maintainability in Engineering Design The time spent in a marking Let S ∈ R 0 be the subset of markings in which a par- ticular condition. matrix of the CTMC Π (0)=initial condition of the N-dimensional state probability vector. 4.3 Analytic Development of Availability and Maintainability in Engineering Design 449 f) Measures of Markovian