

small (250x250 max)
medium (500x500 max)
Large
Extra Large
large ( > 500x500)
Full Resolution


USING QUALITATIVE DISCRETE EVENT SIMULATION TO CALCULATE NEAR EXACT OUTPUT STATISTICS By YEN PING LEOW Bachelor of Science in Mathematical and Computer Science University of Adelaide Adelaide, Australia 2000 Master of Science in Industrial Engineering and Management Oklahoma State University Stillwater, Oklahoma 2002 Submitted to the Faculty of the Graduate College of the Oklahoma State University in partial fulfillment of the requirements for the Degree of DOCTOR OF PHILOSOPHY July, 2008 ii USING QUALITATIVE DISCRETE EVENT SIMULATION TO CALCULATE NEAR EXACT OUTPUT STATISTICS Dissertation Approved: Dr. Ricki G. Ingalls Dissertation Adviser Dr. David B. Pratt Dr. Camille DeYong Dr. Dursun Delen Dr. A. Gordon Emslie Dean of the Graduate College iii DEDICATION This dissertation is dedicated to my loving husband, Dr. Loay Sehwail and my parents, Swee Meen Leow and See Moy Soo whose love and devotion have helped me accomplished my educational goals. iv ACKNOWLEDGEMENT First, I would like to express my gratitude to my advisor Dr. Ricki Ingalls for the guidance, encouragement and knowledge he has provided me throughout the course of writing this dissertation. I would like to extend my sincere appreciation to my dissertation committee: Drs. David Pratt, Camille DeYong and Delen Dursun for all your time, input and ideas. I am most grateful to my family for their support of my graduate studies. This work would not have been possible without the tireless support of my husband, Loay, who has endured many financial, personal and career sacrifices to see me reach this goal. To my son, Munir, who has helped me to overcome the stress of dissertation writing. To my father, Swee Meen, who has supported and encouraged me to complete my graduate studies. To my mother, See Moy, who inspired me to enjoy learning. To my brother, Kai Siong, and my sisters, Fong Ping, Li Ping and Siew Ping, I am grateful for their support of my efforts. v TABLE OF CONTENTS CHAPTER 1: INTRODUCTION 1 1.1 Overview 1 1.2 Research Objective and Motivation 5 CHAPTER 2: LITERATURE REVIEW 7 2.1 Discrete Event Simulation 7 2.2 Qualitative Simulation 11 2.3 Qualitative Discrete Event Simulation (QDES) 14 2.4 Event Graphs and Simulation Graph Models 18 2.5 Interval Mathematics and Its Use in Modeling 19 2.6 Temporal Intervals and Qualitative Time Calendar Construct 20 2.7 Qualitative Discrete Event Simulation (QDES) Algorithm 21 2.7.1 Depth First Algorithm 22 2.7.2 Breadth First Algorithm 23 2.8 Thread Scoring Techniques 24 CHAPTER 3: DELAY CONSTRAINT VALIDITY 26 3.1 Introduction 26 3.2 Simple Queuing Model 28 vi 3.3 Discovery of Invalid Threads 29 3.4 Delay Constraint Algorithm 38 CHAPTER 4: PROBABILITY OF THREAD OCCURRENCE 43 4.1 Introduction 43 4.2 Probability of an Event Executing First 43 4.3 Conditional Probability 46 4.4 Probability Distribution for a New Event 51 4.4.1 Generating New Probability Distribution 51 4.4.2 Generating Conditions for the Conditional Probability 55 4.5 Probability Distribution for the Timing of an Event 62 4.6 Probability of Thread Occurrence 62 4.7 Comparison of the Manually Computed Probability with Simulated Probability Using Excel 65 4.8 Development of Probability Algorithm Using QDES Framework 71 4.8.1 QDES Framework 71 4.8.2 Execution of the QDES Algorithm 72 4.8.3 Execution of the Probability Algorithm 74 vii CHAPTER 5: QDES OUTPUT STATISTICS 90 5.1 Introduction 90 5.2 Regular Discrete Event Simulation (RDES) Output Data Analysis 90 5.3 Qualitative Discrete Event Simulation (QDES) Output Data Analysis 92 5.3.1 TimePersistent Statistics 93 5.3.2 Delay Time of Consecutive Events in a Thread 95 5.3.3 Server Utilization and Number Waiting of the Simple Queuing System QDES Model 106 5.3.4 Tally Statistics 111 5.3.5 Total Time in System and Waiting Time of the Simple Queuing System QDES Model 114 5.4 Probability and Statistics Analysis Algorithm 119 CHAPTER 6: CONCLUSION AND FUTURE RESEARCH 137 6.1 Future Research 138 6.1.1 Scaling QDES Methodology 138 6.1.2 Delay Bounds 139 6.1.3 Application in Simulation Optimization Problems 140 6.1.4 Sensitivity Analysis 141 viii APPENDICES 144 ix LIST OF TABLES Table 1. Probability distributions for events ENTER[4,8] and Uniform[3,8] .................. 52 Table 2. Probability distribution of ENTER[7,16] ........................................................... 53 Table 3. Probability distribution for ENTER[7,16] with interval width of 1 time units .. 54 Table 4. Conditional probability for ENTER[7,16] with all possible combination of conditions.................................................................................................................. 59 Table 5. Conditional probability for LEAVE[8,14] with all possible combination of conditions.................................................................................................................. 59 Table 6. Probability of the conditions that are generated ................................................. 60 Table 7. Probability of ENTER[7,14] executing first...................................................... 61 Table 8. Probability of LEAVE[8,14] executing first....................................................... 62 Table 9. Thread Probability .............................................................................................. 63 Table 10. Comparison of the calculated thread probability with simulated thread probability................................................................................................................. 66 Table 11. Calculated and Simulated Probability for the timing of LEAVE[4,6] (Node 22) .................................................................................................................................. 67 Table 12. Calculated and Simulated Probability for the timing of ENTER[4,8] (Node 23) .................................................................................................................................. 68 x Table 13. Calculated and Simulated Probability for the timing of ENTER[7,14] (Node 25) ............................................................................................................................. 69 Table 14. Calculated and Simulated Probability for the timing o LEAVE[8,14] (Node 26) .................................................................................................................................. 69 Table 15. Calculated and Simulated Probability for the timing of LEAVE[9,14] (Node 10) ............................................................................................................................. 70 Table 16. Conditional scheduled time probability of Leave[8,12] and Enter[9,12] ......... 76 Table 17. Conditional Probability of Executing First for Enter[9,12].............................. 76 Table 18. The values of c(f,w) at node 5 .......................................................................... 78 Table 19. The progression of the expected value for the execution time of X............... 100 Table 20. Combinations of B,C and D events executing in the ith interval .................... 104 Table 21. Conditional probability for node 13 (Thread 4).............................................. 109 Table 22. Conditional probability for node 31 (Thread 8).............................................. 109 Table 23. Average number waiting for thread 4 ............................................................. 110 Table 24. Average server utilization for thread 8 ........................................................... 110 Table 25. The average number waiting and server utilization all the threads ................ 111 Table 26. Comparison of Arena simulation statistics and calculated statistics from QDES ................................................................................................................................ 111 xi Table 27. Conditional probability for node 13................................................................ 114 Table 28. The total time in system for all nonzero execution time conditional probability in node 13................................................................................................................ 116 Table 29. The waiting time for all nonzero execution time conditional probability in node 13.................................................................................................................... 117 Table 30. Waiting time and total time in system for all threads ..................................... 117 Table 31. Average waiting time and total time in system for the Simple Queuing System ................................................................................................................................ 118 xii LIST OF FIGURES Figure 1. The basic construct for an event graph.............................................................. 19 Figure 2. Simple Queuing Model...................................................................................... 28 Figure 3. Event sequences of the simple queuing model.................................................. 30 Figure 4. Event sequences for the simple queuing model with threads 3, 5, 6 and 7 eliminated.................................................................................................................. 31 Figure 5. Checking delay constraint validity for Enter[3,6] ............................................. 32 Figure 6. The Remaining Valid Threads from QSGM Output ......................................... 37 Figure 7. Probability distribution for events LEAVE[4,6] and ENTER[3,8] ................... 44 Figure 8. The probabilities of events LEAVE[4,6] and ENTER[3,8] executing first, respectively ............................................................................................................... 46 Figure 9. Redistribution of the original probability of occurrence for ENTER[4,8] ........ 47 Figure 10. Probability of occurrence for ENTER[4,8] given that LEAVE[4,6] was executed .................................................................................................................... 49 Figure 11. Probability of occurrence for ENTER[4,8] ..................................................... 51 Figure 12. Events ENTER[7,14] and LEAVE[8,14] ........................................................ 52 Figure 13. Probability distribution of LEAVE[8,14]........................................................ 55 Figure 14. Conditional probability for ENTER[7,16] ...................................................... 58 xiii Figure 15. Probability tree diagram.................................................................................. 64 Figure 16. The generated timing of events for the Simple Queuing System.................... 66 Figure 17. The generated events for the Simple Queuing System.................................... 66 Figure 18. Event Timing Probability Distribution for LEAVE[4,6] (Node 22) .............. 68 Figure 19. Event Timing Probability Distribution for ENTER[4,8] (Node 23) ............... 68 Figure 20. . Event Timing Probability Distribution for ENTER[7,14] (Node 25) ........... 69 Figure 21. Event Timing Probability Distribution for LEAVE[8,14] (Node 26) ............. 70 Figure 22. Event Timing Probability Distribution for LEAVE[9,12] (Node 10) ............. 70 Figure 23. The values of c(f,w) and d(w) for w=1,..,5...................................................... 78 Figure 24. The values of S and Q ................................................................................... 108 Figure 25. Delay between the first ENTER and first LEAVE events ............................ 115 Figure 26. Delay between the second ENTER and second LEAVE events ................... 115 Figure 27. Delay between the first and second pairs of ENTER and START events .... 116 Figure 28. Delay Bound Example................................................................................... 139 1 CHAPTER 1: INTRODUCTION 1.1 Overview A simulation is the imitation of the operation of a process or system over time. It attempts to build an experimental device that will act like a real system in aspects that are important to the users. Simulation has been and will continue to be a widely used technique for solving many problems in engineering, business, physical science, artificial intelligence, economics and many other fields. Simulation is a powerful and important tool that provides a method for evaluating existing or alternative decisions, plans and policies without having to conduct experiments on the real system. Discrete event simulation is intuitively appealing to users. Despite the fact that it mimics what happens in a real system, it also handles discontinuous functions, randomness and dynamics [Ingalls, et al. 2000]. It also allows users to analyze both the long term behavior and transient behavior. One can say that it is only limited by the amount of time you want to spend in gathering the data and programming it in the model. Otherwise, you can make the model of the world include anything you want. However, this comes at the cost of collecting the necessary data and spending enough time and effort building a complex model. Thus, modeling and analysis for a discrete event simulation can be time consuming and expensive. A real world system is full of uncertainties and the processes or systems under study are sometimes highly random and poorly understood. In order to study a real world system, we often have to make a set of assumptions about how it works using statistical, mathematical or logical relationships. The statistical relationships used in discrete event 2 simulation are often based on the incomplete information that could be gathered about the real system. Users would often have to make intuitive choices of a specific statistical distribution to be used as input to the simulation model. Discrete event simulation outputs are random variables that are usually based on random inputs [Banks, et al. 2005]. The results of a discrete event simulation can be difficult to interpret because it can be hard to distinguish whether an observation is a result of system interrelationships or of randomness [Banks, et al. 2005]. Even if the results are interpreted correctly, the results of a discrete event simulation models could only tell a user how a system works under given conditions, it does not tell the user how to arrange the system so that it works best under these conditions. The issues discussed above are among some of the pitfalls of using discrete event simulation. Qualitative discrete event simulation (QDES) is the qualitative counterpart of discrete event simulation. The qualitative approach to discrete event simulation was developed by Ingalls, et al. (2000) in an attempt to improve the techniques used in discrete event simulation and make it an even more robust decision tool. It is useful in situations where quality data is not available. In fact, it is designed to represent whatever level of knowledge is available about the real system [Ingalls, et al. 2000]. Discrete event simulation can be qualitatively defined by permitting imprecise specification of elements that are typically quantitatively specified either deterministically or in the form of a probability distribution [Ingalls 2000]. This approach assumes imprecise specification of event occurrences. More specifically, event execution time uncertainty is represented in the form of closed intervals in R. These intervals are called temporal intervals [Ingalls, et al. 2000]. The simulation time clock is also represented in the form of an interval. As a 3 result of this new representation, the simulation executive, a key part of a simulation system that is responsible for controlling the time advance, has evolved to allow the implementation of the temporal intervals and the construction of a temporal interval clock. When temporal intervals are used in QDES, the simulation executive runs the execution processes and orders the events according to their execution times. The ordering of interval execution times is made possible with Allen’s interval algebra [Allen 1983]. Allen’s interval algebra formally expresses the temporal relations between intervals, the operations on them and the reasoning about them [Allen 1983]. Even in a simplest QDES model, it is likely to have ties in the ordering of event execution times. Instead of assuming a tie breaking strategy as in the regular discrete event simulation, the QDES simulation executive creates a process for every possible ordering and produces all possible outcomes of a simulation model. This characteristic makes QDES distinctive from the regular discrete event simulation. QDES characterizes all possible outcomes, including the outcome from a regular discrete event simulation, which makes this a far reaching potential benefit. One could check state variable values to see if they are in valid ranges in every process [Ingalls, et al. 2000]. For planning and scheduling problems, schedules would not have to be rerun every time something did not happen according to plan [Ingalls 2004]. Thus, as long as the input intervals are respected, at least one process in the run would characterize the events that had taken place and would give information on the next scheduling decision [Ingalls 2004]. The execution sequences of low or high probability events (the tail probability events) could be executed in the QDES model if users are interested in these events. 4 Ingalls (2000) developed scoring methodologies to assist in the analysis of data generated by the QDES model. However, more work needs to be done in this area. These scoring methodologies create some simple approximation that will provide users with information to rank the processes. They are made in attempt to approximate the event execution probabilities. With these approximations in hand, algorithms could be implemented to separate the more likely processes from the less likely processes. This would greatly expand the benefits of using qualitative discrete event simulation as a decision tool as it would help to answer questions like “What is the likelihood of disastrous events? What leads to these disastrous events? How can we prevent them?” These are the intriguing questions that users want from a decision tool. 5 1.2 Research Objective and Motivation Previous work on Qualitative Discrete Event Simulation (QDES) by Ingalls et al. (2000), and Ingalls (2003) uses time interval representations in discrete event simulation (without any assumptions on the distribution on these intervals) to generate execution threads that characterize all possible outputs of the simulation. Motivated by the intention to advance qualitative discrete event simulation as a decision tool and to expand its capability to provide meaningful output information, the author intends to conduct research in the qualitative discrete event simulation area. In particular, the author is interested in creating approximations to rank the processes generated from the QDES output. The current methodologies on scoring techniques fall short on ranking the processes according to the probabilities of the event sequences. The current QDES framework also does not have statistical analysis features to collect and report statistics on certain states or the values of global variables and other performance statistics based on the attributes of entities in the simulation model. The primary objective of this dissertation research is to calculate the probability of occurrence and the event time distribution of each event that is generated using QDES, by imposing an assumption on the distribution of the delay intervals in the simulation. We are particularly interested in imposing a uniform distribution on the delay intervals. The resulting calculations can be used to calculate the simulation timepersistent output statistics and tally statistics for variables that are of interests to the users. The second objective of this dissertation is to use these calculations (the probability distribution of the event sequences and the thread probabilities) to produce near exact output statistics. The advancement in the area of reasoning with probability to produce 6 near exact output statistics from QDES will allow modelers to make wise decisions based on a single run, instead of sampling from multiple runs as in the regular discrete event simulation models. The near exact output statistics will be able to describe the distribution of different variables, such as the queue length, customer delay, server utilization, etc., together with information for the probability of these average values occurring for each variable and the event sequences that lead to them. By imposing statistical distribution to describe the threads, modelers can also identify low probability threads that are not significant, which could contribute to the area of thread scoring. 7 CHAPTER 2: LITERATURE REVIEW 2.1 Discrete Event Simulation Discrete Event Simulation is the modeling of a system where the state variables of the system change at discrete points in time at the instant an event occurs. In every simulation model, there is a global variable that is used to keep track of the current value of the simulated time. This variable is known as the simulation clock. Since discrete event simulation involves the modeling of a system over time, it needs a mechanism to advance the simulated time from its current value to another. The most widely used mechanism follows the nextevent time advance approach. We reserve the term Regular Discrete Event Simulation (RDES) to relate to simulation models that follow this approach. The next event time advance approach uses a future event calendar to advance the simulated time and thus guarantee that all events generated during the simulation are executed in the correct chronological order according to the execution time of each event. Every event is represented by an event notice that contains the event execution time, event type and other data related to each particular event. The future event calendar keeps a list of all event notices for events that are scheduled to occur at a future time. Every event may schedule and/or cancel other events. At the start of a RDES model, the simulation clock is initialized to zero. The occurrence times of future events are computed and inserted into the future event calendar. The future event calendar is ordered by event times and arranged chronologically, with the most imminent future event placed at the top of the future event 8 calendar. The first event notice is then removed from the top of the future event calendar. The simulation clock is advanced to the scheduled time of the next event, which is retrieved from the event notice. At the instant this event occurs, the state of the system is updated to account for the occurrence of this event. Then, the completion time of this event is computed, an “end” event is scheduled and its associated event notice is placed on the future event calendar to signify the completion of the event. At this time, cumulative statistics could be collected so that a summary statistics for the simulation run could be computed at the end of the simulation. Then, the next “new” event at the top of the future event calendar is removed and the whole process of advancing the simulation clock, updating the states of the system and scheduling or canceling events continues until all the events have been executed or the stopping condition of the simulation model is satisfied. The future event calendar is a strongly ordered list according to the event times. The event times must satisfy the following condition [Banks, et al. 2005]: Let t1 be the time of the most imminent event Let be t2, t3,…, tn the time of the 2nd, 3rd, .., nth most imminent event At any given time, t: t < t1 ≤ t2 ≤ t3 ≤ … ≤ tn Sometimes, several events may have the same execution time, such situation is known as a tie. The events that have the same execution times are referred to as simultaneous events [Banks, et al. 2005]. The result of a simulation model may vary as it depends on the order of how simultaneous events are executed. There are several tiebreaking mechanisms to determine the order of simultaneous events, sometimes ad hoc 9 and specific to the simulation protocols being used. There is a type of arbitrary tiebreaking mechanism which allows a nondeterministic ordering of simultaneous events. Another type of tiebreaking mechanism allows the modelers to explicitly specify the order of simultaneous events or use a default ordering. The advantage of such mechanism is that it guarantees that the results of a simulation are reproducible. The creation of a discrete event simulation model could be highly dependent on the availability of system data. If data is available, modelers normally would begin with the development of a frequency distribution or histogram of the data in attempt to fit the data to a family of distributions. Sometimes, it is impractical or impossible to get these data and even if the data is available, there are still other issues that need to be addressed such as data overload and data abstraction [Ingalls 1999]. Collecting data from the real system is timeconsuming and requires a substantial resource allocation. Other than that, a great deal of effort is required to distill quality data and encode the information that is available into building a discrete event simulation model. In a discrete event simulation, the inputs to a simulation model are usually represented using probabilistic distributions. These models rely on the specification of the probability distributions and the associated parameters that serve as inputs to the model. Each of these probability distributions carries statistical assumptions such as the identical and independent (i.i.d.), normality assumption and so on. Often, these assumptions are rarely valid but modelers are forced to make these assumptions. The choice of input model could significantly impact the prediction or decision to be made based on the simulation results. Depending on the type of distributions that is used in the model, the modelers are required to make the necessary assumptions. Every simulation analysis is 10 based on a variety of assumptions that are made about the nature of the data. It is likely to make erroneous conclusions if one does not carefully evaluate the validity of the assumptions behind the analysis. It has become more of a standard procedure to make these assumptions when using discrete event simulations as an analysis tool. Even if the assumptions could be considered fairly valid (for example in a strictly controlled environment), the task of identifying a probability distribution to represent the input data is not easy. Moreover, the uncertainty in a discrete event simulation model only relates to the choice of a family of distributions and the selection of the parameters as the parameters could either be entered as constant values or chosen from one of the probability distributions. Discrete event simulation is a computerbased statistical sampling experiment [Law and Kelton 2000], therefore the result of any analysis based on input represented by probability distributions is itself a probability distribution. Each simulation run denotes the trajectory behavior of the simulation model in response to a particular experiment. Therefore, discrete event simulation does not provide desired information in regards to the viability of the system under study. For example, it does not tell you under what conditions the system will fail or succeed because these types of questions deal with the transient behavior of the system under study [Ingalls et al. 2000]. As a result of this, the modelers are left with no choice but to run a wide range of simulation runs with different inputs in order to grasp a deeper understanding about the system viability. Another important characteristic of discrete event simulation is that it uses a timepoint representation to model the occurrence times of future events. The timepoint representation does not allow any uncertainty of information and often the exact 11 relationship between two timepoints is not known, but some constraints on how it could be related are known [Allen 1993]. Thus, the notion of time point is not decomposable and is not useful in a reasoning system. According to Allen (1993), an event may occur instantaneously at a particular time point but it appears that such event could be decomposed into subevents if it is examined closely. For example, an event of “an arriving customer to a bank” at a specific time point could be decomposed to “open the front door” and “look for a free bank teller or queue and walk towards it”. This poses a question to search for another possible representation for the modeling of discrete event systems. 2.2 Qualitative Simulation Qualitative simulation is a reasoning technique that derives useful inferences from the modeling of a system when there is a lack of good quantitative information about the system under consideration. It is motivated by the desire to reason about the objects, processes or events in order to uncover all possible behaviors that may exist in the system. One major distinction between qualitative simulation and quantitative simulation is that a quantitative model is a result from a particular experiment while the output from a qualitative model is in response to all possible experiments [Cellier 1991]. When a qualitative simulation determines the next possible state, it can easily determine that there are several next possible states because of the imprecise nature of the data. A qualitative simulation will then execute each of these possible next states. Thus, the results of a qualitative simulation contain all possible event sequences. Qualitative simulation is particularly useful when the level of knowledge about the system under consideration is imprecise. 12 Another major distinction is in the representation of state variables and time representation. A model is quantitative if the state variables are realvalued, otherwise the model is qualitative [Cellier 1991]. This is an important distinction which could identify different types of simulation model. Simulation models could be largely classified into two major types, namely the continuous simulation and discrete event simulation. For the case of a continuous simulation, which often deals with the modeling of a physical system over time by a representation in which the state variables change continuously with respect to time, a traditional continuous simulation model often involves differential equations that describe the rate of change or the interaction of state variables with time. The abstraction is based on ordinary differential equation, which is numerical in character. Kuipers (2001) described a continuous time qualitative simulation model using qualitative differential equation model as an abstraction of an ordinary differential equation, consisting of a set of realvalued variables with functional, algebraic and differential constraints among them. Qualitative differential equation consists of variables which are described in terms of their ordinal relations with a finite set of symbolic landmark values. The relationships could be a firstorder relationship as simple as: As a increases, b increases. The values a and b could be described as increasing or decreasing over a particular ranges. The concept of combining discrete event simulation with qualitative simulation was explored by Ingalls (1999) who developed a qualitative discrete event simulation methodology using temporal interval as the simulation time specification. Temporal interval allows the user to define time as an interval in the simulation which means that the current state of the simulation can occur at any time during the interval defined by t = 13 [t,t+]. Temporal intervals are represented by modeling their endpoints, assuming for any interval t, the lesser endpoint is denoted by t and the greater by t+. The level of uncertainty about the exact timing of the event occurrence is reflected in the width of the interval [Ingalls et al. 2000]. The more uncertainty there is in the event timing, the wider the corresponding temporal interval. This simulation methodology is developed within the modeling framework of event graphs and simulation graph models [Schruben 1983; Som and Sargent 1989; Yücesan and Schruben 1992; Schruben and Yücesan 1993; Ingalls et al. 1996]. Another qualitative approach to discreteevent simulation was taken by Ziegler and Chi (1992) using linear polynomial representation, known as the Symbolic Discrete Event System. This approach allows the modelers to express times symbolically for both situations when timing of events is unknown and when timing is known, but can vary. The time advance values could be expressed as linear polynomials such as 1, 2, s, t, 2s, t+s, 2ts+9, etc., which must evaluate to nonnegative real numbers [Ziegler and Chi 1992]. Temporal interval specification extends the regular time base from real numbers to interval representation. This specification also serves the purpose of representing uncertainty in event execution times. The linear polynomial representation is used to allow manipulation of expressions for time with symbols representing unspecified event times. For example, let p = waiting time at register A, and q = process time at register A, then “p+q” is a valid expression according to symbolic discrete event system specification. This expression could be used to represent the time a customer spends at register A where p and q are symbols that are used to represent unspecified event times. When p and q are assigned numerical values, the expression evaluates to a real number. 14 2.3 Qualitative Discrete Event Simulation (QDES) Qualitative discrete event simulation (QDES) is an eventscheduling approach to simulation modeling developed by Ingalls (1999) and Ingalls et al. (2000). It extends the concept of qualitative simulation to be applied particularly in discrete event systems. QDES uses the next event time advance approach as in a regular discreteevent simulation (RDES). At the start of a simulation model, the simulation clock is initialized to zero and the times of occurrence of the most imminent future event are determined and inserted into an events calendar. The simulation clock is advanced from the occurrence of one event to another at which the state of the system and the times of the occurrence of future events are updated. This process advances the simulation clock until all the events in the events calendar have been executed or canceled, or a certain stopping criterion is met. One distinct characteristic of QDES is that it allows the modelers to specify elements qualitatively. These elements include the time of occurrence of the future events, the simulation time clock and the state variables. The times of occurrence of the future events are represented as time points in RDES. Unlike in RDES, QDES assume imprecise specification of event occurrences. The uncertainty of the event times is represented in a closed time interval in R, which is also known as temporal interval. There are two types of temporal intervals, namely the constant interval and the uncertain interval. A constant interval is an interval whose value remains the same throughout the entire simulation, while an uncertain interval is an interval that changes values every time the interval is evaluated. In using constant intervals, it is assumed that the actual value of the variables is a constant that lies somewhere within an interval. An uncertain interval is 15 equivalent to the modeling of sampling from an unknown probability distribution that is bounded by an interval. As a result of the time interval representation in QDES, the future event calendar is also represented in temporal intervals. Future event calendar in QDES is also sorted according to event times but it is not a strongly ordered list. Events are sorted according to interval mathematics outlined in Allen (1983). It is likely that there would be ties on the future event calendar because of the uncertain order of events. If there is a tie, QDES would not assume a tie breaking strategy as in the case for RDES. Instead, QDES would create threads that make up all of the possible ordering of ties, which differentiates between QDES and RDES. The differences are that RDES’s future event calendar is a strongly ordered list according to the event times and RDES uses several tiebreaking mechanisms to determine the order of simultaneous events, sometimes ad hoc and specific to the simulation protocols being used. The future event calendar in QDES collects all the event notices whose execution order is uncertain and group them in a set, called the nondeterministically ordered set (NOS). The capability of generating all possible scenarios is achieved with the thread generation algorithm. There are currently two algorithms developed so far, namely the DepthFirst algorithm and the BreadthFirst algorithm. These two algorithms are discussed in the next section in more detail. The execution of QDES algorithms resembles the RDES to some extents. The algorithms contain the basic steps of RDES such as initializing the simulation clock, advancing simulation clock from its current value to another, inserting events into the future event calendar, determining the next event to be executed and so on. When QDES is executed, the next possible state is 16 determined. Due to the lack of precise data about the realworld, there could be several next possible states or a tie. QDES algorithm will have some major additional steps to ensure that each of these states will be executed in turns and result in a set of threads that will include all of the possible ordering of event sequences. An example of a new thread being generated is when the execution times (expressed in temporal intervals) for two or more events overlap. The distinctive characteristic of QDES of generating all possible ordering of event sequences is known as coverage [Ingalls et al. 2000]. The coverage property ensures that all outcomes of QDES are characterized and no outcome will be missed out. In RDES, the simulation outcome is based on a sampling approach. The coverage property in QDES is very useful for planning and scheduling problems. Schedules would not have to be rerun every time if something did not happen according to plan. The output of QDES would be able to characterize the changes of events and give information on the next scheduling position, as long as the input interval is respected. Another advantage of coverage property is in debugging simulation models. QDES would characterize all possible scenarios, including anything that is characterized in a RDES model and sequences that have low probability of execution. This would give the modeler absolute confidence in the validity of the simulation model [ Ingalls et al. 2000]. In RDES, input parameters to the systems are usually represented using probabilistic distributions and thus some modeling assumptions are made due to the lack of quality information about the real system. DES is a computerbased statistical sampling experiment [Law and Kelton 2000]. The result of any analysis based on input represented by probability distributions is itself a probability distribution. On the other 17 hand, input parameters to QDES are expressed in temporal intervals and there is no assumption required to determine which probability distribution fits these input data. The capability of generating all possible scenarios is achieved using thread generation algorithm. When a qualitative simulation is executed, the next possible state is determined. Due to lack of precise data about the realworld system, there could be several next possible states or a tie. Each of these states will be executed in turns and result in a set of threads that will include all of the event sequences. An example of a new thread being generated is when the execution time (expressed in time interval) for two events overlaps. In case of a tie in the future events calendar, QDES would create a thread for every possible ordering of the ties. This distinctive characteristic of qualitative simulation is known as coverage [ Ingalls et al. 2000]. Coverage property ensures that all outcomes of QDES are characterized and no outcome will be missed out. Unlike in RDES, the simulation outcome is based on a sampling approach. This property is very useful for planning and scheduling problems. Schedules would not have to be rerun every time if something did not happen according to plan. The output of QDES would be able to characterize the changes of events and give information on the next scheduling position, as long as the input interval is respected. Another advantage of coverage property is in debugging simulation models. QDES would characterize all possible scenarios, including anything that is characterized in a RDES model and sequences that have low probability of execution. This would give the modeler absolute confidence in the validity of the simulation model [Ingalls et al. 2000]. 18 2.4 Event Graphs and Simulation Graph Models QDES is designed and developed using event graphs (EG) and simulation graph model (SGM). The event graphs and simulation graph framework was first introduced by Yücesan and Schruben (1992), Schruben and Yücesan (1993) and further extended by Ingalls (1999) and Ingalls et al. (2000). Event graphs were introduced in order to have a discrete event simulation methodology that is based on systems events. There are other types of discrete event models such as the block diagrams, process networks, and activity wheel or activity life cycle [Ingalls 1994]. These models are structured from the activity standpoint. The event orientation allows discrete event simulation concept to work without the traditional entity definition. Let simulation graph, G (V(G),ES(G),EC(G),ΨG) be a directed graph where V(G) is the set of vertices of G, ES(G) is the set of scheduling edges, EC(G) is the set of canceling edges and ΨG is an incidence function that associates with each edge of G. The Simulation Graph Model (SGM) is then defined as S = (F,C,X,T,Γ,G) where F = {fv: v є V(G)}, the set of state transitions functions associated with vertex v C = {Ce: e є E(G) }, the set of scheduling edge conditions X = {Xe: e є E(G) }, the set of execution edge conditions T = {te :e є ES (G)}, the set of edge delay times as time intervals, and Γ = {γe e є ES (G)}, the set of event execution priorities The simulation graph G specifies the relationships that exist between the elements of the set of entities in a simulation model. The basic construct of the event graph with edge execution condition is given in Figure 1. The nodes labeled A and B represent events 19 and the edge specifies that there is a relationship between the two events. The construct is interpreted as follows: If condition (i) is true at the instant event A occurs, then event B will be scheduled to occur t time units later. Event B will be executed t time units later with the state variables in array n set equal to the values in array k if condition (j) is true t time units later, in which t may assume the value of zero. Note that it is possible to specify an edge with no condition. Figure 1. The basic construct for an event graph 2.5 Interval Mathematics and Its Use in Modeling Allen (1983) mentioned that timepoint representation does not allow any uncertainty of information and often the exact relationship between two timepoints is not known. Thus, the notion of time point is not decomposable and is not useful in a reasoning system. Temporal interval representation is sometimes more useful in certain situations. An example to illustrate the use of temporal interval is shown in the modeling of the start time of a crisis. Let’s say that there is an alarm system that triggers to inform the appropriate authority when there is a crisis. In this case, the start time of the crisis is not known even if the start time of the triggering alarm could be determined accurately. An upper bound could be placed on the start time of the crisis if we assume that the crisis happen at a time earlier than the alarm time. In this case, it is only possible to specify the start time of the associated crisis as a time interval. 20 2.6 Temporal Intervals and Qualitative Time Calendar Construct The uncertainty of the event time is represented in a closed time interval in R, which is also known as temporal interval. Temporal intervals are used as a timing mechanism in the QDES framework. There are two types of temporal intervals used in QDES, namely the constant interval and the uncertain interval. A constant interval is an interval whose values remain the same throughout the entire simulation, while an uncertain interval is an interval that changes values every time the interval is evaluated. In using constant intervals, it is assumed that the actual value of the variables is a constant that lies somewhere within an interval. An uncertain interval is equivalent to the modeling of sampling from an unknown probability distribution that is bounded by an interval. A temporal interval allows users to define time in terms of either a constant interval or uncertain interval in a QDES model. For example, the current simulation time could be defined as t = [t,t+]. This means that the current state of the simulation can occur at any time during the interval t. As a result of the use of temporal intervals in QDES, events in the future events calendar and the future events calendar itself are also defined in the same manner. The future event calendar in a QDES framework is also sorted according to event times but it will not be a strongly ordered list. Events are sorted according to interval mathematics outlined in Allen (1983). Let’s say, there are two events, A and B. Event A is scheduled to execute any time during the interval tA=[ tA , tA +] and event B is scheduled to execute any time during the interval tB=[ tB , tB +]. Event A precedes event B in the future events calendar in a QDES model if: 1. tA < tB (e.g. [2,3] < [4,5]) 21 2. tA  < tB  (e.g. [2,6] and [3,6]) 3. tA  = tB  and tA + < tB + It is likely that there would be ties in the order of events in a QDES model. Such situation happens when the execution times for these events intersect. For example, the order of events A and B will be uncertain if tA∩ tB ≠ ∅. When this condition arises, QDES creates different threads to execute the possible orderings of this set of events. This set of events is known as the nondeterministically ordered set (NOS). In a QDES model, the future events calendar is made up of a union of NOSs. Given the current state of the model, the future events calendar will determine the NOS and then create a thread for each event in this set. If there are a total of N events in the NOS, there will be N threads created. In the ith thread, the ith event would be the first event to be executed. The remaining events in the set will still be in the future events calendar for that thread. Each thread maintains its own calendar and calendar time. The implementation of the future events calendar for NOSs in the QDES is known as the Temporal Interval Calendar (TIC). 2.7 Qualitative Discrete Event Simulation (QDES) Algorithm The Simulation Graph Model (SGM) that was defined earlier as S = (F,C,X,T,Γ,G) was used as the modeling framework in the QDES algorithm in Ingalls’s work. There is additional mechanism that was introduced along with the definition of SGM in order to facilitate the execution of the model. These additional mechanisms include the global simulation clock τ that is represented in time interval, the events calendar L and the terminating conditions for the simulation ω. According to Ingalls (2003), the definition of 22 the L is an ordered set such that L ={(x1,γ1,v1,e1,a1,b1), (x2,γ2,v2,e2,a2,b2),…} where xi represents the execution time, γi represents the execution priority, vi is the vertex to be executed, ei is the index of the edge that is being scheduled, ai are the values of the edge attributes, and bi are the times at which events are scheduled for the ith event notice. Two new sets are introduced to handle temporal intervals and they are defined as follows: H = the set of saved states. This set is used to iterate through all the possible states in the simulation Nh = the NOS Two new variables are also introduced. Variable h is used as a counter to count the number of saved states and iterate through the set H. Variable nh is used to iterate through the Nh set. The execution of the QDES model with the SGM and other definitions mentioned above will be explained in the next section. There are currently two QDES algorithms defined, the depth first algorithm [Ingalls et. al. 2000] and the breadth first algorithm [Agrawal 2002]. Each of these algorithms is explained in the next sections. 2.7.1 Depth First Algorithm A depthfirst algorithm was created and implemented by Ingalls et. al. (2000). The algorithm is designed to completely finish generating one whole thread before moving on to another thread. Any additional threads will be stored on a stack waiting to be executed at a later time. When there is a tie, two threads or more will be generated. The state of the simulation after each thread explosion is saved in a stack called the savestate stack, H. Each savestate consists of a global event calendar and state variable information so that all possible states in the simulation could be executed recursively. The algorithm would 23 assume first thread, put the rest of the threads on stack and continue. When the generation of this thread is complete, the algorithm restores the system from the savestate stack and continues the simulation for the second thread. A savestate counter is set up to count the number of saved states and to iterate through the savestate stack. Recall that all event notices whose execution order is uncertain are grouped in a set, called the nondeterministically ordered set (NOS). In depthfirst algorithm, a NOS counter is used to iterate through the NOS. The depthfirst algorithm is very useful if a complete analysis of all possible scenarios is needed in decisionmaking. Since all possible schedules are characterized and as long as input interval is not violated, users could make fast strategic decisions based on the previously run output, and thus saving time and effort. 2.7.2 Breadth First Algorithm A breadthfirst algorithm was developed and implemented by Agrawal (2003). In a breadthfirst algorithm, all the active threads are evaluated simultaneously. The explosion of threads in this algorithm could be viewed as a tree diagram. The QDES simulation starts either with one parent node or a set of parent nodes. The simulation execution continues and spawns new threads from each of these parents, one by one, until all possible child nodes from each of these parents are explored. Before advancing to the next level down the tree structure, the breadthfirst algorithm ensures that all sibling nodes are executed. Agrawal (2003) proposed a queue structure in his implementation to store the breadthfirst nodes, a set consists of the event notice that are to be executed next, together with information such as the state variable and global events calendar. This queue is denoted as breadthfirst node queue. 24 The breadthfirst algorithm proceeds and gives system snapshots of all event sequences through time. It keeps track of possible system state that is available and how it leads to that system state. It would also be useful to eliminate threads that are considered unimportant or unlikely to give good information. For example, elimination of thread could be added to the breadthfirst algorithm if the number of thread explosion exceeds a certain limit. However, depthfirst algorithm has a speed advantage over breadthfirst algorithm. This is due to the way breadthfirst algorithm is structured. 2.8 Thread Scoring Techniques Some of the threads that are generated by either of the above QDES algorithms may have a less likelihood to happen than other threads. As the complexity of the system increases, the number of threads generated using QDES will also increase. The thread generation could increase exponentially and causes the problem of extracting meaningful information from the output of the model. Thus, some scoring techniques were introduced to approximately rank the threads according to their likelihood of event execution sequences. Thread scores could be used in breadthfirst algorithms to eliminate threads that have lower scores in relative to other threads and thereby reducing run time of the algorithm. Ingalls (1999) described three scoring techniques. The midpoint ranking calculates the midpoint of each interval and ranks them accordingly. The second method is the multiple midpoint method, which also uses the midpoint of each interval. However, the resulting midpoint has taken into account the relative magnitude of all the midpoints and thus the event that is likely to execute first has a higher rank. 25 Let En, n = 1, 2, …, N (N ≥ 2) denotes events with execution intervals [Ln ,Un] that overlap. Let Mn be the midpoint of interval [Ln,Un] for n = 1, 2, …, N. Let Rank(Mn) denotes the rank of Mn. The calculation of using multiple midpoint ranking is given as in the following equation: min ( ) min ( ) min ( ) ( ) 1 1 1 n n N n n N n n N n n M L M L Rank M ≤ ≤ ≤ ≤ ≤ ≤ − − = The uniform method assumes that each overlap interval sections have equal chances of being executed next and each interval follows a uniform distribution. The score is given to each interval, determined according to the probability that a given event would be executed first. The uniform distribution is chosen because it could be easily programmed into the simulation and it has a closed form density function. 26 CHAPTER 3: DELAY CONSTRAINT VALIDITY 3.1 Introduction In a RDES model, entities migrate from state to state while they work their way through the model. There are five entity states as described by Schriber and Brunner (2007), which are the Active state, Ready state, TimeDelayed state, ConditionDelayed State and Dormant state. The Active state is the state of the currently moving entity and only one entity moves at any instant time point. An entity in a Ready state indicates that it is ready to move. There could be more than one entity that are in Ready state, however, only one entity can move onebyone. An entity in the TimeDelayed state is waiting for a known future simulated time to be reached so that it can then (re)enter the Ready state. The ConditionDelayed state is the state in which the entity is delayed until some specified condition comes about. If an entity is in the Dormant state, then it is in a state in which no escape will be triggered automatically by changes in model conditions, only modelerssupplied logic will be able to transform the entity from Dormant state to Ready state. Generic RDES software uses five lists to organize entities in the five entity states, which are: • Active Entity List  a list consists of only the active entity • Current Event List  a list consists of entities in the Ready state • Future Events List  a list of entities in the TimeDelayed state, ranked chronologically to the events' execution time • Delay List  a list consists of entities in the ConditionDelayed state 27 • UserManaged List  a list of entities in the Dormant state The interdependencies and the delay constraints that exist between events are managed using these lists. For example, by the time an entity's insertion in the Future Events list, the entity's delay time (also known as move time) is calculated by adding the value of the simulation clock to the exact known (sampled) duration of the timebased delay. Since the Future Events list is a strongly ordered list and events are executed based on exact (sampled) execution times, there is no need to check for the validity of the order of event execution. Based on the QSGM algorithm proposed by Ingalls (1999) and Ingalls et. al. (2000), when the execution of a node triggers a new event to be scheduled at a later time interval, the new event's delay time is calculated by adding the time interval of the simulation clock to the timebased delay which is also represented in a time interval. Since time interval representation does not provide a means of strongly ordering the Future Events calendar in QDES, all possible order of event executions are simulated. Due to the uncertainties in the timing of the events, it is necessary to check for the validity of the delay constraint that exists between an event and the event that scheduled it. The algorithm proposed by Ingalls (1999) and Ingalls, et. al. (2000) did not consider checking for validity of delay constraints and thus may cause invalid threads to be generated. Ingalls (1999) presents an example to demonstrate that QDES has the same functionality as RDES and also to show the ability of QDES to model at a more strategic level than lower operations level. The same example will be used to illustrate the discovery of invalid threads and the approach to eliminate the generation of invalid 28 threads in the QDES algorithm. This chapter starts with the illustration of the Simple Queuing Model (same as the simple inbound outbound logistic pipeline model in Ingalls(1999)). 3.2 Simple Queuing Model Figure 2 shows a simple queuing model with four events, “BEGIN”, “ENTER”, “START” and “LEAVE”. This model could be viewed as bank teller system where the system will “BEGIN” at time [0,0], customers “ENTER” the bank and wait to be served by the bank teller. The next customer in line will “START” the service process. When the service is finished, the customer will then “LEAVE” the bank. “BEGIN” is the first event to be executed and it is used to initialize the state variables. The variables that we are tracking in this model are the queue length, Q, status of the bank teller, S and number of customers that have exited, E. The status of the server is busy if S=0. If S=1, then the bank teller is idle. Changes in the state variables occur when an event occurs. The changes of the state variables of an event are stated below the event itself in Figure 2. For example, when “START” event occurs, the queue length is decremented by 1(Q=Q1) and the status of the bank teller is changed to busy (S=0). Begin Q = 0 E = 0 S = 1 Start S=S1 Q=Q1 Leave S=S+1 E=E+1 Enter [3,8] Q=Q+1 [4,6] Figure 2. Simple Queuing Model 29 The conditions on the edges in Figure 2 are based on the state of the system. The condition of S>0 is to check that the bank teller is available to service the next customer in line. If a customer arrives to the bank and the bank teller is not busy, the customer would be serviced immediately. Using the example model above, the author want to demonstrate how the approximation of the probability for each event sequence is done. In order to make the example model a simple model to illustrate the approximation, the QDES model for this example is set to terminate after two customers exited. So, the terminating condition is reached when the number of exits equals two (E=2). There are a total of 11 threads generated by Ingalls(1999) QDES algorithm with the above terminating condition. Figure 3 shows the event sequences of each thread [Ingalls 1999]. Threads 3, 5, 6 and 7 will not be considered in the research. This is because at node 14, event Enter[6,6] has an exact execution time and at node 11 and Enter[12,12] also has an exact execution time. In probability theory the probability that event Enter will execute at any exact time x is zero, for all real numbers x. Since there is no information gain from considering these threads, they will be eliminated from this example. Figure 4 shows the event sequences for the simple queuing model, with threads 3, 5, 6 and 7 eliminated. 3.3 Discovery of Invalid Threads In this section, each of the eleven threads from Figure 4 will be reviewed and checked for the delay constraint validity. Starting at thread 1, events Begin[0,0], Enter[0,0] and Start[0,0] have execution time of [0,0]. There is no need to check for delay constraint validity for events that have zero execution time, as the delay time from the scheduling events to these events will be equal to [0,0]. 30 Figure 3. Event sequences of the simple queuing model 31 Figure 4. Event sequences for the simple queuing model with threads 3, 5, 6 and 7 eliminated 1 Begin [0,0] 2 3 4 5 6 7 8 13 9 10 Leave [9,12] 22 23 24 25 26 29 30 31 1 4 2 10 9 8 11 Enter [0,0] Start [0,0] Enter [3,6] Leave [4,6] Leave [4,6] Enter [4,8] Start [4,8] Enter [7,14] Leave [8,14] Start [4,6] Leave [8,12] Enter [6,12] Leave [8,12] 27 28 Leave [8,14] Enter [10,14] Leave [10,14] Enter [13,14] Leave [13,14] Enter [9,12] 32 Another exception to checking delay constraint validity is when the scheduling event has a zero timebased delay. Recall that a newly scheduled event's delay time is calculated by adding the time interval of the simulation clock to the timebased delay which is also represented as a time interval. If the timebased delay is equal to [0,0], then the scheduled event is to be executed immediately, thus there is no need to check for delay constraint validity. The next event on thread 1 would be Enter[3,6] (node 4), it is scheduled by Enter[0,0] (node 2) with the delay constraint of [3,8] (refer to Figure 5). In order for the thread to be valid (at least until node 4), the delay time from node 2 to node 3, add to the delay time from node 3 to node 4 should be within [3,8]. In other words, the total elapsed time from node 2 to node 4 has to fall in between [3,8]. Since [0,0]+[3,6] = [3,6] and [3,6] is within the delay constraint of [3,8], the execution of thread 1 until node 4 is considered valid, as[3,6]∩[3,8] = [3,6] ≠ ∅. Recall from interval arithmetic, [a,b] [c,d]=[(ad), (bc)]. Delay constraint validity check is performed for all the subsequent nodes for thread 1 and for all other threads as well, which is shown in the next two pages. Figure 5. Checking delay constraint validity for Enter[3,6] [0,0] [3,6][0,0] =[3,6] 33 Thread 1 Note: Node 6 is omitted because there is a zero time delay from node 5 to node 6 Leave[4,6] (node 5) is scheduled by Start[0,0] (node 3) with delay constraint of [4,6] ([3,6][0,0])+([4,6][3,6])=[3,6]+[0,3]=[3,9] Since [3,9]∩[4,6] = [4,6] ≠ ∅, thus thread 1 is valid up to node 5 Valid! Enter[6,12] (node 7) is scheduled by Enter[3,6] (node 4) with delay constraint of [3,8] ([4,6][3,6])+([6,12][4,6])=[0,3]+[0,8]=[0,11] Since [0,11]∩[3,8] = [3,8] ≠ ∅, thus thread 1 is valid up to node 7 Valid! Leave[8,12] (node 8) is scheduled by Start[4,6] (node 6) with delay constraint of [4,6] ([6,12][4,6])+([8,12][6,12])=[0,8]+[0,6]=[0,14] Since [0,14]∩[4,6] = [4,6] ≠ ∅ , thus thread 1 is valid up to node 8 Valid! Thread 2 Enter[9,12] (node 9) is scheduled by Enter[6,12] (node 7) with delay constraint of [3,8] [9,12][6,12]=[0,6] Since [0,6]∩[3,8] = [3,6] ≠ ∅ , thus thread 2 is valid up to node 9 Valid! Leave[9,12] (node 10) is scheduled by Start[4,6] (node 6) with delay constraint of [4,6] ([9,12][9,12])+[3,6](from node 9's delay bound)+([6,12][4,6])=[0,3]+[3,6]+[0,8]=[3,17] Since [3,17]∩[4,6] = [4,6] ≠ ∅, thus thread 2 is valid up to node 10 Valid! Thread 4 Leave [8,12] (node 13) is scheduled by Start [4,6] (node 6) with delay constraint of [4,6] [8,12][4,6]=[2,8] Since [2,8]∩[4,6] = [4,6] ≠ ∅, thus thread 4 is valid up to node 13 Valid! Thread 8 Note: Node 24 is omitted because there is a zero time delay from node 23 to node 24 Leave[4,6] (node 22) is scheduled by Start[0,0] (node 3) with delay constraint of [4,6] [4,6][0,0]=[4,6] Since [4,6]∩[4,6] = [4,6] ≠ ∅, thus thread 8 is valid up to node 22 Valid! Enter[4,8] (node 23) is scheduled by Enter[0,0] (node 2) with delay constraint of [3,8] ([4,8][4,6])+[4,6]=[0,4]+[4,6]=[4,10] Since [4,10]∩[3,8] = [4,8] ≠ ∅ , thus thread 8 is valid up to node 23 Valid! 34 Enter[7,14](node 25) is scheduled by Enter[4,8] (node 23) with delay constraint of [3,8] [7,14][4,8]=[0,10] Since [0,10]∩[3,8] = [3,8] ≠ ∅, thus thread 8 is valid up to node 25 Valid! Leave[8,14] (node 26) is scheduled by Start[4,8] (node 24) with delay constraint of [4,6] ([8,14][7,14])+[3,8]=[0,7]+[3,8]=[3,15] [3,8] is the delay bound for node 25. Since there is a zero time delay from node 23 to node 24, thus the delay bound from node 23 to node 25 could be extended also from node 24 to node 25. Since [3,15]∩[4,6] = [4,6] ≠ ∅, thus thread 8 is valid up to node 26 Valid! Thread 11 Leave[8,14] (node 31) is scheduled by Start[4,8] (node 24) with delay constraint of [4,6] [8,14][4,8]=[0,10] Since [0,10]∩[4,6] = [4,6] ≠ ∅ , thus thread 11 is valid up to node 31 Valid! Thread 9 Enter[10,14] (node 27) is scheduled by Enter[7,14] (node 25) with delay constraint of [3,8] [10,14][7,14]=[0,7] Since [0,7]∩[3,8] = [3,7] ≠ ∅, thus thread 9 is valid up to node 27 Valid! Leave[10,14] (node 28) is scheduled by Start[4,8] (node 24) with delay constraint of [4,6] [3,8] (delay bound from node 25) + [3,7] (delay bound from node 27)+([10,14][10,14]) =[3,8]+[3,7]+[0,4]=[6,19] Since [6,19]∩[4,6] = [6,6] and P([6,6])=0, thus thread 9 is not valid. Invalid! Thread 10 Enter[13,14] (node 29) is scheduled by Enter[10,14] (node 27) with delay constraint of [3,8] [13,14][10,14]=[0,4] Since [0,4]∩[3,8] = [3,4] ≠ ∅, thus thread 10 is valid up to node 29 Valid! 35 Leave[13,14] (node 30) is scheduled by Start[4,8] (node 24) with delay constraint of [4,6] [3,8] (delay bound from node 25) + [3,7](delay bound from node 27) + [3,4] (delay bound from node 29) + ([13,14][13,14]) =[3,8]+[3,7]+[3,4]+[0,1]=[9,20] Since [9,20]∩[4,6] = ∅, thus thread 10 is not valid. Invalid! Validity check is only required to be performed once for each node, if the node has been checked and verified that it is valid, it does not need to be checked again. Once a node has been verified as valid, the intersection of the delay constraint with the total elapsed time interval (from the scheduling event to its scheduled event) will become a delay bound that could be used in the calculation of total elapsed time for subsequent nodes in the same thread. For example, in node 10, instead of using the delay time for node 9, which equals to [10,14][7,14]=[0,7], the delay bound of [3,6] is used in the calculation of total elapsed time for node 10. This is because it has been calculated in node 9 that the minimum total elapsed time from node 7 to 9 is equal to 3 time units and the maximum total elapsed time is equal to 6. Thus, this information gained from checking delay constraint in preceding nodes need to be used in the validity check for subsequent nodes. The delay bound for a node can be extended to other nodes under certain circumstances. This is especially true for nodes with zero delay time. For example, node 23 has a zero delay time, therefore the delay bound for a (scheduled) node that has node 23 as the scheduling node could be extended to the immediate successor node following node 23. In this case, the delay bound is extended from node 23 to node 24. Based on the verification that has been done on each node, the QSGM output from the Simple Queuing Model has two invalid threads, which are threads 9 and 10. 36 These threads will be deleted from the QSGM output and will not be considered in further research. There are a total of five valid threads remaining in the Simple Queuing Model as shown in Figure 6. 37 Figure 6. The Remaining Valid Threads from QSGM Output 1 Begin [0,0] 2 3 4 5 6 7 8 13 9 10 Leave [9,12] 22 23 24 25 26 31 1 4 2 8 11 Enter [0,0] Start [0,0] Enter [3,6] Leave [4,6] Leave [4,6] Enter [4,8] Start [4,8] Enter [7,14] Leave [8,14] Start [4,6] Leave [8,12] Enter [6,12] Leave [8,12] Leave [8,14] Enter [9,12] 38 3.4 Delay Constraint Algorithm In this section, the algorithm that is created to check and verify delay constraints for a QSGM output using the methodology described in Section 3.3 is presented. The algorithm could be run to check the validity of each thread one by one, visiting each node only once. The definition of L is an ordered set, L ={ ( , , , , , ), ( , , , , , ),... 1 1 1 1 1 1 2 2 2 2 2 2 x γ v e a b x γ v e a b } where xi represents the execution time, γi represents the execution priority, vi is the vertex to be executed, ei is the index of the edge that is being scheduled, ai are the values of the edge attributes, and bi are the times at which events are scheduled for the ith event notice. If γi=1, this means that event l has a high priority and it will be executed immediately. The sets and variables that are used in the algorithm are defined as follows: Let T={threads from QSGM output}. For the simple queuing model example, T={1,2,8,9,10,11} Let Lt={the events from thread t}={1,2,3,…} Lt is an ordered set according to the sequence from QSGM output from thread t. For example, ( , , , , , ) ( , , , , , ) 2 2 2 2 2 2 2 1 1 1 1 1 1 1 l b x d z f od l b x d z f od = = Let pos (l) t = the position of event l in thread t Let ( , ) i j l l md = the delay (in temporal interval) from event i l to event j l Let d' = the accumulated elapse time The algorithm loops through all the threads one by one, checking the delay constraint validity for each event in a thread, starting from the first event in each thread. The first task in the algorithm is to loop through all the events in the thread and determine the total elapsed time between each event and its scheduling event. If the total elapsed time falls within the delay constraint for event l then the execution of event l is 39 considered valid. If the current event l has zero execution time or zero delay time, then the delay between l and its scheduling event is equal to [0,0]. If the current event l has nonzero execution time or nonzero delay time, the algorithm will proceed to check delay constraint validity. In order to determine the total elapsed time between the current event l and its scheduling event, the algorithm will loop through each event (node) between the current event l and its scheduling event in reverse (starting from the last event) to check if there is a delay bound. If there is no delay bound for the current node, the delay time between the current node and its predecessor node will be calculated. Let's say the current node has an execution time interval of [a,b] and its predecessor node has an execution time interval of [c,d], then the delay time between the current node and its predecessor node is equal to [(ad),(bc)]. The algorithm will proceed to the next event. If there is a delay bound for the current node, we want to determine if the position of the delay bound falls within the current event l and its scheduling event's position. For delay bound that meets this criterion, the delay time between the current node and its scheduling node will be retrieved from the variable ( , ) i j l l md . The algorithm will now proceed to the event before the scheduling node and continue checking the delay constraint validity. Because we skip the nodes between current node and the current node's scheduling event, we did not check to see if these nodes have delay bound. The search for delay bounds starts from the node right before the scheduled event, namely the scheduled event’s predecessor. There may more than one delay bound that exists between the scheduled event and the scheduling event. On top of that, there may be intersections that exist between the delay bounds. In the algorithm that we propose here, not all delay 40 bounds will be considered as the search for delay bounds are based on the order of the event sequences in reverse, meaning starting from the scheduled event. Thus, the proposed checking delay constraint algorithm in this dissertation does not consider checking all delay constraints for all combinations of bounds that exists in a thread. The effect of this approach in calculating the total elapsed time is that it is possible for a thread that has been determined to be valid using this approach, but in fact when using other approaches it is invalid. This is mainly due to the fact that not all combinations of delay bounds are considered when calculating the total elapsed time between the current event l and its scheduling event. However, this situation did not arise in the Simple Queuing System model as there is no intersection between delay bounds and all delay bounds are considered in the model when calculating the total elapsed time for an event in a thread. The second task for the delay constraint algorithm is to determine the intersection of the total elapse time with the delay constraint ( d'∩bl ). If the result of the intersection turns out to be an empty set or is a zeroprobability time interval, this means that the delay constraint for that particular thread is not valid and this thread will be deleted from the QSGM output. The remaining algorithm deals with storing the delay bounds in variable ( , ) i j l l md . The algorithm is shown in the following page. The comments that are meant for a statement in the algorithm are in Italic fonts under the statement. 41 Delay Constraint Algorithm For each t∈T This for loop is created to assign the position of an event in a thread t to the variable pos (z) t counter = 0 For each l∈Lt counter = counter +1 pos l counter t ( ) = Next l Next t For each t∈T InitializeM = ∅, V = ∅, = 0,∀ i , j md defined (i, j)' s For l ∈ L  pos (l) = 1, pos (l) + + t t t If ≠ [0,0] l x or ≠ [0,0] l b then If current event l has zero execution time, then the delay between l and its scheduling event is equal to [0,0] as well d'= [0,0] For p pos (l) t = to ( ) +1 t l pos z step 1 To loop through all the nodes between event l and its scheduled event’s successor If ( ) t l p > pos z then If p is greater than the position of the event that scheduled l, meaning there’s a delay bound for event with position p that exists between l and its scheduled event ' ' (  ( ') ) ' , ' d d md pos l p zl l t = + = Go to ( ) 1 ' + t l pos z Else if b pos l p l t = [0,0]  ( ') = ' then d'= d'+[0,0] Else 1 (  ( ') ) ' x x pos l p l t = = 2 (  ( ' ') 1) '' x = x pos l = p − l t d'= d'+x1− x2 x1x2 is the delay between event in position p and p1, or between p and its successor event End if Next p 42 If ∩ = ∅ l d' b or d ∩b = i i i∈ℜ l ' [ , ]  then If the intersection of the total elapse time with the delay constraint is an empty set or is a zeroprobability interval T = T \ {t} Go to next t End if If 1 ( ') ( ) 1 ' = pos l = pos l + l t t γ then If the event that scheduled l’s successor has a high priority flag, meaning that if it’s execution time is the same as event l’s execution time z l l md d b l = '∩ , z l l md d b l = ∩ + ' 1, Assign new delay bounds with zero delay complication Else z l l md d b l = '∩ , End if End if End if Next l Next t 43 CHAPTER 4: PROBABILITY OF THREAD OCCURRENCE 4.1 Introduction The probabilistic approach to creating a statistical analysis for a QDES model depends on the possibility of finding a way to calculate the probability of a thread occurring in the model, the probability of an event occurring in a thread and the probability of an event executing first whenever there is more than one event in the NOS. In this chapter, we will present the method for calculating these probabilities and the associated algorithm that can be programmed and implemented in the QDES framework. 4.2 Probability of an Event Executing First In order to estimate the probability for each individual thread, the minimal assumption is required. The assumption is that all event delay times are assumed to be uniformly distributed. The simulation clock is initialized to [0,0]. There are three events scheduled to execute at time [0,0] and they are BEGIN[0,0], ENTER[0,0] and START[0,0] in this order (refer to Figure 6). At the instant event ENTER[0,0] is executed, it schedules another event ENTER to be executed [3,8] time units later. Now, event START[0,0] is at the top of the future event calendar, so it is executed next. The instant event START[0,0] is executed, it schedules an event LEAVE to be executed [4,6] time units later. At this time, there are two events in the future event calendar, which are events ENTER[3,8] and LEAVE[4,6]. 44 From the uniform distribution assumption on all the events, the event time intervals can be sectioned into predetermined time intervals and the probability of the event executing in each of these sections can be calculated. Without loss of generality, we will assume that the intervals are sectioned to oneunit time intervals. For example, if the time interval for event LEAVE[4,6] is sectioned to one time unit intervals, then LEAVE[4,6] would have a 0.5 probability of executing either in interval [4,5] or [5,6]. On the other hand, ENTER[3,8] has 0.2 probability of executing in [3,4], [4,5], [5,6], [6,7] or [7,8] (as shown in Figure 7). [3,4] [4,5] [5,6] [6,7] [7,8] LEAVE [4,6] 0.5 0.5 ENTER [3,8] 0.2 0.2 0.2 0.2 0.2 Figure 7. Probability distribution for events LEAVE[4,6] and ENTER[3,8] As the exact timing of these two events is uncertain due to execution times and the order of event execution sequences is uncertain due to the overlapping execution times. QDES algorithm will spawn two threads, one assumes that event ENTER[3,8] will be executed first and the second thread assumes that event LEAVE[4,6] will be executed first. If event ENTER[3,8] executes first, then it must be executed in [3,8], so that the next imminent event LEAVE can execute next in [4,6]. On the other hand, if event LEAVE[4,6] executes first in [4,6], the event ENTER[3,8] will execute next in [4,8]. Notice how the execution time interval for the next event changes as a result of the uncertain event execution times and uncertain order of event execution sequences. With the uniform probability information on the two events, the probability of each event executing first could be calculated, as shown below. Note that the probability of an event executing in an interval [i,j] is denoted with P(event∈[i,j]). 45 Question: What is the probability of event LEAVE executing first? P(LEAVE execute first in [4,5]) = P(LEAVE∈[4,5]) *( 2 1 *P(ENTER∈[4,5]+P(ENTER∈[5,8])) = 0.5*(0.5* 0.2 + 0.6) = 0.35 P(LEAVE execute first in [5,6]) = P(LEAVE∈[5,6])* ( 2 1 *P(ENTER∈[5,6]) + P(ENTER∈[6,8])) =0.5*(0.5* 0.2 + 0.4) = 0.25 Question: What is the probability of event ENTER execute first? P(ENTER execute first in [3,4]) = P(ENTER∈[3,4]) = 0.2 P(ENTER execute in [4,5]) = P(ENTER∈[4,5])*( 2 1 *P(LEAVE∈[4,5] + P(LEAVE∈[4,5])) =0.2*(0.5*0.5 + 0.5) = 0.15 P(ENTER execute first in [5,6]) = P(ENTER∈[5,6])*( 2 1 *P(LEAVE∈[5,6])) =0.2(0.5* 0.5) = 0.05 46 As a result of the above calculations, P(LEAVE execute first in [4,6]) = P(LEAVE execute in [4,5]) + P(LEAVE execute in [5,6]) = 0.35 + 0.25 = 0.6 P(ENTER execute first in [3,6]) = P(ENTER execute in [3,4]) + P(ENTER execute in [4,5])+ P(ENTER execute in [5,6]) =0.2 + 0.15 + 0.05 = 0.4 Figure 8 shows the results so far. P{LEAVE[4,6] execute first in interval (i,j)} (i,j) (4,5) (5,6) Total 0.35 0.25 0.6 P{ENTER [3,6] execute first in interval (i,j)} (i,j) (3,4) (4,5) (5,6) Total 0.2 0.15 0.05 0.4 Figure 8. The probabilities of events LEAVE[4,6] and ENTER[3,8] executing first, respectively 4.3 Conditional Probability If we assume that LEAVE[4,6] is executed first, the probability of event ENTER executing next in [4,5], [5,6], [6,7] and [7,8] can be calculated. Based on the condition of when LEAVE[4,6] is executed first, in this case, we have two conditions, namely: (1) LEAVE is executed first in [4,5] and (2) LEAVE is executed first in [5,6], the probabilities of event ENTER executing next in [5,6], [6,7] and [7,8] are calculated and shown as below. 47 P(ENTER in [4,8]LEAVE 1st in [4,5])= + + + P(ENTER 2nd in [7,8]  LEAVE 1st in [4,5]) P(ENTER 2nd in [6,7]  LEAVE 1st in [4,5]) P(ENTER 2nd in [5,6]  LEAVE 1st in [4,5]) P(ENTER 2nd in [4,5]  LEAVE 1st in [4,5]) P(ENTER in [5,8]LEAVE 1st in [5,6]) + = + P(ENTER 2nd in [7,8]  LEAVE 1st in [5,6]) P(ENTER 2nd in [6,7]  LEAVE 1st in [5,6]) P(ENTER 2nd in [5,6]  LEAVE 1st in [5,6]) If LEAVE executes in [4,5], then ENTER can execute next in [4,8]. Without considering the fact that LEAVE was executed in [4,5], we could redistribute the probability of ENTER executing in [4,8] by looking at the original probability distribution of ENTER[3,8]. Since it is equally likely for ENTER[3,8] to execute in any interval within [3,8] with a probability of 0.2, ENTER will still be equally likely to execute anywhere within [4,8] with a probability of 0.25. The probability distribution for ENTER[4,8] is shown in the first row of Figure 9. The same reasoning goes for the case if LEAVE executes in [5,6], the probability distribution for ENTER[4,8] is shown in the second row of Figure 9. [4,5] [5,6] [6,7] [7,8] If LEAVE executes in [4,5] 0.25 0.25 0.25 0.25 If LEAVE executes in [5,6] 0.333333 0.333333 0.333333 Figure 9. Redistribution of the original probability of occurrence for ENTER[4,8] If LEAVE [4,6] executes in [4,5], then the probability for ENTER to execute in [4,5] is equal to 0.125 (half of 0.25). Since ENTER has a 0.25 probability to be in [4,5], the conditional probability of ENTER to execute in [4,5], given that LEAVE executes in [4,5] is equal to 0.5*0.25 = 0.125. Thus, P{ENTER execute second in [4,5]/LEAVE executed in [4,5]}=0.125/(0.125+0.25*3) ≅ 0.142857 48 For the ease of calculation and to save the hassle of redistributing the original probability of occurrence for ENTER[4,8], the following shows two calculations that are equivalent to the one described above. The third calculation is the simplified version that will be used in developing the probability algorithm in 4.8.3 Execution of the Probability Algorithm. P(ENTER execute in [4,5] LEAVE executed first in [4,5]) = P(ENTER in [4,5]) = ( ) 0.14286 *0.2 3* 0.2 2 1 *0.2 2 1 ≅ + P(ENTER execute in [4,5]  LEAVE executed first in [4,5]) = P(ENTER in [4,5]) = ( ) 0.14286 3 2 1 2 1 ≅ + The rest of the conditional probabilities show similar calculations, except that we have to factor in the probability of ENTER not executing in preceding time intervals. P(ENTER execute in [5,6]  LEAVE executed first in [4,5]) = P(ENTER did not execute in [4,5] out of [4,8])*P(ENTER in [5,6] out of [5,8]) = ( ) 0.28571 (3*0.2) 0.2 1− 0.14286 * ≅ P(ENTER execute in [6,7]  LEAVE executed first in [4,5]) = P(ENTER did not execute in [4,5] out of [4,8] and [5,6] out of [5,8])*P(ENTER in [6,7] out of [6,8]) = ( ) 0.28571 (2 *0.2) 0.2 * (3* 0.2) 0.2 1 * 14286 . 0 1 ≅ − − 49 P(ENTER execute in [7,8] LEAVE executed first in [4,5]) = P(ENTER did not execute in [4,5] out of [4,8], [5,6] out of [5,8] and [6,7] out of [6,8])*P(ENTER in [7,8] out of [7,8]) = ( ) *1 0.28571 (2 *0.2) 0.2 * 1 (3*0.2) 0.2 1 * 14286 . 0 1 ≅ − − − P(ENTER execute in [5,6] LEAVE execute first in [5,6]) = P(ENTER in [5,6] out of [5,8]) = ( ) 0.2 *0.2 2* 0.2 2 1 *0.2 2 1 = + P(ENTER execute in [6,7] / LEAVE execute first in [5,6]) = P(ENTER did not execute in [5,6] out of [5,8])* P(ENTER in [6,7] out of [6,8]) = ( ) 0.4 2* 0.2 0.2 1− 0.2 * = P(ENTER execute in [7,8] / LEAVE execute first in [5,6]) = P(ENTER did not execute in [5,6] out of [5,8] and [6,7] out of [6,8])* P(ENTER in [7,8] out of [7,8]) = ( ) *1 0.4 2 *0.2 0.2 1 * 2 . 0 1 = − − The result so far is shown in Figure 10. P{ENTER [4,8] execute second in interval (i,j) / LEAVE [4,6] execute first in (p,q)} (i,j) (p,q) (4,5) (5,6) (6,7) (7,8) TOTAL (4,5) 0.14286 0.28571 0.28571 0.28571 1 (5,6) 0 0.2 0.4 0.4 1 Figure 10. Probability of occurrence for ENTER[4,8] given that LEAVE[4,6] was executed 50 With all the information we have so far, we can now calculate the probability of ENTER execute next in [4,5], [5,6], [6,7] and [7,8]. The probability distribution of ENTER [4,8] is given in Figure 11. P(ENTER execute in [4,5]) = P(ENTER execute in [4,5] / LEAVE execute first in [4,5]) * P(LEAVE execute first in [4,5]) = 0.08333 (0.35 0.25) 0.35 0.142857 * ≅ + P(ENTER execute in [5,6]) = {P(ENTER execute in [5,6] / LEAVE execute first in [4,5]) * P(LEAVE execute first in [4,5])} + {P(ENTER execute in [5,6] / LEAVE execute first in [5,6]) * P(LEAVE execute first in [5,6])} = 0.25 (0.35 0.25) 0.25 0.2* (0.35 0.25) 0.35 * 28571 . 0 ≅ + + + P(ENTER execute in [6,7]) = {P(ENTER execute in [6,7] / LEAVE execute first in [4,5]) * P(LEAVE execute first in [4,5])} + {P(ENTER execute in [6,7] / LEAVE execute first in [5,6]) * P(LEAVE execute first in [5,6])} = 0.3333 (0.35 0.25) 0.25 0.4 * (0.35 0.25) 0.35 * 28571 . 0 ≅ + + + 51 P(ENTER execute in [7,8]) = {P(ENTER execute 2nd in [6,7] / LEAVE execute first in [4,5]) * P(LEAVE execute first in [4,5])} + {P(ENTER execute 2nd in [7,8] / LEAVE execute first in [5,6]) * P(LEAVE execute first in [5,6])} = 0.3333 (0.35 0.25) 0.25 0.4* (0.35 0.25) 0.35 * 285714 . 0 ≅ + + + P{ENTER [4,8] execute in interval (i,j)} (i, j) (4,5) (5,6) (6,7) (7,8) TOTAL 0.083333 0.25 0.333333 0.333333 1 Figure 11. Probability of occurrence for ENTER[4,8] 4.4 Probability Distribution for a New Event 4.4.1 Generating New Probability Distribution Let’s assume that the simulation proceeds with the current simulation clock at [4,8], ENTER[4,8] has just executed. As soon as ENTER [4,8] has executed, it schedules a new event START to happen immediately since the server is available there is no other customer waiting for the server. START[4,8] could execute immediately in [4,8] with the same probability distribution as ENTER[4,8] (refer to node 24 in Figure 6). A new event ENTER is also scheduled by ENTER[4,8] to happen [3,8] time units later, which leads to ENTER[7,16]. START[4,8] schedules a LEAVE event to execute [4,6] time units after time interval [4,8], which then leads to LEAVE[8,14]. Events ENTER[7,16] (refer to node 25 in Figure 12) and LEAVE[8,14] (refer to node 31 in Figure 12) are currently on the future event calendar. Because their execution times overlap, they create a nondeterministically ordered set (NOS). 52 Figure 12. Events ENTER[7,14] and LEAVE[8,14] ENTER[7,16] is scheduled to occur [3,8] time units after [4,8], which leads to time interval [7,16]. The delay of [3,8] is assumed to follow uniform distribution, but time interval [4,8] from ENTER[4,8] is not uniform. The probability distribution for ENTER[7,16] can be obtained by adding the distribution of ENTER[4,8] to the distribution of Uniform[3,8]. ENTER[4,8] UNIFORM[3,8] [4,5] 0.083333333 [3,4] 0.2 [5,6] 0.25 [4,5] 0.2 [6,7] 0.333333333 [5,6] 0.2 [7,8] 0.333333333 [6,7] 0.2 [7,8] 0.2 Table 1. Probability distributions for events ENTER[4,8] and Uniform[3,8] Table 1 shows the probability distribution for ENTER[4,8] and the uniform distribution [3,8]. Adding these two distributions is equivalent to the result of drawing one sample from each distribution and add the values of the samples together to get the final value, x. The addition of ENTER[4,8] and Uniform[3,8] probability distributions creates the probability distribution for this final value, x. 53 To get the probability distribution of ENTER[4,8]+Uniform[3,8], we could simply multiply the ENTER[4,8] column (refer to Table 1) with the Uniform column. For example, P(E[4,5] + U[3,4]) = P(ENTER[7,9]) = 0.083333333* 0.2 ≅ 0.016666. Table 2 shows the resulting probability distribution. E[4,5]+U column holds the probability values for ENTER[7,9], [8,10], [9,11], [10,12], and [11,13], which are the results of adding the probability of ENTER[4,5] with Uniform probability of [3,4], [4,5], [5,6], [6,7] and [7,8], respectively. In order to get the probability of ENTER executing in a certain interval, the sum of probabilities attributed to that interval is computed. For example, P(ENTER[8,10]) = { P(E[4,5]+U[4,5])=[8,10]) } + { P(E[5,6]+U[3,4])=[8,10]) } ≅ 0.016666667 + 0.05 ≅ 0.066666667 E[4,5] +U P{E[4,5]+U} E[5,6]+U P{E[5,6]+U} E[6,7]+U P{E[6,7]+U} E[7,8]+U P{E[7,8]+U} [7,9] 0.016666667 [8,10] 0.05 [9,11] 0.06666667 [10,12] 0.06666667 [8,10] 0.016666667 [9,11] 0.05 [10,12] 0.06666667 [11,13] 0.06666667 [9,11] 0.016666667 [10,12] 0.05 [11,13] 0.06666667 [12,14] 0.06666667 [10,12] 0.016666667 [11,13] 0.05 [12,14] 0.06666667 [13,15] 0.06666667 [11,13] 0.016666667 [12,14] 0.05 [13,15] 0.06666667 [14,16] 0.06666667 Table 2. Probability distribution of ENTER[7,16] The resulting interval will be of length 2, which are [7,9], [8,10], [9,11], [10,12] and [11,13] with the same probability of 0.016666 , respectively. The researcher chooses to work with the same interval length by reducing the interval length to one time unit. If we leave the interval length as it is, as the simulation continues the length of the interval will keep increasing every time we add probability distributions. However, if we reduce the interval to one time unit by assuming that the each of these resulting intervals of [7,9], [8,10], [9,11], [10,12] and [11,13] follows a uniform distribution. If we assume that the 54 probability of ENTER[8,10] which equals to 0.06666 is evenly distributed throughout [8,10], then ENTER could be executed in [8,9] and [9,10] each with the probability of 0.0333333 2 0.06666666 ≅ . The same assumption is applied to all other intervals, including ENTER[7,9] (which has a probability of 0.0166666), then we could find the probability of ENTER[8,9] by adding the half of the probability from [8,10] and half of the probability from [7,9], which is shown as below. P(ENTER[8,9]) 0.03333333 0.00833333 0.04166666 2 0.016666667 2 0.066666667 ≅ + ≅ + ≅ Table 3 shows the probability distribution of ENTER[7,16], after the reduction of the interval length to one time unit. ENTER[7,16] Probability [7,8] 0.008333333 [8,9] 0.041666667 [9,10] 0.1 [10,11] 0.166666667 [11,12] 0.2 [12,13] 0.191666667 [13,14] 0.158333333 [14,15] 0.1 [15,16] 0.033333333 Table 3. Probability distribution for ENTER[7,16] with interval width of 1 time units The probability distribution of LEAVE[8,14] can be obtained by multiplying the START[4,8] column with the Uniform distribution of [4,6] (i.e. the delay distribution). Figure 13 shows the multiplication of these two columns, the addition of the probabilities with the same time interval and finally the probability of LEAVE[8,14] after the reduction of time interval to length one. 55 LEAVE [8,14] START[4,8] Uniform [4,5] 0.083333 [4,5] 0.5 [5,6] 0.25 [5,6] 0.5 [6,7] 0.333333 [7,8] 0.333333 E[4,5] +U P{E[4,5] +U} E[5,6] +U P{E[5,6] +U} E[6,7] +U P{E[6,7] +U} E[7,8] +U P{E[7,8] +U} [8,10] 0.041666 [9,11] 0.125 [10,12] 0.166666 [11,13] 0.166666 [9,11] 0.041666 [10,12] 0.125 [11,13] 0.166666 [12,14] 0.166666 Interval Probability Interval Probability [8,10] 0.041666 [8,9] 0.02083 [9,11] 0.166666 [9,10] 0.10416 [10,12] 0.291666 [10,11] 0.22916 [11,13] 0.333333 [11,12] 0.3125 [12,14] 0.166666 [12,13] 0.25 SUM 1 [13,14] 0.08333 SUM 1 Figure 13. Probability distribution of LEAVE[8,14] 4.4.2 Generating Conditions for the Conditional Probability The probability distributions that are generated in Section 4.4.1 carry important information about the probability of occurrence for the newly scheduled events ENTER[7,16] and LEAVE[8,14]. It assumes that ENTER[7,16] and LEAVE[8,14] are independent of all previous events. This assumption is not valid when it comes to examining the probability of occurrence for sequential events. ENTER[7,16] and LEAVE[8,14] are exclusive (i.e. disjoint) events, which capture the intuition of noncompatible outcomes. Noncompatible events cannot happen at the same time. This is not the same as independent outcomes. If A, B are disjoint and we know that A occurred, then we do know a lot about B. Namely we know that B cannot occur. Thus there is an interaction between events A and B. Knowing whether A occurred influences the 56 probability of B, which is not possible under independence. Independence captures the intuition of noninteraction, and lack of information. In modeling it is often assumed rather than verified. The probability of ENTER[7,16] executing in a specific time interval (i.e. [7,8],…, [15,16]) depends on the condition of two things: 1. The time interval for which ENTER[7,16]’s scheduling event is executed. If given that its scheduling event ENTER[4,8] is executed in [4,5], then ENTER[7,16] can execute in [7,9], [8,10], [9,11], [10,12] and [11,13] with the probability of 0.2, respectively. This distribution with overlapping time interval represents the conditional probability distribution of ENTER[7,11], given the time interval of ENTER[4,5]. Note that it has the same shape as the delay distribution of [3,8]. Recall that the delay distribution of [3,8] has a uniform probability distribution of 0.2 in each of its oneunit time interval. We know from 4.4.1 that P(ENTER[7,16] is in [7,9]) = P(E[4,5] + U[3,4]) = P(ENTER[7,9]) ≅ 0.083333333* 0.2 = 0.016666… And, P(ENTER[4,8] is in [4,5]) ≅ 0.083333333 Since P(ENTER[7,16] is in [7,9]) = P([ENTER[7,16] is in [7,9]/ENTER[4,8] is in [4,5])* P(ENTER[4,8] is in [4,5]) Thus, P([ENTER[7,16] is in [7,9]/ENTER[4,8] is in [4,5]) = P(ENTER[7,16] is in [7,9])/ P(ENTER[4,8] is in [4,5]) ≅ (0.083333333* 0.2)/ 0.083333333 57 = 0.2 If we assume that the probability of all the twounit time intervals are evenly distributed within their time intervals, using the same interval probability reduction method that was discussed in 4.1.1, the conditional probability distribution of ENTER[7,16] executing in [7,8], [8,9], [9,10], [10,11], [11,12] and [12,13] given that its scheduling event ENTER[4,8] was executed in [4,5] is equal to 0.1, 0.2, 0.2, 0.2, 0.2 and 0.1 respectively. 2. Whether ENTER[7,16] intersects with its immediate preceding event. The conditional probability distribution that was obtained in part 1 only takes into account the effect of ENTER[4,8]'s execution time on the probability of occurrence for ENTER[7,16]. If the immediate preceding event for ENTER[7,16] is ENTER[4,8], there will not be any intersection of time intervals when calculating the conditional probability. Clearly, the conditional probability for ENTER[7,16] will be less in the time intervals that have intersection with the immediate preceding event. This is likely to be the case if ENTER[7,16] is one of the NOS event in the NOS set. Even though the immediate preceding event for ENTER[7,16] is not its scheduling event, ENTER[4,8], but START[4,8] has the exact probability distribution as ENTER[4,8]. This means that there will not be any intersection of time intervals when calculating the conditional probability for ENTER[7,16]. Thus, the conditional 58 probability for ENTER[7,16] that is calculated in part 1 will not be affected and could be used for further analysis. Figure 14 shows the conditional probability for ENTER[7,16]. P{ENTER [7,16] execute in interval (i,j) / START[4,8] execute in (p,q)} (I,j) (p,q) (7,8) (8,9) (9,10) (10,11) (11,12) (12,13) (13,14) (14,15) (15,16) [4,5] 0.1 0.2 0.2 0.2 0.2 0.1 [5,6] 0.1 0.2 0.2 0.2 0.2 0.1 [6,7] 0.1 0.2 0.2 0.2 0.2 0.1 [7,8] 0.1 0.2 0.2 0.2 0.2 0.1 Figure 14. Conditional probability for ENTER[7,16] If we were just interested in finding the probability of a thread occurring from the QSGM output, it is sufficient to use the conditional probability for ENTER[7,16] in Figure 14. However, it is not sufficient to use it if we want to determine output statistics for the QSGM output. This is because the conditional probability for ENTER[7,16] in Figure 14 only carries information all the way up to its scheduling event. Important information about any preceding events before ENTER[7,16]'s scheduling event is not included. If we were to stop the simulation at this point, we could calculate the probability of the thread up to ENTER[7,16], but we could not accurately calculate statistics such as the server utilization. From Figure 14, we know that from [4,5] to [7,8], the server utilization is 1 as the server is busy during that time and the conditional probability of this event is 0.1. But we could not determine the server utilization for the events before that, even though we know that the server is busy from [0,0] (at the start of event START[0,0]) to [4,6] (at the start of event LEAVE[4,6]). An average value for the server utilization could be estimated but as the simulation continues, the estimated average value will not be accurate. 59 Another way to getting the information that we need in order to calculate statistics is to generate all possible conditions for the conditional probability. Any events that are to be executed after the first nonzero execution time event will carry information about the timing of all preceding events. Each combination of the timing of these preceding events that are generated will become the condition for the conditional probability. The conditions that are generated for the conditional probability of ENTER[7,16] are shown in the left column of Table 4. The number 5 of condition “5,4” represents the time interval [5,6] of the immediate preceding event, START[4,8] and ENTER[4,8] since they both have exactly the same distribution and if START is executed in [4,5], ENTER will also be executed in [4,5]. The number 4 of condition “5,4” represents the time interval [4,5] of the second last event, which is LEAVE[4,6]. The same conditions are generated for LEAVE[8,14]. The conditional probability for LEAVE[8,16] is shown in Table 5. Conditional Probability for ENTER [7,16] Condition [7,8] [8,9] [9,10] [10,11] [11,12] [12,13] [13,14] [14,15] [15,16] 4,4 0.1 0.2 0.2 0.2 0.2 0.1 5,4 0.1 0.2 0.2 0.2 0.2 0.1 6,4 0.1 0.2 0.2 0.2 0.2 0.1 7,4 0.1 0.2 0.2 0.2 0.2 0.1 5,5 0.1 0.2 0.2 0.2 0.2 0.1 6,5 0.1 0.2 0.2 0.2 0.2 0.1 7,5 0.1 0.2 0.2 0.2 0.2 0.1 Table 4. Conditional probability for ENTER[7,16] with all possible combination of conditions Conditional Probability for Leave [8,14] Condition [8,9] [9,10] [10,11] [11,12] [12,13] [13,14] 4,4 0.25 0.5 0.25 5,4 0.25 0.5 0.25 6,4 0.25 0.5 0.25 7,4 0.25 0.5 0.25 5,5 0.25 0.5 0.25 6,5 0.25 0.5 0.25 7,5 0.25 0.5 0.25 Table 5. Conditional probability for LEAVE[8,14] with all possible combination of conditions 60 As can be seen from Table 4 and Table 5, the conditional probability for ENTER[7,16] and LEAVE[8,14] is not affected by what happens before its scheduling event, ENTER[4,8]. The delay distribution and scheduling event's distribution determine the timing of the newly scheduled event and its probability distribution. The probability of occurrence that is obtained for the new event will be less in the time intervals where it intersects with its immediate preceding event. If the immediate preceding event for the new event happens to be its scheduling event, there will not be any intersection of time intervals. In this case, the probability of occurrence that is obtained from the delay distribution and scheduling event's distribution will remain the same. The probability of each condition occurring could be obtained by normalizing the conditional probability in Figure 10, which is shown in Table 6. The probability of each condition occurring is needed when calculating the probability of a NOS event executing first. Condition Probability 4,4 0.083333 5,4 0.166667 6,4 0.166667 7,4 0.166667 5,5 0.083333 6,5 0.166667 7,5 0.166667 Table 6. Probability of the conditions that are generated Basically, the method for calculating the probability of each NOS event executing first is the same. Instead of using the probability distribution that is computed from the uniform delay distribution and the scheduling event's probability distribution, for each condition that is generated, the conditional probability of each NOS event occurring in a given time interval is used in calculating the probability of which event executing first. 61 So, the algorithm that determines the probability of which event executing first in a given time interval would be constructed in a way that it is repeatable for each condition that is generated and for each defined time interval. The probability of ENTER[7,14] executing first and LEAVE[8,14] executing first are computed and are given in Table 7 and Table 8, respectively. ENTER[7,14] has a probability of 0.00833 to execute first in the time interval [7,8], 0.03958 to execute first in [8,9], 0.08542 to execute first in [9,10] and so on. These probabilities are obtained from summing the product of the conditional probability in Table 6 with the conditional probability of ENTER[7,14] in each time interval. Combining the probabilities from all these time intervals, ENTER[7,14] has a probability of 0.4 to execute first in [7,14]. On the other hand, LEAVE[8,14] has a probability of 0.6 to execute first in [8,14]. Notice that the addition of ENTER[7,14] and LEAVE[8,14]'s probability equals to one, as they are exclusive events. ENTER[7,14] 7 8 9 10 11 12 13 Node 25 4,4 0.1 0.175 0.1 0.025 5,4 0.1 0.175 0.1 0.025 6,4 0.1 0.175 0.1 0.025 7,4 0.1 0.175 0.1 0.025 5,5 0.1 0.175 0.1 0.025 6,5 0.1 0.175 0.1 0.025 7,5 0.1 0.175 0.1 0.025 P(Execute 1st) 0.00833 0.03958 0.08542 0.11875 0.09792 0.04167 0.00833 0.4 Normalized p 0.02083 0.09896 0.21354 0.29688 0.24479 0.10417 0.02083 1 Table 7. Probability of ENTER[7,14] executing first 62 LEAVE[8,14] 8 9 10 11 12 13 NODE 31 4,4 0.2 0.3 0.1 5,4 0.2 0.3 0.1 6,4 0.2 0.3 0.1 7,4 0.2 0.3 0.1 5,5 0.2 0.3 0.1 6,5 0.2 0.3 0.1 7,5 0.2 0.3 0.1 P(Execute 1st) 0.016667 0.075 0.15 0.19167 0.13333 0.03333 0.6 Normalized p 0.027778 0.125 0.25 0.31944 0.22222 0.05556 1 Table 8. Probability of LEAVE[8,14] executing first 4.5 Probability Distribution for the Timing of an Event In the process of calculating the probability of event occurrence, for the case when there is more than one event in the NOS set, the probability distribution for the timing of an event can be obtained from normalizing the probability of an NOS event executing first. For the case when there is one event in the NOS set, the probability distribution for the timing of an event is already in the normalized form. The "Normalized p" rows in Table 7 and Table 8 are the probability distributions for the timing of ENTER[7,14] and LEAVE[8,14], respectively, as the result of normalization from the probability in the "P(Execute 1st)" row. 4.6 Probability of Thread Occurrence When attempting to determine a sample space (the complete possible outcomes from an experiment), it is often helpful to draw a diagram which illustrates the paths to all the possible outcomes. One such diagram is a tree diagram. In addition to determining the number of outcomes in a sample space, the tree diagram can be used to determine the probability of individual outcomes within the sample space. The probability of any 63 outcome in the sample space is the product (multiplication) of all possibilities along the path that represents that outcome on the tree diagram. By following the branches of the tree diagram, we can find all the possible outcomes. The QSGM output (refer to Figure 6) is a tree diagram, since it illustrates the possible outcomes from a QSGM output. Each branch of the tree has its probability of occurrence that has been calculated and the probability distribution of the timing of the event. If there is only one branch at any level, the probability of occurrence for this branch is equal to 1. The sum of all the branches at any level must sum to 1. Figure 15 shows the probability tree diagram for the QSGM output. The bolded number 0.5969 in Figure 15 on the arc emanating from node 6 to node 7 is the probability of occurrence for ENTER[6,12]. The bolded number 0.4031 on the arc emanating from node 6 to node 13 is the probability of occurrence for LEAVE[8,12]. The thread probability for the Simple Queuing Model could be calculated by multiplying the probabilities of occurrence for the events along the path that represents the outcome from the tree diagram. For example, thread 1 has a probability of 0.2336 (correct to 4 decimal places), which is the product of 0.4 (taken from node 4's probability of occurrence), 0.5969 (taken from node 7's probability of occurrence) and 0.9786(taken from node 8's probability of occurrence). The thread probability for all the possible outcomes of the Simple Queuing model is shown in Table 9. Thread Thread Probability 1 0.2336 2 0.0051 4 0.1613 8 0.2400 11 0.3600 Table 9. Thread Probability 64 Figure 15. Probability tree diagram 1 Begin [0,0] 2 3 4 5 6 7 8 13 9 10 Leave [9,12] 22 23 24 25 26 31 1 2 4 8 11 Enter [0,0] Start [0,0] Enter [3,6] Leave [4,6] Leave [4,6] Enter [4,8] Start [4,8] Enter [7,14] Leave [8,14] Start [4,6] Leave [8,12] Enter [6,12] Leave [8,12] Leave [8,14] Enter [9,12] 0.6 0.4 0.6 0.4 0.4031 0.0214 0.5969 0.9786 65 4.7 Comparison of the Manually Computed Probability with Simulated Probability Using Excel In order to compare the thread probabilities that are calculated using the method discussed in the previous sections, a simulation model of the Simple Queuing System was created using Microsoft Excel. The timing of all the scheduled events is randomly generated (see Figure 16). The event name that represents each of these timings is shown in Figure 17. All the iterations are sorted according to the order of event sequences and are then assigned the appropriate thread number. Figure 16 shows a table from the Excel Worksheet that contains the 10,000 generated timing of events for the Simple Queuing System. Each row contains all the timing of events for an iteration (column B) that are then sorted into thread number (column A). The numbers in Column C to J of Figure 16 are the generated timing for the first event, second event and so on. The matching event name for each of these timings of event from each iteration is shown in Figure 17 Column C to J. The simulated thread probabilities based on the 10,000 output data are computed and tabulated in Table 10. The calculated probability for thread 1 is equal to 0.2336 (correct to four decimal places), compared to the simulated probability for thread of 0.2418 (correct to four decimal places). The difference between the two probabilities is 0.0082, with the percentage difference less than 3.4%. 66 Figure 16. The generated timing of events for the Simple Queuing System Figure 17. The generated events for the Simple Queuing System Thread Calculated Thread Probability Simulated Thread Probability Difference 1 0.23363689 0.24176667 0.00812978 2 0.00511311 0.00418333 0.00092978 4 0.16125000 0.15380000 0.00745000 8 0.24000000 0.23865000 0.00135000 11 0.36000000 0.36160000 0.00160000 Table 10. Comparison of the calculated thread probability with simulated thread probability The Excel simulation model for the Simple Queuing System contains all the required information to plot the probability distribution for the timing of an event 67 execution. The simulated and calculated probability for the execution of LEAVE[4,6] (Node 22) in the interval with the lesser endpoint of 4 and 5 are shown in Table 11. The associated event timing probability distributions are plotted in Figure 18. LEAVE[4,6] is the first nonzero execution time event in thread 8 and 11. Table 11 shows that the calculated probability is very close to the simulated probability for an event that was executed early in the system. Figure 19, Figure 20, Figure 21 and Figure 22 show the progression of the calculated probability distribution and the simulated probability distribution for the timing of events from thread 8 and 11 as the system approaches the terminating condition of 2 customers exiting the system. Table 12, Table 13, Table 14 and Table 15 are the associated tables which contain the calculated and simulated probability distributions for the events from thread 8 and 11. The tables show that the calculated probabilities are very close to the simulated probabilities as the simulation progresses. The ability of the probability algorithm to capture the shape of the event timing probability distribution as the simulation progresses, even for rare event sequences with low thread probability such as LEAVE[9,12] (Node 10), which is the last event to execute in thread 2 with thread probability of 0.0051(correct to 4 decimal places) are exhibited in the graphs. LEAVE[4,6] (Node 22) Interval Simulated Probability Calculated Probability Difference [4,5] 0.586982331 0.583333333 0.003649 [5,6] 0.413017669 0.416666667 0.003649 Table 11. Calculated and Simulated Probability for the timing of LEAVE[4,6] (Node 22) 68 Event Timing Probability Distribution for LEAVE[4,6] (Node 22) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 4 5 Calculated Probability Simulated Probability Figure 18. Event Timing Probability Distribution for LEAVE[4,6] (Node 22) ENTER[4,8] (node 23) Interval Simulated Probability Calculated Probability Difference 4 0.078636776 0.083333333 0.00469656 5 0.260213702 0.25 0.0102137 6 0.330609679 0.333333333 0.00272365 7 0.330539842 0.333333333 0.00279349 Table 12. Calculated and Simulated Probability for the timing of ENTER[4,8] (Node 23) Figure 19. Event Timing Probability Distribution for ENTER[4,8] (Node 23) Event Timing Probability Distribution for ENTER[4,8] (Node 23) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 4 5 6 7 Simulated Probability Calculated Probability 69 ENTER[7,14] (Node 25) Interval Simulated Probability Calculated Probability Difference 7 0.014316642 0.020833333 0.00651669 8 0.095537398 0.098958333 0.00342094 9 0.222431734 0.213541667 0.0088901 10 0.309448984 0.296875 0.012574 11 0.247922341 0.244791667 0.0031307 12 0.097073818 0.104166667 0.00709285 13 0.013269083 0.020833333 0.00756425 Table 13. Calculated and Simulated Probability for the timing of ENTER[7,14] (Node 25) Event Timing Probability Distribution for ENTER[7,14] (Node 25) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 7 8 9 10 11 12 13 Calculated Probability Simulated Probability Figure 20. . Event Timing Probability Distribution for ENTER[7,14] (Node 25) LEAVE[8,14] (Node 26) Interval Simulated Probability Calculated Probability Difference 8 0.009009009 0.010416667 0.00140766 9 0.073538655 0.072916667 0.000622 10 0.21509882 0.197916667 0.0171822 11 0.316781898 0.302083333 0.0146986 12 0.277184161 0.291666667 0.01448251 13 0.108387457 0.125 0.01661254 Table 14. Calculated and Simulated Probability for the timing o LEAVE[8,14] (Node 26) 70 Event Timing Probability Distribution for LEAVE[8,14] (Node 26) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 8 9 10 11 12 13 Simulated Probability Calculated Probability Figure 21. Event Timing Probability Distribution for LEAVE[8,14] (Node 26) LEAVE[9,14] (Node 10) Interval Simulated Probability Calculated Probability Difference 9 0.015936255 0.0671685 0.05123225 10 0.466135458 0.435147655 0.0309878 11 0.517928287 0.497683845 0.0202444 Table 15. Calculated and Simulated Probability for the timing of LEAVE[9,14] (Node 10) Event Timing Probability Distribution for LEAVE[9,12] (Node 10) 0 0.1 0.2 0.3 0.4 0.5 0.6 9 10 11 Simulated Probability Calculated Probability Figure 22. Event Timing Probability Distribution for LEAVE[9,12] (Node 10) 71 4.8 Development of Probability Algorithm Using QDES Framework 4.8.1 QDES Framework The simulation graph model (SGM) modified by Ingalls (1996, 2001) provides a general framework in defining and developing QDES. The general framework and the algorithm have been briefly introduced in Section 2.4 and 2.7. In this section, further details of QDES framework will be discussed. Let simulation graph, G (V(G),E(G),ΨG) be a directed graph where V(G) is the set of vertices of G, E(G) is the set of edges and ΨG is an incidence function that associates with each edge of G. The Simulation Graph Model (SGM) is then defined as S = (F,C,X,T,Γ,G) where F = {fv: v є V(G)}, the set of state transitions functions associated with vertex v C = {Ce: e є E(G) }, the set of scheduling edge conditions X = {Xe: e є E(G) }, the set of execution edge conditions T = {te :e є ES (G)}, the set of edge delay times as time intervals, and Γ = {γe e є ES (G)}, the set of event execution priorities The global simulation clock and the event calendar will be stored as time interval. The symbol τ is used to represent the global simulation clock, while the event calendar will be represented by the capital letter, L. The definition of L is an ordered set, L ={ ( , , , , , ), ( , , , , , ),... 1 1 1 1 1 1 2 2 2 2 2 2 x γ v e a b x γ v e a b } where xi represents the execution time, γi represents the execution priority, vi is the vertex to be executed, ei is the index of the edge that is being scheduled, ai are the values of the edge attributes, and bi are the times at which events are scheduled for the ith event notice. According to Ingalls, et al. (1996), the following sets are defined: 72 : v S the set of state variables that can be modified at vertex v (since S is the set of all state variables, S S v ⊂ ); : v P the set of state variables that can be modified at vertex v ; Фe : the set of state variables involved in the scheduling conditions on edge e; : e ϑ the set of state variables involved in the execution conditions on edge e; : e A the set of state variables used to determine the values of e a on edge e; H : the set of saved states, which is used to iterate through all possible states in the simulation : h N the NOS or the set of possible next events In addition to these sets, ω is used to represent the termination conditions for the simulation. Two variables, h and h n are also defined. The variable h is used to count the number of saved states while iterating through the set H. The variable h n is used to iterate through the set h N . 4.8.2 Execution of the QDES Algorithm With the definitions defined in Section 4.8.1, the execution of the QDES algorithm is carried out as follows. Run Initialization Initialize the saved state set and counter. H = ∅,h ←1 New Process Initialization Step 1. Initialize the global simulation clock. τ ←[0,0] . Step 2. Insert one event notice into the event calendar: For simplicity, we will assume that the event notice could be executed at time [0,0] where L L {( x e a b ) ( x e a b ) } 1 [0,0], . , , , , [0,0], . , , , ,... 1 1 1 1 2 2 2 2 2 = ∪ γ γ Execute (execution of the model implementation) Step 1. Determine the NOS, the set Q is the set of all event notices that could be executed next without regard to priority. 73 {  (min( ) ) } {  (min( ) ) } N q q Q q Q Q l x x l L l L h q q l l = = ∀ ∈ ∀ ∈ = − ≤ + ∀ ∈ ∀ ∈ γ γ Step 2. If = 1 h N , then go to Step 6 of Execute. Step 3. Initialize the variable to loop through the NOS. ←1 h n Step 4. Save the state of the simulation by saving the state information in the savestate stack and incrementing the savestate counter. H = {S, L}. h ← h +1. h Step 5. Remove the 1 ( ) −1 − h nh N event notice from L . L L\{ll N } h nh 1 ( ) −1 − = = . Event notice l is removed from the calendar. Go to Step 7 of Execute. Step 6. Remove the first event notice fromL . L L\{ll (x , ,e ,a b z )} l , , 1 1 1 1 1 = = γ . Event notice l is removed from the calendar. Step 7. Evaluate the execution edge condition, X ( ,a ) el el l ϑ . If X ( ,a ) FALSE el el l ϑ = then go to Step 15 of Execute, else go to Step 8 of Execute. Step 8. Determine the possible new simulation clock time. [max(x , ),min(x , l L)] l l τ ′← − τ − + ∀ ∈ . Step 9. If el t is a constant delay interval, determine if the constant delay time is still valid. It is still valid if e l t b l =τ ′ − . If el t is a constant interval and still valid, or if el t is an uncertain interval, go to Step 11 of Execute, else go to Step 10 of Execute. Step 10. Set e l t b l ←τ ′ − . If − ≤ + el el t t , then the new el t is valid, but the process must be started at the beginning so that el t can be consistent throughout the process. Go to step 1 of New Process Initialization. Otherwise there is no valid constant interval for this process and the process is declared “invalid” and terminates. In that case, go to step 16 of Execute. Step 11. Update the simulation clock. τ ←τ ′ . Step 12. Assign the attributes to the parameters of the vertex. l l P a v ← . If Y is the th i state variable in the vertex parameter list, i.e. P Y v i ( ) = , then i Y (a ) l ← . Step 13. Evaluate the state change. S f (S) v vl ← l . Step 14. Schedule further events. For each edge, j el , emanating from l v , if C ( ) TRUE j j e e Φ = l l then evaluate j e A l and assign the attribute value of the new event notice, k , j k e a A l ← . Generate the interevent time, b f(t ) lj k e = , and schedule the event notice where L L {( b v e a b z } k e j j k k j , , , , , , , ) l l = ∪ τ + γ . Step 15. If any of the following conditions are satisfied: (i)τ > Tstop ; (ii) the simulation stopping condition, ω , evaluates TRUE; (iii) L is empty, then the simulation has reached the end of the process. Go to Step 17 of Execute. Step 16. Go to Step 1 of Execute. Step 17. If h = 1, then terminate the simulation. Step 18. Restore the last saved system state off the savedstate stack: 1, (  ), (  ). h h h = h − L = L L∈H S = S S ∈H Step 19. Increment 1 1 1 ← + h− h− n n . 74 Step 20. If   −1 −1 ≤ h h n N then go to Step 5 of Execute. Step 21. Go to Step 17 of Execute. 4.8.3 Execution of the Probability Algorithm The QDES framework and algorithm in Section 4.8.1 and 4.8.2 will be used as the general framework for developing the probability algorithm that will compute the probability of an event execution and the probability distribution of the event's execution time. In order to determine the new scheduled time given the conditions of preceding events that have executed, each event notice in the event calendar will carry a new piece of information that will provide the vertex of the event that scheduled the current event notice. Thus, the new definition of L is an ordered set, L ={ ( , , , , , , ), ( , , , , , , ),... 1 1 1 1 1 1 1 2 2 2 2 2 2 2 x γ v e a b z x γ v e a b z } where xi represents the execution time, γi represents the execution priority, vi is the vertex to be executed, ei is the index of the edge that is being scheduled, ai are the values of the edge attributes, bi are the times at which events are scheduled for the ith event notice, and zi represents the vertex of the event that scheduled the ith event notice. All executed events will be assigned a vertex number, z and an edge number, y that represents the vertex and the edge that has been executed for an event. The scheduled execution time for an event is denoted as sl. The userspecified time unit that will be used for the simulation will be denoted as θ . The value of θ for the Simple Queuing Model is equal to 1. Recall from Section 4.4.2 that in order to calculate statistics, we propose to generate all possible conditions for calculating the conditional probability of an event executing in a given interval. Any event that is to be executed after the first nonzero execution time event will carry information about the timing of all preceding events. 75 Each combination of the timing of these preceding events that are generated will become the condition for the conditional probability. A new variable is defined as the condition number that will be generated at each node, denoted as f . At each node, as new conditions are being created, the numbering of f will be reordered. This is because the number of condition may increase or decrease from node to node. When determining whether a new condition (for the execution of the next event in the future calendar) should be generated, the current event’s conditional execution probability will be calculated. A new condition will only be created if the current event’s conditional execution probability is not equal to zero. As an example, let’s examine the following table that shows the conditional scheduled time probability distribution for events LEAVE[8,12] (node 8 in the probability tree diagram in Figure 15. Probability tree diagram) and ENTER[9,12] (node 9). These two events are in the NOS because they have overlapping execution time. The rows of “6,4,3,0”, “7,4,3,0” and “8,4,3,0” are the first three conditions that are generated from the previous node, node 7. Comparing the probabilities of occurrence for both events in the time interval [8,9],…,[15,16], under all conditions, LEAVE[8,12] can execute first in the interval of [8,9], [9,10] and [10,11]. On the other hand, ENTER[9,12] has a zero probability of executing first under the condition “8,4,3,0” because ENTER[9,12]’s earliest possible execution time is equal to [11,12] while LEAVE[8,12]’s latest possible execution is [10,11]. However, ENTER[9,12] has a nonzero probability of executing first under the conditions of “6,4,3,0” and “7,4,3,0”. 76 Leave[8,12] [8,9] [9,10] [10,11] [11,12] 6,4,3,0 0.25 0.5 0.25 7,4,3,0 0.25 0.5 0.25 8,4,3,0 0.142857 0.571429 0.285714 Enter[9,12] [9,10] [10,11] [11,12] [12,13] [13,14] [14,15] [15,16] 6,4,3,0 0.1 0.2 0.2 0.2 0.2 0.1 7,4,3,0 0.1 0.2 0.2 0.2 0.2 0.1 8,4,3,0 0.1 0.2 0.2 0.2 0.2 Table 16. Conditional scheduled time probability of Leave[8,12] and Enter[9,12] Let’s now consider creating new conditions for ENTER[9,12]. Since there is a zero probability for ENTER[9,12] to execute first under the condition “8,4,3,0”, no new conditions will be generated. Table 17. Conditional Probability of Executing First for Enter[9,12] shows the calculated conditional probability for ENTER[9,12] executing first in the interval of [9,10] and [10,11] for the conditions of “6,4,3,0” and “7,4,3,0”. For condition “6,4,3,0”, two new conditions will be generated, which are “9,6,4,3,0” and “10,6,4,3,0”. For condition “7,4,3,0”, only one new condition will be generated, namely “10,7,4,3,0”. Enter[9,12] 9 10 6,4,3,0 0.05 0.025 7,4,3,0 0.0125 Table 17. Conditional Probability of Executing First for Enter[9,12] Four variables will be defined to handle the generation of new conditions for the next event(s), the linkage of these conditions to the vertex that executed the current event and the ordering of event sequences. They are defined as follows. f : is the condition number that will be generated at each node. w: is the element number for the fth condition. It represents the tier or level of the nodes in a probability tree diagram from QSGM output. 77 c( f ,w) : is the lesser endpoint of the timing of the event that is executed under the fth condition and it is the wth element in the order of the event sequence. The series of c( f ,w) for a given condition f will give us the timing of all preceding events, which will allow to us to calculate the average execution time for all preceding events. d(w) : is the vertex number of the event that is executed for the wth element in the order of the event sequence Pr( f ,w) : is the probability of occurrence for the f th condition and wth element. It is obtained by normalizing the conditional probability of event l executing first in the interval with the lesser endpoint pp. If c( f ,w) = pp , then _ _ _ ( , ) _ _ ( , ( , )) * Pr( , ) Pr( , 1) Sum Cond Exe First f w Cond Exe First pp f w f w f w + = l Figure 23 shows the values of c( f ,w) and d(w) for w =1,2,3,4,5 , assuming that the conditional probability that is computed based on the fth condition and the wth element is not equal to zero. For example, if the conditional probability of ENTER[3,6] executing first in [4,5] given that Begin[0,0] executed in [0,0], ENTER[0,0] executed in [0,0] and START[0,0] executed in [0,0] is not equal to zero, then a new condition with c(2,4)=4 and d(4)=4 will be generated. The values of c(2,1), c(2,2) and c(2,3) will be copied from c(1,1), c(1,2) and c(1,3) respectively, so that the values of c(2,1), c(2,2), c(2,3) and c(2,4) become the time series of an event sequence from the 2nd condition. 78 Figure 23. The values of c(f,w) and d(w) for w=1,..,5 If the simulation was to stop at node 5, the following table shows the value of c(f,w) for all the five conditions that are generated at node 5. For example, c(4,5)=5 is the lesser endpoint of the execution time for LEAVE[4,6] and d(5)=5 means that node 5 is the vertex that executed the event that is associated with c(4,5). The value of f=4 is the condition number and w=5 means that the event is the 5th element in the thread. f\w 1 2 3 4 5 1 0 0 0 3 4 2 0 0 0 3 5 3 0 0 0 4 4 4 0 0 0 4 5 5 0 0 0 5 5 Table 18. The values of c(f,w) at node 5 The following shows a list of new variables that are added in the probability algorithm to determine the expected execution time for an event and the expected delay time between two events. The definition of the variables is also given. 79 Timestamp( f ,w) : is the expected execution time for an event from the f th condition and wth element. Delay( f ,w) : is the expected delay time between the event from the f th condition and wth element with the event from the same f th condition and (w1)th element. cursor( f ,w) : is the pointer to the condition which is the
Click tabs to swap between content that is broken into logical sections.
Rating  
Title  Using Qualitative Discrete Event Simulation to Calculate Near Exact Output Statistics 
Date  20080701 
Author  Leow, Yen Ping 
Keywords  Qualitative discrete event simulation, simulation statistics, event probability, simulation modeling 
Department  Industrial Engineering & Management 
Document Type  
Full Text Type  Open Access 
Abstract  The scope of this dissertation is to develop a statistical analysis method for the Qualitative Discrete Event Simulation (QDES) methodology. The method presented shows the calculation of the probability of occurrence and the event time distribution for each event that is generated using QDES by imposing an assumption on the distribution of the delay intervals in the simulation, in particularly uniform distribution. The resulting calculations are used in calculating the simulation timepersistent output statistics and tally statistics for variables of interests. The Qualitative Discrete Event Simulation (QDES) methodology has been advanced as a decision tool and its capability has been enhanced to provide meaningful output information. A probabilistic approach to the statistical analysis of a QDES model has been developed. The advancement in the area of reasoning with probability to produce near exact output statistics from QDES will allow modelers to make wise decisions based on a single run, instead of sampling from multiple runs as in a regular discrete event simulation model. The near exact output statistics will be able to describe the distribution of different variables, such as the queue length, customer delay, server utilization, etc., together with information for the probability of these average values occurring for each variable and the event sequences that lead to these values. The probabilistic nature of our approach to calculating statistics from a QDES output opens up new chapter on creating a closedform discrete event simulation output. The key behind the approach is that by imposing statistical distribution on the temporal intervals and using a complex set of conditional probabilities for the thread, the statistics in the QDES model is exactly represented. 
Note  Dissertation 
Rights  © Oklahoma Agricultural and Mechanical Board of Regents 
Transcript  USING QUALITATIVE DISCRETE EVENT SIMULATION TO CALCULATE NEAR EXACT OUTPUT STATISTICS By YEN PING LEOW Bachelor of Science in Mathematical and Computer Science University of Adelaide Adelaide, Australia 2000 Master of Science in Industrial Engineering and Management Oklahoma State University Stillwater, Oklahoma 2002 Submitted to the Faculty of the Graduate College of the Oklahoma State University in partial fulfillment of the requirements for the Degree of DOCTOR OF PHILOSOPHY July, 2008 ii USING QUALITATIVE DISCRETE EVENT SIMULATION TO CALCULATE NEAR EXACT OUTPUT STATISTICS Dissertation Approved: Dr. Ricki G. Ingalls Dissertation Adviser Dr. David B. Pratt Dr. Camille DeYong Dr. Dursun Delen Dr. A. Gordon Emslie Dean of the Graduate College iii DEDICATION This dissertation is dedicated to my loving husband, Dr. Loay Sehwail and my parents, Swee Meen Leow and See Moy Soo whose love and devotion have helped me accomplished my educational goals. iv ACKNOWLEDGEMENT First, I would like to express my gratitude to my advisor Dr. Ricki Ingalls for the guidance, encouragement and knowledge he has provided me throughout the course of writing this dissertation. I would like to extend my sincere appreciation to my dissertation committee: Drs. David Pratt, Camille DeYong and Delen Dursun for all your time, input and ideas. I am most grateful to my family for their support of my graduate studies. This work would not have been possible without the tireless support of my husband, Loay, who has endured many financial, personal and career sacrifices to see me reach this goal. To my son, Munir, who has helped me to overcome the stress of dissertation writing. To my father, Swee Meen, who has supported and encouraged me to complete my graduate studies. To my mother, See Moy, who inspired me to enjoy learning. To my brother, Kai Siong, and my sisters, Fong Ping, Li Ping and Siew Ping, I am grateful for their support of my efforts. v TABLE OF CONTENTS CHAPTER 1: INTRODUCTION 1 1.1 Overview 1 1.2 Research Objective and Motivation 5 CHAPTER 2: LITERATURE REVIEW 7 2.1 Discrete Event Simulation 7 2.2 Qualitative Simulation 11 2.3 Qualitative Discrete Event Simulation (QDES) 14 2.4 Event Graphs and Simulation Graph Models 18 2.5 Interval Mathematics and Its Use in Modeling 19 2.6 Temporal Intervals and Qualitative Time Calendar Construct 20 2.7 Qualitative Discrete Event Simulation (QDES) Algorithm 21 2.7.1 Depth First Algorithm 22 2.7.2 Breadth First Algorithm 23 2.8 Thread Scoring Techniques 24 CHAPTER 3: DELAY CONSTRAINT VALIDITY 26 3.1 Introduction 26 3.2 Simple Queuing Model 28 vi 3.3 Discovery of Invalid Threads 29 3.4 Delay Constraint Algorithm 38 CHAPTER 4: PROBABILITY OF THREAD OCCURRENCE 43 4.1 Introduction 43 4.2 Probability of an Event Executing First 43 4.3 Conditional Probability 46 4.4 Probability Distribution for a New Event 51 4.4.1 Generating New Probability Distribution 51 4.4.2 Generating Conditions for the Conditional Probability 55 4.5 Probability Distribution for the Timing of an Event 62 4.6 Probability of Thread Occurrence 62 4.7 Comparison of the Manually Computed Probability with Simulated Probability Using Excel 65 4.8 Development of Probability Algorithm Using QDES Framework 71 4.8.1 QDES Framework 71 4.8.2 Execution of the QDES Algorithm 72 4.8.3 Execution of the Probability Algorithm 74 vii CHAPTER 5: QDES OUTPUT STATISTICS 90 5.1 Introduction 90 5.2 Regular Discrete Event Simulation (RDES) Output Data Analysis 90 5.3 Qualitative Discrete Event Simulation (QDES) Output Data Analysis 92 5.3.1 TimePersistent Statistics 93 5.3.2 Delay Time of Consecutive Events in a Thread 95 5.3.3 Server Utilization and Number Waiting of the Simple Queuing System QDES Model 106 5.3.4 Tally Statistics 111 5.3.5 Total Time in System and Waiting Time of the Simple Queuing System QDES Model 114 5.4 Probability and Statistics Analysis Algorithm 119 CHAPTER 6: CONCLUSION AND FUTURE RESEARCH 137 6.1 Future Research 138 6.1.1 Scaling QDES Methodology 138 6.1.2 Delay Bounds 139 6.1.3 Application in Simulation Optimization Problems 140 6.1.4 Sensitivity Analysis 141 viii APPENDICES 144 ix LIST OF TABLES Table 1. Probability distributions for events ENTER[4,8] and Uniform[3,8] .................. 52 Table 2. Probability distribution of ENTER[7,16] ........................................................... 53 Table 3. Probability distribution for ENTER[7,16] with interval width of 1 time units .. 54 Table 4. Conditional probability for ENTER[7,16] with all possible combination of conditions.................................................................................................................. 59 Table 5. Conditional probability for LEAVE[8,14] with all possible combination of conditions.................................................................................................................. 59 Table 6. Probability of the conditions that are generated ................................................. 60 Table 7. Probability of ENTER[7,14] executing first...................................................... 61 Table 8. Probability of LEAVE[8,14] executing first....................................................... 62 Table 9. Thread Probability .............................................................................................. 63 Table 10. Comparison of the calculated thread probability with simulated thread probability................................................................................................................. 66 Table 11. Calculated and Simulated Probability for the timing of LEAVE[4,6] (Node 22) .................................................................................................................................. 67 Table 12. Calculated and Simulated Probability for the timing of ENTER[4,8] (Node 23) .................................................................................................................................. 68 x Table 13. Calculated and Simulated Probability for the timing of ENTER[7,14] (Node 25) ............................................................................................................................. 69 Table 14. Calculated and Simulated Probability for the timing o LEAVE[8,14] (Node 26) .................................................................................................................................. 69 Table 15. Calculated and Simulated Probability for the timing of LEAVE[9,14] (Node 10) ............................................................................................................................. 70 Table 16. Conditional scheduled time probability of Leave[8,12] and Enter[9,12] ......... 76 Table 17. Conditional Probability of Executing First for Enter[9,12].............................. 76 Table 18. The values of c(f,w) at node 5 .......................................................................... 78 Table 19. The progression of the expected value for the execution time of X............... 100 Table 20. Combinations of B,C and D events executing in the ith interval .................... 104 Table 21. Conditional probability for node 13 (Thread 4).............................................. 109 Table 22. Conditional probability for node 31 (Thread 8).............................................. 109 Table 23. Average number waiting for thread 4 ............................................................. 110 Table 24. Average server utilization for thread 8 ........................................................... 110 Table 25. The average number waiting and server utilization all the threads ................ 111 Table 26. Comparison of Arena simulation statistics and calculated statistics from QDES ................................................................................................................................ 111 xi Table 27. Conditional probability for node 13................................................................ 114 Table 28. The total time in system for all nonzero execution time conditional probability in node 13................................................................................................................ 116 Table 29. The waiting time for all nonzero execution time conditional probability in node 13.................................................................................................................... 117 Table 30. Waiting time and total time in system for all threads ..................................... 117 Table 31. Average waiting time and total time in system for the Simple Queuing System ................................................................................................................................ 118 xii LIST OF FIGURES Figure 1. The basic construct for an event graph.............................................................. 19 Figure 2. Simple Queuing Model...................................................................................... 28 Figure 3. Event sequences of the simple queuing model.................................................. 30 Figure 4. Event sequences for the simple queuing model with threads 3, 5, 6 and 7 eliminated.................................................................................................................. 31 Figure 5. Checking delay constraint validity for Enter[3,6] ............................................. 32 Figure 6. The Remaining Valid Threads from QSGM Output ......................................... 37 Figure 7. Probability distribution for events LEAVE[4,6] and ENTER[3,8] ................... 44 Figure 8. The probabilities of events LEAVE[4,6] and ENTER[3,8] executing first, respectively ............................................................................................................... 46 Figure 9. Redistribution of the original probability of occurrence for ENTER[4,8] ........ 47 Figure 10. Probability of occurrence for ENTER[4,8] given that LEAVE[4,6] was executed .................................................................................................................... 49 Figure 11. Probability of occurrence for ENTER[4,8] ..................................................... 51 Figure 12. Events ENTER[7,14] and LEAVE[8,14] ........................................................ 52 Figure 13. Probability distribution of LEAVE[8,14]........................................................ 55 Figure 14. Conditional probability for ENTER[7,16] ...................................................... 58 xiii Figure 15. Probability tree diagram.................................................................................. 64 Figure 16. The generated timing of events for the Simple Queuing System.................... 66 Figure 17. The generated events for the Simple Queuing System.................................... 66 Figure 18. Event Timing Probability Distribution for LEAVE[4,6] (Node 22) .............. 68 Figure 19. Event Timing Probability Distribution for ENTER[4,8] (Node 23) ............... 68 Figure 20. . Event Timing Probability Distribution for ENTER[7,14] (Node 25) ........... 69 Figure 21. Event Timing Probability Distribution for LEAVE[8,14] (Node 26) ............. 70 Figure 22. Event Timing Probability Distribution for LEAVE[9,12] (Node 10) ............. 70 Figure 23. The values of c(f,w) and d(w) for w=1,..,5...................................................... 78 Figure 24. The values of S and Q ................................................................................... 108 Figure 25. Delay between the first ENTER and first LEAVE events ............................ 115 Figure 26. Delay between the second ENTER and second LEAVE events ................... 115 Figure 27. Delay between the first and second pairs of ENTER and START events .... 116 Figure 28. Delay Bound Example................................................................................... 139 1 CHAPTER 1: INTRODUCTION 1.1 Overview A simulation is the imitation of the operation of a process or system over time. It attempts to build an experimental device that will act like a real system in aspects that are important to the users. Simulation has been and will continue to be a widely used technique for solving many problems in engineering, business, physical science, artificial intelligence, economics and many other fields. Simulation is a powerful and important tool that provides a method for evaluating existing or alternative decisions, plans and policies without having to conduct experiments on the real system. Discrete event simulation is intuitively appealing to users. Despite the fact that it mimics what happens in a real system, it also handles discontinuous functions, randomness and dynamics [Ingalls, et al. 2000]. It also allows users to analyze both the long term behavior and transient behavior. One can say that it is only limited by the amount of time you want to spend in gathering the data and programming it in the model. Otherwise, you can make the model of the world include anything you want. However, this comes at the cost of collecting the necessary data and spending enough time and effort building a complex model. Thus, modeling and analysis for a discrete event simulation can be time consuming and expensive. A real world system is full of uncertainties and the processes or systems under study are sometimes highly random and poorly understood. In order to study a real world system, we often have to make a set of assumptions about how it works using statistical, mathematical or logical relationships. The statistical relationships used in discrete event 2 simulation are often based on the incomplete information that could be gathered about the real system. Users would often have to make intuitive choices of a specific statistical distribution to be used as input to the simulation model. Discrete event simulation outputs are random variables that are usually based on random inputs [Banks, et al. 2005]. The results of a discrete event simulation can be difficult to interpret because it can be hard to distinguish whether an observation is a result of system interrelationships or of randomness [Banks, et al. 2005]. Even if the results are interpreted correctly, the results of a discrete event simulation models could only tell a user how a system works under given conditions, it does not tell the user how to arrange the system so that it works best under these conditions. The issues discussed above are among some of the pitfalls of using discrete event simulation. Qualitative discrete event simulation (QDES) is the qualitative counterpart of discrete event simulation. The qualitative approach to discrete event simulation was developed by Ingalls, et al. (2000) in an attempt to improve the techniques used in discrete event simulation and make it an even more robust decision tool. It is useful in situations where quality data is not available. In fact, it is designed to represent whatever level of knowledge is available about the real system [Ingalls, et al. 2000]. Discrete event simulation can be qualitatively defined by permitting imprecise specification of elements that are typically quantitatively specified either deterministically or in the form of a probability distribution [Ingalls 2000]. This approach assumes imprecise specification of event occurrences. More specifically, event execution time uncertainty is represented in the form of closed intervals in R. These intervals are called temporal intervals [Ingalls, et al. 2000]. The simulation time clock is also represented in the form of an interval. As a 3 result of this new representation, the simulation executive, a key part of a simulation system that is responsible for controlling the time advance, has evolved to allow the implementation of the temporal intervals and the construction of a temporal interval clock. When temporal intervals are used in QDES, the simulation executive runs the execution processes and orders the events according to their execution times. The ordering of interval execution times is made possible with Allen’s interval algebra [Allen 1983]. Allen’s interval algebra formally expresses the temporal relations between intervals, the operations on them and the reasoning about them [Allen 1983]. Even in a simplest QDES model, it is likely to have ties in the ordering of event execution times. Instead of assuming a tie breaking strategy as in the regular discrete event simulation, the QDES simulation executive creates a process for every possible ordering and produces all possible outcomes of a simulation model. This characteristic makes QDES distinctive from the regular discrete event simulation. QDES characterizes all possible outcomes, including the outcome from a regular discrete event simulation, which makes this a far reaching potential benefit. One could check state variable values to see if they are in valid ranges in every process [Ingalls, et al. 2000]. For planning and scheduling problems, schedules would not have to be rerun every time something did not happen according to plan [Ingalls 2004]. Thus, as long as the input intervals are respected, at least one process in the run would characterize the events that had taken place and would give information on the next scheduling decision [Ingalls 2004]. The execution sequences of low or high probability events (the tail probability events) could be executed in the QDES model if users are interested in these events. 4 Ingalls (2000) developed scoring methodologies to assist in the analysis of data generated by the QDES model. However, more work needs to be done in this area. These scoring methodologies create some simple approximation that will provide users with information to rank the processes. They are made in attempt to approximate the event execution probabilities. With these approximations in hand, algorithms could be implemented to separate the more likely processes from the less likely processes. This would greatly expand the benefits of using qualitative discrete event simulation as a decision tool as it would help to answer questions like “What is the likelihood of disastrous events? What leads to these disastrous events? How can we prevent them?” These are the intriguing questions that users want from a decision tool. 5 1.2 Research Objective and Motivation Previous work on Qualitative Discrete Event Simulation (QDES) by Ingalls et al. (2000), and Ingalls (2003) uses time interval representations in discrete event simulation (without any assumptions on the distribution on these intervals) to generate execution threads that characterize all possible outputs of the simulation. Motivated by the intention to advance qualitative discrete event simulation as a decision tool and to expand its capability to provide meaningful output information, the author intends to conduct research in the qualitative discrete event simulation area. In particular, the author is interested in creating approximations to rank the processes generated from the QDES output. The current methodologies on scoring techniques fall short on ranking the processes according to the probabilities of the event sequences. The current QDES framework also does not have statistical analysis features to collect and report statistics on certain states or the values of global variables and other performance statistics based on the attributes of entities in the simulation model. The primary objective of this dissertation research is to calculate the probability of occurrence and the event time distribution of each event that is generated using QDES, by imposing an assumption on the distribution of the delay intervals in the simulation. We are particularly interested in imposing a uniform distribution on the delay intervals. The resulting calculations can be used to calculate the simulation timepersistent output statistics and tally statistics for variables that are of interests to the users. The second objective of this dissertation is to use these calculations (the probability distribution of the event sequences and the thread probabilities) to produce near exact output statistics. The advancement in the area of reasoning with probability to produce 6 near exact output statistics from QDES will allow modelers to make wise decisions based on a single run, instead of sampling from multiple runs as in the regular discrete event simulation models. The near exact output statistics will be able to describe the distribution of different variables, such as the queue length, customer delay, server utilization, etc., together with information for the probability of these average values occurring for each variable and the event sequences that lead to them. By imposing statistical distribution to describe the threads, modelers can also identify low probability threads that are not significant, which could contribute to the area of thread scoring. 7 CHAPTER 2: LITERATURE REVIEW 2.1 Discrete Event Simulation Discrete Event Simulation is the modeling of a system where the state variables of the system change at discrete points in time at the instant an event occurs. In every simulation model, there is a global variable that is used to keep track of the current value of the simulated time. This variable is known as the simulation clock. Since discrete event simulation involves the modeling of a system over time, it needs a mechanism to advance the simulated time from its current value to another. The most widely used mechanism follows the nextevent time advance approach. We reserve the term Regular Discrete Event Simulation (RDES) to relate to simulation models that follow this approach. The next event time advance approach uses a future event calendar to advance the simulated time and thus guarantee that all events generated during the simulation are executed in the correct chronological order according to the execution time of each event. Every event is represented by an event notice that contains the event execution time, event type and other data related to each particular event. The future event calendar keeps a list of all event notices for events that are scheduled to occur at a future time. Every event may schedule and/or cancel other events. At the start of a RDES model, the simulation clock is initialized to zero. The occurrence times of future events are computed and inserted into the future event calendar. The future event calendar is ordered by event times and arranged chronologically, with the most imminent future event placed at the top of the future event 8 calendar. The first event notice is then removed from the top of the future event calendar. The simulation clock is advanced to the scheduled time of the next event, which is retrieved from the event notice. At the instant this event occurs, the state of the system is updated to account for the occurrence of this event. Then, the completion time of this event is computed, an “end” event is scheduled and its associated event notice is placed on the future event calendar to signify the completion of the event. At this time, cumulative statistics could be collected so that a summary statistics for the simulation run could be computed at the end of the simulation. Then, the next “new” event at the top of the future event calendar is removed and the whole process of advancing the simulation clock, updating the states of the system and scheduling or canceling events continues until all the events have been executed or the stopping condition of the simulation model is satisfied. The future event calendar is a strongly ordered list according to the event times. The event times must satisfy the following condition [Banks, et al. 2005]: Let t1 be the time of the most imminent event Let be t2, t3,…, tn the time of the 2nd, 3rd, .., nth most imminent event At any given time, t: t < t1 ≤ t2 ≤ t3 ≤ … ≤ tn Sometimes, several events may have the same execution time, such situation is known as a tie. The events that have the same execution times are referred to as simultaneous events [Banks, et al. 2005]. The result of a simulation model may vary as it depends on the order of how simultaneous events are executed. There are several tiebreaking mechanisms to determine the order of simultaneous events, sometimes ad hoc 9 and specific to the simulation protocols being used. There is a type of arbitrary tiebreaking mechanism which allows a nondeterministic ordering of simultaneous events. Another type of tiebreaking mechanism allows the modelers to explicitly specify the order of simultaneous events or use a default ordering. The advantage of such mechanism is that it guarantees that the results of a simulation are reproducible. The creation of a discrete event simulation model could be highly dependent on the availability of system data. If data is available, modelers normally would begin with the development of a frequency distribution or histogram of the data in attempt to fit the data to a family of distributions. Sometimes, it is impractical or impossible to get these data and even if the data is available, there are still other issues that need to be addressed such as data overload and data abstraction [Ingalls 1999]. Collecting data from the real system is timeconsuming and requires a substantial resource allocation. Other than that, a great deal of effort is required to distill quality data and encode the information that is available into building a discrete event simulation model. In a discrete event simulation, the inputs to a simulation model are usually represented using probabilistic distributions. These models rely on the specification of the probability distributions and the associated parameters that serve as inputs to the model. Each of these probability distributions carries statistical assumptions such as the identical and independent (i.i.d.), normality assumption and so on. Often, these assumptions are rarely valid but modelers are forced to make these assumptions. The choice of input model could significantly impact the prediction or decision to be made based on the simulation results. Depending on the type of distributions that is used in the model, the modelers are required to make the necessary assumptions. Every simulation analysis is 10 based on a variety of assumptions that are made about the nature of the data. It is likely to make erroneous conclusions if one does not carefully evaluate the validity of the assumptions behind the analysis. It has become more of a standard procedure to make these assumptions when using discrete event simulations as an analysis tool. Even if the assumptions could be considered fairly valid (for example in a strictly controlled environment), the task of identifying a probability distribution to represent the input data is not easy. Moreover, the uncertainty in a discrete event simulation model only relates to the choice of a family of distributions and the selection of the parameters as the parameters could either be entered as constant values or chosen from one of the probability distributions. Discrete event simulation is a computerbased statistical sampling experiment [Law and Kelton 2000], therefore the result of any analysis based on input represented by probability distributions is itself a probability distribution. Each simulation run denotes the trajectory behavior of the simulation model in response to a particular experiment. Therefore, discrete event simulation does not provide desired information in regards to the viability of the system under study. For example, it does not tell you under what conditions the system will fail or succeed because these types of questions deal with the transient behavior of the system under study [Ingalls et al. 2000]. As a result of this, the modelers are left with no choice but to run a wide range of simulation runs with different inputs in order to grasp a deeper understanding about the system viability. Another important characteristic of discrete event simulation is that it uses a timepoint representation to model the occurrence times of future events. The timepoint representation does not allow any uncertainty of information and often the exact 11 relationship between two timepoints is not known, but some constraints on how it could be related are known [Allen 1993]. Thus, the notion of time point is not decomposable and is not useful in a reasoning system. According to Allen (1993), an event may occur instantaneously at a particular time point but it appears that such event could be decomposed into subevents if it is examined closely. For example, an event of “an arriving customer to a bank” at a specific time point could be decomposed to “open the front door” and “look for a free bank teller or queue and walk towards it”. This poses a question to search for another possible representation for the modeling of discrete event systems. 2.2 Qualitative Simulation Qualitative simulation is a reasoning technique that derives useful inferences from the modeling of a system when there is a lack of good quantitative information about the system under consideration. It is motivated by the desire to reason about the objects, processes or events in order to uncover all possible behaviors that may exist in the system. One major distinction between qualitative simulation and quantitative simulation is that a quantitative model is a result from a particular experiment while the output from a qualitative model is in response to all possible experiments [Cellier 1991]. When a qualitative simulation determines the next possible state, it can easily determine that there are several next possible states because of the imprecise nature of the data. A qualitative simulation will then execute each of these possible next states. Thus, the results of a qualitative simulation contain all possible event sequences. Qualitative simulation is particularly useful when the level of knowledge about the system under consideration is imprecise. 12 Another major distinction is in the representation of state variables and time representation. A model is quantitative if the state variables are realvalued, otherwise the model is qualitative [Cellier 1991]. This is an important distinction which could identify different types of simulation model. Simulation models could be largely classified into two major types, namely the continuous simulation and discrete event simulation. For the case of a continuous simulation, which often deals with the modeling of a physical system over time by a representation in which the state variables change continuously with respect to time, a traditional continuous simulation model often involves differential equations that describe the rate of change or the interaction of state variables with time. The abstraction is based on ordinary differential equation, which is numerical in character. Kuipers (2001) described a continuous time qualitative simulation model using qualitative differential equation model as an abstraction of an ordinary differential equation, consisting of a set of realvalued variables with functional, algebraic and differential constraints among them. Qualitative differential equation consists of variables which are described in terms of their ordinal relations with a finite set of symbolic landmark values. The relationships could be a firstorder relationship as simple as: As a increases, b increases. The values a and b could be described as increasing or decreasing over a particular ranges. The concept of combining discrete event simulation with qualitative simulation was explored by Ingalls (1999) who developed a qualitative discrete event simulation methodology using temporal interval as the simulation time specification. Temporal interval allows the user to define time as an interval in the simulation which means that the current state of the simulation can occur at any time during the interval defined by t = 13 [t,t+]. Temporal intervals are represented by modeling their endpoints, assuming for any interval t, the lesser endpoint is denoted by t and the greater by t+. The level of uncertainty about the exact timing of the event occurrence is reflected in the width of the interval [Ingalls et al. 2000]. The more uncertainty there is in the event timing, the wider the corresponding temporal interval. This simulation methodology is developed within the modeling framework of event graphs and simulation graph models [Schruben 1983; Som and Sargent 1989; Yücesan and Schruben 1992; Schruben and Yücesan 1993; Ingalls et al. 1996]. Another qualitative approach to discreteevent simulation was taken by Ziegler and Chi (1992) using linear polynomial representation, known as the Symbolic Discrete Event System. This approach allows the modelers to express times symbolically for both situations when timing of events is unknown and when timing is known, but can vary. The time advance values could be expressed as linear polynomials such as 1, 2, s, t, 2s, t+s, 2ts+9, etc., which must evaluate to nonnegative real numbers [Ziegler and Chi 1992]. Temporal interval specification extends the regular time base from real numbers to interval representation. This specification also serves the purpose of representing uncertainty in event execution times. The linear polynomial representation is used to allow manipulation of expressions for time with symbols representing unspecified event times. For example, let p = waiting time at register A, and q = process time at register A, then “p+q” is a valid expression according to symbolic discrete event system specification. This expression could be used to represent the time a customer spends at register A where p and q are symbols that are used to represent unspecified event times. When p and q are assigned numerical values, the expression evaluates to a real number. 14 2.3 Qualitative Discrete Event Simulation (QDES) Qualitative discrete event simulation (QDES) is an eventscheduling approach to simulation modeling developed by Ingalls (1999) and Ingalls et al. (2000). It extends the concept of qualitative simulation to be applied particularly in discrete event systems. QDES uses the next event time advance approach as in a regular discreteevent simulation (RDES). At the start of a simulation model, the simulation clock is initialized to zero and the times of occurrence of the most imminent future event are determined and inserted into an events calendar. The simulation clock is advanced from the occurrence of one event to another at which the state of the system and the times of the occurrence of future events are updated. This process advances the simulation clock until all the events in the events calendar have been executed or canceled, or a certain stopping criterion is met. One distinct characteristic of QDES is that it allows the modelers to specify elements qualitatively. These elements include the time of occurrence of the future events, the simulation time clock and the state variables. The times of occurrence of the future events are represented as time points in RDES. Unlike in RDES, QDES assume imprecise specification of event occurrences. The uncertainty of the event times is represented in a closed time interval in R, which is also known as temporal interval. There are two types of temporal intervals, namely the constant interval and the uncertain interval. A constant interval is an interval whose value remains the same throughout the entire simulation, while an uncertain interval is an interval that changes values every time the interval is evaluated. In using constant intervals, it is assumed that the actual value of the variables is a constant that lies somewhere within an interval. An uncertain interval is 15 equivalent to the modeling of sampling from an unknown probability distribution that is bounded by an interval. As a result of the time interval representation in QDES, the future event calendar is also represented in temporal intervals. Future event calendar in QDES is also sorted according to event times but it is not a strongly ordered list. Events are sorted according to interval mathematics outlined in Allen (1983). It is likely that there would be ties on the future event calendar because of the uncertain order of events. If there is a tie, QDES would not assume a tie breaking strategy as in the case for RDES. Instead, QDES would create threads that make up all of the possible ordering of ties, which differentiates between QDES and RDES. The differences are that RDES’s future event calendar is a strongly ordered list according to the event times and RDES uses several tiebreaking mechanisms to determine the order of simultaneous events, sometimes ad hoc and specific to the simulation protocols being used. The future event calendar in QDES collects all the event notices whose execution order is uncertain and group them in a set, called the nondeterministically ordered set (NOS). The capability of generating all possible scenarios is achieved with the thread generation algorithm. There are currently two algorithms developed so far, namely the DepthFirst algorithm and the BreadthFirst algorithm. These two algorithms are discussed in the next section in more detail. The execution of QDES algorithms resembles the RDES to some extents. The algorithms contain the basic steps of RDES such as initializing the simulation clock, advancing simulation clock from its current value to another, inserting events into the future event calendar, determining the next event to be executed and so on. When QDES is executed, the next possible state is 16 determined. Due to the lack of precise data about the realworld, there could be several next possible states or a tie. QDES algorithm will have some major additional steps to ensure that each of these states will be executed in turns and result in a set of threads that will include all of the possible ordering of event sequences. An example of a new thread being generated is when the execution times (expressed in temporal intervals) for two or more events overlap. The distinctive characteristic of QDES of generating all possible ordering of event sequences is known as coverage [Ingalls et al. 2000]. The coverage property ensures that all outcomes of QDES are characterized and no outcome will be missed out. In RDES, the simulation outcome is based on a sampling approach. The coverage property in QDES is very useful for planning and scheduling problems. Schedules would not have to be rerun every time if something did not happen according to plan. The output of QDES would be able to characterize the changes of events and give information on the next scheduling position, as long as the input interval is respected. Another advantage of coverage property is in debugging simulation models. QDES would characterize all possible scenarios, including anything that is characterized in a RDES model and sequences that have low probability of execution. This would give the modeler absolute confidence in the validity of the simulation model [ Ingalls et al. 2000]. In RDES, input parameters to the systems are usually represented using probabilistic distributions and thus some modeling assumptions are made due to the lack of quality information about the real system. DES is a computerbased statistical sampling experiment [Law and Kelton 2000]. The result of any analysis based on input represented by probability distributions is itself a probability distribution. On the other 17 hand, input parameters to QDES are expressed in temporal intervals and there is no assumption required to determine which probability distribution fits these input data. The capability of generating all possible scenarios is achieved using thread generation algorithm. When a qualitative simulation is executed, the next possible state is determined. Due to lack of precise data about the realworld system, there could be several next possible states or a tie. Each of these states will be executed in turns and result in a set of threads that will include all of the event sequences. An example of a new thread being generated is when the execution time (expressed in time interval) for two events overlaps. In case of a tie in the future events calendar, QDES would create a thread for every possible ordering of the ties. This distinctive characteristic of qualitative simulation is known as coverage [ Ingalls et al. 2000]. Coverage property ensures that all outcomes of QDES are characterized and no outcome will be missed out. Unlike in RDES, the simulation outcome is based on a sampling approach. This property is very useful for planning and scheduling problems. Schedules would not have to be rerun every time if something did not happen according to plan. The output of QDES would be able to characterize the changes of events and give information on the next scheduling position, as long as the input interval is respected. Another advantage of coverage property is in debugging simulation models. QDES would characterize all possible scenarios, including anything that is characterized in a RDES model and sequences that have low probability of execution. This would give the modeler absolute confidence in the validity of the simulation model [Ingalls et al. 2000]. 18 2.4 Event Graphs and Simulation Graph Models QDES is designed and developed using event graphs (EG) and simulation graph model (SGM). The event graphs and simulation graph framework was first introduced by Yücesan and Schruben (1992), Schruben and Yücesan (1993) and further extended by Ingalls (1999) and Ingalls et al. (2000). Event graphs were introduced in order to have a discrete event simulation methodology that is based on systems events. There are other types of discrete event models such as the block diagrams, process networks, and activity wheel or activity life cycle [Ingalls 1994]. These models are structured from the activity standpoint. The event orientation allows discrete event simulation concept to work without the traditional entity definition. Let simulation graph, G (V(G),ES(G),EC(G),ΨG) be a directed graph where V(G) is the set of vertices of G, ES(G) is the set of scheduling edges, EC(G) is the set of canceling edges and ΨG is an incidence function that associates with each edge of G. The Simulation Graph Model (SGM) is then defined as S = (F,C,X,T,Γ,G) where F = {fv: v є V(G)}, the set of state transitions functions associated with vertex v C = {Ce: e є E(G) }, the set of scheduling edge conditions X = {Xe: e є E(G) }, the set of execution edge conditions T = {te :e є ES (G)}, the set of edge delay times as time intervals, and Γ = {γe e є ES (G)}, the set of event execution priorities The simulation graph G specifies the relationships that exist between the elements of the set of entities in a simulation model. The basic construct of the event graph with edge execution condition is given in Figure 1. The nodes labeled A and B represent events 19 and the edge specifies that there is a relationship between the two events. The construct is interpreted as follows: If condition (i) is true at the instant event A occurs, then event B will be scheduled to occur t time units later. Event B will be executed t time units later with the state variables in array n set equal to the values in array k if condition (j) is true t time units later, in which t may assume the value of zero. Note that it is possible to specify an edge with no condition. Figure 1. The basic construct for an event graph 2.5 Interval Mathematics and Its Use in Modeling Allen (1983) mentioned that timepoint representation does not allow any uncertainty of information and often the exact relationship between two timepoints is not known. Thus, the notion of time point is not decomposable and is not useful in a reasoning system. Temporal interval representation is sometimes more useful in certain situations. An example to illustrate the use of temporal interval is shown in the modeling of the start time of a crisis. Let’s say that there is an alarm system that triggers to inform the appropriate authority when there is a crisis. In this case, the start time of the crisis is not known even if the start time of the triggering alarm could be determined accurately. An upper bound could be placed on the start time of the crisis if we assume that the crisis happen at a time earlier than the alarm time. In this case, it is only possible to specify the start time of the associated crisis as a time interval. 20 2.6 Temporal Intervals and Qualitative Time Calendar Construct The uncertainty of the event time is represented in a closed time interval in R, which is also known as temporal interval. Temporal intervals are used as a timing mechanism in the QDES framework. There are two types of temporal intervals used in QDES, namely the constant interval and the uncertain interval. A constant interval is an interval whose values remain the same throughout the entire simulation, while an uncertain interval is an interval that changes values every time the interval is evaluated. In using constant intervals, it is assumed that the actual value of the variables is a constant that lies somewhere within an interval. An uncertain interval is equivalent to the modeling of sampling from an unknown probability distribution that is bounded by an interval. A temporal interval allows users to define time in terms of either a constant interval or uncertain interval in a QDES model. For example, the current simulation time could be defined as t = [t,t+]. This means that the current state of the simulation can occur at any time during the interval t. As a result of the use of temporal intervals in QDES, events in the future events calendar and the future events calendar itself are also defined in the same manner. The future event calendar in a QDES framework is also sorted according to event times but it will not be a strongly ordered list. Events are sorted according to interval mathematics outlined in Allen (1983). Let’s say, there are two events, A and B. Event A is scheduled to execute any time during the interval tA=[ tA , tA +] and event B is scheduled to execute any time during the interval tB=[ tB , tB +]. Event A precedes event B in the future events calendar in a QDES model if: 1. tA < tB (e.g. [2,3] < [4,5]) 21 2. tA  < tB  (e.g. [2,6] and [3,6]) 3. tA  = tB  and tA + < tB + It is likely that there would be ties in the order of events in a QDES model. Such situation happens when the execution times for these events intersect. For example, the order of events A and B will be uncertain if tA∩ tB ≠ ∅. When this condition arises, QDES creates different threads to execute the possible orderings of this set of events. This set of events is known as the nondeterministically ordered set (NOS). In a QDES model, the future events calendar is made up of a union of NOSs. Given the current state of the model, the future events calendar will determine the NOS and then create a thread for each event in this set. If there are a total of N events in the NOS, there will be N threads created. In the ith thread, the ith event would be the first event to be executed. The remaining events in the set will still be in the future events calendar for that thread. Each thread maintains its own calendar and calendar time. The implementation of the future events calendar for NOSs in the QDES is known as the Temporal Interval Calendar (TIC). 2.7 Qualitative Discrete Event Simulation (QDES) Algorithm The Simulation Graph Model (SGM) that was defined earlier as S = (F,C,X,T,Γ,G) was used as the modeling framework in the QDES algorithm in Ingalls’s work. There is additional mechanism that was introduced along with the definition of SGM in order to facilitate the execution of the model. These additional mechanisms include the global simulation clock τ that is represented in time interval, the events calendar L and the terminating conditions for the simulation ω. According to Ingalls (2003), the definition of 22 the L is an ordered set such that L ={(x1,γ1,v1,e1,a1,b1), (x2,γ2,v2,e2,a2,b2),…} where xi represents the execution time, γi represents the execution priority, vi is the vertex to be executed, ei is the index of the edge that is being scheduled, ai are the values of the edge attributes, and bi are the times at which events are scheduled for the ith event notice. Two new sets are introduced to handle temporal intervals and they are defined as follows: H = the set of saved states. This set is used to iterate through all the possible states in the simulation Nh = the NOS Two new variables are also introduced. Variable h is used as a counter to count the number of saved states and iterate through the set H. Variable nh is used to iterate through the Nh set. The execution of the QDES model with the SGM and other definitions mentioned above will be explained in the next section. There are currently two QDES algorithms defined, the depth first algorithm [Ingalls et. al. 2000] and the breadth first algorithm [Agrawal 2002]. Each of these algorithms is explained in the next sections. 2.7.1 Depth First Algorithm A depthfirst algorithm was created and implemented by Ingalls et. al. (2000). The algorithm is designed to completely finish generating one whole thread before moving on to another thread. Any additional threads will be stored on a stack waiting to be executed at a later time. When there is a tie, two threads or more will be generated. The state of the simulation after each thread explosion is saved in a stack called the savestate stack, H. Each savestate consists of a global event calendar and state variable information so that all possible states in the simulation could be executed recursively. The algorithm would 23 assume first thread, put the rest of the threads on stack and continue. When the generation of this thread is complete, the algorithm restores the system from the savestate stack and continues the simulation for the second thread. A savestate counter is set up to count the number of saved states and to iterate through the savestate stack. Recall that all event notices whose execution order is uncertain are grouped in a set, called the nondeterministically ordered set (NOS). In depthfirst algorithm, a NOS counter is used to iterate through the NOS. The depthfirst algorithm is very useful if a complete analysis of all possible scenarios is needed in decisionmaking. Since all possible schedules are characterized and as long as input interval is not violated, users could make fast strategic decisions based on the previously run output, and thus saving time and effort. 2.7.2 Breadth First Algorithm A breadthfirst algorithm was developed and implemented by Agrawal (2003). In a breadthfirst algorithm, all the active threads are evaluated simultaneously. The explosion of threads in this algorithm could be viewed as a tree diagram. The QDES simulation starts either with one parent node or a set of parent nodes. The simulation execution continues and spawns new threads from each of these parents, one by one, until all possible child nodes from each of these parents are explored. Before advancing to the next level down the tree structure, the breadthfirst algorithm ensures that all sibling nodes are executed. Agrawal (2003) proposed a queue structure in his implementation to store the breadthfirst nodes, a set consists of the event notice that are to be executed next, together with information such as the state variable and global events calendar. This queue is denoted as breadthfirst node queue. 24 The breadthfirst algorithm proceeds and gives system snapshots of all event sequences through time. It keeps track of possible system state that is available and how it leads to that system state. It would also be useful to eliminate threads that are considered unimportant or unlikely to give good information. For example, elimination of thread could be added to the breadthfirst algorithm if the number of thread explosion exceeds a certain limit. However, depthfirst algorithm has a speed advantage over breadthfirst algorithm. This is due to the way breadthfirst algorithm is structured. 2.8 Thread Scoring Techniques Some of the threads that are generated by either of the above QDES algorithms may have a less likelihood to happen than other threads. As the complexity of the system increases, the number of threads generated using QDES will also increase. The thread generation could increase exponentially and causes the problem of extracting meaningful information from the output of the model. Thus, some scoring techniques were introduced to approximately rank the threads according to their likelihood of event execution sequences. Thread scores could be used in breadthfirst algorithms to eliminate threads that have lower scores in relative to other threads and thereby reducing run time of the algorithm. Ingalls (1999) described three scoring techniques. The midpoint ranking calculates the midpoint of each interval and ranks them accordingly. The second method is the multiple midpoint method, which also uses the midpoint of each interval. However, the resulting midpoint has taken into account the relative magnitude of all the midpoints and thus the event that is likely to execute first has a higher rank. 25 Let En, n = 1, 2, …, N (N ≥ 2) denotes events with execution intervals [Ln ,Un] that overlap. Let Mn be the midpoint of interval [Ln,Un] for n = 1, 2, …, N. Let Rank(Mn) denotes the rank of Mn. The calculation of using multiple midpoint ranking is given as in the following equation: min ( ) min ( ) min ( ) ( ) 1 1 1 n n N n n N n n N n n M L M L Rank M ≤ ≤ ≤ ≤ ≤ ≤ − − = The uniform method assumes that each overlap interval sections have equal chances of being executed next and each interval follows a uniform distribution. The score is given to each interval, determined according to the probability that a given event would be executed first. The uniform distribution is chosen because it could be easily programmed into the simulation and it has a closed form density function. 26 CHAPTER 3: DELAY CONSTRAINT VALIDITY 3.1 Introduction In a RDES model, entities migrate from state to state while they work their way through the model. There are five entity states as described by Schriber and Brunner (2007), which are the Active state, Ready state, TimeDelayed state, ConditionDelayed State and Dormant state. The Active state is the state of the currently moving entity and only one entity moves at any instant time point. An entity in a Ready state indicates that it is ready to move. There could be more than one entity that are in Ready state, however, only one entity can move onebyone. An entity in the TimeDelayed state is waiting for a known future simulated time to be reached so that it can then (re)enter the Ready state. The ConditionDelayed state is the state in which the entity is delayed until some specified condition comes about. If an entity is in the Dormant state, then it is in a state in which no escape will be triggered automatically by changes in model conditions, only modelerssupplied logic will be able to transform the entity from Dormant state to Ready state. Generic RDES software uses five lists to organize entities in the five entity states, which are: • Active Entity List  a list consists of only the active entity • Current Event List  a list consists of entities in the Ready state • Future Events List  a list of entities in the TimeDelayed state, ranked chronologically to the events' execution time • Delay List  a list consists of entities in the ConditionDelayed state 27 • UserManaged List  a list of entities in the Dormant state The interdependencies and the delay constraints that exist between events are managed using these lists. For example, by the time an entity's insertion in the Future Events list, the entity's delay time (also known as move time) is calculated by adding the value of the simulation clock to the exact known (sampled) duration of the timebased delay. Since the Future Events list is a strongly ordered list and events are executed based on exact (sampled) execution times, there is no need to check for the validity of the order of event execution. Based on the QSGM algorithm proposed by Ingalls (1999) and Ingalls et. al. (2000), when the execution of a node triggers a new event to be scheduled at a later time interval, the new event's delay time is calculated by adding the time interval of the simulation clock to the timebased delay which is also represented in a time interval. Since time interval representation does not provide a means of strongly ordering the Future Events calendar in QDES, all possible order of event executions are simulated. Due to the uncertainties in the timing of the events, it is necessary to check for the validity of the delay constraint that exists between an event and the event that scheduled it. The algorithm proposed by Ingalls (1999) and Ingalls, et. al. (2000) did not consider checking for validity of delay constraints and thus may cause invalid threads to be generated. Ingalls (1999) presents an example to demonstrate that QDES has the same functionality as RDES and also to show the ability of QDES to model at a more strategic level than lower operations level. The same example will be used to illustrate the discovery of invalid threads and the approach to eliminate the generation of invalid 28 threads in the QDES algorithm. This chapter starts with the illustration of the Simple Queuing Model (same as the simple inbound outbound logistic pipeline model in Ingalls(1999)). 3.2 Simple Queuing Model Figure 2 shows a simple queuing model with four events, “BEGIN”, “ENTER”, “START” and “LEAVE”. This model could be viewed as bank teller system where the system will “BEGIN” at time [0,0], customers “ENTER” the bank and wait to be served by the bank teller. The next customer in line will “START” the service process. When the service is finished, the customer will then “LEAVE” the bank. “BEGIN” is the first event to be executed and it is used to initialize the state variables. The variables that we are tracking in this model are the queue length, Q, status of the bank teller, S and number of customers that have exited, E. The status of the server is busy if S=0. If S=1, then the bank teller is idle. Changes in the state variables occur when an event occurs. The changes of the state variables of an event are stated below the event itself in Figure 2. For example, when “START” event occurs, the queue length is decremented by 1(Q=Q1) and the status of the bank teller is changed to busy (S=0). Begin Q = 0 E = 0 S = 1 Start S=S1 Q=Q1 Leave S=S+1 E=E+1 Enter [3,8] Q=Q+1 [4,6] Figure 2. Simple Queuing Model 29 The conditions on the edges in Figure 2 are based on the state of the system. The condition of S>0 is to check that the bank teller is available to service the next customer in line. If a customer arrives to the bank and the bank teller is not busy, the customer would be serviced immediately. Using the example model above, the author want to demonstrate how the approximation of the probability for each event sequence is done. In order to make the example model a simple model to illustrate the approximation, the QDES model for this example is set to terminate after two customers exited. So, the terminating condition is reached when the number of exits equals two (E=2). There are a total of 11 threads generated by Ingalls(1999) QDES algorithm with the above terminating condition. Figure 3 shows the event sequences of each thread [Ingalls 1999]. Threads 3, 5, 6 and 7 will not be considered in the research. This is because at node 14, event Enter[6,6] has an exact execution time and at node 11 and Enter[12,12] also has an exact execution time. In probability theory the probability that event Enter will execute at any exact time x is zero, for all real numbers x. Since there is no information gain from considering these threads, they will be eliminated from this example. Figure 4 shows the event sequences for the simple queuing model, with threads 3, 5, 6 and 7 eliminated. 3.3 Discovery of Invalid Threads In this section, each of the eleven threads from Figure 4 will be reviewed and checked for the delay constraint validity. Starting at thread 1, events Begin[0,0], Enter[0,0] and Start[0,0] have execution time of [0,0]. There is no need to check for delay constraint validity for events that have zero execution time, as the delay time from the scheduling events to these events will be equal to [0,0]. 30 Figure 3. Event sequences of the simple queuing model 31 Figure 4. Event sequences for the simple queuing model with threads 3, 5, 6 and 7 eliminated 1 Begin [0,0] 2 3 4 5 6 7 8 13 9 10 Leave [9,12] 22 23 24 25 26 29 30 31 1 4 2 10 9 8 11 Enter [0,0] Start [0,0] Enter [3,6] Leave [4,6] Leave [4,6] Enter [4,8] Start [4,8] Enter [7,14] Leave [8,14] Start [4,6] Leave [8,12] Enter [6,12] Leave [8,12] 27 28 Leave [8,14] Enter [10,14] Leave [10,14] Enter [13,14] Leave [13,14] Enter [9,12] 32 Another exception to checking delay constraint validity is when the scheduling event has a zero timebased delay. Recall that a newly scheduled event's delay time is calculated by adding the time interval of the simulation clock to the timebased delay which is also represented as a time interval. If the timebased delay is equal to [0,0], then the scheduled event is to be executed immediately, thus there is no need to check for delay constraint validity. The next event on thread 1 would be Enter[3,6] (node 4), it is scheduled by Enter[0,0] (node 2) with the delay constraint of [3,8] (refer to Figure 5). In order for the thread to be valid (at least until node 4), the delay time from node 2 to node 3, add to the delay time from node 3 to node 4 should be within [3,8]. In other words, the total elapsed time from node 2 to node 4 has to fall in between [3,8]. Since [0,0]+[3,6] = [3,6] and [3,6] is within the delay constraint of [3,8], the execution of thread 1 until node 4 is considered valid, as[3,6]∩[3,8] = [3,6] ≠ ∅. Recall from interval arithmetic, [a,b] [c,d]=[(ad), (bc)]. Delay constraint validity check is performed for all the subsequent nodes for thread 1 and for all other threads as well, which is shown in the next two pages. Figure 5. Checking delay constraint validity for Enter[3,6] [0,0] [3,6][0,0] =[3,6] 33 Thread 1 Note: Node 6 is omitted because there is a zero time delay from node 5 to node 6 Leave[4,6] (node 5) is scheduled by Start[0,0] (node 3) with delay constraint of [4,6] ([3,6][0,0])+([4,6][3,6])=[3,6]+[0,3]=[3,9] Since [3,9]∩[4,6] = [4,6] ≠ ∅, thus thread 1 is valid up to node 5 Valid! Enter[6,12] (node 7) is scheduled by Enter[3,6] (node 4) with delay constraint of [3,8] ([4,6][3,6])+([6,12][4,6])=[0,3]+[0,8]=[0,11] Since [0,11]∩[3,8] = [3,8] ≠ ∅, thus thread 1 is valid up to node 7 Valid! Leave[8,12] (node 8) is scheduled by Start[4,6] (node 6) with delay constraint of [4,6] ([6,12][4,6])+([8,12][6,12])=[0,8]+[0,6]=[0,14] Since [0,14]∩[4,6] = [4,6] ≠ ∅ , thus thread 1 is valid up to node 8 Valid! Thread 2 Enter[9,12] (node 9) is scheduled by Enter[6,12] (node 7) with delay constraint of [3,8] [9,12][6,12]=[0,6] Since [0,6]∩[3,8] = [3,6] ≠ ∅ , thus thread 2 is valid up to node 9 Valid! Leave[9,12] (node 10) is scheduled by Start[4,6] (node 6) with delay constraint of [4,6] ([9,12][9,12])+[3,6](from node 9's delay bound)+([6,12][4,6])=[0,3]+[3,6]+[0,8]=[3,17] Since [3,17]∩[4,6] = [4,6] ≠ ∅, thus thread 2 is valid up to node 10 Valid! Thread 4 Leave [8,12] (node 13) is scheduled by Start [4,6] (node 6) with delay constraint of [4,6] [8,12][4,6]=[2,8] Since [2,8]∩[4,6] = [4,6] ≠ ∅, thus thread 4 is valid up to node 13 Valid! Thread 8 Note: Node 24 is omitted because there is a zero time delay from node 23 to node 24 Leave[4,6] (node 22) is scheduled by Start[0,0] (node 3) with delay constraint of [4,6] [4,6][0,0]=[4,6] Since [4,6]∩[4,6] = [4,6] ≠ ∅, thus thread 8 is valid up to node 22 Valid! Enter[4,8] (node 23) is scheduled by Enter[0,0] (node 2) with delay constraint of [3,8] ([4,8][4,6])+[4,6]=[0,4]+[4,6]=[4,10] Since [4,10]∩[3,8] = [4,8] ≠ ∅ , thus thread 8 is valid up to node 23 Valid! 34 Enter[7,14](node 25) is scheduled by Enter[4,8] (node 23) with delay constraint of [3,8] [7,14][4,8]=[0,10] Since [0,10]∩[3,8] = [3,8] ≠ ∅, thus thread 8 is valid up to node 25 Valid! Leave[8,14] (node 26) is scheduled by Start[4,8] (node 24) with delay constraint of [4,6] ([8,14][7,14])+[3,8]=[0,7]+[3,8]=[3,15] [3,8] is the delay bound for node 25. Since there is a zero time delay from node 23 to node 24, thus the delay bound from node 23 to node 25 could be extended also from node 24 to node 25. Since [3,15]∩[4,6] = [4,6] ≠ ∅, thus thread 8 is valid up to node 26 Valid! Thread 11 Leave[8,14] (node 31) is scheduled by Start[4,8] (node 24) with delay constraint of [4,6] [8,14][4,8]=[0,10] Since [0,10]∩[4,6] = [4,6] ≠ ∅ , thus thread 11 is valid up to node 31 Valid! Thread 9 Enter[10,14] (node 27) is scheduled by Enter[7,14] (node 25) with delay constraint of [3,8] [10,14][7,14]=[0,7] Since [0,7]∩[3,8] = [3,7] ≠ ∅, thus thread 9 is valid up to node 27 Valid! Leave[10,14] (node 28) is scheduled by Start[4,8] (node 24) with delay constraint of [4,6] [3,8] (delay bound from node 25) + [3,7] (delay bound from node 27)+([10,14][10,14]) =[3,8]+[3,7]+[0,4]=[6,19] Since [6,19]∩[4,6] = [6,6] and P([6,6])=0, thus thread 9 is not valid. Invalid! Thread 10 Enter[13,14] (node 29) is scheduled by Enter[10,14] (node 27) with delay constraint of [3,8] [13,14][10,14]=[0,4] Since [0,4]∩[3,8] = [3,4] ≠ ∅, thus thread 10 is valid up to node 29 Valid! 35 Leave[13,14] (node 30) is scheduled by Start[4,8] (node 24) with delay constraint of [4,6] [3,8] (delay bound from node 25) + [3,7](delay bound from node 27) + [3,4] (delay bound from node 29) + ([13,14][13,14]) =[3,8]+[3,7]+[3,4]+[0,1]=[9,20] Since [9,20]∩[4,6] = ∅, thus thread 10 is not valid. Invalid! Validity check is only required to be performed once for each node, if the node has been checked and verified that it is valid, it does not need to be checked again. Once a node has been verified as valid, the intersection of the delay constraint with the total elapsed time interval (from the scheduling event to its scheduled event) will become a delay bound that could be used in the calculation of total elapsed time for subsequent nodes in the same thread. For example, in node 10, instead of using the delay time for node 9, which equals to [10,14][7,14]=[0,7], the delay bound of [3,6] is used in the calculation of total elapsed time for node 10. This is because it has been calculated in node 9 that the minimum total elapsed time from node 7 to 9 is equal to 3 time units and the maximum total elapsed time is equal to 6. Thus, this information gained from checking delay constraint in preceding nodes need to be used in the validity check for subsequent nodes. The delay bound for a node can be extended to other nodes under certain circumstances. This is especially true for nodes with zero delay time. For example, node 23 has a zero delay time, therefore the delay bound for a (scheduled) node that has node 23 as the scheduling node could be extended to the immediate successor node following node 23. In this case, the delay bound is extended from node 23 to node 24. Based on the verification that has been done on each node, the QSGM output from the Simple Queuing Model has two invalid threads, which are threads 9 and 10. 36 These threads will be deleted from the QSGM output and will not be considered in further research. There are a total of five valid threads remaining in the Simple Queuing Model as shown in Figure 6. 37 Figure 6. The Remaining Valid Threads from QSGM Output 1 Begin [0,0] 2 3 4 5 6 7 8 13 9 10 Leave [9,12] 22 23 24 25 26 31 1 4 2 8 11 Enter [0,0] Start [0,0] Enter [3,6] Leave [4,6] Leave [4,6] Enter [4,8] Start [4,8] Enter [7,14] Leave [8,14] Start [4,6] Leave [8,12] Enter [6,12] Leave [8,12] Leave [8,14] Enter [9,12] 38 3.4 Delay Constraint Algorithm In this section, the algorithm that is created to check and verify delay constraints for a QSGM output using the methodology described in Section 3.3 is presented. The algorithm could be run to check the validity of each thread one by one, visiting each node only once. The definition of L is an ordered set, L ={ ( , , , , , ), ( , , , , , ),... 1 1 1 1 1 1 2 2 2 2 2 2 x γ v e a b x γ v e a b } where xi represents the execution time, γi represents the execution priority, vi is the vertex to be executed, ei is the index of the edge that is being scheduled, ai are the values of the edge attributes, and bi are the times at which events are scheduled for the ith event notice. If γi=1, this means that event l has a high priority and it will be executed immediately. The sets and variables that are used in the algorithm are defined as follows: Let T={threads from QSGM output}. For the simple queuing model example, T={1,2,8,9,10,11} Let Lt={the events from thread t}={1,2,3,…} Lt is an ordered set according to the sequence from QSGM output from thread t. For example, ( , , , , , ) ( , , , , , ) 2 2 2 2 2 2 2 1 1 1 1 1 1 1 l b x d z f od l b x d z f od = = Let pos (l) t = the position of event l in thread t Let ( , ) i j l l md = the delay (in temporal interval) from event i l to event j l Let d' = the accumulated elapse time The algorithm loops through all the threads one by one, checking the delay constraint validity for each event in a thread, starting from the first event in each thread. The first task in the algorithm is to loop through all the events in the thread and determine the total elapsed time between each event and its scheduling event. If the total elapsed time falls within the delay constraint for event l then the execution of event l is 39 considered valid. If the current event l has zero execution time or zero delay time, then the delay between l and its scheduling event is equal to [0,0]. If the current event l has nonzero execution time or nonzero delay time, the algorithm will proceed to check delay constraint validity. In order to determine the total elapsed time between the current event l and its scheduling event, the algorithm will loop through each event (node) between the current event l and its scheduling event in reverse (starting from the last event) to check if there is a delay bound. If there is no delay bound for the current node, the delay time between the current node and its predecessor node will be calculated. Let's say the current node has an execution time interval of [a,b] and its predecessor node has an execution time interval of [c,d], then the delay time between the current node and its predecessor node is equal to [(ad),(bc)]. The algorithm will proceed to the next event. If there is a delay bound for the current node, we want to determine if the position of the delay bound falls within the current event l and its scheduling event's position. For delay bound that meets this criterion, the delay time between the current node and its scheduling node will be retrieved from the variable ( , ) i j l l md . The algorithm will now proceed to the event before the scheduling node and continue checking the delay constraint validity. Because we skip the nodes between current node and the current node's scheduling event, we did not check to see if these nodes have delay bound. The search for delay bounds starts from the node right before the scheduled event, namely the scheduled event’s predecessor. There may more than one delay bound that exists between the scheduled event and the scheduling event. On top of that, there may be intersections that exist between the delay bounds. In the algorithm that we propose here, not all delay 40 bounds will be considered as the search for delay bounds are based on the order of the event sequences in reverse, meaning starting from the scheduled event. Thus, the proposed checking delay constraint algorithm in this dissertation does not consider checking all delay constraints for all combinations of bounds that exists in a thread. The effect of this approach in calculating the total elapsed time is that it is possible for a thread that has been determined to be valid using this approach, but in fact when using other approaches it is invalid. This is mainly due to the fact that not all combinations of delay bounds are considered when calculating the total elapsed time between the current event l and its scheduling event. However, this situation did not arise in the Simple Queuing System model as there is no intersection between delay bounds and all delay bounds are considered in the model when calculating the total elapsed time for an event in a thread. The second task for the delay constraint algorithm is to determine the intersection of the total elapse time with the delay constraint ( d'∩bl ). If the result of the intersection turns out to be an empty set or is a zeroprobability time interval, this means that the delay constraint for that particular thread is not valid and this thread will be deleted from the QSGM output. The remaining algorithm deals with storing the delay bounds in variable ( , ) i j l l md . The algorithm is shown in the following page. The comments that are meant for a statement in the algorithm are in Italic fonts under the statement. 41 Delay Constraint Algorithm For each t∈T This for loop is created to assign the position of an event in a thread t to the variable pos (z) t counter = 0 For each l∈Lt counter = counter +1 pos l counter t ( ) = Next l Next t For each t∈T InitializeM = ∅, V = ∅, = 0,∀ i , j md defined (i, j)' s For l ∈ L  pos (l) = 1, pos (l) + + t t t If ≠ [0,0] l x or ≠ [0,0] l b then If current event l has zero execution time, then the delay between l and its scheduling event is equal to [0,0] as well d'= [0,0] For p pos (l) t = to ( ) +1 t l pos z step 1 To loop through all the nodes between event l and its scheduled event’s successor If ( ) t l p > pos z then If p is greater than the position of the event that scheduled l, meaning there’s a delay bound for event with position p that exists between l and its scheduled event ' ' (  ( ') ) ' , ' d d md pos l p zl l t = + = Go to ( ) 1 ' + t l pos z Else if b pos l p l t = [0,0]  ( ') = ' then d'= d'+[0,0] Else 1 (  ( ') ) ' x x pos l p l t = = 2 (  ( ' ') 1) '' x = x pos l = p − l t d'= d'+x1− x2 x1x2 is the delay between event in position p and p1, or between p and its successor event End if Next p 42 If ∩ = ∅ l d' b or d ∩b = i i i∈ℜ l ' [ , ]  then If the intersection of the total elapse time with the delay constraint is an empty set or is a zeroprobability interval T = T \ {t} Go to next t End if If 1 ( ') ( ) 1 ' = pos l = pos l + l t t γ then If the event that scheduled l’s successor has a high priority flag, meaning that if it’s execution time is the same as event l’s execution time z l l md d b l = '∩ , z l l md d b l = ∩ + ' 1, Assign new delay bounds with zero delay complication Else z l l md d b l = '∩ , End if End if End if Next l Next t 43 CHAPTER 4: PROBABILITY OF THREAD OCCURRENCE 4.1 Introduction The probabilistic approach to creating a statistical analysis for a QDES model depends on the possibility of finding a way to calculate the probability of a thread occurring in the model, the probability of an event occurring in a thread and the probability of an event executing first whenever there is more than one event in the NOS. In this chapter, we will present the method for calculating these probabilities and the associated algorithm that can be programmed and implemented in the QDES framework. 4.2 Probability of an Event Executing First In order to estimate the probability for each individual thread, the minimal assumption is required. The assumption is that all event delay times are assumed to be uniformly distributed. The simulation clock is initialized to [0,0]. There are three events scheduled to execute at time [0,0] and they are BEGIN[0,0], ENTER[0,0] and START[0,0] in this order (refer to Figure 6). At the instant event ENTER[0,0] is executed, it schedules another event ENTER to be executed [3,8] time units later. Now, event START[0,0] is at the top of the future event calendar, so it is executed next. The instant event START[0,0] is executed, it schedules an event LEAVE to be executed [4,6] time units later. At this time, there are two events in the future event calendar, which are events ENTER[3,8] and LEAVE[4,6]. 44 From the uniform distribution assumption on all the events, the event time intervals can be sectioned into predetermined time intervals and the probability of the event executing in each of these sections can be calculated. Without loss of generality, we will assume that the intervals are sectioned to oneunit time intervals. For example, if the time interval for event LEAVE[4,6] is sectioned to one time unit intervals, then LEAVE[4,6] would have a 0.5 probability of executing either in interval [4,5] or [5,6]. On the other hand, ENTER[3,8] has 0.2 probability of executing in [3,4], [4,5], [5,6], [6,7] or [7,8] (as shown in Figure 7). [3,4] [4,5] [5,6] [6,7] [7,8] LEAVE [4,6] 0.5 0.5 ENTER [3,8] 0.2 0.2 0.2 0.2 0.2 Figure 7. Probability distribution for events LEAVE[4,6] and ENTER[3,8] As the exact timing of these two events is uncertain due to execution times and the order of event execution sequences is uncertain due to the overlapping execution times. QDES algorithm will spawn two threads, one assumes that event ENTER[3,8] will be executed first and the second thread assumes that event LEAVE[4,6] will be executed first. If event ENTER[3,8] executes first, then it must be executed in [3,8], so that the next imminent event LEAVE can execute next in [4,6]. On the other hand, if event LEAVE[4,6] executes first in [4,6], the event ENTER[3,8] will execute next in [4,8]. Notice how the execution time interval for the next event changes as a result of the uncertain event execution times and uncertain order of event execution sequences. With the uniform probability information on the two events, the probability of each event executing first could be calculated, as shown below. Note that the probability of an event executing in an interval [i,j] is denoted with P(event∈[i,j]). 45 Question: What is the probability of event LEAVE executing first? P(LEAVE execute first in [4,5]) = P(LEAVE∈[4,5]) *( 2 1 *P(ENTER∈[4,5]+P(ENTER∈[5,8])) = 0.5*(0.5* 0.2 + 0.6) = 0.35 P(LEAVE execute first in [5,6]) = P(LEAVE∈[5,6])* ( 2 1 *P(ENTER∈[5,6]) + P(ENTER∈[6,8])) =0.5*(0.5* 0.2 + 0.4) = 0.25 Question: What is the probability of event ENTER execute first? P(ENTER execute first in [3,4]) = P(ENTER∈[3,4]) = 0.2 P(ENTER execute in [4,5]) = P(ENTER∈[4,5])*( 2 1 *P(LEAVE∈[4,5] + P(LEAVE∈[4,5])) =0.2*(0.5*0.5 + 0.5) = 0.15 P(ENTER execute first in [5,6]) = P(ENTER∈[5,6])*( 2 1 *P(LEAVE∈[5,6])) =0.2(0.5* 0.5) = 0.05 46 As a result of the above calculations, P(LEAVE execute first in [4,6]) = P(LEAVE execute in [4,5]) + P(LEAVE execute in [5,6]) = 0.35 + 0.25 = 0.6 P(ENTER execute first in [3,6]) = P(ENTER execute in [3,4]) + P(ENTER execute in [4,5])+ P(ENTER execute in [5,6]) =0.2 + 0.15 + 0.05 = 0.4 Figure 8 shows the results so far. P{LEAVE[4,6] execute first in interval (i,j)} (i,j) (4,5) (5,6) Total 0.35 0.25 0.6 P{ENTER [3,6] execute first in interval (i,j)} (i,j) (3,4) (4,5) (5,6) Total 0.2 0.15 0.05 0.4 Figure 8. The probabilities of events LEAVE[4,6] and ENTER[3,8] executing first, respectively 4.3 Conditional Probability If we assume that LEAVE[4,6] is executed first, the probability of event ENTER executing next in [4,5], [5,6], [6,7] and [7,8] can be calculated. Based on the condition of when LEAVE[4,6] is executed first, in this case, we have two conditions, namely: (1) LEAVE is executed first in [4,5] and (2) LEAVE is executed first in [5,6], the probabilities of event ENTER executing next in [5,6], [6,7] and [7,8] are calculated and shown as below. 47 P(ENTER in [4,8]LEAVE 1st in [4,5])= + + + P(ENTER 2nd in [7,8]  LEAVE 1st in [4,5]) P(ENTER 2nd in [6,7]  LEAVE 1st in [4,5]) P(ENTER 2nd in [5,6]  LEAVE 1st in [4,5]) P(ENTER 2nd in [4,5]  LEAVE 1st in [4,5]) P(ENTER in [5,8]LEAVE 1st in [5,6]) + = + P(ENTER 2nd in [7,8]  LEAVE 1st in [5,6]) P(ENTER 2nd in [6,7]  LEAVE 1st in [5,6]) P(ENTER 2nd in [5,6]  LEAVE 1st in [5,6]) If LEAVE executes in [4,5], then ENTER can execute next in [4,8]. Without considering the fact that LEAVE was executed in [4,5], we could redistribute the probability of ENTER executing in [4,8] by looking at the original probability distribution of ENTER[3,8]. Since it is equally likely for ENTER[3,8] to execute in any interval within [3,8] with a probability of 0.2, ENTER will still be equally likely to execute anywhere within [4,8] with a probability of 0.25. The probability distribution for ENTER[4,8] is shown in the first row of Figure 9. The same reasoning goes for the case if LEAVE executes in [5,6], the probability distribution for ENTER[4,8] is shown in the second row of Figure 9. [4,5] [5,6] [6,7] [7,8] If LEAVE executes in [4,5] 0.25 0.25 0.25 0.25 If LEAVE executes in [5,6] 0.333333 0.333333 0.333333 Figure 9. Redistribution of the original probability of occurrence for ENTER[4,8] If LEAVE [4,6] executes in [4,5], then the probability for ENTER to execute in [4,5] is equal to 0.125 (half of 0.25). Since ENTER has a 0.25 probability to be in [4,5], the conditional probability of ENTER to execute in [4,5], given that LEAVE executes in [4,5] is equal to 0.5*0.25 = 0.125. Thus, P{ENTER execute second in [4,5]/LEAVE executed in [4,5]}=0.125/(0.125+0.25*3) ≅ 0.142857 48 For the ease of calculation and to save the hassle of redistributing the original probability of occurrence for ENTER[4,8], the following shows two calculations that are equivalent to the one described above. The third calculation is the simplified version that will be used in developing the probability algorithm in 4.8.3 Execution of the Probability Algorithm. P(ENTER execute in [4,5] LEAVE executed first in [4,5]) = P(ENTER in [4,5]) = ( ) 0.14286 *0.2 3* 0.2 2 1 *0.2 2 1 ≅ + P(ENTER execute in [4,5]  LEAVE executed first in [4,5]) = P(ENTER in [4,5]) = ( ) 0.14286 3 2 1 2 1 ≅ + The rest of the conditional probabilities show similar calculations, except that we have to factor in the probability of ENTER not executing in preceding time intervals. P(ENTER execute in [5,6]  LEAVE executed first in [4,5]) = P(ENTER did not execute in [4,5] out of [4,8])*P(ENTER in [5,6] out of [5,8]) = ( ) 0.28571 (3*0.2) 0.2 1− 0.14286 * ≅ P(ENTER execute in [6,7]  LEAVE executed first in [4,5]) = P(ENTER did not execute in [4,5] out of [4,8] and [5,6] out of [5,8])*P(ENTER in [6,7] out of [6,8]) = ( ) 0.28571 (2 *0.2) 0.2 * (3* 0.2) 0.2 1 * 14286 . 0 1 ≅ − − 49 P(ENTER execute in [7,8] LEAVE executed first in [4,5]) = P(ENTER did not execute in [4,5] out of [4,8], [5,6] out of [5,8] and [6,7] out of [6,8])*P(ENTER in [7,8] out of [7,8]) = ( ) *1 0.28571 (2 *0.2) 0.2 * 1 (3*0.2) 0.2 1 * 14286 . 0 1 ≅ − − − P(ENTER execute in [5,6] LEAVE execute first in [5,6]) = P(ENTER in [5,6] out of [5,8]) = ( ) 0.2 *0.2 2* 0.2 2 1 *0.2 2 1 = + P(ENTER execute in [6,7] / LEAVE execute first in [5,6]) = P(ENTER did not execute in [5,6] out of [5,8])* P(ENTER in [6,7] out of [6,8]) = ( ) 0.4 2* 0.2 0.2 1− 0.2 * = P(ENTER execute in [7,8] / LEAVE execute first in [5,6]) = P(ENTER did not execute in [5,6] out of [5,8] and [6,7] out of [6,8])* P(ENTER in [7,8] out of [7,8]) = ( ) *1 0.4 2 *0.2 0.2 1 * 2 . 0 1 = − − The result so far is shown in Figure 10. P{ENTER [4,8] execute second in interval (i,j) / LEAVE [4,6] execute first in (p,q)} (i,j) (p,q) (4,5) (5,6) (6,7) (7,8) TOTAL (4,5) 0.14286 0.28571 0.28571 0.28571 1 (5,6) 0 0.2 0.4 0.4 1 Figure 10. Probability of occurrence for ENTER[4,8] given that LEAVE[4,6] was executed 50 With all the information we have so far, we can now calculate the probability of ENTER execute next in [4,5], [5,6], [6,7] and [7,8]. The probability distribution of ENTER [4,8] is given in Figure 11. P(ENTER execute in [4,5]) = P(ENTER execute in [4,5] / LEAVE execute first in [4,5]) * P(LEAVE execute first in [4,5]) = 0.08333 (0.35 0.25) 0.35 0.142857 * ≅ + P(ENTER execute in [5,6]) = {P(ENTER execute in [5,6] / LEAVE execute first in [4,5]) * P(LEAVE execute first in [4,5])} + {P(ENTER execute in [5,6] / LEAVE execute first in [5,6]) * P(LEAVE execute first in [5,6])} = 0.25 (0.35 0.25) 0.25 0.2* (0.35 0.25) 0.35 * 28571 . 0 ≅ + + + P(ENTER execute in [6,7]) = {P(ENTER execute in [6,7] / LEAVE execute first in [4,5]) * P(LEAVE execute first in [4,5])} + {P(ENTER execute in [6,7] / LEAVE execute first in [5,6]) * P(LEAVE execute first in [5,6])} = 0.3333 (0.35 0.25) 0.25 0.4 * (0.35 0.25) 0.35 * 28571 . 0 ≅ + + + 51 P(ENTER execute in [7,8]) = {P(ENTER execute 2nd in [6,7] / LEAVE execute first in [4,5]) * P(LEAVE execute first in [4,5])} + {P(ENTER execute 2nd in [7,8] / LEAVE execute first in [5,6]) * P(LEAVE execute first in [5,6])} = 0.3333 (0.35 0.25) 0.25 0.4* (0.35 0.25) 0.35 * 285714 . 0 ≅ + + + P{ENTER [4,8] execute in interval (i,j)} (i, j) (4,5) (5,6) (6,7) (7,8) TOTAL 0.083333 0.25 0.333333 0.333333 1 Figure 11. Probability of occurrence for ENTER[4,8] 4.4 Probability Distribution for a New Event 4.4.1 Generating New Probability Distribution Let’s assume that the simulation proceeds with the current simulation clock at [4,8], ENTER[4,8] has just executed. As soon as ENTER [4,8] has executed, it schedules a new event START to happen immediately since the server is available there is no other customer waiting for the server. START[4,8] could execute immediately in [4,8] with the same probability distribution as ENTER[4,8] (refer to node 24 in Figure 6). A new event ENTER is also scheduled by ENTER[4,8] to happen [3,8] time units later, which leads to ENTER[7,16]. START[4,8] schedules a LEAVE event to execute [4,6] time units after time interval [4,8], which then leads to LEAVE[8,14]. Events ENTER[7,16] (refer to node 25 in Figure 12) and LEAVE[8,14] (refer to node 31 in Figure 12) are currently on the future event calendar. Because their execution times overlap, they create a nondeterministically ordered set (NOS). 52 Figure 12. Events ENTER[7,14] and LEAVE[8,14] ENTER[7,16] is scheduled to occur [3,8] time units after [4,8], which leads to time interval [7,16]. The delay of [3,8] is assumed to follow uniform distribution, but time interval [4,8] from ENTER[4,8] is not uniform. The probability distribution for ENTER[7,16] can be obtained by adding the distribution of ENTER[4,8] to the distribution of Uniform[3,8]. ENTER[4,8] UNIFORM[3,8] [4,5] 0.083333333 [3,4] 0.2 [5,6] 0.25 [4,5] 0.2 [6,7] 0.333333333 [5,6] 0.2 [7,8] 0.333333333 [6,7] 0.2 [7,8] 0.2 Table 1. Probability distributions for events ENTER[4,8] and Uniform[3,8] Table 1 shows the probability distribution for ENTER[4,8] and the uniform distribution [3,8]. Adding these two distributions is equivalent to the result of drawing one sample from each distribution and add the values of the samples together to get the final value, x. The addition of ENTER[4,8] and Uniform[3,8] probability distributions creates the probability distribution for this final value, x. 53 To get the probability distribution of ENTER[4,8]+Uniform[3,8], we could simply multiply the ENTER[4,8] column (refer to Table 1) with the Uniform column. For example, P(E[4,5] + U[3,4]) = P(ENTER[7,9]) = 0.083333333* 0.2 ≅ 0.016666. Table 2 shows the resulting probability distribution. E[4,5]+U column holds the probability values for ENTER[7,9], [8,10], [9,11], [10,12], and [11,13], which are the results of adding the probability of ENTER[4,5] with Uniform probability of [3,4], [4,5], [5,6], [6,7] and [7,8], respectively. In order to get the probability of ENTER executing in a certain interval, the sum of probabilities attributed to that interval is computed. For example, P(ENTER[8,10]) = { P(E[4,5]+U[4,5])=[8,10]) } + { P(E[5,6]+U[3,4])=[8,10]) } ≅ 0.016666667 + 0.05 ≅ 0.066666667 E[4,5] +U P{E[4,5]+U} E[5,6]+U P{E[5,6]+U} E[6,7]+U P{E[6,7]+U} E[7,8]+U P{E[7,8]+U} [7,9] 0.016666667 [8,10] 0.05 [9,11] 0.06666667 [10,12] 0.06666667 [8,10] 0.016666667 [9,11] 0.05 [10,12] 0.06666667 [11,13] 0.06666667 [9,11] 0.016666667 [10,12] 0.05 [11,13] 0.06666667 [12,14] 0.06666667 [10,12] 0.016666667 [11,13] 0.05 [12,14] 0.06666667 [13,15] 0.06666667 [11,13] 0.016666667 [12,14] 0.05 [13,15] 0.06666667 [14,16] 0.06666667 Table 2. Probability distribution of ENTER[7,16] The resulting interval will be of length 2, which are [7,9], [8,10], [9,11], [10,12] and [11,13] with the same probability of 0.016666 , respectively. The researcher chooses to work with the same interval length by reducing the interval length to one time unit. If we leave the interval length as it is, as the simulation continues the length of the interval will keep increasing every time we add probability distributions. However, if we reduce the interval to one time unit by assuming that the each of these resulting intervals of [7,9], [8,10], [9,11], [10,12] and [11,13] follows a uniform distribution. If we assume that the 54 probability of ENTER[8,10] which equals to 0.06666 is evenly distributed throughout [8,10], then ENTER could be executed in [8,9] and [9,10] each with the probability of 0.0333333 2 0.06666666 ≅ . The same assumption is applied to all other intervals, including ENTER[7,9] (which has a probability of 0.0166666), then we could find the probability of ENTER[8,9] by adding the half of the probability from [8,10] and half of the probability from [7,9], which is shown as below. P(ENTER[8,9]) 0.03333333 0.00833333 0.04166666 2 0.016666667 2 0.066666667 ≅ + ≅ + ≅ Table 3 shows the probability distribution of ENTER[7,16], after the reduction of the interval length to one time unit. ENTER[7,16] Probability [7,8] 0.008333333 [8,9] 0.041666667 [9,10] 0.1 [10,11] 0.166666667 [11,12] 0.2 [12,13] 0.191666667 [13,14] 0.158333333 [14,15] 0.1 [15,16] 0.033333333 Table 3. Probability distribution for ENTER[7,16] with interval width of 1 time units The probability distribution of LEAVE[8,14] can be obtained by multiplying the START[4,8] column with the Uniform distribution of [4,6] (i.e. the delay distribution). Figure 13 shows the multiplication of these two columns, the addition of the probabilities with the same time interval and finally the probability of LEAVE[8,14] after the reduction of time interval to length one. 55 LEAVE [8,14] START[4,8] Uniform [4,5] 0.083333 [4,5] 0.5 [5,6] 0.25 [5,6] 0.5 [6,7] 0.333333 [7,8] 0.333333 E[4,5] +U P{E[4,5] +U} E[5,6] +U P{E[5,6] +U} E[6,7] +U P{E[6,7] +U} E[7,8] +U P{E[7,8] +U} [8,10] 0.041666 [9,11] 0.125 [10,12] 0.166666 [11,13] 0.166666 [9,11] 0.041666 [10,12] 0.125 [11,13] 0.166666 [12,14] 0.166666 Interval Probability Interval Probability [8,10] 0.041666 [8,9] 0.02083 [9,11] 0.166666 [9,10] 0.10416 [10,12] 0.291666 [10,11] 0.22916 [11,13] 0.333333 [11,12] 0.3125 [12,14] 0.166666 [12,13] 0.25 SUM 1 [13,14] 0.08333 SUM 1 Figure 13. Probability distribution of LEAVE[8,14] 4.4.2 Generating Conditions for the Conditional Probability The probability distributions that are generated in Section 4.4.1 carry important information about the probability of occurrence for the newly scheduled events ENTER[7,16] and LEAVE[8,14]. It assumes that ENTER[7,16] and LEAVE[8,14] are independent of all previous events. This assumption is not valid when it comes to examining the probability of occurrence for sequential events. ENTER[7,16] and LEAVE[8,14] are exclusive (i.e. disjoint) events, which capture the intuition of noncompatible outcomes. Noncompatible events cannot happen at the same time. This is not the same as independent outcomes. If A, B are disjoint and we know that A occurred, then we do know a lot about B. Namely we know that B cannot occur. Thus there is an interaction between events A and B. Knowing whether A occurred influences the 56 probability of B, which is not possible under independence. Independence captures the intuition of noninteraction, and lack of information. In modeling it is often assumed rather than verified. The probability of ENTER[7,16] executing in a specific time interval (i.e. [7,8],…, [15,16]) depends on the condition of two things: 1. The time interval for which ENTER[7,16]’s scheduling event is executed. If given that its scheduling event ENTER[4,8] is executed in [4,5], then ENTER[7,16] can execute in [7,9], [8,10], [9,11], [10,12] and [11,13] with the probability of 0.2, respectively. This distribution with overlapping time interval represents the conditional probability distribution of ENTER[7,11], given the time interval of ENTER[4,5]. Note that it has the same shape as the delay distribution of [3,8]. Recall that the delay distribution of [3,8] has a uniform probability distribution of 0.2 in each of its oneunit time interval. We know from 4.4.1 that P(ENTER[7,16] is in [7,9]) = P(E[4,5] + U[3,4]) = P(ENTER[7,9]) ≅ 0.083333333* 0.2 = 0.016666… And, P(ENTER[4,8] is in [4,5]) ≅ 0.083333333 Since P(ENTER[7,16] is in [7,9]) = P([ENTER[7,16] is in [7,9]/ENTER[4,8] is in [4,5])* P(ENTER[4,8] is in [4,5]) Thus, P([ENTER[7,16] is in [7,9]/ENTER[4,8] is in [4,5]) = P(ENTER[7,16] is in [7,9])/ P(ENTER[4,8] is in [4,5]) ≅ (0.083333333* 0.2)/ 0.083333333 57 = 0.2 If we assume that the probability of all the twounit time intervals are evenly distributed within their time intervals, using the same interval probability reduction method that was discussed in 4.1.1, the conditional probability distribution of ENTER[7,16] executing in [7,8], [8,9], [9,10], [10,11], [11,12] and [12,13] given that its scheduling event ENTER[4,8] was executed in [4,5] is equal to 0.1, 0.2, 0.2, 0.2, 0.2 and 0.1 respectively. 2. Whether ENTER[7,16] intersects with its immediate preceding event. The conditional probability distribution that was obtained in part 1 only takes into account the effect of ENTER[4,8]'s execution time on the probability of occurrence for ENTER[7,16]. If the immediate preceding event for ENTER[7,16] is ENTER[4,8], there will not be any intersection of time intervals when calculating the conditional probability. Clearly, the conditional probability for ENTER[7,16] will be less in the time intervals that have intersection with the immediate preceding event. This is likely to be the case if ENTER[7,16] is one of the NOS event in the NOS set. Even though the immediate preceding event for ENTER[7,16] is not its scheduling event, ENTER[4,8], but START[4,8] has the exact probability distribution as ENTER[4,8]. This means that there will not be any intersection of time intervals when calculating the conditional probability for ENTER[7,16]. Thus, the conditional 58 probability for ENTER[7,16] that is calculated in part 1 will not be affected and could be used for further analysis. Figure 14 shows the conditional probability for ENTER[7,16]. P{ENTER [7,16] execute in interval (i,j) / START[4,8] execute in (p,q)} (I,j) (p,q) (7,8) (8,9) (9,10) (10,11) (11,12) (12,13) (13,14) (14,15) (15,16) [4,5] 0.1 0.2 0.2 0.2 0.2 0.1 [5,6] 0.1 0.2 0.2 0.2 0.2 0.1 [6,7] 0.1 0.2 0.2 0.2 0.2 0.1 [7,8] 0.1 0.2 0.2 0.2 0.2 0.1 Figure 14. Conditional probability for ENTER[7,16] If we were just interested in finding the probability of a thread occurring from the QSGM output, it is sufficient to use the conditional probability for ENTER[7,16] in Figure 14. However, it is not sufficient to use it if we want to determine output statistics for the QSGM output. This is because the conditional probability for ENTER[7,16] in Figure 14 only carries information all the way up to its scheduling event. Important information about any preceding events before ENTER[7,16]'s scheduling event is not included. If we were to stop the simulation at this point, we could calculate the probability of the thread up to ENTER[7,16], but we could not accurately calculate statistics such as the server utilization. From Figure 14, we know that from [4,5] to [7,8], the server utilization is 1 as the server is busy during that time and the conditional probability of this event is 0.1. But we could not determine the server utilization for the events before that, even though we know that the server is busy from [0,0] (at the start of event START[0,0]) to [4,6] (at the start of event LEAVE[4,6]). An average value for the server utilization could be estimated but as the simulation continues, the estimated average value will not be accurate. 59 Another way to getting the information that we need in order to calculate statistics is to generate all possible conditions for the conditional probability. Any events that are to be executed after the first nonzero execution time event will carry information about the timing of all preceding events. Each combination of the timing of these preceding events that are generated will become the condition for the conditional probability. The conditions that are generated for the conditional probability of ENTER[7,16] are shown in the left column of Table 4. The number 5 of condition “5,4” represents the time interval [5,6] of the immediate preceding event, START[4,8] and ENTER[4,8] since they both have exactly the same distribution and if START is executed in [4,5], ENTER will also be executed in [4,5]. The number 4 of condition “5,4” represents the time interval [4,5] of the second last event, which is LEAVE[4,6]. The same conditions are generated for LEAVE[8,14]. The conditional probability for LEAVE[8,16] is shown in Table 5. Conditional Probability for ENTER [7,16] Condition [7,8] [8,9] [9,10] [10,11] [11,12] [12,13] [13,14] [14,15] [15,16] 4,4 0.1 0.2 0.2 0.2 0.2 0.1 5,4 0.1 0.2 0.2 0.2 0.2 0.1 6,4 0.1 0.2 0.2 0.2 0.2 0.1 7,4 0.1 0.2 0.2 0.2 0.2 0.1 5,5 0.1 0.2 0.2 0.2 0.2 0.1 6,5 0.1 0.2 0.2 0.2 0.2 0.1 7,5 0.1 0.2 0.2 0.2 0.2 0.1 Table 4. Conditional probability for ENTER[7,16] with all possible combination of conditions Conditional Probability for Leave [8,14] Condition [8,9] [9,10] [10,11] [11,12] [12,13] [13,14] 4,4 0.25 0.5 0.25 5,4 0.25 0.5 0.25 6,4 0.25 0.5 0.25 7,4 0.25 0.5 0.25 5,5 0.25 0.5 0.25 6,5 0.25 0.5 0.25 7,5 0.25 0.5 0.25 Table 5. Conditional probability for LEAVE[8,14] with all possible combination of conditions 60 As can be seen from Table 4 and Table 5, the conditional probability for ENTER[7,16] and LEAVE[8,14] is not affected by what happens before its scheduling event, ENTER[4,8]. The delay distribution and scheduling event's distribution determine the timing of the newly scheduled event and its probability distribution. The probability of occurrence that is obtained for the new event will be less in the time intervals where it intersects with its immediate preceding event. If the immediate preceding event for the new event happens to be its scheduling event, there will not be any intersection of time intervals. In this case, the probability of occurrence that is obtained from the delay distribution and scheduling event's distribution will remain the same. The probability of each condition occurring could be obtained by normalizing the conditional probability in Figure 10, which is shown in Table 6. The probability of each condition occurring is needed when calculating the probability of a NOS event executing first. Condition Probability 4,4 0.083333 5,4 0.166667 6,4 0.166667 7,4 0.166667 5,5 0.083333 6,5 0.166667 7,5 0.166667 Table 6. Probability of the conditions that are generated Basically, the method for calculating the probability of each NOS event executing first is the same. Instead of using the probability distribution that is computed from the uniform delay distribution and the scheduling event's probability distribution, for each condition that is generated, the conditional probability of each NOS event occurring in a given time interval is used in calculating the probability of which event executing first. 61 So, the algorithm that determines the probability of which event executing first in a given time interval would be constructed in a way that it is repeatable for each condition that is generated and for each defined time interval. The probability of ENTER[7,14] executing first and LEAVE[8,14] executing first are computed and are given in Table 7 and Table 8, respectively. ENTER[7,14] has a probability of 0.00833 to execute first in the time interval [7,8], 0.03958 to execute first in [8,9], 0.08542 to execute first in [9,10] and so on. These probabilities are obtained from summing the product of the conditional probability in Table 6 with the conditional probability of ENTER[7,14] in each time interval. Combining the probabilities from all these time intervals, ENTER[7,14] has a probability of 0.4 to execute first in [7,14]. On the other hand, LEAVE[8,14] has a probability of 0.6 to execute first in [8,14]. Notice that the addition of ENTER[7,14] and LEAVE[8,14]'s probability equals to one, as they are exclusive events. ENTER[7,14] 7 8 9 10 11 12 13 Node 25 4,4 0.1 0.175 0.1 0.025 5,4 0.1 0.175 0.1 0.025 6,4 0.1 0.175 0.1 0.025 7,4 0.1 0.175 0.1 0.025 5,5 0.1 0.175 0.1 0.025 6,5 0.1 0.175 0.1 0.025 7,5 0.1 0.175 0.1 0.025 P(Execute 1st) 0.00833 0.03958 0.08542 0.11875 0.09792 0.04167 0.00833 0.4 Normalized p 0.02083 0.09896 0.21354 0.29688 0.24479 0.10417 0.02083 1 Table 7. Probability of ENTER[7,14] executing first 62 LEAVE[8,14] 8 9 10 11 12 13 NODE 31 4,4 0.2 0.3 0.1 5,4 0.2 0.3 0.1 6,4 0.2 0.3 0.1 7,4 0.2 0.3 0.1 5,5 0.2 0.3 0.1 6,5 0.2 0.3 0.1 7,5 0.2 0.3 0.1 P(Execute 1st) 0.016667 0.075 0.15 0.19167 0.13333 0.03333 0.6 Normalized p 0.027778 0.125 0.25 0.31944 0.22222 0.05556 1 Table 8. Probability of LEAVE[8,14] executing first 4.5 Probability Distribution for the Timing of an Event In the process of calculating the probability of event occurrence, for the case when there is more than one event in the NOS set, the probability distribution for the timing of an event can be obtained from normalizing the probability of an NOS event executing first. For the case when there is one event in the NOS set, the probability distribution for the timing of an event is already in the normalized form. The "Normalized p" rows in Table 7 and Table 8 are the probability distributions for the timing of ENTER[7,14] and LEAVE[8,14], respectively, as the result of normalization from the probability in the "P(Execute 1st)" row. 4.6 Probability of Thread Occurrence When attempting to determine a sample space (the complete possible outcomes from an experiment), it is often helpful to draw a diagram which illustrates the paths to all the possible outcomes. One such diagram is a tree diagram. In addition to determining the number of outcomes in a sample space, the tree diagram can be used to determine the probability of individual outcomes within the sample space. The probability of any 63 outcome in the sample space is the product (multiplication) of all possibilities along the path that represents that outcome on the tree diagram. By following the branches of the tree diagram, we can find all the possible outcomes. The QSGM output (refer to Figure 6) is a tree diagram, since it illustrates the possible outcomes from a QSGM output. Each branch of the tree has its probability of occurrence that has been calculated and the probability distribution of the timing of the event. If there is only one branch at any level, the probability of occurrence for this branch is equal to 1. The sum of all the branches at any level must sum to 1. Figure 15 shows the probability tree diagram for the QSGM output. The bolded number 0.5969 in Figure 15 on the arc emanating from node 6 to node 7 is the probability of occurrence for ENTER[6,12]. The bolded number 0.4031 on the arc emanating from node 6 to node 13 is the probability of occurrence for LEAVE[8,12]. The thread probability for the Simple Queuing Model could be calculated by multiplying the probabilities of occurrence for the events along the path that represents the outcome from the tree diagram. For example, thread 1 has a probability of 0.2336 (correct to 4 decimal places), which is the product of 0.4 (taken from node 4's probability of occurrence), 0.5969 (taken from node 7's probability of occurrence) and 0.9786(taken from node 8's probability of occurrence). The thread probability for all the possible outcomes of the Simple Queuing model is shown in Table 9. Thread Thread Probability 1 0.2336 2 0.0051 4 0.1613 8 0.2400 11 0.3600 Table 9. Thread Probability 64 Figure 15. Probability tree diagram 1 Begin [0,0] 2 3 4 5 6 7 8 13 9 10 Leave [9,12] 22 23 24 25 26 31 1 2 4 8 11 Enter [0,0] Start [0,0] Enter [3,6] Leave [4,6] Leave [4,6] Enter [4,8] Start [4,8] Enter [7,14] Leave [8,14] Start [4,6] Leave [8,12] Enter [6,12] Leave [8,12] Leave [8,14] Enter [9,12] 0.6 0.4 0.6 0.4 0.4031 0.0214 0.5969 0.9786 65 4.7 Comparison of the Manually Computed Probability with Simulated Probability Using Excel In order to compare the thread probabilities that are calculated using the method discussed in the previous sections, a simulation model of the Simple Queuing System was created using Microsoft Excel. The timing of all the scheduled events is randomly generated (see Figure 16). The event name that represents each of these timings is shown in Figure 17. All the iterations are sorted according to the order of event sequences and are then assigned the appropriate thread number. Figure 16 shows a table from the Excel Worksheet that contains the 10,000 generated timing of events for the Simple Queuing System. Each row contains all the timing of events for an iteration (column B) that are then sorted into thread number (column A). The numbers in Column C to J of Figure 16 are the generated timing for the first event, second event and so on. The matching event name for each of these timings of event from each iteration is shown in Figure 17 Column C to J. The simulated thread probabilities based on the 10,000 output data are computed and tabulated in Table 10. The calculated probability for thread 1 is equal to 0.2336 (correct to four decimal places), compared to the simulated probability for thread of 0.2418 (correct to four decimal places). The difference between the two probabilities is 0.0082, with the percentage difference less than 3.4%. 66 Figure 16. The generated timing of events for the Simple Queuing System Figure 17. The generated events for the Simple Queuing System Thread Calculated Thread Probability Simulated Thread Probability Difference 1 0.23363689 0.24176667 0.00812978 2 0.00511311 0.00418333 0.00092978 4 0.16125000 0.15380000 0.00745000 8 0.24000000 0.23865000 0.00135000 11 0.36000000 0.36160000 0.00160000 Table 10. Comparison of the calculated thread probability with simulated thread probability The Excel simulation model for the Simple Queuing System contains all the required information to plot the probability distribution for the timing of an event 67 execution. The simulated and calculated probability for the execution of LEAVE[4,6] (Node 22) in the interval with the lesser endpoint of 4 and 5 are shown in Table 11. The associated event timing probability distributions are plotted in Figure 18. LEAVE[4,6] is the first nonzero execution time event in thread 8 and 11. Table 11 shows that the calculated probability is very close to the simulated probability for an event that was executed early in the system. Figure 19, Figure 20, Figure 21 and Figure 22 show the progression of the calculated probability distribution and the simulated probability distribution for the timing of events from thread 8 and 11 as the system approaches the terminating condition of 2 customers exiting the system. Table 12, Table 13, Table 14 and Table 15 are the associated tables which contain the calculated and simulated probability distributions for the events from thread 8 and 11. The tables show that the calculated probabilities are very close to the simulated probabilities as the simulation progresses. The ability of the probability algorithm to capture the shape of the event timing probability distribution as the simulation progresses, even for rare event sequences with low thread probability such as LEAVE[9,12] (Node 10), which is the last event to execute in thread 2 with thread probability of 0.0051(correct to 4 decimal places) are exhibited in the graphs. LEAVE[4,6] (Node 22) Interval Simulated Probability Calculated Probability Difference [4,5] 0.586982331 0.583333333 0.003649 [5,6] 0.413017669 0.416666667 0.003649 Table 11. Calculated and Simulated Probability for the timing of LEAVE[4,6] (Node 22) 68 Event Timing Probability Distribution for LEAVE[4,6] (Node 22) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 4 5 Calculated Probability Simulated Probability Figure 18. Event Timing Probability Distribution for LEAVE[4,6] (Node 22) ENTER[4,8] (node 23) Interval Simulated Probability Calculated Probability Difference 4 0.078636776 0.083333333 0.00469656 5 0.260213702 0.25 0.0102137 6 0.330609679 0.333333333 0.00272365 7 0.330539842 0.333333333 0.00279349 Table 12. Calculated and Simulated Probability for the timing of ENTER[4,8] (Node 23) Figure 19. Event Timing Probability Distribution for ENTER[4,8] (Node 23) Event Timing Probability Distribution for ENTER[4,8] (Node 23) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 4 5 6 7 Simulated Probability Calculated Probability 69 ENTER[7,14] (Node 25) Interval Simulated Probability Calculated Probability Difference 7 0.014316642 0.020833333 0.00651669 8 0.095537398 0.098958333 0.00342094 9 0.222431734 0.213541667 0.0088901 10 0.309448984 0.296875 0.012574 11 0.247922341 0.244791667 0.0031307 12 0.097073818 0.104166667 0.00709285 13 0.013269083 0.020833333 0.00756425 Table 13. Calculated and Simulated Probability for the timing of ENTER[7,14] (Node 25) Event Timing Probability Distribution for ENTER[7,14] (Node 25) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 7 8 9 10 11 12 13 Calculated Probability Simulated Probability Figure 20. . Event Timing Probability Distribution for ENTER[7,14] (Node 25) LEAVE[8,14] (Node 26) Interval Simulated Probability Calculated Probability Difference 8 0.009009009 0.010416667 0.00140766 9 0.073538655 0.072916667 0.000622 10 0.21509882 0.197916667 0.0171822 11 0.316781898 0.302083333 0.0146986 12 0.277184161 0.291666667 0.01448251 13 0.108387457 0.125 0.01661254 Table 14. Calculated and Simulated Probability for the timing o LEAVE[8,14] (Node 26) 70 Event Timing Probability Distribution for LEAVE[8,14] (Node 26) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 8 9 10 11 12 13 Simulated Probability Calculated Probability Figure 21. Event Timing Probability Distribution for LEAVE[8,14] (Node 26) LEAVE[9,14] (Node 10) Interval Simulated Probability Calculated Probability Difference 9 0.015936255 0.0671685 0.05123225 10 0.466135458 0.435147655 0.0309878 11 0.517928287 0.497683845 0.0202444 Table 15. Calculated and Simulated Probability for the timing of LEAVE[9,14] (Node 10) Event Timing Probability Distribution for LEAVE[9,12] (Node 10) 0 0.1 0.2 0.3 0.4 0.5 0.6 9 10 11 Simulated Probability Calculated Probability Figure 22. Event Timing Probability Distribution for LEAVE[9,12] (Node 10) 71 4.8 Development of Probability Algorithm Using QDES Framework 4.8.1 QDES Framework The simulation graph model (SGM) modified by Ingalls (1996, 2001) provides a general framework in defining and developing QDES. The general framework and the algorithm have been briefly introduced in Section 2.4 and 2.7. In this section, further details of QDES framework will be discussed. Let simulation graph, G (V(G),E(G),ΨG) be a directed graph where V(G) is the set of vertices of G, E(G) is the set of edges and ΨG is an incidence function that associates with each edge of G. The Simulation Graph Model (SGM) is then defined as S = (F,C,X,T,Γ,G) where F = {fv: v є V(G)}, the set of state transitions functions associated with vertex v C = {Ce: e є E(G) }, the set of scheduling edge conditions X = {Xe: e є E(G) }, the set of execution edge conditions T = {te :e є ES (G)}, the set of edge delay times as time intervals, and Γ = {γe e є ES (G)}, the set of event execution priorities The global simulation clock and the event calendar will be stored as time interval. The symbol τ is used to represent the global simulation clock, while the event calendar will be represented by the capital letter, L. The definition of L is an ordered set, L ={ ( , , , , , ), ( , , , , , ),... 1 1 1 1 1 1 2 2 2 2 2 2 x γ v e a b x γ v e a b } where xi represents the execution time, γi represents the execution priority, vi is the vertex to be executed, ei is the index of the edge that is being scheduled, ai are the values of the edge attributes, and bi are the times at which events are scheduled for the ith event notice. According to Ingalls, et al. (1996), the following sets are defined: 72 : v S the set of state variables that can be modified at vertex v (since S is the set of all state variables, S S v ⊂ ); : v P the set of state variables that can be modified at vertex v ; Фe : the set of state variables involved in the scheduling conditions on edge e; : e ϑ the set of state variables involved in the execution conditions on edge e; : e A the set of state variables used to determine the values of e a on edge e; H : the set of saved states, which is used to iterate through all possible states in the simulation : h N the NOS or the set of possible next events In addition to these sets, ω is used to represent the termination conditions for the simulation. Two variables, h and h n are also defined. The variable h is used to count the number of saved states while iterating through the set H. The variable h n is used to iterate through the set h N . 4.8.2 Execution of the QDES Algorithm With the definitions defined in Section 4.8.1, the execution of the QDES algorithm is carried out as follows. Run Initialization Initialize the saved state set and counter. H = ∅,h ←1 New Process Initialization Step 1. Initialize the global simulation clock. τ ←[0,0] . Step 2. Insert one event notice into the event calendar: For simplicity, we will assume that the event notice could be executed at time [0,0] where L L {( x e a b ) ( x e a b ) } 1 [0,0], . , , , , [0,0], . , , , ,... 1 1 1 1 2 2 2 2 2 = ∪ γ γ Execute (execution of the model implementation) Step 1. Determine the NOS, the set Q is the set of all event notices that could be executed next without regard to priority. 73 {  (min( ) ) } {  (min( ) ) } N q q Q q Q Q l x x l L l L h q q l l = = ∀ ∈ ∀ ∈ = − ≤ + ∀ ∈ ∀ ∈ γ γ Step 2. If = 1 h N , then go to Step 6 of Execute. Step 3. Initialize the variable to loop through the NOS. ←1 h n Step 4. Save the state of the simulation by saving the state information in the savestate stack and incrementing the savestate counter. H = {S, L}. h ← h +1. h Step 5. Remove the 1 ( ) −1 − h nh N event notice from L . L L\{ll N } h nh 1 ( ) −1 − = = . Event notice l is removed from the calendar. Go to Step 7 of Execute. Step 6. Remove the first event notice fromL . L L\{ll (x , ,e ,a b z )} l , , 1 1 1 1 1 = = γ . Event notice l is removed from the calendar. Step 7. Evaluate the execution edge condition, X ( ,a ) el el l ϑ . If X ( ,a ) FALSE el el l ϑ = then go to Step 15 of Execute, else go to Step 8 of Execute. Step 8. Determine the possible new simulation clock time. [max(x , ),min(x , l L)] l l τ ′← − τ − + ∀ ∈ . Step 9. If el t is a constant delay interval, determine if the constant delay time is still valid. It is still valid if e l t b l =τ ′ − . If el t is a constant interval and still valid, or if el t is an uncertain interval, go to Step 11 of Execute, else go to Step 10 of Execute. Step 10. Set e l t b l ←τ ′ − . If − ≤ + el el t t , then the new el t is valid, but the process must be started at the beginning so that el t can be consistent throughout the process. Go to step 1 of New Process Initialization. Otherwise there is no valid constant interval for this process and the process is declared “invalid” and terminates. In that case, go to step 16 of Execute. Step 11. Update the simulation clock. τ ←τ ′ . Step 12. Assign the attributes to the parameters of the vertex. l l P a v ← . If Y is the th i state variable in the vertex parameter list, i.e. P Y v i ( ) = , then i Y (a ) l ← . Step 13. Evaluate the state change. S f (S) v vl ← l . Step 14. Schedule further events. For each edge, j el , emanating from l v , if C ( ) TRUE j j e e Φ = l l then evaluate j e A l and assign the attribute value of the new event notice, k , j k e a A l ← . Generate the interevent time, b f(t ) lj k e = , and schedule the event notice where L L {( b v e a b z } k e j j k k j , , , , , , , ) l l = ∪ τ + γ . Step 15. If any of the following conditions are satisfied: (i)τ > Tstop ; (ii) the simulation stopping condition, ω , evaluates TRUE; (iii) L is empty, then the simulation has reached the end of the process. Go to Step 17 of Execute. Step 16. Go to Step 1 of Execute. Step 17. If h = 1, then terminate the simulation. Step 18. Restore the last saved system state off the savedstate stack: 1, (  ), (  ). h h h = h − L = L L∈H S = S S ∈H Step 19. Increment 1 1 1 ← + h− h− n n . 74 Step 20. If   −1 −1 ≤ h h n N then go to Step 5 of Execute. Step 21. Go to Step 17 of Execute. 4.8.3 Execution of the Probability Algorithm The QDES framework and algorithm in Section 4.8.1 and 4.8.2 will be used as the general framework for developing the probability algorithm that will compute the probability of an event execution and the probability distribution of the event's execution time. In order to determine the new scheduled time given the conditions of preceding events that have executed, each event notice in the event calendar will carry a new piece of information that will provide the vertex of the event that scheduled the current event notice. Thus, the new definition of L is an ordered set, L ={ ( , , , , , , ), ( , , , , , , ),... 1 1 1 1 1 1 1 2 2 2 2 2 2 2 x γ v e a b z x γ v e a b z } where xi represents the execution time, γi represents the execution priority, vi is the vertex to be executed, ei is the index of the edge that is being scheduled, ai are the values of the edge attributes, bi are the times at which events are scheduled for the ith event notice, and zi represents the vertex of the event that scheduled the ith event notice. All executed events will be assigned a vertex number, z and an edge number, y that represents the vertex and the edge that has been executed for an event. The scheduled execution time for an event is denoted as sl. The userspecified time unit that will be used for the simulation will be denoted as θ . The value of θ for the Simple Queuing Model is equal to 1. Recall from Section 4.4.2 that in order to calculate statistics, we propose to generate all possible conditions for calculating the conditional probability of an event executing in a given interval. Any event that is to be executed after the first nonzero execution time event will carry information about the timing of all preceding events. 75 Each combination of the timing of these preceding events that are generated will become the condition for the conditional probability. A new variable is defined as the condition number that will be generated at each node, denoted as f . At each node, as new conditions are being created, the numbering of f will be reordered. This is because the number of condition may increase or decrease from node to node. When determining whether a new condition (for the execution of the next event in the future calendar) should be generated, the current event’s conditional execution probability will be calculated. A new condition will only be created if the current event’s conditional execution probability is not equal to zero. As an example, let’s examine the following table that shows the conditional scheduled time probability distribution for events LEAVE[8,12] (node 8 in the probability tree diagram in Figure 15. Probability tree diagram) and ENTER[9,12] (node 9). These two events are in the NOS because they have overlapping execution time. The rows of “6,4,3,0”, “7,4,3,0” and “8,4,3,0” are the first three conditions that are generated from the previous node, node 7. Comparing the probabilities of occurrence for both events in the time interval [8,9],…,[15,16], under all conditions, LEAVE[8,12] can execute first in the interval of [8,9], [9,10] and [10,11]. On the other hand, ENTER[9,12] has a zero probability of executing first under the condition “8,4,3,0” because ENTER[9,12]’s earliest possible execution time is equal to [11,12] while LEAVE[8,12]’s latest possible execution is [10,11]. However, ENTER[9,12] has a nonzero probability of executing first under the conditions of “6,4,3,0” and “7,4,3,0”. 76 Leave[8,12] [8,9] [9,10] [10,11] [11,12] 6,4,3,0 0.25 0.5 0.25 7,4,3,0 0.25 0.5 0.25 8,4,3,0 0.142857 0.571429 0.285714 Enter[9,12] [9,10] [10,11] [11,12] [12,13] [13,14] [14,15] [15,16] 6,4,3,0 0.1 0.2 0.2 0.2 0.2 0.1 7,4,3,0 0.1 0.2 0.2 0.2 0.2 0.1 8,4,3,0 0.1 0.2 0.2 0.2 0.2 Table 16. Conditional scheduled time probability of Leave[8,12] and Enter[9,12] Let’s now consider creating new conditions for ENTER[9,12]. Since there is a zero probability for ENTER[9,12] to execute first under the condition “8,4,3,0”, no new conditions will be generated. Table 17. Conditional Probability of Executing First for Enter[9,12] shows the calculated conditional probability for ENTER[9,12] executing first in the interval of [9,10] and [10,11] for the conditions of “6,4,3,0” and “7,4,3,0”. For condition “6,4,3,0”, two new conditions will be generated, which are “9,6,4,3,0” and “10,6,4,3,0”. For condition “7,4,3,0”, only one new condition will be generated, namely “10,7,4,3,0”. Enter[9,12] 9 10 6,4,3,0 0.05 0.025 7,4,3,0 0.0125 Table 17. Conditional Probability of Executing First for Enter[9,12] Four variables will be defined to handle the generation of new conditions for the next event(s), the linkage of these conditions to the vertex that executed the current event and the ordering of event sequences. They are defined as follows. f : is the condition number that will be generated at each node. w: is the element number for the fth condition. It represents the tier or level of the nodes in a probability tree diagram from QSGM output. 77 c( f ,w) : is the lesser endpoint of the timing of the event that is executed under the fth condition and it is the wth element in the order of the event sequence. The series of c( f ,w) for a given condition f will give us the timing of all preceding events, which will allow to us to calculate the average execution time for all preceding events. d(w) : is the vertex number of the event that is executed for the wth element in the order of the event sequence Pr( f ,w) : is the probability of occurrence for the f th condition and wth element. It is obtained by normalizing the conditional probability of event l executing first in the interval with the lesser endpoint pp. If c( f ,w) = pp , then _ _ _ ( , ) _ _ ( , ( , )) * Pr( , ) Pr( , 1) Sum Cond Exe First f w Cond Exe First pp f w f w f w + = l Figure 23 shows the values of c( f ,w) and d(w) for w =1,2,3,4,5 , assuming that the conditional probability that is computed based on the fth condition and the wth element is not equal to zero. For example, if the conditional probability of ENTER[3,6] executing first in [4,5] given that Begin[0,0] executed in [0,0], ENTER[0,0] executed in [0,0] and START[0,0] executed in [0,0] is not equal to zero, then a new condition with c(2,4)=4 and d(4)=4 will be generated. The values of c(2,1), c(2,2) and c(2,3) will be copied from c(1,1), c(1,2) and c(1,3) respectively, so that the values of c(2,1), c(2,2), c(2,3) and c(2,4) become the time series of an event sequence from the 2nd condition. 78 Figure 23. The values of c(f,w) and d(w) for w=1,..,5 If the simulation was to stop at node 5, the following table shows the value of c(f,w) for all the five conditions that are generated at node 5. For example, c(4,5)=5 is the lesser endpoint of the execution time for LEAVE[4,6] and d(5)=5 means that node 5 is the vertex that executed the event that is associated with c(4,5). The value of f=4 is the condition number and w=5 means that the event is the 5th element in the thread. f\w 1 2 3 4 5 1 0 0 0 3 4 2 0 0 0 3 5 3 0 0 0 4 4 4 0 0 0 4 5 5 0 0 0 5 5 Table 18. The values of c(f,w) at node 5 The following shows a list of new variables that are added in the probability algorithm to determine the expected execution time for an event and the expected delay time between two events. The definition of the variables is also given. 79 Timestamp( f ,w) : is the expected execution time for an event from the f th condition and wth element. Delay( f ,w) : is the expected delay time between the event from the f th condition and wth element with the event from the same f th condition and (w1)th element. cursor( f ,w) : is the pointer to the condition which is the 



A 

B 

C 

D 

E 

F 

I 

J 

K 

L 

O 

P 

R 

S 

T 

U 

V 

W 


