Queueing theory

Contents 
Introduction
Queueing theory (sometimes spelled queuing theory) is the mathematical study of waiting lines (or queues). There are several related processes, arriving at the back of the queue, waiting in the queue (essentially a storage process), and being served by the server at the front of the queue. It is applicable in transport and telecommunication. Occasionally linked to ride theory.
Agner Krarup Erlang, a Danish engineer who worked for the Copenhagen Telephone Exchange, published the first paper on queueing theory in 1909.
Kendall's notation for describing Queues and their characteristics can be found in [4]. Kendall introduced a A/B/C queueing notation in 1953. It has since been extended to 1/2/3/(4/5/6) where the numbers are replaced with:
 A code describing the arrival process. The codes used are:
 M stands for "Markovian", implying exponential distribution for service times or interarrival times.
 D stands for "degenerate" distribution, or "deterministic" service times.
 Ek stands for an Erlang distribution with k as the shape parameter.
 G stands for a "General distribution".
 A similar code representing the service process. The same symbols are used.
 The Number of service channels.
 The Priority order that jobs in the line are served:
 The maximum size of the system. The maximum number of customers allowed in the system including those in service. When the number is at this maximum, further arrivals are turned away.
 The size of calling source. The size of the population from which the customers come. This limits the arrival rate. As more jobs queue up there are fewer available to arrive into the system.
The word queue comes from the Latin cauda, meaning tail.
Queueing theory is directly applicable to intelligent transportation systems, call centers, PABXs, networks, telecommunications, server queueing, mainframe computer queueing of telecommunications terminals, advanced telecommunications systems, and traffic flow.
Application of queueing theory to telephony
The Public Switched Telephone Networks (PSTNs) are designed to accommodate the offered traffic intensity with only a small loss. The performance of loss systems is quantified by their Grade of Service (GoS), driven by the assumption that if insufficient capacity is available, the call is refused and lost [1]. Alternatively, overflow systems make use of alternative routes to divert calls via different paths  even these systems have a finite or maximum traffic carrying capacity [1].
However, the use of queuing in PSTNs allows the systems to queue their customer's requests until free resources become available. This means that if traffic intensity levels exceed available capacity, customer’s calls are here no longer lost; they instead wait until they can be served [2]. This method is used in queueing customers for the next available operator.
A queuing discipline determines the manner in which the exchange handles calls from customers [2]. It defines the way they will be served, the order in which they are served, and the way in which resources are divided between the customers [2, 3]. Here are details of three queuing disciplines:
 First In First Out – This principle states that customers are served one at a time and that the customer that has been waiting the longest is served first [3].
 Last In First Out – This principle also serves customers one at a time, however the customer with the shortest waiting time will be served first [3].
 Processor Sharing – Customers are served equally. Network capacity is shared between customers and they all effectively experience the same delay [3].
Queuing is handled by control processes within exchanges, which can be modelled using state equations [2, 3]. Queuing systems use a particular form of state equations known as Markov chains which model the system in each state [2]. Incoming traffic to these systems is modelled via a Poisson distribution and is subject to Erlang’s queuing theory assumptions viz. [1]:
 PureChance Traffic – Call arrivals and departures are random and independent events [1].
 Statistical Equilibrium – Probabilities within the system do not change [1].
 Full Availability – All incoming traffic can be routed to any other customer within the network [1].
 Congestion is cleared as soon as servers are free [1].
Classic queuing theory involves complex calculations to determine call waiting time, service time, server utilisation and many other metrics which are used to measure queuing performance [2, 3].
Classic queuing theory does suffer from some disadvantages. The first and most obvious is that it is too mathematically restrictive for reallife modelling [5]. The second reason is that simplifying assumptions are often made in the light of uncertainty. These disadvantages require that a different approach be used in a number of queuing applications. The technique described in [5] examines the use of simulation as an alternative to mathematical analysis of queues. It was found that the advantages of simulation help overcome the disadvantages of classic queuing theory and that a combination of the two provides the best analytical method to achieve the greatest accuracy.
References
[1] Flood, J.E. Telecommunications Switching, Traffic and Networks, Chapter 4: Telecommunications Traffic, New York: PrenticeHall, 1998.
[2] Bose S.J., Chapter 1  An Introduction to Queuing Systems, Kluwer/Plenum Publishers, 2002.
[3] Penttinen A., Chapter 8 – Queuing Systems, Lecture Notes: S38.145  Introduction to Teletraffic Theory, Helsinki University of Technology, Fall 1999.
[4] Penttinen A., Kendall’s Notation for Queuing Models, Lecture Notes: S38.145  Introduction to Teletraffic Theory, Helsinki University of Technology, Fall 1999.
[5] van Dijk N.M., To pool or not to pool? “the benefits of combining queuing and simulation”, Proceedings of the Winter Simulation Conference, 2002.