INTRODUCTION
Statistical selection procedures are used to identify the best of a finite
set of simulation alternatives where the best simulated design is defined in
terms of the minimum (maximum) expected value of each alternative. In this study,
we are considering the problem of selecting the best design from a finite and
large set of alternatives where the expected value of each alternative can be
inferred by the simulation. A threestage procedure proposed by Almomani
and Alrefaei (2010) identify the best design by using three stages. At first
stage, the Ordinal Optimization (OO) approach is used to select a subset that
overlaps with the set of the best m% designs with high probability.
At second stage, Subset Selection (SS) approach is used to get a smaller subset
that contains the best among the subset that was selected at the first stage
before. Finally, the IndifferenceZone (IZ) approach is used to select the best
design among the subset from the second stage. However, since simulation output
has randomness, the best design cannot be selected with certainty. Instead,
selection procedures provide some measure of the quality of selection. There
are two measures of selection quality, the first one is the Probability of Correct
Selection P (CS) and the second measure is the Expected Opportunity Cost E (OC)
of a potentially incorrect selection. Almomani and Alrefaei
(2010) proposed the threestage procedure from the contexts of P (CS). However,
this study provides the threestage procedure from the context of E (OC), the
second measure of the quality. Traditional selection procedures identify the
best design with high probability of correct selection by maximizing the P (CS).
Since, the E (OC) become important in many applications like business and engineering
applications, it leads to a recently new selection procedures to reduce the
opportunity cost of a potentially incorrect selection (Gupta
and Miescke, 1994, 1996; Chick
and Inoue, 2001a, b). (Almomani
and Alrefaei, 2010) showed that the threestage procedure is indeed to select
one of the best simulated design with its P (CS). In this study we are using
the threestage procedure to select the best simulated design by providing the
E (OC) as a measure of the quality of selection.
PROBLEM STATEMENT
We consider the following optimization problem:
Where:
f 
= 
The expected performance measure of some complex stochastic
system 
Θ 
= 
Arbitrary feasible solution set that finite and has no structure 
θ 
= 
A vector representing the system design parameters 
Y 
= 
Represents all the random effect of the design and 
L 
= 
A deterministic function that depends on θ and Y 

The goal of the selection procedure is to identify the best n simulated designs
where the best in this study is defined as the design with the smallest mean
which is unknown and to be inferred from simulation. Suppose that there are
n designs. Let Y_{ij} represent the jth output from design i and let
Y_{i} = {Y_{ij}, j = 1, 2,....} denote the output sequence from
design i. We assume that Y_{ij} are independent and identically distributed
(i.i.d) normal with unknown means μ_{i} = E (Y_{ij}) and
variances σ^{2} = V_{ar} (Y_{ij}). In addition,
we assume that the Y_{1}, Y_{2},.......,Y_{n} are mutually
independent. The normality assumption is not a problem here because simulation
outputs are obtained from an average performance or batch means. So by using
the Central Limit Theorem the normality assumption is hold. In practice the
σ^{2}_{i} are unknown so we estimate it using the sample
variances S^{2}_{i}. Throughout this study we assume that a
smallest mean is better so if the ordered μ_{i}values are denoted
by μ_{[1]}≤μ_{[2]}≤.....≤μ_{[n]}
then the design having mean is referred μ_{[1]} to as the best
design. The CS occurs when the design selected by the selection procedure is
the same as the true best design.
In the simulation selection problem, the opportunity cost is represented as
the difference between the unknown means of the selected design and the actual
best design:
Where:
b 
= 
The design with the smallest sample mean 
i^{*} 
= 
The true best design 

If the best design is correctly selected then the OC is zero:
This follow with the Expected Opportunity Cost (E (OC)) in which is defined
as E (OC) = E (μ_{b}μ_{i*}) (Chick
and Wu, 2005; He et al., 2007).
BACKGROUND
Ordinal Optimization (OO): OO concentrates on isolating a subset of
good designs with high probability and reduces the required simulation time
dramatically for discrete event simulation. The OO approach has been proposed
by Ho et al. (1992). Since then, it has emerged
as an efficient technique for simulation and optimization. In this approach,
the aim is to find a good enough solution, rather than to estimate accurately
the performance value of these designs. If we simulate the design to estimate
the expected performance measure, the confidence interval of this estimate cannot
be improved faster than
where n is the number of replications used. This rate may be good for many problems
when there are a small number of alternatives but it is not good enough for
a complex simulation with a larger number of alternatives. The reason is that
since each sample requires one simulation run, we need a large number of samples
when we are dealing with a huge number of alternative designs in the solution
set which is very hard and impossible. In this case, one could compromise the
objective to get a good enough solution rather than doing extensive simulation
which is impossible in many cases. However, in many real world applications
using a simpler model to get an accurate estimate is somehow impossible. Suppose
that the Correct Selection (CS) is to select a subset G of g designs from the
feasible solution set that contains at least one of the top m% best designs.
Since we assume that Θ is very huge then the probability of CS is given
by:
Now, suppose that the CS is to select a subset G of g designs that contains
at least r of the best s designs. Let S be the subset that contains the actual
best s designs then the probability of CS can be obtained using the hyper geometric
distribution as:
However, it assume that the number of alternatives is very large then P (CS)
can be approximated by the binomial random variable: Therefore,
where it assume that s/nx100% = m%. It is also clear that this P (CS) increases
when the sample size g increases.
Subset Selection (SS): SS approach screens out the search space and
eliminate noncompetitive designs and construct a subset that contains the best
design with high probability. We can use this approach when the number of alternatives
is relatively large to select (a randomsize) subset that contains the actual
best design. We require that P (CS)≥1α where CS the correct selection
is selecting a subset that contains the actual best design and 1α is a
predetermined probability. The SS approach dating back to Gupta
(1965) who presented a single stage procedure for producing a subset containing
the best design with a specified probability. Extensions of this research which
is relevant to the simulation setting include (Sullivan
and Wilson, 1989) who derived a two stage SS approach that determines a
subset of maximum size m that with a specified probability will contain designs
that are all within a prespecified amount of the optimum.
Indifference Zone (IZ): The main aim of IZ approach is selecting the
best design among n designs when n≤20. Suppose we have n alternative designs
that are normally distributed with unknown means μ_{1}, μ_{2},.....μ_{73}
and suppose that these means are ordered as μ_{[1]}≤μ_{[2]}≤,.....≤μ_{[73]}.
We seek to locate the design that has the best minimum mean μ_{[1]}.
The IZ is defined to be the interval [μ_{[1]}, μ_{[1]}+δ]
where δ is a predetermined small positive real number. We are interested
in selecting an alternative k such that μ_{k}ε [μ_{[1]},
μ_{[1]}+δ]. Let CS here is selecting an alternative whose
mean belongs to the indifference zone. We would like the correct selection to
take place with high probability say with a probability not smaller than 1α
where 1/n≤1α≤1. Care must be taken in specifying the value of δ
because the design requires to implicitly determining the common singlestage
sample size required for the n competing designs. If δ is too small then
the number of observations is expected to be large. Also when the 1α is
large then the number of observations is expected to be large. The indifferencezone
approach consists of two stages. In the first stage, all designs are sampled
using t_{0} simulation runs to get an initial estimate of expected performance
measure and their variances. Next, depending on the information obtained in
the first stage, we compute how many more samples are needed in the second stage
for each design to guarantee that P (CS)≥1α. Rinott
(1978) has presented a procedure that is applicable when the data are normally
distributed and all designs are simulated independently of each others. This
procedure consists of two stages for the case when variances are completely
unknown. Tamhane and Bechhofer (1977) has presented
a simple procedure that is valid when variances may not be equal. Nelson
et al. (2001) proposed a twostage subset selection procedure.
The first stage is to reduce the number of competitive designs. These designs
are carried out to the second stage in which involved with the indifferencezone
approach using the information gathered from the first stage. Alrefaei
and Almomani (2007) proposed two sequential algorithms for selecting a subset
of k designs that is contained in the set of the top s designs. Another comprehensive
review of ranking and selection procedures can be found by Goldsman
and Nelson (1994), Bechhofer et al. (1995)
by Kim and Nelson (2006, 2007).
THREESTAGE PROCEDURE
This procedure consists three stages in the first stage, a subset G is selected
randomly from the feasible solution set Θ. Let g denotes the size of the
subset G where g is a relatively small number but the probability that G contains
one of the best m% alternatives is high. In the second stage, the SS approach
is applied on the subset G to select a subset I that contains the best design
with high probability where I≤20. Note that the size of the subset G is
small so we can use the SS approach. Finally, the IZ approach is applied on
the set I to select the best design. Note that the size of set I is ≤20 so
that we can use the IZ approach. The algorithm of the threestage procedure
is described as follows:
Algorithm:
•  Determine
the number of alternatives g in G, i.e., G = g where G is the subset
that will be selected in the first stage that satisfies P (G contains
at least one of the best m% design) = ≥1α_{1}. Determine
the number of samples t_{0}≥2 and the indifference zone δ
from the tdistribution 

•  Select
a subset G of size g randomly from the feasible solution set Θ 

•  Take
random samples of t_{0} observations y_{ij} (1≤j≤t_{0})
for each design i, i = 1,......., g 

•  Calculate
the first stage sample means and variances
and
as: 

where, i = 1,..., g and v = t_{0}1 degrees of freedom.
Let:
Where:
•  If
I contain a single index then this design is the best design. Otherwise
for all i∈I compute the second stage sample size T_{i}t_{0},
where: 

h = (1α/2, t_{0}, I) be the Rinott’s constant and can
be obtained from tables of (Wilcox, 1984).
•  Take
a random sample of T_{i}t_{0} additional observations
for all designs i∈I and compute the overall sample means: 

For all iεI
•  Select
the design i∈I with the smallest
as the best design 

Nelson et al. (2001) have shown that with a
probability of at least 1(α_{2}+α_{3}) this approach
selected the best design in the subset G. Therefore, if G contains at least
one of the top m% designs then this procedure selects a good design with probability
1(α_{2}+α_{3}). From the OO, the selected set G contains
at least one of the best m% designs with probability:
Therefore, the P (the selected design by threestage procedure is in the top
m% designs)≥(1α_{1})(1(α_{2}+α_{3})).
NUMERICAL EXAMPLE
Consider the queuing designs when the inter arrival times and the service times are exponentially distributed and the design has one server. This queuing design is known as M/M/1 queuing design. In this design, the arrival process follows a Poisson distribution with rate λ (the mean arrival rate) and the service time is exponentially distributed with parameter μ (the mean service rate). Suppose we have n M/M/1 queuing designs and we want to select one of the best m% queuing models that have the minimum average waiting time per customer. We assume that the arrival rate λ is fixed and the service rate με[a, b]. Of course this problem has analytical solution in the steady state case. We implement the algorithm of threestage procedure for solving this problem under some parameter settings. In the first experiment, we assume that n = 1000, λ = 1 and με[4, 5]. We discretize the problem by assuming that Θ = {4.001, 4.002,...., 5.0000}. Of course, the best design is 1000th design with μ_{1000} = 5. Suppose we want to select one of the best (2%) designs. Then we are shooting for designs 9811000. Let G = 50, α_{2} = α_{3} = 0.005 and δ = 0.05. We consider it as a correct selection if the selected design belongs to {μ_{981}, μ_{982},..., μ_{1000}}. Note that, we can evaluate the analytical probability of the correct selection as:
Table 1:  The
numerical illustration for n = 1000, g = 50, t_{0} = 30, m% =
2% 


Table 2:  The
P (CS) and
for n = 1000, g = 50, t_{0} = 30, m% = 2% over 100 replications 


Table 1 contains the result of this illustration with 10
replications for selecting one of the best (2%) designs. From the Table
1, best is referring to the index of the design that has been selected by
threestage procedure as the best design,
is the unknown average waiting time for the actual best design i^{*}
(here i^{*} = 1000 and
= 0.25),
is the unknown average waiting time for the best design b (b = Best),
is the sample mean for the actual best design I,
is the sample mean for the best design b, threestage
and the analytical .
Note that we can find
and
from the formula:
Where:
μ_{i} 
= 
The service rate for the design 
i and λ_{i} 
= 
The arrival rate for the design i 

Moreover, after the simulation is performed,
can be calculated according to the design output. Note that in some replications
there are negative values for threestage E (OC) whereas E (OC) should be positive
values. The reason for this is that in the first stage of threestage procedure
we used OO to select randomly the set G so in some replications we are not selecting
the actual best design i^{*} from the first stage. This means that the
actual best design i^{*} is not belongs to the set G. It implying that
we have not take into consideration, the addition observation in stage three
compare to the best design b. Therefore the sample mean for the best design
b sometime become less than the sample mean for the actual best design
and this implies that the threestage
. The above numerical examples are extended for 100 replications and the results
are shown in Table 2. It is clear that each P (CS) and
values (the average number of Expected Opportunity Cost) for threestage procedure
is closed to the analytical values. Figure 1 shows the relation
between the E (OC) for the threestage procedure and the analytical E (OC) when
this experiment were repeated for 100 replications.
 Fig. 1:  Relation
between threestage E (OC) and Analytical E (OC) for n = 1000, g = 50,
t_{0} = 30, m% = 2% over 100 replications 

 Fig. 2:  Relation
between threestage E (OC) and Analytical E (OC) for n = 1000, g = 50,
t_{0} = 30, m% = 2% over 100 replications 

If we take the difference between the
and
as absolute value (positive) then we can see the relation between threestage
E (OC) and Analytical E (OC) in Fig. 2. Note that threestage
procedure produce a very small E (OC) and are mostly closed to the values of
the analytical E (OC). For the next experiment, assume n = 1000, λ = 1
and με[4, 5]. We want to select one of the best (5%) designs. Let
g = 50, α_{2} = α_{3} = 0.005 and δ = 0.05. We
consider it as a correct selection if the selected design belongs to {μ_{951},
μ_{952},..., μ_{1000}}. Note that we can evaluate
the analytical probability of the correct selection as:
Table 3 contains the result of this illustration with 10
replications for selecting one of the best (5%) designs. The above numerical
experiments are extended for 100 replications and the results are shown in Table
4. The following still show that each value of P (CS) and for threestage
procedure is closed to the analytical values.
 Fig. 3:  Relation
between threestage E (OC) and Analytical E (OC) for n = 1000, g = 50,
t_{0} = 30, m% = 2% over 100 replications 

 Fig. 4:  Relation
between threestage E (OC) and Analytical E (OC) for n = 1000, g = 50,
t_{0} = 30, m% = 2% over 100 replications 

Table 3:  The
numerical illustration for n = 1000, g = 50, t_{0} = 30, m% =
2% 


Table 4:  The
P (CS) and
for n = 1000, g = 50, t_{0} = 20, m% = 5% over 100 replications 


Figure 3 show the relation between the E (OC) for the threestage
procedure and the analytical E (OC) when this experiment were repeated for 100
replications. If we take the difference between the
and
as absolute value then we can see the relation between threestage E (OC) and
Analytical E (OC) in Fig. 4.
CONCLUSION
In this study, we have presented a threestage procedure that selects one of
the good enough simulated design in contexts of E (OC) of a potentially incorrect
selection. In the first stage, we use the ordinal optimization approach to select
a relatively small subset G where the intersection with the set that contains
the actual best m% alternatives is high. Then subset Selection approach are
used to select a smaller subset I that contains the best alternative of G where
I<20. These are following by the indifferencezone approach to select the
best alternative of I which highly results in selecting an alternative that
belongs to the top m% alternatives. We have also discussed how to implement
the algorithm for a given problem involving M/M/1 queuing models.
The numerical results indicate that the threestage procedure is sound and work practically well. We also note that the P (CS) values for this procedure are much closed to the values of analytical P (CS). In addition the threestage procedure also guarantees to provide the minimum value of E (OC). In future research we would like to improve the threestage procedure to reduce the E (OC) and applied it in real life problems. Moreover, we will study the efficiency of threestage procedure when there is a changing in the initial sample size t_{0}, since we cannot ignore that from the numerical example, it shows that the efficiency of threestage procedure is sensitive to t_{0}.
ACKNOWLEDGEMENT
The researchers would like to thank School of Mathematical Sciences, University Sains Malaysia for supporting this research.