OPTIMIZING WORKLOAD ALLOCATION IN A NETWORK OF HETEROGENEOUS COMPUTERS

The allocation of workload to a network of computers is investigated. A new workload allocation model based on Generalized Exponential (GE) distribution is proposed for userlevel performance measures. The criterion used for effective workload allocation is the one that minimizes the expected response time in systems to which jobs are routed. A closedloop expression for workload arrival to minimize systems means queue length and response time are derived using the optimization technique. Results are presented with numerical examples and sensitivity analysis with respect to changes of total workload. Results are verified using the simulation technique.


INTRODUCTION
Recently process improvement has been given a lot of attention.Since then many modelling techniques and tools have been used to support the effort.However most of the currently available tools use static models such as diagrams to model such processes.Some are quite dynamic where the functional aspects of the process have been modelled using simulation.Furthermore feedback to the designer concerning process functional and alternative design option should be done at early the process design.At this stage, analytical modelling provides quantitative properties, whereby these will provide the global indication of the expected performance.In the final stage, more accurate predictions may be required to fine-tune the designs; therefore, the analytical modelling proposed here is at the highest abstraction level of the process design, i.e. to get the initial idea on the process performance.
In this paper, we stress the quantitative measures of the processes in a network of computers to those of the concurrent discrete-event system.Based on this, we show that by using quantitative modelling, arrival to computers can be reallocated to get optimal performance measures.We focus on the issue of job allocation in a network of computers where different computers have different job processing times.The optimization criterion studied here is to minimize the expected job-response time in the systems to which jobs are allocated.Jobs arrive at a scheduler that allocates jobs to the computers according to a pre-calculated arrival rate using the optimization method.

RELATED WORK
The problem of workload allocation is common to a variety of communication systems especially when it involves a network of computers.Workload allocation seeks to allocate job arrival among computers as evenly as possible.In a parallel setting, where jobs may have many possible paths at the job's scheduler, the job allocation problem is of interest.For each job entering the scheduler, a path is assigned to optimize the allocation of workload.Many studies considered developing a closed-loop expression for service rate (Hsiao & Lazar, 1990;Harrison & Patel, 2000;Gunther, 2000), and studies concerned with optimizing the allocation based on the total arrival rate at the service centers are quite recent (Rahim & Ku-Mahamud, 2006;2008).For networks of computers, the workload-allocation problem is of particular interest, since there are several ways to affect the distribution of workload among computers.In general, network traffic is assumed exponentially distributed (Gelenbe & Mitrani, 1980;Bennani & Menasc´e, 2005).The general exponential (GE) distributions for traffic arrival and service time have been considered (Rahim & Ku-Mahamud, 2002;2006;2008;2010), as these types of distribution posses flexible parameters.
Queueing network models have been recognized as powerful tools for evaluating the performance of computer systems (Allen, 1990;Smith & Williams, 2001) and the communication network (Lazar, 1982;Koavatsos & Othman, 1989a;1989b;Koole, 1999;Boxma, 1995).These analytical models have become very important tools for predicting the behaviour of new designs or proposed changes to existing systems (Koavatsos, 1985;Menasce & Almeida, 2000;Urgaonkar, Pacifici, Shenoy, Spreitzer & Tantawi, 2005).Most queueing network models are used either by making assumptions to assure exact numerical solution or by employing approximate methods (Kobayashi, 1974;Lazar, 1983;Koavatsos & Othman, 1989b).The control of arrivals to a network of queues with the objective of maximizing throughput subject to a response-time constraint has been considered (Kleinrock, 1975;Ross & Yao, 1991;Combe & Boxma, 1991;Hsiao & Lazar, 1991).A throughput time-delay function based on an optimality criterion has been developed (Kleinrock, 1975;Hsiao & Lazar, 1991) where the arrival that maximizes the throughput under the constraint of the average response time will not exceed a preassigned value.Then Ku-Mahamud, (1993) continued with the problem of random routing.All these literature have been devoted to the probabilistic analysis of the queueing system; their optimization is somewhat lagging behind.Only recently, optimization problems related to the network of queues have been studied for instance by Lazar (1981;1984), Tantawi and Towsley (1985), Harrison and Patel (1992), Koole (1999), Liu (1999), Jongh (2002), Srikant (2004), Felegyhazi andHubaux (2006), andRahim, Ibrahim, Syed Yahaya andKhalid (2010).Most of the studies focus on reducing the amount of waiting time in a system with several servers either parallel or serial.However none of the studies consider the impact of jobs inter-arrival and service-time variation (CV's) in modelling the systems performance.Without considering the effect of variation in measuring, systems performance may lead to inaccurate results.This has somehow motivated this study, that is, to develop a predictive analytical model which can optimize the systems performance and consider data variation.

MULTISERVER QUEUEING SYSTEM MODEL
When several users compete for the use of a common resource, the limited capacity of the resource can give rise to congestion, hence queueing is a common phenomena.Queueing occurs normally when the demand exceeds the service capacity of the resource and even when the otherwise occurs.This is due to the fact that the inter-arrival times of the users, and their required service times, are generally not fixed; therefore, a mathematical model of congestion phenomena represents the inter-arrival and the service-times of the users by random variables.The Queueing Theory is devoted to the description, analysis and optimization of such a queueing system (Lazar, 1981).It focuses on a few key performance measures, like queue lengths and waiting times.Due to the stochastic nature of the arrival and service processes, and of the routing process of jobs through a network of queues, the main performance measures are also random variables.With this in mind we use the multiplequeue multiple server model to represent a central job routing system which is shown in Figure 1.
In using this model, hardware resources are represented by service centers at which jobs queue and compete for service.The workload is modelled as a single stream of jobs (file request), with total arrival φ .Each newly-arrived job, is assigned to computer i according to a new arrival rate i λ which is a fraction of the total arrivals.We consider the set of computers to be heterogeneous as this is common in real systems and also it can be generalized to homogeneous servers.In the context of general queueing network models, the generalized exponential (GE) distributional model is of the form; (1) Where μ is the mean service rate, C is the coefficient of variation and u 0 (t) is the unit impulse function, which has been used to represent the inter-arrival and service-time distributions.This model is robust and versatile due to it memoryless properties and has been shown to maximize the entropy function subject to mean value constraints.Furthermore it can be shown that the exact mean number of jobs in the GE/GE/1 queue as given by Liu (1999): are the squared coefficients of variation for the inter-arrival and service-time distributions (CV's) respectively.This means the queue length function will be used as an objective in the optimization model.In using this model, hardware resources are represented by service centers at which jobs que compete for service.The workload is modelled as a single stream of jobs (file request), with total  .Each newly-arrived job, is assigned to computer i according to a new arrival rate i  whi fraction of the total arrivals.We consider the set of computers to be heterogeneous as this is com real systems and also it can be generalized to homogeneous servers.In the context of general qu network models, the generalized exponential (GE) distributional model is of the form; Where μ is the mean service rate, C is the coefficient of variation and u 0 (t) is the unit impulse fu which has been used to represent the inter-arrival and service-time distributions.This model is rob versatile due to it memoryless properties and has been shown to maximize the entropy function su mean value constraints.Furthermore it can be shown that the exact mean number of jobs in the G queue as given by Liu (1999): are the squared coefficients of variation for the inter-arrival and service-time distri (CV's) respectively.This means the queue length function will be used as an objective optimization model.In using this model, hardware resources are represented by service centers at which jobs queue a compete for service.The workload is modelled as a single stream of jobs (file request), with total arriv  .Each newly-arrived job, is assigned to computer i according to a new arrival rate i  which is fraction of the total arrivals.We consider the set of computers to be heterogeneous as this is common real systems and also it can be generalized to homogeneous servers.In the context of general queuei network models, the generalized exponential (GE) distributional model is of the form; Where μ is the mean service rate, C is the coefficient of variation and u 0 (t) is the unit impulse functio which has been used to represent the inter-arrival and service-time distributions.This model is robust a versatile due to it memoryless properties and has been shown to maximize the entropy function subject mean value constraints.Furthermore it can be shown that the exact mean number of jobs in the GE/GE queue as given by Liu (1999): are the squared coefficients of variation for the inter-arrival and service-time distributio (CV's) respectively.This means the queue length function will be used as an objective in t optimization model.In using this model, hardware resources are represented by service centers at which jobs queue and compete for service.The workload is modelled as a single stream of jobs (file request), with total arrival  .Each newly-arrived job, is assigned to computer i according to a new arrival rate i  which is a fraction of the total arrivals.We consider the set of computers to be heterogeneous as this is common in real systems and also it can be generalized to homogeneous servers.In the context of general queueing network models, the generalized exponential (GE) distributional model is of the form; Where μ is the mean service rate, C is the coefficient of variation and u 0 (t) is the unit impulse function, which has been used to represent the inter-arrival and service-time distributions.This model is robust and versatile due to it memoryless properties and has been shown to maximize the entropy function subject to mean value constraints.Furthermore it can be shown that the exact mean number of jobs in the GE/GE/1 queue as given by Liu (1999): are the squared coefficients of variation for the inter-arrival and service-time distributions (CV's) respectively.This means the queue length function will be used as an objective in the optimization model.

OPTIMIZATION MODEL USING GENERALIZED EXPONENTIAL (GE) DISTRIBUTION
In this section, a workload allocation model for the GE type distribution system is proposed.In this case, an optimization problem of the queueing system can be generalized to a number of arrival and service distributions by configuring the value of coefficient of variation for inter-arrival and service time.

OPTIMIZATION MODEL USING GENERALIZED EXPONENTIAL (GE) DISTRIBUTION
In this section, a workload allocation model for the GE type distribution system is proposed.In this case, an optimization problem of the queueing system can be generalized to a number of arrival and service distributions by configuring the value of coefficient of variation for inter-arrival and service time.
We formulated an optimization problem of the N GE/GE/1 queueing system as below: where and Problem P1 allows an analytical solution.Using Lagrange multiplier techniques we obtain with δ the Lagrange multiplier, the following first order Kuhn-Tucker constraints: ulated an optimization problem of the N GE/GE/1 queueing system as below: allows an analytical solution.Using Lagrange multiplier techniques we obtain with  the ltiplier, the following first order Kuhn-Tucker constraints: e find the unique optimal values nge multiplier is derived by solving the constraint equation below: lated an optimization problem of the N GE/GE/1 queueing system as below: llows an analytical solution.Using Lagrange multiplier techniques we obtain with  the tiplier, the following first order Kuhn-Tucker constraints: find the unique optimal values ge multiplier is derived by solving the constraint equation below: ulated an optimization problem of the N GE/GE/1 queueing system as below: allows an analytical solution.Using Lagrange multiplier techniques we obtain with  the ltiplier, the following first order Kuhn-Tucker constraints: e find the unique optimal values nge multiplier is derived by solving the constraint equation below: ated an optimization problem of the N GE/GE/1 queueing system as below: lows an analytical solution.Using Lagrange multiplier techniques we obtain with  the plier, the following first order Kuhn-Tucker constraints: find the unique optimal values optimization problem of the N GE/GE/1 queueing system as below: analytical solution.Using Lagrange multiplier techniques we obtain with  the e following first order Kuhn-Tucker constraints: rmulated an optimization problem of the N GE/GE/1 queueing system as below: 1 allows an analytical solution.Using Lagrange multiplier techniques we obtain with  the multiplier, the following first order Kuhn-Tucker constraints: ) we find the unique optimal values grange multiplier is derived by solving the constraint equation below: From (3.6) we find the unique optimal values (3.8) and the Lagrange multiplier is derived by solving the constraint equation below: (3.9)When C ai and C si the GE workload expression is reduced to the N-M/M/1 model.D i is the cost associated with having one job in queue and for simplicity we assign the value of 1.

COMPUTATIONAL RESULTS
In this section, numerical results are presented to assess the credibility of the GE distribution used.For result validation, simulation models were developed to simulate the proposed arrival and service rate.The mean queue length and the mean response time obtained using simulation were compared with the results obtained using the proposed analytical model.Two configurations are shown.For the first configuration, the service rate of the tasks is assumed to be: For the second configuration, the service rate of the tasks is assumed to be: 4 m P1 allows an analytical solution.Using Lagrange multiplier techniques we obtain with  the ge multiplier, the following first order Kuhn-Tucker constraints: (3.7) 3.6) we find the unique optimal values e Lagrange multiplier is derived by solving the constraint equation below: the GE workload expression is reduced to the N-M/M/1 model.D i is the cost ated with having one job in queue and for simplicity we assign the value of 1. 4 em P1 allows an analytical solution.Using Lagrange multiplier techniques we obtain with  the nge multiplier, the following first order Kuhn-Tucker constraints: (3.7) (3.6) we find the unique optimal values he Lagrange multiplier is derived by solving the constraint equation below: the GE workload expression is reduced to the N-M/M/1 model.D i is the cost iated with having one job in queue and for simplicity we assign the value of 1.

COMPUTATIONAL
In this section, numerical results are presented to assess th result validation, simulation models were developed to sim The mean queue length and the mean response time obtain results obtained using the proposed analytical model.T configuration, the service rate of the tasks is assumed to be: For the second configuration, the service rate of the tasks is

Classical
Proposed Cl

COMPUTATIONAL RES
In this section, numerical results are presented to assess the cr result validation, simulation models were developed to simula The mean queue length and the mean response time obtained u results obtained using the proposed analytical model.Two configuration, the service rate of the tasks is assumed to be: For the second configuration, the service rate of the tasks is assu

Classical
Proposed Classi   , are shown below.
A sample of parameters for three queueing system.
A sample of parameters for four queueing system.
A sample of parameters for four queueing system.
A sample of parameters for five queueing system.
A sample of parameters for six queueing system.
A sample of parameters for three queueing system.
A sample of parameters for four queueing system.
A sample of parameters for five queueing system.A sample of parameters for six queueing system.
The analysis shows that a larger range for the service rates and CV's results in a greater percentage of improvements of our aggregate objectives.The result of the analysis for 2, 3, 4, 5 and 6 computers is summarized in Figure 10.
The results clearly show that the mean queue length and the mean response A sample of parameters for three queueing system.
A sample of parameters for four queueing system.
A sample of parameters for four queueing system.
A sample of parameters for five queueing system.The analysis shows that a larger range for the service rates and CV's results in a greater percentage of improvements of our aggregate objectives.The result of the analysis for 2, 3, 4, 5 and 6 computers is summarized in Figure 10.The results clearly show that the mean queue length and the mean response time have improved for a network of more than 3 computers.However this study requires more number of computers for results generalization.The improvement in the system's performance can be seen in Figures 2, 5, 6, and 8. From the result, we can conclude that the optimal arrival rate improved the queue's performance by reducing the mean number of jobs and the mean response time in the system.One factor Journal of ICT, 10, pp: 1-13 10 time have improved for a network of more than 3 computers.However this study requires more number of computers for results generalization.The improvement in the system's performance can be seen in Figures 2, 5, 6, and 8. From the result, we can conclude that the optimal arrival rate improved the queue's performance by reducing the mean number of jobs and the mean response time in the system.One factor to note here is the performance improvement is achieved by increasing the rate of arrival to task with a higher service rate and reducing the rate of arrival to task with a lower service rate.Simulation models were developed to validate the proposed analytical results.Similar generic data as used in the proposed analytical model was used in the simulation for model validation.The simulation results were obtained from the simulation models run at 500 replications using ARENA.The results of the proposed model were compared with the results from the simulation, which are depicted in Figures 3, 4, 7 and 9.

CONCLUSION
In this paper, a new optimization model of allocating arrivals to a network of computers on Generalized Exponential arrival and service-time distribution has been proposed.A closed loop-expression to obtain the routing rate was constructed.An analytical model and simulation approaches were used to show that the classical queueing allocation of total arrivals among parallel systems with the same utilization rate does not provide an efficient performance result.
A sample of results for up to six computers is shown to view the improvement.
The GE distribution has been used as it could represent exponential and other general distributions.There are several directions to extend the applicability of this allocation model such as different performance objective functions, other arrival and service distribution and arrival with different types of jobs.These examples would involve interesting mathematical problems and could be the subject of future research.

Figure
Figure 1.Multiple Queue Multiple-Server Model

Figure
Figure 4. Analytical Versus Simulation Result

Figure 5 .
Figure 5. Performance Improvement of Mean Response Time for a Dual GE/GE/1 Queueing System Figure 6.Performance Improvement of Mean Queue Length

Figure 8 .Figure 10 .Figure
Figure 8. Performance Improvement of Figure 9. Analytical Versus Mean Response Time Result for a Dual GE/GE/1 Q

Figure 8 .Figure 10 .
Figure 8. Performance Improvement of Figure 9. Analytical Versus Simulation Mean Response Time Result for a Dual GE/GE/1 Queueing System Fig8

Table 1
Results of the Proposed and Classical Approaches of Queuing System Mean Queue Length, W : Mean Response Time

Table 2
Results of the Classical and Proposed Approaches of 2-GE/GE/1 Queuing System