Profit optimization algorithm for clouds support resource reservation
The work in  proposed a mechanism to allocate computing resource to workloads with various mixes of best-effort and advance reservation requests. Incoming advance reservation leases are always scheduled right away, where best-effort leases are put on a queue. The scheduling function periodically evaluates the queue. It uses an aggressive backfilling algorithm [21, 23] to decide whether any best-effort leases can be scheduled. When an advance reservation lease comes, the scheduler will try to choose nodes that will not need preempting another lease.
In , the authors presented a scenario of cloud platforms that supports resident applications and resource reservation. If some resource reservation requests come, VMs running resident applications can easily be shifted somewhere  to squeeze out enough resources for the reservation. However, doing this might lead to negative impact to the applications originally resident due to limited total resource amount. The authors defined a revenue function and the adaptive QoS-aware resource reservation management algorithm. With each physical node, the algorithm checks the capacity, forms a new configuration according to the reservation and forecasts the revenue. It then select the best node giving the largest revenue.
From the description above, we can see that both algorithms in  and  support reservation as well as non-reservation workloads. They both give higher priority to reservation request. When a new reservation request comes, the algorithm in  and  may suspend or migrate non-reservation VMs to have enough cloud resources for this request. The difference between two algorithms is the behavior with the affected VMs. While the algorithm in  does not care much about the affected non-reservation workloads, the algorithm in  has to consider the fine before deciding to migrate the non-reservation VMs.
Profit optimization algorithms for clouds support spot market
Motivated by low resource use, Amazon EC2 introduced the spot instance mechanism to allow customers to bid for unused Amazon EC2 capacity . Amazon EC2 at each availability zone has fixed number of virtual machine types and fixed number of instances of each VM type. Amazon EC2 runs one spot market for each VM type in each availability zone. A customer submits a request that specifies the type, the number of instances, the region desired and the biding price per instance-hour. The provider assigns resources to bidders in decreasing order of their bids until all available resources have been allocated or all resource requests have been satisfied. The selling price (i.e. the spot price) is equal to the lowest winning bid. The actual VM – physical host assignment is done with Round Robin algorithm , which is described in detail in Section 4.1.4.
Unlike Amazone EC2, in , the authors developed the model to allow users to bid for bundles of items. This model is called combinatorial auction. Determine the set of winning users with the goal of maximizing the income in this model is a NP-hard problem . To solve this problem, the authors proposed the CA-GREEDY algorithm. It collects users bundle of VMs and bid price, does weighted sum of VM of each bid, calculates the bid density, sorts bids by density and then allocates highest to lowest, until resources exhaust.
The above works all assume fixed number of virtual machine types and fixed number of instances of each VM type. In , the authors proposed a model called Dynamic VM Provisioning and Allocation (DVMPA). It allows users to bid for a bundle of VMs of different types. It can configure the set of available computing resource into different numbers and types of VM instances. The author proposed the CA-PROVISION algorithm to solve the problem. It first collects users' bundle of VMs of different types and bid price, does weighted sum of VM of each bid, calculate the bid density, sort bids by density and then allocates highest to lowest, until resources exhaust. Once the winners are determined, the mechanism determines the VM configuration to fulfill the winning users' requests.
From the description above, we can see the increasing complexity of the workload and resource assumption along the line of description. The algorithm in  and  assume single VM and bundle VMs bid with fixed amount of available resources. The algorithm in  assumes bundle VMs bid and a pool of resources. With the increasing complexity in input description, the algorithm is more complex and more efficient.
Social welfare maximization algorithms for clouds support game theory
In the work of , the authors defined a new class of games called Cloud Resource Allocation Games (CRAGs). CRAGs solve the resource allocation problem in clouds using game-theoretic mechanism. At the start of each round, all the clients submit the jobs for that round to the cloud. The provider first calculates his resource allocation vector for jobs that are running on the cloud. The provider then advertises the amount of resources left at each machine to the clients. Each client chooses the machines that would satisfy its resource need and minimize its total cost. Next, the clients will actively change their resource allocation as long as they can decrease their costs. They can follow any of the standard update mechanisms in the literature  to determine how they change their strategy. All the clients must follow the same update mechanism. When no client can decrease its cost by changing its strategy alone, the system has achieved a stable equilibrium.
Also applying game theory to resource allocation, however, instead of letting user take part in many allocation rounds, the work in  proposed a practical approximated solution with the following two steps. First, each participant solves its optimal problem independently. A Binary Integer Programming method is proposed to solve the independent optimization. Second, an evolutionary mechanism is designed to change multiplexed strategies of all initial optimal solutions with the goal of minimizing their efficiency losses. The algorithms in the evolutionary mechanism take both optimization and fairness into account.
From the description above, we can see that the algorithm in  is not as convenient as the algorithm in . The requirement of complex users' attendance in the scheduling process makes the algorithm in  impractical. It should be used only as the source for further refinement. The algorithm in  fixed the drawback of the algorithm in  but its execution time could be long with the evolution mechanism.
Load balance algorithms for clouds support resource on demand
Several commercial and open source cloud systems use simple load balance algorithms like Round robin [2,4,23,24], random , least connect , weighted selection [23,46] as described in .
Round robin is among the most widely used allocation algorithms for VM initial placement. The servers are in a circular list. There is a pointer pointing to the last server that received the previous request. The system sends a new resource allocation to the next node with enough physical resources to handle the request.
In random load balance, requests are routed to servers at random. Random load balance is recommended only for homogeneous cluster deployments, where each server instance runs on a similarly configured machine. Random load balance distributes requests evenly across server instances in the cluster when the cumulative number of requests increases. Over a small number of requests, the load may not be balanced exactly evenly.
The Least Connections methods are relatively simple. The system passes a new connection to the node that has the least number of current connections. Least Connections methods work best in environments where the servers or other equipment have similar capabilities. These are dynamic load balance methods. They distribute connections based on various aspects of server performance, such as the current number of connections per node or the fastest node response time.
In many enterprise environments, server nodes of unequal processing power & performance characteristics are used to host services. It is often necessary to distribute the load based on individual server capabilities so that some servers are not unfairly burdened with requests.
Weighted Round Robin is an advanced version of the round robin that eliminates the deficiencies of the plain round robin algorithm. In the weighted round robin algorithm, one can assign a weight to each server in the group. If one server is capable of handling twice as much load as the other, the powerful server gets a weight of 2. In such cases, the scheduler will assign two requests to the powerful server for each request assigned to the weaker one.
Similar to Weighted Round Robin, the weighted random load balance policy allows system to specify a processing load distribution ratio for each server with respect to others. In addition to the weight, endpoint selection is then further refined using random distribution based on weight. The server that has higher weight will have higher probability of receiving the request.
The work in  presented a scheduling strategy on load balance of VM resources based on genetic algorithm. According to historical data and current state of the system and through genetic algorithm, this strategy computes ahead the influence on the system after the deployment of the needed VM resources. It then chooses the least-affective solution, through which it achieves the best load balance and reduces or avoids dynamic migration. This strategy solves the problem of load imbalance and high migration cost by traditional algorithms after scheduling.
The work in  discussed resource allocation methods for large-scale Cloud Systems. The first method is Biased Random Sampling. In the large-scale Cloud system, each computing node in the cloud has a set of neighbor nodes. When a request comes, the node will choose randomly a neighbor node according to the light loaded level. The walk like that will continue k steps and the lightest load nodes will be selected for allocation.
The other method discussed in  is Active Clustering. Active Clustering works on the principle of grouping similar nodes together and working on these groups. Each computing node has a set of neighbor nodes. When a request comes, the initial node selects another node called the matchmaker node from its neighbors. This selection satisfies the criteria that it should be of a different type than the former one. The so-called matchmaker node then forms a connection between neighbors of it that is of the same type as the initial node. The matchmaker node then detaches the connection between itself and the initial node. The walk like that will continue k steps and the lightest load nodes will be selected to allocation.
In , the authors presented a load balance method for a three-level cloud-computing network: the service node, the service manager and the request manager. The proposed two-phase scheduling algorithm integrates OLB (Opportunistic Load Balancing) and LBMM (Load Balance Min-Min) to assist in the selection for effective service nodes. First, it finds all nodes having available resource > threshold. It then randomly distributes sub-tasks to those nodes (OLB algorithm). With set of sub-tasks and set of available nodes belong to a service manager, it applies LBMM. It sorts available nodes according min execution time, sorts sub-tasks according to min completion time, assigns min completion time sub-task to min execution time nodes, moves the assigned sub-task out of list, moves the assigned node to the end of the node list and repeats assign process until all sub-tasks assigned.
The work in  proposed a power aware load balance algorithm. The PALB algorithm first gathers the utilization percentage of each active compute node. If all compute nodes n are above 75% utilization, PALB instantiates a new virtual machine on the compute node with the lowest utilization. Otherwise, the new virtual machine (VM) is booted on the compute node with the highest utilization (if it can accommodate the size of the VM). If all currently active compute nodes have utilization over 75%, PALB sends turning on command to power on additional compute nodes (as long as there are more available compute nodes). If the compute node is using less than 25% of its resources, PALB sends a shutdown command to that node.
From the description above, pure load balance algorithms such as round robin, random, least connect are only suitable for homogeneous cloud system as they treat each computing node equally. With heterogeneous system, using those algorithms may create unbalanced state and weighted round robin or weighted random algorithms are more suitable. Algorithms in [27,28] mainly applied to large-scale cloud system. Unlike other algorithms, the algorithm in  considers not only load balance, but also energy saving by turning off servers.
Energy efficient algorithms for clouds support resource on demand
In , a simple energy-aware policy incorporating allocation scheme of virtual servers is proposed to achieve the aim of green computing. The allocation schemes are popular strategy such as round robin, first fit, etc. The policy automatically governs physical hosts to a low-energy consuming state when no virtual servers are allocated in a specific physical host. It automatically manages a physical host into the operating state of full functionality when virtual servers are assigned.
The open source IaaS Cloud package Eucaplytus has integrated Power save policy as a scheduling option [31,32]. The core of the Power save policy is the First - Fit heuristic. The First- Fit method attempts to deploy a virtual machine to the first machine in a physical machine list that can accommodate this virtual machine. If no physical machine is found, then a new physical machine will be booted to host this virtual machine.
In the work of , the authors tried to save energy by optimizing the computing resource usage. The main idea is turning on a number of sufficient servers to meet the allocation demand. Other servers are turned off. To realize this idea, the common techniques are determining the demand and determining suitable set of available servers to meet the demand for the next period. This strategy is very similar to the idea of [34,35] applying to the generic data centre. The works in [64,65,66,67] have the same idea but does not predict the demand. Instead, the authors build cloud-managing models to determine the optimized number of server to be turned on. This optimized value must balance the energy consumption and other criteria such as high performance , customer impatience , blocking probability [64,67].
Both works in [36,37] used the same energy aware Best Fit strategy to allocate resource for the new coming VM. At first step, a power consumption model for data centre is defined. After that, the algorithm walks through all servers to determine if the server has enough resources to host the VM. With the feasible one, it estimates the future power if the VM is deployed on that machine. The feasible server with the smallest future power will be selected.
In , the authors proposed an algorithm for Energy aware VM consolidation. With the experiment, the authors found that there exists an optimal combination of CPU and disk utilization. At this optimal point, the energy per transaction is minimum. Next, as each request arrives, it is allocated to a server, resulting in the desired workload distribution across servers. The used heuristic maximizes the sum of the Euclidean distances of the current allocations to the optimal point at each server. If the request cannot be allocated, a new server is turned on. Then all requests are re-allocated using the same heuristic, in an arbitrary order.
From the description above, we can see the difference in saving energy mechanism of described algorithms. The work in  saves energy by setting servers to the lower power consumption state when there is no VM on the machine. The works in [31,32,33,65,66,67,68] try to use sufficient number of servers to host load and other free machines are turned off. The works in [36,37] go further by allocating load to the least power consumption servers. Unlike other algorithms that deal with VM, the work in  goes deeper with transactions distribution among VMs.
Number of active hosts minimization algorithms for clouds support resource on demand
The placement algorithm for VMs in a data centre allocates various resources such as memory, bandwidth, processing power, etc. from a physical machine (PM) to VMs with the goal of minimizing the number of PMs used. This problem can be viewed as a multi-dimensional packing problem. In  the authors discussed several heuristics to solve this problem. Each host is presented by the host's vector of capacities H = (h1, h2, . . ., hd). Each VM is represented by its vector of demands V = (v1, v2, …, vd).
The popular heuristic for bin packing problem is the First Fit Decreasing (FFD). This heuristic orders the bins and the objects in size decreasing order. Starting with the first bin, it iterates over the object. It places objects into the first bin till no more objects can be placed into it. It then considers the first bin to be filled and proceeds to the second bin with the same procedure.
The important task for applying the FFD heuristic is determining the size of the bin and the size of the object. For cloud computing, the authors presented several ways to handle this task and thus, created several variations of the FFD heuristic.
FFDProd heuristic sorts the servers and VMs according to
Volume(V ) = (3)
Volume(H ) = (4)
FFDSum sorts the servers according to
Volume(V ) = (5)
Volume(H ) = (6)
Dot-Product heuristic - At time t let H(t) denote the vector of remaining or residual capacities of the current open host, i.e. subtract from the host's capacity the total demand of all VMs currently assigned to it. It places the VM that maximizes the dot product with the vector of remaining capacities without violating the capacity constraint.
Norm-based Greedy heuristic - for the l2 norm distance metric, from all unassigned VMs, it places the VM v that minimizes the quantity and the assignment does not violate the capacity constraints.
Do not use heuristic, the work in  modeled the problem as the quadratic programming problem to be solved with  and the integer linear programming problem to be solved with a standard LP solver .
In , the author described the Global Decision Module (GDM) which is responsible for two main tasks: determining the VM allocation vectors Ni for each application ai (VM Provisioning), and placing these VMs on PMs in order to minimize the number of active PMs (VM Packing). These two phases are expressed as two Constraint Satisfaction Problems (CSP) which are handled by a Constraint Solver.
From the above description, we can see that there are two classes of algorithms. Algorithms described in  use heuristics and they are quite fast. Algorithms in [40,43] use global optimization techniques such as quadratic programming or linear programming. Thus, they may have long execution time.
Miscellaneous objectives algorithms for clouds support resource on demand
The work in  proposed a mechanism called CLUSTER-AND-CUT to improve the Scalability of Data Centre Networks. It first partitions VMs into VM-clusters and partitions slots into slot-clusters. VM-clusters are obtained via classical min-cut graph algorithm . The data centre operators can obtain slot-clusters manually. The algorithm then maps each VM-cluster to a slot cluster. For each VM-cluster and its associated slot-cluster, it calls cluster-and-cut for a smaller problem size.
In , the authors recognized the serious risks of host server failures that induce unexpected downs of all hosted virtual machines and applications. To protect required high-availability applications from unpredictable host server failures, redundant configuration using virtual machines can be an effective countermeasure. The proposed method estimates the requisite minimum number of VMs according to the performance requirements of application services. It then decides an optimum VM placement so that minimum configurations survive at any k host server failures.
In , the authors presented a custom optimization mechanism. The mechanism allows operator to select one among following goals: Reserve a single PM for a specific VM; Minimize traffic; Spread VMs across separate PMs; Reduce the number of PMs used; Offload a specific PM. The mechanism is two-phase optimization process. During the first phase, it selects a subset of PMs called cohort with properties that best serve the VM placement. Resource is divided into level according to migration ability. The first phase algorithm gradually explores all cohort levels in search of a promising “neighborhood”. In the second phase, the mechanism solves a constraint satisfaction problem that yields a near optimal VM-to-PM mapping.
From the description above, we can see the difference in goals of those algorithms. While the work in  focuses on the scalability of the cloud system, algorithm in  tries to deal with failure by allocating redundant resources to load. The work in  allows custom optimization.