Title: On the cluster admission problem for cloud computing
Abstract: Cloud computing providers must handle heterogeneous customer workloads for resources such as (virtual) CPU or GPU cores. This is particularly challenging if customers, who are already running a job on a cluster, scale their resource usage up and down over time. The provider therefore has to continuously decide whether she can add additional workloads to a given cluster or if doing so would impact existing workloads’ ability to scale. Currently, this is often done using simple threshold policies to reserve large parts of each cluster, which leads to low average utilization of the cluster. In this paper, we propose more sophisticated policies for controlling admission to a cluster and demonstrate that they significantly increase cluster utilization. We first introduce the cluster admission problem and formalize it as a constrained Partially Observable Markov Decision Problem (POMDP). As it is infeasible to solve the POMDP optimally, we then systematically design heuristic admission policies that estimate moments of each workload’s distribution of future resource usage. Via simulations we show that our admission policies lead to a substantial improvement over the simple threshold policy. We then evaluate how much further this can be improved with learned or elicited prior information and how to incentivize users to provide this information.