Author
Abstract
One dominant approach for reducing response times in large-scale systems is Join-the-Shortest-Queue-d: whenever a job arrives, the dispatcher queries d servers at random and then assigns the job to the queried server with the shortest queue. While $$\textsf{JSQ}$$ JSQ -d is known to perform quite well in systems where all servers run at the same speed, this is not the case in systems that exhibit heterogeneity with respect to server speeds. Unfortunately, there is no straightforward way to extend $$\textsf{JSQ}$$ JSQ -d (or other so-called power-of-d policies) to heterogeneous systems. Should a job be assigned to the queried server with the shortest queue even if much faster servers were among those queried? Should a job be assigned to the queried server where it is expected to complete the soonest even if there is an idle, albeit slower, server available among those queried? And for that matter, should we query faster servers more often than their slower counterparts? Recent work has introduced a framework for developing strong dispatching policies by pairing suitably chosen querying and assignment rules. Within this framework, prior work has focused on finding strong-performing dispatching policies that use only the idle/busy statuses of the queried servers, rather than detailed queue length information. In this paper, we overcome the challenge of evaluating the performance of—and finding strong-performing—general scalable dispatching policies that make use of both server speed and detailed queue length information, through a combination of mean field analysis and a sequence of modified optimization problems. We find that well-designed length-aware dispatching policies can significantly outperform their idleness-based counterparts in large-scale heterogeneous systems. While the best policies of this kind are often complicated to describe, we find that in the vast majority of cases the relatively simple Shortest Expected Wait policy performs nearly as well, without the need for an especially sophisticated and time-consuming optimization procedure.
Suggested Citation
Jazeem Abdul Jaleel & Sherwin Doroudi & Kristen Gardner, 2024.
"Queue-length-aware dispatching in large-scale heterogeneous systems,"
Queueing Systems: Theory and Applications, Springer, vol. 108(1), pages 125-184, October.
Handle:
RePEc:spr:queues:v:108:y:2024:i:1:d:10.1007_s11134-024-09918-x
DOI: 10.1007/s11134-024-09918-x
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:queues:v:108:y:2024:i:1:d:10.1007_s11134-024-09918-x. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.