IDEAS home Printed from https://ideas.repec.org/a/gam/jdataj/v3y2018i4p38-d172608.html
   My bibliography  Save this article

GoCJ: Google Cloud Jobs Dataset for Distributed and Cloud Computing Infrastructures

Author

Listed:
  • Altaf Hussain

    (Department of Computer Science, Faculty of Computing, Capital University of Science and Technology, Islamabad 44000, Pakistan)

  • Muhammad Aleem

    (Department of Computer Science, Faculty of Computing, Capital University of Science and Technology, Islamabad 44000, Pakistan)

Abstract

Developers of resource-allocation and scheduling algorithms share test datasets (i.e., benchmarks) to enable others to compare the performance of newly developed algorithms. However, mostly it is hard to acquire real cloud datasets due to the users’ data confidentiality issues and policies maintained by Cloud Service Providers (CSP). Accessibility of large-scale test datasets, depicting the realistic high-performance computing requirements of cloud users, is very limited. Therefore, the publicly available real cloud dataset will significantly encourage other researchers to compare and benchmark their applications using an open-source benchmark. To meet these objectives, the contemporary state of the art has been scrutinized to explore a real workload behavior in Google cluster traces. Starting from smaller- to moderate-size cloud computing infrastructures, the dataset generation process is demonstrated using the Monte Carlo simulation method to produce a Google Cloud Jobs (GoCJ) dataset based on the analysis of Google cluster traces. With this article, the dataset is made publicly available to enable other researchers in the field to investigate and benchmark their scheduling and resource-allocation schemes for the cloud. The GoCJ dataset is archived and available on the Mendeley Data repository.

Suggested Citation

  • Altaf Hussain & Muhammad Aleem, 2018. "GoCJ: Google Cloud Jobs Dataset for Distributed and Cloud Computing Infrastructures," Data, MDPI, vol. 3(4), pages 1-12, September.
  • Handle: RePEc:gam:jdataj:v:3:y:2018:i:4:p:38-:d:172608
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2306-5729/3/4/38/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2306-5729/3/4/38/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Stephen Makonin & Z. Jane Wang & Chris Tumpach, 2018. "RAE: The Rainforest Automation Energy Dataset for Smart Grid Meter Data Analysis," Data, MDPI, vol. 3(1), pages 1-9, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Arshad, Umer & Aleem, Muhammad & Srivastava, Gautam & Lin, Jerry Chun-Wei, 2022. "Utilizing power consumption and SLA violations using dynamic VM consolidation in cloud data centers," Renewable and Sustainable Energy Reviews, Elsevier, vol. 167(C).
    2. Meennapa Rukhiran & Arpaporn Phokajang & Paniti Netinant, 2022. "Development of Mobile Learning English Web Application: Adoption of Technology in the Digital Teaching and Learning Framework," International Journal of Information Technology and Web Engineering (IJITWE), IGI Global, vol. 17(1), pages 1-25, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lucas Pereira, 2019. "NILMPEds: A Performance Evaluation Dataset for Event Detection Algorithms in Non-Intrusive Load Monitoring," Data, MDPI, vol. 4(3), pages 1-9, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jdataj:v:3:y:2018:i:4:p:38-:d:172608. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.