IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1006843.html
   My bibliography  Save this article

Script of Scripts: A pragmatic workflow system for daily computational research

Author

Listed:
  • Gao Wang
  • Bo Peng

Abstract

Computationally intensive disciplines such as computational biology often require use of a variety of tools implemented in different scripting languages and analysis of large data sets using high-performance computing systems. Although scientific workflow systems can powerfully organize and execute large-scale data-analysis processes, creating and maintaining such workflows usually comes with nontrivial learning curves and engineering overhead, making them cumbersome to use for everyday data exploration and prototyping. To bridge the gap between interactive analysis and workflow systems, we developed Script of Scripts (SoS), an interactive data-analysis platform and workflow system with a strong emphasis on readability, practicality, and reproducibility in daily computational research. For exploratory analysis, SoS has a multilanguage scripting format that centralizes otherwise-scattered scripts and creates dynamic reports for publication and sharing. As a workflow engine, SoS provides an intuitive syntax for creating workflows in process-oriented, outcome-oriented, and mixed styles, as well as a unified interface for executing and managing tasks on a variety of computing platforms with automatic synchronization of files among isolated file systems. As illustrated herein by real-world examples, SoS is both an interactive analysis tool and pipeline platform suitable for different stages of method development and data-analysis projects. In particular, SoS can be easily adopted in existing data analysis routines to substantially improve organization, readability, and cross-platform computation management of research projects.

Suggested Citation

  • Gao Wang & Bo Peng, 2019. "Script of Scripts: A pragmatic workflow system for daily computational research," PLOS Computational Biology, Public Library of Science, vol. 15(2), pages 1-14, February.
  • Handle: RePEc:plo:pcbi00:1006843
    DOI: 10.1371/journal.pcbi.1006843
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006843
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1006843&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1006843?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Berk Ekmekci & Charles E McAnany & Cameron Mura, 2016. "An Introduction to Programming for Bioscientists: A Python-Based Primer," PLOS Computational Biology, Public Library of Science, vol. 12(6), pages 1-43, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cameron Mura & Mike Chalupa & Abigail M Newbury & Jack Chalupa & Philip E Bourne, 2020. "Ten simple rules for starting research in your late teens," PLOS Computational Biology, Public Library of Science, vol. 16(11), pages 1-11, November.
    2. Richard A Erickson & Michael N Fienen & S Grace McCalla & Emily L Weiser & Melvin L Bower & Jonathan M Knudson & Greg Thain, 2018. "Wrangling distributed computing for high-throughput environmental science: An introduction to HTCondor," PLOS Computational Biology, Public Library of Science, vol. 14(10), pages 1-8, October.
    3. Anthony C Fletcher & Cameron Mura, 2019. "Ten quick tips for using a Raspberry Pi," PLOS Computational Biology, Public Library of Science, vol. 15(5), pages 1-11, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1006843. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.