IDEAS home Printed from https://ideas.repec.org/a/eee/reensy/v197y2020ics0951832019309457.html
   My bibliography  Save this article

Optimizing software rejuvenation policy for tasks with periodic inspections and time limitation

Author

Listed:
  • Levitin, Gregory
  • Xing, Liudong
  • Xiang, Yanping

Abstract

Software aging has been observed in diverse types of software systems, causing gradual performance degradation with time and/or load and eventually system failures. To mitigate the aging effects and prevent serious losses caused by the system failure, software rejuvenations can be proactively performed to restore the system performance. This paper models and optimizes a state-based rejuvenation policy for software systems performing real-time computing tasks and undergoing periodic inspections. During each scheduled inspection, the system state is evaluated and the decision about the rejuvenation is made based on the evaluated system state and a rejuvenation decision function. The time of each rejuvenation procedure (corresponding to the system downtime) depends on the system state as well as on the amount of task operations accomplished before deciding to perform the rejuvenation. As the rejuvenation policy determines the time and number of rejuvenations performed during the task processing, it can affect the probability that the system can accomplish the real-time task by a certain deadline significantly. In this work, we optimize the state-based rejuvenation policy to maximize the probability of task completion (PTC) of periodically inspected software systems. The methodology encompasses an event transition-based iterative method proposed for quantifying the PTC and application of the Genetic Algorithm for deriving the optimal rejuvenation policy. Examples are presented to demonstrate the proposed methodology and influences of several parameters (e.g., inspection interval, rejuvenation time) on the optimization results.

Suggested Citation

  • Levitin, Gregory & Xing, Liudong & Xiang, Yanping, 2020. "Optimizing software rejuvenation policy for tasks with periodic inspections and time limitation," Reliability Engineering and System Safety, Elsevier, vol. 197(C).
  • Handle: RePEc:eee:reensy:v:197:y:2020:i:c:s0951832019309457
    DOI: 10.1016/j.ress.2019.106776
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0951832019309457
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ress.2019.106776?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Dohi, Tadashi & Zheng, Junjun & Okamura, Hiroyuki & Trivedi, Kishor S., 2018. "Optimal periodic software rejuvenation policies based on interval reliability criteria," Reliability Engineering and System Safety, Elsevier, vol. 180(C), pages 463-475.
    2. Levitin, Gregory & Xing, Liudong & Huang, Hong-Zhong, 2019. "Optimization of partial software rejuvenation policy," Reliability Engineering and System Safety, Elsevier, vol. 188(C), pages 289-296.
    3. Levitin, Gregory & Xing, Liudong & Xiang, Yanping, 2020. "Cost minimization of real-time mission for software systems with rejuvenation," Reliability Engineering and System Safety, Elsevier, vol. 193(C).
    4. Machida, Fumio & Miyoshi, Naoto, 2017. "Analysis of an optimal stopping problem for software rejuvenation in a deteriorating job processing system," Reliability Engineering and System Safety, Elsevier, vol. 168(C), pages 128-135.
    5. Levitin, Gregory & Xing, Liudong & Dai, Yuanshun, 2018. "Heterogeneous 1-out-of-N warm standby systems with online checkpointing," Reliability Engineering and System Safety, Elsevier, vol. 169(C), pages 127-136.
    6. Levitin, Gregory & Xing, Liudong & Ben-Haim, Hanoch, 2018. "Optimizing software rejuvenation policy for real time tasks," Reliability Engineering and System Safety, Elsevier, vol. 176(C), pages 202-208.
    7. Levitin, Gregory & Xing, Liudong & Luo, Liang, 2019. "Joint optimal checkpointing and rejuvenation policy for real-time computing tasks," Reliability Engineering and System Safety, Elsevier, vol. 182(C), pages 63-72.
    8. Levitin, Gregory & Xing, Liudong & Dai, Yuanshun, 2014. "Cold vs. hot standby mission operation cost minimization for 1-out-of-N systems," European Journal of Operational Research, Elsevier, vol. 234(1), pages 155-162.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Levitin, Gregory & Xing, Liudong & Xiang, Yanping, 2020. "Cost minimization of real-time mission for software systems with rejuvenation," Reliability Engineering and System Safety, Elsevier, vol. 193(C).
    2. Levitin, Gregory & Xing, Liudong & Huang, Hong-Zhong, 2019. "Optimization of partial software rejuvenation policy," Reliability Engineering and System Safety, Elsevier, vol. 188(C), pages 289-296.
    3. Levitin, Gregory & Xing, Liudong & Luo, Liang, 2019. "Joint optimal checkpointing and rejuvenation policy for real-time computing tasks," Reliability Engineering and System Safety, Elsevier, vol. 182(C), pages 63-72.
    4. Junjun Zheng & Hiroyuki Okamura & Tadashi Dohi, 2021. "Availability Analysis of Software Systems with Rejuvenation and Checkpointing," Mathematics, MDPI, vol. 9(8), pages 1-15, April.
    5. Levitin, Gregory & Xing, Liudong & Ben-Haim, Hanoch, 2018. "Optimizing software rejuvenation policy for real time tasks," Reliability Engineering and System Safety, Elsevier, vol. 176(C), pages 202-208.
    6. Wen, Tao & Deng, Yong, 2020. "The vulnerability of communities in complex networks: An entropy approach," Reliability Engineering and System Safety, Elsevier, vol. 196(C).
    7. Levitin, Gregory & Xing, Liudong & Dai, Yuanshun, 2018. "Co-residence based data vulnerability vs. security in cloud computing system with random server assignment," European Journal of Operational Research, Elsevier, vol. 267(2), pages 676-686.
    8. Nan Zhang & Sen Tian & Le Li & Zhongbin Wang & Jun Zhang, 2023. "Maintenance analysis of a partial observable K-out-of-N system with load sharing units," Journal of Risk and Reliability, , vol. 237(4), pages 703-713, August.
    9. Amirhossain Chambari & Javad Sadeghi & Fakhri Bakhtiari & Reza Jahangard, 2016. "A note on a reliability redundancy allocation problem using a tuned parameter genetic algorithm," OPSEARCH, Springer;Operational Research Society of India, vol. 53(2), pages 426-442, June.
    10. Khastgir, Siddartha & Brewerton, Simon & Thomas, John & Jennings, Paul, 2021. "Systems Approach to Creating Test Scenarios for Automated Driving Systems," Reliability Engineering and System Safety, Elsevier, vol. 215(C).
    11. Chatwattanasiri, Nida & Coit, David W. & Wattanapongsakorn, Naruemon, 2016. "System redundancy optimization with uncertain stress-based component reliability: Minimization of regret," Reliability Engineering and System Safety, Elsevier, vol. 154(C), pages 73-83.
    12. Levitin, Gregory & Xing, Liudong & Dai, Yuanshun, 2022. "Optimal sequencing of elements activation in 1-out-of-n warm standby system with storage," Reliability Engineering and System Safety, Elsevier, vol. 221(C).
    13. Levitin, Gregory & Finkelstein, Maxim & Dai, Yuanshun, 2018. "Optimizing availability of heterogeneous standby systems exposed to shocks," Reliability Engineering and System Safety, Elsevier, vol. 170(C), pages 137-145.
    14. Wu, Shaomin & Do, Phuc, 2017. "Editorial," Reliability Engineering and System Safety, Elsevier, vol. 168(C), pages 1-3.
    15. Kim, Heungseob, 2018. "Maximization of system reliability with the consideration of component sequencing," Reliability Engineering and System Safety, Elsevier, vol. 170(C), pages 64-72.
    16. Wu, Hui & Li, Yan-Fu & Bérenguer, Christophe, 2020. "Optimal inspection and maintenance for a repairable k-out-of-n: G warm standby system," Reliability Engineering and System Safety, Elsevier, vol. 193(C).
    17. Levitin, Gregory & Xing, Liudong & Haim, Hanoch Ben & Dai, Yuanshun, 2019. "Optimal structure of series system with 1-out-of-n warm standby subsystems performing operation and rescue functions," Reliability Engineering and System Safety, Elsevier, vol. 188(C), pages 523-531.
    18. Levitin, Gregory & Xing, Liudong & Dai, Yuanshun, 2022. "Heterogeneous 1-out-of-n standby systems with limited unit operation time," Reliability Engineering and System Safety, Elsevier, vol. 224(C).
    19. Kim, Heungseob & Kim, Pansoo, 2017. "Reliability models for a nonrepairable system with heterogeneous components having a phase-type time-to-failure distribution," Reliability Engineering and System Safety, Elsevier, vol. 159(C), pages 37-46.
    20. Chen, Ying & Wang, Ze & Li, YingYi & Kang, Rui & Mosleh, Ali, 2018. "Reliability analysis of a cold-standby system considering the development stages and accumulations of failure mechanisms," Reliability Engineering and System Safety, Elsevier, vol. 180(C), pages 1-12.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reensy:v:197:y:2020:i:c:s0951832019309457. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/reliability-engineering-and-system-safety .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.