IDEAS home Printed from https://ideas.repec.org/a/eee/reensy/v193y2020ics0951832019303503.html
   My bibliography  Save this article

Cost minimization of real-time mission for software systems with rejuvenation

Author

Listed:
  • Levitin, Gregory
  • Xing, Liudong
  • Xiang, Yanping

Abstract

This paper considers the optimal rejuvenation policy problem for software systems performing real-time tasks. Due to software aging, the system performance deteriorates with time eventually leading to the system crash, which can be catastrophic for critical applications. To prevent the system crash or minimize its occurrence probability, software rejuvenation has been widely adopted for counteracting the software aging effect but at the cost of extra system overhead and downtime. We derive the optimal state-based rejuvenation policy minimizing the total expected mission cost for software aging systems subject to multiple performance degradation states. The solution encompasses an event transition-based numerical method that assesses the total expected mission cost of a real-time task, covering penalty cost from the mission failure, operation cost of running the mission software and the expected rejuvenation cost. The proposed cost evaluation model can accommodate arbitrary types of state transition time distributions. The suggested model also allows simultaneous evaluations of other system performance metrics including the probability of the successful task completion (i.e., mission reliability) and conditional expected mission completion time given a successful mission task. Examples are presented to demonstrate the proposed methodology and optimizations. Effects of different parameters on the rejuvenation optimization solution are also investigated.

Suggested Citation

  • Levitin, Gregory & Xing, Liudong & Xiang, Yanping, 2020. "Cost minimization of real-time mission for software systems with rejuvenation," Reliability Engineering and System Safety, Elsevier, vol. 193(C).
  • Handle: RePEc:eee:reensy:v:193:y:2020:i:c:s0951832019303503
    DOI: 10.1016/j.ress.2019.106593
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0951832019303503
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ress.2019.106593?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Dohi, Tadashi & Zheng, Junjun & Okamura, Hiroyuki & Trivedi, Kishor S., 2018. "Optimal periodic software rejuvenation policies based on interval reliability criteria," Reliability Engineering and System Safety, Elsevier, vol. 180(C), pages 463-475.
    2. Levitin, Gregory & Xing, Liudong & Huang, Hong-Zhong, 2019. "Optimization of partial software rejuvenation policy," Reliability Engineering and System Safety, Elsevier, vol. 188(C), pages 289-296.
    3. Machida, Fumio & Miyoshi, Naoto, 2017. "Analysis of an optimal stopping problem for software rejuvenation in a deteriorating job processing system," Reliability Engineering and System Safety, Elsevier, vol. 168(C), pages 128-135.
    4. Levitin, Gregory & Xing, Liudong & Ben-Haim, Hanoch, 2018. "Optimizing software rejuvenation policy for real time tasks," Reliability Engineering and System Safety, Elsevier, vol. 176(C), pages 202-208.
    5. Levitin, Gregory & Xing, Liudong & Luo, Liang, 2019. "Joint optimal checkpointing and rejuvenation policy for real-time computing tasks," Reliability Engineering and System Safety, Elsevier, vol. 182(C), pages 63-72.
    6. Levitin, Gregory & Xing, Liudong & Dai, Yuanshun, 2014. "Cold vs. hot standby mission operation cost minimization for 1-out-of-N systems," European Journal of Operational Research, Elsevier, vol. 234(1), pages 155-162.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Khastgir, Siddartha & Brewerton, Simon & Thomas, John & Jennings, Paul, 2021. "Systems Approach to Creating Test Scenarios for Automated Driving Systems," Reliability Engineering and System Safety, Elsevier, vol. 215(C).
    2. Levitin, Gregory & Xing, Liudong & Xiang, Yanping, 2020. "Optimizing software rejuvenation policy for tasks with periodic inspections and time limitation," Reliability Engineering and System Safety, Elsevier, vol. 197(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Levitin, Gregory & Xing, Liudong & Xiang, Yanping, 2020. "Optimizing software rejuvenation policy for tasks with periodic inspections and time limitation," Reliability Engineering and System Safety, Elsevier, vol. 197(C).
    2. Levitin, Gregory & Xing, Liudong & Huang, Hong-Zhong, 2019. "Optimization of partial software rejuvenation policy," Reliability Engineering and System Safety, Elsevier, vol. 188(C), pages 289-296.
    3. Junjun Zheng & Hiroyuki Okamura & Tadashi Dohi, 2021. "Availability Analysis of Software Systems with Rejuvenation and Checkpointing," Mathematics, MDPI, vol. 9(8), pages 1-15, April.
    4. Wen, Tao & Deng, Yong, 2020. "The vulnerability of communities in complex networks: An entropy approach," Reliability Engineering and System Safety, Elsevier, vol. 196(C).
    5. Amirhossain Chambari & Javad Sadeghi & Fakhri Bakhtiari & Reza Jahangard, 2016. "A note on a reliability redundancy allocation problem using a tuned parameter genetic algorithm," OPSEARCH, Springer;Operational Research Society of India, vol. 53(2), pages 426-442, June.
    6. Levitin, Gregory & Finkelstein, Maxim & Dai, Yuanshun, 2018. "Optimizing availability of heterogeneous standby systems exposed to shocks," Reliability Engineering and System Safety, Elsevier, vol. 170(C), pages 137-145.
    7. Kim, Heungseob & Kim, Pansoo, 2017. "Reliability models for a nonrepairable system with heterogeneous components having a phase-type time-to-failure distribution," Reliability Engineering and System Safety, Elsevier, vol. 159(C), pages 37-46.
    8. Caserta, Marco & Voß, Stefan, 2015. "An exact algorithm for the reliability redundancy allocation problem," European Journal of Operational Research, Elsevier, vol. 244(1), pages 110-116.
    9. Ruiz-Castro, Juan Eloy & Dawabsha, Mohammed & Alonso, Francisco Javier, 2018. "Discrete-time Markovian arrival processes to model multi-state complex systems with loss of units and an indeterminate variable number of repairpersons," Reliability Engineering and System Safety, Elsevier, vol. 174(C), pages 114-127.
    10. Heping Jia & Rui Peng & Yi Ding & Yonghua Song, 2019. "Reliability of demand-based warm standby system with common bus performance sharing," Journal of Risk and Reliability, , vol. 233(4), pages 580-592, August.
    11. Coit, David W. & Zio, Enrico, 2019. "The evolution of system reliability optimization," Reliability Engineering and System Safety, Elsevier, vol. 192(C).
    12. Tadashi Dohi & Hiroyuki Okamura & Cun-Hua Qian, 2022. "Computation algorithms for workload-dependent optimal checkpoint placement," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(2), pages 788-796, June.
    13. Levitin, Gregory & Xing, Liudong & Ben-Haim, Hanoch, 2018. "Optimizing software rejuvenation policy for real time tasks," Reliability Engineering and System Safety, Elsevier, vol. 176(C), pages 202-208.
    14. Zhenya Liu & Yuhao Mu, 2022. "Optimal Stopping Methods for Investment Decisions: A Literature Review," IJFS, MDPI, vol. 10(4), pages 1-23, October.
    15. Levitin, Gregory & Xing, Liudong & Dai, Yuanshun, 2018. "Co-residence based data vulnerability vs. security in cloud computing system with random server assignment," European Journal of Operational Research, Elsevier, vol. 267(2), pages 676-686.
    16. Levitin, Gregory & Xing, Liudong & Peng, Sun & Dai, Yuanshun, 2015. "Optimal choice of standby modes in 1-out-of-N system with respect to mission reliability and cost," Applied Mathematics and Computation, Elsevier, vol. 258(C), pages 587-596.
    17. Gao, Shan & Wang, Jinting, 2021. "Reliability and availability analysis of a retrial system with mixed standbys and an unreliable repair facility," Reliability Engineering and System Safety, Elsevier, vol. 205(C).
    18. Fernández, Arturo J., 2015. "Optimum attributes component test plans for k-out-of-n:F Weibull systems using prior information," European Journal of Operational Research, Elsevier, vol. 240(3), pages 688-696.
    19. Navarro, Jorge & Pellerey, Franco & Di Crescenzo, Antonio, 2015. "Orderings of coherent systems with randomized dependent components," European Journal of Operational Research, Elsevier, vol. 240(1), pages 127-139.
    20. Nizar Mannai & Soufiane Gasmi, 2020. "Optimal design of k-out-of-n system under first and last replacement in reliability theory," Operational Research, Springer, vol. 20(3), pages 1353-1368, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reensy:v:193:y:2020:i:c:s0951832019303503. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/reliability-engineering-and-system-safety .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.