IDEAS home Printed from https://ideas.repec.org/a/eee/reensy/v180y2018icp13-24.html
   My bibliography  Save this article

A decomposition-based reliability and makespan optimization technique for hardware task graphs

Author

Listed:
  • Ramezani, Reza
  • Sedaghat, Yasser
  • Naghibzadeh, Mahmoud
  • Clemente, Juan Antonio

Abstract

This paper presents an approach to optimize the reliability and makespan of hardware task graphs, running on FPGA-based reconfigurable computers, in space-mission computing applications with dynamic soft error rates (SERs). Thus, with rises and falls of the SER, the presented approach dynamically generates a set of solutions that apply redundancy-based fault tolerance (FT) techniques to the running tasks. The set of solutions is generated by decomposing the task graph into multiple subgraphs, applying a multi-objective optimization algorithm to the subgraphs separately, and finally combining and filtering out the obtained solutions of the subgraphs. In this regard, a heuristic has been proposed to decompose task graphs in such a way that a high coverage of the true Pareto set is attained. The experiments show that the presented approach covers 97.37% of the true Pareto set and improves the average computation time of generating the Pareto set from 6.29Â h to 81.86Â ms. In addition, it outperforms the NSGA-II algorithm in terms of the Pareto set coverage and computation time. Additional experiments demonstrate the advantages of the presented approach over the state-of-the-art adaptive FT techniques in dynamic environments.

Suggested Citation

  • Ramezani, Reza & Sedaghat, Yasser & Naghibzadeh, Mahmoud & Clemente, Juan Antonio, 2018. "A decomposition-based reliability and makespan optimization technique for hardware task graphs," Reliability Engineering and System Safety, Elsevier, vol. 180(C), pages 13-24.
  • Handle: RePEc:eee:reensy:v:180:y:2018:i:c:p:13-24
    DOI: 10.1016/j.ress.2018.07.007
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0951832018301182
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ress.2018.07.007?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Cao, Dingzhou & Murat, Alper & Chinnam, Ratna Babu, 2013. "Efficient exact optimization of multi-objective redundancy allocation problems in series-parallel systems," Reliability Engineering and System Safety, Elsevier, vol. 111(C), pages 154-163.
    2. Kim, Heungseob, 2017. "Optimal reliability design of a system with k-out-of-n subsystems considering redundancy strategies," Reliability Engineering and System Safety, Elsevier, vol. 167(C), pages 572-582.
    3. Kim, Heungseob, 2018. "Maximization of system reliability with the consideration of component sequencing," Reliability Engineering and System Safety, Elsevier, vol. 170(C), pages 64-72.
    4. Villalta, Igor & Bidarte, Unai & Gómez-Cornejo, Julen & Jiménez, Jaime & Lázaro, Jesús, 2018. "SEU emulation in industrial SoCs combining microprocessor and FPGA," Reliability Engineering and System Safety, Elsevier, vol. 170(C), pages 53-63.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ramezani, Reza & Ghavidel, Abolfazl & Sedaghat, Yasser, 2021. "Exact and efficient reliability and performance optimization of synchronous task graphs," Reliability Engineering and System Safety, Elsevier, vol. 205(C).
    2. Ramezani, Reza & Clemente, Juan Antonio & Franco, Francisco J., 2020. "Analytical reliability estimation of SRAM-based FPGA designs against single-bit and multiple-cell upsets," Reliability Engineering and System Safety, Elsevier, vol. 202(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jung, Seunghwa & Choi, Jihwan P., 2019. "Predicting system failure rates of SRAM-based FPGA on-board processors in space radiation environments," Reliability Engineering and System Safety, Elsevier, vol. 183(C), pages 374-386.
    2. Enrico Zio & Hadi Gholinezhad, 2023. "Redundancy Allocation of Components with Time-Dependent Failure Rates," Mathematics, MDPI, vol. 11(16), pages 1-27, August.
    3. Peiravi, Abdossaber & Nourelfath, Mustapha & Zanjani, Masoumeh Kazemi, 2022. "Redundancy strategies assessment and optimization of k-out-of-n systems based on Markov chains and genetic algorithms," Reliability Engineering and System Safety, Elsevier, vol. 221(C).
    4. Caserta, Marco & Voß, Stefan, 2015. "An exact algorithm for the reliability redundancy allocation problem," European Journal of Operational Research, Elsevier, vol. 244(1), pages 110-116.
    5. Farhadi, Mohammad & Shahrokhi, Mahmoud & Rahmati, Seyed Habib A, 2022. "Developing a supplier selection model based on Markov chain and probability tree for a k-out-of-N system with different quality of spare parts," Reliability Engineering and System Safety, Elsevier, vol. 222(C).
    6. Gregory Levitin & Liudong Xing & Yuanshun Dai, 2020. "Mission Abort Policy for Systems with Observable States of Standby Components," Risk Analysis, John Wiley & Sons, vol. 40(10), pages 1900-1912, October.
    7. Zhang, Enze & Wu, Yifei & Chen, Qingwei, 2014. "A practical approach for solving multi-objective reliability redundancy allocation problems using extended bare-bones particle swarm optimization," Reliability Engineering and System Safety, Elsevier, vol. 127(C), pages 65-76.
    8. Coit, David W. & Zio, Enrico, 2019. "The evolution of system reliability optimization," Reliability Engineering and System Safety, Elsevier, vol. 192(C).
    9. Kong, Xiangyong & Gao, Liqun & Ouyang, Haibin & Li, Steven, 2015. "Solving the redundancy allocation problem with multiple strategy choices using a new simplified particle swarm optimization," Reliability Engineering and System Safety, Elsevier, vol. 144(C), pages 147-158.
    10. Hoque, Khaza Anuarul & Ait Mohamed, Otmane & Savaria, Yvon, 2019. "Dependability modeling and optimization of triple modular redundancy partitioning for SRAM-based FPGAs," Reliability Engineering and System Safety, Elsevier, vol. 182(C), pages 107-119.
    11. Meisam Sadeghi & Emad Roghanian & Hamid Shahriari & Hassan Sadeghi, 2021. "Reliability optimization for non-repairable series-parallel systems with a choice of redundancy strategies and heterogeneous components: Erlang time-to-failure distribution," Journal of Risk and Reliability, , vol. 235(3), pages 509-528, June.
    12. Soheil Azizi & Milad Mohammadi, 2023. "Strategy selection for multi-objective redundancy allocation problem in a k-out-of-n system considering the mean time to failure," OPSEARCH, Springer;Operational Research Society of India, vol. 60(2), pages 1021-1044, June.
    13. Alikar, Najmeh & Mousavi, Seyed Mohsen & Raja Ghazilla, Raja Ariffin & Tavana, Madjid & Olugu, Ezutah Udoncy, 2017. "Application of the NSGA-II algorithm to a multi-period inventory-redundancy allocation problem in a series-parallel system," Reliability Engineering and System Safety, Elsevier, vol. 160(C), pages 1-10.
    14. Ardakan, Mostafa Abouei & Amini, Hanieh & Juybari, Mohammad N., 2022. "Prescheduled switching time: A new strategy for systems with standby components," Reliability Engineering and System Safety, Elsevier, vol. 218(PB).
    15. Karimi, Behzad & Niaki, S.T.A. & Haleh, Hassan & Naderi, Bahman, 2018. "Bi-objective optimization of a job shop with two types of failures for the operating machines that use automated guided vehicles," Reliability Engineering and System Safety, Elsevier, vol. 175(C), pages 92-104.
    16. Endharta, Alfonsus Julanto & Yun, Won Young & Ko, Young Myoung, 2018. "Reliability evaluation of circular k-out-of-n: G balanced systems through minimal path sets," Reliability Engineering and System Safety, Elsevier, vol. 180(C), pages 226-236.
    17. Torrado, Nuria & Arriaza, Antonio & Navarro, Jorge, 2021. "A study on multi-level redundancy allocation in coherent systems formed by modules," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    18. Ramezani, Reza & Clemente, Juan Antonio & Franco, Francisco J., 2020. "Analytical reliability estimation of SRAM-based FPGA designs against single-bit and multiple-cell upsets," Reliability Engineering and System Safety, Elsevier, vol. 202(C).
    19. Abouei Ardakan, Mostafa & Rezvan, Mohammad Taghi, 2018. "Multi-objective optimization of reliability–redundancy allocation problem with cold-standby strategy using NSGA-II," Reliability Engineering and System Safety, Elsevier, vol. 172(C), pages 225-238.
    20. Eryilmaz, Serkan, 2018. "The number of failed components in a k-out-of-n system consisting of multiple types of components," Reliability Engineering and System Safety, Elsevier, vol. 175(C), pages 246-250.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reensy:v:180:y:2018:i:c:p:13-24. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/reliability-engineering-and-system-safety .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.