IDEAS home Printed from https://ideas.repec.org/a/eee/reensy/v180y2018icp13-24.html
   My bibliography  Save this article

A decomposition-based reliability and makespan optimization technique for hardware task graphs

Author

Listed:
  • Ramezani, Reza
  • Sedaghat, Yasser
  • Naghibzadeh, Mahmoud
  • Clemente, Juan Antonio

Abstract

This paper presents an approach to optimize the reliability and makespan of hardware task graphs, running on FPGA-based reconfigurable computers, in space-mission computing applications with dynamic soft error rates (SERs). Thus, with rises and falls of the SER, the presented approach dynamically generates a set of solutions that apply redundancy-based fault tolerance (FT) techniques to the running tasks. The set of solutions is generated by decomposing the task graph into multiple subgraphs, applying a multi-objective optimization algorithm to the subgraphs separately, and finally combining and filtering out the obtained solutions of the subgraphs. In this regard, a heuristic has been proposed to decompose task graphs in such a way that a high coverage of the true Pareto set is attained. The experiments show that the presented approach covers 97.37% of the true Pareto set and improves the average computation time of generating the Pareto set from 6.29Â h to 81.86Â ms. In addition, it outperforms the NSGA-II algorithm in terms of the Pareto set coverage and computation time. Additional experiments demonstrate the advantages of the presented approach over the state-of-the-art adaptive FT techniques in dynamic environments.

Suggested Citation

  • Ramezani, Reza & Sedaghat, Yasser & Naghibzadeh, Mahmoud & Clemente, Juan Antonio, 2018. "A decomposition-based reliability and makespan optimization technique for hardware task graphs," Reliability Engineering and System Safety, Elsevier, vol. 180(C), pages 13-24.
  • Handle: RePEc:eee:reensy:v:180:y:2018:i:c:p:13-24
    DOI: 10.1016/j.ress.2018.07.007
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0951832018301182
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ress.2018.07.007?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Cao, Dingzhou & Murat, Alper & Chinnam, Ratna Babu, 2013. "Efficient exact optimization of multi-objective redundancy allocation problems in series-parallel systems," Reliability Engineering and System Safety, Elsevier, vol. 111(C), pages 154-163.
    2. Kim, Heungseob, 2017. "Optimal reliability design of a system with k-out-of-n subsystems considering redundancy strategies," Reliability Engineering and System Safety, Elsevier, vol. 167(C), pages 572-582.
    3. Kim, Heungseob, 2018. "Maximization of system reliability with the consideration of component sequencing," Reliability Engineering and System Safety, Elsevier, vol. 170(C), pages 64-72.
    4. Villalta, Igor & Bidarte, Unai & Gómez-Cornejo, Julen & Jiménez, Jaime & Lázaro, Jesús, 2018. "SEU emulation in industrial SoCs combining microprocessor and FPGA," Reliability Engineering and System Safety, Elsevier, vol. 170(C), pages 53-63.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ramezani, Reza & Ghavidel, Abolfazl & Sedaghat, Yasser, 2021. "Exact and efficient reliability and performance optimization of synchronous task graphs," Reliability Engineering and System Safety, Elsevier, vol. 205(C).
    2. Ramezani, Reza & Clemente, Juan Antonio & Franco, Francisco J., 2020. "Analytical reliability estimation of SRAM-based FPGA designs against single-bit and multiple-cell upsets," Reliability Engineering and System Safety, Elsevier, vol. 202(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jung, Seunghwa & Choi, Jihwan P., 2019. "Predicting system failure rates of SRAM-based FPGA on-board processors in space radiation environments," Reliability Engineering and System Safety, Elsevier, vol. 183(C), pages 374-386.
    2. Enrico Zio & Hadi Gholinezhad, 2023. "Redundancy Allocation of Components with Time-Dependent Failure Rates," Mathematics, MDPI, vol. 11(16), pages 1-27, August.
    3. Ardakan, Mostafa Abouei & Talkhabi, Sajjad & Juybari, Mohammad N., 2022. "Optimal activation order vs. redundancy strategies in reliability optimization problems," Reliability Engineering and System Safety, Elsevier, vol. 217(C).
    4. Zhou, Yifan & Lin, Tian Ran & Sun, Yong & Bian, Yangqing & Ma, Lin, 2015. "An effective approach to reducing strategy space for maintenance optimisation of multistate series–parallel systems," Reliability Engineering and System Safety, Elsevier, vol. 138(C), pages 40-53.
    5. Zhou, Jianxiong & Wei, Shanbi & Chai, Yi, 2021. "Using improved dynamic Bayesian networks in reliability evaluation for flexible test system of aerospace pyromechanical device products," Reliability Engineering and System Safety, Elsevier, vol. 210(C).
    6. Chen, Ying & Wang, Ze & Li, YingYi & Kang, Rui & Mosleh, Ali, 2018. "Reliability analysis of a cold-standby system considering the development stages and accumulations of failure mechanisms," Reliability Engineering and System Safety, Elsevier, vol. 180(C), pages 1-12.
    7. Zhang, Enze & Chen, Qingwei, 2016. "Multi-objective reliability redundancy allocation in an interval environment using particle swarm optimization," Reliability Engineering and System Safety, Elsevier, vol. 145(C), pages 83-92.
    8. Zhao, Jiangbin & Si, Shubin & Cai, Zhiqiang, 2019. "A multi-objective reliability optimization for reconfigurable systems considering components degradation," Reliability Engineering and System Safety, Elsevier, vol. 183(C), pages 104-115.
    9. Peiravi, Abdossaber & Nourelfath, Mustapha & Zanjani, Masoumeh Kazemi, 2022. "Redundancy strategies assessment and optimization of k-out-of-n systems based on Markov chains and genetic algorithms," Reliability Engineering and System Safety, Elsevier, vol. 221(C).
    10. Caserta, Marco & Voß, Stefan, 2015. "An exact algorithm for the reliability redundancy allocation problem," European Journal of Operational Research, Elsevier, vol. 244(1), pages 110-116.
    11. Farhadi, Mohammad & Shahrokhi, Mahmoud & Rahmati, Seyed Habib A, 2022. "Developing a supplier selection model based on Markov chain and probability tree for a k-out-of-N system with different quality of spare parts," Reliability Engineering and System Safety, Elsevier, vol. 222(C).
    12. Pradip Kundu, 2021. "A multi-objective reliability-redundancy allocation problem with active redundancy and interval type-2 fuzzy parameters," Operational Research, Springer, vol. 21(4), pages 2433-2458, December.
    13. Gregory Levitin & Liudong Xing & Yuanshun Dai, 2020. "Mission Abort Policy for Systems with Observable States of Standby Components," Risk Analysis, John Wiley & Sons, vol. 40(10), pages 1900-1912, October.
    14. Vu, Hai Canh & Do, Phuc & Fouladirad, Mitra & Grall, Antoine, 2020. "Dynamic opportunistic maintenance planning for multi-component redundant systems with various types of opportunities," Reliability Engineering and System Safety, Elsevier, vol. 198(C).
    15. Cao, Ran & Coit, David W. & Hou, Wei & Yang, Yushu, 2020. "Game theory based solution selection for multi-objective redundancy allocation in interval-valued problem parameters," Reliability Engineering and System Safety, Elsevier, vol. 199(C).
    16. Zhang, Enze & Wu, Yifei & Chen, Qingwei, 2014. "A practical approach for solving multi-objective reliability redundancy allocation problems using extended bare-bones particle swarm optimization," Reliability Engineering and System Safety, Elsevier, vol. 127(C), pages 65-76.
    17. Coit, David W. & Zio, Enrico, 2019. "The evolution of system reliability optimization," Reliability Engineering and System Safety, Elsevier, vol. 192(C).
    18. Kong, Xiangyong & Gao, Liqun & Ouyang, Haibin & Li, Steven, 2015. "Solving the redundancy allocation problem with multiple strategy choices using a new simplified particle swarm optimization," Reliability Engineering and System Safety, Elsevier, vol. 144(C), pages 147-158.
    19. Hoque, Khaza Anuarul & Ait Mohamed, Otmane & Savaria, Yvon, 2019. "Dependability modeling and optimization of triple modular redundancy partitioning for SRAM-based FPGAs," Reliability Engineering and System Safety, Elsevier, vol. 182(C), pages 107-119.
    20. Meisam Sadeghi & Emad Roghanian & Hamid Shahriari & Hassan Sadeghi, 2021. "Reliability optimization for non-repairable series-parallel systems with a choice of redundancy strategies and heterogeneous components: Erlang time-to-failure distribution," Journal of Risk and Reliability, , vol. 235(3), pages 509-528, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reensy:v:180:y:2018:i:c:p:13-24. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/reliability-engineering-and-system-safety .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.