IDEAS home Printed from https://ideas.repec.org/a/eee/reensy/v206y2021ics0951832020308073.html
   My bibliography  Save this article

Random forests for global sensitivity analysis: A selective review

Author

Listed:
  • Antoniadis, Anestis
  • Lambert-Lacroix, Sophie
  • Poggi, Jean-Michel

Abstract

The understanding of many physical and engineering problems involves running complex computational models. Such models take as input a high number of numerical and physical explanatory variables. The information on these underlying input parameters is often limited or uncertain. It is therefore important, based on the relationships between the input variables and the output, to identify and prioritize the most influential inputs. One may use global sensitivity analysis (GSA) methods which aim at ranking input random variables according to their importance in the output uncertainty, or even quantify the global influence of a particular input on the output. Using sensitivity metrics to ignore less important parameters is a form of dimension reduction in the model’s input parameter space. This suggests the use of meta-modeling as a quantitative approach for nonparametric GSA, where the original input/output relation is first approximated using various statistical regression techniques. Subsequently, the main goal of our work is to provide a comprehensive review paper in the domain of sensitivity analysis focusing on some interesting connections between random forests and GSA. The idea is to use a random forests methodology as an efficient non-parametric approach for building meta-models that allow an efficient sensitivity analysis. Apart its easy applicability to regression problems, the random forests approach presents further strong advantages by its ability to implicitly deal with correlation and high dimensional data, to handle interactions between variables and to identify informative inputs using a permutation based RF variable importance index which is easy and fast to compute. We further review an adequate set of tools for quantifying variable importance which are then exploited to reduce the model’s dimension enabling otherwise infeasible sensibility analysis studies. Numerical results from several simulations and a data exploration on a real dataset are presented to illustrate the effectiveness of such an approach.

Suggested Citation

  • Antoniadis, Anestis & Lambert-Lacroix, Sophie & Poggi, Jean-Michel, 2021. "Random forests for global sensitivity analysis: A selective review," Reliability Engineering and System Safety, Elsevier, vol. 206(C).
  • Handle: RePEc:eee:reensy:v:206:y:2021:i:c:s0951832020308073
    DOI: 10.1016/j.ress.2020.107312
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0951832020308073
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ress.2020.107312?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wei, Pengfei & Lu, Zhenzhou & Song, Jingwen, 2015. "Variable importance analysis: A comprehensive review," Reliability Engineering and System Safety, Elsevier, vol. 142(C), pages 399-432.
    2. Gregorutti, Baptiste & Michel, Bertrand & Saint-Pierre, Philippe, 2015. "Grouped variable importance with random forests and application to multiple functional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 90(C), pages 15-35.
    3. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    4. Stan Lipovetsky & Michael Conklin, 2001. "Analysis of regression in game theory approach," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 17(4), pages 319-330, October.
    5. Gérard Biau & Erwan Scornet, 2016. "A random forest guided tour," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(2), pages 197-227, June.
    6. Manel Baucells & Emanuele Borgonovo, 2013. "Invariant Probabilistic Sensitivity Analysis," Management Science, INFORMS, vol. 59(11), pages 2536-2549, November.
    7. Richard Bellman, 1957. "On a Dynamic Programming Approach to the Caterer Problem--I," Management Science, INFORMS, vol. 3(3), pages 270-278, April.
    8. Constantine, Paul G. & Diaz, Paul, 2017. "Global sensitivity metrics from active subspaces," Reliability Engineering and System Safety, Elsevier, vol. 162(C), pages 1-13.
    9. Sudret, Bruno, 2008. "Global sensitivity analysis using polynomial chaos expansions," Reliability Engineering and System Safety, Elsevier, vol. 93(7), pages 964-979.
    10. Matthew Gentzkow & Bryan Kelly & Matt Taddy, 2019. "Text as Data," Journal of Economic Literature, American Economic Association, vol. 57(3), pages 535-574, September.
    11. Liu, Qiao & Homma, Toshimitsu, 2009. "A new computational method of a moment-independent uncertainty importance measure," Reliability Engineering and System Safety, Elsevier, vol. 94(7), pages 1205-1211.
    12. Gérard Biau & Erwan Scornet, 2016. "Rejoinder on: A random forest guided tour," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(2), pages 264-268, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jung, WoongHee & Taflanidis, Alexandros A., 2023. "Efficient global sensitivity analysis for high-dimensional outputs combining data-driven probability models and dimensionality reduction," Reliability Engineering and System Safety, Elsevier, vol. 231(C).
    2. Ballester-Ripoll, Rafael & Leonelli, Manuele, 2022. "Computing Sobol indices in probabilistic graphical models," Reliability Engineering and System Safety, Elsevier, vol. 225(C).
    3. Vuillod, Bruno & Montemurro, Marco & Panettieri, Enrico & Hallo, Ludovic, 2023. "A comparison between Sobol’s indices and Shapley’s effect for global sensitivity analysis of systems with independent input variables," Reliability Engineering and System Safety, Elsevier, vol. 234(C).
    4. Gao, Zhikun & Yu, Junqi & Zhao, Anjun & Hu, Qun & Yang, Siyuan, 2022. "A hybrid method of cooling load forecasting for large commercial building based on extreme learning machine," Energy, Elsevier, vol. 238(PC).
    5. Herbert Amezquita & Cindy P. Guzman & Hugo Morais, 2024. "Forecasting Electric Vehicles’ Charging Behavior at Charging Stations: A Data Science-Based Approach," Energies, MDPI, vol. 17(14), pages 1-27, July.
    6. Chen, Xuyong & Xu, Zhifeng & Wu, Yushun & Wu, Qiaoyun, 2023. "Heuristic algorithms for reliability estimation based on breadth-first search of a grid tree," Reliability Engineering and System Safety, Elsevier, vol. 232(C).
    7. Kim, Jun Young & Kim, Dongjae & Li, Zezhong John & Dariva, Claudio & Cao, Yankai & Ellis, Naoko, 2023. "Predicting and optimizing syngas production from fluidized bed biomass gasifiers: A machine learning approach," Energy, Elsevier, vol. 263(PC).
    8. Pilowsky, Julia A. & Manica, Andrea & Brown, Stuart & Rahbek, Carsten & Fordham, Damien A., 2022. "Simulations of human migration into North America are more sensitive to demography than choice of palaeoclimate model," Ecological Modelling, Elsevier, vol. 473(C).
    9. Djandja, Oraléou Sangué & Salami, Adekunlé Akim & Wang, Zhi-Cong & Duo, Jia & Yin, Lin-Xin & Duan, Pei-Gao, 2022. "Random forest-based modeling for insights on phosphorus content in hydrochar produced from hydrothermal carbonization of sewage sludge," Energy, Elsevier, vol. 245(C).
    10. Xiong, Qingwen & Du, Peng & Deng, Jian & Huang, Daishun & Song, Gongle & Qian, Libo & Wu, Zenghui & Luo, Yuejian, 2022. "Global sensitivity analysis for nuclear reactor LBLOCA with time-dependent outputs," Reliability Engineering and System Safety, Elsevier, vol. 221(C).
    11. Chien-Chih Wang & Yu-Hsun Li, 2022. "Machine-Learning-Based System for the Detection of Entanglement in Dyeing and Finishing Processes," Sustainability, MDPI, vol. 14(14), pages 1-12, July.
    12. Mehdi Dasineh & Amir Ghaderi & Mohammad Bagherzadeh & Mohammad Ahmadi & Alban Kuriqi, 2021. "Prediction of Hydraulic Jumps on a Triangular Bed Roughness Using Numerical Modeling and Soft Computing Methods," Mathematics, MDPI, vol. 9(23), pages 1-24, December.
    13. Torii, André Jacomel & Novotny, Antonio André, 2021. "A priori error estimates for local reliability-based sensitivity analysis with Monte Carlo Simulation," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    14. Ma, Yuan-Zhuo & Jin, Xiang-Xiang & Zhao, Xiang & Li, Hong-Shuang & Zhao, Zhen-Zhou & Xu, Chang, 2024. "Reliability-oriented global sensitivity analysis using subset simulation and space partition," Reliability Engineering and System Safety, Elsevier, vol. 242(C).
    15. Ling Tao & Yuanlai Xie & Chundong Hu, 2022. "Efficient Sensitivity Analysis for Enhanced Heat Transfer Performance of Heat Sink with Swirl Flow Structure under One-Side Heating," Energies, MDPI, vol. 15(19), pages 1-19, October.
    16. Xiang Peng & Xiaoqing Xu & Jiquan Li & Shaofei Jiang, 2021. "A Sampling-Based Sensitivity Analysis Method Considering the Uncertainties of Input Variables and Their Distribution Parameters," Mathematics, MDPI, vol. 9(10), pages 1-18, May.
    17. Goda, Takashi, 2021. "A simple algorithm for global sensitivity analysis with Shapley effects," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    18. Simsekler, Mecit Can Emre & Rodrigues, Clarence & Qazi, Abroon & Ellahham, Samer & Ozonoff, Al, 2021. "A comparative study of patient and staff safety evaluation using tree-based machine learning algorithms," Reliability Engineering and System Safety, Elsevier, vol. 208(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Borup, Daniel & Christensen, Bent Jesper & Mühlbach, Nicolaj Søndergaard & Nielsen, Mikkel Slot, 2023. "Targeting predictors in random forest regression," International Journal of Forecasting, Elsevier, vol. 39(2), pages 841-868.
    2. Patrick Krennmair & Timo Schmid, 2022. "Flexible domain prediction using mixed effects random forests," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1865-1894, November.
    3. Escribano, Álvaro & Wang, Dandan, 2021. "Mixed random forest, cointegration, and forecasting gasoline prices," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1442-1462.
    4. Yigit Aydede & Jan Ditzen, 2022. "Identifying the regional drivers of influenza-like illness in Nova Scotia with dominance analysis," Papers 2212.06684, arXiv.org.
    5. Boller, Daniel & Lechner, Michael & Okasa, Gabriel, 2021. "The Effect of Sport in Online Dating: Evidence from Causal Machine Learning," Economics Working Paper Series 2104, University of St. Gallen, School of Economics and Political Science.
    6. Susan Athey & Julie Tibshirani & Stefan Wager, 2016. "Generalized Random Forests," Papers 1610.01271, arXiv.org, revised Apr 2018.
    7. Marrel, Amandine & Chabridon, Vincent, 2021. "Statistical developments for target and conditional sensitivity analysis: Application on safety studies for nuclear reactor," Reliability Engineering and System Safety, Elsevier, vol. 214(C).
    8. Kayo Murakami & Hideki Shimada & Yoshiaki Ushifusa & Takanori Ida, 2022. "Heterogeneous Treatment Effects Of Nudge And Rebate: Causal Machine Learning In A Field Experiment On Electricity Conservation," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 63(4), pages 1779-1803, November.
    9. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    10. Schnaubelt, Matthias & Fischer, Thomas G. & Krauss, Christopher, 2020. "Separating the signal from the noise – Financial machine learning for Twitter," Journal of Economic Dynamics and Control, Elsevier, vol. 114(C).
    11. Daniel Goller & Michael C. Knaus & Michael Lechner & Gabriel Okasa, 2021. "Predicting match outcomes in football by an Ordered Forest estimator," Chapters, in: Ruud H. Koning & Stefan Kesenne (ed.), A Modern Guide to Sports Economics, chapter 22, pages 335-355, Edward Elgar Publishing.
    12. Tatsuya Sakurahara & Seyed Reihani & Ernie Kee & Zahra Mohaghegh, 2020. "Global importance measure methodology for integrated probabilistic risk assessment," Journal of Risk and Reliability, , vol. 234(2), pages 377-396, April.
    13. Gabriel Okasa, 2022. "Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance," Papers 2201.12692, arXiv.org.
    14. Yiyi Huo & Yingying Fan & Fang Han, 2023. "On the adaptation of causal forests to manifold data," Papers 2311.16486, arXiv.org, revised Dec 2023.
    15. Max Biggs & Rim Hariss & Georgia Perakis, 2023. "Constrained optimization of objective functions determined from random forests," Production and Operations Management, Production and Operations Management Society, vol. 32(2), pages 397-415, February.
    16. Valente, Marica, 2023. "Policy evaluation of waste pricing programs using heterogeneous causal effect estimation," Journal of Environmental Economics and Management, Elsevier, vol. 117(C).
    17. Pedro Forquesato, 2022. "Who Benefits from Political Connections in Brazilian Municipalities," Papers 2204.09450, arXiv.org.
    18. Ruixin Liang & Joanne Yip & Yunli Fan & Jason P. Y. Cheung & Kai-Tsun Michael To, 2022. "Electromyographic Analysis of Paraspinal Muscles of Scoliosis Patients Using Machine Learning Approaches," IJERPH, MDPI, vol. 19(3), pages 1-12, January.
    19. Nathan Kallus & Xiaojie Mao, 2023. "Stochastic Optimization Forests," Management Science, INFORMS, vol. 69(4), pages 1975-1994, April.
    20. Zhexiao Lin & Fang Han, 2022. "On regression-adjusted imputation estimators of the average treatment effect," Papers 2212.05424, arXiv.org, revised Jan 2023.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reensy:v:206:y:2021:i:c:s0951832020308073. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/reliability-engineering-and-system-safety .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.