IDEAS home Printed from https://ideas.repec.org/a/eee/reensy/v224y2022ics0951832022001776.html
   My bibliography  Save this article

Application of structural topic modeling to aviation safety data

Author

Listed:
  • Rose, Rodrigo L.
  • Puranik, Tejas G.
  • Mavris, Dimitri N.
  • Rao, Arjun H.

Abstract

Data-driven frameworks for analyzing aviation safety data have recently gained traction. Text-based machine learning techniques often rely purely on word frequency analysis to eliminate the innate subjectivity of human language, but more refined techniques like structural topic modeling (STM) attempt to simulate text generation to identify the thematic undertones of text corpora. This paper presents an application of STM to two text-based sets of aviation safety data, the Aviation Safety Reporting System (ASRS) and accident and incident reports published by the National Transportation Safety Board (NTSB). A framework for cleaning and pre-processing the datasets is discussed, including a brief discussion of bag-of-words and TF–IDF representations of narratives. The methodology behind STM is described, including techniques for selecting the optimal number of topics. The results of the STM analysis on the ASRS and NTSB datasets are presented, with a focus on the clarity and specificity based on most common words associated with topics. A brief exploration of the correlation between pairs of topic labels is also undertaken, including a visualization of narratives in 2-dimensional space. STM is found to show promise in identifying themes within technical datasets, with model performance increasing for more specific corpora that use precise and unique language.

Suggested Citation

  • Rose, Rodrigo L. & Puranik, Tejas G. & Mavris, Dimitri N. & Rao, Arjun H., 2022. "Application of structural topic modeling to aviation safety data," Reliability Engineering and System Safety, Elsevier, vol. 224(C).
  • Handle: RePEc:eee:reensy:v:224:y:2022:i:c:s0951832022001776
    DOI: 10.1016/j.ress.2022.108522
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0951832022001776
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ress.2022.108522?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Saleh, J.H. & Marais, K.B. & Bakolas, E. & Cowlagi, R.V., 2010. "Highlights from the literature on accident causation and system safety: Review of major ideas, recent contributions, and challenges," Reliability Engineering and System Safety, Elsevier, vol. 95(11), pages 1105-1116.
    2. Zhang, Xiaoge & Mahadevan, Sankaran, 2021. "Bayesian network modeling of accident investigation reports for aviation safety assessment," Reliability Engineering and System Safety, Elsevier, vol. 209(C).
    3. Margaret Roberts & Brandon Stewart & Tingley, Dustin & Edoardo Airoldi, 2013. "The structural topic model and applied social science," Working Paper 132666, Harvard University OpenScholar.
    4. Gu, Shuang & Li, Keping & Feng, Tao & Yan, Dongyang & Liu, Yanyan, 2022. "The prediction of potential risk path in railway traffic events," Reliability Engineering and System Safety, Elsevier, vol. 222(C).
    5. Liu, Lu & Song, Xiao & Zhou, Zhetao, 2022. "Aircraft engine remaining useful life estimation via a double attention-based data-driven architecture," Reliability Engineering and System Safety, Elsevier, vol. 221(C).
    6. Suo, Weilan & Wang, Lin & Li, Jianping, 2021. "Probabilistic risk assessment for interdependent critical infrastructures: A scenario-driven dynamic stochastic model," Reliability Engineering and System Safety, Elsevier, vol. 214(C).
    7. Rao, Arjun H. & Marais, Karen, 2020. "A state-based approach to modeling general aviation accidents," Reliability Engineering and System Safety, Elsevier, vol. 193(C).
    8. Asadayoobi, N. & Taghipour, S. & Jaber, M.Y., 2022. "Predicting human reliability based on probabilistic mission completion time using Bayesian Network," Reliability Engineering and System Safety, Elsevier, vol. 221(C).
    9. Hesabi, Hadis & Nourelfath, Mustapha & Hajji, Adnène, 2022. "A deep learning predictive model for selective maintenance optimization," Reliability Engineering and System Safety, Elsevier, vol. 219(C).
    10. Xu, Zhaoyi & Saleh, Joseph Homer & Subagia, Rachmat, 2020. "Machine learning for helicopter accident analysis using supervised classification: Inference, prediction, and implications," Reliability Engineering and System Safety, Elsevier, vol. 204(C).
    11. Theissler, Andreas & Pérez-Velázquez, Judith & Kettelgerdes, Marcel & Elger, Gordon, 2021. "Predictive maintenance enabled by machine learning: Use cases and challenges in the automotive industry," Reliability Engineering and System Safety, Elsevier, vol. 215(C).
    12. Chen, Thomas Ying-Jeh & Guikema, Seth David, 2020. "Prediction of water main failures with the spatial clustering of breaks," Reliability Engineering and System Safety, Elsevier, vol. 203(C).
    13. Zhang, Xiaoge & Mahadevan, Sankaran & Deng, Xinyang, 2017. "Reliability analysis with linguistic data: An evidential network approach," Reliability Engineering and System Safety, Elsevier, vol. 162(C), pages 111-121.
    14. Bai, Xiwen & Zhang, Xiunian & Li, Kevin X. & Zhou, Yaoming & Yuen, Kum Fai, 2021. "Research topics and trends in the maritime transport: A structural topic model," Transport Policy, Elsevier, vol. 102(C), pages 11-24.
    15. Pan, Xing & Wang, Huixiong & You, Weijia & Zhang, Manli & Yang, Yuexiang, 2020. "Assessing the reliability of electronic products using customer knowledge discovery," Reliability Engineering and System Safety, Elsevier, vol. 199(C).
    16. Margaret E. Roberts & Brandon M. Stewart & Dustin Tingley & Christopher Lucas & Jetson Leder‐Luis & Shana Kushner Gadarian & Bethany Albertson & David G. Rand, 2014. "Structural Topic Models for Open‐Ended Survey Responses," American Journal of Political Science, John Wiley & Sons, vol. 58(4), pages 1064-1082, October.
    17. Yang, Zhe & Baraldi, Piero & Zio, Enrico, 2020. "A novel method for maintenance record clustering and its application to a case study of maintenance optimization," Reliability Engineering and System Safety, Elsevier, vol. 203(C).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Li, Jiawen & Meng, Lu & Zhang, Zelin & Yang, Kejia, 2023. "Low-frequency, high-impact: Discovering important rare events from UGC," Journal of Retailing and Consumer Services, Elsevier, vol. 70(C).
    2. Zhou, Di & Zhuang, Xiao & Zuo, Hongfu & Cai, Jing & Zhao, Xufeng & Xiang, Jiawei, 2022. "A model fusion strategy for identifying aircraft risk using CNN and Att-BiLSTM," Reliability Engineering and System Safety, Elsevier, vol. 228(C).
    3. Wang, Hui & Zheng, Junkang & Xiang, Jiawei, 2023. "Online bearing fault diagnosis using numerical simulation models and machine learning classifications," Reliability Engineering and System Safety, Elsevier, vol. 234(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhou, Di & Zhuang, Xiao & Zuo, Hongfu & Cai, Jing & Zhao, Xufeng & Xiang, Jiawei, 2022. "A model fusion strategy for identifying aircraft risk using CNN and Att-BiLSTM," Reliability Engineering and System Safety, Elsevier, vol. 228(C).
    2. Li Tang & Jennifer Kuzma & Xi Zhang & Xinyu Song & Yin Li & Hongxu Liu & Guangyuan Hu, 2023. "Synthetic biology and governance research in China: a 40-year evolution," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 5293-5310, September.
    3. Valérie Mignon & Celso Brunetti & Marc Joëts, 2023. "Reasons Behind Words: OPEC Narratives and the Oil Market," EconomiX Working Papers 2023-24, University of Paris Nanterre, EconomiX.
    4. Braga, Joaquim A.P. & Andrade, António R., 2021. "Multivariate statistical aggregation and dimensionality reduction techniques to improve monitoring and maintenance in railways: The wheelset component," Reliability Engineering and System Safety, Elsevier, vol. 216(C).
    5. Beatrice Ferrario & Stefanie Stantcheva, 2022. "Eliciting People's First-Order Concerns: Text Analysis of Open-Ended Survey Questions," AEA Papers and Proceedings, American Economic Association, vol. 112, pages 163-169, May.
    6. Tian-yuan, Ye & Lin-lin, Liu & He-wei, Pang & Yuan-zi, Zhou, 2023. "Bayesian Networks based approach to enhance GO methodology for reliability modeling of multi-state consecutive-k-out-of-n: F system," Reliability Engineering and System Safety, Elsevier, vol. 229(C).
    7. Keith Carlson & Michael A. Livermore & Daniel N. Rockmore, 2020. "The Problem of Data Bias in the Pool of Published U.S. Appellate Court Opinions," Journal of Empirical Legal Studies, John Wiley & Sons, vol. 17(2), pages 224-261, June.
    8. Goodell, John W. & Kumar, Satish & Li, Xiao & Pattnaik, Debidutta & Sharma, Anuj, 2022. "Foundations and research clusters in investor attention: Evidence from bibliometric and topic modelling analysis," International Review of Economics & Finance, Elsevier, vol. 82(C), pages 511-529.
    9. Sergio Davalos & Ehsan H. Feroz, 2022. "A textual analysis of the US Securities and Exchange Commission's accounting and auditing enforcement releases relating to the Sarbanes–Oxley Act," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 29(1), pages 19-40, January.
    10. David Ardia & Keven Bluteau & Mohammad‐Abbas Meghani, 2024. "Thirty years of academic finance," Journal of Economic Surveys, Wiley Blackwell, vol. 38(3), pages 1008-1042, July.
    11. Nuccio Ludovico & Marc Esteve Del Valle & Franco Ruzzenenti, 2020. "Mapping the Dutch Energy Transition Hyperlink Network," Sustainability, MDPI, vol. 12(18), pages 1-24, September.
    12. Baker, H. Kent & Kumar, Satish & Goyal, Kirti & Sharma, Anuj, 2021. "International review of financial analysis: A retrospective evaluation between 1992 and 2020," International Review of Financial Analysis, Elsevier, vol. 78(C).
    13. Pan, Yongjun & Sun, Yu & Li, Zhixiong & Gardoni, Paolo, 2023. "Machine learning approaches to estimate suspension parameters for performance degradation assessment using accurate dynamic simulations," Reliability Engineering and System Safety, Elsevier, vol. 230(C).
    14. Ziegler Haselein, Bruno & da Silva, Jonny Carlos & Hooey, Becky L., 2024. "Multiple machine learning modeling on near mid-air collisions: An approach towards probabilistic reasoning," Reliability Engineering and System Safety, Elsevier, vol. 244(C).
    15. Nuccio Ludovico & Federica Dessi & Marino Bonaiuto, 2020. "Stakeholders Mapping for Sustainable Biofuels: An Innovative Procedure Based on Computational Text Analysis and Social Network Analysis," Sustainability, MDPI, vol. 12(24), pages 1-22, December.
    16. Podofillini, Luca & Reer, Bernhard & Dang, Vinh N., 2023. "A traceable process to develop Bayesian networks from scarce data and expert judgment: A human reliability analysis application," Reliability Engineering and System Safety, Elsevier, vol. 230(C).
    17. Miriam Andrejiova & Anna Grincova & Daniela Marasova & Peter Koščák, 2021. "Civil Aviation Occurrences in Slovakia and Their Evaluation Using Statistical Methods," Sustainability, MDPI, vol. 13(10), pages 1-17, May.
    18. Wang, Fang & Bai, Jie & Liu, Linlin & Ye, Tianyuan, 2024. "Temporal noisy-adder of bayesian network for scalable consecutive-k-out-of-n:F system reliability analysis," Reliability Engineering and System Safety, Elsevier, vol. 242(C).
    19. Everett, Jeff & Shiraz Rahaman, Abu & Neu, Dean & Saxton, Gregory, 2024. "Letters to the editor, institutional experimentation, and the public accounting professional," CRITICAL PERSPECTIVES ON ACCOUNTING, Elsevier, vol. 99(C).
    20. Minchul Lee & Min Song, 2020. "Incorporating citation impact into analysis of research trends," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(2), pages 1191-1224, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reensy:v:224:y:2022:i:c:s0951832022001776. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/reliability-engineering-and-system-safety .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.