IDEAS home Printed from https://ideas.repec.org/a/bla/jinfst/v73y2022i3p419-437.html
   My bibliography  Save this article

Are mortgage loan closing delay risks predictable? A predictive analysis using text mining on discussion threads

Author

Listed:
  • David M. Goldberg
  • Nohel Zaman
  • Arin Brahma
  • Mariano Aloiso

Abstract

Loan processors and underwriters at mortgage firms seek to gather substantial supporting documentation to properly understand and model loan risks. In doing so, loan originations become prone to closing delays, risking client dissatisfaction and consequent revenue losses. We collaborate with a large national mortgage firm to examine the extent to which these delays are predictable, using internal discussion threads to prioritize interventions for loans most at risk. Substantial work experience is required to predict delays, and we find that even highly trained employees have difficulty predicting delays by reviewing discussion threads. We develop an array of methods to predict loan delays. We apply four modern out‐of‐the‐box sentiment analysis techniques, two dictionary‐based and two rule‐based, to predict delays. We contrast these approaches with domain‐specific approaches, including firm‐provided keyword searches and “smoke terms” derived using machine learning. Performance varies widely across sentiment approaches; while some sentiment approaches prioritize the top‐ranking records well, performance quickly declines thereafter. The firm‐provided keyword searches perform at the rate of random chance. We observe that the domain‐specific smoke term approaches consistently outperform other approaches and offer better prediction than loan and borrower characteristics. We conclude that text mining solutions would greatly assist mortgage firms in delay prevention.

Suggested Citation

  • David M. Goldberg & Nohel Zaman & Arin Brahma & Mariano Aloiso, 2022. "Are mortgage loan closing delay risks predictable? A predictive analysis using text mining on discussion threads," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 73(3), pages 419-437, March.
  • Handle: RePEc:bla:jinfst:v:73:y:2022:i:3:p:419-437
    DOI: 10.1002/asi.24559
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/asi.24559
    Download Restriction: no

    File URL: https://libkey.io/10.1002/asi.24559?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Marco Di Maggio & Amir Kermani & Sanket Korgaonkar, 2019. "Partial Deregulation and Competition: Effects on Risky Mortgage Origination," Management Science, INFORMS, vol. 65(10), pages 4676-4711, October.
    2. Mike Thelwall & Kevan Buckley & Georgios Paltoglou & Di Cai & Arvid Kappas, 2010. "Sentiment strength detection in short informal text," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 61(12), pages 2544-2558, December.
    3. Delen, Dursun & Zolbanin, Hamed M., 2018. "The analytics paradigm in business research," Journal of Business Research, Elsevier, vol. 90(C), pages 186-195.
    4. Andreas Fuster & Matthew Plosser & Philipp Schnabl & James Vickery, 2019. "The Role of Technology in Mortgage Lending," The Review of Financial Studies, Society for Financial Studies, vol. 32(5), pages 1854-1899.
    5. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    6. Mike Thelwall & Kevan Buckley & Georgios Paltoglou & Di Cai & Arvid Kappas, 2010. "Sentiment strength detection in short informal text," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 61(12), pages 2544-2558, December.
    7. Philipe F. Melo & Daniel H. Dalip & Manoel M. Junior & Marcos A. Gonçalves & Fabrício Benevenuto, 2019. "10SENT: A stable sentiment analysis method based on the combination of off‐the‐shelf approaches," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 70(3), pages 242-255, March.
    8. Joel Denning & Maria Soledad Pera & Yiu-Kai Ng, 2016. "A readability level prediction tool for K-12 books," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(3), pages 550-565, March.
    9. Begley, Jaclene & Fout, Hamilton & LaCour-Little, Michael & Mota, Nuno, 2020. "Home equity conversion mortgages: The secondary market investor experience," Journal of Housing Economics, Elsevier, vol. 47(C).
    10. Tjur, Tue, 2009. "Coefficients of Determination in Logistic Regression Models—A New Proposal: The Coefficient of Discrimination," The American Statistician, American Statistical Association, vol. 63(4), pages 366-372.
    11. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    12. Noa P. Cruz & Maite Taboada & Ruslan Mitkov, 2016. "A machine-learning approach to negation and speculation detection for sentiment analysis," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(9), pages 2118-2136, September.
    13. Pekka Malo & Ankur Sinha & Pekka Korhonen & Jyrki Wallenius & Pyry Takala, 2014. "Good debt or bad debt: Detecting semantic orientations in economic texts," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(4), pages 782-796, April.
    14. Alexander F. Wolff, 2013. "Investor sentiment and stock prices in the subprime mortgage crisis," Applied Financial Economics, Taylor & Francis Journals, vol. 23(16), pages 1301-1309, August.
    15. Alan S. Abrahams & Weiguo Fan & G. Alan Wang & Zhongju (John) Zhang & Jian Jiao, 2015. "An Integrated Text Analytic Framework for Product Defect Discovery," Production and Operations Management, Production and Operations Management Society, vol. 24(6), pages 975-990, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gabriele Ranco & Ilaria Bordino & Giacomo Bormetti & Guido Caldarelli & Fabrizio Lillo & Michele Treccani, 2014. "Coupling news sentiment with web browsing data improves prediction of intra-day price dynamics," Papers 1412.3948, arXiv.org, revised Dec 2015.
    2. Gabriele Ranco & Ilaria Bordino & Giacomo Bormetti & Guido Caldarelli & Fabrizio Lillo & Michele Treccani, 2016. "Coupling News Sentiment with Web Browsing Data Improves Prediction of Intra-Day Price Dynamics," PLOS ONE, Public Library of Science, vol. 11(1), pages 1-14, January.
    3. Kirtac, Kemal & Germano, Guido, 2024. "Sentiment trading with large language models," Finance Research Letters, Elsevier, vol. 62(PB).
    4. Chen, Cathy Yi-Hsuan & Fengler, Matthias R. & Härdle, Wolfgang Karl & Liu, Yanchu, 2022. "Media-expressed tone, option characteristics, and stock return predictability," Journal of Economic Dynamics and Control, Elsevier, vol. 134(C).
    5. Martin Haselmayer & Marcelo Jenny, 2017. "Sentiment analysis of political communication: combining a dictionary approach with crowdcoding," Quality & Quantity: International Journal of Methodology, Springer, vol. 51(6), pages 2623-2646, November.
    6. José María Liberti & Mitchell A. Petersen, 2018. "Information: Hard and Soft," NBER Working Papers 25075, National Bureau of Economic Research, Inc.
    7. Asier Guti'errez-Fandi~no & Miquel Noguer i Alonso & Petter Kolm & Jordi Armengol-Estap'e, 2021. "FinEAS: Financial Embedding Analysis of Sentiment," Papers 2111.00526, arXiv.org, revised Nov 2021.
    8. Ingrid E. Fisher & Margaret R. Garnsey & Mark E. Hughes, 2016. "Natural Language Processing in Accounting, Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 23(3), pages 157-214, July.
    9. Kirtac, Kemal & Germano, Guido, 2024. "Sentiment trading with large language models," LSE Research Online Documents on Economics 122592, London School of Economics and Political Science, LSE Library.
    10. Zhang, Xuetong & Zhang, Weiguo, 2023. "Information asymmetry, sentiment interactions, and asset price," The North American Journal of Economics and Finance, Elsevier, vol. 67(C).
    11. Ankur Sinha & Satishwar Kedas & Rishu Kumar & Pekka Malo, 2022. "SEntFiN 1.0: Entity‐aware sentiment analysis for financial news," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 73(9), pages 1314-1335, September.
    12. Simon Albrecht & Bernhard Lutz & Dirk Neumann, 2020. "The behavior of blockchain ventures on Twitter as a determinant for funding success," Electronic Markets, Springer;IIM University of St. Gallen, vol. 30(2), pages 241-257, June.
    13. Abdollahi, Hooman & Fjesme, Sturla L. & Sirnes, Espen, 2024. "Measuring market volatility connectedness to media sentiment," The North American Journal of Economics and Finance, Elsevier, vol. 71(C).
    14. Yi-Hsuan Chen, Cathy & Fengler, Matthias & Härdle, Wolfgang Karl & Liu, Yanchu, 2018. "Textual Sentiment, Option Characteristics, and Stock Return Predictability," Economics Working Paper Series 1808, University of St. Gallen, School of Economics and Political Science.
    15. Jozef Barunik & Cathy Yi-Hsuan Chen & Jan Vecer, 2019. "Sentiment-Driven Stochastic Volatility Model: A High-Frequency Textual Tool for Economists," Papers 1906.00059, arXiv.org.
    16. Nohel Zaman & David M. Goldberg & Richard J. Gruss & Alan S. Abrahams & Siriporn Srisawas & Peter Ractham & Michelle M.H. Şeref, 2022. "Cross-Category Defect Discovery from Online Reviews: Supplementing Sentiment with Category-Specific Semantics," Information Systems Frontiers, Springer, vol. 24(4), pages 1265-1285, August.
    17. Wehrheim, Lino, 2021. "The sound of silence: On the (in)visibility of economists in the media," Working Papers 30, German Research Foundation's Priority Programme 1859 "Experience and Expectation. Historical Foundations of Economic Behaviour", Humboldt University Berlin.
    18. Müller, Karsten, 2020. "German forecasters' narratives: How informative are German business cycle forecast reports?," Working Papers 23, German Research Foundation's Priority Programme 1859 "Experience and Expectation. Historical Foundations of Economic Behaviour", Humboldt University Berlin.
    19. Yan Luo & Linying Zhou, 2020. "Textual tone in corporate financial disclosures: a survey of the literature," International Journal of Disclosure and Governance, Palgrave Macmillan, vol. 17(2), pages 101-110, September.
    20. Jiao Ji & Oleksandr Talavera & Shuxing Yin, 2018. "The Hidden Information Content: Evidence from the Tone of Independent Director Reports," Working Papers 2018-28, Swansea University, School of Management.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jinfst:v:73:y:2022:i:3:p:419-437. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.asis.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.