IDEAS home Printed from https://ideas.repec.org/a/inm/orisre/v36y2025i1p370-393.html
   My bibliography  Save this article

Guided Diverse Concept Miner (GDCM): Uncovering Relevant Constructs for Managerial Insights from Text

Author

Listed:
  • Dokyun “DK” Lee

    (Questrom School of Business, Boston University, Boston, Massachusetts 02215)

  • Zhaoqi “ZQ” Cheng

    (Questrom School of Business, Boston University, Boston, Massachusetts 02215)

  • Chengfeng Mao

    (Marketing, Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02142)

  • Emaad Manzoor

    (Marketing, Samuel Curtis Johnson Graduate School of Management, Cornell University, Ithaca, New York 14853)

Abstract

Guided Diverse Concept Miner (GDCM) is an interpretable deep learning algorithm to (1) automatically extract corpus-level concepts from text data, (2) focus the discovery of concepts to filter through only the concepts highly correlated to the user-specified managerial outcome, and (3) quantify the concept’s correlational importance to the outcome. GDCM is used to explore and potentially extract previously unknown concepts and insights from the text that may explain the managerial outcome, without the need to provide any human-predefined guidance or labeled data on concepts. GDCM embeds words, documents, and concepts all in the same vector space, enabling easy interpretation of discovered concepts by associating words local to the concept vector. GDCM is explicitly configured to increase recovered-concept diversity, coherence, and relevance to managerial outcomes. We demonstrate GDCM as a “guided exploratory” tool for a hypothetical managerial case involving online purchase journey data connected to consumed reviews. GDCM scalably extracts concepts hidden in customer reviews highly correlated to conversion and provides concept importance in comparison with product ratings. Concepts produced turn out to be product qualities previously theorized to impact conversion in the literature, and correlational importance gauged by GDCM closely matches estimates from a previous causal study run on a similar data set, serving as external validations of GDCM as a “guided exploratory” tool. Additional experiments with other data show that extracted insights are sensitive to guiding managerial variables and sensibly so, further demonstrating the flexibility of GDCM as a managerial tool.

Suggested Citation

  • Dokyun “DK” Lee & Zhaoqi “ZQ” Cheng & Chengfeng Mao & Emaad Manzoor, 2025. "Guided Diverse Concept Miner (GDCM): Uncovering Relevant Constructs for Managerial Insights from Text," Information Systems Research, INFORMS, vol. 36(1), pages 370-393, March.
  • Handle: RePEc:inm:orisre:v:36:y:2025:i:1:p:370-393
    DOI: 10.1287/isre.2020.0494
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/isre.2020.0494
    Download Restriction: no

    File URL: https://libkey.io/10.1287/isre.2020.0494?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orisre:v:36:y:2025:i:1:p:370-393. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.