IDEAS home Printed from https://ideas.repec.org/a/eee/tefoso/v198y2024ics0040162523006212.html
   My bibliography  Save this article

The impact of forum content on data science open innovation performance: A system dynamics-based causal machine learning approach

Author

Listed:
  • Li, Libo
  • Yu, Huan
  • Kunc, Martin

Abstract

Open innovation in data science generally takes the form of public competitions where teams exchange messages and solutions by competing and collaborating simultaneously. Team behaviours are widely heterogeneous in terms of the performance of their solutions and the participation in knowledge creation. We present a novel research framework for open innovation by integrating system dynamics and structural topic modelling to extract open factors and adopting a machine learning-based difference-in-differences estimator to understand the impact of team behaviour on their performance using data from Kaggle's competition. Our results identify four team behaviour categories—active, learner, lurker, and passive— in data science open innovation competitions which depend on the performance of their solutions and actions related to posting and reading messages in the forum. Furthermore, the activities of model evaluation, community support, and business understanding are the top three most positive and significant factors affecting team performance. Our research contributes to the literature by highlighting the value of forum feedback and exploring the data science activities in the forum discussion, in relation to innovation performance, to enrich the empirical understanding of open innovation. Research implications for researchers and practitioners participating in, organising, and supporting data science open innovation activities are provided.

Suggested Citation

  • Li, Libo & Yu, Huan & Kunc, Martin, 2024. "The impact of forum content on data science open innovation performance: A system dynamics-based causal machine learning approach," Technological Forecasting and Social Change, Elsevier, vol. 198(C).
  • Handle: RePEc:eee:tefoso:v:198:y:2024:i:c:s0040162523006212
    DOI: 10.1016/j.techfore.2023.122936
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0040162523006212
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.techfore.2023.122936?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Kewei Ming & Paul R. Rosenbaum, 2000. "Substantial Gains in Bias Reduction from Matching with a Variable Number of Controls," Biometrics, The International Biometric Society, vol. 56(1), pages 118-124, March.
    2. Xu, Shuo & Hao, Liyuan & Yang, Guancan & Lu, Kun & An, Xin, 2021. "A topic models based framework for detecting and forecasting emerging technologies," Technological Forecasting and Social Change, Elsevier, vol. 162(C).
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    4. Bill Francis & Iftekhar Hasan & Jong Chool Park & Qiang Wu, 2015. "Gender Differences in Financial Reporting Decision Making: Evidence from Accounting Conservatism," Contemporary Accounting Research, John Wiley & Sons, vol. 32(3), pages 1285-1318, September.
    5. Athanasopoulos, George & Hyndman, Rob J., 2011. "The value of feedback in forecasting competitions," International Journal of Forecasting, Elsevier, vol. 27(3), pages 845-849.
    6. Hanieh Javadi Khasraghi & Rudy Hirschheim, 2022. "Collaboration in crowdsourcing contests: how different levels of collaboration affect team performance," Behaviour and Information Technology, Taylor & Francis Journals, vol. 41(7), pages 1566-1582, May.
    7. Garcia Martinez, Marian, 2015. "Solver engagement in knowledge sharing in crowdsourcing communities: Exploring the link to creativity," Research Policy, Elsevier, vol. 44(8), pages 1419-1430.
    8. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, April.
    9. Erzurumlu, S. Sinan & Pachamanova, Dessislava, 2020. "Topic modeling and technology forecasting for assessing the commercial viability of healthcare innovations," Technological Forecasting and Social Change, Elsevier, vol. 156(C).
    10. Yuan Jin & Ho Cheung Brian Lee & Sulin Ba & Jan Stallaert, 2021. "Winning by Learning? Effect of Knowledge Sharing in Crowdsourcing Contests," Information Systems Research, INFORMS, vol. 32(3), pages 836-859, September.
    11. Marianne Bertrand & Esther Duflo & Sendhil Mullainathan, 2004. "How Much Should We Trust Differences-In-Differences Estimates?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 119(1), pages 249-275.
    12. Li, Xixi & Bai, Yun & Kang, Yanfei, 2022. "Exploring the social influence of the Kaggle virtual community on the M5 competition," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1507-1518.
    13. Zhu, Lin & Cunningham, Scott W., 2022. "Unveiling the knowledge structure of technological forecasting and social change (1969–2020) through an NMF-based hierarchical topic model," Technological Forecasting and Social Change, Elsevier, vol. 174(C).
    14. Saura, Jose Ramon & Palacios-Marqués, Daniel & Ribeiro-Soriano, Domingo, 2023. "Exploring the boundaries of open innovation: Evidence from social media mining," Technovation, Elsevier, vol. 119(C).
    15. Bojer, Casper Solheim & Meldgaard, Jens Peder, 2021. "Kaggle forecasting competitions: An overlooked learning opportunity," International Journal of Forecasting, Elsevier, vol. 37(2), pages 587-603.
    16. Alberto Abadie, 2005. "Semiparametric Difference-in-Differences Estimators," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 72(1), pages 1-19.
    17. Neng-Chieh Chang, 2020. "Double/debiased machine learning for difference-in-differences models," The Econometrics Journal, Royal Economic Society, vol. 23(2), pages 177-191.
    18. Dimitris Bertsimas & Nathan Kallus, 2020. "From Predictive to Prescriptive Analytics," Management Science, INFORMS, vol. 66(3), pages 1025-1044, March.
    19. Samer Faraj & Sirkka L. Jarvenpaa & Ann Majchrzak, 2011. "Knowledge Collaboration in Online Communities," Organization Science, INFORMS, vol. 22(5), pages 1224-1239, October.
    20. Kumar, Vivek & Srivastava, Arpita, 2022. "Trends in the thematic landscape of corporate social responsibility research: A structural topic modeling approach," Journal of Business Research, Elsevier, vol. 150(C), pages 26-37.
    21. Shi, Xiaoxiao & Evans, Richard & Shan, Wei, 2022. "Solver engagement in online crowdsourcing communities: The roles of perceived interactivity, relationship quality and psychological ownership," Technological Forecasting and Social Change, Elsevier, vol. 175(C).
    22. Peter Otto & Martin Simon, 2008. "Dynamic perspectives on social characteristics and sustainability in online community networks," System Dynamics Review, System Dynamics Society, vol. 24(3), pages 321-347, September.
    23. Garcia Martinez, Marian, 2017. "Inspiring crowdsourcing communities to create novel solutions: Competition design and the mediating role of trust," Technological Forecasting and Social Change, Elsevier, vol. 117(C), pages 296-304.
    24. Tauchert, Christoph & Buxmann, Peter & Lambinus, Jannis, 2020. "Crowdsourcing Data Science: A Qualitative Analysis of Organizations’ Usage of Kaggle Competitions," Publications of Darmstadt Technical University, Institute for Business Studies (BWL) 117656, Darmstadt Technical University, Department of Business Administration, Economics and Law, Institute for Business Studies (BWL).
    25. Ma, Tingting & Zhou, Xiao & Liu, Jia & Lou, Zhenkai & Hua, Zhaoting & Wang, Ruitao, 2021. "Combining topic modeling and SAO semantic analysis to identify technological opportunities of emerging technologies," Technological Forecasting and Social Change, Elsevier, vol. 173(C).
    26. Jeffrey A. Roberts & Il-Horn Hann & Sandra A. Slaughter, 2006. "Understanding the Motivations, Participation, and Performance of Open Source Software Developers: A Longitudinal Study of the Apache Projects," Management Science, INFORMS, vol. 52(7), pages 984-999, July.
    27. Zhong Zhao, 2004. "Using Matching to Estimate Treatment Effects: Data Requirements, Matching Metrics, and Monte Carlo Evidence," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 91-107, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    2. Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Jun 2024.
    3. Ghaffari, Mohsen & Aliahmadi, Alireza & Khalkhali, Abolfazl & Zakery, Amir & Daim, Tugrul U. & Yalcin, Haydar, 2023. "Topic-based technology mapping using patent data analysis: A case study of vehicle tires," Technological Forecasting and Social Change, Elsevier, vol. 193(C).
    4. Salgado, Stéphane & Hemonnet-Goujot, Aurelie & Henard, David H. & de Barnier, Virginie, 2020. "The dynamics of innovation contest experience: An integrated framework from the customer’s perspective," Journal of Business Research, Elsevier, vol. 117(C), pages 29-43.
    5. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    6. Martin Huber & Eva-Maria Oe{ss}, 2024. "A joint test of unconfoundedness and common trends," Papers 2404.16961, arXiv.org, revised Jun 2024.
    7. Athey, Susan & Imbens, Guido W., 2022. "Design-based analysis in Difference-In-Differences settings with staggered adoption," Journal of Econometrics, Elsevier, vol. 226(1), pages 62-79.
    8. Chad D. Meyerhoefer & Muzhe Yang, 2011. "The Relationship between Food Assistance and Health: A Review of the Literature and Empirical Strategies for Identifying Program Effects," Applied Economic Perspectives and Policy, Agricultural and Applied Economics Association, vol. 33(3), pages 304-344.
    9. Leandro D’Aurizio & Domenico Depalo, 2016. "An Evaluation of the Policies on Repayment of Government’s Trade Debt in Italy," Italian Economic Journal: A Continuation of Rivista Italiana degli Economisti and Giornale degli Economisti, Springer;Società Italiana degli Economisti (Italian Economic Association), vol. 2(2), pages 167-196, July.
    10. Roth, Jonathan & Sant’Anna, Pedro H.C. & Bilinski, Alyssa & Poe, John, 2023. "What’s trending in difference-in-differences? A synthesis of the recent econometrics literature," Journal of Econometrics, Elsevier, vol. 235(2), pages 2218-2244.
    11. Morton, Rebecca B. & Muller, Daniel & Page, Lionel & Torgler, Benno, 2015. "Exit polls, turnout, and bandwagon voting: Evidence from a natural experiment," European Economic Review, Elsevier, vol. 77(C), pages 65-81.
    12. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    13. Juan M. Villa, 2014. "Social Transfers and Growth: The Missing Evidence from Luminosity Data," WIDER Working Paper Series wp-2014-090, World Institute for Development Economic Research (UNU-WIDER).
    14. Goodman-Bacon, Andrew, 2021. "Difference-in-differences with variation in treatment timing," Journal of Econometrics, Elsevier, vol. 225(2), pages 254-277.
    15. Zhang, Yingheng & Li, Haojie & Ren, Gang, 2022. "Quantifying the social impacts of the London Night Tube with a double/debiased machine learning based difference-in-differences approach," Transportation Research Part A: Policy and Practice, Elsevier, vol. 163(C), pages 288-303.
    16. Dmitry Arkhangelsky & Guido Imbens, 2018. "Fixed Effects and the Generalized Mundlak Estimator," Papers 1807.02099, arXiv.org, revised Aug 2023.
    17. Lucas Zhang, 2024. "Continuous difference-in-differences with double/debiased machine learning," Papers 2408.10509, arXiv.org.
    18. Tang, Shengfang & Huang, Zhilin, 2022. "Empirical likelihood confidence interval for difference-in-differences estimator with panel data," Economics Letters, Elsevier, vol. 216(C).
    19. Villa, Juan M., 2014. "Social transfers and growth: The missing evidence from luminosity data," WIDER Working Paper Series 090, World Institute for Development Economic Research (UNU-WIDER).
    20. Jonathan Fuhr & Philipp Berens & Dominik Papies, 2024. "Estimating Causal Effects with Double Machine Learning -- A Method Evaluation," Papers 2403.14385, arXiv.org, revised Apr 2024.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:tefoso:v:198:y:2024:i:c:s0040162523006212. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.sciencedirect.com/science/journal/00401625 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.