IDEAS home Printed from https://ideas.repec.org/a/inm/orijoc/v32y4i2020p1143-1156.html
   My bibliography  Save this article

Predictive Analytics with Strategically Missing Data

Author

Listed:
  • Juheng Zhang

    (Department of Operations and Information Systems, University of Massachusetts, Lowell, Massachusetts 01854;)

  • Xiaoping Liu

    (D’Amore-McKim School of Business, Northeastern University, Boston, Massachusetts 02115)

  • Xiao-Bai Li

    (Department of Operations and Information Systems, University of Massachusetts, Lowell, Massachusetts 01854;)

Abstract

We study strategically missing data problems in predictive analytics with regression. In many real-world situations, such as financial reporting, college admission, job application, and marketing advertisement, data providers often conceal certain information on purpose in order to gain a favorable outcome. It is important for the decision-maker to have a mechanism to deal with such strategic behaviors. We propose a novel approach to handle strategically missing data in regression prediction. The proposed method derives imputation values of strategically missing data based on the Support Vector Regression models. It provides incentives for the data providers to disclose their true information. We show that with the proposed method imputation errors for the missing values are minimized under some reasonable conditions. An experimental study on real-world data demonstrates the effectiveness of the proposed approach.

Suggested Citation

  • Juheng Zhang & Xiaoping Liu & Xiao-Bai Li, 2020. "Predictive Analytics with Strategically Missing Data," INFORMS Journal on Computing, INFORMS, vol. 32(4), pages 1143-1156, October.
  • Handle: RePEc:inm:orijoc:v:32:y:4:i:2020:p:1143-1156
    DOI: 10.1287/ijoc.2019.0947
    as

    Download full text from publisher

    File URL: https://doi.org/10.1287/ijoc.2019.0947
    Download Restriction: no

    File URL: https://libkey.io/10.1287/ijoc.2019.0947?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Braun Sebastian & Dwenger Nadja & Kübler Dorothea, 2010. "Telling the Truth May Not Pay Off: An Empirical Study of Centralized University Admissions in Germany," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 10(1), pages 1-38, March.
    2. Conlin, Michael & Dickert-Conlin, Stacy & Chapman, Gabrielle, 2013. "Voluntary disclosure and the strategic behavior of colleges," Journal of Economic Behavior & Organization, Elsevier, vol. 96(C), pages 48-64.
    3. Young U. Ryu & R. Chandrasekaran & Varghese Jacob, 2004. "Prognosis Using an Isotonic Prediction Technique," Management Science, INFORMS, vol. 50(6), pages 777-785, June.
    4. Yinghui (Catherine) Yang & Hongyan Liu & Yuanjue Cai, 2013. "Discovery of Online Shopping Patterns Across Websites," INFORMS Journal on Computing, INFORMS, vol. 25(1), pages 161-176, February.
    5. Juheng Zhang & Haldun Aytug & Gary J. Koehler, 2014. "Research Note —Discriminant Analysis with Strategically Manipulated Data," Information Systems Research, INFORMS, vol. 25(3), pages 654-662, September.
    6. Harrison, David Jr. & Rubinfeld, Daniel L., 1978. "Hedonic housing prices and the demand for clean air," Journal of Environmental Economics and Management, Elsevier, vol. 5(1), pages 81-102, March.
    7. Fidan Boylu & Haldun Aytug & Gary J. Koehler, 2010. "Induction over Strategic Agents," Information Systems Research, INFORMS, vol. 21(1), pages 170-189, March.
    8. Kuntara Pukthuanthong‐Le & Nikhil Varaiya, 2007. "IPO Pricing, Block Sales, and Long‐Term Performance," The Financial Review, Eastern Finance Association, vol. 42(3), pages 319-348, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Thirupathi Samala & Vijaya Kumar Manupati & Maria Leonilde R. Varela & Goran Putnik, 2021. "Investigation of Degradation and Upgradation Models for Flexible Unit Systems: A Systematic Literature Review," Future Internet, MDPI, vol. 13(3), pages 1-18, February.
    2. Feng, Qianqian & Sun, Xiaolei & Hao, Jun & Li, Jianping, 2021. "Predictability dynamics of multifactor-influenced installed capacity: A perspective of country clustering," Energy, Elsevier, vol. 214(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Juheng Zhang & Haldun Aytug & Gary J. Koehler, 2014. "Research Note —Discriminant Analysis with Strategically Manipulated Data," Information Systems Research, INFORMS, vol. 25(3), pages 654-662, September.
    2. Mehmet Eren Ahsen & Mehmet Ulvi Saygi Ayvaci & Srinivasan Raghunathan, 2019. "When Algorithmic Predictions Use Human-Generated Data: A Bias-Aware Classification Algorithm for Breast Cancer Diagnosis," Service Science, INFORMS, vol. 30(1), pages 97-116, March.
    3. Zhang, Juheng & Aytug, Haldun, 2016. "Comparison of imputation methods for discriminant analysis with strategically hidden data," European Journal of Operational Research, Elsevier, vol. 255(2), pages 522-530.
    4. Klijn, Flip & Pais, Joana & Vorsatz, Marc, 2019. "Static versus dynamic deferred acceptance in school choice: Theory and experiment," Games and Economic Behavior, Elsevier, vol. 113(C), pages 147-163.
    5. Jianhong Shi & Qian Yang & Xiongya Li & Weixing Song, 2017. "Effects of measurement error on a class of single-index varying coefficient regression models," Computational Statistics, Springer, vol. 32(3), pages 977-1001, September.
    6. Grenet, Julien & He, YingHua & Kübler, Dorothea, 2022. "Preference Discovery in University Admissions: The Case for Dynamic Multioffer Mechanisms," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 130(6), pages 1-1.
    7. Villalonga, Belen, 2004. "Intangible resources, Tobin's q, and sustainability of performance differences," Journal of Economic Behavior & Organization, Elsevier, vol. 54(2), pages 205-230, June.
    8. Mira Fischer & Patrick Kampkötter, 2017. "Effects of German Universities' Excellence Initiative on Ability Sorting of Students and Perceptions of Educational Quality," Journal of Institutional and Theoretical Economics (JITE), Mohr Siebeck, Tübingen, vol. 173(4), pages 662-687, December.
    9. Brockmeier, M., 1991. "Entwicklung und Aufhebung von Reinheitsgeboten im Nahrungsmittelbereich – Analyse und Bewertung," Proceedings “Schriften der Gesellschaft für Wirtschafts- und Sozialwissenschaften des Landbaues e.V.”, German Association of Agricultural Economists (GEWISOLA), vol. 27.
    10. Miller, Steve & Startz, Richard, 2019. "Feasible generalized least squares using support vector regression," Economics Letters, Elsevier, vol. 175(C), pages 28-31.
    11. Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
    12. Prendergast, Luke A. & Li Wai Suen, Connie, 2011. "A new and practical influence measure for subsets of covariance matrix sample principal components with applications to high dimensional datasets," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 752-764, January.
    13. Tizheng Li & Xiaojuan Kang, 2022. "Variable selection of higher-order partially linear spatial autoregressive model with a diverging number of parameters," Statistical Papers, Springer, vol. 63(1), pages 243-285, February.
    14. Deac Dan Stelian & Schebesch Klaus Bruno, 2018. "Market Forecasts and Client Behavioral Data: Towards Finding Adequate Model Complexity," Studia Universitatis „Vasile Goldis” Arad – Economics Series, Sciendo, vol. 28(3), pages 50-75, September.
    15. Dimakopoulos, Philipp D. & Heller, C.-Philipp, 2015. "Matching with Waiting Times: The German Entry-Level Labour Market for Lawyers," VfS Annual Conference 2015 (Muenster): Economic Development - Theory and Policy 113153, Verein für Socialpolitik / German Economic Association.
    16. Tobias Reischmann & Thilo Klein & Sven Giegerich, 2021. "A deferred acceptance mechanism for decentralized, fast, and fair childcare assignment," The Journal of Mechanism and Institution Design, Society for the Promotion of Mechanism and Institution Design, University of York, vol. 6(1), pages 59-100, December.
    17. Juan Ignacio Zoloa, 2020. "Noise pollution and housing markets: A spatial hedonic analysis for La Plata City," Ensayos de Política Económica, Departamento de Investigación Francisco Valsecchi, Facultad de Ciencias Económicas, Pontificia Universidad Católica Argentina., vol. 3(2), pages 129-152, Octubre.
    18. Cheng, Tsung-Chi, 2012. "On simultaneously identifying outliers and heteroscedasticity without specific form," Computational Statistics & Data Analysis, Elsevier, vol. 56(7), pages 2258-2272.
    19. Bodhisattva Sen & Mary Meyer, 2017. "Testing against a linear regression model using ideas from shape-restricted estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(2), pages 423-448, March.
    20. Benítez-Peña, Sandra & Blanquero, Rafael & Carrizosa, Emilio & Ramírez-Cobo, Pepa, 2024. "Cost-sensitive probabilistic predictions for support vector machines," European Journal of Operational Research, Elsevier, vol. 314(1), pages 268-279.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijoc:v:32:y:4:i:2020:p:1143-1156. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.