IDEAS home Printed from https://ideas.repec.org/a/inm/ormnsc/v68y2022i3p1595-1615.html
   My bibliography  Save this article

Data Pooling in Stochastic Optimization

Author

Listed:
  • Vishal Gupta

    (Data Science and Operations, USC Marshall School of Business, Los Angles, California 90089)

  • Nathan Kallus

    (School of Operations Research and Information Engineering and Cornell Tech, Cornell University, New York, New York 10044)

Abstract

Managing large-scale systems often involves simultaneously solving thousands of unrelated stochastic optimization problems, each with limited data. Intuition suggests that one can decouple these unrelated problems and solve them separately without loss of generality. We propose a novel data-pooling algorithm called Shrunken-SAA that disproves this intuition. In particular, we prove that combining data across problems can outperform decoupling, even when there is no a priori structure linking the problems and data are drawn independently. Our approach does not require strong distributional assumptions and applies to constrained, possibly nonconvex, nonsmooth optimization problems such as vehicle-routing, economic lot-sizing, or facility location. We compare and contrast our results to a similar phenomenon in statistics (Stein’s phenomenon), highlighting unique features that arise in the optimization setting that are not present in estimation. We further prove that, as the number of problems grows large, Shrunken-SAA learns if pooling can improve upon decoupling and the optimal amount to pool, even if the average amount of data per problem is fixed and bounded. Importantly, we highlight a simple intuition based on stability that highlights when and why data pooling offers a benefit, elucidating this perhaps surprising phenomenon. This intuition further suggests that data pooling offers the most benefits when there are many problems, each of which has a small amount of relevant data. Finally, we demonstrate the practical benefits of data pooling using real data from a chain of retail drug stores in the context of inventory management.

Suggested Citation

  • Vishal Gupta & Nathan Kallus, 2022. "Data Pooling in Stochastic Optimization," Management Science, INFORMS, vol. 68(3), pages 1595-1615, March.
  • Handle: RePEc:inm:ormnsc:v:68:y:2022:i:3:p:1595-1615
    DOI: 10.1287/mnsc.2020.3933
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/mnsc.2020.3933
    Download Restriction: no

    File URL: https://libkey.io/10.1287/mnsc.2020.3933?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. DeMiguel, Victor & Martin-Utrera, Alberto & Nogales, Francisco J., 2013. "Size matters: Optimal calibration of shrinkage estimators for portfolio selection," Journal of Banking & Finance, Elsevier, vol. 37(8), pages 3018-3034.
    2. Jorion, Philippe, 1986. "Bayes-Stein Estimation for Portfolio Analysis," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 21(3), pages 279-292, September.
    3. Retsef Levi & Georgia Perakis & Joline Uichanco, 2015. "The Data-Driven Newsvendor Problem: New Bounds and Insights," Operations Research, INFORMS, vol. 63(6), pages 1294-1306, December.
    4. Deheuvels, Paul & Pfeifer, Dietmar, 1988. "Poisson approximations of multinomial distributions and point processes," Journal of Multivariate Analysis, Elsevier, vol. 25(1), pages 65-89, April.
    5. Vishal Gupta & Paat Rusmevichientong, 2021. "Small-Data, Large-Scale Linear Optimization with Uncertain Objectives," Management Science, INFORMS, vol. 67(1), pages 220-241, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Platanakis, Emmanouil & Sutcliffe, Charles & Ye, Xiaoxia, 2021. "Horses for courses: Mean-variance for asset allocation and 1/N for stock selection," European Journal of Operational Research, Elsevier, vol. 288(1), pages 302-317.
    2. Marco Neffelli, 2018. "Target Matrix Estimators in Risk-Based Portfolios," Risks, MDPI, vol. 6(4), pages 1-20, November.
    3. Hwang, Inchang & Xu, Simon & In, Francis, 2018. "Naive versus optimal diversification: Tail risk and performance," European Journal of Operational Research, Elsevier, vol. 265(1), pages 372-388.
    4. Kircher, Felix & Rösch, Daniel, 2021. "A shrinkage approach for Sharpe ratio optimal portfolios with estimation risks," Journal of Banking & Finance, Elsevier, vol. 133(C).
    5. Ortiz, Roberto & Contreras, Mauricio & Mellado, Cristhian, 2023. "Regression, multicollinearity and Markowitz," Finance Research Letters, Elsevier, vol. 58(PC).
    6. Chakrabarti, Deepayan, 2021. "Parameter-free robust optimization for the maximum-Sharpe portfolio problem," European Journal of Operational Research, Elsevier, vol. 293(1), pages 388-399.
    7. Sangwon Suh, 2018. "Portfolio Selection using New Factors based on Firm Characteristics," Journal of Economic Development, Chung-Ang Unviersity, Department of Economics, vol. 43(1), pages 77-99, March.
    8. Abadir, Karim M. & Distaso, Walter & Žikeš, Filip, 2014. "Design-free estimation of variance matrices," Journal of Econometrics, Elsevier, vol. 181(2), pages 165-180.
    9. Laniado Rodas, Henry, 2017. "Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators," DES - Working Papers. Statistics and Econometrics. WS 24613, Universidad Carlos III de Madrid. Departamento de Estadística.
    10. Candelon, B. & Hurlin, C. & Tokpavi, S., 2012. "Sampling error and double shrinkage estimation of minimum variance portfolios," Journal of Empirical Finance, Elsevier, vol. 19(4), pages 511-527.
    11. Mishra, Anil V., 2016. "Foreign bias in Australian-domiciled mutual fund holdings," Pacific-Basin Finance Journal, Elsevier, vol. 39(C), pages 101-123.
    12. Andrew F. Siegel & Artemiza Woodgate, 2007. "Performance of Portfolios Optimized with Estimation Error," Management Science, INFORMS, vol. 53(6), pages 1005-1015, June.
    13. Afonso, António & Gomes, Pedro & Taamouti, Abderrahim, 2014. "Sovereign credit ratings, market volatility, and financial gains," Computational Statistics & Data Analysis, Elsevier, vol. 76(C), pages 20-33.
    14. Vaughn Gambeta & Roy Kwon, 2020. "Risk Return Trade-Off in Relaxed Risk Parity Portfolio Optimization," JRFM, MDPI, vol. 13(10), pages 1-28, October.
    15. Serrano, Breno & Minner, Stefan & Schiffer, Maximilian & Vidal, Thibaut, 2024. "Bilevel optimization for feature selection in the data-driven newsvendor problem," European Journal of Operational Research, Elsevier, vol. 315(2), pages 703-714.
    16. Jacobs, Heiko & Müller, Sebastian & Weber, Martin, 2014. "How should individual investors diversify? An empirical evaluation of alternative asset allocation policies," Journal of Financial Markets, Elsevier, vol. 19(C), pages 62-85.
    17. Hsiao-Fen Hsiao & Jiang-Chuan Huang & Zheng-Wei Lin, 2020. "Portfolio construction using bootstrapping neural networks: evidence from global stock market," Review of Derivatives Research, Springer, vol. 23(3), pages 227-247, October.
    18. Francesco Lautizi, 2015. "Large Scale Covariance Estimates for Portfolio Selection," CEIS Research Paper 353, Tor Vergata University, CEIS, revised 07 Aug 2015.
    19. Olivier Ledoit & Michael Wolf, 2003. "Honey, I shrunk the sample covariance matrix," Economics Working Papers 691, Department of Economics and Business, Universitat Pompeu Fabra.
    20. Michael W. Brandt & Pedro Santa-Clara & Rossen Valkanov, 2009. "Parametric Portfolio Policies: Exploiting Characteristics in the Cross-Section of Equity Returns," The Review of Financial Studies, Society for Financial Studies, vol. 22(9), pages 3411-3447, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormnsc:v:68:y:2022:i:3:p:1595-1615. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.