IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2309.06693.html
   My bibliography  Save this paper

Stochastic Learning of Semiparametric Monotone Index Models with Large Sample Size

Author

Listed:
  • Qingsong Yao

Abstract

I study the estimation of semiparametric monotone index models in the scenario where the number of observation points $n$ is extremely large and conventional approaches fail to work due to heavy computational burdens. Motivated by the mini-batch gradient descent algorithm (MBGD) that is widely used as a stochastic optimization tool in the machine learning field, I proposes a novel subsample- and iteration-based estimation procedure. In particular, starting from any initial guess of the true parameter, I progressively update the parameter using a sequence of subsamples randomly drawn from the data set whose sample size is much smaller than $n$. The update is based on the gradient of some well-chosen loss function, where the nonparametric component is replaced with its Nadaraya-Watson kernel estimator based on subsamples. My proposed algorithm essentially generalizes MBGD algorithm to the semiparametric setup. Compared with full-sample-based method, the new method reduces the computational time by roughly $n$ times if the subsample size and the kernel function are chosen properly, so can be easily applied when the sample size $n$ is large. Moreover, I show that if I further conduct averages across the estimators produced during iterations, the difference between the average estimator and full-sample-based estimator will be $1/\sqrt{n}$-trivial. Consequently, the average estimator is $1/\sqrt{n}$-consistent and asymptotically normally distributed. In other words, the new estimator substantially improves the computational speed, while at the same time maintains the estimation accuracy.

Suggested Citation

  • Qingsong Yao, 2023. "Stochastic Learning of Semiparametric Monotone Index Models with Large Sample Size," Papers 2309.06693, arXiv.org, revised Oct 2023.
  • Handle: RePEc:arx:papers:2309.06693
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2309.06693
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Thomas Dohmen & Armin Falk & Bart H. H. Golsteyn & David Huffman & Uwe Sunde, 2017. "Risk Attitudes Across The Life Course," Economic Journal, Royal Economic Society, vol. 127(605), pages 95-116, October.
    2. Shakeeb Khan & Xiaoying Lan & Elie Tamer & Qingsong Yao, 2021. "Estimating High Dimensional Monotone Index Models by Iterative Convex Optimization1," Papers 2110.04388, arXiv.org, revised Feb 2023.
    3. Jean-Jacques Forneron, 2022. "Estimation and Inference by Stochastic Optimization," Papers 2205.03254, arXiv.org.
    4. Cohn, Richard A, et al, 1975. "Individual Investor Risk Aversion and Investment Portfolio Composition," Journal of Finance, American Finance Association, vol. 30(2), pages 605-620, May.
    5. Stoker, Thomas M, 1986. "Consistent Estimation of Scaled Coefficients," Econometrica, Econometric Society, vol. 54(6), pages 1461-1481, November.
    6. John Y. Campbell & John Cochrane, 1999. "Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior," Journal of Political Economy, University of Chicago Press, vol. 107(2), pages 205-251, April.
    7. Cosslett, Stephen R, 1983. "Distribution-Free Maximum Likelihood Estimator of the Binary Choice Model," Econometrica, Econometric Society, vol. 51(3), pages 765-782, May.
    8. Peter Hall & Jeff Racine & Qi Li, 2004. "Cross-Validation and the Estimation of Conditional Probability Densities," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 1015-1026, December.
    9. Li Gan & Zhichao Yin & Nan Jia & Shu Xu & Shuang Ma & Lu Zheng, 2014. "Data you need to know about China," Springer Books, Springer, edition 127, number 978-3-642-38151-5, February.
    10. Thomas Dohmen & Armin Falk & Bart H. H. Golsteyn & David Huffman & Uwe Sunde, 2017. "Risk Attitudes Across The Life Course," Economic Journal, Royal Economic Society, vol. 127(605), pages 95-116, October.
    11. Powell, James L & Stock, James H & Stoker, Thomas M, 1989. "Semiparametric Estimation of Index Coefficients," Econometrica, Econometric Society, vol. 57(6), pages 1403-1430, November.
    12. Klein, Roger W & Spady, Richard H, 1993. "An Efficient Semiparametric Estimator for Binary Response Models," Econometrica, Econometric Society, vol. 61(2), pages 387-421, March.
    13. Lex Borghans & Bart H. H. Golsteyn & James J. Heckman & Huub Meijers, 2009. "Gender Differences in Risk Aversion and Ambiguity Aversion," Journal of the European Economic Association, MIT Press, vol. 7(2-3), pages 649-658, 04-05.
    14. Kapteyn, Arie & Teppa, Federica, 2011. "Subjective measures of risk aversion, fixed costs, and portfolio choice," Journal of Economic Psychology, Elsevier, vol. 32(4), pages 564-580, August.
    15. Lewbel, Arthur, 2000. "Semiparametric qualitative response model estimation with unknown heteroscedasticity or instrumental variables," Journal of Econometrics, Elsevier, vol. 97(1), pages 145-177, July.
    16. Jianakoplos, Nancy Ammon & Bernasek, Alexandra, 1998. "Are Women More Risk Averse?," Economic Inquiry, Western Economic Association International, vol. 36(4), pages 620-630, October.
    17. Eckel, Catherine C. & Grossman, Philip J., 2008. "Men, Women and Risk Aversion: Experimental Evidence," Handbook of Experimental Economics Results, in: Charles R. Plott & Vernon L. Smith (ed.), Handbook of Experimental Economics Results, edition 1, volume 1, chapter 113, pages 1061-1073, Elsevier.
    18. Manski, Charles F., 1985. "Semiparametric analysis of discrete response : Asymptotic properties of the maximum score estimator," Journal of Econometrics, Elsevier, vol. 27(3), pages 313-333, March.
    19. Fan, Yanqin & Han, Fang & Li, Wei & Zhou, Xiao-Hua, 2020. "On rank estimators in increasing dimensions," Journal of Econometrics, Elsevier, vol. 214(2), pages 379-412.
    20. Han, Aaron K., 1987. "Non-parametric analysis of a generalized regression model : The maximum rank correlation estimator," Journal of Econometrics, Elsevier, vol. 35(2-3), pages 303-316, July.
    21. Lex Borghans & Bart H.H. Golsteyn & James J. Heckman & Huub Meijers, 2009. "Gender Differences in Risk Aversion and Ambiguity," Working Papers 200903, Geary Institute, University College Dublin.
    22. Hyungtaik Ahn & Hidehiko Ichimura & James L. Powell & Paul A. Ruud, 2018. "Simple Estimators for Invertible Index Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 36(1), pages 1-10, January.
    23. Hyungtaik Ahn & Hidehiko Ichimura & James L. Powell & Paul A. Ruud, 2018. "Rejoinder for “Simple Estimators for Invertible Index Models”," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 36(1), pages 22-23, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Coppejans, Mark, 2001. "Estimation of the binary response model using a mixture of distributions estimator (MOD)," Journal of Econometrics, Elsevier, vol. 102(2), pages 231-269, June.
    2. Jason R. Blevins, 2013. "Non-Standard Rates of Convergence of Criterion-Function-Based Set Estimators," Working Papers 13-02, Ohio State University, Department of Economics.
    3. Chen, Le-Yu & Lee, Sokbae, 2019. "Breaking the curse of dimensionality in conditional moment inequalities for discrete choice models," Journal of Econometrics, Elsevier, vol. 210(2), pages 482-497.
    4. Magnac, Thierry & Maurin, Eric, 2007. "Identification and information in monotone binary models," Journal of Econometrics, Elsevier, vol. 139(1), pages 76-104, July.
    5. Chen, Xirong & Gao, Wenzheng & Li, Zheng, 2018. "A data-driven bandwidth selection method for the smoothed maximum score estimator," Economics Letters, Elsevier, vol. 170(C), pages 24-26.
    6. Chen, Xiaohong, 2007. "Large Sample Sieve Estimation of Semi-Nonparametric Models," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 76, Elsevier.
    7. Matzkin, Rosa L., 2019. "Constructive identification in some nonseparable discrete choice models," Journal of Econometrics, Elsevier, vol. 211(1), pages 83-103.
    8. Shakeeb Khan & Xiaoying Lan & Elie Tamer & Qingsong Yao, 2021. "Estimating High Dimensional Monotone Index Models by Iterative Convex Optimization1," Papers 2110.04388, arXiv.org, revised Feb 2023.
    9. Takahiro ITO, 2024. "Binary and Ordered Response Models in Randomized Experiments: Applications of the Resampling-Based Maximum Likelihood Method," GSICS Working Paper Series 42, Graduate School of International Cooperation Studies, Kobe University.
    10. Qi Li & Jeffrey Scott Racine, 2006. "Nonparametric Econometrics: Theory and Practice," Economics Books, Princeton University Press, edition 1, volume 1, number 8355.
    11. Delgado, Miguel A. & Rodriguez-Poo, Juan M. & Wolf, Michael, 2001. "Subsampling inference in cube root asymptotics with an application to Manski's maximum score estimator," Economics Letters, Elsevier, vol. 73(2), pages 241-250, November.
    12. Park, Byeong U. & Simar, Léopold & Zelenyuk, Valentin, 2017. "Nonparametric estimation of dynamic discrete choice models for time series data," Computational Statistics & Data Analysis, Elsevier, vol. 108(C), pages 97-120.
    13. Lahiri, Kajal & Yang, Liu, 2013. "Forecasting Binary Outcomes," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 1025-1106, Elsevier.
    14. Gao, Yichen & Li, Cong & Liang, Zhongwen, 2015. "Binary response correlated random coefficient panel data models," Journal of Econometrics, Elsevier, vol. 188(2), pages 421-434.
    15. Jetter, Michael & Magnusson, Leandro M. & Roth, Sebastian, 2020. "Becoming sensitive: Males’ risk and time preferences after the 2008 financial crisis," European Economic Review, Elsevier, vol. 128(C).
    16. Hausman, J. A. & Abrevaya, Jason & Scott-Morton, F. M., 1998. "Misclassification of the dependent variable in a discrete-response setting," Journal of Econometrics, Elsevier, vol. 87(2), pages 239-269, September.
    17. Muna Sharma & Swarn Chatterjee, 2021. "Cognitive Functioning: An Underlying Mechanism of Age and Gender Differences in Self-Assessed Risk Tolerance among an Aging Population," Sustainability, MDPI, vol. 13(4), pages 1-9, February.
    18. Lewbel, Arthur, 2007. "Endogenous selection or treatment model estimation," Journal of Econometrics, Elsevier, vol. 141(2), pages 777-806, December.
    19. Zexuan Wang & Ismaël Rafaï & Marc Willinger, 2023. "Does age affect the relation between risk and time preferences? Evidence from a representative sample," Southern Economic Journal, John Wiley & Sons, vol. 90(2), pages 341-368, October.
    20. Hoderlein, Stefan & Sherman, Robert, 2015. "Identification and estimation in a correlated random coefficients binary response model," Journal of Econometrics, Elsevier, vol. 188(1), pages 135-149.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2309.06693. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.