IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2502.06446.html
   My bibliography  Save this paper

Grouped fixed effects regularization for binary choice models

Author

Listed:
  • Claudia Pigini
  • Alessandro Pionati
  • Francesco Valentini

Abstract

We study the application of the Grouped Fixed Effects (GFE) estimator (Bonhomme et al., ECMTA 90(2):625-643, 2022) to binary choice models for network and panel data. This approach discretizes unobserved heterogeneity via k-means clustering and performs maximum likelihood estimation, reducing the number of fixed effects in finite samples. This regularization helps analyze small/sparse networks and rare events by mitigating complete separation, which can lead to data loss. We focus on dynamic models with few state transitions and network formation models for sparse networks. The effectiveness of this method is demonstrated through simulations and real data applications.

Suggested Citation

  • Claudia Pigini & Alessandro Pionati & Francesco Valentini, 2025. "Grouped fixed effects regularization for binary choice models," Papers 2502.06446, arXiv.org.
  • Handle: RePEc:arx:papers:2502.06446
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2502.06446
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Rob Alessie & Stefan Hochguertel & Arthur van Soest, 2004. "Ownership of Stocks and Mutual Funds: A Panel Data Analysis," The Review of Economics and Statistics, MIT Press, vol. 86(3), pages 783-796, August.
    2. Bettin, Giulia & Lucchetti, Riccardo & Zazzaro, Alberto, 2012. "Endogeneity and sample selection in a model for remittances," Journal of Development Economics, Elsevier, vol. 99(2), pages 370-384.
    3. Johannes S. Kunz & Kevin E. Staub & Rainer Winkelmann, 2021. "Predicting individual effects in fixed effects panel probit models," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(3), pages 1109-1145, July.
    4. Liangjun Su & Zhentao Shi & Peter C. B. Phillips, 2016. "Identifying Latent Structures in Panel Data," Econometrica, Econometric Society, vol. 84, pages 2215-2264, November.
    5. Ioannis Kosmidis & David Firth, 2009. "Bias reduction in exponential family nonlinear models," Biometrika, Biometrika Trust, vol. 96(4), pages 793-804.
    6. Fernández-Val, Iván & Weidner, Martin, 2016. "Individual and time effects in nonlinear panel models with large N, T," Journal of Econometrics, Elsevier, vol. 192(1), pages 291-312.
    7. Docquier, Frédéric & Rapoport, Hillel & Salomone, Sara, 2012. "Remittances, migrants' education and immigration policy: Theory and evidence from bilateral data," Regional Science and Urban Economics, Elsevier, vol. 42(5), pages 817-828.
    8. Koen Jochmans, 2018. "Semiparametric Analysis of Network Formation," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 36(4), pages 705-713, October.
    9. Áureo de Paula, 2020. "Econometric Models of Network Formation," Annual Review of Economics, Annual Reviews, vol. 12(1), pages 775-799, August.
    10. Lorenzo Cappellari & Stephen P. Jenkins, 2004. "Modelling low income transitions," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 19(5), pages 593-610.
    11. Wang, Yiren & Phillips, Peter C.B. & Su, Liangjun, 2024. "Panel data models with time-varying latent group structures," Journal of Econometrics, Elsevier, vol. 240(1).
    12. Daron Acemoglu & Asuman Ozdaglar & Alireza Tahbaz-Salehi, 2015. "Systemic Risk and Stability in Financial Networks," American Economic Review, American Economic Association, vol. 105(2), pages 564-608, February.
    13. Haihong Li & Bruce G. Lindsay & Richard P. Waterman, 2003. "Efficiency of projected score methods in rectangular array asymptotics," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(1), pages 191-208, February.
    14. Jeffrey M. Wooldridge, 2005. "Simple solutions to the initial conditions problem in dynamic, nonlinear panel data models with unobserved heterogeneity," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 20(1), pages 39-54, January.
    15. Lueth Erik & Ruiz-Arranz Marta, 2008. "Determinants of Bilateral Remittance Flows," The B.E. Journal of Macroeconomics, De Gruyter, vol. 8(1), pages 1-23, October.
    16. Bryan S. Graham, 2017. "An econometric model of network formation with degree heterogeneity," CeMMAP working papers 08/17, Institute for Fiscal Studies.
    17. Tomohiro Ando & Jushan Bai, 2016. "Panel Data Models with Grouped Factor Structure Under Unknown Group Membership," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 31(1), pages 163-191, January.
    18. Andreas Dzemski, 2019. "An Empirical Model of Dyadic Link Formation in a Network with Unobserved Heterogeneity," The Review of Economics and Statistics, MIT Press, vol. 101(5), pages 763-776, December.
    19. Bettin, Giulia & Lucchetti, Riccardo & Pigini, Claudia, 2018. "A dynamic double hurdle model for remittances: evidence from Germany," Economic Modelling, Elsevier, vol. 73(C), pages 365-377.
    20. Carro, Jesus M., 2007. "Estimating dynamic panel data discrete choice models with fixed effects," Journal of Econometrics, Elsevier, vol. 140(2), pages 503-528, October.
    21. Geert Dhaene & Koen Jochmans, 2015. "Split-panel Jackknife Estimation of Fixed-effect Models," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 82(3), pages 991-1030.
    22. Hahn, Jinyong & Moon, Hyungsik Roger, 2010. "Panel Data Models With Finite Number Of Multiple Equilibria," Econometric Theory, Cambridge University Press, vol. 26(3), pages 863-881, June.
    23. Stéphane Bonhomme & Thibaut Lamadon & Elena Manresa, 2022. "Discretizing Unobserved Heterogeneity," Econometrica, Econometric Society, vol. 90(2), pages 625-643, March.
    24. Drescher, Katharina & Janzen, Benedikt, 2021. "Determinants, persistence, and dynamics of energy poverty: An empirical assessment using German household survey data," Energy Economics, Elsevier, vol. 102(C).
    25. David W. Hughes, 2021. "Estimating Nonlinear Network Data Models with Fixed Effects," Boston College Working Papers in Economics 1058, Boston College Department of Economics.
    26. Gary Chamberlain, 1980. "Analysis of Covariance with Qualitative Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 47(1), pages 225-238.
    27. Jinyong Hahn & Whitney Newey, 2004. "Jackknife and Analytical Bias Reduction for Nonlinear Panel Models," Econometrica, Econometric Society, vol. 72(4), pages 1295-1319, July.
    28. Paul Contoyannis & Andrew M. Jones & Nigel Rice, 2004. "The dynamics of health in the British Household Panel Survey," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 19(4), pages 473-503.
    29. Elhanan Helpman & Marc Melitz & Yona Rubinstein, 2008. "Estimating Trade Flows: Trading Partners and Trading Volumes," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 123(2), pages 441-487.
    30. Orazio Attanasio & Abigail Barr & Juan Camilo Cardenas & Garance Genicot & Costas Meghir, 2012. "Risk Pooling, Risk Preferences, and Social Networks," American Economic Journal: Applied Economics, American Economic Association, vol. 4(2), pages 134-167, April.
    31. Freeman, Hugo & Weidner, Martin, 2023. "Linear panel regressions with two-way unobserved heterogeneity," Journal of Econometrics, Elsevier, vol. 237(1).
    32. Bester, C. Alan & Hansen, Christian B., 2016. "Grouped effects estimators in fixed effects models," Journal of Econometrics, Elsevier, vol. 190(1), pages 197-208.
    33. Pigini, Claudia & Presbitero, Andrea F. & Zazzaro, Alberto, 2016. "State dependence in access to credit," Journal of Financial Stability, Elsevier, vol. 27(C), pages 17-34.
    34. Dean R. Hyslop, 1999. "State Dependence, Serial Correlation and Heterogeneity in Intertemporal Labor Force Participation of Married Women," Econometrica, Econometric Society, vol. 67(6), pages 1255-1294, November.
    35. Marta F. Arroyabe & Martin Schumann, 2022. "On the Estimation of True State Dependence in the Persistence of Innovation," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 84(4), pages 850-893, August.
    36. Bartolucci, Francesco & Nigro, Valentina, 2012. "Pseudo conditional maximum likelihood estimation of the dynamic logit model for binary panel data," Journal of Econometrics, Elsevier, vol. 170(1), pages 102-116.
    37. Bryan S. Graham, 2017. "An Econometric Model of Network Formation With Degree Heterogeneity," Econometrica, Econometric Society, vol. 85, pages 1033-1063, July.
    38. Caggiano, Giovanni & Calice, Pietro & Leonida, Leone & Kapetanios, George, 2016. "Comparing logit-based early warning systems: Does the duration of systemic banking crises matter?," Journal of Empirical Finance, Elsevier, vol. 37(C), pages 104-116.
    39. Hahn, Jinyong & Kuersteiner, Guido, 2011. "Bias Reduction For Dynamic Nonlinear Panel Models With Fixed Effects," Econometric Theory, Cambridge University Press, vol. 27(6), pages 1152-1191, December.
    40. Bester, C. Alan & Hansen, Christian, 2009. "A Penalty Function Approach to Bias Reduction in Nonlinear Panel Models with Fixed Effects," Journal of Business & Economic Statistics, American Statistical Association, vol. 27(2), pages 131-148.
    41. Geert Dhaene & Koen Jochmans, 2015. "Split-panel Jackknife Estimation of Fixed-effect Models," Review of Economic Studies, Oxford University Press, vol. 82(3), pages 991-1030.
    42. Bryan S. Graham, 2020. "Sparse network asymptotics for logistic regression," Papers 2010.04703, arXiv.org.
    43. Bo E. Honoré & Ekaterini Kyriazidou, 2000. "Panel Data Discrete Choice Models with Lagged Dependent Variables," Econometrica, Econometric Society, vol. 68(4), pages 839-874, July.
    44. Karyne B. Charbonneau, 2017. "Multiple fixed effects in binary response panel data models," Econometrics Journal, Royal Economic Society, vol. 20(3), pages 1-13, October.
    45. Pigini, Claudia, 2021. "Penalized maximum likelihood estimation of logit-based early warning systems," International Journal of Forecasting, Elsevier, vol. 37(3), pages 1156-1172.
    46. Bryan S. Graham, 2020. "Sparse Network Asymptotics for Logistic Regression under Possible Misspecification," NBER Working Papers 27962, National Bureau of Economic Research, Inc.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Francesco Bartolucci & Claudia Pigini & Francesco Valentini, 2023. "Conditional inference and bias reduction for partial effects estimation of fixed-effects logit models," Empirical Economics, Springer, vol. 64(5), pages 2257-2290, May.
    2. Francesco Bartolucci & Francesco Valentini & Claudia Pigini, 2023. "Recursive Computation of the Conditional Probability Function of the Quadratic Exponential Model for Binary Panel Data," Computational Economics, Springer;Society for Computational Economics, vol. 61(2), pages 529-557, February.
    3. Francesco Bartolucci & Claudia Pigini & Francesco Valentini, 2024. "MCMC conditional maximum likelihood for the two-way fixed-effects logit," Econometric Reviews, Taylor & Francis Journals, vol. 43(6), pages 379-404, July.
    4. Claudia Pigini & Alessandro Pionati & Francesco Valentini, 2023. "Specification testing with grouped fixed effects," Papers 2310.01950, arXiv.org.
    5. Chernozhukov, Victor & Fernández-Val, Iván & Weidner, Martin, 2024. "Network and panel quantile effects via distribution regression," Journal of Econometrics, Elsevier, vol. 240(2).
    6. Geert Dhaene & Koen Jochmans, 2015. "Split-panel Jackknife Estimation of Fixed-effect Models," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 82(3), pages 991-1030.
    7. Lucchetti, Riccardo & Pigini, Claudia, 2017. "DPB: Dynamic Panel Binary Data Models in gretl," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 79(i08).
    8. David W. Hughes, 2021. "Estimating Nonlinear Network Data Models with Fixed Effects," Boston College Working Papers in Economics 1058, Boston College Department of Economics.
    9. repec:hal:spmain:info:hdl:2441/f6h8764enu2lskk9p2m9mgp8l is not listed on IDEAS
    10. repec:spo:wpmain:info:hdl:2441/f6h8764enu2lskk9p2m9mgp8l is not listed on IDEAS
    11. repec:spo:wpecon:info:hdl:2441/f6h8764enu2lskk9p2m9mgp8l is not listed on IDEAS
    12. repec:hal:wpspec:info:hdl:2441/f6h8764enu2lskk9p2m9mgp8l is not listed on IDEAS
    13. Jochmans, Koen & Higgins, Ayden, 2022. "Bootstrap inference for fixed-effect models," TSE Working Papers 22-1328, Toulouse School of Economics (TSE), revised Dec 2023.
    14. Liu, Ruiqi & Shang, Zuofeng & Zhang, Yonghui & Zhou, Qiankun, 2020. "Identification and estimation in panel models with overspecified number of groups," Journal of Econometrics, Elsevier, vol. 215(2), pages 574-590.
    15. Schumann, Martin & Severini, Thomas A. & Tripathi, Gautam, 2021. "Integrated likelihood based inference for nonlinear panel data models with unobserved effects," Journal of Econometrics, Elsevier, vol. 223(1), pages 73-95.
    16. Weidner, Martin & Zylkin, Thomas, 2021. "Bias and consistency in three-way gravity models," Journal of International Economics, Elsevier, vol. 132(C).
    17. Francesco Bartolucci & Claudia Pigini, 2017. "Granger causality in dynamic binary short panel data models," Working Papers 421, Universita' Politecnica delle Marche (I), Dipartimento di Scienze Economiche e Sociali.
    18. Francesco Bartolucci & Valentina Nigro & Claudia Pigini, 2018. "Testing for state dependence in binary panel data with individual covariates by a modified quadratic exponential model," Econometric Reviews, Taylor & Francis Journals, vol. 37(1), pages 61-88, January.
    19. Andreas Dzemski, 2019. "An Empirical Model of Dyadic Link Formation in a Network with Unobserved Heterogeneity," The Review of Economics and Statistics, MIT Press, vol. 101(5), pages 763-776, December.
    20. Freeman, Hugo & Weidner, Martin, 2023. "Linear panel regressions with two-way unobserved heterogeneity," Journal of Econometrics, Elsevier, vol. 237(1).
    21. Mugnier, Martin & Wang, Ao, 2022. "Identification and (Fast) Estimation of Large Nonlinear Panel Models with Two-Way Fixed Effects," The Warwick Economics Research Paper Series (TWERPS) 1422, University of Warwick, Department of Economics.
    22. Stéphane Bonhomme & Koen Jochmans & Martin Weidner, 2025. "A neyman-orthogonalization approach to the incidental parameter problem," CeMMAP working papers 05/25, Institute for Fiscal Studies.
    23. Xuan Leng & Jiaming Mao & Yutao Sun, 2023. "Debiased Inference for Dynamic Nonlinear Panels with Multi-dimensional Heterogeneities," Papers 2305.03134, arXiv.org, revised Nov 2024.
    24. Chen, Mingli & Fernández-Val, Iván & Weidner, Martin, 2021. "Nonlinear factor models for network and panel data," Journal of Econometrics, Elsevier, vol. 220(2), pages 296-324.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2502.06446. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.