IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v169y2022ics0167947322000032.html
   My bibliography  Save this article

Low-rank matrix denoising for count data using unbiased Kullback-Leibler risk estimation

Author

Listed:
  • Bigot, Jérémie
  • Deledalle, Charles

Abstract

Many statistical studies are concerned with the analysis of observations organized in a matrix form whose elements are count data. When these observations are assumed to follow a Poisson or a multinomial distribution, it is of interest to focus on the estimation of either the intensity matrix (Poisson case) or the compositional matrix (multinomial case) when it is assumed to have a low rank structure. In this setting, it is proposed to construct an estimator minimizing the regularized negative log-likelihood by a nuclear norm penalty. Such an approach easily yields a low-rank matrix-valued estimator with positive entries which belongs to the set of row-stochastic matrices in the multinomial case. Then, as a main contribution, a data-driven procedure is constructed to select the regularization parameter in the construction of such estimators by minimizing (approximately) unbiased estimates of the Kullback-Leibler (KL) risk in such models, which generalize Stein's unbiased risk estimation originally proposed for Gaussian data. The evaluation of these quantities is a delicate problem, and novel methods are introduced to obtain accurate numerical approximation of such unbiased estimates. Simulated data are used to validate this way of selecting regularizing parameters for low-rank matrix estimation from count data. For data following a multinomial distribution, the performances of this approach are also compared to K-fold cross-validation. Examples from a survey study and metagenomics also illustrate the benefits of this methodology for real data analysis.

Suggested Citation

  • Bigot, Jérémie & Deledalle, Charles, 2022. "Low-rank matrix denoising for count data using unbiased Kullback-Leibler risk estimation," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).
  • Handle: RePEc:eee:csdana:v:169:y:2022:i:c:s0167947322000032
    DOI: 10.1016/j.csda.2022.107423
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947322000032
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2022.107423?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Unknown, 2005. "Forward," 2005 Conference: Slovenia in the EU - Challenges for Agriculture, Food Science and Rural Affairs, November 10-11, 2005, Moravske Toplice, Slovenia 183804, Slovenian Association of Agricultural Economists (DAES).
    2. A. S. Lewis, 1996. "Derivatives of Spectral Functions," Mathematics of Operations Research, INFORMS, vol. 21(3), pages 576-588, August.
    3. Yuanpei Cao & Anru Zhang & Hongzhe Li, 2020. "Multisample estimation of bacterial composition matrices in metagenomics data," Biometrika, Biometrika Trust, vol. 107(1), pages 75-92.
    4. Robin, Geneviève & Josse, Julie & Moulines, Éric & Sardy, Sylvain, 2019. "Low-rank model with covariates for count data with missing values," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 416-434.
    5. Shabalin, Andrey A. & Nobel, Andrew B., 2013. "Reconstruction of a low-rank matrix in the presence of Gaussian noise," Journal of Multivariate Analysis, Elsevier, vol. 118(C), pages 67-76.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chao Kan & Wen Song, 2015. "Second-order conditions for existence of augmented Lagrange multipliers for eigenvalue composite optimization problems," Journal of Global Optimization, Springer, vol. 63(1), pages 77-97, September.
    2. Pilar Lopez-Llompart & G. Mathias Kondolf, 2016. "Encroachments in floodways of the Mississippi River and Tributaries Project," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 81(1), pages 513-542, March.
    3. Cheng, Jianquan & Bertolini, Luca, 2013. "Measuring urban job accessibility with distance decay, competition and diversity," Journal of Transport Geography, Elsevier, vol. 30(C), pages 100-109.
    4. M. De Donno & M. Pratelli, 2006. "A theory of stochastic integration for bond markets," Papers math/0602532, arXiv.org.
    5. Prilly Oktoviany & Robert Knobloch & Ralf Korn, 2021. "A machine learning-based price state prediction model for agricultural commodities using external factors," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 44(2), pages 1063-1085, December.
    6. Michelle Sheran Sylvester, 2007. "The Career and Family Choices of Women: A Dynamic Analysis of Labor Force Participation, Schooling, Marriage and Fertility Decisions," Review of Economic Dynamics, Elsevier for the Society for Economic Dynamics, vol. 10(3), pages 367-399, July.
    7. Henrekson, Magnus & Johansson, Dan, 2010. "Firm Growth, Institutions and Structural Transformation," Ratio Working Papers 150, The Ratio Institute.
    8. Karen K. Lewis, 2011. "Global Asset Pricing," Annual Review of Financial Economics, Annual Reviews, vol. 3(1), pages 435-466, December.
    9. DAVID M. BLAU & WILBERT van der KLAAUW, 2013. "What Determines Family Structure?," Economic Inquiry, Western Economic Association International, vol. 51(1), pages 579-604, January.
    10. Panagiota DIONYSOPOULOU & Georgios SVARNIAS & Theodore PAPAILIAS, 2021. "Total Quality Management In Public Sector, Case Study: Customs Service," Regional Science Inquiry, Hellenic Association of Regional Scientists, vol. 0(1), pages 153-168, June.
    11. Afanasyev, Dmitriy O. & Fedorova, Elena A. & Popov, Viktor U., 2015. "Fine structure of the price–demand relationship in the electricity market: Multi-scale correlation analysis," Energy Economics, Elsevier, vol. 51(C), pages 215-226.
    12. Peter Viggo Jakobsen, 2009. "Small States, Big Influence: The Overlooked Nordic Influence on the Civilian ESDP," Journal of Common Market Studies, Wiley Blackwell, vol. 47(1), pages 81-102, January.
    13. Defeng Sun & Jie Sun, 2008. "Löwner's Operator and Spectral Functions in Euclidean Jordan Algebras," Mathematics of Operations Research, INFORMS, vol. 33(2), pages 421-445, May.
    14. Julie Holland Mortimer, 2007. "Price Discrimination, Copyright Law, and Technological Innovation: Evidence from the Introduction of DVDs," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 122(3), pages 1307-1350.
    15. Suwan Shen & Xi Feng & Zhong Ren Peng, 2016. "A framework to analyze vulnerability of critical infrastructure to climate change: the case of a coastal community in Florida," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 84(1), pages 589-609, October.
    16. Jean-Bernard Chatelain & Kirsten Ralf, 2017. "Can We Identify the Fed's Preferences?," Working Papers halshs-01549908, HAL.
    17. Billio, Monica & Casarin, Roberto & Osuntuyi, Anthony, 2016. "Efficient Gibbs sampling for Markov switching GARCH models," Computational Statistics & Data Analysis, Elsevier, vol. 100(C), pages 37-57.
    18. Jan Babecký & Fabrizio Coricelli & Roman Horváth, 2009. "Assessing Inflation Persistence: Micro Evidence on an Inflation Targeting Economy," Czech Journal of Economics and Finance (Finance a uver), Charles University Prague, Faculty of Social Sciences, vol. 59(2), pages 102-127, June.
    19. Lloyd, S. P., 2017. "Unconventional Monetary Policy and the Interest Rate Channel: Signalling and Portfolio Rebalancing," Cambridge Working Papers in Economics 1735, Faculty of Economics, University of Cambridge.
    20. Fischer, Andreas M. & Ranaldo, Angelo, 2011. "Does FOMC news increase global FX trading?," Journal of Banking & Finance, Elsevier, vol. 35(11), pages 2965-2973, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:169:y:2022:i:c:s0167947322000032. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.