IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v59y2018i4d10.1007_s00362-018-1043-8.html
   My bibliography  Save this article

Block tensor train decomposition for missing data estimation

Author

Listed:
  • Namgil Lee

    (Kangwon National University)

  • Jong-Min Kim

    (University of Minnesota-Morris)

Abstract

We propose a method for imputation of missing values in large scale matrix data based on a low-rank tensor approximation technique called the block tensor train (BTT) decomposition. Given sparsely observed data points, the proposed method iteratively computes the singular value decomposition (SVD) of the underlying data matrix with missing values. The SVD of the matrices is performed based on a low-rank BTT decomposition, by which storage and time complexities can be reduced dramatically for large-scale data matrices admitting a low-rank tensor structure. An iterative soft-thresholding algorithm is implemented for missing data estimation based on an alternating least squares method for BTT decomposition. Experimental results on simulated data and real benchmark data demonstrate that the proposed method can estimate a large amount of missing values accurately compared to a matrix-based standard method. The R source code of the BTT-based imputation method is available at https://github.com/namgillee/BTTSoftImpute .

Suggested Citation

  • Namgil Lee & Jong-Min Kim, 2018. "Block tensor train decomposition for missing data estimation," Statistical Papers, Springer, vol. 59(4), pages 1283-1305, December.
  • Handle: RePEc:spr:stpapr:v:59:y:2018:i:4:d:10.1007_s00362-018-1043-8
    DOI: 10.1007/s00362-018-1043-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-018-1043-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-018-1043-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. J. Carroll & Jih-Jie Chang, 1970. "Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition," Psychometrika, Springer;The Psychometric Society, vol. 35(3), pages 283-319, September.
    2. GILLIS, Nicolas & GLINEUR, François, 2010. "Low-rank matrix approximation with weights or missing data is NP-hard," LIDAM Discussion Papers CORE 2010075, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    3. Henk Kiers, 1997. "Weighted least squares fitting using ordinary least squares algorithms," Psychometrika, Springer;The Psychometric Society, vol. 62(2), pages 251-266, June.
    4. Ledyard Tucker, 1966. "Some mathematical notes on three-mode factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 31(3), pages 279-311, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mariela González-Narváez & María José Fernández-Gómez & Susana Mendes & José-Luis Molina & Omar Ruiz-Barzola & Purificación Galindo-Villardón, 2021. "Study of Temporal Variations in Species–Environment Association through an Innovative Multivariate Method: MixSTATICO," Sustainability, MDPI, vol. 13(11), pages 1-25, May.
    2. Elisa Frutos-Bernal & Ángel Martín del Rey & Irene Mariñas-Collado & María Teresa Santos-Martín, 2022. "An Analysis of Travel Patterns in Barcelona Metro Using Tucker3 Decomposition," Mathematics, MDPI, vol. 10(7), pages 1-17, March.
    3. Yoshio Takane & Forrest Young & Jan Leeuw, 1977. "Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features," Psychometrika, Springer;The Psychometric Society, vol. 42(1), pages 7-67, March.
    4. Giuseppe Brandi & Ruggero Gramatica & Tiziana Di Matteo, 2019. "Unveil stock correlation via a new tensor-based decomposition method," Papers 1911.06126, arXiv.org, revised Apr 2020.
    5. Zhang, Shuang & Han, Le, 2023. "Robust tensor recovery with nonconvex and nonsmooth regularization," Applied Mathematics and Computation, Elsevier, vol. 438(C).
    6. Michel Velden & Tammo Bijmolt, 2006. "Generalized canonical correlation analysis of matrices with missing rows: a simulation study," Psychometrika, Springer;The Psychometric Society, vol. 71(2), pages 323-331, June.
    7. Paolo Giordani & Roberto Rocci & Giuseppe Bove, 2020. "Factor Uniqueness of the Structural Parafac Model," Psychometrika, Springer;The Psychometric Society, vol. 85(3), pages 555-574, September.
    8. Alwin Stegeman & Tam Lam, 2014. "Three-Mode Factor Analysis by Means of Candecomp/Parafac," Psychometrika, Springer;The Psychometric Society, vol. 79(3), pages 426-443, July.
    9. Chen Ling & Gaohang Yu & Liqun Qi & Yanwei Xu, 2021. "T-product factorization method for internet traffic data completion with spatio-temporal regularization," Computational Optimization and Applications, Springer, vol. 80(3), pages 883-913, December.
    10. Zhang, Tonglin, 2020. "CP decomposition and weighted clique problem," Statistics & Probability Letters, Elsevier, vol. 161(C).
    11. Timmerman, Marieke E. & Kiers, Henk A. L., 2002. "Three-way component analysis with smoothness constraints," Computational Statistics & Data Analysis, Elsevier, vol. 40(3), pages 447-470, September.
    12. Richard Harshman & Margaret Lundy, 1996. "Uniqueness proof for a family of models sharing features of Tucker's three-mode factor analysis and PARAFAC/candecomp," Psychometrika, Springer;The Psychometric Society, vol. 61(1), pages 133-154, March.
    13. Andersson, Claus A. & Henrion, Rene, 1999. "A general algorithm for obtaining simple structure of core arrays in N-way PCA with application to fluorometric data," Computational Statistics & Data Analysis, Elsevier, vol. 31(3), pages 255-278, September.
    14. Rubinstein, Alexander & Slutskin, Lev, 2018. "«Multiway data analysis» and the general problem of journals’ ranking," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 50, pages 90-113.
    15. Carlos Martin-Barreiro & John A. Ramirez-Figueroa & Ana B. Nieto-Librero & Víctor Leiva & Ana Martin-Casado & M. Purificación Galindo-Villardón, 2021. "A New Algorithm for Computing Disjoint Orthogonal Components in the Three-Way Tucker Model," Mathematics, MDPI, vol. 9(3), pages 1-22, January.
    16. Giudici, Paolo & Huang, Bihong & Spelta, Alessandro, 2019. "Trade networks and economic fluctuations in Asian countries," Economic Systems, Elsevier, vol. 43(2), pages 1-1.
    17. Jacques Bénasséni & Mohammed Bennani Dosse, 2012. "Analyzing multiset data by the Power STATIS-ACT method," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 6(1), pages 49-65, April.
    18. Ji Yeh Choi & Heungsun Hwang & Marieke E. Timmerman, 2018. "Functional Parallel Factor Analysis for Functions of One- and Two-dimensional Arguments," Psychometrika, Springer;The Psychometric Society, vol. 83(1), pages 1-20, March.
    19. Violetta Simonacci & Michele Gallo, 2019. "Detecting Public Social Spending Patterns in Italy Using a Three-Way Relative Variation Approach," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 146(1), pages 205-219, November.
    20. Laura Anderlucci & Alessandro Lubisco & Stefania Mignani, 2021. "Investigating the Judges Performance in a National Competition of Sport Dance," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 156(2), pages 783-799, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:59:y:2018:i:4:d:10.1007_s00362-018-1043-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.