IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v89y2015icp12-24.html
   My bibliography  Save this article

Variational algorithms for biclustering models

Author

Listed:
  • Vu, Duy
  • Aitkin, Murray

Abstract

Biclustering is an important tool in exploratory statistical analysis which can be used to detect latent row and column groups of different response patterns. However, few studies include covariate data directly into their biclustering models to explain these variations. A novel biclustering framework that considers both stochastic block structures and covariate effects is proposed to address this modeling problem. Fast approximation estimation algorithms are also developed to deal with a large number of latent variables and covariate coefficients. These algorithms are derived from the variational generalized expectation–maximization (EM) framework where the goal is to increase, rather than maximize, the likelihood lower bound in both E and M steps. The utility of the proposed biclustering framework is demonstrated through two block modeling applications in model-based collaborative filtering and microarray analysis.

Suggested Citation

  • Vu, Duy & Aitkin, Murray, 2015. "Variational algorithms for biclustering models," Computational Statistics & Data Analysis, Elsevier, vol. 89(C), pages 12-24.
  • Handle: RePEc:eee:csdana:v:89:y:2015:i:c:p:12-24
    DOI: 10.1016/j.csda.2015.02.015
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947315000560
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2015.02.015?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hunter D.R. & Lange K., 2004. "A Tutorial on MM Algorithms," The American Statistician, American Statistical Association, vol. 58, pages 30-37, February.
    2. William H. Greene, 1994. "Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models," Working Papers 94-10, New York University, Leonard N. Stern School of Business, Department of Economics.
    3. Jacob M Zahn & Suresh Poosala & Art B Owen & Donald K Ingram & Ana Lustig & Arnell Carter & Ashani T Weeraratna & Dennis D Taub & Myriam Gorospe & Krystyna Mazan-Mamczarz & Edward G Lakatta & Kenneth , 2007. "AGEMAP: A Gene Expression Database for Aging in Mice," PLOS Genetics, Public Library of Science, vol. 3(11), pages 1-12, November.
    4. Zhou, Hua & Lange, Kenneth, 2009. "Rating Movies and Rating the Raters Who Rate Them," The American Statistician, American Statistical Association, vol. 63(4), pages 297-307.
    5. Salter-Townshend, Michael & Murphy, Thomas Brendan, 2013. "Variational Bayesian inference for the Latent Position Cluster Model for network data," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 661-671.
    6. Hunter, David R. & Goodreau, Steven M. & Handcock, Mark S., 2008. "Goodness of Fit of Social Network Models," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 248-258, March.
    7. Govaert, Gérard & Nadif, Mohamed, 2008. "Block clustering with Bernoulli mixture models: Comparison of different approaches," Computational Statistics & Data Analysis, Elsevier, vol. 52(6), pages 3233-3245, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Murray Aitkin & Duy Vu & Brian Francis, 2017. "Statistical modelling of a terrorist network," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(3), pages 751-768, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Inoue, Masaaki & Pham, Thong & Shimodaira, Hidetoshi, 2020. "Joint estimation of non-parametric transitivity and preferential attachment functions in scientific co-authorship networks," Journal of Informetrics, Elsevier, vol. 14(3).
    2. Babkin, Sergii & Stewart, Jonathan R. & Long, Xiaochen & Schweinberger, Michael, 2020. "Large-scale estimation of random graph models with local dependence," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).
    3. Blazquez-Soriano, Amparo & Ramos-Sandoval, Rosmery, 2022. "Information transfer as a tool to improve the resilience of farmers against the effects of climate change: The case of the Peruvian National Agrarian Innovation System," Agricultural Systems, Elsevier, vol. 200(C).
    4. Cornelia Lawson, 2013. "Academic Inventions Outside the University: Investigating Patent Ownership in the UK," Industry and Innovation, Taylor & Francis Journals, vol. 20(5), pages 385-398, July.
    5. Rui Baptista & Joana Mendonça, 2010. "Proximity to knowledge sources and the location of knowledge-based start-ups," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 45(1), pages 5-29, August.
    6. Domenico Piccolo & Rosaria Simone, 2019. "The class of cub models: statistical foundations, inferential issues and empirical evidence," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 28(3), pages 389-435, September.
    7. Greene, William, 2007. "Functional Form and Heterogeneity in Models for Count Data," Foundations and Trends(R) in Econometrics, now publishers, vol. 1(2), pages 113-218, August.
    8. Christopher J. W. Zorn, 1998. "An Analytic and Empirical Examination of Zero-Inflated and Hurdle Poisson Specifications," Sociological Methods & Research, , vol. 26(3), pages 368-400, February.
    9. Rasmus Lentz & Jean Marc Robin & Suphanit Piyapromdee, 2018. "On Worker and Firm Heterogeneity in Wages and Employment Mobility: Evidence from Danish Register Data," 2018 Meeting Papers 469, Society for Economic Dynamics.
    10. Agrawal, Ajay & Cockburn, Iain, 2003. "The anchor tenant hypothesis: exploring the role of large, local, R&D-intensive firms in regional innovation systems," International Journal of Industrial Organization, Elsevier, vol. 21(9), pages 1227-1253, November.
    11. Timothy C. Haab, "undated". "A Utility Based Repeated Discrete Choice Model of Consumer Demand," Working Papers 9611, East Carolina University, Department of Economics.
    12. Sándor Juhász, 2021. "Spinoffs and tie formation in cluster knowledge networks," Small Business Economics, Springer, vol. 56(4), pages 1385-1404, April.
    13. Niklas Elert, 2014. "What determines entry? Evidence from Sweden," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 53(1), pages 55-92, August.
    14. Boubaker, Sabri & Labégorre, Florence, 2008. "Ownership structure, corporate governance and analyst following: A study of French listed firms," Journal of Banking & Finance, Elsevier, vol. 32(6), pages 961-976, June.
    15. Simen G. Enger & Fulvio Castellacci, 2016. "Who gets Horizon 2020 research grants? Propensity to apply and probability to succeed in a two-step analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(3), pages 1611-1638, December.
    16. Payandeh Najafabadi Amir T. & MohammadPour Saeed, 2018. "A k-Inflated Negative Binomial Mixture Regression Model: Application to Rate–Making Systems," Asia-Pacific Journal of Risk and Insurance, De Gruyter, vol. 12(2), pages 1-31, July.
    17. Abbas Moghimbeigi & Mohammed Reza Eshraghian & Kazem Mohammad & Brian Mcardle, 2008. "Multilevel zero-inflated negative binomial regression modeling for over-dispersed count data with extra zeros," Journal of Applied Statistics, Taylor & Francis Journals, vol. 35(10), pages 1193-1202.
    18. Kayvan Sadeghi & Alessandro Rinaldo, 2020. "Hierarchical models for independence structures of networks," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 74(3), pages 439-457, August.
    19. Nicolas Carayol, 2006. "La production de brevets par les chercheurs et enseignants-chercheurs.. Le cas de l'université Louis Pasteur," Economie & Prévision, La Documentation Française, vol. 0(4), pages 117-134.
    20. Levan Elbakidze & Rodolfo M. Nayga Jr. & Hao Li & Chris McIntosh, 2014. "Value elicitation for multiple quantities of a quasi-public good using open ended choice experiments and uniform price auctions," Agricultural Economics, International Association of Agricultural Economists, vol. 45(2), pages 253-265, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:89:y:2015:i:c:p:12-24. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.