IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v271y2018i2p594-605.html
   My bibliography  Save this article

Model-based capacitated clustering with posterior regularization

Author

Listed:
  • Mai, Feng
  • Fry, Michael J.
  • Ohlmann, Jeffrey W.

Abstract

We propose a heuristic approach to address the general class of optimization problems involving the capacitated clustering of observations consisting of variable values that are realizations from respective probability distributions. Based on the expectation-maximization algorithm, our approach unifies Gaussian mixture modeling for clustering analysis and cluster capacity constraints using a posterior regularization framework. To test our algorithm, we consider the capacitated p-median problem in which the observations consist of geographic locations of customers and the corresponding demand of these customers. Our heuristic has superior performance compared to classic geometrical clustering heuristics, with robust performance over a collection of instance types.

Suggested Citation

  • Mai, Feng & Fry, Michael J. & Ohlmann, Jeffrey W., 2018. "Model-based capacitated clustering with posterior regularization," European Journal of Operational Research, Elsevier, vol. 271(2), pages 594-605.
  • Handle: RePEc:eee:ejores:v:271:y:2018:i:2:p:594-605
    DOI: 10.1016/j.ejor.2018.04.048
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221718303758
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2018.04.048?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Gendreau, Michel & Laporte, Gilbert & Seguin, Rene, 1996. "Stochastic vehicle routing," European Journal of Operational Research, Elsevier, vol. 88(1), pages 3-12, January.
    2. I H Osman & S Ahmadi, 2007. "Guided construction search metaheuristics for the capacitated p-median problem with single source constraint," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 58(1), pages 100-114, January.
    3. Samad Ahmadi & Ibrahim Osman, 2004. "Density Based Problem Space Search for the Capacitated Clustering p-Median Problem," Annals of Operations Research, Springer, vol. 131(1), pages 21-43, October.
    4. Barry N. Boots & Arthur Getis, 1985. "Point Pattern Analysis," Wholbk, Regional Research Institute, West Virginia University, number 13 edited by Grant I. Thrall, Fall.
    5. Barreto, Sergio & Ferreira, Carlos & Paixao, Jose & Santos, Beatriz Sousa, 2007. "Using clustering analysis in a capacitated location-routing problem," European Journal of Operational Research, Elsevier, vol. 179(3), pages 968-977, June.
    6. Salema, Maria Isabel Gomes & Barbosa-Povoa, Ana Paula & Novais, Augusto Q., 2007. "An optimization model for the design of a capacitated multi-product reverse logistics network with uncertainty," European Journal of Operational Research, Elsevier, vol. 179(3), pages 1063-1077, June.
    7. Bozkaya, Burcin & Erkut, Erhan & Laporte, Gilbert, 2003. "A tabu search heuristic and adaptive memory procedure for political districting," European Journal of Operational Research, Elsevier, vol. 144(1), pages 12-26, January.
    8. Barry N. Boots & Arthur Getis, 1985. "Point Pattern Analysis," Book Chapters, in: Grant I. Thrall (ed.),Scientific Geography Series, pages 50, Regional Research Institute, West Virginia University.
    9. Fleszar, K. & Hindi, K.S., 2008. "An effective VNS for the capacitated p-median problem," European Journal of Operational Research, Elsevier, vol. 191(3), pages 612-622, December.
    10. Tu, Yufeng & Ball, Michael O. & Jank, Wolfgang S., 2008. "Estimating Flight Departure Delay DistributionsA Statistical Approach With Long-Term Trend and Short-Term Pattern," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 112-125, March.
    11. Mark S. Handcock & Adrian E. Raftery & Jeremy M. Tantrum, 2007. "Model‐based clustering for social networks," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(2), pages 301-354, March.
    12. Fraley, Chris & Raftery, Adrian, 2007. "Model-based Methods of Classification: Using the mclust Software in Chemometrics," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 18(i06).
    13. A. Charnes & W. W. Cooper, 1959. "Chance-Constrained Programming," Management Science, INFORMS, vol. 6(1), pages 73-79, October.
    14. Biernacki, Christophe & Celeux, Gilles & Govaert, Gerard, 2003. "Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 41(3-4), pages 561-575, January.
    15. Diaz, Juan A. & Fernandez, Elena, 2006. "Hybrid scatter search and path relinking for the capacitated p-median problem," European Journal of Operational Research, Elsevier, vol. 169(2), pages 570-585, March.
    16. Scheuerer, Stephan & Wendolsky, Rolf, 2006. "A scatter search heuristic for the capacitated clustering problem," European Journal of Operational Research, Elsevier, vol. 169(2), pages 533-547, March.
    17. Karlis, Dimitris & Xekalaki, Evdokia, 2003. "Choosing initial values for the EM algorithm for finite mixtures," Computational Statistics & Data Analysis, Elsevier, vol. 41(3-4), pages 577-590, January.
    18. Osman Alp & Erhan Erkut & Zvi Drezner, 2003. "An Efficient Genetic Algorithm for the p-Median Problem," Annals of Operations Research, Springer, vol. 122(1), pages 21-42, September.
    19. Mulvey, John M. & Beck, Michael P., 1984. "Solving capacitated clustering problems," European Journal of Operational Research, Elsevier, vol. 18(3), pages 339-348, December.
    20. Raftery, Adrian E. & Dean, Nema, 2006. "Variable Selection for Model-Based Clustering," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 168-178, March.
    21. L’udmila Jánošíková & Miloš Herda & Michal Haviar, 2017. "Hybrid genetic algorithms with selective crossover for the capacitated p-median problem," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 25(3), pages 651-664, September.
    22. Lin, C.K.Y., 2009. "Stochastic single-source capacitated facility location model with service level requirements," International Journal of Production Economics, Elsevier, vol. 117(2), pages 439-451, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ouyang, Zhiyuan & Leung, Eric K.H. & Huang, George Q., 2023. "Community logistics and dynamic community partitioning: A new approach for solving e-commerce last mile delivery," European Journal of Operational Research, Elsevier, vol. 307(1), pages 140-156.
    2. Gambella, Claudio & Ghaddar, Bissan & Naoum-Sawaya, Joe, 2021. "Optimization problems for machine learning: A survey," European Journal of Operational Research, Elsevier, vol. 290(3), pages 807-828.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fleszar, K. & Hindi, K.S., 2008. "An effective VNS for the capacitated p-median problem," European Journal of Operational Research, Elsevier, vol. 191(3), pages 612-622, December.
    2. Juan A. Díaz & Dolores E. Luna, 2017. "Primal and dual bounds for the vertex p-median problem with balance constraints," Annals of Operations Research, Springer, vol. 258(2), pages 613-638, November.
    3. Alcaraz, Javier & Landete, Mercedes & Monge, Juan F., 2012. "Design and analysis of hybrid metaheuristics for the Reliability p-Median Problem," European Journal of Operational Research, Elsevier, vol. 222(1), pages 54-64.
    4. Adrian O’Hagan & Arthur White, 2019. "Improved model-based clustering performance using Bayesian initialization averaging," Computational Statistics, Springer, vol. 34(1), pages 201-231, March.
    5. Hung Tong & Cristina Tortora, 2022. "Model-based clustering and outlier detection with missing data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(1), pages 5-30, March.
    6. Salvatore Ingrassia & Antonio Punzo & Giorgio Vittadini & Simona Minotti, 2015. "Erratum to: The Generalized Linear Mixed Cluster-Weighted Model," Journal of Classification, Springer;The Classification Society, vol. 32(2), pages 327-355, July.
    7. repec:jss:jstsof:28:i04 is not listed on IDEAS
    8. Haugland, Dag & Ho, Sin C. & Laporte, Gilbert, 2007. "Designing delivery districts for the vehicle routing problem with stochastic demands," European Journal of Operational Research, Elsevier, vol. 180(3), pages 997-1010, August.
    9. Saif Eddin Jabari & Nikolaos M. Freris & Deepthi Mary Dilip, 2020. "Sparse Travel Time Estimation from Streaming Data," Transportation Science, INFORMS, vol. 54(1), pages 1-20, January.
    10. Derek S. Young & Xi Chen & Dilrukshi C. Hewage & Ricardo Nilo-Poyanco, 2019. "Finite mixture-of-gamma distributions: estimation, inference, and model-based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(4), pages 1053-1082, December.
    11. Kerekes, Monika, 2012. "Growth miracles and failures in a Markov switching classification model of growth," Journal of Development Economics, Elsevier, vol. 98(2), pages 167-177.
    12. Salvatore Ingrassia & Antonio Punzo, 2020. "Cluster Validation for Mixtures of Regressions via the Total Sum of Squares Decomposition," Journal of Classification, Springer;The Classification Society, vol. 37(2), pages 526-547, July.
    13. Kınay, Ömer Burak & Yetis Kara, Bahar & Saldanha-da-Gama, Francisco & Correia, Isabel, 2018. "Modeling the shelter site location problem using chance constraints: A case study for Istanbul," European Journal of Operational Research, Elsevier, vol. 270(1), pages 132-145.
    14. Snežana Tadić & Mladen Krstić & Željko Stević & Miloš Veljović, 2023. "Locating Collection and Delivery Points Using the p -Median Location Problem," Logistics, MDPI, vol. 7(1), pages 1-17, February.
    15. I H Osman & S Ahmadi, 2007. "Guided construction search metaheuristics for the capacitated p-median problem with single source constraint," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 58(1), pages 100-114, January.
    16. Gabriele Perrone & Gabriele Soffritti, 2023. "Seemingly unrelated clusterwise linear regression for contaminated data," Statistical Papers, Springer, vol. 64(3), pages 883-921, June.
    17. Sahin, Özge & Czado, Claudia, 2022. "Vine copula mixture models and clustering for non-Gaussian data," Econometrics and Statistics, Elsevier, vol. 22(C), pages 136-158.
    18. Antonello Maruotti & Antonio Punzo, 2021. "Initialization of Hidden Markov and Semi‐Markov Models: A Critical Evaluation of Several Strategies," International Statistical Review, International Statistical Institute, vol. 89(3), pages 447-480, December.
    19. O’Hagan, Adrian & Murphy, Thomas Brendan & Gormley, Isobel Claire, 2012. "Computational aspects of fitting mixture models via the expectation–maximization algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 56(12), pages 3843-3864.
    20. Melnykov, Volodymyr & Melnykov, Igor, 2012. "Initializing the EM algorithm in Gaussian mixture models with an unknown number of components," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1381-1395.
    21. Melnykov, Volodymyr, 2016. "ClickClust: An R Package for Model-Based Clustering of Categorical Sequences," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 74(i09).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:271:y:2018:i:2:p:594-605. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.