IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v249y2017i1d10.1007_s10479-015-1867-8.html
   My bibliography  Save this article

Multi-pattern generation framework for logical analysis of data

Author

Listed:
  • Chun-An Chou

    (SUNY Binghamton)

  • Tibérius O. Bonates

    (Federal University of Ceara)

  • Chungmok Lee

    (Hankuk University of Foreign Studies)

  • Wanpracha Art Chaovalitwongse

    (University of Washington)

Abstract

Logical analysis of data (LAD) is a rule-based data mining algorithm using combinatorial optimization and boolean logic for binary classification. The goal is to construct a classification model consisting of logical patterns (rules) that capture structured information from observations. Among the four steps of LAD framework (binarization, feature selection, pattern generation, and model construction), pattern generation has been considered the most important step. Combinatorial enumeration approaches to generate all possible patterns were mostly studied in the literature; however, those approaches suffered from the computational complexity of pattern generation that grows exponentially with data (feature) size. To overcome the problem, recent studies proposed column generation-based approaches to improve the efficacy of building a LAD model with a maximum-margin objective. There was still a difficulty in solving subproblems efficiently to generate patterns. In this study, a new column generation framework is proposed, in which a new mixed-integer linear programming approach is developed to generate multiple patterns having maximum coverage in subproblems at each iteration. In addition to the maximum-margin objective, we propose an alternative objective (minimum-pattern) to solve the LAD problem as a minimum set covering problem. The proposed approaches are evaluated on the datasets from the University of California Irvine Machine Learning Repository. The computational experiments provide comparable performances compared with previous LAD and other well-known classification algorithms.

Suggested Citation

  • Chun-An Chou & Tibérius O. Bonates & Chungmok Lee & Wanpracha Art Chaovalitwongse, 2017. "Multi-pattern generation framework for logical analysis of data," Annals of Operations Research, Springer, vol. 249(1), pages 329-349, February.
  • Handle: RePEc:spr:annopr:v:249:y:2017:i:1:d:10.1007_s10479-015-1867-8
    DOI: 10.1007/s10479-015-1867-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-015-1867-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-015-1867-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sorin Alexe & Peter L. Hammer, 2007. "Pattern-Based Discriminants in the Logical Analysis of Data," Springer Optimization and Its Applications, in: Panos M. Pardalos & Vladimir L. Boginski & Alkis Vazacopoulos (ed.), Data Mining in Biomedicine, pages 3-23, Springer.
    2. Pierre Hansen & Christophe Meyer, 2011. "A new column generation algorithm for Logical Analysis of Data," Annals of Operations Research, Springer, vol. 188(1), pages 215-249, August.
    3. Sorin Alexe & Eugene Blackstone & Peter Hammer & Hemant Ishwaran & Michael Lauer & Claire Pothier Snader, 2003. "Coronary Risk Prediction by Logical Analysis of Data," Annals of Operations Research, Springer, vol. 119(1), pages 15-42, March.
    4. Cynthia Barnhart & Ellis L. Johnson & George L. Nemhauser & Martin W. P. Savelsbergh & Pamela H. Vance, 1998. "Branch-and-Price: Column Generation for Solving Huge Integer Programs," Operations Research, INFORMS, vol. 46(3), pages 316-329, June.
    5. P. Hammer & A. Kogan & M. Lejeune, 2011. "Reverse-engineering country risk ratings: a combinatorial non-recursive model," Annals of Operations Research, Springer, vol. 188(1), pages 185-213, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lejeune, Miguel & Lozin, Vadim & Lozina, Irina & Ragab, Ahmed & Yacout, Soumaya, 2019. "Recent advances in the theory and practice of Logical Analysis of Data," European Journal of Operational Research, Elsevier, vol. 275(1), pages 1-15.
    2. Maurizio Boccia & Antonio Sforza & Claudio Sterle, 2020. "Simple Pattern Minimality Problems: Integer Linear Programming Formulations and Covering-Based Heuristic Solving Approaches," INFORMS Journal on Computing, INFORMS, vol. 32(4), pages 1049-1060, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lejeune, Miguel & Lozin, Vadim & Lozina, Irina & Ragab, Ahmed & Yacout, Soumaya, 2019. "Recent advances in the theory and practice of Logical Analysis of Data," European Journal of Operational Research, Elsevier, vol. 275(1), pages 1-15.
    2. Réal Carbonneau & Gilles Caporossi & Pierre Hansen, 2014. "Globally Optimal Clusterwise Regression By Column Generation Enhanced with Heuristics, Sequencing and Ending Subset Optimization," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 219-241, July.
    3. Guo, Cui & Ryoo, Hong Seo, 2021. "On Pareto-Optimal Boolean Logical Patterns for Numerical Data," Applied Mathematics and Computation, Elsevier, vol. 403(C).
    4. Janostik, Radek & Konecny, Jan & Krajča, Petr, 2020. "Interface between Logical Analysis of Data and Formal Concept Analysis," European Journal of Operational Research, Elsevier, vol. 284(2), pages 792-800.
    5. Travaughn C. Bain & Juan F. Avila-Herrera & Ersoy Subasi & Munevver Mine Subasi, 2020. "Logical analysis of multiclass data with relaxed patterns," Annals of Operations Research, Springer, vol. 287(1), pages 11-35, April.
    6. Bagchi, Prabir & Lejeune, Miguel A. & Alam, A., 2014. "How supply competency affects FDI decisions: Some insights," International Journal of Production Economics, Elsevier, vol. 147(PB), pages 239-251.
    7. Maenhout, Broos & Vanhoucke, Mario, 2010. "A hybrid scatter search heuristic for personalized crew rostering in the airline industry," European Journal of Operational Research, Elsevier, vol. 206(1), pages 155-167, October.
    8. Hoogervorst, R. & Dollevoet, T.A.B. & Maróti, G. & Huisman, D., 2018. "Reducing Passenger Delays by Rolling Stock Rescheduling," Econometric Institute Research Papers EI2018-29, Erasmus University Rotterdam, Erasmus School of Economics (ESE), Econometric Institute.
    9. Lentink, R.M. & Fioole, P-J. & Kroon, L.G. & van 't Woudt, C., 2003. "Applying Operations Research techniques to planning of train shunting," ERIM Report Series Research in Management ERS-2003-094-LIS, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
    10. Omid Shahvari & Rasaratnam Logendran & Madjid Tavana, 2022. "An efficient model-based branch-and-price algorithm for unrelated-parallel machine batching and scheduling problems," Journal of Scheduling, Springer, vol. 25(5), pages 589-621, October.
    11. Gutiérrez-Jarpa, Gabriel & Desaulniers, Guy & Laporte, Gilbert & Marianov, Vladimir, 2010. "A branch-and-price algorithm for the Vehicle Routing Problem with Deliveries, Selective Pickups and Time Windows," European Journal of Operational Research, Elsevier, vol. 206(2), pages 341-349, October.
    12. Baptiste, Philippe & Sadykov, Ruslan, 2010. "Time-indexed formulations for scheduling chains on a single machine: An application to airborne radars," European Journal of Operational Research, Elsevier, vol. 203(2), pages 476-483, June.
    13. Andreas Ernst & Houyuan Jiang & Mohan Krishnamoorthy, 2006. "Exact Solutions to Task Allocation Problems," Management Science, INFORMS, vol. 52(10), pages 1634-1646, October.
    14. Rostami, Borzou & Malucelli, Federico & Belotti, Pietro & Gualandi, Stefano, 2016. "Lower bounding procedure for the asymmetric quadratic traveling salesman problem," European Journal of Operational Research, Elsevier, vol. 253(3), pages 584-592.
    15. Renaud Chicoisne, 2023. "Computational aspects of column generation for nonlinear and conic optimization: classical and linearized schemes," Computational Optimization and Applications, Springer, vol. 84(3), pages 789-831, April.
    16. Fowler, John W. & Mönch, Lars, 2022. "A survey of scheduling with parallel batch (p-batch) processing," European Journal of Operational Research, Elsevier, vol. 298(1), pages 1-24.
    17. Qin, Hu & Moriakin, Anton & Xu, Gangyan & Li, Jiliu, 2024. "The generator distribution problem for base stations during emergency power outage: A branch-and-price-and-cut approach," European Journal of Operational Research, Elsevier, vol. 318(3), pages 752-767.
    18. Amy Cohn & Michael Magazine & George Polak, 2009. "Rank‐Cluster‐and‐Prune: An algorithm for generating clusters in complex set partitioning problems," Naval Research Logistics (NRL), John Wiley & Sons, vol. 56(3), pages 215-225, April.
    19. Erwin Abbink & Matteo Fischetti & Leo Kroon & Gerrit Timmer & Michiel Vromans, 2005. "Reinventing Crew Scheduling at Netherlands Railways," Interfaces, INFORMS, vol. 35(5), pages 393-401, October.
    20. Barry C. Smith & Ellis L. Johnson, 2006. "Robust Airline Fleet Assignment: Imposing Station Purity Using Station Decomposition," Transportation Science, INFORMS, vol. 40(4), pages 497-516, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:249:y:2017:i:1:d:10.1007_s10479-015-1867-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.