IDEAS home Printed from https://ideas.repec.org/a/spr/orspec/v46y2024i4d10.1007_s00291-024-00751-5.html
   My bibliography  Save this article

Convex and nonconvex nonparametric frontier-based classification methods for anomaly detection

Author

Listed:
  • Qianying Jin

    (Nanjing University of Aeronautics and Astronautics)

  • Kristiaan Kerstens

    (UMR 9221 - LEM - Lille Économie Management)

  • Ignace Van de Woestyne

    (Brussels Campus)

Abstract

Effective methods for determining the boundary of the normal class are very useful for detecting anomalies in commercial or security applications—a problem known as anomaly detection. This contribution proposes a nonparametric frontier-based classification (NPFC) method for anomaly detection. By relaxing the commonly used convexity assumption in the literature, a nonconvex-NPFC method is constructed and the nonconvex nonparametric frontier turns out to provide a more conservative boundary enveloping the normal class. By reflecting on the monotonic relation between the characteristic variables and the membership, the proposed NPFC method is in a more general form since both input-like and output-like characteristic variables are incorporated. In addition, by allowing some of the training observations to be misclassified, the convex- and nonconvex-NPFC methods are extended from a hard nonparametric frontier to a soft one, which also provides a more conservative boundary enclosing the normal class. Both simulation studies and a real-life data set are used to evaluate and compare the proposed NPFC methods to some well-established methods in the literature. The results show that the proposed NPFC methods have competitive classification performance and have consistent advantages in detecting abnormal samples, especially the nonconvex-NPFC methods.

Suggested Citation

  • Qianying Jin & Kristiaan Kerstens & Ignace Van de Woestyne, 2024. "Convex and nonconvex nonparametric frontier-based classification methods for anomaly detection," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 46(4), pages 1213-1239, December.
  • Handle: RePEc:spr:orspec:v:46:y:2024:i:4:d:10.1007_s00291-024-00751-5
    DOI: 10.1007/s00291-024-00751-5
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00291-024-00751-5
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00291-024-00751-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Kerstens, Kristiaan & Sadeghi, Jafar & Toloo, Mehdi & Van de Woestyne, Ignace, 2022. "Procedures for ranking technical and cost efficient units: With a focus on nonconvexity," European Journal of Operational Research, Elsevier, vol. 300(1), pages 269-281.
    2. Per Andersen & Niels Christian Petersen, 1993. "A Procedure for Ranking Efficient Units in Data Envelopment Analysis," Management Science, INFORMS, vol. 39(10), pages 1261-1264, October.
    3. Walter Briec & Kristiaan Kerstens & Ignace Van de Woestyne, 2018. "Hypercongestion in production correspondences: an empirical exploration," Applied Economics, Taylor & Francis Journals, vol. 50(27), pages 2938-2956, June.
    4. Lovell, C. A. Knox & Pastor, Jesus T., 1999. "Radial DEA models without inputs or without outputs," European Journal of Operational Research, Elsevier, vol. 118(1), pages 46-51, October.
    5. K Kerstens & I Van de Woestyne, 2011. "Negative data in DEA: a simple proportional distance function approach," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 62(7), pages 1413-1419, July.
    6. Bruce G. Marcot & Anca M. Hanea, 2021. "What is an optimal value of k in k-fold cross-validation in discrete Bayesian network analysis?," Computational Statistics, Springer, vol. 36(3), pages 2009-2031, September.
    7. Valero-Carreras, Daniel & Aparicio, Juan & Guerrero, Nadia M., 2021. "Support vector frontiers: A new approach for estimating production functions through support vector machines," Omega, Elsevier, vol. 104(C).
    8. W. Briec, 1997. "A Graph-Type Extension of Farrell Technical Efficiency Measure," Journal of Productivity Analysis, Springer, vol. 8(1), pages 95-110, March.
    9. Sueyoshi, Toshiyuki, 2006. "DEA-Discriminant Analysis: Methodological comparison among eight discriminant analysis approaches," European Journal of Operational Research, Elsevier, vol. 169(1), pages 247-272, February.
    10. R. G. Chambers & Y. Chung & R. Färe, 1998. "Profit, Directional Distance Functions, and Nerlovian Efficiency," Journal of Optimization Theory and Applications, Springer, vol. 98(2), pages 351-364, August.
    11. Juan Aparicio & Miriam Esteve & Jesus J. Rodriguez-Sala & Jose L. Zofio, 2021. "The Estimation of Productive Efficiency Through Machine Learning Techniques: Efficiency Analysis Trees," International Series in Operations Research & Management Science, in: Joe Zhu & Vincent Charles (ed.), Data-Enabled Analytics, pages 51-92, Springer.
    12. Chiwoo Park & Jianhua Z. Huang & Yu Ding, 2010. "A Computable Plug-In Estimator of Minimum Volume Sets for Novelty Detection," Operations Research, INFORMS, vol. 58(5), pages 1469-1480, October.
    13. Esteve, Miriam & Aparicio, Juan & Rodriguez-Sala, Jesus J. & Zhu, Joe, 2023. "Random Forests and the measurement of super-efficiency in the context of Free Disposal Hull," European Journal of Operational Research, Elsevier, vol. 304(2), pages 729-744.
    14. Pendharkar, Parag C., 2002. "A potential use of data envelopment analysis for the inverse classification problem," Omega, Elsevier, vol. 30(3), pages 243-248, June.
    15. Laurens Cherchye & Timo Kuosmanen & Thierry Post, 2001. "FDH Directional Distance Functions with an Application to European Commercial Banks," Journal of Productivity Analysis, Springer, vol. 15(3), pages 201-215, January.
    16. R. D. Banker & A. Charnes & W. W. Cooper, 1984. "Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis," Management Science, INFORMS, vol. 30(9), pages 1078-1092, September.
    17. C F Leon & F Palacios, 2009. "Evaluation of rejected cases in an acceptance system with data envelopment analysis and goal programming," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 60(10), pages 1411-1420, October.
    18. Kaffash, Sepideh & Azizi, Roza & Huang, Ying & Zhu, Joe, 2020. "A survey of data envelopment analysis applications in the insurance industry 1993–2018," European Journal of Operational Research, Elsevier, vol. 284(3), pages 801-813.
    19. Kim, Ji-Hyun, 2009. "Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap," Computational Statistics & Data Analysis, Elsevier, vol. 53(11), pages 3735-3745, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Esteve, Miriam & Aparicio, Juan & Rodriguez-Sala, Jesus J. & Zhu, Joe, 2023. "Random Forests and the measurement of super-efficiency in the context of Free Disposal Hull," European Journal of Operational Research, Elsevier, vol. 304(2), pages 729-744.
    2. Ravelojaona, Paola, 2019. "On constant elasticity of substitution – Constant elasticity of transformation Directional Distance Functions," European Journal of Operational Research, Elsevier, vol. 272(2), pages 780-791.
    3. Aparicio, Juan & Pastor, Jesus T. & Vidal, Fernando, 2016. "The directional distance function and the translation invariance property," Omega, Elsevier, vol. 58(C), pages 1-3.
    4. Guerrero, Nadia M. & Moragues, Raul & Aparicio, Juan & Valero-Carreras, Daniel, 2024. "Support Vector Frontiers with kernel splines," Omega, Elsevier, vol. 128(C).
    5. Cova-Alonso, David José & Díaz-Hernández, Juan José & Martínez-Budría, Eduardo, 2021. "A strong efficiency measure for CCR/BCC models," European Journal of Operational Research, Elsevier, vol. 291(1), pages 284-295.
    6. Mahmood Mehdiloo & Jafar Sadeghi & Kristiaan Kerstens, 2024. "Top Down Axiomatic Modeling of Metatechnologies and Evaluating Directional Economic Efficiency," Working Papers 2024-EQM-03, IESEG School of Management.
    7. Halická, Margaréta & Trnovská, Mária & Černý, Aleš, 2024. "A unified approach to radial, hyperbolic, and directional efficiency measurement in data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 312(1), pages 298-314.
    8. Sahoo, Biresh K. & Singh, Ramadhar & Mishra, Bineet & Sankaran, Krithiga, 2017. "Research productivity in management schools of India during 1968-2015: A directional benefit-of-doubt model analysis," Omega, Elsevier, vol. 66(PA), pages 118-139.
    9. Andreas Dellnitz & Andreas Kleine & Madjid Tavana, 2024. "An integrated data envelopment analysis and regression tree method for new product price estimation," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 46(4), pages 1189-1211, December.
    10. Bogetoft, Peter & Leth Hougaard, Jens, 2004. "Super efficiency evaluations based on potential slack," European Journal of Operational Research, Elsevier, vol. 152(1), pages 14-21, January.
    11. Valadkhani, Abbas & Roshdi, Israfil & Smyth, Russell, 2016. "A multiplicative environmental DEA approach to measure efficiency changes in the world's major polluters," Energy Economics, Elsevier, vol. 54(C), pages 363-375.
    12. Cherchye, L. & Post, G.T., 2001. "Methodological Advances in Dea," ERIM Report Series Research in Management ERS-2001-53-F&A, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
    13. Pastor, Jesus T. & Lovell, C.A. Knox & Aparicio, Juan, 2020. "Defining a new graph inefficiency measure for the proportional directional distance function and introducing a new Malmquist productivity index," European Journal of Operational Research, Elsevier, vol. 281(1), pages 222-230.
    14. Bolós, V.J. & Benítez, R. & Coll-Serrano, V., 2024. "Chance constrained directional models in stochastic data envelopment analysis," Operations Research Perspectives, Elsevier, vol. 12(C).
    15. Maria Silva Portela & Pedro Borges & Emmanuel Thanassoulis, 2003. "Finding Closest Targets in Non-Oriented DEA Models: The Case of Convex and Non-Convex Technologies," Journal of Productivity Analysis, Springer, vol. 19(2), pages 251-269, April.
    16. Papaioannou, Grammatoula & Podinovski, Victor V., 2023. "Multicomponent production technologies with restricted allocations of shared inputs and outputs," European Journal of Operational Research, Elsevier, vol. 308(1), pages 274-289.
    17. Kao, Chiang, 2020. "Measuring efficiency in a general production possibility set allowing for negative data," European Journal of Operational Research, Elsevier, vol. 282(3), pages 980-988.
    18. Mercedes Beltrán-Esteve & José Gómez-Limón & Andrés Picazo-Tadeo & Ernest Reig-Martínez, 2014. "A metafrontier directional distance function approach to assessing eco-efficiency," Journal of Productivity Analysis, Springer, vol. 41(1), pages 69-83, February.
    19. K Kerstens & I Van de Woestyne, 2011. "Negative data in DEA: a simple proportional distance function approach," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 62(7), pages 1413-1419, July.
    20. Timo Kuosmanen, 2007. "Performance measurement and best-practice benchmarking of mutual funds: combining stochastic dominance criteria with data envelopment analysis," Journal of Productivity Analysis, Springer, vol. 28(1), pages 71-86, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:orspec:v:46:y:2024:i:4:d:10.1007_s00291-024-00751-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.