IDEAS home Printed from https://ideas.repec.org/a/spr/infosf/v19y2017i5d10.1007_s10796-016-9680-8.html
   My bibliography  Save this article

Automatic classification of data-warehouse-data for information lifecycle management using machine learning techniques

Author

Listed:
  • Sebastian Büsch

    (Ilmenau University of Technology)

  • Volker Nissen

    (Ilmenau University of Technology)

  • Arndt Wünscher

    (Ilmenau University of Technology)

Abstract

The aim of Information Lifecycle Management (ILM) is to govern data throughout its lifecycle as efficiently as possible and effectively from technical points of view. A core aspect is the question, where the data should be stored, since different costs and access times are entailed. For this purpose data have to be classified, which presently is either done manually in an elaborate way, or with recourse to only a few data attributes, in particular access frequency. In the context of Data-Warehouse-Systems this article introduces an automated and therefore speedy and cost-effective data classification for ILM. Machine learning techniques, in particular an artificial neural network (multilayer perceptron), a support vector machine and a decision tree approach are compared on an SAP-based real-world data set from the automotive industry. This data classification considers a large number of data attributes and thus attains similar results akin to human experts. In this comparison of machine learning techniques, besides the accuracy of classification, also the types of misclassification that appear, are included, since this is important in ILM.

Suggested Citation

  • Sebastian Büsch & Volker Nissen & Arndt Wünscher, 2017. "Automatic classification of data-warehouse-data for information lifecycle management using machine learning techniques," Information Systems Frontiers, Springer, vol. 19(5), pages 1085-1099, October.
  • Handle: RePEc:spr:infosf:v:19:y:2017:i:5:d:10.1007_s10796-016-9680-8
    DOI: 10.1007/s10796-016-9680-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10796-016-9680-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10796-016-9680-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hasso Plattner & Alexander Zeier, 2011. "In-Memory Data Management," Springer Books, Springer, number 978-3-642-19363-7, February.
    2. David L. Olson & Dursun Delen, 2008. "Advanced Data Mining Techniques," Springer Books, Springer, number 978-3-540-76917-0, February.
    3. Markus Lilienthal, 2013. "A Decision Support Model for Cloud Bursting," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 5(2), pages 71-81, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vijayan Sugumaran & T. V. Geetha & D. Manjula & Hema Gopal, 2017. "Guest Editorial: Computational Intelligence and Applications," Information Systems Frontiers, Springer, vol. 19(5), pages 969-974, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sebastian Büsch & Volker Nissen & Arndt Wünscher, 0. "Automatic classification of data-warehouse-data for information lifecycle management using machine learning techniques," Information Systems Frontiers, Springer, vol. 0, pages 1-15.
    2. Tobias Knabke & Sebastian Olbrich, 2018. "Building novel capabilities to enable business intelligence agility: results from a quantitative study," Information Systems and e-Business Management, Springer, vol. 16(3), pages 493-546, August.
    3. Vangelis Marinakis & Themistoklis Koutsellis & Alexandros Nikas & Haris Doukas, 2021. "AI and Data Democratisation for Intelligent Energy Management," Energies, MDPI, vol. 14(14), pages 1-14, July.
    4. Mark Gilchrist & Deana Lehmann Mooers & Glenn Skrubbeltrang & Francine Vachon, 2012. "Knowledge Discovery in Databases for Competitive Advantage," Journal of Management and Strategy, Journal of Management and Strategy, Sciedu Press, vol. 3(2), pages 2-15, April.
    5. Marina Johnson & Abdullah Albizri & Serhat Simsek, 2022. "Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis," Annals of Operations Research, Springer, vol. 308(1), pages 275-305, January.
    6. Mehri, Ali & Darooneh, Amir H. & Shariati, Ashrafalsadat, 2012. "The complex networks approach for authorship attribution of books," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 391(7), pages 2429-2437.
    7. Michał Jasiński & Tomasz Sikorski & Zbigniew Leonowicz & Klaudiusz Borkowski & Elżbieta Jasińska, 2020. "The Application of Hierarchical Clustering to Power Quality Measurements in an Electrical Power Network with Distributed Generation," Energies, MDPI, vol. 13(9), pages 1-19, May.
    8. Beni Rohrbach & Sharolyn Anderson & Patrick Laube, 2016. "The effects of sample size on data quality in participatory mapping of past land use," Environment and Planning B, , vol. 43(4), pages 681-697, July.
    9. Simsek, Serhat & Dag, Ali & Tiahrt, Thomas & Oztekin, Asil, 2021. "A Bayesian Belief Network-based probabilistic mechanism to determine patient no-show risk categories," Omega, Elsevier, vol. 100(C).
    10. Yucel, Ahmet & Dag, Ali & Oztekin, Asil & Carpenter, Mark, 2022. "A novel text analytic methodology for classification of product and service reviews," Journal of Business Research, Elsevier, vol. 151(C), pages 287-297.
    11. Kizilaslan, Recep & Freund, Steven & Iseri, Ali, 2016. "A data analytic approach to forecasting daily stock returns in an emerging marketAuthor-Name: Oztekin, Asil," European Journal of Operational Research, Elsevier, vol. 253(3), pages 697-710.
    12. Saljooghi, Saeed & Safisamghabadib, Azamdokht, 2016. "Analyzing Semiconductor component's market sales data to create an Expert Fuzzy inference system," MPRA Paper 79846, University Library of Munich, Germany.
    13. Asil Oztekin, 0. "Information fusion-based meta-classification predictive modeling for ETF performance," Information Systems Frontiers, Springer, vol. 0, pages 1-16.
    14. Ramin Vakili & Mojdeh Khorsand, 2022. "A Machine Learning-Based Method for Identifying Critical Distance Relays for Transient Stability Studies," Energies, MDPI, vol. 15(23), pages 1-28, November.
    15. Delen, Dursun & Cogdell, Douglas & Kasap, Nihat, 2012. "A comparative analysis of data mining methods in predicting NCAA bowl outcomes," International Journal of Forecasting, Elsevier, vol. 28(2), pages 543-552.
    16. ShakorShahabi, Reza & Qarahasanlou, Ali Nouri & Azimi, Seyed Reza & Mottahedi, Adel, 2021. "Application of data mining in Iran's Artisanal and Small-Scale mines challenges analysis," Resources Policy, Elsevier, vol. 74(C).
    17. Chen, Kunlong & Zheng, Fangdan & Jiang, Jiuchun & Zhang, Weige & Jiang, Yan & Chen, Kunjin, 2017. "Practical failure recognition model of lithium-ion batteries based on partial charging process," Energy, Elsevier, vol. 138(C), pages 1199-1208.
    18. Andreas Fink & Natalia Kliewer & Dirk Mattfeld & Lars Mönch & Franz Rothlauf & Guido Schryen & Leena Suhl & Stefan Voß, 2014. "Model-Based Decision Support in Manufacturing and Service Networks," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 6(1), pages 17-24, February.
    19. Gitae Kim & Bongsug Chae & David Olson, 2013. "A support vector machine (SVM) approach to imbalanced datasets of customer responses: comparison with other customer response models," Service Business, Springer;Pan-Pacific Business Association, vol. 7(1), pages 167-182, March.
    20. Dinesh, Chinthaka & Welikala, Shirantha & Liyanage, Yasitha & Ekanayake, Mervyn Parakrama B. & Godaliyadda, Roshan Indika & Ekanayake, Janaka, 2017. "Non-intrusive load monitoring under residential solar power influx," Applied Energy, Elsevier, vol. 205(C), pages 1068-1080.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:infosf:v:19:y:2017:i:5:d:10.1007_s10796-016-9680-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.