IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v61y2013i1p32-44.html
   My bibliography  Save this article

When Is the Right Time to Refresh Knowledge Discovered from Data?

Author

Listed:
  • Xiao Fang

    (Department of Operations and Information Systems, David Eccles School of Business, University of Utah, Salt Lake City, Utah 84112)

  • Olivia R. Liu Sheng

    (Department of Operations and Information Systems, David Eccles School of Business, University of Utah, Salt Lake City, Utah 84112)

  • Paulo Goes

    (Department of Management Information Systems, Eller College of Management, University of Arizona, Tucson, Arizona 85721)

Abstract

Knowledge discovery in databases (KDD) techniques have been extensively employed to extract knowledge from massive data stores to support decision making in a wide range of critical applications. Maintaining the currency of discovered knowledge over evolving data sources is a fundamental challenge faced by all KDD applications. This paper addresses the challenge from the perspective of deciding the right times to refresh knowledge. We define the knowledge-refreshing problem and model it as a Markov decision process. Based on the identified properties of the Markov decision process model, we establish that the optimal knowledge-refreshing policy is monotonically increasing in the system state within every appropriate partition of the state space. We further show that the problem of searching for the optimal knowledge-refreshing policy can be reduced to the problem of finding the optimal thresholds and propose a method for computing the optimal knowledge-refreshing policy. The effectiveness and the robustness of the computed optimal knowledge-refreshing policy are examined through extensive empirical studies addressing a real-world knowledge-refreshing problem. Our method can be applied to refresh knowledge for KDD applications that employ major data-mining models.

Suggested Citation

  • Xiao Fang & Olivia R. Liu Sheng & Paulo Goes, 2013. "When Is the Right Time to Refresh Knowledge Discovered from Data?," Operations Research, INFORMS, vol. 61(1), pages 32-44, February.
  • Handle: RePEc:inm:oropre:v:61:y:2013:i:1:p:32-44
    DOI: 10.1287/opre.1120.1148
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/opre.1120.1148
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.1120.1148?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sumit Sarkar & Ram S. Sriram, 2001. "Bayesian Models for Early Warning of Bank Failures," Management Science, INFORMS, vol. 47(11), pages 1457-1475, November.
    2. Alain Bensoussan & Radha Mookerjee & Vijay Mookerjee & Wei T. Yue, 2009. "Maintaining Diagnostic Knowledge-Based Systems: A Control-Theoretic Approach," Management Science, INFORMS, vol. 55(2), pages 294-310, February.
    3. Debabrata Dey & Zhongju Zhang & Prabuddha De, 2006. "Optimal Synchronization Policies for Data Warehouses," INFORMS Journal on Computing, INFORMS, vol. 18(2), pages 229-242, May.
    4. Lee G. Cooper & Giovanni Giuffrida, 2000. "Turning Datamining into a Management Science Tool: New Algorithms and Empirical Results," Management Science, INFORMS, vol. 46(2), pages 249-264, February.
    5. Fang, Xiao & Rachamadugu, Ram, 2009. "Policies for knowledge refreshing in databases," Omega, Elsevier, vol. 37(1), pages 16-28, February.
    6. June S. Park & Robert Bartoszynski & Prabuddha De & Hasan Pirkul, 1990. "Optimal Reorganization Policies for Stationary and Evolutionary Databases," Management Science, INFORMS, vol. 36(5), pages 613-631, May.
    7. Bart Baesens & Rudy Setiono & Christophe Mues & Jan Vanthienen, 2003. "Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation," Management Science, INFORMS, vol. 49(3), pages 312-329, March.
    8. Arie Segev & Weiping Fang, 1991. "Optimal Update Policies for Distributed Materialized Views," Management Science, INFORMS, vol. 37(7), pages 851-870, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zhiling Guo & Jin Li & Ram Ramesh, 2023. "Green Data Analytics of Supercomputing from Massive Sensor Networks: Does Workload Distribution Matter?," Information Systems Research, INFORMS, vol. 34(4), pages 1664-1685, December.
    2. Kexin Yin & Xiao Fang & Bintong Chen & Olivia R. Liu Sheng, 2023. "Diversity Preference-Aware Link Recommendation for Online Social Networks," Information Systems Research, INFORMS, vol. 34(4), pages 1398-1414, December.
    3. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fang, Xiao & Rachamadugu, Ram, 2009. "Policies for knowledge refreshing in databases," Omega, Elsevier, vol. 37(1), pages 16-28, February.
    2. Casado Yusta, Silvia & Nœ–ez Letamendía, Laura & Pacheco Bonrostro, Joaqu’n Antonio, 2018. "Predicting Corporate Failure: The GRASP-LOGIT Model || Predicci—n de la quiebra empresarial: el modelo GRASP-LOGIT," Revista de Métodos Cuantitativos para la Economía y la Empresa = Journal of Quantitative Methods for Economics and Business Administration, Universidad Pablo de Olavide, Department of Quantitative Methods for Economics and Business Administration, vol. 26(1), pages 294-314, Diciembre.
    3. R Setiono & S-L Pan & M-H Hsieh & A Azcarraga, 2006. "Knowledge acquisition and revision using neural networks: an application to a cross-national study of brand image perception," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 57(3), pages 231-240, March.
    4. Hoffmann, F. & Baesens, B. & Mues, C. & Van Gestel, T. & Vanthienen, J., 2007. "Inferring descriptive and approximate fuzzy rules for credit scoring using evolutionary algorithms," European Journal of Operational Research, Elsevier, vol. 177(1), pages 540-555, February.
    5. Kunpeng Zhang & Wendy Moe, 2021. "Measuring Brand Favorability Using Large-Scale Social Media Data," Information Systems Research, INFORMS, vol. 32(4), pages 1128-1139, December.
    6. Huseyin Cavusoglu & Srinivasan Raghunathan, 2004. "Configuration of Detection Software: A Comparison of Decision and Game Theory Approaches," Decision Analysis, INFORMS, vol. 1(3), pages 131-148, September.
    7. Kwon, He-Boong & Lee, Jooh, 2019. "Exploring the differential impact of environmental sustainability, operational efficiency, and corporate reputation on market valuation in high-tech-oriented firms," International Journal of Production Economics, Elsevier, vol. 211(C), pages 1-14.
    8. Martens, David & Baesens, Bart & Van Gestel, Tony & Vanthienen, Jan, 2007. "Comprehensible credit scoring models using rule extraction from support vector machines," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1466-1476, December.
    9. Zhepeng Li & Xiao Fang & Xue Bai & Olivia R. Liu Sheng, 2017. "Utility-Based Link Recommendation for Online Social Networks," Management Science, INFORMS, vol. 63(6), pages 1938-1952, June.
    10. Guo, Mengzhuo & Zhang, Qingpeng & Liao, Xiuwu & Chen, Frank Youhua & Zeng, Daniel Dajun, 2021. "A hybrid machine learning framework for analyzing human decision-making through learning preferences," Omega, Elsevier, vol. 101(C).
    11. M. Naresh Kumar & V. Sree Hari Rao, 2015. "A New Methodology for Estimating Internal Credit Risk and Bankruptcy Prediction under Basel II Regime," Computational Economics, Springer;Society for Computational Economics, vol. 46(1), pages 83-102, June.
    12. Debabrata Dey & Atanu Lahiri & Guoying Zhang, 2015. "Optimal Policies for Security Patch Management," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 462-477, August.
    13. Polyzos, Stathis & Samitas, Aristeidis & Katsaiti, Marina-Selini, 2020. "Who is unhappy for Brexit? A machine-learning, agent-based study on financial instability," International Review of Financial Analysis, Elsevier, vol. 72(C).
    14. Hu'e Sullivan & Hurlin Christophe & P'erignon Christophe & Saurin S'ebastien, 2022. "Measuring the Driving Forces of Predictive Performance: Application to Credit Scoring," Papers 2212.05866, arXiv.org, revised Jun 2023.
    15. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W., 2018. "A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees," European Journal of Operational Research, Elsevier, vol. 269(2), pages 760-772.
    16. Chiuling Lu & Ann Yang & Jui-Feng Huang, 2015. "Bankruptcy predictions for U.S. air carrier operations: a study of financial data," Journal of Economics and Finance, Springer;Academy of Economics and Finance, vol. 39(3), pages 574-589, July.
    17. Jason R. W. Merrick & Claire A. Dorsey & Bo Wang & Martha Grabowski & John R. Harrald, 2022. "Measuring Prediction Accuracy in a Maritime Accident Warning System," Production and Operations Management, Production and Operations Management Society, vol. 31(2), pages 819-827, February.
    18. Marchioni, Andrea & Magni, Carlo Alberto, 2018. "Investment decisions and sensitivity analysis: NPV-consistency of rates of return," European Journal of Operational Research, Elsevier, vol. 268(1), pages 361-372.
    19. S. Balcaen & H. Ooghe, 2004. "Alternative methodologies in studies on business failure: do they produce better results than the classical statistical methods?," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 04/249, Ghent University, Faculty of Economics and Business Administration.
    20. Tang, Lingxiao & Cai, Fei & Ouyang, Yao, 2019. "Applying a nonparametric random forest algorithm to assess the credit risk of the energy industry in China," Technological Forecasting and Social Change, Elsevier, vol. 144(C), pages 563-572.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:61:y:2013:i:1:p:32-44. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.