IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i10p1455-d1390651.html
   My bibliography  Save this article

Novel Feature-Based Difficulty Prediction Method for Mathematics Items Using XGBoost-Based SHAP Model

Author

Listed:
  • Xifan Yi

    (School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China)

  • Jianing Sun

    (School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China)

  • Xiaopeng Wu

    (Faculty of Education, Northeast Normal University, Changchun 130024, China)

Abstract

The level of difficulty of mathematical test items is a critical aspect for evaluating test quality and educational outcomes. Accurately predicting item difficulty during test creation is thus significantly important for producing effective test papers. This study used more than ten years of content and score data from China’s Henan Provincial College Entrance Examination in Mathematics as an evaluation criterion for test difficulty, and all data were obtained from the Henan Provincial Department of Education. Based on the framework established by the National Center for Education Statistics (NCES) for test item assessment methodology, this paper proposes a new framework containing eight features considering the uniqueness of mathematics. Next, this paper proposes an XGBoost-based SHAP model for analyzing the difficulty of mathematics tests. By coupling the XGBoost method with the SHAP method, the model not only evaluates the difficulty of mathematics tests but also analyzes the contribution of specific features to item difficulty, thereby increasing transparency and mitigating the “black box” nature of machine learning models. The model has a high prediction accuracy of 0.99 for the training set and 0.806 for the test set. With the model, we found that parameter-level features and reasoning-level features are significant factors influencing the difficulty of subjective items in the exam. In addition, we divided senior secondary mathematics knowledge into nine units based on Chinese curriculum standards and found significant differences in the distribution of the eight features across these different knowledge units, which can help teachers place different emphasis on different units during the teaching process. In summary, our proposed approach significantly improves the accuracy of item difficulty prediction, which is crucial for intelligent educational applications such as knowledge tracking, automatic test item generation, and intelligent paper generation. These results provide tools that are better aligned with and responsive to students’ learning needs, thus effectively informing educational practice.

Suggested Citation

  • Xifan Yi & Jianing Sun & Xiaopeng Wu, 2024. "Novel Feature-Based Difficulty Prediction Method for Mathematics Items Using XGBoost-Based SHAP Model," Mathematics, MDPI, vol. 12(10), pages 1-21, May.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:10:p:1455-:d:1390651
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/10/1455/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/10/1455/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Matthias von Davier, 2018. "Automated Item Generation with Recurrent Neural Networks," Psychometrika, Springer;The Psychometric Society, vol. 83(4), pages 847-857, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Björn E. Hommel & Franz-Josef M. Wollang & Veronika Kotova & Hannes Zacher & Stefan C. Schmukle, 2022. "Transformer-Based Deep Neural Language Modeling for Construct-Specific Automatic Item Generation," Psychometrika, Springer;The Psychometric Society, vol. 87(2), pages 749-772, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:10:p:1455-:d:1390651. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.