IDEAS home Printed from https://ideas.repec.org/a/spr/ijsaem/v15y2024i11d10.1007_s13198-024-02535-0.html
   My bibliography  Save this article

Predictive modeling and benchmarking for diamond price estimation: integrating classification, regression, hyperparameter tuning and execution time analysis

Author

Listed:
  • Md Shaik Amzad Basha

    (Gandhi Institute of Technology and Management (Deemed to Be University))

  • Peerzadah Mohammad Oveis

    (Gandhi Institute of Technology and Management (Deemed to Be University))

Abstract

The objective of this research is to provide a comprehensive analysis of diamond price prediction by evaluating a wide array of 23 machine learning models (ML), including both regression and classification techniques. This study aims to fill a gap in existing literature by applying hyperparameter tuning optimization across various models to enhance prediction accuracy, estimated values and Time execution efficiency, setting a new benchmark in the field. This approach involved a systematic assessment of multiple ML models on their base and tuned performance concerning accuracy, execution time, and predictive value alignment (under, accurate, over). The study utilized advanced hyperparameter tuning techniques to optimize each model’s performance, offering a comparative analysis that highlights the effectiveness of different models in predicting diamond prices. This research makes a distinct contribution through its extensive benchmarking of numerous ML models in the context of diamond price prediction, which is unprecedented in the literature. By applying hyperparameter tuning extensively to enhance model performance, its originality is derived from its comprehensive application of hyperparameter tuning to improve model performance by essentially tuning the model, this paper provides a novel contribution to the growing area of predictive analytics. By benchmarking an unprecedented amount of ML models for diamond price prediction and employing hyperparameter tuning, this paper moves the state of the art by noting the remarkable scope for accuracy improvements in tailored ML applications and demonstrates the extreme importance of model selection and optimization. The findings encompass that CatBoost Regressor, XGBoost Regressor still, kept high accuracy scores after tuning process and Random Forest Regressor accelerated much after tuning. Lastly, CatBoost Classifier, LightGBM Classifier existent achieving accuracies and efficiencies on the problem of diamond price classification tasks. Given its holistic nature, this study acknowledges the potential of overfitting in highly tuned models and their reliance on the specific dataset used for training. Future research might explore the generalisability of these techniques to other datasets and further investigate the trade-offs between model complexity and interpretability. The practical implications of this research are significant for stakeholders in the diamond industry such as retailers, appraisers, and investors. By identifying the most effective models for price prediction, we offer actionable insights that can improve decision-making processes, optimize inventory management, and enhance pricing policies.

Suggested Citation

  • Md Shaik Amzad Basha & Peerzadah Mohammad Oveis, 2024. "Predictive modeling and benchmarking for diamond price estimation: integrating classification, regression, hyperparameter tuning and execution time analysis," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 15(11), pages 5279-5313, November.
  • Handle: RePEc:spr:ijsaem:v:15:y:2024:i:11:d:10.1007_s13198-024-02535-0
    DOI: 10.1007/s13198-024-02535-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13198-024-02535-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13198-024-02535-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Satya Prakash Sahu & B. Ramachandra Reddy & Dev Mukherjee & D. M. Shyamla & Bhim Singh Verma, 2022. "A hybrid approach to software fault prediction using genetic programming and ensemble learning methods," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(4), pages 1746-1760, August.
    2. Sobolewski, Robert Adam & Tchakorom, Médane & Couturier, Raphaël, 2023. "Gradient boosting-based approach for short- and medium-term wind turbine output power prediction," Renewable Energy, Elsevier, vol. 203(C), pages 142-160.
    3. Francis K. C. Hui & David I. Warton & Scott D. Foster, 2015. "Tuning Parameter Selection for the Adaptive Lasso Using ERIC," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(509), pages 262-269, March.
    4. Alisha Banga & Ravinder Ahuja & Subhash Chander Sharma, 2023. "Performance analysis of regression algorithms and feature selection techniques to predict PM2.5 in smart cities," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 14(3), pages 732-745, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hu, Jiaxiang & Hu, Weihao & Cao, Di & Sun, Xinwu & Chen, Jianjun & Huang, Yuehui & Chen, Zhe & Blaabjerg, Frede, 2024. "Probabilistic net load forecasting based on transformer network and Gaussian process-enabled residual modeling learning method," Renewable Energy, Elsevier, vol. 225(C).
    2. Borup, Daniel & Christensen, Bent Jesper & Mühlbach, Nicolaj Søndergaard & Nielsen, Mikkel Slot, 2023. "Targeting predictors in random forest regression," International Journal of Forecasting, Elsevier, vol. 39(2), pages 841-868.
    3. Linh H. Nghiem & Francis K.C. Hui & Samuel Müller & A.H. Welsh, 2023. "Screening methods for linear errors‐in‐variables models in high dimensions," Biometrics, The International Biometric Society, vol. 79(2), pages 926-939, June.
    4. David Cheng & Abhishek Chakrabortty & Ashwin N. Ananthakrishnan & Tianxi Cai, 2020. "Estimating average treatment effects with a double‐index propensity score," Biometrics, The International Biometric Society, vol. 76(3), pages 767-777, September.
    5. Jonas Krampe & Efstathios Paparoditis, 2021. "Sparsity concepts and estimation procedures for high‐dimensional vector autoregressive models," Journal of Time Series Analysis, Wiley Blackwell, vol. 42(5-6), pages 554-579, September.
    6. Daniel, Jeffrey & Horrocks, Julie & Umphrey, Gary J., 2018. "Penalized composite likelihoods for inhomogeneous Gibbs point process models," Computational Statistics & Data Analysis, Elsevier, vol. 124(C), pages 104-116.
    7. Holter, Julia C. & Stallrich, Jonathan W., 2023. "Tuning parameter selection for penalized estimation via R2," Computational Statistics & Data Analysis, Elsevier, vol. 183(C).
    8. Jiacheng Wu & Nina Galanter & Susan M. Shortreed & Erica E.M. Moodie, 2022. "Ranking tailoring variables for constructing individualized treatment rules: An application to schizophrenia," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(2), pages 309-330, March.
    9. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    10. Hui, Francis K.C. & Müller, Samuel & Welsh, A.H., 2020. "The LASSO on latent indices for regression modeling with ordinal categorical predictors," Computational Statistics & Data Analysis, Elsevier, vol. 149(C).
    11. Francis K. C. Hui & Samuel Müller & A. H. Welsh, 2017. "Joint Selection in Mixed Models using Regularized PQL," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(519), pages 1323-1333, July.
    12. Ismaïla Ba & Jean‐François Coeurjolly, 2023. "Inference for low‐ and high‐dimensional inhomogeneous Gibbs point processes," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 50(3), pages 993-1021, September.
    13. Li, Yanhui & Sun, Kaixuan & Yao, Qi & Wang, Lin, 2024. "A dual-optimization wind speed forecasting model based on deep learning and improved dung beetle optimization algorithm," Energy, Elsevier, vol. 286(C).
    14. Mineaki Ohishi & Hirokazu Yanagihara & Shuichi Kawano, 2020. "Equivalence between adaptive Lasso and generalized ridge estimators in linear regression with orthogonal explanatory variables after optimizing regularization parameters," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(6), pages 1501-1516, December.
    15. Karl B. Gregory & Dewei Wang & Christopher S. McMahan, 2019. "Adaptive elastic net for group testing," Biometrics, The International Biometric Society, vol. 75(1), pages 13-23, March.
    16. Borup, Daniel & Rapach, David E. & Schütte, Erik Christian Montes, 2023. "Mixed-frequency machine learning: Nowcasting and backcasting weekly initial claims with daily internet search volume data," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1122-1144.
    17. Zhixuan Fu & Chirag R. Parikh & Bingqing Zhou, 2017. "Penalized variable selection in competing risks regression," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(3), pages 353-376, July.
    18. Piotr Pokarowski & Wojciech Rejchel & Agnieszka Sołtys & Michał Frej & Jan Mielniczuk, 2022. "Improving Lasso for model selection and prediction," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(2), pages 831-863, June.
    19. Daniel Borup & David E. Rapach & Erik Christian Montes Schütte, 2021. "Now- and Backcasting Initial Claims with High-Dimensional Daily Internet Search-Volume Data," CREATES Research Papers 2021-02, Department of Economics and Business Economics, Aarhus University.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:ijsaem:v:15:y:2024:i:11:d:10.1007_s13198-024-02535-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.