IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v15y2023i17p12764-d1223629.html
   My bibliography  Save this article

Exploration of Biodegradable Substances Using Machine Learning Techniques

Author

Listed:
  • Alaa M. Elsayad

    (Department of Electrical Engineering, College of Engineering in Wadi Alddawasir, Prince Sattam Bin Abdulaziz University, Wadi Alddawasir 11991, Saudi Arabia)

  • Medien Zeghid

    (Department of Electrical Engineering, College of Engineering in Wadi Alddawasir, Prince Sattam Bin Abdulaziz University, Wadi Alddawasir 11991, Saudi Arabia
    Electronics and Micro-Electronics Laboratory, Faculty of Sciences, University of Monastir, Monastir 5000, Tunisia)

  • Hassan Yousif Ahmed

    (Department of Electrical Engineering, College of Engineering in Wadi Alddawasir, Prince Sattam Bin Abdulaziz University, Wadi Alddawasir 11991, Saudi Arabia)

  • Khaled A. Elsayad

    (Pharmacy Department, Cairo University Hospitals, Cairo University, Cairo 11662, Egypt)

Abstract

The concept of being readily biodegradable is crucial in evaluating the potential effects of chemical substances on ecosystems and conducting environmental risk assessments. Substances that readily biodegrade are generally associated with lower environmental persistence and reduced risks to the environment compared to those that do not easily degrade. The accurate development of quantitative structure–activity relationship (QSAR) models for biodegradability prediction plays a critical role in advancing the design and creation of sustainable chemicals. In this paper, we report the results of our investigation into the utilization of classification and regression trees (CARTs) in classifying and selecting features of biodegradable substances based on 2D molecular descriptors. CARTs are a well-known machine learning approach renowned for their simplicity, scalability, and built-in feature selection capabilities, rendering them highly suitable for the analysis of large datasets. Curvature and interaction tests were employed to construct efficient and unbiased trees, while Bayesian optimization (BO) and repeated cross-validation techniques were utilized to improve the generalization and stability of the trees. The main objective was to classify substances as either readily biodegradable (RB) or non-readily biodegradable (NRB). We compared the performance of the proposed CARTs with support vector machine (SVM), K nearest neighbor (kNN), and regulated logistic regression (RLR) models in terms of overall accuracy, sensitivity, specificity, and receiver operating characteristics (ROC) curve. The experimental findings demonstrated that the proposed CART model, which integrated curvature–interaction tests, outperformed other models in classifying the test subset. It achieved accuracy of 85.63%, sensitivity of 87.12%, specificity of 84.94%, and a highly comparable area under the ROC curve of 0.87. In the prediction process, the model identified the top ten most crucial descriptors, with the SpMaxB(m) and SpMin1_Bh(v) descriptors standing out as notably superior to the remaining descriptors.

Suggested Citation

  • Alaa M. Elsayad & Medien Zeghid & Hassan Yousif Ahmed & Khaled A. Elsayad, 2023. "Exploration of Biodegradable Substances Using Machine Learning Techniques," Sustainability, MDPI, vol. 15(17), pages 1-22, August.
  • Handle: RePEc:gam:jsusta:v:15:y:2023:i:17:p:12764-:d:1223629
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/15/17/12764/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/15/17/12764/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Alaa M. Elsayad & Ahmed M. Nassef & Mujahed Al-Dhaifallah & Khaled A. Elsayad, 2020. "Classification of Biodegradable Substances Using Balanced Random Trees and Boosted C5.0 Decision Trees," IJERPH, MDPI, vol. 17(24), pages 1-20, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.

      Corrections

      All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:15:y:2023:i:17:p:12764-:d:1223629. See general information about how to correct material in RePEc.

      If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

      If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

      If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

      For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

      Please note that corrections may take a couple of weeks to filter through the various RePEc services.

      IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.