IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v17y2025i7p3173-d1627255.html
   My bibliography  Save this article

Digital Mapping of Soil pH and Driving Factor Analysis Based on Environmental Variable Screening

Author

Listed:
  • He Huang

    (School of Resource and Environmental Science, Wuhan University, Wuhan 430079, China)

  • Yaolin Liu

    (School of Resource and Environmental Science, Wuhan University, Wuhan 430079, China)

  • Yanfang Liu

    (School of Resource and Environmental Science, Wuhan University, Wuhan 430079, China)

  • Zhaomin Tong

    (School of Resource and Environmental Science, Wuhan University, Wuhan 430079, China)

  • Zhouqiao Ren

    (Institute of Digital Agriculture, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China)

  • Yifan Xie

    (School of Resource and Environmental Science, Wuhan University, Wuhan 430079, China)

Abstract

This study comprehensively considers soil formation factors such as land use types, soil types, depths, and geographical conditions in Lanxi City, China. Using multi-source public data, three environmental variable screening methods, the Boruta algorithm, Recursive Feature Elimination (RFE), and Particle Swarm Optimization (PSO), were used to optimize and combine 47 environmental variables for the modeling of soil pH based on the data collected from farmland in the study area in 2022, and their effects were evaluated. A Random Forest (RF) model was used to predict soil pH in the study area. At the same time, Pearson correlation analysis, an environmental variable importance assessment based on the RF model, and SHAP explanatory model were used to explore the main controlling factors of soil pH and reveal its spatial differentiation mechanism. The results showed that in the presence of a large number of environmental variables, the model with covariates selected by PSO before the application of the Random Forest algorithm had higher prediction accuracy than that of Boruta–RF, RFE–RF, and all variable prediction RF models (MAE = 0.496, RMSE = 0.641, R 2 = 0.413, LCCC = 0.508). This indicates that PSO, as a covariate selection method, effectively optimized the input variables for the RF model, enhancing its performance. In addition, the results of the Pearson correlation analysis, RF-model-based environmental variable importance assessment, and SHAP explanatory model consistently indicate that Channel Network Base Level (CNBL), Elevation (DEM), Temperature mean (T_m), Evaporation (E_m), Land surface temperature mean (LST_m), and Humidity mean (H_m) are key factors affecting the spatial differentiation of soil pH. In summary, the approach of using PSO for covariate selection before applying the RF model exhibits high prediction accuracy and can serve as an effective method for predicting the spatial distribution of soil pH, providing important references for accurately simulating the spatial mapping of soil attributes in hilly and basin areas.

Suggested Citation

  • He Huang & Yaolin Liu & Yanfang Liu & Zhaomin Tong & Zhouqiao Ren & Yifan Xie, 2025. "Digital Mapping of Soil pH and Driving Factor Analysis Based on Environmental Variable Screening," Sustainability, MDPI, vol. 17(7), pages 1-21, April.
  • Handle: RePEc:gam:jsusta:v:17:y:2025:i:7:p:3173-:d:1627255
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/17/7/3173/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/17/7/3173/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:17:y:2025:i:7:p:3173-:d:1627255. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.