The dataset
You can download the dataset and find the description at https://www.kaggle.com/c/afsis-soil-properties/data.
The dataset has been explained in the following term list, as found at the preceding web link:
PIDN: This is the unique soil sample identifier.
SOC: This refers to soil organic carbon.
pH: These are the pH values.
Ca: This is the Mehlich-3 extractable calcium.
P: This is the Mehlich-3 extractable phosphorus.
Sand: This is the sand content.
m7497.96 - m599.76: There are 3,578 mid-infrared absorbance measurements. For example, the "m7497.96" column is the absorbance at wavenumber 7497.96 cm-1. We suggest you remove spectra CO2 bands, which are in the region m2379.76 to m2352.76, but you do not have to.
Depth: This is the depth of the soil sample (this has two categories: "Topsoil" and "Subsoil"). They have also included some potential spatial predictors from remote sensing data sources. Short variable descriptions of different terms are provided below and additional descriptions...