The Kriging interpolation technique is being increasingly used in geostatistics these days. But how does Kriging work to create a prediction, after all?
To start with, Kriging is a method where the distance and direction between the sample data points indicate a spatial correlation. This correlation is then used to explain the different variations in the surface.
In cases where the distance and direction give appropriate spatial correlation, Kriging will be able to predict surface variations in the most effective way. As such, we often see Kriging being used in Geology and Soil Sciences.
Kriging generates an optimal output surface for prediction which it estimates based on a scattered set with z-values. The procedure involves investigating the z-values’ spatial behavior in an ‘interactive’ manner where advanced statistical relationships are measured (autocorrelation).
Mathematically speaking, Kriging is somewhat similar to regression analysis and its whole idea is to predict the unknown value of a function at a given point by calculating the weighted average of all known functional values in the neighborhood. To get the output value for a location, we take the weighted sum of already measured values in the surrounding (all the points that we intend to consider around a specific radius), using a formula such as the following:
In a regression equation, λi would represent the weights of how far the points are from the prediction location. However, in Kriging, λi represent not just the weights of how far the measured points are from prediction location, but also how the measured points are arranged spatially around the prediction location. First, the variograms and covariance functions are generated to create the spatial autocorrelation of data. Then, that data is used to make predictions.
Thus, unlike the deterministic interpolation techniques like Inverse Distance Weighted (IDW) and Spline interpolation tools, Kriging goes beyond just estimating a prediction surface. Here, it brings an element of certainty in that prediction surface.
That is why experts rate kriging so highly for a strong prediction. Instead of a weather report forecasting a 2 mm rain on a certain Saturday, Kriging also tells you what is the "probability" of a 2 mm rain on that Saturday.
We hope you enjoy this simple R tutorial on Kriging by Berry Boessenkool.
We will be covering following sections in our tutorial with supported illustrations:
install.packages("rgeos")
install.packages("sf")
install.packages("geoR")
library(sf) # for st_read (read shapefiles),
# st_centroid, st_area, st_union
library(geoR) # as.geodata, variog, variofit,
# krige.control, krige.conv, legend.krige
## Warning: package ’sf’ was built under R version 3.4.1
x <- c(1,1,2,2,3,3,3,4,4,5,6,6,6)
y <- c(4,7,3,6,2,4,6,2,6,5,1,5,7)
z <- c(5,9,2,6,3,5,9,4,8,8,3,6,7)
plot(x,y, pch="+", cex=z/4)
GEODATA <- as.geodata(cbind(x,y,z))
plot(GEODATA)
EMP_VARIOGRAM <- variog(GEODATA)
## variog: computing omnidirectional variogram
FIT_VARIOGRAM <- variofit(EMP_VARIOGRAM)
## variofit: covariance model used is matern
## variofit: weights used: npairs
## variofit: minimisation function used: optim
## Warning in variofit(EMP_VARIOGRAM): initial values not provided - running the default search
## variofit: searching for best initial value ... selected values:
## sigmasq phi tausq kappa
## initial.value "9.19" "3.65" "0" "0.5"
## status "est" "est" "est" "fix"
## loss value: 401.578968904954
plot(EMP_VARIOGRAM)
lines(FIT_VARIOGRAM)
res <- 0.1
grid <- expand.grid(seq(min(x),max(x),res),
seq(min(y),max(y),res))
krico <- krige.control(type.krige="OK",
obj.model=FIT_VARIOGRAM)
krobj <- krige.conv(GEODATA,
locations=grid, krige=krico)
## krige.conv: model with constant mean
## krige.conv: Kriging performed using global neighbourhood
# KRigingObjekt
image(krobj, col=rainbow2(100))
legend.krige(col=rainbow2(100),
x.leg=c(6.2,6.7), y.leg=c(2,6),
vert=T, off=-0.5,
values=krobj$predict)
contour(krobj, add=T)
colPoints(x,y,z, col=rainbow2(100), legend=F)
points(x,y)
library("berryFunctions") # scatterpoints by color
colPoints(x,y,z, add=F, cex=2, legargs=list(y1=0.8,y2=1))
colPoints(grid[ ,1], grid[ ,2], krobj$predict, add=F,
cex=2, col2=NA, legargs=list(y1=0.8,y2=1))
Precipitation from ca 250 gauges in Brandenburg as Thiessen Polygons with steep gradients at edges:
# Shapefile:
p <- sf::st_read("data/PrecBrandenburg/niederschlag.shp",
quiet=TRUE)
# Plot prep
pcol <- colorRampPalette(c("red","yellow","blue"))(50)
clss <- berryFunctions::classify(p$P1, breaks=50)$index
# Plot
par(mar = c(0,0,1.2,0))
plot(p, col=pcol[clss], max.plot=1) # P1: Precipitation
# kriging coordinates
cent <- sf::st_centroid(p)
berryFunctions::colPoints(cent$x, cent$y, p$P1, add=T, cex=0.7,
legargs=list(y1=0.8,y2=1), col=pcol)
points(cent$x, cent$y, cex=0.7)
library(geoR)
# Semivariance:
geoprec <- as.geodata(cbind(cent$x,cent$y,p$P1))
vario <- variog(geoprec, max.dist=130000)
## variog: computing omnidirectional variogram
fit <-variofit(vario)
## Warning in variofit(vario): initial values not provided - running the default search
## variofit: searching for best initial value ... selected values:
## sigmasq phi tausq kappa
## initial.value "1326.72" "19999.93" "0" "0.5"
## status "est" "est" "est" "fix"
## loss value: 107266266.76371
plot(vario) ; lines(fit)
# distance to closest other point:
d <- sapply(1:nrow(cent), function(i) min(berryFunctions::distance(
cent$x[i], cent$y[i], cent$x[-i], cent$y[-i])))
hist(d/1000, breaks=20, main="distance to closest gauge [km]")
mean(d) # 8 km
## [1] 8165.633
# Kriging:
res <- 1000 # 1 km, since stations are 8 km apart on average
grid <- expand.grid(seq(min(cent$x),max(cent$x),res),
seq(min(cent$y),max(cent$y),res))
krico <- krige.control(type.krige="OK", obj.model=fit)
krobj <- krige.conv(geoprec, locations=grid, krige=krico)
## krige.conv: model with constant mean
## krige.conv: Kriging performed using global neighbourhood
# Set values outside of Brandenburg to NA:
grid_sf <- sf::st_as_sf(grid, coords=1:2, crs=sf::st_crs(p))
isinp <- sapply(sf::st_within(grid_sf, p), length) > 0
krobj2 <- krobj
krobj2$predict[!isinp] <- NA
geoR:::image.kriging(krobj2, col=pcol)
colPoints(cent$x, cent$y, p$P1, col=pcol, zlab="Prec", cex=0.7,
legargs=list(y1=0.1,y2=0.8, x1=0.78, x2=0.87, horiz=F))
plot(p, add=T, col=NA, border=8)#; points(cent$x,cent$y, cex=0.7)
[author title="About the author"]Berry started working with R in 2010 during his studies of Geoecology at Potsdam University, Germany. He has since then given a number of R programming workshops and tutorials, including full-week workshops in Kyrgyzstan and Kazachstan. He has left the department for environmental science in summer 2017 to focus more on software development and teaching in the data science industry. Please follow the Github link for detailed explanations on Berry’s R courses. [/author]