unsupervised classification with R

unsupervised classification with R

m

January 29, 2016

Here we see three simple ways to perform an unsupervised classification on a raster dataset in R. I will show these approaches, but first we need to load the relevant packages and the actual data. You could use the Landsat data used in the “Remote Sensing and GIS for Ecologists” book which can be downloaded here.

library("raster")  
library("cluster")
library("randomForest")

# loading the layerstack  
# here we use a subset of the Landsat dataset from "Remote Sensing and GIS for Ecologists" 
image <- stack("path/to/raster")
plotRGB(image, r=3,g=2,b=1,stretch="hist")

RGBimage

Now we will prepare the data for the classifications. First we convert the raster data in a matrix, then we remove the NA-values.

## returns the values of the raster dataset and write them in a matrix. 
v <- getValues(image)
i <- which(!is.na(v))
v <- na.omit(v)

The first classification method is the well-known k-means method. It separates n observations into  k clusters. Each observation belongs to the cluster with the nearest mean.

## kmeans classification 
E <- kmeans(v, 12, iter.max = 100, nstart = 10)
kmeans_raster <- raster(image)
kmeans_raster[i] <- E$cluster
plot(kmeans_raster)

Kmeans

The second classification method is called clara (Clustering for Large Applications). It work by clustering only a sample of the dataset and then assigns all object in the dataset to the clusters.

## clara classification 
clus <- clara(v,12,samples=500,metric="manhattan",pamLike=T)
clara_raster <- raster(image)
clara_raster[i] <- clus$clustering
plot(clara_raster)

clara

The third method uses a random Forest model to calculate proximity values. These values were clustered using k-means. The clusters are used to train another random Forest model for classification.

## unsupervised randomForest classification using kmeans
vx<-v[sample(nrow(v), 500),]
rf = randomForest(vx)
rf_prox <- randomForest(vx,ntree = 1000, proximity = TRUE)$proximity

E_rf <- kmeans(rf_prox, 12, iter.max = 100, nstart = 10)
rf <- randomForest(vx,as.factor(E_rf$cluster),ntree = 500)
rf_raster<- predict(image,rf)
plot(rf_raster)

randomForest

The three classifications are stacked into one layerstack and plotted for comparison.

class_stack <- stack(kmeans_raster,clara_raster,rf_raster)
names(class_stack) <- c("kmeans","clara","randomForest")

plot(class_stack)

Comparing the three classifications:

Looking at the different classifications we notice, that the kmeans and clara classifications have only minor differences.
The randomForest classification shows a different image.

 

want to read more about R and classifications? check out this book:

you may also like:

Presentation at ESA Advanced Training Course

Presentation at ESA Advanced Training Course

At the 14th Advanced Training Course on Land Remote Sensing – Agriculture, held from 29 September to 3 October in Thessaloniki, researchers, early-career scientists, and experts from across Europe gathered to exchange knowledge on the latest advances in remote sensing...

New EAGLEs take off into the Winter Term 2025/26

New EAGLEs take off into the Winter Term 2025/26

As in previous years, the next generation of EAGLE Master's students from around the world gathered at the Earth Observation Research Center (EORC) on the first day of the winter term to begin their studies at the University of Würzburg. Prof. Dr. Tobias Ullmann...

Recording the Sounds of a River

Recording the Sounds of a River

Over the weekend, EORC PI Florian Betz met with Martina Cecchetto and Riccardo Fumigalli from the University of Padua to conduct ambient sound recordings and collect photographs of the Lech River, one of the major tributaries of the upper Danube. The photographs and...

Our PhD Wall is Growing — and So Is Our Research Family!

Our PhD Wall is Growing — and So Is Our Research Family!

It’s been a remarkable year for our research team! The PhD Wall of Fame, showcasing all past and current doctoral researchers, has officially reached its limits — and we’ve had to expand it to make room for even more success stories. So far six PhD defenses have taken...