unsupervised classification with R

unsupervised classification with R

m

January 29, 2016

Here we see three simple ways to perform an unsupervised classification on a raster dataset in R. I will show these approaches, but first we need to load the relevant packages and the actual data. You could use the Landsat data used in the “Remote Sensing and GIS for Ecologists” book which can be downloaded here.

library("raster")  
library("cluster")
library("randomForest")

# loading the layerstack  
# here we use a subset of the Landsat dataset from "Remote Sensing and GIS for Ecologists" 
image <- stack("path/to/raster")
plotRGB(image, r=3,g=2,b=1,stretch="hist")

RGBimage

Now we will prepare the data for the classifications. First we convert the raster data in a matrix, then we remove the NA-values.

## returns the values of the raster dataset and write them in a matrix. 
v <- getValues(image)
i <- which(!is.na(v))
v <- na.omit(v)

The first classification method is the well-known k-means method. It separates n observations into  k clusters. Each observation belongs to the cluster with the nearest mean.

## kmeans classification 
E <- kmeans(v, 12, iter.max = 100, nstart = 10)
kmeans_raster <- raster(image)
kmeans_raster[i] <- E$cluster
plot(kmeans_raster)

Kmeans

The second classification method is called clara (Clustering for Large Applications). It work by clustering only a sample of the dataset and then assigns all object in the dataset to the clusters.

## clara classification 
clus <- clara(v,12,samples=500,metric="manhattan",pamLike=T)
clara_raster <- raster(image)
clara_raster[i] <- clus$clustering
plot(clara_raster)

clara

The third method uses a random Forest model to calculate proximity values. These values were clustered using k-means. The clusters are used to train another random Forest model for classification.

## unsupervised randomForest classification using kmeans
vx<-v[sample(nrow(v), 500),]
rf = randomForest(vx)
rf_prox <- randomForest(vx,ntree = 1000, proximity = TRUE)$proximity

E_rf <- kmeans(rf_prox, 12, iter.max = 100, nstart = 10)
rf <- randomForest(vx,as.factor(E_rf$cluster),ntree = 500)
rf_raster<- predict(image,rf)
plot(rf_raster)

randomForest

The three classifications are stacked into one layerstack and plotted for comparison.

class_stack <- stack(kmeans_raster,clara_raster,rf_raster)
names(class_stack) <- c("kmeans","clara","randomForest")

plot(class_stack)

Comparing the three classifications:

Looking at the different classifications we notice, that the kmeans and clara classifications have only minor differences.
The randomForest classification shows a different image.

 

want to read more about R and classifications? check out this book:

you may also like:

Succesful MSc Theseis Defense by Jean de Dieu Tuyizere

Succesful MSc Theseis Defense by Jean de Dieu Tuyizere

Congratulations to Jean de Dieu Tuyizere on the successful defense of his MSc thesis, entitled "Utilizing deep learning and Earth Observation data to predict land cover changes in Volcanoes National Park, Rwanda".   His study analyzed and projected land cover...

Writing in Progress Across Europe!

Writing in Progress Across Europe!

This week, members of the COST Action DSS4ES from all over Europe — including colleagues from Türkiye — have gathered at the Earth Observation Research Cluster of the University of Würzburg for a dedicated writing retreat. Our goal? To collaboratively shape the...

EORC at the GfÖ Annual Symposium 2025 in Würzburg

EORC at the GfÖ Annual Symposium 2025 in Würzburg

Last week, EORC staff co-organized and partizipated in the Ecological Society of Germany, Austria and Switzerland (GfÖ) Annual Symposium 2025, this year hosted at University of Würzburg. The symposium, attended by more than 600 people, covered a wide range of topics...

New study on the conservation of biodiversity in West Africa

New study on the conservation of biodiversity in West Africa

A new study by our team, led by Insa Otte, on the conflict between biodiversity conservation in protected areas and agricultural development in West Africa has been published in the journal Natur und Landschaft. The abstract: According to the Human Development Report...

New study on invasive species in Rwanda

New study on invasive species in Rwanda

A new publication by EORC members Lilly Schell, Insa Otte, Sarah Schönbrodt-Stitt and Konstantin Müller, was just published   in the Journal Frontiers in Plant Science. Their study, “Synergistic use of satellite, legacy, and in situ data to predict spatio-temporal...