unsupervised classification with R

unsupervised classification with R

m

January 29, 2016

Here we see three simple ways to perform an unsupervised classification on a raster dataset in R. I will show these approaches, but first we need to load the relevant packages and the actual data. You could use the Landsat data used in the “Remote Sensing and GIS for Ecologists” book which can be downloaded here.

library("raster")  
library("cluster")
library("randomForest")

# loading the layerstack  
# here we use a subset of the Landsat dataset from "Remote Sensing and GIS for Ecologists" 
image <- stack("path/to/raster")
plotRGB(image, r=3,g=2,b=1,stretch="hist")

RGBimage

Now we will prepare the data for the classifications. First we convert the raster data in a matrix, then we remove the NA-values.

## returns the values of the raster dataset and write them in a matrix. 
v <- getValues(image)
i <- which(!is.na(v))
v <- na.omit(v)

The first classification method is the well-known k-means method. It separates n observations into  k clusters. Each observation belongs to the cluster with the nearest mean.

## kmeans classification 
E <- kmeans(v, 12, iter.max = 100, nstart = 10)
kmeans_raster <- raster(image)
kmeans_raster[i] <- E$cluster
plot(kmeans_raster)

Kmeans

The second classification method is called clara (Clustering for Large Applications). It work by clustering only a sample of the dataset and then assigns all object in the dataset to the clusters.

## clara classification 
clus <- clara(v,12,samples=500,metric="manhattan",pamLike=T)
clara_raster <- raster(image)
clara_raster[i] <- clus$clustering
plot(clara_raster)

clara

The third method uses a random Forest model to calculate proximity values. These values were clustered using k-means. The clusters are used to train another random Forest model for classification.

## unsupervised randomForest classification using kmeans
vx<-v[sample(nrow(v), 500),]
rf = randomForest(vx)
rf_prox <- randomForest(vx,ntree = 1000, proximity = TRUE)$proximity

E_rf <- kmeans(rf_prox, 12, iter.max = 100, nstart = 10)
rf <- randomForest(vx,as.factor(E_rf$cluster),ntree = 500)
rf_raster<- predict(image,rf)
plot(rf_raster)

randomForest

The three classifications are stacked into one layerstack and plotted for comparison.

class_stack <- stack(kmeans_raster,clara_raster,rf_raster)
names(class_stack) <- c("kmeans","clara","randomForest")

plot(class_stack)

Comparing the three classifications:

Looking at the different classifications we notice, that the kmeans and clara classifications have only minor differences.
The randomForest classification shows a different image.

 

want to read more about R and classifications? check out this book:

you may also like:

Blender GIS introduction

Blender GIS introduction

Within out EAGLE Earth Observation M.Sc. we also cover software applications which might not be used on a regular basis within our field of research but are sometimes highly useful to display our spatial data in a visually appealing way - and also potentially provides...

EAGLE presentation by Gökçe Yağmur Budak

EAGLE presentation by Gökçe Yağmur Budak

On November 26, 2024, Gökçe Yağmur Budak will present her internship results on " Leveraging Data-Driven Approaches for Seismic Risk Assessment in Istanbul " at 12:30 in seminar room 3, John-Skilton-Str. 4a. From the abstract: This internship aims to create time...

JURSE – deadline for paper submission extended

JURSE – deadline for paper submission extended

JURSE - Joint Urban Remote Sensing Event   The 17th International Conference on Joint Urban Remote Sensing (JURSE), organized by Higher School of Communication of Tunis (SUP'COM) will take place in Tunisia from 4 to 7 May 2025. https://2025.ieee-jurse.org/ ;...

Professor Appolonia A. Okhimamhe visits DLR

Professor Appolonia A. Okhimamhe visits DLR

This week, we were delighted to welcome Professor Appolonia A. Okhimamhe from the Federal University of Technology (FUT) Minna, Nigeria, to the German Aerospace Center (DLR). She is a Professor of Geography and the Director of the Doctoral Programme on Climate Change...

PHD DEFENSE BY PATRICK ARAVENA PELIZARI ON DECEMBER 12, 2024

PHD DEFENSE BY PATRICK ARAVENA PELIZARI ON DECEMBER 12, 2024

On December 12, 2024, Patrick Aravena Pelizari will defend his doctoral thesis entitled "Multihazard-Expositionsmodellierung mit multimodalen Geobilddaten und Deep Learning" at the University of Würzburg. In his work, Patrick Aravena Pelizari explored the potential of...