Data Science: Object Based Image Classification and Change Detection

In the course of my internship at Bhaskaracharya Institute of Space Application and Geoinformatics (BISAG) at Gandhinagar I had the chance to work on an interesting application of machine learning and image processing in remote sensing. The aim of our project was object based classification of multiband satellite images.

To understand what is what let me first give you a high level understanding of what object based image classification means exactly. The problem of land cover classification has been a very busy research area in GIS. Traditionally, it was tackled by an approach called ‘Pixel based Classification’. What this method does is, while training, it looks at individual pixels independently and trains and classifies them just on the basis of their spectral attributes. Though this method works modestly it is not as good for high resolution data because it completely ignores the spatial attributes. For example a blue colored pixel from a water body and a blue pixel from a car will be looked at in the same way in this approach. A newer approach over this is ‘Object based image classification’ which I will walk you through in this post.

The first and the key step after some pre-processing was image segmentation. What this step did was it grouped similar pixels into clumps of pixels called super pixels or segments in an unsupervised manner. We first tried K means but moved on to built in sophisticated segmentation algorithms in python's skimage library like quick shift segmentation. This is how an image looks like after segmentation:

With K-means:

With Felzenswalb’s and Quickshift:

These segments will be our objects and serve as a heuristic for similar pixels of a single object. The labelled training data that we had was in form of something called shape files. Shape file is basically vector data like lines, polygons, etc. So our shape files marked certain areas in the image and labelled each pixel in that region. We mapped each labelled pixel to its corresponding segment and thus defined the training region in this image. Another challenge was data representation of segments because each segment consisted of variable number of pixels. How we got around this was using the scipy package to describe 6 statistics of each segment and use these six as attributes for the data representation of each segment. So once the training regions are isolated and we have a concise representation for each segment we can finally deploy our ML classifier to learn it. We used SVM with linear and RBF kernels and random forests. Maximum accuracy of 97% was seen in random forest classifier.

The coolest thing about this project was that we built a machine learning model just out of a single image. I learnt a lot in this project and also got an opportunity to get a flavor of remote sensing area and exposure to GIS field which probably I would not have gotten anywhere else.

Data Science

Wednesday, August 17, 2016

Object Based Image Classification and Change Detection

No comments:

Post a Comment