Sunday, August 28, 2016

A High Level Guide to Artificial Neural Networks



Today let’s try to have a high level view of Artificial Neural Networks, one of the most powerful machine learning methods that exist and rabbit hole that leads to the wonderland of deep learning, which is arguably the hottest area in computer science that exists today.

So let’s get straight to it. A typical architecture of an ANN looks like this:



As you can see, it consists of stacked layers of perceptrons each having a non-linear activation function. Each perceptron or a neuron is a computational unit that takes in input $x_{1}, x_{2}, x_{3} ..$ and 1 (the bias input) and outputs $f(W^{T}X)$ where $W$ is its corresponding weight vector and $f$ is the chosen activation function. 




This is a crude simulation of a biological neuron which receives inputs from various regions through its dendrites and fires when the activation value is greater than some threshold value. The inputs are weighted in the synaptic junction, and these weights which are dynamically changing, adjusting, are the learnable parameters of the neuron.

The first layer is an input layer and the rightmost layer is the output layer. The layers in between are called ‘hidden layers’ and their number is a hyper parameter that we choose beforehand depending upon the complexity of the model we wish to build. Neural networks can form very complex decision boundaries as they consist of several layers of non-linearity in the form of these hidden layers.

All the neurons in one layer are connected to the ones in the next with some weights that we wish to learn. The learning procedure starts with randomly initializing all the weights with small random numbers. The rest of the procedure consists of successive forward passes in which input layer takes in the training data and each neuron in the layer computes its respective value and feeds it to the hidden layers and the process is continued all the way up to the output layer. Once output layer has its prediction values, loss is computed choosing a suitable loss function and is backpropogated towards the input side. The process of backpropogation is basically application of chain rule across each neuron as gradients of objective function with respect to the weight vectors are updated by multiplying the local gradients with the incoming gradient. With the new updated gradients the weight vectors are also updated by gradient descent method that we discussed in the earlier posts. With the new weight vectors forward propagation happens again and the same procedure reiterates. This is a very high level and informal description of what goes on in the neural nets. For concrete mathematical formulation I recommend you to go through this link and also this.

Deep learning which has shown mindblowing results in the past few years is a field built upon ANNs. It basically deals with modelling neural networks in a way that can be scalable to large number of hidden layers. I will try to cover some of these cool models in my future posts. :-) 




Wednesday, August 17, 2016

Object Based Image Classification and Change Detection


In the course of my internship at Bhaskaracharya Institute of Space Application and Geoinformatics (BISAG) at Gandhinagar I had the chance to work on an interesting application of machine learning and image processing in remote sensing. The aim of our project was object based classification of multiband satellite images.
To understand what is what let me first give you a high level understanding of what object based image classification means exactly. The problem of land cover classification has been a very busy research area in GIS. Traditionally, it was tackled by an approach called ‘Pixel based Classification’. What this method does is, while training, it looks at individual pixels independently and trains and classifies them just on the basis of their spectral attributes. Though this method works modestly it is not as good for high resolution data because it completely ignores the spatial attributes. For example a blue colored pixel from a water body and a blue pixel from a car will be looked at in the same way in this approach. A newer approach over this is ‘Object based image classification’ which I will walk you through in this post.
The first and the key step after some pre-processing was image segmentation. What this step did was it grouped similar pixels into clumps of pixels called super pixels or segments in an unsupervised manner. We first tried K means but moved on to built in sophisticated segmentation algorithms in python's skimage library like quick shift segmentation. This is how an image looks like after segmentation:

With K-means:




With Felzenswalb’s and Quickshift:




These segments will be our objects and serve as a heuristic for similar pixels of a single object. The  labelled training data that we had was in form of something called shape files. Shape file is basically vector data like lines, polygons, etc. So our shape files marked certain areas in the image and labelled each pixel in that region. We mapped each labelled pixel to its corresponding segment and thus defined the training region in this image. Another challenge was data representation of segments because each segment consisted of variable number of pixels. How we got around this was using the scipy package to describe 6 statistics of each segment and use these six as attributes for the data representation of each segment. So once the training regions are isolated and we have a concise representation for each segment we can finally deploy our ML classifier to learn it. We used  SVM with linear and RBF kernels and random forests. Maximum accuracy of 97% was seen in random forest classifier.
The coolest thing about this project was that we built a machine learning model just out of a single image. I learnt a lot in this project and also got an opportunity to get a flavor of remote sensing area and exposure to GIS field which probably I would not have gotten anywhere else.