Image segmentation, an image processing technique, is the process of getting the foreground or removing the background of an image.
With deep learning techniques like Mask R-CNN, this process has become more accurate, overshadowing traditional image segmentation techniques.
However, for simple images, these traditional techniques are sufficient, which will be explored in this project.
Please take these steps:
Analogous to how eyes perceive colors using receptors (red, green, and blue cones), images are composed of three channels or dimension corresponding to RGB.
Using simple thresholding for one of these channels, certain pixels/parts in the image can be removed.
The sliders below set the range of values for red, green, and blue channels that will be retained in the image. For example, when the red channel is limited to 50 and 155, all the pixels outside that range will be filtered out.
The mask must be generated first from the chosen thresholds on sliders before applying the mask on the image.
When the colors of the images are too mixed, i.e. the foreground is not easily separable from the background using thresholding techniques, some operations can be done on the mask in order to improve the results.
There are many operations but most of them can be traced back to two fundamental operations: erosion and dilation.
Erosion is the process of "enlarging" the background pixels, transforming all the surrounding foreground pixels of a background pixel to background pixel.
Dilation is the opposite, it enlarges the foreground pixels by transforming background to foreground.
Note that both operations have a concept of "surroundings" or a neighborhood.
This is defined by a structuring element which is also a matrix of ones and zeros that take any shape.
For this project, our structuring element will be a circle with radius set by the user.
Note that only one operation is done on the mask image (either erosion or dilation).
In practice, a combination of these operations is usually done to process images. The two most commonly used operations are closing and opening operators. Closing, which is done by dilating then eroding the image, patches all holes in the image that are smaller than the structuring element. Opening, on the other hand, is erosion followed by dilation which removes all blobs smaller than the structuring element. More complicated operations like top-hat are also a combination of closing and opening. So, in order to get the desired image, operations upon operations are performed on images.