Abstract
With the rapid growth and development in artificial intelligence, Semantic Segmentation has widely adapted in the area of computer vision and artificial intelligence/ Machine learning applications & algorithms are widely utilized in self-driving cars, drone technology, satellite images, treatment, medical research & diagnostic. Semantic segmentation (Image segmentation) is the mapping of each pixel in the image is assigned to a specific class. To illustrate this, imagine an image consisting of cars, bikes, pedestrians, animals, road signs, traffic signals, etc. Semantic segmentation applied image, every pixel in the photo mapped associating with road signs will be labelled to class road signs and similarly for that different class it is mapped to appropriate classes. The Cityscapes dataset has images of urban street scenes images of different classes such as sidewalks, rail tracks, fences, terrain, sky, bus, vegetation, etc. Cityscapes dataset comprises of 5000 images of 50 different European cities covering different climate conditions.
The proposed solution in this project uses a different version of the image segmentation method using the Convolution Neural Network algorithm for the classification task has been implemented. Our proposed solution uses Autoencoders to overcome the FCN up sampling issues. As a result, the spatial information present in the data is better utilized and in addition, the Autoencoders approach uses lesser computational power which is essential for self-driving car applications. The proposed architectures in this project are superior to many best-performing models by achieving better or comparable performance from the best reported FCN model in the mean intersection over union when tested on the Cityscapes dataset which is publicly available on the internet.
Keywords: CNN, Segmentation, FCN, Convolution, VGG, Deep Learning