Skip to content

bishwaspraveen/Vision_Transformers_Based_Image_Segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

HYPER-VIT : A NOVEL LIGHT-WEIGHTED VISUAL TRANSFORMER-BASED SUPERVISED CLASSIFICATION FRAMEWORK FOR HYPERSPECTRAL REMOTE SENSING APPLICATIONS

Introduction

Hyperspectral Imagery (HSI) data inherently has the ability to store finer details in the form of reflectance information through its contiguous spectral bands, which is utilized to discriminate between materials included in the data. With the emergence of deep learning (DL) over the last decade and the extent to which it has influenced applications and research in the domain of remote sensing is significant. Convolutional neural networks (CNNs), residual networks (ResNets), recurrent neural networks (RNNs), and other deep learning constructs have been employed to develop remote sensing-based computer vision applications, and have produced excellent results time and time again. However, the aforementioned deep learning constructs lack the intrinsic ability to prioritize data features based on how cardinal they are in mapping inputs to ground-truths, generally called attention, thus not exploiting the underlying architecture’s potential to the fullest. Hence, in this work, we explore the breadth of influence of a novel, computationally efficient visual transformer (ViT) based architecture on HSI data based classification tasks. The efficacy of the proposed architecture is evaluated through a series of experimentation and the performance is compared against other state-of-the-art attention based HSI data classification methodologies on two datasets, Salinas and Pavia University. When compared to other approaches discussed, experimental results show that our proposed methodology outperformed them in terms of classification efficacy and computational complexity under limited training samples scenario.


Datasets

Name : Pavia University

Size : (610 x 340 x 103) pixels

Ground Truth (Labels)

Name : Salinas Valley

Size : (512 x 217 x 224) pixels

Ground Truth (Labels)


Overall System Architecture


Results

Classification Map (Pavia University) (2% Training Data - Accuracy : 97.19%)

Classification Map (Salinas) (2% Training Data - Accuracy : 94.80%)


Overall Classification Accuracy - Pavia University (1% - 5% Training)

Overall Classification Accuracy - Salinas (1% - 5% Training)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published