HYPER-VIT : A NOVEL LIGHT-WEIGHTED VISUAL TRANSFORMER-BASED SUPERVISED CLASSIFICATION FRAMEWORK FOR HYPERSPECTRAL REMOTE SENSING APPLICATIONS
Hyperspectral Imagery (HSI) data inherently has the ability to store finer details in the form of reflectance information through its contiguous spectral bands, which is utilized to discriminate between materials included in the data. With the emergence of deep learning (DL) over the last decade and the extent to which it has influenced applications and research in the domain of remote sensing is significant. Convolutional neural networks (CNNs), residual networks (ResNets), recurrent neural networks (RNNs), and other deep learning constructs have been employed to develop remote sensing-based computer vision applications, and have produced excellent results time and time again. However, the aforementioned deep learning constructs lack the intrinsic ability to prioritize data features based on how cardinal they are in mapping inputs to ground-truths, generally called attention, thus not exploiting the underlying architecture’s potential to the fullest. Hence, in this work, we explore the breadth of influence of a novel, computationally efficient visual transformer (ViT) based architecture on HSI data based classification tasks. The efficacy of the proposed architecture is evaluated through a series of experimentation and the performance is compared against other state-of-the-art attention based HSI data classification methodologies on two datasets, Salinas and Pavia University. When compared to other approaches discussed, experimental results show that our proposed methodology outperformed them in terms of classification efficacy and computational complexity under limited training samples scenario.
Name : Pavia University
Size : (610 x 340 x 103) pixels
Name : Salinas Valley
Size : (512 x 217 x 224) pixels