In this research project we implement a Kernel Subspace Method (KSM) for malware classification. The full paper can be found at: https://ieeexplore.ieee.org/abstract/document/10244023 or https://scholar.google.com/citations?view_op=view_citation&hl=en&user=JQsEaPUAAAAJ&citation_for_view=JQsEaPUAAAAJ:IjCSPb-OGe4C
This repository contains the Python implementation of the algorithm, along with necessary documentation and examples. This script should run with Python-2 and Python-3 both.
- Scalable to handle large datasets and complex malware variants.
- Computationally efficient and requires only a few parameter adjustments, making it a practical and effective solution for malware classification.
We used two publicly available malware datasets: Malimg and Dumpware. For ease of use and consistency, all the samples from these datasets have been preprocessed and stored in MATLAB's '.mat' file format after applying L2 normalization. The datasets in '.mat' format are included in this repository for direct use with KSM algorithm.
KSM confusion matrix on both Malimg and Dumpware datasets.
If you find the work useful, kindly cite is as:
@article{djafer2023efficient,
title={Efficient Malware Analysis Using Subspace-Based Methods on Representative Image Patterns},
author={Djafer Yahia M, Benchadi and Batalo, Bojan and Fukui, Kazuhiro},
journal={IEEE Access},
volume={11},
pages={102492--102507},
year={2023}
}