- Extract useful and meaningful features (WIN API calls, n-gram binaries, op instructions, etc.) from given PE files
- Given a binary file, write a program to determine if it is a PE file: check if it begins with the magic word “MZ” followed by a DOS stub and has “PE” signature in the PE file header, check if the binary is 32- or 64- bit.
- Given a PE file, write a program code to extract its features
- Develop an ML algorithm to train model based on selected feature set(s). Then test it on the testing set
- Implement a GUI tool for users to scan uploaded PE files. Program should conduct the predictions in the backend and display detection results (benign vs. malicious)
- Working code with GUI
- Report containing feature detection method, training/detection alg, and experimental results
- In class demo that shows detection of any executables in a format of PE
- Class Lectures 16-18
- sklearn decision trees tutorial