Skip to content

Adversarial attacks againsts Large Language Models

Notifications You must be signed in to change notification settings

AlisaLC/llama-attack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Llama Attack

Introduction

This repo contains apps and implementations of adversarial attacks against Large Language Models and Vision Language Models.

Implementations

  • Greedy Coordinate Gradient (by nanoGCG)
  • Don't Say No (modified verstion of nanoGCG)
  • Visual Embedding Attack (different images with similar embeddings)

Apps

Apps are created using Streamlit for implementations.

  • GCG app
  • DSN app
  • VisEmb app

Models

  • Llama 3.1
  • Qwen2-VL

About

Adversarial attacks againsts Large Language Models

Resources

Stars

Watchers

Forks

Languages