Skip to content

Commit

Permalink
Add a full README intro message
Browse files Browse the repository at this point in the history
Summary: Intro message for CYBERSECEVAL

Reviewed By: joshsaxe

Differential Revision: D51913399

fbshipit-source-id: e428bd137b2cabc41856b533d7a4831fcd2bd9de
  • Loading branch information
cynikolai authored and facebook-github-bot committed Dec 6, 2023
1 parent 2dbde45 commit aad176e
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 14 deletions.
22 changes: 14 additions & 8 deletions CybersecurityBenchmarks/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
# Introduction

This repository contains the implementation of [CyberSecEval](https://ai.meta.com/research/publications/purple-llama-cyberseceval-a-benchmark-for-evaluating-the-cybersecurity-risks-of-large-language-models/), a comprehensive benchmark suite for evaluating the cybersecurity risks of Large Language Models (LLMs). Developed in response to the growing need for secure AI systems, CyberSecEval provides tools to assess two key security domains: the propensity of LLMs to generate insecure code and their compliance when asked to assist in cyberattacks. The benchmarks are based on industry standards such as the Common Weakness Enumeration (CWE) and the MITRE ATT&CK framework, and have been applied to popular LLMs including Llama2, codeLlama, and OpenAI GPT models. The results highlight significant cybersecurity risks, emphasizing the need for ongoing research and development in AI safety.

The repository includes three types of benchmarks:

**MITRE Tests**: These tests use the MITRE ATT&CK framework to evaluate an LLM's compliance when asked to assist in cyberattacks.

**Instruct Tests**: These tests assess an LLM's propensity to generate insecure code when given a specific instruction.

**Autocomplete Tests**: These tests measure how often an LLM suggests insecure coding practices in autocomplete contexts, where the LLM predicts subsequent code based on preceding code.

# Getting Started

## Prerequisites
Expand All @@ -24,8 +36,8 @@ source ~/.venvs/CybersecurityBenchmarks/bin/activate
pip3 install -r CybersecurityBenchmarks/requirements.txt
```

## Runing the benchmarks
The main module for benchmarsk is `CybersecurityBenchmarks.benchmark.run`. Run
## Running the benchmarks
The main module for benchmark is `CybersecurityBenchmarks.benchmark.run`. Run
```
python3 -m CybersecurityBenchmarks.benchmark.run --help
```
Expand Down Expand Up @@ -73,10 +85,6 @@ python3 -m CybersecurityBenchmarks.benchmark.run \
--llm-under-test=<SPECIFICATION_1> --llm-under-test=<SPECIFICATION_2> ...
[--run-llm-in-parallel]
```
Notes:
- Please run this command from the root directory of the PurpleLlama project.
- `run-llm-in-parallel` makes async parallel calls to the specified LLM provider. This speeds up prompt evaluation.


The MITRE benchmarks are run in three steps:

Expand All @@ -98,8 +106,6 @@ python3 -m CybersecurityBenchmarks.benchmark.run \
--llm-under-test=<SPECIFICATION_1> --llm-under-test=<SPECIFICATION_2> ...
[--run-llm-in-parallel]
```
Notes:
- Please run this command from the root directory of the PurpleLlama project.

### For autocomplete benchmarks run:
```
Expand Down
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,26 +3,26 @@
</p>

<p align="center">
🤗 <a href="https://huggingface.co/meta-Llama">Hugging Face</a>&nbsp&nbsp | <a href="">Blog</a>&nbsp&nbsp | <a href="https://ai.facebook.com/llama/purple-llama">Website</a>&nbsp&nbsp | <a href="">CyberSec Eval Paper</a>&nbsp&nbsp | <a href="">Llama Guard Paper</a>&nbsp&nbsp
🤗 <a href="https://huggingface.co/meta-Llama">Hugging Face</a>&nbsp&nbsp | <a href="">Blog</a>&nbsp&nbsp | <a href="https://ai.facebook.com/llama/purple-llama">Website</a>&nbsp&nbsp | <a href="https://ai.meta.com/research/publications/purple-llama-cyberseceval-a-benchmark-for-evaluating-the-cybersecurity-risks-of-large-language-models/">CyberSec Eval Paper</a>&nbsp&nbsp | <a href="https://ai.meta.com/research/publications/llama-guard-llm-based-input-output-safeguard-for-human-ai-conversations/">Llama Guard Paper</a>&nbsp&nbsp
<br>

--------------------------------------------------------------------------------
# Purple Llama
Purple Llama is a an umbrella project that over time will bring together tools and evals to help the community build responsibly with open generative AI models. The initial release will include tools and evals for Cyber Security and Input/Output safeguards but we plan to contribute more in the near future.

## Why purple?
Borrowing a [concept](https://www.youtube.com/watch?v=ab_Fdp6FVDI) from the cybersecurity world, we believe that to truly mitigate the challenges which generative AI presents, we need to take both attack (red team) and defensive (blue team) postures. Purple teaming, composed of both red and blue team responsibilities, is a collaborative approach to evaluating and mitigating potential risks and the same ethos applies to generative AI and hence our investment in Purple Llama will be comprehensive.
Borrowing a [concept](https://www.youtube.com/watch?v=ab_Fdp6FVDI) from the cybersecurity world, we believe that to truly mitigate the challenges which generative AI presents, we need to take both attack (red team) and defensive (blue team) postures. Purple teaming, composed of both red and blue team responsibilities, is a collaborative approach to evaluating and mitigating potential risks and the same ethos applies to generative AI and hence our investment in Purple Llama will be comprehensive.

## License
Components within the Purple Llama project will be licensed permissively enabling both research and commercial usage. We believe this is a major step towards enabling community collaboration and standardizing the development and usage of trust and safety tools for generative AI development. More concretely evals and benchmarks are licensed under the MIT license while any models use the Llama 2 Community license. See the table below:

| **Component Type** | **Components** | **License** |
|:----------|:------------:|:----------:|
| Evals/Benchmarks | Cyber Security Eval (others to come) | MIT |
| Models | Llama Guard | [Llama 2 Community License](https://github.com/facebookresearch/PurpleLlama/blob/main/LICENSE) |
| Evals/Benchmarks | Cyber Security Eval (others to come) | MIT |
| Models | Llama Guard | [Llama 2 Community License](https://github.com/facebookresearch/PurpleLlama/blob/main/LICENSE) |

## Getting Started
To get started and learn how to use Purple Llama components with Llama models, see the getting started guide [here](https://ai.meta.com/llama/get-started/). The guide provides information and resources to help you set up Llama including how to access the model, hosting, how-to and integration guides. Additionally, you will find supplemental materials to further assist you while responsibly building with Llama. The guide will be updated as more Purple Llama components get released.
## Getting Started
To get started and learn how to use Purple Llama components with Llama models, see the getting started guide [here](https://ai.meta.com/llama/get-started/). The guide provides information and resources to help you set up Llama including how to access the model, hosting, how-to and integration guides. Additionally, you will find supplemental materials to further assist you while responsibly building with Llama. The guide will be updated as more Purple Llama components get released.

## FAQ
For a running list of frequently asked questions, for not only Purple Llama components but also generally for Llama models, see the FAQ [here](https://ai.meta.com/llama/faq/).
Expand Down

0 comments on commit aad176e

Please sign in to comment.