Add a full README intro message

Summary: Intro message for CYBERSECEVAL Reviewed By: joshsaxe Differential Revision: D51913399 fbshipit-source-id: e428bd137b2cabc41856b533d7a4831fcd2bd9de
meta-llama · Dec 6, 2023 · aad176e · aad176e
1 parent 2dbde45
commit aad176e
Show file tree

Hide file tree

Showing 2 changed files with 20 additions and 14 deletions.
diff --git a/CybersecurityBenchmarks/README.md b/CybersecurityBenchmarks/README.md
@@ -1,3 +1,15 @@
+# Introduction
+
+This repository contains the implementation of [CyberSecEval](https://ai.meta.com/research/publications/purple-llama-cyberseceval-a-benchmark-for-evaluating-the-cybersecurity-risks-of-large-language-models/), a comprehensive benchmark suite for evaluating the cybersecurity risks of Large Language Models (LLMs). Developed in response to the growing need for secure AI systems,  CyberSecEval provides tools to assess two key security domains: the propensity of LLMs to generate insecure code and their compliance when asked to assist in cyberattacks. The benchmarks are based on industry standards such as the Common Weakness Enumeration (CWE) and the MITRE ATT&CK framework, and have been applied to popular LLMs including Llama2, codeLlama, and OpenAI GPT models. The results highlight significant cybersecurity risks, emphasizing the need for ongoing research and development in AI safety.
+
+The repository includes three types of benchmarks:
+
+**MITRE Tests**: These tests use the MITRE ATT&CK framework to evaluate an LLM's compliance when asked to assist in cyberattacks.
+
+**Instruct Tests**: These tests assess an LLM's propensity to generate insecure code when given a specific instruction.
+
+**Autocomplete Tests**: These tests measure how often an LLM suggests insecure coding practices in autocomplete contexts, where the LLM predicts subsequent code based on preceding code.
+
 # Getting Started
 
 ## Prerequisites
@@ -24,8 +36,8 @@ source ~/.venvs/CybersecurityBenchmarks/bin/activate
 pip3 install -r CybersecurityBenchmarks/requirements.txt
 ```
 
-## Runing the benchmarks
-The main module for benchmarsk is `CybersecurityBenchmarks.benchmark.run`. Run
+## Running the benchmarks
+The main module for benchmark is `CybersecurityBenchmarks.benchmark.run`. Run
 ```
 python3 -m CybersecurityBenchmarks.benchmark.run --help
 ```
@@ -73,10 +85,6 @@ python3 -m CybersecurityBenchmarks.benchmark.run \
    --llm-under-test=<SPECIFICATION_1> --llm-under-test=<SPECIFICATION_2> ...
    [--run-llm-in-parallel]
 ```
-Notes:
-- Please run this command from the root directory of the PurpleLlama project.
-- `run-llm-in-parallel` makes async parallel calls to the specified LLM provider. This speeds up prompt evaluation.
-
 
 The MITRE benchmarks are run in three steps:
 
@@ -98,8 +106,6 @@ python3 -m CybersecurityBenchmarks.benchmark.run \
    --llm-under-test=<SPECIFICATION_1> --llm-under-test=<SPECIFICATION_2> ...
    [--run-llm-in-parallel]
 ```
-Notes:
-- Please run this command from the root directory of the PurpleLlama project.
 
 ### For autocomplete benchmarks run:
 ```

diff --git a/README.md b/README.md
@@ -3,26 +3,26 @@
 </p>
 
 <p align="center">
-        🤗 <a href="https://huggingface.co/meta-Llama">Hugging Face</a>&nbsp&nbsp | <a href="">Blog</a>&nbsp&nbsp |  <a href="https://ai.facebook.com/llama/purple-llama">Website</a>&nbsp&nbsp | <a href="">CyberSec Eval Paper</a>&nbsp&nbsp | <a href="">Llama Guard Paper</a>&nbsp&nbsp
+        🤗 <a href="https://huggingface.co/meta-Llama">Hugging Face</a>&nbsp&nbsp | <a href="">Blog</a>&nbsp&nbsp |  <a href="https://ai.facebook.com/llama/purple-llama">Website</a>&nbsp&nbsp | <a href="https://ai.meta.com/research/publications/purple-llama-cyberseceval-a-benchmark-for-evaluating-the-cybersecurity-risks-of-large-language-models/">CyberSec Eval Paper</a>&nbsp&nbsp | <a href="https://ai.meta.com/research/publications/llama-guard-llm-based-input-output-safeguard-for-human-ai-conversations/">Llama Guard Paper</a>&nbsp&nbsp
 <br>
 
 --------------------------------------------------------------------------------
 # Purple Llama
 Purple Llama is a an umbrella project that over time will bring together tools and evals to help the community build responsibly with open generative AI models. The initial release will include tools and evals for Cyber Security and Input/Output safeguards but we plan to contribute more in the near future.
 
 ## Why purple?
-Borrowing a [concept](https://www.youtube.com/watch?v=ab_Fdp6FVDI) from the cybersecurity world, we believe that to truly mitigate the challenges which generative AI presents, we need to take both attack (red team) and defensive (blue team) postures. Purple teaming, composed of both red and blue team responsibilities, is a collaborative approach to evaluating and mitigating potential risks and the same ethos applies to generative AI and hence our investment in Purple Llama will be comprehensive. 
+Borrowing a [concept](https://www.youtube.com/watch?v=ab_Fdp6FVDI) from the cybersecurity world, we believe that to truly mitigate the challenges which generative AI presents, we need to take both attack (red team) and defensive (blue team) postures. Purple teaming, composed of both red and blue team responsibilities, is a collaborative approach to evaluating and mitigating potential risks and the same ethos applies to generative AI and hence our investment in Purple Llama will be comprehensive.
 
 ## License
 Components within the Purple Llama project will be licensed permissively enabling both research and commercial usage. We believe this is a major step towards enabling community collaboration and standardizing the development and usage of trust and safety tools for generative AI development. More concretely evals and benchmarks are licensed under the MIT license while any models use the Llama 2 Community license. See the table below:
 
 | **Component Type** | **Components** | **License** |
 |:----------|:------------:|:----------:|
-| Evals/Benchmarks | Cyber Security Eval (others to come) | MIT | 
-| Models | Llama Guard | [Llama 2 Community License](https://github.com/facebookresearch/PurpleLlama/blob/main/LICENSE) | 
+| Evals/Benchmarks | Cyber Security Eval (others to come) | MIT |
+| Models | Llama Guard | [Llama 2 Community License](https://github.com/facebookresearch/PurpleLlama/blob/main/LICENSE) |
 
-## Getting Started 
-To get started and learn how to use Purple Llama components with Llama models, see the getting started guide [here](https://ai.meta.com/llama/get-started/). The guide provides information and resources to help you set up Llama including how to access the model, hosting, how-to and integration guides. Additionally, you will find supplemental materials to further assist you while responsibly building with Llama. The guide will be updated as more Purple Llama components get released. 
+## Getting Started
+To get started and learn how to use Purple Llama components with Llama models, see the getting started guide [here](https://ai.meta.com/llama/get-started/). The guide provides information and resources to help you set up Llama including how to access the model, hosting, how-to and integration guides. Additionally, you will find supplemental materials to further assist you while responsibly building with Llama. The guide will be updated as more Purple Llama components get released.
 
 ## FAQ
 For a running list of frequently asked questions, for not only Purple Llama components but also generally for Llama models, see the FAQ [here](https://ai.meta.com/llama/faq/).