Skip to content

Commit

Permalink
add more docs
Browse files Browse the repository at this point in the history
Signed-off-by: oilbeater <[email protected]>
  • Loading branch information
oilbeater committed Oct 7, 2024
1 parent f622544 commit 9739823
Show file tree
Hide file tree
Showing 8 changed files with 57 additions and 6 deletions.
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ It is fully **CloudFlare Native**: allowing for global scale deployment without

It is written in **TypeScript**: ensuring adaptability to the rapidly evolving AI ecosystem and catering to the diverse needs of application developers.

> Malacca is still an early-stage project, containing many experiments by the author. Currently, it only supports AzureOpenAI. While Malacca provides numerous features, its primary purpose is to offer a framework and examples to help users better implement their own custom functionalities. We encourage you to read the code and adapt it to your specific needs. We welcome contributions and ideas from the community to help expand and improve this project.
> Malacca is still an early-stage project, containing many experiments by the author. Currently, it only supports AzureOpenAI. While Malacca provides numerous features, its primary purpose is to offer a framework and examples to help users better implement their own custom functionalities. In fact, we encourage you to read the code and adapt it to your specific needs. We welcome contributions and ideas from the community to help expand and improve this project.
## Features

Expand All @@ -21,9 +21,11 @@ It is written in **TypeScript**: ensuring adaptability to the rapidly evolving A
- 🛠️ **Comprehensive Feature Set**
- 🔑 **Virtual Key**: Manage access permissions using virtual keys, providing more granular control over API access.
-**Caching**: Reduce latency and costs by caching repeat requests.
- 🛡️ **Guard**: Deny the request if the request or response has inappropriate content.
- 📊 **Analytics**: Track the status, error, latency and usage of tokens, allowing you to understand and manage API costs.
- 📋 **Logging**: Record requests and responses to further fine-tune or reinforcement learning.
- 🚦 **Rate Limiting**: Protect upstream API resources by controlling request rates.
- 🔄 **Fallback**: Fallback to CF Workers AI if the upstream API fails.

## Quick Start

Expand Down Expand Up @@ -68,11 +70,14 @@ It is written in **TypeScript**: ensuring adaptability to the rapidly evolving A

### How to use?

- [Virtual Key](./docs/virtual-key.md)
- [Azure OpenAI](./docs/azure_openai.md)
- [Virtual Key](./docs/virtual-key.md)
- [Caching](./docs/caching.md)
- [Guard](./docs/guard.md)
- [Logging](./docs/logging.md)
- [Metrics](./docs/metrics.md)
- [Rate Limiting](./docs/rate-limiting.md)
- [Fallback](./docs/fallback.md)

## Customization and Extension

Expand Down
4 changes: 3 additions & 1 deletion docs/caching.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,6 @@ For cached response there will be a header in the response:

```bash
malacca-cache-status: hit
```
```

You can customize the cache logic by modifying the cache middleware in `src/middlewares/cache.ts`.
6 changes: 6 additions & 0 deletions docs/fallback.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# 🔄 Fallback

Malacca's fallback is designed to protect your API from upstream API failures. When the upstream API fails, Malacca will automatically fallback to the CF Workers AI. By default it will fallback to the `@cf/meta/llama-3.1-8b-instruct` model.

You can customize the fallback logical to fit your needs or change to another model by modifying the `src/middlewares/fallback.ts` file.

8 changes: 8 additions & 0 deletions docs/guards.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# 🛡️ Guard

Malacca's guard is designed to protect your API from inappropriate content. It is a powerful tool that can be used to protect your API from a variety of attacks, including but not limited to:

- **Content Filtering**: The guard can be used to filter out content that is inappropriate or offensive.
- **Abuse Detection**: The guard can be used to detect and prevent abuse of your API by bad actors.

You can customize the guards logic by modifying the guards middleware in `src/middlewares/guards.ts`.
4 changes: 3 additions & 1 deletion docs/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,4 +125,6 @@ The log entry looks like this:
},
"id": 0
}
```
```

You can customize the logging logic by modifying the logging middleware in `src/middlewares/logging.ts`.
4 changes: 3 additions & 1 deletion docs/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,6 @@ Response:
"rows": 1,
"rows_before_limit_at_least": 1
}
```
```

You can customize the metrics logic by modifying the metrics middleware in `src/middlewares/metrics.ts`.
24 changes: 24 additions & 0 deletions docs/rate-limiting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# 🚦 Rate Limiting

Rate Rate limiting is a crucial feature that helps protect your application from potential abuse or unintended overuse. By implementing rate limits, you can:

- 🛡️ Prevent excessive requests from a single user or IP address
- 💰 Control costs by limiting the number of API calls
- 🚀 Ensure fair usage and maintain service quality for all users
- 🔒 Mitigate potential Denial of Service (DoS) attacks

Malacca's rate limiting feature relies on the [Cloudflare Workers Rate Limiting](https://developers.cloudflare.com/workers/runtime-apis/bindings/rate-limit/) feature. It allows you to set limits on the number of requests a user can make within a specified time frame.

By default it limit by the virtual key and allows 100 requests per minute.

You can modify the limit tokens count in `wrangler.toml` file, for example:

```toml
[[unsafe.bindings]]
name = "MY_RATE_LIMITER"
type = "ratelimit"
namespace_id = "1001"
simple = { limit = 100, period = 60 }
```

And uou can also modify the rate limiting logic by modifying the rate limiting middleware in `src/middlewares/rateLimiter.ts`.
4 changes: 3 additions & 1 deletion docs/virtual-key.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,6 @@ Virtual Key is implemented by Cloudflare Worker KV, you can easily add a delete
npx wrangler kv key delete ${VIRTUAL_KEY} --binding MALACCA_USER
```

You can also manage the KV pairs directly from Cloudflare Worker KV web console.
You can also manage the KV pairs directly from Cloudflare Worker KV web console.

You can customize the virtual key logic by modifying the virtual key middleware in `src/middlewares/virtualKey.ts`.

0 comments on commit 9739823

Please sign in to comment.