From 9739823892f9d339870c719cecc2442482d54ee2 Mon Sep 17 00:00:00 2001 From: oilbeater Date: Mon, 7 Oct 2024 23:22:19 +0800 Subject: [PATCH] add more docs Signed-off-by: oilbeater --- README.md | 9 +++++++-- docs/caching.md | 4 +++- docs/fallback.md | 6 ++++++ docs/guards.md | 8 ++++++++ docs/logging.md | 4 +++- docs/metrics.md | 4 +++- docs/rate-limiting.md | 24 ++++++++++++++++++++++++ docs/virtual-key.md | 4 +++- 8 files changed, 57 insertions(+), 6 deletions(-) create mode 100644 docs/fallback.md create mode 100644 docs/guards.md create mode 100644 docs/rate-limiting.md diff --git a/README.md b/README.md index b653dd8..591d903 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ It is fully **CloudFlare Native**: allowing for global scale deployment without It is written in **TypeScript**: ensuring adaptability to the rapidly evolving AI ecosystem and catering to the diverse needs of application developers. -> Malacca is still an early-stage project, containing many experiments by the author. Currently, it only supports AzureOpenAI. While Malacca provides numerous features, its primary purpose is to offer a framework and examples to help users better implement their own custom functionalities. We encourage you to read the code and adapt it to your specific needs. We welcome contributions and ideas from the community to help expand and improve this project. +> Malacca is still an early-stage project, containing many experiments by the author. Currently, it only supports AzureOpenAI. While Malacca provides numerous features, its primary purpose is to offer a framework and examples to help users better implement their own custom functionalities. In fact, we encourage you to read the code and adapt it to your specific needs. We welcome contributions and ideas from the community to help expand and improve this project. ## Features @@ -21,9 +21,11 @@ It is written in **TypeScript**: ensuring adaptability to the rapidly evolving A - 🛠️ **Comprehensive Feature Set** - 🔑 **Virtual Key**: Manage access permissions using virtual keys, providing more granular control over API access. - ⚡ **Caching**: Reduce latency and costs by caching repeat requests. + - 🛡️ **Guard**: Deny the request if the request or response has inappropriate content. - 📊 **Analytics**: Track the status, error, latency and usage of tokens, allowing you to understand and manage API costs. - 📋 **Logging**: Record requests and responses to further fine-tune or reinforcement learning. - 🚦 **Rate Limiting**: Protect upstream API resources by controlling request rates. + - 🔄 **Fallback**: Fallback to CF Workers AI if the upstream API fails. ## Quick Start @@ -68,11 +70,14 @@ It is written in **TypeScript**: ensuring adaptability to the rapidly evolving A ### How to use? -- [Virtual Key](./docs/virtual-key.md) - [Azure OpenAI](./docs/azure_openai.md) +- [Virtual Key](./docs/virtual-key.md) - [Caching](./docs/caching.md) +- [Guard](./docs/guard.md) - [Logging](./docs/logging.md) - [Metrics](./docs/metrics.md) +- [Rate Limiting](./docs/rate-limiting.md) +- [Fallback](./docs/fallback.md) ## Customization and Extension diff --git a/docs/caching.md b/docs/caching.md index e72f032..39a0f58 100644 --- a/docs/caching.md +++ b/docs/caching.md @@ -15,4 +15,6 @@ For cached response there will be a header in the response: ```bash malacca-cache-status: hit -``` \ No newline at end of file +``` + +You can customize the cache logic by modifying the cache middleware in `src/middlewares/cache.ts`. diff --git a/docs/fallback.md b/docs/fallback.md new file mode 100644 index 0000000..d818289 --- /dev/null +++ b/docs/fallback.md @@ -0,0 +1,6 @@ +# 🔄 Fallback + +Malacca's fallback is designed to protect your API from upstream API failures. When the upstream API fails, Malacca will automatically fallback to the CF Workers AI. By default it will fallback to the `@cf/meta/llama-3.1-8b-instruct` model. + +You can customize the fallback logical to fit your needs or change to another model by modifying the `src/middlewares/fallback.ts` file. + diff --git a/docs/guards.md b/docs/guards.md new file mode 100644 index 0000000..2fb3851 --- /dev/null +++ b/docs/guards.md @@ -0,0 +1,8 @@ +# 🛡️ Guard + +Malacca's guard is designed to protect your API from inappropriate content. It is a powerful tool that can be used to protect your API from a variety of attacks, including but not limited to: + +- **Content Filtering**: The guard can be used to filter out content that is inappropriate or offensive. +- **Abuse Detection**: The guard can be used to detect and prevent abuse of your API by bad actors. + +You can customize the guards logic by modifying the guards middleware in `src/middlewares/guards.ts`. \ No newline at end of file diff --git a/docs/logging.md b/docs/logging.md index 0efda30..0d4c888 100644 --- a/docs/logging.md +++ b/docs/logging.md @@ -125,4 +125,6 @@ The log entry looks like this: }, "id": 0 } -``` \ No newline at end of file +``` + +You can customize the logging logic by modifying the logging middleware in `src/middlewares/logging.ts`. diff --git a/docs/metrics.md b/docs/metrics.md index 6182fb3..d94184e 100644 --- a/docs/metrics.md +++ b/docs/metrics.md @@ -59,4 +59,6 @@ Response: "rows": 1, "rows_before_limit_at_least": 1 } -``` \ No newline at end of file +``` + +You can customize the metrics logic by modifying the metrics middleware in `src/middlewares/metrics.ts`. \ No newline at end of file diff --git a/docs/rate-limiting.md b/docs/rate-limiting.md new file mode 100644 index 0000000..69d58ad --- /dev/null +++ b/docs/rate-limiting.md @@ -0,0 +1,24 @@ +# 🚦 Rate Limiting + +Rate Rate limiting is a crucial feature that helps protect your application from potential abuse or unintended overuse. By implementing rate limits, you can: + +- 🛡️ Prevent excessive requests from a single user or IP address +- 💰 Control costs by limiting the number of API calls +- 🚀 Ensure fair usage and maintain service quality for all users +- 🔒 Mitigate potential Denial of Service (DoS) attacks + +Malacca's rate limiting feature relies on the [Cloudflare Workers Rate Limiting](https://developers.cloudflare.com/workers/runtime-apis/bindings/rate-limit/) feature. It allows you to set limits on the number of requests a user can make within a specified time frame. + +By default it limit by the virtual key and allows 100 requests per minute. + +You can modify the limit tokens count in `wrangler.toml` file, for example: + +```toml +[[unsafe.bindings]] +name = "MY_RATE_LIMITER" +type = "ratelimit" +namespace_id = "1001" +simple = { limit = 100, period = 60 } +``` + +And uou can also modify the rate limiting logic by modifying the rate limiting middleware in `src/middlewares/rateLimiter.ts`. \ No newline at end of file diff --git a/docs/virtual-key.md b/docs/virtual-key.md index 1b1af97..a59504b 100644 --- a/docs/virtual-key.md +++ b/docs/virtual-key.md @@ -27,4 +27,6 @@ Virtual Key is implemented by Cloudflare Worker KV, you can easily add a delete npx wrangler kv key delete ${VIRTUAL_KEY} --binding MALACCA_USER ``` -You can also manage the KV pairs directly from Cloudflare Worker KV web console. \ No newline at end of file +You can also manage the KV pairs directly from Cloudflare Worker KV web console. + +You can customize the virtual key logic by modifying the virtual key middleware in `src/middlewares/virtualKey.ts`. \ No newline at end of file