Skip to content

Commit

Permalink
feat(vllm): Allow to set quantization (#1094)
Browse files Browse the repository at this point in the history
This particularly useful to set AWQ

**Description**

Follow up of #1015 

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

---------

Signed-off-by: Ettore Di Giacinto <[email protected]>
  • Loading branch information
mudler authored Sep 22, 2023
1 parent 048b813 commit a28ab18
Show file tree
Hide file tree
Showing 13 changed files with 357 additions and 332 deletions.
16 changes: 12 additions & 4 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,24 @@ This PR fixes #
**[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.


<!--
Thank you for contributing to LocalAI!
Contributing Conventions:
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We use [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out!
By following the community's contribution conventions upfront, the review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
1 change: 1 addition & 0 deletions api/backend/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ func gRPCModelOpts(c config.Config) *pb.ModelOptions {
NoMulMatQ: c.NoMulMatQ,
DraftModel: c.DraftModel,
AudioPath: c.VallE.AudioPath,
Quantization: c.Quantization,
LoraAdapter: c.LoraAdapter,
LoraBase: c.LoraBase,
NGQA: c.NGQA,
Expand Down
1 change: 1 addition & 0 deletions api/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ type LLMConfig struct {
NoMulMatQ bool `yaml:"no_mulmatq"`
DraftModel string `yaml:"draft_model"`
NDraft int32 `yaml:"n_draft"`
Quantization string `yaml:"quantization"`
}

type AutoGPTQ struct {
Expand Down
56 changes: 28 additions & 28 deletions extra/grpc/autogptq/backend_pb2.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit a28ab18

Please sign in to comment.