Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: some of the engine parameters in the model load request are ignored #1824

Open
7 tasks
louis-jan opened this issue Dec 24, 2024 · 0 comments
Open
7 tasks
Assignees
Labels
P1: important Important feature / fix type: bug Something isn't working
Milestone

Comments

@louis-jan
Copy link
Contributor

louis-jan commented Dec 24, 2024

Cortex version

1.0.6

Describe the issue and expected behaviour

When starting a model, there are engine parameters that can be configured as described here: https://github.com/janhq/cortex.llamacpp. However, when sending these parameters through the cortex.cpp server, most of them are filtered out due to a new model.yaml configuration that hardcodes several acceptable parameters.

After reviewing the model.yaml implementation, I noticed that the settings are not applicable because these declaration are missing. So that they all fallback to default settings.

  • cpu_threads
  • n_batch
  • caching_enabled
  • grp_attn_n
  • grp_attn_w
  • mlock
  • grammar_file
  • model_type
  • model_alias
  • flash_attn
  • cache_type
  • use_mmap
  • llama_model_path
  • embedding
  • cont_batching
  • user_prompt
  • ai_prompt
  • system_prompt
  • pre_prompt

Steps to Reproduce

  1. Start cortex server
  2. Start a model by sending a request with cpu_threads or n_batch settings
  3. Observe cortex.log
  4. See the error

Screenshots / Logs

No response

What is your OS?

  • Windows
  • Mac Silicon
  • Mac Intel
  • Linux / Ubuntu

What engine are you running?

  • cortex.llamacpp (default)
  • cortex.tensorrt-llm (Nvidia GPUs)
  • cortex.onnx (NPUs, DirectML)

Hardware Specs eg OS version, GPU

No response

@louis-jan louis-jan added the type: bug Something isn't working label Dec 24, 2024
@github-project-automation github-project-automation bot moved this to Investigating in Menlo Dec 24, 2024
@louis-jan louis-jan added the P1: important Important feature / fix label Dec 24, 2024
@louis-jan louis-jan added this to the v1.0.7 milestone Dec 24, 2024
@louis-jan louis-jan moved this from Investigating to In Progress in Menlo Dec 24, 2024
@vansangpfiev vansangpfiev moved this from In Progress to QA in Menlo Dec 26, 2024
@TC117 TC117 moved this from QA to Completed in Menlo Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1: important Important feature / fix type: bug Something isn't working
Projects
Status: Completed
Development

No branches or pull requests

2 participants