feat(api, backend): load and expose backend model info at runtime #890

jamestexas · 2024-08-07T22:42:16Z

Summary

Fixes: #590
Fixes: #952

This PR introduces the ability to load and expose model metadata from configuration files at runtime within the API. This required somewhat significant refactoring of the Config class to allow easier testing.

The key changes before that refactoring can be found in this commit: ad3ddea
Refactoring and testing changes can be found here: 9727d5b

I took care to retain function signatures with defaults if needed, and do not believe this should contain any breaking changes.

Model Metadata Response: Added a new ModelMetadataResponse class in types.py to encapsulate model metadata such as type, dimensions, precision, and capabilities.
Config Enhancements:
- Introduced a ModelMetadata dataclass to represent metadata.
- Updated Config class to include methods for loading and processing configuration files, including validation of metadata fields.
- Ensured mutable default arguments were replaced.
API Route Update: Modified the models endpoint in models.py to include metadata in the response.
Testing Improvements: Added tests for the new Config functionalities and validated correct handling of configuration files.
Significant Refactoring for Testability: Refactored the Config class to improve testability while retaining the original behavior.
Tweak to main.py - the lifespan method in main.py was causing a race condition in the API tests. I altered this slightly to explicitly cleanup on shutdown.

Additional Context from internal Slack conversation:

Confirmed metadata should be an object and discussed potential placement of capabilities within metadata.
Highlighted the need for and added tests for the Config class, leveraging pytest for unit testing.

Testing coverage from this PR:

---------- coverage: platform darwin, python 3.12.4-final-0 ----------
Name                                 Stmts   Miss Branch BrPart  Cover   Missing
--------------------------------------------------------------------------------
src/leapfrogai_api/utils/config.py     130     13     35      0    90%   113-120, 140-141, 160-161, 228-229
--------------------------------------------------------------------------------
TOTAL                                  130     13     35      0    90%

TODO:

Update test dependencies to include pytest / pytest-asyncio. Is it appropriate to add dev dependencies into /src/leapfrogai_api/pyproject.toml?

netlify · 2024-08-07T22:42:33Z

✅ Deploy Preview for leapfrogai-docs ready!

Name	Link
🔨 Latest commit	`101b897`
🔍 Latest deploy log	https://app.netlify.com/sites/leapfrogai-docs/deploys/66db754fca2baf0008610c6f
😎 Deploy Preview	https://deploy-preview-890--leapfrogai-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.
Lighthouse	1 paths audited Performance: 42 (🟢 up 19 from production) Accessibility: 98 (no change from production) Best Practices: 100 (no change from production) SEO: 92 (no change from production) PWA: - View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify site configuration.

justinthelaw

Looks great so far! I'll do a deeper dive once it's fully ready. Just some comments to propagate downwards and discuss. Thanks!

justinthelaw · 2024-08-12T15:20:12Z

src/leapfrogai_api/backend/types.py

+        default=None,
+        description="Embedding dimensions (for embeddings models)",
+    )
+    precision: str | None = Field(


We might want to constrain this further to only accept: float16, float32, bfloat16, int8, int4.

👍 do you have a preference on using enums or anything? I am totally happy to make a new model referencing that (and refactoring accordingly)

Also brings up a missing field: format. This includes None, GPTQ, GGUF, SqueezeLLM, AWQ. This depends on whether the model was quantized, and whether it is compatible with vLLM and/or llama.cpp

Enums work with me! Didn't want to specify implementation, just the content.

justinthelaw · 2024-08-12T15:22:24Z

src/leapfrogai_api/backend/types.py

@@ -54,6 +53,25 @@ class Usage(BaseModel):
 ##########


+class ModelMetadataResponse(BaseModel):
+    type: Literal["embeddings", "llm"] | None = Field(


Type and capabilities have cross-over based on the descriptions/examples here. We should be instead identifying modalities (image, text, speech) and capabilities (embeddings, chat, TTS, STT).

Hmm, should we maybe differentiate on this too? Would it make sense to alias type to modalities (to encapsulate what you said) and then adding an enum for capabilities?
I don't know how type is used outside of this context and want to make sure it conforms to the rest of the repo.

I guess I was thinking there are vision embeddings-only models versus vision comprehension-only (chat), for example. So the combination of modality and capability works; however, if you're saying type is basically modality, then I am okay with that. Modality is more of an industry-standard term (e.g., how HF organizes models using semantic search: https://huggingface.co/models)

@justinthelaw I think I am a bit unsure, to be honest! Referencing this config map I "lifted" type out of what was defined already in the config, based off of this convo. Tentatively looking at an implementation like this though (and will proceed with this for now unless you have any opinions otherwise). I am keeping defaults to None to ensure backward compatibility with existing implementations.

class ModelMetadataResponse(BaseModel): """Metadata for the model, including type, dimensions (for embeddings), and precision.""" model_config = ConfigDict(use_enum_values=True) capabilities: list[Capability] | None = Field( default=None, description="Model capabilities (e.g., 'embeddings', 'chat', 'tts', 'stt')", ) dimensions: int | None = Field( default=None, description="Embedding dimensions (for embeddings models)", ) format: Format | None = Field( default=None, description="Model format (e.g., None, 'GPTQ', 'GGUF', 'SqueezeLLM', 'AWQ')", ) modalities: list[Modality] | None = Field( default=None, description="The modalities of the model (e.g., 'image', 'text', 'speech')", ) precision: Precision | None = Field( default=None, description="Model precision (e.g., 'float16', 'float32', 'bfloat16', 'int8', 'int4')", ) type: Literal["embeddings", "llm"] | None = Field( default=None, description="The type of the model e.g. ('embeddings' or 'llm')", )

I do think it might be a good idea to update the existing configmaps in the repo to include these fields. I'm happy to make another issue for that if it's useful just let me know.

src/leapfrogai_api/backend/types.py

jamestexas · 2024-08-12T15:33:20Z

Looks great so far! I'll do a deeper dive once it's fully ready. Just some comments to propagate downwards and discuss. Thanks!

Outside of the PR feedback is there anything else missing? This worked in terms of interacting with the API and wasn't sure if I needed to update things anywhere else to make it "complete". Let me know!

justinthelaw · 2024-08-12T15:41:56Z

Looks great so far! I'll do a deeper dive once it's fully ready. Just some comments to propagate downwards and discuss. Thanks!

Outside of the PR feedback is there anything else missing? This worked in terms of interacting with the API and wasn't sure if I needed to update things anywhere else to make it "complete". Let me know!

This PR is perfectly constrained to the issue's main concerns, as long as the unit tests are there/updated properly for the new config and fields. Areas like UI consuming config and Backends producing config are meant for future tickets.

…api test

… singleton, stop event stuff

…iles

CollectiveUnicorn · 2024-08-28T22:25:56Z

src/leapfrogai_api/routers/openai/models.py

-    for model in model_config.models:
-        m = ModelResponseModel(id=model)
+    # shared config object from the app
+    model_config: "Config" = session.app.state.config


Seems like there might be an issue here, the supabase_session object doesn't have a property named app so its failing for me when I hit this endpoint: https://leapfrogai-api.uds.dev/openai/v1/models

gphorvath · 2024-10-08T18:09:59Z

As we icebox the repo, we are leaving this draft PR as a good place to pickup if we resume work. The goal behind the work was to expose model information such as type (embeddings, chat, etc.) so that the API can infer the model backend without requiring additional state.

jamestexas requested a review from a team as a code owner August 7, 2024 22:42

jamestexas force-pushed the feat/api-backend-load-model-info-590 branch 5 times, most recently from 42717b4 to e3f755a Compare August 9, 2024 03:43

jamestexas added tech-debt Not a feature, but still necessary python Pull requests that update Python code labels Aug 9, 2024

justinthelaw reviewed Aug 12, 2024

View reviewed changes

jamestexas force-pushed the feat/api-backend-load-model-info-590 branch from a7c48c3 to af85681 Compare August 20, 2024 19:42

jamestexas changed the title ~~feat(api, backend): Load and Expose Backend Model Info at Runtime~~ feat(api, backend): load and expose backend model Info at runtime Aug 20, 2024

jamestexas added 7 commits August 26, 2024 13:39

feat(api,backend): load backend model info at runtime

36fb6eb

test(api,backend): refactor config.py to be testable

5a32882

fix: allow main app to gracefully shutdown / fix config modification …

4642142

…api test

feat(leapfrogai_api/tests): fix race condition in test, make config a…

8a9b4e0

… singleton, stop event stuff

fix(routers): use new global variable for config in various endpoints

d87e2f0

test(test_api): update repeater tests to properly execute with watchf…

3e3e3c8

…iles

fix(test_config.py): remove unused logging / debug statements

f276aba

jamestexas force-pushed the feat/api-backend-load-model-info-590 branch from cce96e0 to f276aba Compare August 26, 2024 18:39

jamestexas requested a review from justinthelaw August 28, 2024 16:11

CollectiveUnicorn reviewed Aug 28, 2024

View reviewed changes

alekst23 self-requested a review September 5, 2024 17:56

alekst23 mentioned this pull request Sep 5, 2024

EPIC: Implement Model Directory #623

Open

Merge branch 'main' into feat/api-backend-load-model-info-590

101b897

gphorvath assigned jamestexas and alekst23 Sep 15, 2024

justinthelaw changed the title ~~feat(api, backend): load and expose backend model Info at runtime~~ feat(api, backend): load and expose backend model info at runtime Sep 18, 2024

defenseunicorns deleted a comment from CodiumAI-Agent Sep 23, 2024

justinthelaw marked this pull request as draft September 25, 2024 02:27

justinthelaw unassigned alekst23 Oct 4, 2024

nywilken added the wontfix This will not be worked on label Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(api, backend): load and expose backend model info at runtime #890

feat(api, backend): load and expose backend model info at runtime #890

jamestexas commented Aug 7, 2024 •

edited

Loading

netlify bot commented Aug 7, 2024 •

edited

Loading

justinthelaw left a comment

justinthelaw Aug 12, 2024

jamestexas Aug 12, 2024

justinthelaw Aug 12, 2024

justinthelaw Aug 12, 2024

justinthelaw Aug 12, 2024

jamestexas Aug 12, 2024

justinthelaw Aug 12, 2024

jamestexas Aug 12, 2024 •

edited

Loading

jamestexas commented Aug 12, 2024

justinthelaw commented Aug 12, 2024 •

edited

Loading

CollectiveUnicorn Aug 28, 2024

gphorvath commented Oct 8, 2024

feat(api, backend): load and expose backend model info at runtime #890

Are you sure you want to change the base?

feat(api, backend): load and expose backend model info at runtime #890

Conversation

jamestexas commented Aug 7, 2024 • edited Loading

Summary

Additional Context from internal Slack conversation:

TODO:

netlify bot commented Aug 7, 2024 • edited Loading

✅ Deploy Preview for leapfrogai-docs ready!

justinthelaw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jamestexas Aug 12, 2024 • edited Loading

Choose a reason for hiding this comment

jamestexas commented Aug 12, 2024

justinthelaw commented Aug 12, 2024 • edited Loading

Choose a reason for hiding this comment

gphorvath commented Oct 8, 2024

jamestexas commented Aug 7, 2024 •

edited

Loading

netlify bot commented Aug 7, 2024 •

edited

Loading

jamestexas Aug 12, 2024 •

edited

Loading

justinthelaw commented Aug 12, 2024 •

edited

Loading