[lora] Add load option to LoRA adapter API #2536

xyang16 · 2024-11-08T21:09:36Z

Description

Add an additional load option to the register adapter API and update adapter API.

The reason is for this change is to keep consistent with other model servers like vllm and lorax. vllm load_lora_adapter API don't load adapter weights, the adapter weights is only loaded when running inference using one particular adapter.

Discussed with Hosting team, making default to true.

Example:

curl -X POST "http://localhost:8080/models/model/adapters?name=eng_alpaca&load=false&src=/opt/ml/model/adapters/eng_alpaca"

siddvenk · 2024-11-11T19:31:43Z

engines/python/setup/djl_python/huggingface.py

+            if adapter_load:
+                _service.add_lora(adapter_name, adapter_alias, adapter_path)
+            else:
+                _service.remove_lora(adapter_name, adapter_alias)


if adapter_load is false, why are we removing the adapter? Should this just be a noop?

This is to have an unload option.

I see, so we support unloading only for unpinned adapters

siddvenk · 2024-11-11T19:33:43Z

engines/python/setup/djl_python/rolling_batch/lmi_dist_rolling_batch.py

+        return self.engine.add_lora(lora_request) and self.engine.pin_lora(
+            lora_request.lora_int_id)


do we need to check the result of add_lora before pinning?

Also, I would prefer if we kept these calls separate. It's more readable.

Same questions for vlm_rolling_batch.py

This is to make sure it's successfully loaded before pinning.

Made these calls separate.

xyang16 requested review from zachgk and a team as code owners November 8, 2024 21:09

xyang16 force-pushed the lora branch 12 times, most recently from 52d9886 to 7a3e234 Compare November 8, 2024 23:22

[lora] Add load option to LoRA adapter API

3bdf5c2

xyang16 force-pushed the lora branch from 7a3e234 to 3bdf5c2 Compare November 10, 2024 20:05

siddvenk reviewed Nov 11, 2024

View reviewed changes

Update

aae0cf6

siddvenk approved these changes Nov 12, 2024

View reviewed changes

xyang16 merged commit 8bddf35 into deepjavalibrary:master Nov 12, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[lora] Add load option to LoRA adapter API #2536

[lora] Add load option to LoRA adapter API #2536

xyang16 commented Nov 8, 2024 •

edited

Loading

siddvenk Nov 11, 2024

xyang16 Nov 12, 2024

siddvenk Nov 12, 2024

xyang16 Nov 12, 2024

siddvenk Nov 11, 2024 •

edited

Loading

xyang16 Nov 12, 2024

		return self.engine.add_lora(lora_request) and self.engine.pin_lora(
		lora_request.lora_int_id)

[lora] Add load option to LoRA adapter API #2536

[lora] Add load option to LoRA adapter API #2536

Conversation

xyang16 commented Nov 8, 2024 • edited Loading

Description

siddvenk Nov 11, 2024

Choose a reason for hiding this comment

xyang16 Nov 12, 2024

Choose a reason for hiding this comment

siddvenk Nov 12, 2024

Choose a reason for hiding this comment

xyang16 Nov 12, 2024

Choose a reason for hiding this comment

siddvenk Nov 11, 2024 • edited Loading

Choose a reason for hiding this comment

xyang16 Nov 12, 2024

Choose a reason for hiding this comment

xyang16 commented Nov 8, 2024 •

edited

Loading

siddvenk Nov 11, 2024 •

edited

Loading