This folder contains the following examples for Llama 2 models:
File | Description | Model Used | GPU Minimum Requirement |
---|---|---|---|
01_load_inference | Environment setup and suggested configurations when inferencing Llama 2 models on Databricks. | Llama-2-13b-chat-hf |
2xA10-24GB |
02_mlflow_logging_inference | Save, register, and load Llama 2 models with MLflow, and create a Databricks model serving endpoint. | Llama-2-13b-chat-hf |
2xA10-24GB |
03_serve_driver_proxy | Serve Llama 2 models on the cluster driver node using Flask. | Llama-2-13b-chat-hf |
2xA10-24GB |
04_langchain | Integrate a serving endpoint or cluster driver proxy app with LangChain and query. | N/A | N/A |
05_fine_tune_deepspeed | Fine-tune Llama 2 base models leveraging DeepSpeed. | Llama-2-13b-hf |
8xA10 or 2xA100-80GB |
06_fine_tune_qlora | Fine-tune Llama 2 base models with QLORA. | Llama-2-13b-hf |
1xA10 |
07_ai_gateway | Manage a MLflow AI Gateway Route that accesses a Databricks model serving endpoint. | N/A | N/A |
08_load_from_marketplace | Load Llama 2 models from Databricks Marketplace. | Llama-2-13b-chat-hf |
2xA10-24GB |