Quick Instruction for LlamaCpp setup (Linux)

After creating virtual environment and install langchain:

Run this command to use GPU version of LlamaCpp (require cmake-3.29.6 refer to this link):

CMAKE_ARGS="-DGGML_CUDA=on" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir

Access the llama guard 2 gguf modelfile and download it via this link. This model has been quantized to Q4_K_M for the ease of use. Or you can go to hugging face and look up for gguf-file models then copy the link wget "the copied link"

Now, you are ready to run the test.py for demonstration!

Notice that the LlamaCpp implementation for llama guard 2 modelfile is from line 20 to line 73!

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.gitignore		.gitignore
README.md		README.md
json-pgvector.py		json-pgvector.py
llama-guard.py		llama-guard.py
prize.json		prize.json
update-28-covid-19-what-we-know.pdf		update-28-covid-19-what-we-know.pdf

Provide feedback