Skip to content

Latest commit

 

History

History
39 lines (37 loc) · 1.88 KB

README.md

File metadata and controls

39 lines (37 loc) · 1.88 KB

llama2.zig

Llama 2 inference in Zig

How to run

  1. Start and get inside the Docker container:
    cd infra-dev/
    docker-compose up -d
    docker-compose exec -it llama2 bash
  2. Download models:
    curl -L https://huggingface.co/karpathy/tinyllamas/resolve/main/stories15M.bin -o ../models/TinyLlama-15M.bin
    See model.py for more details about how the .bin file was exported.
  3. Inference:
    zig build run -- ../models/TinyLlama-15M.bin ../llama2.c/tokenizer.bin
    Or:
    zig build-exe ./src/main.zig -O ReleaseFast -lc
    ./main ../models/TinyLlama-15M.bin ../llama2.c/tokenizer.bin
    Output:
    Hello darkness, my old friend, the sun. He is very hot and he needs to cool down. He looks around and sees a big tree. He thinks it looks like a good place to rest.
    He climbs up the tree and looks around. He sees a big, green tree with lots of leaves. He thinks it looks like a good place to rest. He climbs up the tree and sits on a branch. He feels the cool breeze on his face.
    He looks around and sees a little girl. She is playing with her doll. She has long hair and a pink dress. She looks at him and smiles. She says, "Hello, mister. Do you like my tree?"
    The old man nods and says, "Yes, I do. It is very nice. Do you want to play with me?"
    The little girl nods and says, "Yes, I do. I like your tree. It is very big and green. Can I sit with you?"
    The old man says, "Sure, you can sit with me. But be careful, don't touch my tree. It is very old and fragile. It can break easily."
    The little girl says,
    

References