Skip to content

A collection of simple LLMs that I build progressively as I get better at it.

License

Notifications You must be signed in to change notification settings

ZainKhalidOfficial/ZainLLM

Repository files navigation

This project us inspired by Andrej Karpathy who implemented reproduction of OpenAI GPT-2 Architecture (https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf).

Aim:

The aim of this project is 1st to learn the basics of LLMs, how & why they work. Obviously, I'll put in my ideas & ML concepts within the architecture to improve the results.

Resources:

Theory

(1) The Attention Mechanism in Large Language Models (https://www.youtube.com/watch?v=OxCpWwDCDFQ)

(2) The math behind Attention: Keys, Queries, and Values matrices (https://www.youtube.com/watch?v=UPtG_38Oq8o)

(3) What are Transformer Models and how do they work? (https://www.youtube.com/watch?v=qaWMOYf4ri8)

(4) Course Series by Andrej Karpathy (https://karpathy.ai/zero-to-hero.html)

About

A collection of simple LLMs that I build progressively as I get better at it.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published