Skip to content

Latest commit

 

History

History
31 lines (25 loc) · 809 Bytes

README.md

File metadata and controls

31 lines (25 loc) · 809 Bytes

Transformers

Minimal PyTorch implementation of the Transformers architecture from Vaswani et al.'s 2017 paper, Attention Is All You Need. This repository serves as a deep dive into understanding key architectures and their nuances, including architecture design and training techniques.

Components

Project Structure

  • Tokenizer.py
  • Modules/
    • Attention.py
      • MultiHeadAttentionBase
      • MultiHeadSelfAttention
      • MultiHeadCrossAttention
    • AddNorm.py
    • MLP.py
    • Encoder.py
      • EncoderLayer
      • Encoder
    • Decoder.py
      • DecoderLayer
      • Decoder
    • Transformer.py
      • Transformer
  • Config/
    • Config.py
  • BERT/
    • BERT.py
      • Bert