Implementing Transformers from scratch in PyTorch

This repository consists of a from scratch implementation the Transformer architecture in PyTorch, as originally introduced by Vaswani et al. (2017) for translation tasks. Transformers have since been widely used in various sequence processing tasks, from language modelling (Devlin et al., 2019) to time series forecasting (Wu et al., 2022) via image processing (Dosovitskiy et al., 2022).

This code aims to reproduce the core components of a Transformer model (e.g., positional embedding, attention mechanisms, residual connection, layer normalisation) in PyTorch based on the 2022 paper Formal Algorithms for Transformers. Each core component is implemented independently and tested by calculating the shapes of processed objects. Final architectures (encoder, decoder, encoder-decoder) are also implemented by combining all building blocks.

We test our implementation by successfully overfitting a small artificially generated batch of data using a tiny Transformer encoder model, as shown in the experiment.ipynb Notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
attention.py		attention.py
decoder.py		decoder.py
encoder.py		encoder.py
encoder_decoder.py		encoder_decoder.py
experiment.ipynb		experiment.ipynb
layer_norm.py		layer_norm.py
multi_head_attention.py		multi_head_attention.py
positional_embedding.py		positional_embedding.py
resnet.py		resnet.py
single_query_attention.py		single_query_attention.py
token_embedding.py		token_embedding.py
token_unembedding.py		token_unembedding.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementing Transformers from scratch in PyTorch

About

Releases

Packages

Languages

vkhamesi/transformers

Folders and files

Latest commit

History

Repository files navigation

Implementing Transformers from scratch in PyTorch

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages