Skip to content

Latest commit

 

History

History
8 lines (5 loc) · 378 Bytes

File metadata and controls

8 lines (5 loc) · 378 Bytes

About

Single-file implementation of Attention Is All You Need

Why

  • It's a flexible interpretable architecture, more performant than RNN and applicable to POMDP.
  • Simple implementantion to play with my Titan V and see how much I can get out of mixed-precision learning and matmul-friendly tensor-cores with matmul-heavy architecture.