Skip to content

Latest commit

 

History

History
30 lines (21 loc) · 1.32 KB

README.md

File metadata and controls

30 lines (21 loc) · 1.32 KB

tiktoken.hs

This library is a binding to an extremely (as in, one function) subset of the tiktoken-rs library. It exposes a function countTokens :: Text -> Word64 which can be used to count tokens and return a result which should match the one returned by OpenAI itself (see for example their online tool).

Library design

This library uses the haskell-foreign-rust and haskell-rust-ffi to call into tiktoken-rs which is currently the industry-standard for tokenisation. Internally, this library is really composed by a Rust wrapper and a Haskell library, where the former is shipped alongside the latter, and we use a Custom setup script to seamlessly build the Rust wrapper before building the Haskell library.

For more information see the blog post Calling Purgatory from Heaven.

Building the project

This project requires a nighly version of the Rust toolchain as well as the cargo-c applet. You can install both with:

rustup toolchain install nightly
cargo install cargo-c

Then, you can build this project like any other Haskell library with cabal v2-build.