This repository contains Sorbet, a static typechecker for a subset of Ruby. It is still in early stages, but is mature enough to run on the majority of Ruby code at Stripe. You are welcome to try it, though, but your experience might still be rough.
This README contains documentation specifically for contributing to Sorbet. You might also want to:
- Read the public Sorbet docs
- Or even edit the docs
- Watch the talks we've given about Sorbet
- Try the Sorbet playground online
If you are at Stripe, you might also want to see http://go/types/internals for docs about Stripe-specific development workflows and historical Stripe context.
- Sorbet user-facing design principles
- Quickstart
- Learning how Sorbet works
- Building Sorbet
- Running Sorbet
- Running the tests
- Testing Sorbet against pay-server
- Writing tests
- C++ conventions
- Debugging and profiling
- Writing docs
- Editor and environment
Early in our project we've defined some guidelines for how working with sorbet should feel like.
-
Explicit
We're willing to write annotations, and in fact see them as beneficial; they make code more readable and predictable. We're here to help readers as much as writers.
-
Feel useful, not burdensome
While it is explicit, we are putting effort into making it concise. This shows in multiple ways:
- error messages should be clear
- verbosity should be compensated with more safety
-
As simple as possible, but powerful enough
Overall, we are not strong believers in super-complex type systems. They have their place, and we need a fair amount of expressive power to model (enough) real Ruby code, but all else being equal we want to be simpler. We believe that such a system scales better, and—most importantly—is easier for our users to learn & understand.
-
Compatible with Ruby
In particular, we don't want new syntax. Existing Ruby syntax means we can leverage most of our existing tooling (editors, etc). Also, the point of Sorbet is to gradually improve an existing Ruby codebase. No new syntax makes it easier to be compatible with existing tools.
-
Scales
On all axes: execution speed, number of collaborators, lines of code, codebase age. We work in large Ruby code bases, and they will only get larger.
-
Can be adopted gradually
In order to make adoption possible at scale, we cannot require every team or project to adopt Sorbet all at once. Sorbet needs to support teams adopting it at different paces.
-
Install the dependencies
brew install bazel autoconf coreutils parallel
-
Clone this repository
git clone https://github.com/sorbet/sorbet.git
cd sorbet
-
Build Sorbet
./bazel build //main:sorbet --config=dbg
-
Run Sorbet!
bazel-bin/main/sorbet -e "42 + 'hello'"
We've documented the internals of Sorbet in a separate doc. Cross-reference between that doc and here to learn how Sorbet works and how to change it!
There is also a talk online that describes Sorbet's high-level architecture and the reasons why it's fast:
→ Fast type checking for Ruby
There are multiple ways to build sorbet
. This one is the most common:
./bazel build //main:sorbet --config=dbg
This will build an executable in bazel-bin/main/sorbet
(see "Running Sorbet"
below). There are many options you can pass when building sorbet
:
--config=dbg
- Most common build config for development.
- Good stack traces, runs all ENFORCEs.
--config=sanitize
- Link in extra sanitizers, in particular: UBSan and ASan.
- Catches most memory and undefined-behavior errors.
- Substantially larger and slower binary.
--config=debugsymbols
- (Included by
--config=dbg
) debugging symbols, and nothing else.
- (Included by
--config=forcedebug
- Use more memory, but report even more sanity checks.
--config=static-libs
- Forcibly use static linking (Sorbet defaults to dynamic linking for faster build times).
- Sorbet already uses this option in release builds (see below).
--config=release-mac
and--config=release-linux
- Exact release configuration that we ship to our users.
Independently of providing or omitting any of the above flags, you can turn on optimizations for any build:
-c opt
- Enables
clang
optimizations (i.e.,-O2
)
- Enables
These args are not mutually exclusive. For example, a common pairing when debugging is
--config=dbg --config=sanitize
In tools/bazel.rc you can find out what all these options (and others) mean.
(Mac) Xcode version must be specified to use an Apple CROSSTOOL
This error typically occurs after an XCode upgrade.
Developer tools must be installed, the XCode license must be accepted, and your active XCode command line tools directory must point to an installed version of XCode.
The following commands should do the trick:
# Install command line tools
xcode-select --install
# Ensure that the system finds command line tools in an active XCode directory
sudo xcode-select -s /Applications/Xcode.app/Contents/Developer
# Accept the XCode license.
sudo xcodebuild -license
# Clear bazel's cache, which may contain files generated from a previous
# version of XCode command line tools.
bazel clean --expunge
(Mac) fatal error: 'stdio.h' file not found
(or some other system header)
This error can happen on Macs when the /usr/include
folder is missing. The
solution is to install macOS headers via the following package:
open /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg
Run Sorbet on an expression:
bazel-bin/main/sorbet -e "1 + false"
Run Sorbet on a file:
bazel-bin/main/sorbet foo.rb
Running bazel-bin/main/sorbet --help
will show lots of options. These are
the common ones for contributors:
-p <IR>
- Asks sorbet to print out any given intermediate representation.
- See
--help
for available values of<IR>
.
--stop-after <phase>
- Useful when there's a bug in a later phase, and you want to quit early to debug.
-v
,-vv
,-vvv
- Show
logger
output (increasing verbosity)
- Show
--max-threads=1
- Useful for determining if you're dealing with a concurrency bug or not.
--wait-for-dbg
- Will freeze Sorbet on startup and wait for a debugger to attach
- This is useful when you don't have control over launching the process (LSP)
To run all the tests:
bazel test //... --config=dbg
(The //...
literally means "all targets".)
To run a subset of the tests curated for faster iteration and development speed, run:
bazel test test --config=dbg
Note that in bazel terms, the second test is an alias for //test:test
, so we're being a bit cute here.
By default, all test output goes into files. To also print it to the screen:
bazel test //... --config=dbg --test_output=errors
If any test failed, you will see two pieces of information printed:
1. //test:test_testdata/resolver/optional_constant
2. /private/var/tmp/.../test/test_testdata/resolver/optional_constant/test.log
- the test's target (in case you want to run just this test again with
bazel test <target>
) - a (runnable) file containing the test's output
To see the failing output, either:
- Re-run
bazel test
with the--test_output=errors
flag - Copy/paste the
*.log
file and run it (the output will open inless
)
This is specific to contributing to Sorbet at Stripe.
If you are at Stripe and want to test your branch against pay-server, see http://go/types/local-dev.
We write tests by adding files to subfolders of the test/
directory.
Individual subfolders are "magic"; each contains specific types of tests.
We aspire to have our tests be fully reproducible.
C++ note: In C++, hash functions are only required to produce the same result for the same input within a single execution of a program.
Thus, we expect all user-visible outputs to be explicitly sorted using a key stable from one run to the next.
There are many ways to test Sorbet, some "better" than others. We've ordered them below in order from most preferable to least preferable. And we always prefer some test to no tests!
The first kind of test can be called either test_corpus tests or testdata tests, based on the name of the test harness or the folder containing these tests, respectively.
To create a test_corpus test, add any file <name>.rb
to test/testdata
, in
any folder depth. The file must either:
- typecheck entirely, or
- throw errors only on lines marked with a comment (see below).
To mark that a line should have errors, append # error: <message>
(the
<message>
must match the raised error message). In case there are multiple
errors on this line, add an # error: <message>
on its own line just below.
Error checks can optionally point to a range of characters rather than a line:
1 + '' # error: `String` doesn't match `Integer`
rescue Foo, Bar => baz
# ^^^ error: Unable to resolve constant `Foo`
# ^^^ error: Unable to resolve constant `Bar`
You can run this test with:
bazel test //test:test_PosTests/testdata/path/to/<name>
Each test_corpus test can be turned into an expectation test by optionally
creating any number of <name>.rb.<phase>.exp
files (where <name>
matches the
name of the ruby file for this test). These files contain pretty printed
representations of internal data structures, according to what would be printed
by -p <phase>
. The snapshot must exactly match the output generated by running
sorbet -p <phase> <name>.rb
for the test to pass.
You can run this test with:
bazel test //test:test_PosTests/testdata/path/to/<name>
Files that begin with a prefix and __
will be run together. For example,
foo__1.rb
and foo__2.rb
will be run together as test foo
.
Any folder <name>
that is added to test/cli/
becomes a test.
This folder should have a file <name>.sh
that is executable.
When run, its output will be compared against <name>.out
in that folder.
Our bazel setup will produce two targets:
bazel run //test/cli:test_<name>
will execute the.sh
filebazel test //test/cli:test_<name>
will execute the.sh
and check it against what's in the.out
file.
The scripts are run inside Bazel, so they will be executed from the top of the
workspace and have access to sources files and built targets using their path
from the root. In particular, the compiled sorbet binary is available under
main/sorbet
.
Most LSP tests are expectation tests with additional LSP-specific annotations.
They are primarily contained in test/testdata/lsp
, but all files in test/testdata
are tested in LSP mode. You can run a test test/testdata/lsp/<name>.rb
like so:
bazel test //test:test_LSPTests/testdata/lsp/<name>
LSP tests have access to def
and usage
assertions that you can use to annotate definition
and usage sites for a variable:
a = 10
# ^ def: a
b = a + 10
# ^ usage: a
With these annotations, the test will check that "Find Definition" from the addition will lead to
a = 10
, and that "Find All References" from either location will return both the definition and usage.
If a variable is re-defined, it can be annotated with a version number:
a = 10
# ^ def: a 1
a = 20
# ^ def: a 2
b = a + 10
# ^ usage: a 2
If a location should not report any definition or usage, then use the magic label (nothing)
:
a = 10
# ^ def: (nothing)
This is somewhat similar to "Find Definition" above, but also slightly different because there's no analogue of "Find All Type Definitions."
class A; end
# ^ type-def: some-label
aaa = A.new
# ^ type: some-label
The type: some-label
assertion says "please simulate a Go to Type Definition
here, named some-label
" and the type-def: some-label
assertion says "assert
that the results for some-label
are exactly these locations."
That means if the type definition could return multiple locs, the assertions will have to cover all results:
class A; end
# ^ type-def: AorB
class B; end
# ^ type-def: AorB
aaa = T.let(A.new, T.any(A, B))
# ^ type: AorB
If a location should not report any definition or usage, then use the magic
label (nothing)
:
# typed: false
class A; end
aaa = A.new
# ^ def: (nothing)
LSP tests can also assert the contents of hover responses with hover
assertions:
a = 10
# ^ hover: Integer(10)
If a location should report the empty string, use the special label (nothing)
:
a = 10
# ^ hover: (nothing)
🚧 This section is under construction! 🚧
LSP tests can also assert the contents of completion responses with completion
assertions.
class A
def self.foo_1; end
def self.foo_2; end
foo
# ^ completion: foo_1, foo_2
end
The ^
corresponds to the position of the cursor. So in the above example, it's
as if the cursor is like this: foo│
. If the ^
had been directly under the
last o
, it would have been like this: fo|o
. Only the first ^
is used. If
you use ^^^
in the assertion, the test harness will use only the first caret.
You can also write a test for a partial prefix of the completion results:
class A
def self.foo_1; end
def self.foo_2; end
foo
# ^ completion: foo_1, ...
end
Add the , ...
suffix to the end of a partial list of completion results, and
the test harness will ensure that the listed identifiers match a prefix of the
completion items. This prefix must still be listed in order.
If a location should report zero completion items, use the special message
(nothing)
:
class A
def self.foo_1; end
def self.foo_2; end
zzz
# ^ completion: (nothing)
end
To write a test for the snippet that would be inserted into the document if a particular completion item was selected, you can make two files:
# -- test/testdata/lsp/completion/mytest.rb --
class A
def self.foo_1; end
end
A.foo_
# ^ apply-completion: [A] item: 0
The apply-completion
assertion says "make sure the file mytest.A.rbedited
contains the result of inserting the completion snippet for the 0th completion
item into the file."
# -- test/testdata/lsp/completion/mytest.A.rbedited --
class A
def self.foo_1; end
end
A.foo_1${0}
# ^ apply-completion: [A] item: 0
As you can see, the fancy ${...}
(tabstop placeholders) show up verbatim in
the output if they were sent in the completion response.
In LSP mode, Sorbet runs file updates on a fast path or a slow path. It checks the structure of the file before and after the update to determine if the change is covered under the fast path. If it is, it performs further processing to determine the set of files that need to be typechecked.
LSP tests can define file updates in <name>.<version>.rbupdate
files which contain the contents of <name>.rb
after the update occurs. For example, the file foo.1.rbupdate
contains the updated contents of foo.rb
.
If the test contains multiple files by using a __
suffixed prefix, then all rbupdates with the same version will
be applied in the same update. For example, foo__bar.1.rbupdate
and foo__baz.1.rbupdate
will be applied
simultaneously to update foo__bar.rb
and foo__baz.rb
.
Inside *.rbupdate
files, you can assert that the slow path ran by adding a line with # assert-slow-path: true
.
You can assert that the fast path ran on foo__bar.rb
and foo__baz.rb
with
#assert-fast-path: foo__bar.rb,foo__baz.rb
.
It is possible to record an LSP session and use it as a test. We are attempting to move away from this form of testing, as these tests are hard to update and understand. If at all possible, try to add your test case as a regular LSP test.
Any folder <name>
that is added to test/lsp/
will become a test.
This folder should contain a file named <folderName>.rec
that contains a
recorded LSP session.
- Lines that start with "Read:" will be sent to sorbet as input.
- Lines that start with "Write:" will be expected from sorbet as output.
Frequently when a test is failing, it's because something inconsequential changed in the captured output, rather than there being a bug in your code.
To recapture the traces, you can run
tools/scripts/update_exp_files.sh
You will probably want to look through the changes and git checkout
any files
with changes that you believe are actually bugs in your code and fix your code.
update_exp_files.sh
updates every snapshot file kind known to Sorbet. This can
be slow, depending on what needs to be recompiled and updated. Some faster
commands:
# Only update the `*.exp` files in `test/testdata`
tools/scripts/update_testdata_exp.sh
# Only update the `*.exp` files in `test/testdata/cfg`
tools/scripts/update_testdata_exp.sh test/testdata/cfg
# Only update a single exp file's test:
tools/scripts/update_testdata_exp.sh test/testdata/cfg/next.rb
# Only update the `*.out` files in `test/cli`
bazel test //test/cli:update
-
TODO(jez) Write and link to "Notes on C++ Development"
-
Use smart pointers for storage, references for arguments.
-
No C-style allocators; use vectors instead.
In general,
- to debug a normal build of sorbet?
lldb bazel-bin/main/sorbet -- <args> ...
- (Consider using
--config=static-libs
for better debug symbols) - If you see weird Python errors on macOS, try
PATH=/usr/bin lldb
.
- to debug an existing Sorbet process (i.e., LSP)
- launch Sorbet with the
--wait-for-dbg
flag lldb -p <pid>
- set breakpoints and then
continue
- launch Sorbet with the
Also, it’s good to get in the practice of fixing bugs by first adding an
ENFORCE
(assertion) that would have caught the bug before actually fixing the
bug. It’s far easier to fix bugs when there’s a nice error message stating what
invariant you’ve violated. ENFORCE
s are free in the release build.
- TODO(jez) Write about how to profile Sorbet
The sources for Sorbet's documentation website live in the
website/
folder. Specifically, the docs live in
website/docs/
, are all authored with Markdown, and are built
using Docusaurus.
^ See here for how to work with the documentation site locally.
Bazel supports having a persistent cache of previous build results so that
rebuilds for the same input files are fast. To enable this feature, run these
commands to create a ./.bazelrc.local
and cache folder:
# The .bazelrc.local will live in the sorbet repo so it doesn't interfere with
# other bazel-based repos you have.
echo "build --disk_cache=$HOME/.cache/sorbet/bazel-cache" >> ./.bazelrc.local
echo "test --disk_cache=$HOME/.cache/sorbet/bazel-cache" >> ./.bazelrc.local
mkdir -p "$HOME/.cache/sorbet/bazel-cache"
Sometimes it can be nice to have [multiple working trees] in Git. This allows
you to have multiple active checkouts Sorbet, sharing the same .git/
folder.
To set up a new worktree with Sorbet:
tools/scripts/make_worktree.sh <worktree_name>
Many of the build commands are very long. You might consider shortening the common ones with shell aliases of your choice:
# mnemonic: 's' for sorbet
alias sb="bazel build //main:sorbet --config=dbg"
alias st="bazel test //... --config=dbg --test_output=errors"
We ensure that C++ files are formatted with clang-format
and that Bazel BUILD
files are formatted with buildifier
. To avoid inconsistencies between
different versions of these tools, we have scripts which download and run these
tools through bazel
:
tools/scripts/format_cxx.sh
tools/scripts/format_build_files.sh
CI will fail if there are any unformatted files, so you might want to set up your files to be formatted automatically with one of these options:
- Set up a pre-commit / pre-push hook which runs these scripts.
- Set up your editor to run these scripts. See below.
The clang
suite of tools has a pretty great story around editor tooling: you
can build a compile_commands.json
using Clang's Compilation Database format.
Many clang-based tools consume this file to provide language-aware features in, for example, editor integrations.
To build a compile_commands.json
file for Sorbet with bazel:
tools/scripts/build_compilation_db.sh
You are encouraged to play around with various clang-based tools which use the
compile_commands.json
database. Some suggestions:
-
rtags -- Clang aware jump-to-definition / find references / etc.
brew install rtags # Have the rtags daemon be automatically launched by macOS on demand brew services start rtags # cd into sorbet # ensure that ./compile_commands.json exists # Tell rtags to index sorbet using our compile_commands.json file rc -J .
There are rtags editor plugins for most text editors.
-
clangd -- Clang-based language server implementation
brew install llvm@8 # => /usr/local/opt/llvm/bin/clangd # You might need to put this on your PATH to tell your editor about it.
clangd
supports more features thanrtags
(specifically, reporting Diagnostics), but can be somewhat slower at times because it does not pre-index all your code like rtags does. -
clang-format -- Clang-based source code formatter
We build
clang-format
in Bazel to ensure that everyone uses the same version. Here's how you can getclang-format
out of Bazel to use it in your editor:# Build clang-format with bazel ./bazel build //tools:clang-format # Once bazel runs again, this symlink to clang-format will go away. # We need to copy it out of bazel so our editor can use it: mkdir -p "$HOME/bin" cp bazel-bin/tools/clang-format $HOME/bin # (Be sure that $HOME/bin is on your PATH, or use a path that is)
With
clang-format
on your path, you should be able to find an editor plugin that uses it to format your code on save.Note: our format script passes some extra options to
clang-format
. Configure your editor to pass these options along toclang-format
:-style=file -assume-filename=<CURRENT_FILE>
-
CLion -- JetBrains C/C++ IDE
CLion can be made aware of the
compile_commands.json
database. Replaces your entire text editing workflow (full-fledged IDE). -
vscode-clangd -- Clangd extension for VS Code
This extension integrates clangd (see above) with VS Code. It will also run
clang-format
whenever you save. Note: Microsoft's C/C++ extension does not work properly with Sorbet'scompile_commands.json
.clangd will need to be on your path, or you will need to change the "clangd.path" setting.
clangd operates on
compile_commands.json
, so make sure you run the./tools/scripts/build_compilation_db.sh
script.
Here are some sample config setups: