Skip to content

Commit

Permalink
Add How-to guides on parsing Solidity for CLI/Rust/NPM
Browse files Browse the repository at this point in the history
  • Loading branch information
Xanewok committed Dec 18, 2023
1 parent 8bd2839 commit e45f45c
Show file tree
Hide file tree
Showing 4 changed files with 444 additions and 2 deletions.
1 change: 1 addition & 0 deletions .cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
"doxygen",
"ebnf",
"inheritdoc",
"instanceof",
"ipfs",
"mkdocs",
"napi",
Expand Down
27 changes: 27 additions & 0 deletions crates/solidity/outputs/npm/tests/src/tests/cst-cursor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -106,3 +106,30 @@ test("use cursor", () => {
expectToken(cursor.node(), TokenKind.Semicolon, ";");
expect(cursor.goToNext()).toBe(false);
});

test("cursor navigation", () => {
const data = "contract Foo {} contract Bar {} contract Baz {}";

const language = new Language("0.8.0");
const parseTree = language.parse(RuleKind.SourceUnit, data);

let contractNames = [];
let cursor = parseTree.createTreeCursor();

while (cursor.goToNextRuleWithKinds([RuleKind.ContractDefinition])) {
// You have to make sure you return the cursor to original position
cursor.goToFirstChild();
cursor.goToNextTokenWithKinds([TokenKind.Identifier]);

// The currently pointed-to node is the name of the contract
let tokenNode = cursor.node();
if (tokenNode.kind !== TokenKind.Identifier) {
throw new Error("Expected identifier");
}
contractNames.push(tokenNode.text);

cursor.goToParent();
}

expect(contractNames).toEqual(["Foo", "Bar", "Baz"]);
});
Original file line number Diff line number Diff line change
@@ -1,3 +1,116 @@
# How to parse a Solidity file

--8<-- "crates/solidity/inputs/language/snippets/under-construction.md"
In this guide, we'll walk you through the process of parsing a Solidity file using Slang. See [Installation](../#installation) on how to install Slang.

A file has to be parsed according to a specific Solidity [version](../../../solidity-specification/supported-versions/). The version has to be explicitly specified and is not inferred from the source. To selectively parse parts of the source code using different versions, e.g. when the contract across multiple files has been flattened, you need to do that manually.

## Using the NPM package

Start by adding the Slang package as a dependency to your project:

```bash
$ npm install "@nomicfoundation/slang"
```

Using the API directly provides us with a more fine-grained control over the parsing process; we can parse individual rules like contracts, various definitions or even expressions.

We start by creating a `Language` struct with a given version. This is an entry point for our parser API.

```ts
import { Language } from "@nomicfoundation/slang/language";
import { RuleKind, TokenKind } from "@nomicfoundation/slang/kinds";
import { Cursor } from "@nomicfoundation/slang/cursor";

const source = "int256 constant z = 1 + 2;";
const language = new Language("0.8.11");

const parseOutput = language.parse(RuleKind.SourceUnit, source);
const cursor: Cursor = parseOutput.createTreeCursor();
```

The resulting `ParseOutput` class exposes these helpful functions:

- `errors()/isValid()` that return structured parse errors, if any,
- `tree()` that gives us back a CST (partial if there were parse errors),
- `fn createTreeCursor()` that creates a `Cursor` type used to conveniently walk the parse tree.

### Example 1: Reconstruct the Solidity file

Let's try the same example, only now using the API directly.

We'll start with this file:

```solidity
// file: file.sol
pragma solidity ^0.8.0;
```

#### Step 1: Parse the Solidity file

Let's naively (ignore the errors) read the file and parse it:

```ts
import { fs } from "node:fs";
const data = fs.readFileSync("file.sol", "utf8");

let parseTree = language.parse(RuleKind.SourceUnit, data);
```

#### Step 2: Reconstruct the source code

The `Cursor` visits the tree nodes in a depth-first search (DFS) fashion. Since our CST is complete (includes trivia such as whitespace), it's enough to visit the `Token` nodes and concatenate their text to reconstruct the original source code.

Let's do that:

```ts
import { TokenNode } from "@nomicfoundation/slang/cst";

let output = "";
while (cursor.goToNext()) {
let node = cursor.node();
if (node instanceof TokenNode) {
output += node.text;
}
}

// Jest-style assertion for clarity
expect(output).toEqual("pragma solidity ^0.8.0\n");
```

### Example 2: List the top-level contracts and their names

The `Cursor` type exposes more procedural-style functions that allow you to navigate the source in an imperative fashion. In addition to `goToNext`, we can go to the parent, first child, next sibling, etc., as well as nodes with a given kind.

To list the top-level contracts and their names, we need to visit the `ContractDefinition` rule nodes and then their `Identifier` children.

Let's do that:

```ts
import { fs } from "node:fs";
import { RuleKind, TokenKind } from "@nomicfoundation/slang/kinds";

const data = fs.readFileSync("file.sol", "utf8");

const language = new Language("0.8.0");
const parseTree = language.parse(RuleKind.SourceUnit, data);

let contractNames = [];
let cursor = parseTree.createTreeCursor();

while (cursor.goToNextRuleWithKinds([RuleKind.ContractDefinition])) {
// You have to make sure you return the cursor to original position
cursor.goToFirstChild();
cursor.goToNextTokenWithKinds([TokenKind.Identifier]);

// The currently pointed-to node is the name of the contract
let tokenNode = cursor.node();
if (tokenNode.kind !== TokenKind.Identifier) {
throw new Error("Expected identifier");
}
contractNames.push(tokenNode.text);

cursor.goToParent();
}

expect(contractNames).toEqual(["Foo", "Bar", "Baz"]);
```
Loading

0 comments on commit e45f45c

Please sign in to comment.