Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve acorn-walk types, add ".d.ts tests" using tsd #1344

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

reillylm
Copy link

This makes two changes to the TS types for acorn-walk.

Using AnyNode in more places

We currently have the discriminated union AnyNode, who's getting used in the functions who accept a visitor object. It enables us to receive the actual Node type as an argument to our visitor function:

walk.simple(ast, {
  Literal: (n) => {
    // works because our visitor function receives a node of type:
    // Extract<acorn.AnyNode, { type: "Literal" }>
    console.log(n.value)
  }
})

If we weren't using the discriminated union, n would be of type Node, and when we try to access n.value, we'd get the TS error TS2339: Property value does not exist on type Node.

With this change, we now use AnyNode in the other places where we receive a Node in a callback, like findNodeAt:

walk.findNodeAt(ast, null, null, (_, n) => {
  return n.type === "Literal" && n.raw?.includes("someSubstring")
})

With the discriminated union, once we confirm n.type === "Literal", TS can "know" that n is a Literal, and we can access its fields appropriately.

This doesn't change Node to AnyNode in the parameters to the walker functions themselves, since we don't use TS in the library itself, so we have no need to disambiguate between input Nodes. If we pass e.g. a BlockStatement as the first argument to simple, that's fine for TS because a BlockStatement is a Node. I don't think it would hurt anything to change those occurrences to AnyNode too, though.

Cleaning up some duplication

The visitor object types previously contained a little duplication because visitor objects can contain either Node type names (like Literal) or Aggregate type names (like Pattern) as keys. Previously, we used an intersection type ({} & {}) to account for the Node/Aggregate cases, but now there's a single conditional type who we can share between the visitor object types. Hopefully giving names to all the different parts makes things clearer too.

Also, with NodeOrAggregateTypeName representing both sets of type names, we can make the typing a little more accurate for the functions where we receive the type name in a callback.

Adding type definition tests

As part of making the change, I wrote some "type definition tests" using the tsd library, where we can assert things like "when we write a visitor function for the Node type Literal, the walker function will call it with a Literal as the first argument."

I found them helpful in giving me some confidence while making the changes (just as regular tests do), so I feel like they're worth having if we ever need to change the types. We can run the same test with both the walk.d.ts before and after the changes, to see which new scenarios are accounted for with the changes.

I excluded the test file from ESLint; we could add it, it would just require us adding the TS ESLint plugin, but I wasn't sure if we wanted that (I noticed the test directory is also currently excluded from ESLint).

We could also make the test run as part of CI. It runs with npx tsd in the acorn-walk directory and uses the test in the test-d directory to test dist/walk.d.ts by default.

This makes two changes to the TS types for acorn-walk.

**Using `AnyNode` in more places**

We currently have the discriminated union `AnyNode`, who's getting used in the functions who accept a visitor object. It enables us to receive the actual Node type as an argument to our visitor function:

```
walk.simple(ast, {
  Literal: (n) => {
    // works because our visitor function receives a node of type:
    // Extract<acorn.AnyNode, { type: "Literal" }>
    console.log(n.value)
  }
})
```

If we weren't using the discriminated union, `n` would be of type `Node`, and when we try to access `n.value`, we'd get the TS error `TS2339: Property value does not exist on type Node`.

With this change, we now use `AnyNode` in the other places where we receive a Node in a callback, like `findNodeAt`:

```
walk.findNodeAt(ast, null, null, (_, n) => {
  return n.type === "Literal" && n.raw?.includes("someSubstring")
})
```

With the discriminated union, once we confirm `n.type === "Literal"`, TS can "know" that `n` is a `Literal`, and we can access its fields appropriately.

This doesn't change `Node` to `AnyNode` in the parameters to the walker functions themselves, since we don't use TS in the library itself, so we have no need to disambiguate between input Nodes. If we pass e.g. a `BlockStatement` as the first argument to `simple`, that's fine for TS because a `BlockStatement` is a `Node`. I don't think it would hurt anything to change those occurrences to `AnyNode` too, though.

**Cleaning up some duplication**

The visitor object types previously contained a little duplication because visitor objects can contain either Node type names (like `Literal`) or Aggregate type names (like `Pattern`) as keys. Previously, we used an intersection type (`{} & {}`) to account for the Node/Aggregate cases, but now there's a single conditional type who we can share between the visitor object types. Hopefully giving names to all the different parts makes things clearer too.

Also, with `NodeOrAggregateTypeName` representing both sets of type names, we can make the typing a little more accurate for the functions where we receive the type name in a callback.

**Adding type definition tests**

As part of making the change, I wrote some "type definition tests" using the [`tsd` library](https://github.com/tsdjs/tsd), where we can assert things like "when we write a visitor function for the Node type `Literal`, the walker function will call it with a `Literal` as the first argument."

I found them helpful in giving me some confidence while making the changes (just as regular tests do), so I feel like they're worth having if we ever need to change the types. We can run the same test with both the `walk.d.ts` before and after the changes, to see which new scenarios are accounted for with the changes.

I excluded the test file from ESLint; we could add it, it would just require us adding the [TS ESLint plugin](https://typescript-eslint.io/getting-started/legacy-eslint-setup/), but I wasn't sure if we wanted that (I noticed the `test` directory is also currently excluded from ESLint).

We could also make the test run as part of CI. It runs with `npx tsd` in the acorn-walk directory and uses the test in the `test-d` directory to test `dist/walk.d.ts` by default.
@reillylm
Copy link
Author

For the tsd test, I tried to keep all the assertions terse so it doesn't look completely like an eye chart, but it still pretty much does. Places where you might think duplication can be reduced, I don't think they can. Here are some ones I tried:

Where we use the visitor object {Literal: skip, Pattern: skip}, we couldn't extract it to a variable with a shorter name, because TS only does "excess property checks" on object literals, not variables (in fact, the docs mention this as a feature of sorts), so we'd be losing some validation and tests would still pass, even if the visitor object contained invalid keys.

Where we call all the functions who accept a visitor object, we couldn't do something like:

const visitorObjectAcceptingWalkers = [ simple, ancestor, recursive, ... ]

for (const walker of visitorObjectAcceptingWalkers) {
  expectAssignable<Parameters<typeof walker>>([ast,
    {Super: v, Class: v},
    {Super: v, Class: v}
  ])
}

...for two reasons: the test logic isn't actually getting executed, just type-checked, so we wouldn't be able to have imperative logic like that, but even if we could, the visitor object isn't always in the same argument position for the different functions.

In our mock visitor functions who check the discriminated union behavior, we couldn't extract the functions like:

function checkSuperDiscriminatedUnion(node) {
  expectType<acorn.Super>(node)
}

function checkPatternDiscriminatedUnion(node) {
  expectType<acorn.Pattern>(node)
  if (node.type === "Identifier") { expectType<acorn.Identifier>(node) }
}

walk.simple(ast, {
  Super: checkSuperDiscriminatedUnion,
  Pattern: checkPatternDiscriminatedUnion
})

...because the type of the input parameter node is determined by the object key of the visitor object, so it needs to be "attached" to the visitor object.

@marijnh
Copy link
Member

marijnh commented Jan 29, 2025

For me, the walk.simple program you're showing already typechecks. SimpleVisitors and similar make sure the argument to the walker function is the actual node type that corresponds to the property name. As such, I don't really understand the motivation for the added complexity in the walker object types.

@reillylm
Copy link
Author

I agree about the walk.simple example-- I included it to demonstrate the current behavior of the visitor function types, where using the AnyNode discriminated union is the "correct" typing, and to provide motivation for the change to have the other callback types (e.g. FullWalkerCallback, FindPredicate) use AnyNode as well.

That's the primary "type behavioral" change (to invent a term 😃) in the PR: to make the other callback types pass AnyNode to the user, so they can have the same "correctness" we currently have on the visitor function types.

There's one smaller "type behavioral" change: in places where we pass a Node/Aggregate type name to the user, we'll use a type who reflects this constraint (NodeOrAggregateTypeName) rather than just string (e.g. we'll only ever receive type names like "Literal" or "Pattern" as the third argument to our FullWalkerCallback, not an arbitrary string like "hello world")

Here's a TS Playground link who demonstrates those two changes.

We can also visualize the changes using the tsd test file. The tests above line 142 demonstrate the current behavior of the visitor object/function types. The tests below line 142 show the scenarios who are "fixed" by this PR (using AnyNode in more places, using NodeOrAggregateTypeName instead of string).

As for the visitor object types (SimpleVisitors / AncestorVisitors / RecursiveVisitors), the changes to those are only for the purpose of reducing duplication and improving clarity, with no "type behavior" changes.

It's only so that we can open the d.ts file and read:

export type SimpleVisitors<TState> = {
  [typeName in NodeOrAggregateTypeName]?: SimpleVisitorFunction<typeName, TState>
}

type SimpleVisitorFunction<Name extends NodeOrAggregateTypeName, TState> =
  (node: ActualNodeForType<Name>, state: TState) => void

rather than:

export type SimpleVisitors<TState> = {
  [type in acorn.AnyNode["type"]]?: (node: Extract<acorn.AnyNode, { type: type }>, state: TState) => void
} & {
  [type in keyof AggregateType]?: (node: AggregateType[type], state: TState) => void
}

...where the two possibilities of keys (Node type names and Aggregate type names) are expressed through the "helper type" NodeOrAggregateTypeName, and the the relationship between the type name and the actual node type is expressed through the helper type ActualNodeForType, rather than the intersection type ({} & {}) getting repeated in the three visitor object types.

I'm not sure if breaking out the callback types into their own helper types (e.g. SimpleVisitorFunction) is making it look more complex? My idea there was just to use naming to make the distinction more clear between a "visitor function" (a funtion who accepts a node) and a "visitor object" (an object who maps Node/Aggregate names to visitor functions). But we could write it like this instead:

export type SimpleVisitors<TState> = {
  [typeName in NodeOrAggregateTypeName]?: (node: ActualNodeForType<typeName>, state: TState) => void
}

@marijnh
Copy link
Member

marijnh commented Jan 31, 2025

The changes to functions like findNodeAt seem like a good idea. The others, though they do make the definition of those types look simpler, also make them involve a lot more indirection. I don't think the total effect is that beneficial, and I'm worried that they might cause obscure type error regressions for people.

Could you reduce the scope of this PR to just those NodeAnyNode changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants