Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create CODEGEN.md #45

Merged
merged 1 commit into from
Jan 24, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions docs/CODEGEN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Code Generation

I'd initially wanted to hide the code generation aspects of this project and just commit the generated `.c` and `.h`
files in the `src/` directory. However this is disingenuous and probably hinders other people playing with the code,
so I've made it official.

There always was a `tmp/` directory created by `make` that hosts the generated flex and bison output, so it was
simple enough to generate additional code into there instead of in to `src/`, and remove those generated files from git.

So what's the code generation for? It just removes the need to maintain a ton of boilerplate code around structures
used by the project. There are a number of `.yaml` files in the `src/` directory which basically declare
C structs and their typedefs. At the time of writing they are:

* [`anf.yaml`](../src/anf.yaml) A-Normal form structures input to the bytecode compiler, generated fron the lambda structures.
* [`ast.yaml`](../src/ast.yaml) The abstract syntax tree generated by the parser.
* [`lambda.yaml`](../src/lambda.yaml) Lambda calculus-like structures generated from the AST.
* [`tc.yaml`](src/tc.yaml) Type checking support for Algorithm W.
* [`tpmc.yaml`](../src/tpmc.yaml) Term Pattern Matching Compiler support structures, part of lambda conversion.

For example `ast.yaml` contains the declarations for the abstract
syntax tree generated by the parser. A python script [makeAST.py](../tools/makeAST.py) is given each of those yaml files
and generates the same set of `.c` and `.h` files for each. Continuing with the `ast.yaml` example, from that file
will be generated:
* `tmp/ast.c` a number of different functions for each structure:
* `new<struct>()` functions that allocate memory and poulate the allocated structs with argument values.
* `copy<struct>()` functions that will make a deep copy of the struct.
* `push<struct>()` functions that will push data onto any declared 1-dimensional arrays.
* `mark<struct>()` functions that will recursively mark the structures as part of garbage collection.
* a generic `mark` function that will switch on the type and call the correct `mark` function.
* `free<struct>` functions that will release unused memory when requested by the garbage collection system
* a generic `free` function that dispatches to the correct `free<struct>` function.
* a `typename` function that will return the name of a struct for debugging etc.
* `tmp/ast_debug.c` debugging utilities, namely:
* `print<struct>()` functions that will recursively display a representation of the struct for debugging.
* `eq<struct>()` functions that perform deep comparisons for testing and debugging.
* `tmp/ast_debug.h` header for `ast_debug.c`
* `tmp/ast.h` header for `ast.c` includes the structure declarations themselves.
* `tmp/ast_objtypes.h` macros collecting the enums and case statements that can then easily be incorporated into the memory management system.

This all means that it's relatively easy to make fairly sweeping changes to the various trees without all the
tedious re-writing of the above.
Loading