Serializable VMs #3

jamespfennell · 2023-06-10T14:14:57Z

This is the first commit in the serializable VMs project, which is biggish project to support making the Texcraft VM serializable. The main reason why this isn't trivial in the usual Rust way is because the VM contains function pointers (of primitives) and thus you can't trivially serde the VM. This commit adds (or fixes) some of the initial infrastructure that was added (e.g., the command key type) and adds support for serializing primitives only. Support for other commands will come later. One of the main additions here is a unit testing utility for easy unit testing serding stuff.

This is to make the dynamic memory allocation system work with serializable VMs.

One bug with this (de)serializer is that the registers array overflows the stack, I'm guessing becuase the call stack gets bigger than with serde_json. I put the array behind a Box which is unfortunate because there's a small runtime cost to that. Maybe in the future there will be a better solution. I also put all VMs behind boxes. I thought this may have solved the stack overflow but it didn't. Nonetheless it's worth doing because the VM can be big. After this commit all serde unit tests run both with json and message pack.

This sneaked in in the previous commit.

jamespfennell · 2023-06-21T23:30:59Z

For benchmarking, it would be awesome to time (de)serializing a VM that has loaded the Plain TeX format. It may take a lot of work though before Texlang can read the Plain TeX format.

To do this we need to support (de)serializing the save stack. This is a bit of a pain as the save stack contains function pointers that need to be cross-referenced with what's in the command map. In the end the code touches the commands map, the variables API, and the serde module, so there are more pub(crate)s than I would like. I tried to refactor things to minimize cross module deps. But it works!

jamespfennell · 2023-06-22T03:15:14Z

Some perf improvement ideas I had:

In general support serializing and deserializing iterators. In a bunch of places I create a new data structure (like a new instance of a map) and then serialize that. We could probably skip this intermediate phase, making serding faster and less memory intensive. Serialization should be trivial; figuring out deserializing may be tricky. Given serde's API it will be impossible to actually deserialize to an iterator.
When serializing the cat code map, don't serialize values that are the same as e.g. INITEX. When deserializing, initialize to the INITEX defaults and then apply the differences on top.
Serialize cat codes as integers, irrespective of the format.
For registers, serialize continuous runs of 0s as something like 0<number of zeros>. In many serde contexts it is expected that registers mostly have their default values so this will be much faster and space efficient. I had a fancier idea of dividing a vector into blocks of the form <number of non-zero values><number of zeros><non-zero values> which I think is provably more space efficient in all cases. If a block starts with 0, it means the vector is over.
In the CS name interner, the ends vector is increasing. We should serialize the diffs between adjacent elements instead of the elements themselves. For formats with varint encoding, this will be more space efficient.

jamespfennell added a commit that referenced this issue Jun 12, 2023

Reimplement dynamic memory allocation (#3)

49fd0e9

This is to make the dynamic memory allocation system work with serializable VMs.

jamespfennell added a commit that referenced this issue Jun 13, 2023

Support (de)serializing registers (#3)

80c61c9

jamespfennell added a commit that referenced this issue Jun 17, 2023

Implement \dump and support loading format files (#3)

bd891a3

jamespfennell added a commit that referenced this issue Jun 17, 2023

Some save stack and math command improvements (#3)

03d4e2c

jamespfennell added a commit that referenced this issue Jun 21, 2023

Make the bincode dep optional (#3)

90a185e

This sneaked in in the previous commit.

jamespfennell mentioned this issue Jun 22, 2023

Support loading the Plain TeX format #5

Open

23 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serializable VMs #3

Serializable VMs #3

jamespfennell commented Jun 10, 2023 •

edited

Loading

jamespfennell commented Jun 21, 2023

jamespfennell commented Jun 22, 2023

Serializable VMs #3

Serializable VMs #3

Comments

jamespfennell commented Jun 10, 2023 • edited Loading

jamespfennell commented Jun 21, 2023

jamespfennell commented Jun 22, 2023

jamespfennell commented Jun 10, 2023 •

edited

Loading