From 2eb2d636650bb8ef4508253b01d388126d4cc0f3 Mon Sep 17 00:00:00 2001 From: ysthakur <45539777+ysthakur@users.noreply.github.com> Date: Sun, 15 Dec 2024 16:30:55 -0500 Subject: [PATCH] Add proper READMe --- README.md | 55 +++++++++++++++++++------------------------------------ 1 file changed, 19 insertions(+), 36 deletions(-) diff --git a/README.md b/README.md index 5b57db9..576a528 100644 --- a/README.md +++ b/README.md @@ -1,36 +1,19 @@ -# fred - -CMSC499 project - -## Compiling to C - -A type like -``` -data Foo - = Bar { - mut foo: Foo, - fred: Fred, - common: str, - notcommon: str - } - | Baz { - blech: str, - mut gah: int, - common: str, - notcommon: int - } -``` -would be compiled to the following: -```c -enum Foo_kind { Bar_tag, Baz_tag }; -struct Foo { - enum Foo_kind kind; - char* common; - union { - struct { struct Foo* foo_Bar; struct Fred* fred_Bar; char* notcommon_Bar; }; - struct { char* blech_Baz; int gah_Baz; int notcommon_Baz; }; - }; -}; -``` - -Every field has the name of its variant added to it, so that multiple variants can have fields with the same name (e.g. `notcommon`). But fields that have the same type in all variants are put outside the union, e.g., `common`, which has a type of `str` in all variants. +# Fred + +Fred is a programming language made to explore a compiler optimization for lazy mark scan, which is an algorithm for collecting cycles when doing reference counting. See https://github.com/ysthakur/fred/blob/main/writeup/writeup.pdf for more information. + +With the lazy mark scan algorithm, whenever an object's reference count is decremented but it doesn't hit 0, it's added to a list of potential cyclic roots (PCRs). Every once in a while, you go through these PCRs (as a group) and perform trial deletion to get rid of cycles. But you need to scan every single object reachable from all of these PCRs. So, I worked on a way to reduce this scanning using type information. + +To do this, you can first partition the graph of types into its [strongly-connected components](https://en.wikipedia.org/wiki/Strongly_connected_component) (SCCs). If you have two objects object `a` and `b` of types `A` and `B` respectively, and `A` and `B` are not from the same SCC, you know that `a` and `b` cannot possibly form a cycle. This is important. + +Now that you have these SCCs, you don't need to maintain a flat list of PCRs. Instead, you can maintain a list of PCR buckets, with each bucket containing PCRs from a different SCC (bucket 0 contains all the PCRs from SCC 0, bucket 1 contains all the PCRs from SCC 1, ...). When processing the PCRs, you process one bucket at a time, rather than the entire list of PCRs. This makes the algorithm more incremental. That's not the main improvement, though. + +The main improvement is that when processing a PCR from, say, SCC 5, if it has any references to objects outside of SCC 5, you don't need to scan those objects. This does require sorting the buckets according to the SCC they're for. And SCCs are sorted topologically, so types from SCC 2 can only have references to types from SCC 2, 3, etc. but they can't have references to types from SCC 0 and 1. + +This has probably been done already, so if you come across such a paper or project, please let me know, I'd be very interested. + +## Why name it Fred? + +When I was young, I had a cute little hamster called Freddie Krueger, so named because of the hamster-sized striped red sweater my grandmother had knitted for him, as well as his proclivity for murdering small children. In his spare time, Fred would exercise on his hamster wheel, or as he liked to call it, his Hamster Cycle. + +But one day, I came home to find Fred lying on the hamster cycle, unresponsive. The vet said that he'd done too much running and had had a heart attack. I was devastated. It was then that I decided that, to exact my revenge on the cycle that killed Fred, I would kill all cycles.