Skip to content

Commit

Permalink
Add proper READMe
Browse files Browse the repository at this point in the history
  • Loading branch information
ysthakur committed Dec 15, 2024
1 parent ae4e107 commit 2eb2d63
Showing 1 changed file with 19 additions and 36 deletions.
55 changes: 19 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,19 @@
# fred

CMSC499 project

## Compiling to C

A type like
```
data Foo
= Bar {
mut foo: Foo,
fred: Fred,
common: str,
notcommon: str
}
| Baz {
blech: str,
mut gah: int,
common: str,
notcommon: int
}
```
would be compiled to the following:
```c
enum Foo_kind { Bar_tag, Baz_tag };
struct Foo {
enum Foo_kind kind;
char* common;
union {
struct { struct Foo* foo_Bar; struct Fred* fred_Bar; char* notcommon_Bar; };
struct { char* blech_Baz; int gah_Baz; int notcommon_Baz; };
};
};
```

Every field has the name of its variant added to it, so that multiple variants can have fields with the same name (e.g. `notcommon`). But fields that have the same type in all variants are put outside the union, e.g., `common`, which has a type of `str` in all variants.
# Fred

Fred is a programming language made to explore a compiler optimization for lazy mark scan, which is an algorithm for collecting cycles when doing reference counting. See https://github.com/ysthakur/fred/blob/main/writeup/writeup.pdf for more information.

With the lazy mark scan algorithm, whenever an object's reference count is decremented but it doesn't hit 0, it's added to a list of potential cyclic roots (PCRs). Every once in a while, you go through these PCRs (as a group) and perform trial deletion to get rid of cycles. But you need to scan every single object reachable from all of these PCRs. So, I worked on a way to reduce this scanning using type information.

To do this, you can first partition the graph of types into its [strongly-connected components](https://en.wikipedia.org/wiki/Strongly_connected_component) (SCCs). If you have two objects object `a` and `b` of types `A` and `B` respectively, and `A` and `B` are not from the same SCC, you know that `a` and `b` cannot possibly form a cycle. This is important.

Now that you have these SCCs, you don't need to maintain a flat list of PCRs. Instead, you can maintain a list of PCR buckets, with each bucket containing PCRs from a different SCC (bucket 0 contains all the PCRs from SCC 0, bucket 1 contains all the PCRs from SCC 1, ...). When processing the PCRs, you process one bucket at a time, rather than the entire list of PCRs. This makes the algorithm more incremental. That's not the main improvement, though.

The main improvement is that when processing a PCR from, say, SCC 5, if it has any references to objects outside of SCC 5, you don't need to scan those objects. This does require sorting the buckets according to the SCC they're for. And SCCs are sorted topologically, so types from SCC 2 can only have references to types from SCC 2, 3, etc. but they can't have references to types from SCC 0 and 1.

This has probably been done already, so if you come across such a paper or project, please let me know, I'd be very interested.

## Why name it Fred?

When I was young, I had a cute little hamster called Freddie Krueger, so named because of the hamster-sized striped red sweater my grandmother had knitted for him, as well as his proclivity for murdering small children. In his spare time, Fred would exercise on his hamster wheel, or as he liked to call it, his Hamster Cycle.

But one day, I came home to find Fred lying on the hamster cycle, unresponsive. The vet said that he'd done too much running and had had a heart attack. I was devastated. It was then that I decided that, to exact my revenge on the cycle that killed Fred, I would kill all cycles.

0 comments on commit 2eb2d63

Please sign in to comment.