Skip to content

Commit

Permalink
writeup: Added bib and mentioned quad scan problem
Browse files Browse the repository at this point in the history
  • Loading branch information
ysthakur committed Dec 11, 2024
1 parent 6196d5d commit b152a0d
Show file tree
Hide file tree
Showing 4 changed files with 167 additions and 0 deletions.
Binary file added docs/img/quadratic_scanning_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
56 changes: 56 additions & 0 deletions docs/writeup-bib.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@

@article{local_mark_scan,
title = {Cyclic reference counting with local mark-scan},
volume = {34},
issn = {0020-0190},
url = {https://www.sciencedirect.com/science/article/pii/002001909090226N},
doi = {10.1016/0020-0190(90)90226-N},
number = {1},
urldate = {2024-09-15},
journal = {Information Processing Letters},
author = {Martínez, Alejandro D. and Wachenchauzer, Rosita and Lins, Rafael D.},
month = feb,
year = {1990},
keywords = {garbage collection, reference counting, analysis of algorithms, Compilers, functional programming},
pages = {31--35},
file = {PDF:C\:\\Users\\yasht\\Zotero\\storage\\KF6EMQ3M\\Martínez et al. - 1990 - Cyclic reference counting with local mark-scan.pdf:application/pdf;ScienceDirect Snapshot:C\:\\Users\\yasht\\Zotero\\storage\\MGH39B3R\\002001909090226N.html:text/html},
}


@article{lazy_mark_scan,
title = {Cyclic {Reference} {Counting} {With} {Lazy} {Mark}-{Scan}},
volume = {44},
issn = {0020-0190},
url = {http://dx.doi.org/10.1016/0020-0190(92)90088-D},
abstract = {this paper is much higher than the original one for cyclic reference counting with local mark-scan. More shared cells will now be claimed directly, without any need for mark-scan. The deletion of the last pointer to a shared cell will recycle it immediately, regardless of whether there is a reference to it on the queue. The queue will be left basically with pointers to cycles and pointers to green cells in the free-list or recycled. In this case again, our algorithm performs far better than the original one. In the best case, only one local mark-scan will be performed per cycle, instead of as many as the number of external references to a cycle, as before.},
language = {en},
number = {4},
urldate = {2024-09-18},
journal = {Information Processing Letters},
author = {Lins, Rafael D.},
month = dec,
year = {1992},
note = {Number: 4
Publisher: Elsevier Science Bv, Po Box 211, 1000 Ae Amsterdam, Netherlands},
pages = {215--220},
file = {Full Text PDF:C\:\\Users\\yasht\\Zotero\\storage\\DVE2LKA5\\Lins - 1992 - Cyclic Reference Counting With Lazy Mark-Scan.pdf:application/pdf;Snapshot:C\:\\Users\\yasht\\Zotero\\storage\\CDRGRICK\\22347.html:text/html},
}

@article{morris_chang_cyclic_2012,
series = {{SMALLTALKS} 2010},
title = {Cyclic reference counting by typed reference fields},
volume = {38},
issn = {1477-8424},
url = {https://www.sciencedirect.com/science/article/pii/S1477842411000285},
doi = {10.1016/j.cl.2011.09.001},
abstract = {Reference counting strategy is a natural choice for real-time garbage collection, but the cycle collection phase which is required to ensure the correctness for reference counting algorithms can introduce heavy scanning overheads. This degrades the efficiency and inflates the pause time required for garbage collection. In this paper, we present two schemes to improve the efficiency of reference counting algorithms. First, in order to make better use of the semantics of a given program, we introduce a novel classification model to predict the behavior of objects precisely. Second, in order to reduce the scanning overheads, we propose an enhancement for cyclic reference counting algorithms by utilizing strongly-typed reference features of the Java language. We implement our proposed algorithm in Jikes RVM and measure the performance over various Java benchmarks. Our results show that the number of scanned objects can be reduced by an average of 37.9\% during cycle collection phase.},
number = {1},
urldate = {2024-09-13},
journal = {Computer Languages, Systems \& Structures},
author = {Morris Chang, J. and Chen, Wei-Mei and Griffin, Paul A. and Cheng, Ho-Yuan},
month = apr,
year = {2012},
keywords = {reference counting, Garbage collection, Java, Memory management},
pages = {98--107},
file = {PDF:C\:\\Users\\yasht\\Zotero\\storage\\IFPWL334\\Morris Chang et al. - 2012 - Cyclic reference counting by typed reference fields.pdf:application/pdf;ScienceDirect Snapshot:C\:\\Users\\yasht\\Zotero\\storage\\FHPHZGUW\\S1477842411000285.html:text/html},
}
Binary file added docs/writeup.pdf
Binary file not shown.
111 changes: 111 additions & 0 deletions docs/writeup.typ
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
= Introduction

One major problem with reference counting is the fact that it cannot free objects that are involved in reference cycles. Lazy mark scan is a cyclic reference counting algorithm that aims to fix this issue, but it requires traversing all objects reachable from any potential cyclic roots @lazy_mark_scan. This write-up describes a way to reduce the number of objects traversed by applying information about types known at compile-time. My project involved creating a language #smallcaps[fred] and implementing a runtime for it that takes advantage of this optimization.

= Background

Leo, you can probably skip/skim these first couple sections, they mostly just exist for future me.

== Lazy mark scan

Lazy mark scan is a lazy version of an algorithm called local mark scan.

With local mark scan, every time an object's reference count is decremented, if its reference count does not hit 0, all of the objects reachable from that object are scanned recursively, and if it turns out that any of these objects were part of a cycle and are now no longer reachable, they will be freed @local_mark_scan.

Since this process is expensive, lazy mark scan merely adds each object to a list of *potential cyclic roots* (PCRs). Every once in a while, this list of PCRs is traversed and mark scan is performed on all of these PCRs at once. Note that each PCR is not simply scanned individually, in sequence, because this would essentially be the same as local mark scan. Rather, each phase of the mark scan algorithm is sequentially performed on all PCRs before moving on to the next phase.

This still requires scanning a bunch of objects. There are many things you can do to improve reference counting performance significantly, and I ran into a bunch of such articles. However, I only found a couple that make optimizations based on compile-time information. This is possibly because TODO

== Avoiding scanning children based on type

In a statically typed language, some guarantees can be made about whether or not objects of one type can ever form cycles with objects of another type. At runtime, this allows to reduce the scanning we do.

Fewer guarantees can be made if the type system in question includes subtyping or something, but this project only looks at a very simple language, with no subtyping, polymorphism, dependent types, or other bells and whistles.

#smallcaps[fred]'s user-defined types are only algebraic data types, I think they're called? They're tagged unions of product types. And rather than assume every field is mutable, fields need to be marked mutable explicitly. Immutability isn't central to my project, but it does give us some extra knowledge to avoid more scanning.

This makes it easy to represent all the types in a program as a directed graph where the nodes are types. The fields inside every type can be represented as edges going from that type to the type of the field.

Now that we have a graph of types, we can see that two objects `a` and `b` (not necessarily distinct) of types `A` and `B`, respectively, can only form a cycle if:
- `A` and `B` form a cycle,
- AND somewhere along the path from `A` to `B` or `B` to `A`, there's a mutable field.

Although I may be lazy, #smallcaps[fred] is not, and so there is currently no way to create cycles using only immutable fields. I don't feel like proving this or Googling for existing proofs of it.

Now that we know that certain objects cannot form cycles with certain other objects, we can apply this knowledge at runtime. When recursively scanning the objects reachable from a PCR, every time we come across some object, we can avoid scanning those of its children that can never form a cycle with that object (based on their types).

== Quadratic scanning problem

However, if done naively, this can result in garbage not being found in a single sweep of the list of PCRs @morris_chang_cyclic_2012. A quick fix for this would be to go over the list of PCRs multiple times until all garbage is gone, but this makes cycle collection quadratic in the number of objects.

Below, I will give some example code that triggers this problem. Suppose you are creating a compiler and you have the following types. You can have `Context -> FileList -> Context` cycles, as well as `File -> ExprList -> Expr -> File` cycles.

```haskell
data Context = Context {
name: str,
mut files: FileList
}
data FileList
= FileNil {}
| FileCons {
ctx: Context,
head: File,
tail: FileList,
}
data File = File {
mut exprs: ExprList
}
data ExprList
= ExprNil {}
| ExprCons {
head: Expr,
tail: ExprList
}
data Expr = Expr {
file: File,
// other stuff here
}
```

And for whatever reason, you have the following code that creates a context with a file that contains one expression:
```rust
let ctx = Context { name: "foo", files: FileNil {} } in
let file = File { exprs: ExprNil {} } in
let expr = Expr { file: file } in
ctx.files = FileCons { ctx: ctx, head: file, tail: ctx.files };
file.exprs = ExprCons { head: expr, tail: file.exprs }
```

After running that code, this is what the graph of objects looks like:

#align(center)[
#image("img/quadratic_scanning_example.png", height: 40%)
]

The green edges in the diagram above are references that are known not to introduce any cycles. Therefore, when doing mark-scan, we will not follow them (this is our modification from the previous section, not part of lazy mark scan). There are other references that don’t cause cycles in there, but we can't know this at compile-time. I'm going to call these green edges "innocent", because I don't know what sort of terms are actually used for them by real researchers.

At some point, the variables `ctx`, `file`, and `expr` will go out of scope, so the `Context`, `File`, and `Expr` objects will all have their refcounts decremented before being added to the list of PCRs. Every object in the diagram above has become garbage and is eagerly waiting to be freed, not knowing that rather than nirvana, all they will get is an endless cycle of rebirth and deallocation, at least until your computer stops working and you throw it away.

Let's trace what our naively modified lazy mark scan algorithm would do here:
- First, we go to every object reachable from the `Context`, `File`, and `Expr` objects (without traversing green edges) and mark it gray.
- The reference count of every object reachable from these gray objects is decremented, as if the gray objects have been deleted.
- After doing this, the `Context` and `FileList` objects have a refcount of 0. `File` has a refcount of 1, while the `Expr` and `ExprList` objects have a refcount of 0.
- TODO finish this

Now the stuff in the top cycle has been correctly marked as garbage, but not the stuff in the bottom cycle. `File` lives because it has a reference from the `FileCons` object, and it keeps the rest of the bottom cycle alive. All the scanning we did on the bottom cycle was in vain, because we'll have to go back and repeat it now.

Therefore, it is not enough to simply not traverse innocent edges. Fortunately, the solution to this is pretty simple.

== Fixing the quadratic scanning problem

You may notice above that because we don't traverse innocent edges, processing the `Context` object happens completely separately from processing the `File` and `Expr` objects. We could have processed the `Context` object first, found it to be garbage, decremented the `File` object's reference count, and then processed the `File` and `Expr` objects together. This way, all of the objects would have been marked as garbage without processing any PCR multiple times.

In general, how do we determine which objects should be processed before which objects? We can do this based on which objects can reference which objects (directly or indirectly). A `Context` object can refer to `File` and `Expr` objects, so it must be processed before them. `File` and `Expr` objects can both refer to each other, so they must be processed together.

#bibliography("writeup-bib.bib")

= Why #smallcaps[fred]?

I was going to name it Foo, but it there's already an esolang by that name that's fairly well-known (by esolang standards). So I went to the Wikipedia page on metasyntactic variables and picked "fred." I figured that if I needed to, I could pretend that it was something meaningful, like maybe an acronym or the name of a beloved childhood pet.


0 comments on commit b152a0d

Please sign in to comment.