Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collector #20

Open
CeleritasCelery opened this issue Jan 14, 2023 · 4 comments
Open

Garbage collector #20

CeleritasCelery opened this issue Jan 14, 2023 · 4 comments
Labels
design needed Items where more design help is needed

Comments

@CeleritasCelery
Copy link
Owner

The plan for the garbage collector is to be generational, copying collector. The old generation will use Immix style mark region collector, which Jon Harrop calls a breakthrough in gc design. I don't fully understand yet why it is so great, but the rust interpreter folks used immix for their project as well. Our current implementation already allows for copying object pointers via indirection, so we have the ability to implement this.

When to promote objects to the Major heap?

A typical generational GC will promote objects after the survive N minor heap collections, where N is usually 1. However we can take advantage of the execution patterns of Emacs. Commands in Emacs are usually run in short bursts with lots of waiting. Everytime you type a character or run a command, emacs will evaluate some code. The best time to run GC is after the command completes and the display has been updated. This is also the best time to promote objects, because anything that is still live will typically live for a long time. If the current command generates so much garbage that it fills the minor (nursery) heap, then we could use Chenny's Semi-space copying approach to remove dead objects, without promotion.

@CeleritasCelery CeleritasCelery added the design needed Items where more design help is needed label Jan 14, 2023
@gagbo
Copy link

gagbo commented Jan 23, 2023

Hello, I haven't followed everything closely in the project, but I thought that at some point you used to use manishearth garbage collector crate.

Why did you migrate? Did you make a post or a big commit explaining the reasons for migrations and what your current design is? I'm also wondering an extra thing: how are you able to trace elements across closure calls? In my lispy rust project, I had to go to great lengths to be able to trace values that are used in a closure, I wonder how you plan to go past the difficulty. At least for me and a few other people, the fact that Rust doesn't track the captured values used in a rust || {} closure means it's really hard to access the values owned by (moved into) the closure for tracing purposes

Happy to see you sticking with it though (I also hope to have a R7RS fully compliant interpreter/compiler by the next decade)

Also, thank you for that hosted-langs link, I started to feel alone when searching frantically around internet for my issues when implementing it. Looks like I will have to write my own allocator after all.

@CeleritasCelery
Copy link
Owner Author

but I thought that at some point you used to use manishearth garbage collector crate.

Why did you migrate? Did you make a post or a big commit explaining the reasons for migrations and what your current design is?

I never actually implemented anything with rust-gc, but I did look into every rust implementation I could find. I wrote a post about my final design. The reasons I did not go with rust-gc is firstly that it Gc values can only be immutable, meaning you need interior mutability for everything. But also it uses Rc under the hood.

Using Rc is a great choice for implementation simplicity and interoperability. If I was creating a “new” language then I would use Rc as well. But I want this implementation to be competitive with Emacs C core and tracing has a much higher performance ceiling compared to reference counting. I wouldn’t recommend my approach unless you need the performance, just use Rc. This is what’s done by many “big” languages hosted in rust like steel lisp and rust-python, so it is a good strategy.

I'm also wondering an extra thing: how are you able to trace elements across closure calls?

can you share some example code illustrating the problem? I am not sure I follow.

@gagbo
Copy link

gagbo commented Jan 24, 2023

can you share some example code illustrating the problem? I am not sure I follow.

Let's say you want to host this piece of code:

;; Lexical binding by default.
(setq captured '(atom1 atom2))

(defun scm-closure (foo)
  "Return non-nil if FOO is a nice value."
  (memq foo captured))

As long as scm-closure is reachable, captured is a reachable object too. So if your implementation has scm-closure rooted and you start tracing to mark all items, you need a way to trace back to captured. Maybe you don't have the issue at all in your design; when I tried to implement that for a scheme interpreter, I wanted to back scheme closures with rust closures. So my idea was:

  • captured is a SchemeValue you can root/trace
  • scm-closure is a SchemeValue that takes ownership of an extra reference of captured (it gets "moved" into the rust closure), which also happens to be a Rust closure so i can just do let result: SchemeValue = scm_closure(); in Rust

Therefore, if scm-closure is a root, it would trace the captured SchemeValue and save it from collection. The issue with this approach is that Rust doesn't give closures an API to access the list of values it owns, leading to this issue on rust-gc. You can see the kind of workaround I needed in the comments; as I didn't find where you implemented closures in rune yet, I was wondering if you had a solution in mind to be a little more ergonomic on Rust side of things

@CeleritasCelery
Copy link
Owner Author

Normally that code would not be a closure, because closures capture their lexical environment, and captured is global. However this would be a closure:

(let ((captured '(atom1 atom2)))
  (defun scm-closure (foo)
    "Return non-nil if FOO is a nice value."
    (memq foo captured)))

In that case I capture the whole lexical environment and store it as a list in the function object

rune/src/interpreter.rs

Lines 190 to 196 in a1e5983

let env = {
// TODO: remove temp vector
let env: Vec<_> = self.vars.iter().map(|x| x.bind(cx).into()).collect();
crate::fns::slice_into_list(env.as_slice(), Some(cons!(true; cx)), cx)
};
let end = cons!(env, cons.cdr(); cx);
Ok(cons!(sym::CLOSURE, end; cx))

Using cons cells is important here, because it means that multiple closures will share their closed-over lexical variables. When the GC traces the function object, it will trace the lexical environment as well.

My lisp closures don't map to Rust closures, so they are two different issues. As far as Rust goes, my roots are created with the root! macro, which pins a value on the stack. It does not expose an "owned" value, so you can't move a root into closures. You can move a reference, but not the actual root. This means that Rust closures are not a problem for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design needed Items where more design help is needed
Projects
None yet
Development

No branches or pull requests

2 participants