From 2d03e7bd016e28ae41eb96dd2828a0369b661549 Mon Sep 17 00:00:00 2001 From: Troy Hinckley Date: Thu, 11 Jan 2024 17:14:42 -0600 Subject: [PATCH] Add more design doc about stack model --- design.org | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/design.org b/design.org index b97c982c..c3b436cf 100644 --- a/design.org +++ b/design.org @@ -148,11 +148,15 @@ Reader macros are controversial. They enable some pretty amazing super powers (j * Details ** stack -Traditional emacs uses recursion to implement calls, meaning that every function call will also push on the C stack. Meaning that having lisp eval depth go too far and you will crash emacs. That is why they limit it to 800 by default. It makes the implementation very simple because you can use the recursion to keep track of your stack frame. And you can just unwind your stack to unwind the lisp stack. However this also means you have to be careful to not stack overflow and it makes it hard to implement things like stackful coroutines. If you are using those (or elisp threads) you need to unwind the stack. +GNU Emacs uses recursion to implement calls, meaning that every function call will also push on the C stack. Meaning that having lisp eval depth go too far and you will crash emacs. That is why they limit it to 800 by default. It makes the implementation very simple because you can use the recursion to keep track of your stack frame. And you can just unwind your stack to unwind the lisp stack. However this also means you have to be careful to not stack overflow and it makes it hard to implement things like stackful coroutines. If you are using those (or elisp threads) you need to unwind the stack. In Emacs when you enter the debugger in the middle of execution it will not unwind but keep the stack frames there so they can be resumed. Anything you run after that will be on top of the current stack. Emacs keeps information about the stack in a separate "specpdl" array so that it doesn't have to unwind to display backtraces. -An alternate is to not use the C stack and explicitly store the frames and variables in an array. This makes it easier to enter and resume the debugger, but is complicated by builtin functions that call elisp, like mapcar. In mapcar, it always has to go through the C stack since it is defined in C. You would have to have some mechanism to save the state of these types of functions so that they can be resumed later. This is not a problem is you just use the C stack. +An alternate is to not use the C stack and explicitly store the frames and variables in an array. This makes it easier to enter and resume the debugger, but is complicated by builtin functions that call elisp, like mapcar. In mapcar, it always has to go through the C stack since it is defined in C. You would have to have some mechanism to save the state of these types of functions so that they can be resumed later. This is not a problem is you just use the C stack. You could use async to transform functions into state machines so that they could be suspended and resumed. But this makes it hard because you would need to be boxing lots of futures, since most call stacks in this project are not statically known. [[https://github.com/kyren/piccolo][Piccolo]] is an Lua runtime that takes this "stackless" approach. + +We are going to try using the rust stack approach because it is simpler. It should still allow us to do almost anything but implement stackful coroutines (and by extension async/await). We should still be able to do interactive debugging and reverse debugging. It really seems like a trade-off between interpreter simplicity and debugger simplicity. If you use the native Rust stack than the interpreter is easier to write and maintain, but the debugger is harder. This is because the debugger needs to open on any error (without unwinding the stack) and needs to be able to jump up stack frames while keeping it's context. + +However if you used a heap allocated stack and made everything "stackless" from the Rust stack perspective then your interpreter becomes harder because you can't let the stack implicitly hold state for you. Everything needs to be explicit. You also need to write everything in an ~async~ style (probably using ~async~ blocks directly). But the debugger becomes much easier because you can manipulate the call frame and stacks as array elements. Also you can no longer rely on stable memory addresses for stack elements, but that is less of a problem because you can just use index's. In some sense Emacs already has a "heap stack" in the form of the specpdl stack. Every call needs to push a new frame on there. And the only purpose of doing that is for displaying backtraces without unwinding. ** Storing data *** Buffer representation @@ -197,7 +201,6 @@ Another thing to consider is that if codepoints are not meaningful boundaries, w If we didn't have to work with existing code, a better API would be to not expose "characters" as indexes, but instead provide a cursor API. This would let you seek forwards or backwards, but not jump to an arbitrary point. - *** Pointer Tagging **** Tagged Arithmetic