Question about GC improvements and memory management #70
-
I remember reading these two Nim articles, it really blew my mind: https://nim-lang.org/blog/2020/12/08/introducing-orc.html and https://nim-lang.org/docs/destructors.html. Do you think such implementation, such as of destructor mechanism, could simplify even further the memory management for Nelua's record(s)? As a programming language designer @edubart, I would like to hear your thoughts around this topic. |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 8 replies
-
First, as a Nim user for a good amount of time and a C++ for more than a decade I am very used to RAII, constructors/destructors, reference counting, smart pointers. I was even addicted to abuse of such mechanisms for a good part of my programming life, thus was even I my original goals of Nelua to offer the following 3 memory management mechanisms:
While developing Nelua I first made 1 and 2, then I've even worked in 3, and it even shipped the Nelua master for some time (although under experimental and not documented). But while developing that at some point I've decided to cut that out because as you code and design such mechanisms you notice the amount of complexity is made, not just in the compiler and language design, but the cognitive load it causes on users and in the syntax. Plus the standard library would become complex and not that efficient in some places, (try to read for example C++ standard libraries, do you find that readable?). In summary the language would not be so simple anymore with such systems, thus not that pleasant to code and also not always efficient, all that diverges from Nelua simplicity and efficient goals. Moreover automatic referencing counting is not magic or that efficient as some assume, it trashes the CPU caches with referencing counting, thus depending on your application GC or manual memory management can be faster than reference counting. What memory model to choose all depends on the application requirements, none of the 3 options is the best. The best always depends on your requirements, for some things you could use GC, for others manual memory management would be best, for some special stuff reference counting makes sense and that can still be done manually in Nelua, it's just not in the language goals to provide means to do this automatically because it would hurt some principles as I found in my research. There is also a 4. way I think people should do more in the future
This 4th way is what I currently aim for in the future of Nelua in terms of better memory management. It's the most efficient and logical for me today, and can be easier and faster than manual memory management. It's the most logical when you think about how your hardware works. Nelua already has some allocators for that in the standard library, but it's design is not finished yet so people should stick with GC or manual memory management at this moment. The best memory management mechanism is to never allocate in the first place, if you design your application with a well thought data structure, custom allocators and everything preallocated you never need to allocate or free, I plan to demo more how to this with Nelua in the future, I've already some in-house games in Nelua not doing any allocation, just using custom allocators, handles and fixed buffers. In this design there is no cost of GC or referencing counting and leaks are impossible. The custom allocator can have data locality which is even better for the CPU cache and efficiency. The code complexity is way lower than ownership or referencing counting system would be in my opinion, and simpler than doing manual memory management because you can't have leaks if you never allocate, and can't have dangling pointers (use after free) if you use generational handles. This is a nice way software can be designed in my opinion while maintaining simplicity, bug free and efficiency. Finally I will share here some articles on the topic that I feel similar thoughts for further reading: https://www.gingerbill.org/article/2019/02/01/memory-allocation-strategies-001/ |
Beta Was this translation helpful? Give feedback.
-
I thought you would follow this route and order of choice due to the nature of game development and your accumulated experience in such field.
About reading STL and C++ libraries in general...yeah, I feel you! I could be wrong, but I have the impression elite engineers and developers compete with each other for the sake of showoff without paying attention to usability, let alone readability and comprehension.
Basically what I have had in mind is how RAII works, especially with modern C++ (smart pointers etc), things got a lot "safer" comparing the past times with legacy code and tricky low-level techniques that you were forced to use to handle memory. It's to know you don't have to worry about releasing memory, that the language does it for you via RAII; whatever goes out of scope gets released.
...but it can be implemented as an extension, a language extension that is via metaprogramming implementation, right?
I learned something new today, about custom allocators; cheers for sharing this valuable info. I would love to see a demo around this concept to get a taste how it works.
I heard a podcast with Ginger Bill, Andrew Kelly, and another guy that I cannot remember his name; they shared incredible feedback, ideas, and bottleneck they all faced while trying to solve specific problems around the domain they were working on. Andre's blog is a valuable resource around various topics. I already read https://floooh.github.io/2019/09/27/modern-c-for-cpp-peeps.html and enjoyed it. I appreciate your thorough feedback @edubart; you are helping this "old" geek to finally embrace language design and implementation. |
Beta Was this translation helpful? Give feedback.
-
Oh yea, I've heard that podcast, also I've read the articles you mentioned of Nim and others blogs from Andre long time ago. They are all good.
Probably, as an example, let's say you want to implement Lua 5.4 style "destructors", the to-be-closed variables, it's way simpler than full destructors semantics as found in C++ because only variables marked with the annotation ##[[
local typedefs = require 'nelua.typedefs'
local tabler = require 'nelua.utils.tabler'
local visitors = require 'nelua.analyzer'.visitors
typedefs.variable_annots.close = true -- define the `close` annotation
-- hook original VarDecl node visitor in the analyzer
local orig_VarDecl = visitors.VarDecl
function visitors.VarDecl(context, node)
local idnodes = node[2] -- list of identifier declarations nodes
for _,idnode in ipairs(idnodes) do -- iterate over identifier declarations nodes
local symbol = idnode.attr -- get identifier symbol
if symbol.close and not symbol.closed then -- identifier symbol has `close` annotation
-- create a defer call to __close method
local callnode = aster.Defer{aster.Block{
aster.CallMethod{'__close', {}, aster.Id{idnode[1]}}
}}
-- inject defer call after variable declaration
local blocknode = context:get_parent_node() -- get parent block node
assert(blocknode.tag == 'Block')
local statindex = tabler.ifind(blocknode, node) -- find this node index
assert(statindex)
table.insert(blocknode, statindex+1, callnode) -- insert the new statement
blocknode.scope:delay_resolution()
symbol.closed = true
end
end
-- call original VarDecl
return orig_VarDecl(context, node)
end
]]
require 'allocators.general'
local Object = @record{
x: integer
}
function Object:__close()
print 'object destroyed'
general_allocator:delete(self)
end
do
local o: *Object <close> = general_allocator:new(@Object)
-- "defer o:__close() end" is injected here
print 'object created'
-- o:__close() will be called automatically here
end If you run the above program, you should get this output:
Note that the The same thing could be done for full feature fledged destructors, but would be quite complex to do. |
Beta Was this translation helpful? Give feedback.
-
This a repost for an answer made in 494ea5a : There is room to improve the GC, just not good motivation for doing it at this moment, the current implementation uses a simple mark-and-sweep and stop-the-world algorithm (similar to Lua 5.0). Garbage collection is a broad topic under research, there are multiple ways to do it, all with different advantages and disadvantages. For Nelua by default I think having a simple and reliable garbage collector is enough for this moment, the garbage collector could be improved in the future by making it incremental (like in Lua 5.1), and later generational (like in Lua 5.4) if any good motivation to do so appears. The current GC design stops the application to run a full collection cycle every time the memory usage is doubled. For small applications the collection cycle is quite fast, for large applications with lots of allocations this may become a problem for real-time requirements. Note that despite Lua 5.1 having an incremental garbage collector the application still suffers stalls from time to time for the garbage collection atomic phase, it's just less noticeable, so having a really real-time collector without any stop-the-world step is very difficult (I don't know if any even exists at this moment). I don't plan to implement incremental collector at this time because it will incur more runtime overhead when the application is not collecting, due to write barriers that are introduced every-time a variable assignment happens, thus incremental collector increases overall CPU usage, I think this is a huge downside for my current plans, it's a price you pay for less noticeable stop-the-world phase. Also the compiler and application complexity would grow, thus making it harder to maintain, all go against the simple and minimal goals. I think if the user doesn't want any stop-the-world to happen, then he should design a good data structure, or disable the GC and manage memory himself, or maybe even mix GC / non-GC code (this is possible although should be done carefully). Despite the current GC having a very simple algorithm, it has a good overall runtime performance when you can live with the stop-the-world phases. For example, if you make a script that is a single shot run (like a batch script) it should perform better in terms of total runtime when you compare with most incremental and generational collectors. Although Nelua provides a default GC, you could completely replace it by other well researched GC, just by replacing the
The GC is already implemented in Nelua, but it does still have room to scan less memory by using some meta-programming on type information, it's possible to use type information to build the memory layout of records, to scan just segments in the record that contains pointers, because at this moment any record that is known to contain a pointer is fully scanned. This may be improved somewhere in the future, I just did not do it yet because for the current use cases the GC is already quite fast, until someone has good motivations to do this, it should remain simple and ignore records memory layout.
The GC will become slow to run a collect cycle when you have a lot of allocations done by the |
Beta Was this translation helpful? Give feedback.
-
I'm writing here my thoughts about current Nelua GC state; in other words, arguing with myself out loud. Currently I can see we have For future reasons, in my humble opinion it would make sense to have something like This way, if I'm interested in testing a different type of collector mechanism, I could implement it and place it in |
Beta Was this translation helpful? Give feedback.
First, as a Nim user for a good amount of time and a C++ for more than a decade I am very used to RAII, constructors/destructors, reference counting, smart pointers. I was even addicted to abuse of such mechanisms for a good part of my programming life, thus was even I my original goals of Nelua to offer the following 3 memory management mechanisms:
While developing Nelua I first made 1 and 2, then I've even worked in 3, and it even shipped the Nelua master for some time (although under experimental and not documented). But while developi…