You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the moving GC, we have an issue where the hash values of keys can change during garbage collection. The hash of an object is taken from its value, which is just a tagged pointer. So when the pointer moves, so does the hash. There are a couple possible solutions to this.
1. store the hash value in the object header
This is what other dynamic languages like C# and Java do. This is easy because we only need to calculate the hash once and it will never change. However this also takes more memory, at least 4 bytes if we want a good hash. This is especially problematic for Cons, which are very common and should only be 2 words wide. This would require them to have a header, which they currently don't need.
2. rehash the keys during GC
This is the path we are going to take for now until we can think of a better solution. Every GC the hashmap keys will have to be recalculated and inserted. This is costly. The way we do this now is via RefCell that let's us mutate a HashMap during collection. But this means we could also have runtime panics if there exists a root to one of the hashmap values. Not ideal. Here is paper about how to handle this issue and make it lest costly.
Ideally we should define a custom HashMap that let's us rehash all the keys without moving the values. I think this might be possible, but it might also require us forking either HashMap or IndexMap.
3. Is there a possible third option?
The text was updated successfully, but these errors were encountered:
With the moving GC, we have an issue where the hash values of keys can change during garbage collection. The hash of an object is taken from its value, which is just a tagged pointer. So when the pointer moves, so does the hash. There are a couple possible solutions to this.
1. store the hash value in the object header
This is what other dynamic languages like C# and Java do. This is easy because we only need to calculate the hash once and it will never change. However this also takes more memory, at least 4 bytes if we want a good hash. This is especially problematic for
Cons
, which are very common and should only be 2 words wide. This would require them to have a header, which they currently don't need.2. rehash the keys during GC
This is the path we are going to take for now until we can think of a better solution. Every GC the hashmap keys will have to be recalculated and inserted. This is costly. The way we do this now is via
RefCell
that let's us mutate aHashMap
during collection. But this means we could also have runtime panics if there exists a root to one of the hashmap values. Not ideal. Here is paper about how to handle this issue and make it lest costly.Ideally we should define a custom
HashMap
that let's us rehash all the keys without moving the values. I think this might be possible, but it might also require us forking eitherHashMap
or IndexMap.3. Is there a possible third option?
The text was updated successfully, but these errors were encountered: