Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Goal
We want to replace the megamorphic calls with v-table sends to see if it is faster.
We can identify the megamorphic selectors and compile all message sends of those selectors (or maybe just the methods with megamorphic calls?) to v-table sends.
Thus, all the classes in the system must have a v-table filled with a pre-fixed method base on selectors (for example, all the classes will have the method
size
as first entry...).Then, instead of filling a PIC and finish in a megamorphic call, the method can found via the v-table (it is 2 LOADs more than a PIC, one for the class and other for the v-table).
So, the plan is:
What we have
We a v-table to all classes in the image: [Experiment] Add vTable to ClassDescription pharo-project/pharo#17088
We added a new bytecode:
pharo-vm/smalltalksrc/VMMaker/StackInterpreter.class.st
Line 743 in d409e4d
It is implemented in the Interpreter:
pharo-vm/smalltalksrc/VMMaker/StackInterpreter.class.st
Lines 13632 to 13653 in d409e4d
We extended Opal to compile methods using the new bytecode: pharo-project/pharo@281566a
Then we can define a method using the
<opalBytecodeMethod>
option:size
) in Interpreter-mode:We learnt that the global cache is faster than using v-table (less accesses to the memory).
pharo-vm/smalltalksrc/VMMaker/StackToRegisterMappingCogit.class.st
Lines 2506 to 2543 in 58fda6f
We run the same micro-benchmark than before, but with more types (I lost the code).
We learnt that, by speed the vTable is equivalent to a polymorphic call:
mono > poly = vTable > mega
Next steps
We want to replace almost all megamorphic calls in the image with vTable sends.
For that we need:
Most of the work needed is in pharo-project/pharo#17088.
I leave here a possible util workspace:
Then, we should have a stable image with less megamorphic calls than the normal one.
And we can measure them in Benchy 📈