-
Notifications
You must be signed in to change notification settings - Fork 677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support for scripting engines as Valkey modules #1277
base: unstable
Are you sure you want to change the base?
Conversation
I'm opening this PR in draft mode because I still need to fix test failures, and add new tests as well. |
ea6e6b6
to
d3293dd
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #1277 +/- ##
============================================
+ Coverage 70.63% 70.84% +0.21%
============================================
Files 116 118 +2
Lines 63303 63691 +388
============================================
+ Hits 44711 45119 +408
+ Misses 18592 18572 -20
|
4a2e223
to
1a808b2
Compare
I just realized I still need to implement the |
Core team said we are directionally aligned with, we still want someone to take a deeper look to make sure the details makes sense. |
fe56090
to
9d4b296
Compare
The PR is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like great progress. Is it ready for a proper review or are you going to change much more?
Shall we do EVAL as a follow-up PR?
Each engine has only one context which is used for all calls, right? Would it be possible for an engine to use a context (or sub-context, etc.) for each client? This idea came up in the LuaJIT discussion by @secwall in #1229. A separate Lua context per client would isolate each client from each other, which compensates for the fact that Lua isn't completely sandboxed by design. Maybe the engine can organize this by itself, if the engine just has access to the current client at the time of a FCALL or EVAL?
It's ready. I don't think I'll add more changes apart from the changes asked by the reviewers.
Yes, that's the plan.
I'll take a look if it's possible, and will open a follow-up PR with the changes. |
This commit extends the module API to support the addition of different scripting engines to run user defined functions. The scripting engine can be implemented as a Valkey module, and can be dynamically loaded with the `loadmodule` config directive, or with the `MODULE LOAD` command. The current module API support, only allows to load scripting engines to run functions using `FCALL` command. In a follow up PR, we will move the Lua scripting engine implmentation into its own module. Signed-off-by: Ricardo Dias <[email protected]>
9d4b296
to
fc03df7
Compare
This commit adds a module with a very simple stack based scripting language implementation to test the new module API that allows to implement new scripting engines as modules. Signed-off-by: Ricardo Dias <[email protected]>
fc03df7
to
9c618a3
Compare
Can you avoid force-pushing? It's easier to review what's changed since last time if you just push small commits. You can even merge unstable to it, rather than rebase. We squash-merge PRs anyway in the end and we use the PR title and description as the commit message. |
Ah sorry. I wasn't aware that we were squashing the commits upon merge. I'll stop force-pushing. |
Signed-off-by: Ricardo Dias <[email protected]>
Today I added more test cases to test the error code paths. |
Signed-off-by: Ricardo Dias <[email protected]>
Signed-off-by: Ricardo Dias <[email protected]>
…ORE|DELETE] Signed-off-by: Ricardo Dias <[email protected]>
…ions.c` Signed-off-by: Ricardo Dias <[email protected]>
@zuiderkwast I've pushed a small update that refactors a bit the code. I've already have branch with the implementation of the EVAL using the scripting engine module API almost done. It would be good to finish the review of this PR and merge it, so we can focus on the next PRs. |
… functionLibInfo in most places Signed-off-by: Ricardo Dias <[email protected]>
8ff2b88
to
d7ffcfe
Compare
- Removed the `ValkeyModuleScriptingEngineFunctionLibrary` structure. - Changed a few structure and parameters names. Signed-off-by: Ricardo Dias <[email protected]>
Signed-off-by: Ricardo Dias <[email protected]>
Signed-off-by: Ricardo Dias <[email protected]>
Updated the PR description with the new API functions added by this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one looks close now. I think we can merge it soon and add EVAL in the other PR, to keep each PR small.
Regarding the sds
strings that can't be used in the module API: Did you settle with char *
for the strings or are you still considering ValkeyModuleString AKA robj *
for the strings? Just curious.
typedef struct ValkeyModuleScriptingEngineCompiledFunction { | ||
char *name; /* Function name */ | ||
void *function; /* Opaque object representing a function, usually it' | ||
the function compiled code. */ | ||
char *desc; /* Function description */ | ||
uint64_t f_flags; /* Function flags */ | ||
} ValkeyModuleScriptingEngineCompiledFunction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these structs be opaque and defined in module.c? If not, why?
For the other non-opaque structs, we've often included a version field, so we can update it in the future by adding more fields at the end. The valkeymodule.h
is copied in each module and initialized in a way that permits us to add more fields in the struct. Here is an example:
#define VALKEYMODULE_CLIENTINFO_VERSION 1
typedef struct ValkeyModuleClientInfo {
uint64_t version; /* Version of this structure for ABI compat. */
uint64_t flags; /* VALKEYMODULE_CLIENTINFO_FLAG_* */
uint64_t id; /* Client ID. */
char addr[46]; /* IPv4 or IPv6 address. */
uint16_t port; /* TCP port. */
uint16_t db; /* Selected DB. */
} ValkeyModuleClientInfoV1;
#define ValkeyModuleClientInfo ValkeyModuleClientInfoV1
#define VALKEYMODULE_CLIENTINFO_INITIALIZER_V1 {.version = 1}
Should we do this here too? If not, why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the ValkeyModuleScriptingEngineCompiledFunction
struct, this struct is must be populated by the implementation of the scripting engine, and then its fields are read individually by the core implementation. We can make it opaque by providing a new VM_* function that accepts the fields as parameters and returns a pointer to the structure, but I'm not sure we gain much by doing that, specially if in the future we want to add/remove fields from this struct.
Regarding the structure of non-opaque structs to support versioning, I think it's a good idea. I was not aware of this pattern. I'll do the same thing with the non-opaque structs created in this PR and in the following ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I agree we don't need a VM function just to populate a struct.
Let it be non-opaque and use version field. 👍
I'll do the same thing with the non-opaque structs created in this PR and in the following ones.
Good, at least whenever it makes sense. Maybe there are exceptions, where we don't need it to be extensible. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I introduced a "versioned struct" to store the engine callbacks, the other structs used by the callback functions don't need to be versioned because they will all be under the umbrella of the engine callback structs version value.
src/valkeymodule.h
Outdated
ValkeyModuleScriptingEngineGetUsedMemoryFunc get_used_memory_func, | ||
ValkeyModuleScriptingEngineGetFunctionMemoryOverheadFunc get_function_memory_overhead_func, | ||
ValkeyModuleScriptingEngineGetEngineMemoryOverheadFunc get_engine_memory_overhead_func, | ||
ValkeyModuleScriptingEngineFreeFunctionFunc free_function_func) VALKEYMODULE_ATTR; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to add an optional callback to kill a script or function? It can be invoked by SCRIPT KILL or FUNCTION KILL.
Thinking about extenisility: If we want to add more callbacks in the future, we will have to add another API function. Is it likely that we will want to do that? It's easier if don't need to do that; if we manage to get it right from the beginning. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another style of the API could be to use a struct with all the callbacks and then just register the struct.
The struct could have a version field for future extensibility.
Pros and cons? What do you think?
int VM_RegisterScriptingEngine(ValkeyModuleCtx *ctx,
const char *engine_name,
void *engine_ctx,
ValkeyModuleScriptingEngineCallbacks *callbacks);
typedef struct {
int version; /* version field for ABI compatibility */
... callbacks ...
ValkeyModuleScriptingEngineGetUsedMemoryFunc get_used_memory_func;
... etc ...
} ValkeyModuleScriptingEngineCallbacksV1;
typedef ValkeyModuleScriptingEngineCallbacksV1 ValkeyModuleScriptingEngineCallbacks;
/* Modules use this to initialize version field. */
#define VALKEYMODULE_SCRIPTING_ENGINE_CALLBACKS_VERSION 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to use a struct to group the callbacks because of the extensibility argument. I will change it.
We will use |
@zuiderkwast I'm now looking at the structure of |
Good finding! I'm not that familiar with ValkeyModuleTypeMethods, but yes it looks like a similar thing. The module struct just adds versioning, right? Do this if you think it's reasonable. |
Signed-off-by: Ricardo Dias <[email protected]>
Signed-off-by: Ricardo Dias <[email protected]>
Signed-off-by: Ricardo Dias <[email protected]>
Signed-off-by: Ricardo Dias <[email protected]>
Signed-off-by: Ricardo Dias <[email protected]>
/* ValkeyModule type aliases for scripting engine structs and types. */ | ||
typedef ValkeyModuleScriptingEngineCtx EngineCtx; | ||
typedef ValkeyModuleScriptingEngineFunctionCtx FunctionCtx; | ||
typedef ValkeyModuleScriptingEngineCompiledFunction CompiledFunction; | ||
typedef ValkeyModuleScriptingEngineMethods EngineMethods; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
Just thinking, apart from in the module API, we use a lowercase first letter for types and structs, at least in most cases (or always? I'm not sure). For consistency, it's probably best to follow the existing style.
This PR extends the module API to support the addition of different scripting engines to execute user defined functions.
The scripting engine can be implemented as a Valkey module, and can be dynamically loaded with the
loadmodule
config directive, or with theMODULE LOAD
command.This PR also adds an example of a dummy scripting engine module, to show how to use the new module API. The dummy module is implemented in
tests/modules/helloscripting.c
.The current module API support, only allows to load scripting engines to run functions using
FCALL
command.The additions to the module API are the following: