-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design doc #1
Comments
@kklas, a quick question - do you need something specific from the move-model that cannot be obtained from IRs generated by the source-to-bytecode compiler? |
@awelc I would think that this goal
necessitates usage of the |
Thank you @sblackshear! Admittedly, I did not know that about the move-model... |
I guess the main thing I'd like to get your thoughts on is on the "Dealing with multiple packages and dependencies" section. Here I'm proposing that code for all packages is generated in the repo on the same level instead of creating a separate npm package for each Move package which would be the conventional way of doing things. But I think what I'm proposing is the right way here for the reasons I outlined in the section. I'm also introducing As for the generated code itself I'm fairly confident this will be OK since I've used a similar pattern in a production app with very good results. Also this can be easily modified later if needed so I don't have concerns here. As for the |
Thanks for sharing this design doc @kklas ! Dealing with multiple packages and dependenciesA flat structure makes sense to me here, especially if you are allowing people to opt-in to packages to generate for. We already require that packages with overlapping addresses do not contain overlapping modules, so there is no risk of conflicts. In terms of mapping addresses to names, did you consider re-using One feature this might not work with today is on-chain or bytecode-only dependencies, but we can look into adding support for that. |
@amnn thanks for the feedback!
Yes, this will be used, but doesn't fully solve the problem because, as you mention, the on-chain packages don't store that info. So for on-chain packages you'd still have to have the user specify them, otherwise your imports look like this The other reason why So while the address mapping in
Because of that my intuition is that some kind of config file will need to be there. |
This makes sense @kklas, I don't want to get rid of
There is upstream support for building against a binary-only package. It works by setting up the package with a We can create a binary-only package based on an on-chain address, and this can be used as a dependency for the package you are generating bindings for. From there you can supply a name for the on-chain address either in the manifest of the binary-only package, or the manifest of the package that depends on it.
This is also a problem that
There's two conflict scenarios I can think of, let me know if there are more, but for both of these, I think, there are existing solutions for:
So, modulo binary package support and retaining |
@amnn, yes, here we're already getting into implementation details. I agree it would be great if we can re-use this logic and thanks for pointing this out. I haven't thought about how complex this gets.
So if I'm understanding this correctly, what you're suggesting is -- suppose I want to pull an on-chain package directly (not a dependency and I don't have the source code locally), I would have to create a stub build directory with corresponding If so I have to say I don't agree with this approach here. It seems very cumbersome for the users and also unnecessary because we're not actually trying to build the package, just get it's bytecode into I agree that we should re-use as much of
I agree with your points here. Another trickiness would be with handling different package versions, but I think your points on this already being handled by Move.toml also apply here and can be re-used. |
I don't think this needs to be so manual or cumbersome, we can and should add a tool that performs all these steps, given the ID of a single on-chain package. Would that help? |
Of course, if there's a tool that I can throw a package ID (or multiple) and required address definitions into and it would solve version and name conflicts, that would certainly help! Whether it creates stub build directories or not is an implementation detail and I'm not familiar enough with the code base to comment on that, but I strongly feel the users shouldn't have to be aware of any external build directories or |
I came up with a more elegant solution for package name problem.
This will put them in the top-level directory. The transitive dependency packages won't be included in the top-level directory but under
Benefits:
|
I think that could work, and is a nice simplification! One thing you may have to consider is whether users of named packages might be exposed to addresses via types in their public signature, e.g. On a new topic,
Other options:
My intuition is that configuring the node as a top-level field in |
The only problem with using addresses in paths is that import statements become very unergonomic. Users will very likely be exposed to types defined in dependencies but will probably not need to import them directly. If you need to use a type directly, you can specify the package to I think this will be more than good enough for the MVP and later we may explore some of the other solutions you mention here or perhaps introduce support for address mappings as discussed earlier.
Good idea! I think using |
From a TypeScript SDK perspective, this looks great! Thanks so much for putting this together. A couple thoughts after reading through this:
|
@Jordan-Mysten thanks for taking a look and the feedback!
This is actually a point that I'm very well aware of and is something that is much more nuanced than it appears. My opinion is that being able to switch between different IDs on the same generated code would be useful to have, but being able to do that dynamically during runtime is unnecessary. First let me explain the distinction between the two. In a previous SDK generator I built for Solana (called So, this allows configuring the IDs of the generated code (you can modify them via an env var during build), but doesn't allow them to be changed dynamically during runtime (e.g. via a select in the app UI). Some users then asked for the latter, but after some thought my conclusion was that this is unnecessary and very likely an anti-pattern. My reasoning is that in a production app you will never need to switch between environments (mainnet, testnet) or different smart contract IDs dynamically during runtime. You may need to do that only during development but in that case you can set them through env vars (or through a config set through an env var; I agree with you that having some ability to configure package IDs through env vars and such would be useful and would like to implement that, on Sui we're running into a different problem. If you think about switching between e.g. mainnet and devnet, we're not only changing the ID of the relevant package, but also changing IDs of all of its dependencies. So when you say "you may want the consumer to be able to pass a known package ID", considering you also have to pass down IDs of all its dependencies, it's not clear to me how to do that in a practical manner. Also, the generated code for the dependencies has package IDs in the folder structure (see previous discussion here #1 (comment)), so passing down some other IDs would be very confusing. Another approach that could be tried is that code for different environments is generated in separate folders (e.g. one for mainnet, one for devnet, etc.) and then there is some convenient way to switch between them. Perhaps having a "current" directory that is a link to the relevant environment, or having a stub directory that re-exports the types / methods from the directory of the relevant environment. But it's also not clear to me if this would be practical. LMK if you have any ideas here.
Agreed, and a custom instance is indeed being used: https://github.com/kunalabs-io/sui-client-gen/blob/master/ts/src/framework/bcs.ts
You mean like this?
Also there already is a
Totally agree. Skipped this for the POC because it's a lot to add all this manually. I'm making a note and will add this in the generator. There are probably a few other useful methods that can be added like
I agree with you here that |
@kklas We actually do allow switching between networks dynamically in wallet, so I do think there's legitimacy there. All of our products also have network switches currently (I actually do think For things like invalidating object caches, this is why we've taken the stance that the SDK doesn't have any caches and the user is responsible for caching logic to ensure the network is used as a cache key, which would also solve the problem here. I'm not sure I understand the dependency problem, why would we need to provide all the dependency IDs? |
@Jordan-Mysten Fair enough, I see a case for having dynamic network switching, I was mainly discussing about introducing the complexity of being able to pass down IDs to the SDK vs. letting the generator be "dumb" about it and require the user generate separate code for each of the environments. I don't see an elegant way to design the SDK to support this.
Suppose you have a package So when you change the env and pass down a different ID for package I'm not against it in general, just don't see a elegant way to do it and I'd like to keep things as simple as possible for the MVP. |
Which apps are that? I think that wallets and explorer don't apply here because you probably won't be generating static SDKs there aside from the framework which is anyways the same on all chains? Suifrens AFAICT doesn't have a network switch. |
I still don't really understand why those IDs are needed, unless you're interacting directly with those packages? The apps would be wallet, explorer, suifrens, etc. Wallet and explorer won't only talk to the framework packages, which is why it'd be nice to be able to use something like this. Suifrens doesn't currently have a network switch but in dev we'd probably have one. |
Concretely, they're used for three reasons:
If you don't have this you could end up with obscure and hard to trace bugs in your app because it would be possible to deserialize an object with the wrong class without throwing an error and ending up with a corrupt object. In theory, it's would be to use package names instead of addresses here, but we would have to have proper address mappings which is a different problem to solve, not exactly straightforward and has its drawbacks (as you can see from my discussion with @amnn in previous comments). Using package IDs is most natural here. So while perhaps you may get away with e.g. fetching an object manually and then deserializing it using In essence, package IDs are used all throughout the SDK to differentiate between types (and this is also fundamental to the Move programming model in general), so even if you don't "interact with those packages directly", they're still needed for SDK to function. It's not an easy task to work around that.
Can you give an example? I'm trying to understand the use-case where you need both dynamic network switching and static SDKs in the same app.
I'm curious how is your dev workflow affected by not having a dynamic switch and having to do it with |
It seems weird to me to apply package checks for generated code from dependencies, but maybe that's just a personal preference thing. If this is important, I imagine the user can opt-out when they set a custom package ID, or they could define all of their dependency IDs. I actually don't think runtime validation is that important here though. For registering BCS names, that's a valid point but more of a problem with how BCS uses a global namespace (which will go away at some point). If you create an isolated bcs instance you can also just guarantee you register unique names (the names don't actually matter). For the case of wallet, you can imagine we might at some point have a native integration with a defi token swap package that we'd like to have integrated in all environments. The wallet needs to be able to built generically, and be able to define these dynamically for each supported network. Having multiple builds just flat out isn't supported. We'd ideally like to use tooling to generate out clients in these cases. You can imagine similar use-cases for our other products as well. Dynamic switch in-app was probably one of our single biggest improvements to local DX that we've shipped because it lets us quickly switch networks and validate our product across the multiple networks that exist. And for products like wallets, it's also just a requirement. |
Understood and thanks for these valuable insights @Jordan-Mysten. You're right that runtime validation isn't strictly necessary for types defined in dependencies because once the top-level type is validated it's guaranteed that inner (wrapped) types are OK also. These are here because simply because code for all packages (including dependencies) is generated the same, and in the original design we didn't make the distinction between dependencies and top-level packages. While this can be worked around in multiple ways (one of those being allowing the user to opt-out as you say), I don't think this is the core of the issue. What we're dealing here is having the same package published across different environments with different IDs and the question is how to keep track of that. Giving it a bit more thought I have a solution in mind. In
Then in the
The generator can then query the on-chain package object to get IDs of all dependencies for all of the environments. It can even verify the packages against on-chain objects (ala On the SDK side we would generate dependency packages for each env in a separate directory:
(Yes, I do understand that this will the generated dependency code will be duplicated for each env. There are multiple ways around that, but let's keep it simple for the purposes of this discussion.) Then for each required method we would introduce an optional parameter
The package IDs for each environment would be stored in constants. Each env would have its own BCS instance also. This would allow the SDK to be able to switch between different environments dynamically. @Jordan-Mysten will this work for your use case? |
@amnn I would love to hear your thoughts on the above also. I think that the problem of keeping track of deployments of the same package across different envs is something that should be handled on the Sui Move package level (i.e. Some of these problems could also perhaps be handled by a package manager (including providing a global address <-> package name mapping). Is anyone working on that? |
That still wouldn't really satisfy our requirements, we'd want to be able to dynamically update the IDs at runtime to handle data wipes in networks, and have flexibility to update the IDs pointed to dynamically (e.g. we want to change contracts in already deployed code). |
Are you referring to the situation where e.g. in the wallet extension there is a devnet wipe and you need to update the IDs but can't push an update to the chrome store but rather have to rely on fetching the new set of IDs in the old code from some backend service? Because if you can just push new code then I don't see a problem. |
Yeah specifically talking about wallets where there's a very long amount of time before code changes are fully rolled out. Agreed that on websites we can push code, although I'd still prefer not requiring code pushes for configuration changes. |
OK, makes sense. I understand your point about opting out of ID checks now. I think that if we add support for |
Yes, this is something we have been thinking about. Generally we want to move away from developers having to manually manage addresses, because the process is fiddly and error-prone, especially with the addition of MystenLabs/sui#8345 covers some initial work we want to do along this vein -- moving |
Great! In that case I will do the MVP as described here #1 (comment) to not duplicate any of this effort, and then add address mappings and environments after this lands. @Jordan-Mysten I think you will be OK with this for the time being even with the wallet use case since you're only using the framework packages. If you need it sooner LMK and I'll add some hack in to allow you to switch package IDs. |
@amnn can you point me to where this is supported in upstream? I took a closer look at the codebase and I think you're right that this is the way to go. A stub package directory should be created where all the top level packages are listed as dependencies, including on-chain packages. This way When you said "porting this over" did you also mean to add support for |
Sure thing, it was added in these PRs:
Note that I believe there is one limitation around
Yes, most of the changes are in the compiler and package resolution logic. We will need to cherry-pick the change and fix things up to work with lock file support because they both landed at the same time in the respective codebases (There shouldn't be any fundamental reason why this wouldn't work though). |
Thanks! There seems to be a bit more work here to be done so for the MVP what I'll do is I'll keep the on-chain and source packages separate by building two separate move models for them. I can guarantee consistency in both because for source it will be handled by the
Out of curiosity, how is this handled in the prover then? Does it just not work when there's an address substitution in a dependency? |
Yes, I believe it doesn't support those packages (and I think practically speaking, it's actually rare to get package name conflicts as well). |
@amnn @Jordan-Mysten the MVP is ready https://github.com/kunalabs-io/sui-client-gen |
sui-client-gen design
sui-client-gen
is a code generator that generates TS SDK code based on Move smart contracts. It brings type safety and significantly reduces the boilerplate for interacting with smart contracts and fetching objects from chain.The purpose of this document is to describe its design and gather feedback before it's implemented. It explains reasoning behind the design decisions and is a useful reference for building generators for other languages also.
Design goals for the generated TS code:
sui.js
(no overlapping functionalities, composes withTransactionBlock
...)Design goals for the generator:
POC Overview
In order to evaluate the generated client code I have manually implemented the target code for the Kuna Labs AMM smart contract. This target code is located under
ts/src/generated/amm
.I have also included a "fixture" smart contract in this repo under
move/fixture
. It has two modules -fixture.move
andexample_coin.move
which I have used to evaluate some other functionalities of the generated client not covered with the AMM smart contract.Additionally, I have included a small node CLI in
ts/src/main.ts
to showcase how the generated client is used (creating transactions, fetching objects, fetching dynamic fields, fetching events...). It works with smart contracts mentioned above (AMM and fixture) that I have deployed on the devnet.There is also a POC for the generator in Rust which fetches modules and their dependencies recursively from chain starting from a package address, and then prints structs and their fields to output. It's located in generator/src/main.rs.
Package
For each package, a folder is generated containing within it a folder for each of its modules.
Within module folders,
functions.ts
,structs.ts
, andconstants.ts
files are generated that will contain relevant code. If a module has no structs or constants defined, the respective files won't be generated.In the package root folder there is also
index.ts
that contains the hard-codedPACKAGE_ID
const andinit.ts
whose purpose is related to decoding objects with generic fields which will be explained in the "Structs with generic fields and StructClassLoader" section.In the case of the mentioned
amm
package, the generated file structure will look like this:The reason I have separated constants, functions, and structs into separate source files is to avoid potential name collisions for some extra code that is generated for each function / struct. (e.g. each function binding has an "...Args" interface generated for it used to provide type safety when specifying arguments to a function call). Another reason is that it makes the generated code more readable.
But arguably, it could be made that these are all concatenated in a single file
<module>.ts
. I don't see a big advantage to that but it could be discussed.Constants
Move constants can only be of primitive types and
vector<u8>
so generatingconstants.ts
is very straightforward (example).In the future versions of the generator, doc comments above constant definitions in Move could be used to indicate to the generator that the constants represent error codes and generate error handling code also, but ths is not in the scope for the initial version.
Functions
Function bindings are generated in
functions.ts
files for each module (if any). Only public functions are considered.We generate two items for each move function - an args interface and a function to generate a move call with the provided arguments.
In case of
amm
module'screate
function (link) we will have this (full file here):In the args interface, the types are mapped as follows:
u8
,u16
,u32
number
u64
,u128
,u256
bigint
bool
boolean
address
string
vector<u8>
Array<number> | Uint8Array | string
vector<T>
Array<T>
ObjectArg
ObjectArg
is defined asObjectId | ObjectCallArg | TransactionArgument
. This way it supports passing in raw object ID values (string) and also composes withTransactionBlock
(tx.object(...)
,tx.gas
and values returned fromtx.split(...)
,tx.moveCall(...)
, etc.).The function itself is very straightforward. It just generates a
moveCall
on the provided transaction block with the provided arguments.Its signature will change based on how many type parameters and function parameters the respective move function takes. E.g. if the move function doesn't have any type parameters then the
typeArgs
parameter will be omitted. In case it takes only 1 type parameter, it will have atypeArg
parameter of typeType
(and not a single element tuple). Similarly, if the move function has only one function parameter, then the function will take just this one argument directly (not nested in a field).The doc comment from the move file is also generated on top of the function.
Structs
Code corresponding to module's structs are generated in the
structs.ts
. The goal is mainly to generate a class for the struct to handle decoding of object's data returned by the RPC into an object with proper type definitions for its fields.Let's look at the generated code for
amm
module'sPool
struct (full code here):(1) The struct is registered in the framework-level
BCS
instance so that it can be decoded from a BCS encoding.(2) A helper method is generated that checks whether an arbitrary type is a
Pool
(3) An interface defining struct's fields. This is used in the struct class constructor (9) for defining the type of the
fields
parameter.(4) The class representing the struct. It stores struct instance's fields within proper type definitions and also provides multiple methods that deal with loading an instance from various different formats returned by the RPC.
(5), (6), and (7) are meta fields. They are prefixed with
$
to avoid name collisions with struct fields and as an indication to the user.static $typeName
(static) holds struct's full type name (type parameters are omitted),$numTypeParams
(static) stores the number of type parameters the struct is defined with.$typeArgs
will hold the type arguments the struct instance (object) is instantiated with.(8) Then the struct fields are defined. Nested structs are defined through their respective generated classes.
(9) Struct class constructor. It takes
typeArgs
if any andfields
as args.(10)
fromFields
loads the struct instance from an object containing its fields (example). This is used when loading the object from BCS asbcs.de(...)
returns an object in this format. In case the object has nested struct fields, they will be recursively loaded by callingfromFields
on their class.(11)
fromFieldsWithTypes
loads the struct instance from an object that contains its type and fields (example). The RPC often returns objects in this format (defined here), e.g. callingprovider.getObject
withshowContent: true
. This method can then be used to load an object from this format. Also does recursive loading for nested structs.(12)
fromBcs
decodes the object from BCS encoding. Callsbcs.de
andfromFields
internally.(13)
frumSuiParsedData
is a convenience method that loads the object fromSuiParsedData
which is what is returned onprovider.getObject
calls (and others) whenshowContent: true
. CallsfromFieldsWithTypes
internally.(14)
fetch
convenience function that callsprovider.getObject
with the provided object ID. Errors if the object at address doesn't match the class. This method is only generated for structs that have thekey
capability.There's also one issue I ran into in this part. This is that when fetching objects or events from the RPC, they're returned in various different formats. There's also special casing in the RPC for certain types (like Balance) to be returned in a different format which one needs to be mindful of when decoding them from nested fields.
For example:
Structs with generic fields and StructClassLoader
Code for structs with generics is similar the ones without with a few small differences. Let's look at
sui::dynamic_field::Field<Name, Value>
as an example (full target code here).Firstly, generics are added to the fields interface and struct class definition:
It's important to note that generics are added only for non-phantom type parameters. In the
Pool
example above, both type parameters were phantom so generics were not added there.This is because adding generics for phantom types is technically not necessary for the generated code to function. Also because there's no native support for phantom parameters in TypeScript (perhaps there's a hacky way to do it but I don't want to go there). The phantom arguments will be stored (along with others) in the
$typeArgs
field of the struct class instance.Next, since we will have to load objects with generics dynamically during runtime (e.g. when calling fetch), when loading generic fields we will have to somehow be able to get their respective classes based on object's type arguments to be able to decode them. This is done through
structClassLoader
which is a framework-level instance ofStructClassLoader
class.E.g., in the dynamic field example we have:
So in (2) and (3)
structClassLoader
is being used to get the right struct class for the field. E.g., if the dynamic field had0x2::coin::Coin
for itsValue
type parameter, then thestructClassLoader.fromFields(typeArgs[1], fields.value)
call would return aCoin
instance loaded fromfield.value
object.Before this can be done though, the struct classes we generated need to be registered with the loader. This is done in the
initLoaderIfNeded()
call (1). This will register struct classes for all generated modules in the package (example), but only the first time it's called.The code for the
StructClassLoader
can be found here. In its working it's similar to theBCS
class -- the decode callbacks for each type are stored in a map against their type name and then the decode calls (fromFields
andfromFieldsWithTypes
) fetch them from there and use to load the proper instance based on its type.Dealing with multiple packages and dependencies
When generating code for a package, its code and the code for its dependencies will be generated on the same level in the
<root>
directory (with the exception of0x1
,0x2
, and0x3
packages which will be included in the framework js package). Package folder names will be based on package names found in theirMove.toml
.E.g.:
The generator will also support generating code from the chain when the source code isn't available. Since the on-chain package object don't contain
Move.toml
info, package names will be provided manually by the user.For this I would like to introduce a config file (e.g.
gen.toml
) that would be located in the code generation root directory that would specify address to name mapping for on chain packages similar to how it's done inMove.toml
:This config file would also be used to specify which packages to generate the code for. It would support specifying object IDs, git URLs, and local directories:
This would allow for a very elegant workflow:
gen.toml
file and place it in a directory in your reposui-client-gen
inside it (no extra args or options need to be passed)The advantages of this approach (generating code for all dependencies in the repo) are:
Potential issue -- I expect that people will start building higher level SDKs that wrap the generated code and introduce higher level abstractions and helper functions based on the semantics of the package. In that case the issue is that these higher level SDKs would have to bundle the generated code they depend on. When importing these higher level SDKs into a project that already has generated code you could run into compatibility issues and the diamond problem again. One way this could be solved is that the higher level SDKs are also built as code generators that you could plug in to generate them on top of the code generated by
sui-client-gen
.Generator implementation
The code generator will be implemented in Rust and based on
move-model
. This is so that it can support code generation from both local packages and bytecode modules fetched from chain.The general implementation outline is this:
gen.toml
and start loading specified packagesGlobalEnv
GlobalEnv
to build an IDL-like intermediate representation of all packages and their contents more suitable for code generation work (see below)Since move-model's
GlobalEnv
is a bit cumbersome to work with in the context of code generation, I will build an intermediate representation struct more suitable for that work (resembling an IDL). This foundation can then be used for building code generators for other languages.This representation could look something like this (displayed as JSON):
The text was updated successfully, but these errors were encountered: