Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Enum and JSONString type #30

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

magnudae
Copy link
Contributor

We were working with some new nodes that were using Enums and JSONStrings and encountered some issues

With Enums we would always have to call the .value everywhere since there were no Enum interpreter in Nebula.
Since Enums are basically strings I thought we could use a type that would just save the value.
I also wanted to be able to interpret it on the way up from the DB, but that might be something needed in the main nebula package?

For JSON strings, the syntax of JSON dictates it uses double quotes. If we just use the String type in carina we get error because it just wraps the whole thing in double quotes and it becomes an invalid object. Solution was to for this type switch to wrapping in single quotes.

Both cases we could handle by making our own wrappers around, but I thought this might be valuable to have within Carina, to help the users along.

@wey-gu Please guide me here. Is this out of scope? Do you guys have something to deal with this already?
Or is there something I havent thought of here?

@magnudae
Copy link
Contributor Author

@wey-gu Lets put this a bit on hold.

So the first issue I encounter is that Enum and JSON arent represented as types in Nebula, for obvious reasons.

For this approach to work, we would need to have carina being able to "enable" more comples types with validation, but underneath translating them to primitive types handled by nebula-python.

I am thinking abit how to actually do that in a clean way. But the PR as it stands right now, does not.

I do still think it would be benefitical to "support" other types for easier and more seemless validation with pydantic, but it has to be done in a clean and readable way.

Im gonna think a bit and revise thise PR. Put it on hold for now, and Ill ping you again.

@magnudae
Copy link
Contributor Author

@wey-gu

@magnudae magnudae closed this Oct 25, 2024
@wey-gu
Copy link
Member

wey-gu commented Oct 26, 2024

Sorry was super busy these days.

My idea on implementing enum in NebulaGraph would be:

  • as you mentioned, put enum value(string enum as I assumed) to be underneath persisted.
  • maintain metadata of the extra typing info in NebulaGraph leveraging comment fields of property (in jsonstring structure?)
  • implement enum validation and enforcement with hooks here in ORM level.

What do you think, please?

Regarding JSON:

  • for nested structure, if edge-vertex is not accepted, we could do json string as string prop here, and NebulaGraph comes with json_extract buildin function, but please be noted it comes with limitations(depth-2).
  • if the jsonstring was introduced for adding capabilities on bulk fields, there is another approach feasible here, a edge with target to itself, by introducing such, we could put properties under such self-loop edges as depth-1 list prop representation and corresponding abstraction could be done here in orm, too.

What do you think?

@magnudae sorry again to let you down on being so late responses.

@magnudae
Copy link
Contributor Author

@wey-gu No worries.

For the enum, that sounds similar to what ive been thinking of.

For the JSON structure, im currently trying out in my fork a way to extend the current String DataType, to handle JSON.

Suggestion 1 might be the easiest to do, but with a depth of 2 that would be quite limiting. I think we could expect the user of carina to extract the JSON as long as carina returns the string.

Suggestion 2, I think I understand the concept, but wouldnt that be a bit complex since we would need EdgeTypes for each field? Or would we introduce some kind of "JSON" EdgeType for this. Depth would also be a problem here, since it sounds complex to calculate depth on these edges from a JSON string, unless that comes predefined by the user (maybe?)

Anyway, our use case is currently that we have JSON configs that trigger different actions, depending on the context of the user.
Maybe there is a way to model this better with Vertex-edges, which would remove the need to save the JSON string.

I will work a bit further and hopefully drop a PR here with some suggestions.

Also, sorry for pinging you so much, just been quite exciting to explore Nebula

@wey-gu
Copy link
Member

wey-gu commented Oct 29, 2024

  1. I agree that we don't rely on server-side json_extract first, but we could also improve the function itself to remove the limitation that I introduced (sorry, but either you or I could make the NebulaGraph main repo PR on this?)
  2. indeed, that's too heavy to have 1-1 field-EdgeType mapping. Maybe one EdgeType would be more practical, especially with an improved json_extract function.

Thanks and looking forward to your exploration :)

To enable notifications on not missing this, I'll reopen this pr first(don't know how to transfer a PR to an issue or discussion thread)

@wey-gu wey-gu reopened this Oct 29, 2024
@magnudae
Copy link
Contributor Author

Im not familiar enough with C++, so I dont think I will propose anything for Nebula yet

I am working with slight change of the DataType structure here in Carina, that moves the JSON handling up from the database layer, so that the DB only knows about a string.
This is probably less performant than with a JSON_EXTRACT, but not sure if JSON seralization/deserialization ever will be very performant. And for our use case, its not handing many at the same time.

Will probably take me some more time than before to get a proper PR ready, since I have it similar to you, a lot on my plate right now.

@wey-gu
Copy link
Member

wey-gu commented Oct 30, 2024

Yeah sure, the first version to be purely orm side should already meet your needs, we could then make the extraction server-side when some real perf issue encountered(not likely I guess, either).

Looking forward to your work🫡 and take your time, please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants