Document how to handle multiple data types simultaneously #5

ivarflakstad · 2023-01-04T09:54:23Z

Wall of text incoming!

TLDR

My conclusion is this:
We should inform users that if they have identical structs within an enum the de/serialization may not work as they expect.
If they absolutely must have identical structs in an enum they have to use #serde[tag] and remember to write their queries with this in mind (i.e. remember to add type: $type). Otherwise it will not deserialize the result of a select correctly, but then again serde will error if the tag is not included with Error("missing field type", line: 0, column: 0), so with a bit of documentation I think it's fair to give the responsibility to the user.

Problem

With a query (like MATCH p=()-->() return p) you will get all vertices and edges in the db.
For example let's say we get "A -[E]->[F]-> B".

We can't fit both the A and B types into V directly, so we have to use an enum or perhaps Box<dyn SomeTrait>. Same goes for edges with E and F.
I think enums are the correct choice here.

// We already have the structs defined above.

#[derive(Deserialize, Serialize, PartialEq)]
enum ExampleVertex {
    A(A),
    B(B)
}

#[derive(Deserialize, Serialize, PartialEq)]
enum ExampleEdge {
    E(E),
    F(F),
}

// `result` holds the result from querying "MATCH p=()-->() return p",
for row in result {
    let path: Path<ExampleVertex, ExampleEdge> = row.get(0);
}

So far so good.

However, if any of our structs are evaluated as identical by serde it will deserialize to the first struct that matches the data in the row.

#[derive(Deserialize, Serialize)]
struct A {
    id: usize
}

#[derive(Deserialize, Serialize)]
struct B {
    id: usize
}

#[derive(Deserialize, Serialize)]
enum ExampleEnum {
    A(A)
    B(B)
}

The contents of a B could be deserialized into a A here, resulting in something like
Vertex { id: 123456789, label: "B", properties: A(A { id: 1 }) }

Suggested solution

The absolutely simplest solution is obviously to recommend that users of the library do not use identical structs within an enum 😆

But if we want to support it the best way to deal with this (in my opinion) is to inform the users that they can use serde tagging.

#[derive(Deserialize, Serialize, PartialEq)]
#[serde(tag = "type")]
enum ExampleEnum {
    A(A)
    B(B)
}

If we now create an instance of B that includes the corresponding type field

CREATE(b:B {id: 1, type:'B'})  // type:'B' is the important part
MATCH (b: B) WHERE b.id = 1 RETURN b

We can now query using our enum and will deserialize to the correct enum variant.

let b_instance: Vertex<ExampleVertex> = row.get(0);

Resulting in

Vertex { id: 844424930131969, label: "B", properties: B(B { id: 1 }) }

I've tested this approach and I am pretty sure that this works in all cases, but we should definitely add some tests.

Also it's important to remember that if you're using AgType to save models you must remember to include the type in the query.

fn insert_example_vertex(v: ExampleVertex) {
  client.execute_cypher(
      'my_graph',
      "CREATE(b: B { id: $id, type: $type })",  // <- here
      Some(AgType::<ExampleVertex>(v)
  )
}

Forgetting $type will cause the enum to not be deserialized correctly, but then again serde will error if the tag is not included with Error("missing field type", line: 0, column: 0), so with a bit of documentation I think it's fair to give the responsibility to the user.

Alternatives

For completeness I'm including some alternative approaches that I personally think aren't as good.

Deserialize twice

You can deserialize to a Vertex first (or a serde json Value), then match on the label, and finally deserialize that into the correct model.
Note that I haven't properly tested this technique, but it should work.

fn from_label(vertex_string: str) -> ExampleVertex {
    // Using Vertex<String> here because we're only interested in the label
    match serde_json::from_str::<Vertex<String>>(vertex_string) {
        Ok(vertex) => {
            match vertex.label() {
                "A" => match serde_json::from_str::<Vertex<A>>(vertex_string) {
                    Ok(a) => Ok(ExampleVertex::A(a)),
                    Err(e) => Err(e),
                },
                "B" => match serde_json::from_value::<B>(value) {
                    Ok(b) => Ok(ExampleVertex::B(b)),
                    Err(e) => Err(e),
                },
            }
        }
    }
}

I do not think this is as good an approach as serde tagging because it is

Ugly.
Annoying to require the users of the lib to write all this boilerplate.

Custom de/serializer

We could create custom de/serializer that inspects the label and uses that de/serialize correctly.
Using something like strum AsRefStr would be useful, but we then we would have to differentiate between simple structs and enums within the method.
I suspect this will be quite a bit of extra work, so I don't recommend this approach.

The text was updated successfully, but these errors were encountered:

ivarflakstad mentioned this issue Jan 4, 2023

Initial path support #4

Merged

Dzordzu added this to the Good documentation milestone Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document how to handle multiple data types simultaneously #5

Document how to handle multiple data types simultaneously #5

ivarflakstad commented Jan 4, 2023 •

edited

Loading

Document how to handle multiple data types simultaneously #5

Document how to handle multiple data types simultaneously #5

Comments

ivarflakstad commented Jan 4, 2023 • edited Loading

TLDR

Problem

Suggested solution

Alternatives

Deserialize twice

Custom de/serializer

ivarflakstad commented Jan 4, 2023 •

edited

Loading