Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsed schema contains nested "type" keys #86

Open
devenjahnke opened this issue Oct 14, 2024 · 3 comments
Open

Parsed schema contains nested "type" keys #86

devenjahnke opened this issue Oct 14, 2024 · 3 comments

Comments

@devenjahnke
Copy link

When using the encodeRecord method on the RecordSerializer, I encounter an issue when the schemaId method is called on a Registry class to validate that the parsed schema is registered for the subject.

Specifically, the issue seems to stem from subtle differences between the parsed and remote schema. For example, the following snippet is the schema as it was parsed by this package after formatting by the prepareJsonSchemaForTransfer function:

{
    "name": "orders",
    "type": [
        {
            "type": "null"
        },
        {
            "type": "array",
            "items": {
                "type": "record",
                "name": "OrderRecord",
                "fields": [
                    {
                        "name": "id",
                        "type": {
                            "type": "string"
                        },
                        "doc": "Id(uuid v7) of the order"
                    },
                    {
                        "name": "isReady",
                        "type": {
                            "type": "boolean"
                        }
                    }
                ]
            }
        }
    ],
    "default": null,
    "doc": "Orders associated to customer"
}

Whereas, the following snippet is the schema returned from a GET request to the registry:

{
    "name": "orders",
    "type": [
        "null",
        {
            "type": "array",
            "items": {
                "type": "record",
                "name": "OrderRecord",
                "fields": [
                    {
                        "name": "id",
                        "type": "string",
                        "doc": "Id(uuid v7) of the order"
                    },
                    {
                        "name": "isReady",
                        "type": "boolean"
                    }
                ]
            }
        }
    ],
    "doc": "Orders associated to customer",
    "default": null
}

When I make a POST request to the registry for the subject with the second schema (returned by a GET request to the registry), I receive a successful response. But, when I make a POST request to the registry for the subject with the first schema (parsed by the package), I get the following error:

{
    "error_code": 40403,
    "message": "Schema not found io.confluent.rest.exceptions.RestNotFoundException: Schema not found
}

Any advice on the issue and how I can proceed is greatly appreciated. Cheers!

@devenjahnke devenjahnke changed the title Parsed schema contains nests "type" keys Parsed schema contains nested "type" keys Oct 14, 2024
@devenjahnke
Copy link
Author

The issue seems to occur in the to_avro method of the AvroField class. When setting the AvroSchema::TYPE_ATTR value, $this->is_type_from_schemata evaluates to false, resulting in $this->type->to_avro() being called which returns the nested { "type": "..." } object.

@devenjahnke
Copy link
Author

I've found the origin of the issue. When parsing the fields of an AvroRecordSchema a new AvroField is created. For example:

AvroField^ {#686
  +type: AvroPrimitiveSchema^ {#672
    +type: "string"
    #extra_attributes: []
    #logical_type: null
    -serialize_type_attribute: true
  }
  #extra_attributes: null
  #logical_type: null
  -name: "id"
  -has_default: false
  -default: null
  -order: null
  -is_type_from_schemata: false
  -doc: "Id(uuid v7) of the underlying resource"
  -precision: null
  -scale: null
}

When this field is later JSON encoded using the json_encode function, the type property of the AvroField is encoded as an object containing a type key for the nested AvroField. For example, the above code snippet encodes to:

{
    "type": {
        "type": "string"
    }
}

This is the origin of the nested type keys, resulting in the registry not recognizing the schema upon validation.

@devenjahnke
Copy link
Author

Ensuring that $serialize_type_attribute is set to false when an AvroPrimitiveSchema is created resolves the type nesting issue. However, I am unsure how to override where it is otherwise being set to true without modifying the vendor files directly, as I have done while debugging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant