Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Codecs Upgrade to Lucene99 Codec #95

Conversation

sarthakaggarwal97
Copy link
Collaborator

Description

This change would upgrade the underlying Lucene95 codec of the custom codecs to Lucene99 codec.

Issues Resolved

#94

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@sarthakaggarwal97
Copy link
Collaborator Author

sarthakaggarwal97 commented Jan 17, 2024

@reta @backslasht @mgodwan please take a look. Working simultaneously on the upgrade ITs to ensure correctness.

/**
* ZStandard mode with dictionary
*/
ZSTD("ZSTD99", Set.of("zstd")),
Copy link
Collaborator

@reta reta Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sarthakaggarwal97 I don't think we should approach codec backward compatibility this way - it will cause an explosion of compression codecs. Ideally, we should be able:

  • read / write indices that use older (9.5) codec
  • create new indices using new (9.9) codec
  • read / write new indices that use (9.9) codec

The codec name stays the same zstd / zstdnodict. @msfroh does it make sense (from Apache Lucene perspective)?

Copy link
Collaborator Author

@sarthakaggarwal97 sarthakaggarwal97 Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta I was initially not planning to introduce new codecs. I was looking to extend the current ZSTD and ZSTD_NO_DICT codecs from Lucene99 instead of Lucene95, and finally read the stored fields using the same Lucene95CustomStoredFieldsFormat, but I started to run into problems.

This initial approach worked well to ensure that we are now indexing and reading the newly created segments with Lucene99 codec in the background, but upon reading the older segments (created with zstandard in <OSv2.12), things started to fail.

We would see the shards are getting unassigned upon recovery with the exception

{
    "index": "index-26",
    "shard": 0,
    "primary": true,
    "current_state": "unassigned",
    "unassigned_info": {
        "reason": "CLUSTER_RECOVERED",
        "at": "2024-01-17T11:11:00.285Z",
        "last_allocation_status": "no_valid_shard_copy"
    },
    "can_allocate": "no_valid_shard_copy",
    "allocate_explanation": "cannot allocate because all found copies of the shard are either stale or corrupt",
    "node_allocation_decisions": [
        {
            "node_id": "GK-KmiQoQEWlE44bvkt2Rg",
            "node_name": "bcd0743e920a.ant.amazon.com",
            "transport_address": "127.0.0.1:9300",
            "node_attributes": {
                "shard_indexing_pressure_enabled": "true"
            },
            "node_decision": "no",
            "store": {
                "in_sync": true,
                "allocation_id": "CnmW07YyQtqjAI0G4xSOMQ",
                "store_exception": {
                    "type": "corrupt_index_exception",
                    "reason": "checksum status indeterminate: remaining=0; please run checkindex for more details (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path=\"/Users/sarthagg/workplace/actual/OpenSearch/build/distribution/local/opensearch-3.0.0-SNAPSHOT/data/nodes/0/indices/DthI0nwRTqafGvWDltUf1g/0/index/_f.si\")))",
                    "suppressed": [
                        {
                            "type": "e_o_f_exception",
                            "reason": "read past EOF: NIOFSIndexInput(path=\"/Users/sarthagg/workplace/actual/OpenSearch/build/distribution/local/opensearch-3.0.0-SNAPSHOT/data/nodes/0/indices/DthI0nwRTqafGvWDltUf1g/0/index/_f.si\")"
                        },
                        {
                            "type": "corrupt_index_exception",
                            "reason": "checksum passed (94fdd3c1). possibly transient resource issue, or a Lucene or JVM bug (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path=\"/Users/sarthagg/workplace/actual/OpenSearch/build/distribution/local/opensearch-3.0.0-SNAPSHOT/data/nodes/0/indices/DthI0nwRTqafGvWDltUf1g/0/index/segments_7\")))"
                        }
                    ]
                }
            }
        }
    ]
}

Even though the codecs name were same, we would still see the recovery to fail. The only difference was that instead of Lucene95 as the delegate codec, we were using Lucene99 as the delegate codec. It should have worked since even with Lucene99, lucene is still relying on the Lucene90StoredFieldsFormat.

So what changed?

There was a slight change in the way we parse the segments in Lucene95 and Lucene99.
This comparison can be viewed over here: https://editor.mergely.com/Mb1lKm8z

There is an additional call to readbyte in the Lucene99SegmentInfoFormat, post which we see EOF exception if we try to read the old segments with Lucene99SegmentInfoFormat.

It was in this commit where this change in format and how we parse the segments was introduced: apache/lucene@6677109

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, it definitely makes sense, I was confused a bit why we have to duplicate Mode for each codec, but looking into code it is clear - we use it as codec name, so what you've implemented makes perfect sense. Thanks @sarthakaggarwal97 !

@reta
Copy link
Collaborator

reta commented Jan 22, 2024

I think we would need to have a BWC tests here: the codecs created by previous versions of the plugin could be picked up by the new versions.

@sarthakaggarwal97
Copy link
Collaborator Author

@reta makes sense. Have moved the lucene95 codecs to a backward_codecs directory.
While I'm looking to introduce the bwc tests in this plugin, do you think we should take the tests in a separate PR or in the same one?

@reta
Copy link
Collaborator

reta commented Jan 23, 2024

While I'm looking to introduce the bwc tests in this plugin, do you think we should take the tests in a separate PR or in the same one?

I would feel more confident having the change and the tests together, the is a risk (sadly, always) to break on upgrades, thanks @sarthakaggarwal97

Comment on lines -1 to +5
org.opensearch.index.codec.customcodecs.ZstdCodec
org.opensearch.index.codec.customcodecs.ZstdNoDictCodec
org.opensearch.index.codec.customcodecs.ZstdDeprecatedCodec
org.opensearch.index.codec.customcodecs.backward_codecs.Zstd95Codec
org.opensearch.index.codec.customcodecs.backward_codecs.ZstdNoDict95Codec
org.opensearch.index.codec.customcodecs.backward_codecs.Zstd95DeprecatedCodec
org.opensearch.index.codec.customcodecs.Zstd99Codec
org.opensearch.index.codec.customcodecs.ZstdNoDict99Codec
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can safely change these.

Codecs are loaded via reflection. The segment infos holds the full class name.

I think you should be able to test this manually by writing a little tool in this package that writes an index. (I would just use IndexWriter directly with an IndexWriterConfig that uses your custom codec. Here's an example.)

  1. Write an index without this commit. That will simulate the "2.11 behavior".
  2. Apply this commit. Open an IndexReader over the 2.11 index.

I believe that you *must keep the old fully-qualified class name to maintain backward compatibility (which unfortunately doesn't mention the version). Going forward, you can+should include version numbers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codecs are loaded via reflection. The segment infos holds the full class name.

Hm ... shouldn't it be code name (not class name)? Lucene moves codec under backward_codecs package every time, how does it work there?

Copy link
Member

@mgodwan mgodwan Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding as well is that the provided name within the constructor for codec class implementation is being used (based on how NamedSPILoader is being used in Lucene).

https://github.com/apache/lucene/blob/3a205feb27de8450c3bac7242f3c976ce5d21436/lucene/core/src/java/org/apache/lucene/codecs/Codec.java#L66-L68

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this by restoring the data indexed with zstd codecs in OSv2.11 to OSv3.0. Was able to ingest, and search after moving the codecs to the backward_codecs folder.
Here is part of the output when I hit GET <index>/_segments locally, was able to see both old and new segments (before and after the lucene upgrade).

                        "segments": {
                            "_f": {
                                "generation": 15,
                                "num_docs": 1767,
                                "deleted_docs": 0,
                                "size_in_bytes": 53849,
                                "memory_in_bytes": 0,
                                "committed": false,
                                "search": true,
                                "version": "9.7.0",
                                "compound": false,
                                "attributes": {
                                    "Lucene95CustomStoredFieldsFormat.mode": "ZSTD_NO_DICT"
                                }
                            },
                            "_v": {
                                "generation": 31,
                                "num_docs": 432,
                                "deleted_docs": 0,
                                "size_in_bytes": 14885,
                                "memory_in_bytes": 0,
                                "committed": false,
                                "search": true,
                                "version": "9.9.1",
                                "compound": true,
                                "attributes": {
                                    "Lucene99CustomStoredFieldsFormat.mode": "ZSTD_NO_DICT"
                                }
                            },

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh... got it. Thanks @mgodwan for the link to the codec comment -- I had mistakenly assumed the FQCN got written to SegmentInfos.

In that case, we should be good to go once we have the BWC test. 👍

@sarthakaggarwal97
Copy link
Collaborator Author

sarthakaggarwal97 commented Jan 25, 2024

@reta @msfroh @mgodwan
I'm trying to introduce the support for bwc tests in custom-codecs plugin, based on the implementation of security plugin, but looks like I'm running into the same issue for some time now.

It would be great if I can get some inputs on this draft change, since we need to close on this before v2.12 release. Here is the link for the failed bwc task for reference.

Thank you!

@mgodwan
Copy link
Member

mgodwan commented Jan 25, 2024

I understand that currently 2.12 build is failing as we don't have the required custom codecs change in.
Due to this, the bwc tests are being run against 2.11 which is not a wire compatible version for main/3.0

This would cause any mixed mode to fail in your tests today.

This is a raw PR which runs against 2.12 as the current version and verify its bwc (at least from a cluster run perspective) with 2.11: #98

In order to have a path forward, I think we can try following approaches:

  1. Raise a PR directly against 2.x to fix the 2.12 branch. Raise the same PR with required bwc versions in main
  2. Raise a PR in main with failing bwc tests. Backport to 2.x and then fix the main branch code.

@reta @sarthakaggarwal97 Thoughts?

@sarthakaggarwal97
Copy link
Collaborator Author

@mgodwan thank you for your inputs. It looks like the minimum_wire_compatibility_version was the issue all along.
Option 2 looks good to me (with monitoring that, post merge to 2.x, things still look okay), not sure we follow the practice of merging in 2.x prior to main when it comes to major code changes.

@reta what would you suggest?

@reta
Copy link
Collaborator

reta commented Jan 25, 2024

@reta what would you suggest?

Sorry @sarthakaggarwal97 , just got to that, looking into the options

@reta
Copy link
Collaborator

reta commented Jan 25, 2024

@reta what would you suggest?

Sorry @sarthakaggarwal97 , just got to that, looking into the options

@sarthakaggarwal97 I think we may go 2 step processes, @mgodwan @msfroh please validate me here:

  1. Update 2.x (2.12.0-SNAPSHOT) to reflect Apache Lucene 9.9.1 update: at this moment we would use backward_codecs for Lucene 9.5, that would bring 2.x back on track.
  2. Move on with BWC tests as planned

Alternatively, we could do 2nd option (as @mgodwan suggested) and have 2 pull requests, I think it would be better and less cycles wasted:

  1. Raise a PR directly against 2.x to fix the 2.12 branch. Raise the same PR with required bwc versions in main

What do you think?

@msfroh
Copy link

msfroh commented Jan 25, 2024

What do you think?

I think @mgodwan's suggestion is probably smoother (less of a BWC dance). While we don't normally like to commit directly to 2.x, I think it's reasonable to unblock us here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants