-
Notifications
You must be signed in to change notification settings - Fork 383
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Post: quarkus.search.io series - rolling over
- Loading branch information
1 parent
538bdd8
commit 1a30f0c
Showing
5 changed files
with
277 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -513,4 +513,11 @@ yrodiere: | |
emailhash: "2a8bdd4ffd282b7185c74b52ab452617" | ||
job_title: "Principal Software Engineer" | ||
twitter: "yoannrodiere" | ||
bio: "Lead developer on Hibernate Search (http://hibernate.org/search/), and one of the main contributors to the Hibernate extensions (ORM, Search, Validator) of Quarkus." | ||
bio: "Lead developer on Hibernate Search (http://hibernate.org/search/), and one of the main contributors to the Hibernate extensions (ORM, Search, Validator) of Quarkus." | ||
markobekhta: | ||
name: "Marko Bekhta" | ||
email: "[email protected]" | ||
emailhash: "2934f00ba9190bc06cf03fde5b50c61d" | ||
job_title: "Engineer (Software)" | ||
twitter: "that_java_guy" | ||
bio: "Software Engineer at Red Hat and Hibernate team member." |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,269 @@ | ||
--- | ||
layout: post | ||
title: 'Indexing rollover with Quarkus and Hibernate Search' | ||
date: 2024-02-25 | ||
tags: hibernate search howto | ||
synopsis: 'This is the first post in the series that dives into the implementation details of a search.quarkus.io application. Are you interested in near zero-downtime reindexing? Then this one is for you!' | ||
author: markobekhta | ||
--- | ||
|
||
:imagesdir: /assets/images/posts/search-indexing-rollover | ||
:hibernate-search-docs-url: https://docs.jboss.org/hibernate/search/{hibernate-search-version}/reference/en-US/html_single/ | ||
:quarkus-hibernate-search-docs-url: https://quarkus.io/guides/hibernate-search-orm-elasticsearch | ||
|
||
This is the first post in the series diving into the implementation details of an | ||
link:https://github.com/quarkusio/search.quarkus.io[application] backing the guide search of | ||
link:https://quarkus.io/guides/[quarkus.io]. | ||
|
||
Does your application have full-text search capabilities and use Hibernate Search? | ||
Do you need to perform reindexing of your data while keeping your application running and producing search results? | ||
Look no further. In this post, we'll cover how you can approach this problem | ||
and solve it in practice with a few low-level APIs provided by Hibernate Search. | ||
|
||
The approach suggested in this post is based on the fact that Hibernate Search is using | ||
link:{hibernate-search-docs-url}#backend-elasticsearch-indexlayout[aliased indexes], | ||
and communicates with the actual index through a read/write alias, depending on the operation it needs to perform. | ||
For example, a search operation will be routed through a read index alias, | ||
while the indexing operation will be sent through a write index alias. | ||
|
||
image::initial-app.png[] | ||
|
||
NOTE: This approach is implemented and successfully used in our Quarkus application that backs the guides' search of quarkus.io/guides. | ||
You can see the complete implementation here: | ||
link:https://github.com/quarkusio/search.quarkus.io/blob/d956b6a1341d8693fa1d6b7881f3840f48bdaacd/src/main/java/io/quarkus/search/app/indexing/Rollover.java#L44-L331[rollover implementation] | ||
and link:https://github.com/quarkusio/search.quarkus.io/blob/d956b6a1341d8693fa1d6b7881f3840f48bdaacd/src/main/java/io/quarkus/search/app/indexing/IndexingService.java#L226-L244[rollover usage]. | ||
|
||
Now, since we would want to keep our application providing results to any search operations and add/update documents to the indexes | ||
we cannot perform a simple reindex operation (purge all documents within an index and mass-index them back) | ||
using a link:{hibernate-search-docs-url}#search-batchindex-massindexer[mass indexer], | ||
or a recently added link:{quarkus-hibernate-search-docs-url}#management[management Quarkus endpoint], | ||
as this will drop all existing documents from the index, and search operations will not be able to match them anymore. | ||
|
||
Instead, we can create a new index with the same schema and route any write operations to it. | ||
|
||
image::write-app.png[] | ||
|
||
Since, at the moment, Hibernate Search does not provide the rollover feature out of the box | ||
we will need to resort to using the lower-level APIs to access the Elasticsearch client and perform the required operations ourselves. | ||
To do so, we need to follow a few simple steps: | ||
|
||
1. Get the mapping information for the index we want to reindex using the schema manager. | ||
+ | ||
[source, java] | ||
==== | ||
---- | ||
@Inject | ||
SearchMapping searchMapping; // <1> | ||
// ... | ||
searchMapping.scope(MyIndexedEntity.class).schemaManager() // <2> | ||
.exportExpectedSchema((backendName, indexName, export) -> { // <3> | ||
var createIndexRequestBody = export.extension(ElasticsearchExtension.get()).bodyParts().get(0); // <4> | ||
var mappings = createIndexRequestBody.getAsJsonObject("mappings"); // <5> | ||
var settings =createIndexRequestBody.getAsJsonObject("settings"); // <6> | ||
}); | ||
---- | ||
1. Inject `SearchMapping` somewhere in your app so that we can use it to access a schema manager. | ||
2. Get a schema manager for the indexed entity we are interested in (`MyIndexedEntity`). | ||
If all entities should be targeted, then `Object.class` can be used to create the scope. | ||
3. Use the export schema API to access the mapping information. | ||
4. Use the extension to get access to the Elasticsearch-specific `.bodyParts()` method that returns | ||
a JSON representing the JSON HTTP body needed to create the indexes. | ||
5. Get the mapping information for the particular index. | ||
6. Get the settings for the particular index. | ||
==== | ||
+ | ||
2. Get the reference to the Elasticsearch client, so we can perform API calls to the search backend cluster: | ||
+ | ||
[source, java] | ||
==== | ||
---- | ||
@Inject | ||
SearchMapping searchMapping; // <1> | ||
// ... | ||
RestClient client = searchMapping.backend() // <2> | ||
.unwrap(ElasticsearchBackend.class) // <3> | ||
.client(RestClient.class); // <4> | ||
}); | ||
---- | ||
1. Inject `SearchMapping` somewhere in your app so that we can use it to access a schema manager. | ||
2. Access the backend from a search mapping instance. | ||
3. Unwrap the backend to the `ElasticsearchBackend`, so that we can access backend-specific APIs. | ||
4. Get a reference to the Elasticsearch's rest client. | ||
==== | ||
+ | ||
3. Create a new index using the OpenSearch/Elasticsearch rollover API | ||
that would allow us to keep using the existing index for read operations, | ||
while write operations will be sent to the new index: | ||
+ | ||
[source, java] | ||
==== | ||
---- | ||
@Inject | ||
SearchMapping searchMapping; // <1> | ||
// ... | ||
SearchIndexedEntity<?> entity = searchMapping.indexedEntity(MyIndexedEntity.class); | ||
var index = entity.indexManager().unwrap(ElasticsearchIndexManager.class).descriptor(); // <2> | ||
var request = new Request("POST", "/" + index.writeName() + "/_rollover"); // <3> | ||
var body = new JsonObject(); | ||
body.add("mappings", mappings); | ||
body.add("settings", settings); | ||
body.add("aliases", new JsonObject()); // <4> | ||
request.setEntity(new StringEntity(gson.toJson(body), ContentType.APPLICATION_JSON)); | ||
var response = client.performRequest(request); // <5> | ||
//... | ||
---- | ||
1. Inject `SearchMapping` somewhere in your app so that we can use it to access a schema manager. | ||
2. Get the index descriptor to get the aliases from it. | ||
3. Start building the rollover request body using the write index alias from the index descriptor. | ||
4. Note that we are including an empty "aliases" so that the aliases are not copied over to the new index, | ||
except for the write alias. | ||
We don't want the read alias to start pointing to the new index immediately. | ||
5. Perform the rollover API request using the Elasticsearch REST client obtained in the previous step. | ||
==== | ||
|
||
With this successfully completed, we can start populating our write index. Once we are done with indexing, | ||
we can either commit or rollback depending on the results: | ||
|
||
image::after-indexing.png[] | ||
|
||
Committing the index rollover means that we are happy with the results and ready to switch to the new index | ||
for both reading and writing operations while removing the old one. To do that, we need to send a request to the cluster: | ||
|
||
[source, java] | ||
==== | ||
---- | ||
var client = ... <1> | ||
var request = new Request("POST", "_aliases"); // <2> | ||
request.setEntity(new StringEntity(""" | ||
{ | ||
"actions": [ | ||
{ | ||
"add": { // <3> | ||
"index": "%s", | ||
"alias": "%s", | ||
"is_write_index": false | ||
}, | ||
"remove_index": { // <4> | ||
"index": "%s" | ||
} | ||
} | ||
] | ||
} | ||
""".formatted( newIndexName, readAliasName, oldIndexName ) // <5> | ||
, ContentType.APPLICATION_JSON)); | ||
var response = client.performRequest(request); // <5> | ||
//... | ||
---- | ||
1. Get access to the Elasticsearch REST client as described above. | ||
2. Start creating an `_aliases` API request. | ||
3. Add an action to update the index aliases to use the new index for both read and write operations. | ||
Here, we must make the read alias point to the new index. | ||
4. Add an action to remove the old index. | ||
5. The names of the new/old index can be retrieved from the response of the initial `_rollover` API request, | ||
while the aliases can be retrieved from the index descriptor. | ||
==== | ||
|
||
Otherwise, if we have encountered an error or decided for any other reason to stop the rollover, we can roll back to using | ||
the initial index: | ||
|
||
[source, java] | ||
==== | ||
---- | ||
var client = ... <1> | ||
var request = new Request("POST", "_aliases"); // <2> | ||
request.setEntity(new StringEntity(""" | ||
{ | ||
"actions": [ | ||
{ | ||
"add": { // <3> | ||
"index": "%s", | ||
"alias": "%s", | ||
"is_write_index": true | ||
}, | ||
"remove_index": { // <4> | ||
"index": "%s" | ||
} | ||
} | ||
] | ||
} | ||
""".formatted( oldIndexName, writeAliasName, newIndexName ) // <5> | ||
, ContentType.APPLICATION_JSON)); | ||
var response = client.performRequest(request); // <5> | ||
//... | ||
---- | ||
1. Get access to the Elasticsearch REST client as described above. | ||
2. Start creating an `_aliases` API request. | ||
3. Add an action to update the index aliases to use the old index for both read and write operations. | ||
Here, we must make the write alias point back to the old index. | ||
4. Add an action to remove the new index. | ||
5. The names of the new/old index can be retrieved from the response of the initial `_rollover` API request, | ||
while the aliases can be retrieved from the index descriptor. | ||
==== | ||
|
||
NOTE: Keep in mind that in case of a rollback, your initial index may be out of sync if any write operations were performed | ||
while the write alias was pointing to the new index. | ||
|
||
With this knowledge, we can organize the rollover process as follows: | ||
[source, java] | ||
==== | ||
---- | ||
try (Rollover rollover = Rollover.start(searchMapping)) { | ||
// Perform the indexing operations ... | ||
rollover.commit(); | ||
} | ||
---- | ||
==== | ||
|
||
Where the `Rollover` class will look as follows: | ||
|
||
[source, java] | ||
==== | ||
---- | ||
class Rollover implements Closeable { | ||
public static Rollover start(SearchMapping searchMapping) { | ||
// initiate the rollover process by sending the _rollover request ... | ||
// ... | ||
return new Rollover( client, rolloverResponse ); // <1> | ||
} | ||
@Override | ||
public void close() { | ||
if ( !done ) { // <2> | ||
rollback(); | ||
} | ||
} | ||
public void commit() { | ||
// send the `_aliases` request to switch to the *new* index | ||
// ... | ||
done = true; | ||
} | ||
public void rollback() { | ||
// send the `_aliases` request to switch to the *old* index | ||
// ... | ||
done = true; | ||
} | ||
} | ||
---- | ||
1. Keep the reference to the Elasticsearch REST client to perform API calls. | ||
2. If we haven't successfully committed the rollover, it'll be rolled back on close. | ||
==== | ||
|
||
Once again, for a complete working example of this rollover implementation, check out the | ||
link:https://github.com/quarkusio/search.quarkus.io[search.quarkus.io on GitHub]. | ||
|
||
If you find this feature useful and would like to have it built-in into your Hibernate Search and Quarkus apps | ||
feel free to reach out to us, submit feature requests and discuss your ideas and suggestions. | ||
|
||
Stay tuned for more details in the coming weeks as we publish more blog posts | ||
diving into other interesting implementation aspects of this application. | ||
Happy searching and rolling over! |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.