diff --git a/_posts/2024-08-01-search-standalone-mapper.adoc b/_posts/2024-08-01-search-standalone-mapper.adoc index 9af8fba570..448a4409f1 100644 --- a/_posts/2024-08-01-search-standalone-mapper.adoc +++ b/_posts/2024-08-01-search-standalone-mapper.adoc @@ -53,7 +53,6 @@ which will create the documents in the search index that we will later use to pe Generally speaking, mass indexing can be as simple as: [source, java] -==== ---- @Inject SearchMapping searchMapping; // <1> @@ -68,7 +67,6 @@ var future = searchMapping.scope(Object.class) // <2> In this case, all indexed entities should be targeted; hence, the `Object.class` can be used to create the scope. 3. Create a mass indexer with the default configuration. 4. Start the indexing process. Starting the process returns a future; the indexing happens in the background. -==== For Hibernate Search to perform this operation, we must tell it how to load the indexed entities. We will use an `EntityLoadingBinder` to do that. It is a simple interface providing access to the binding context @@ -76,7 +74,6 @@ where we can define selection-loading strategies (for search) and mass-loading s Since, in our case, we are only interested in the mass indexer, it would be enough only to define the mass loading strategy: [source, java] -==== ---- public class GuideLoadingBinder implements EntityLoadingBinder { @@ -91,12 +88,10 @@ public class GuideLoadingBinder implements EntityLoadingBinder { 1. Implement the single `bind(..)` method of the `EntityLoadingBinder`. 2. Specify the mass loading strategy for the `Guide` search entity. We'll discuss the implementation of the strategy later in this post. -==== And then, with the entity loading binder defined, we can simply reference it within the `@SearchEntity` annotation: [source, java] -==== ---- @SearchEntity(loadingBinder = @EntityLoadingBinderRef(type = GuideLoadingBinder.class)) // <1> @Indexed( ... ) @@ -112,7 +107,6 @@ public class Guide { As with many other Hibernate Search components, a CDI bean reference can be used here instead by providing the bean name, for example, if the loading binder requires access to some CDI beans and is a CDI bean itself. -==== That is all that is needed to tie things together. The only open question is how to implement the mass loading strategy. @@ -134,7 +128,6 @@ than just pass through the batch of received "identifiers", which are actual ent With that in mind, the mass-loading strategy may be implemented as: [source, java] -==== ---- new MassLoadingStrategy() { @Override @@ -167,7 +160,6 @@ it is slightly trickier than the pass-through entity loader. Hence, we would want to take a closer look at it. 2. An implementation of the pass-through entity loader. 3. As explained above, we treat the search entities as identifiers and simply pass the entities we receive to the sink. -==== NOTE: If passing entities as identifiers feels like a hack, it's because it is. Hibernate Search will, at some point, provide alternative APIs to achieve this more elegantly: link:https://hibernate.atlassian.net/browse/HSEARCH-5209[HSEARCH-5209] @@ -178,7 +170,6 @@ We could do this by using the `MassLoadingOptions options`. These mass loading options provide access to the context objects passed to the mass indexer by the user. [source, java] -==== ---- @Inject SearchMapping searchMapping; // <1> @@ -198,6 +189,8 @@ for an example of how such context can be implemented. 4. Set any other mass indexer configuration options as needed. 5. Create a mass indexer. 6. Start the indexing process. + +[source, java] ---- public class GuideLoadingContext { @@ -220,14 +213,12 @@ public class GuideLoadingContext { 2. Read the next batch of the guides from the iterator. We are using the batch size limit that we will retrieve from the mass-loading options and checking the iterator to see if there are any more entities to pull. -==== Now, having the way of reading the entities in batches from the stream and knowing how to pass it to the mass indexer, implementing the identifier loader can be as easy as: [source, java] -==== ---- @Override public MassIdentifierLoader createIdentifierLoader(LoadingTypeGroup includedTypes, @@ -270,7 +261,6 @@ for the current mass indexer. 4. If the batch is empty, it means that the stream iterator has no more guides to return. Hence, we can notify the mass indexing sink that no more items will be provided by calling `.complete()`. 5. If there are any guides in the loaded batch, we'll pass them to the sink to be processed. -==== To sum up, here is a summary of the steps to take to index an unknown number of search entities from a datasource while reading each entity only once, and without relying on lookups by identifier: