[pull] main from elastic:main #507

pull · 2024-10-04T18:39:21Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

As requested in #113571, augmented testFindAliasWithExclusion to validate that exclusions work for multiple indices sharing an alias.

Enable incremental bulk processing by default.

A few of today's REST handler implementations compute a new set of supported parameters on each request. This is needlessly inefficient since the set never changes. This commit fixes those implementations and adds assertions to verify that we are returning the exact same instance each time.

* validate index name in parser

…mall_withPlatformAgnosticVariant #113983

REST APIs which declare their supported parameters must consume exactly those parameters: consuming an unsupported parameter means that requests including that parameter will be rejected, whereas failing to consume a supported parameter means that this parameter has no effect and should be removed. This commit adds an assertion to verify that we are consuming the correct parameters. Closes #113854

…ide (#113990) Fixes #113966

Relates #88201

It's possible to call `RecoveryTarget#restoreFileFromSnapshot` after the `RecoveryTarget` has been replaced with a new instance due to a retry, but before all its refs have been released. If the recovery was snapshot-based then in this situation we will have already transferred the permits to the new instance, so the assertion that this instance has permits will trip. This commit fixes the problem with a non-null placeholder value indicating that the recovery target was at least originally created while holding some snapshot recovery permits. Closes #96018

…int (#113895)

…rch.test.rest.ClientYamlTestSuiteIT #114013

…ses size (#113613) Final (I wish) part of #99815 Also, fixes #113916 ## Steps 1. Migrate TDigest classes to use a custom Array implementation. Temporarily use a simple array wrapper (#112810) 2. Implement CircuitBreaking in the `WrapperTDigestArrays` class. Add Releasable/AutoCloseable and ensure everything is closed (#113105) 3. Pass the CircuitBreaker as a parameter to TDigestState from wherever it's being used (#113387) - ESQL: Pass a real CB - Other aggs: Use the deprecated methods on `TDigestState`, that will use a No-op CB instead 4. Account remaining TDigest classes size ("SHALLOW_SIZE") (This PR) Every step should be safely mergeable to main: - The first and second steps should have no impact. - The third and fourth ones will start increasing the CB count partially. ## Remarks As TDigests are releasable now, I had to refactor all tests, adding try-with-resources or direct close() calls. That added a lot of changes, but most of them are trivial. Outside of it, in ESQL, TDigestStates are closed now. Old aggregations don't close them, as it's not trivial. However, as they are using the NoopCircuitBreaker, there's no problem with it. There's nothing to be closed. ## _Remarks 2_ I tried to follow the same pattern in how everything is accounted. On each TDigest class: - Static constant "SHALLOW_SIZE" with the object weight - Field `AtomicBoolean closed` to ensure indempotent `close()` - Static `create()` method that accounts the SHALLOW_SIZE, and returns a new isntance. And the important part: On exception, it discounts SHALLOW_SIZE again - A `ramBytesUsed()` (Accountable interface), barely used for anything really, but some assertions I believe - A constructor, that closes everything it created on exception (If it creates an array, and the next array surpasses the CB limit, the first one must be closed) - And a close() that will, well, close everything and discount SHALLOW_SIZE A lot of steps to make sure everything works well in this multi-level structure, but I believe the result was quite clean

…st {yaml=rrf/700_rrf_retriever_search_api_compatibility/rrf retriever with top-level collapse} #114019

…ithTrainedModelAndInference #114023

Copy from the 8.x branch to main

* merging * copy elser service files into elasticsearch service * Add deprecation log message for elser service * improve deprecation warning * change elasticsearch internal service elser case to use elser model * switch elasticsearch elser tests to use elasticsearch elser * Update docs/changelog/113216.yaml * alias elser service to elasticsearch * delete elser service package now that elasticsearch service supports it and has aliased it * Add deprecation warning to infer API for elser * Fix accidentally introduced NPE and retain BWC support for null model ID (with deprecation message) * change "area" to "REST API" because "Machine Learning" isn't an option for deprecation * change elser literals to static variable * change Put and Elasticsearch Internal service to pass the service name if it is elser or elasticsearch this will allow the elasticsearch service to maintain BWC for null model IDs if the service was elser. * fix up tests to match new elasticsearch service semantics regarding elser. * Move passing of service name * add persistence for elser models in elasticsearch * copy elser service files into elasticsearch service * Add deprecation log message for elser service * Add deprecation warning to infer API for elser * fix merge conflicts * fix merge

This mentions EXPLAIN ANALYZE and EXPLAIN PLAN in the docs for ESQL's `profile` option. Those are things that folks from PostgreSQL and Oracle are used to and might search for. And `profile` is the closest thing we have to them. EXPLAIN PLAN doesn't run the query - it just tells you what the plan is. ESQL's `profile` always runs the query. So that's different. But it's close! EXPLAIN ANALYZE *does* run the query. It's pretty much the same.

* start to add deberta-v2 tokenizer classes * continue to add basic tokenizer stuff * Finish adding DeBERTa-2 tokenizer Still need to review & test * Complete test setup and linting * Update docs/changelog/111852.yaml * Add serialization of deberta tokenization * fix request buillder to match model * debugging * add balanced truncation * remove full vocabulary and use tiny vocab for tests * Remove TODO * precommit * Add named writables and known tokenizers * Add deberta to list of known tokenizers in test * Add tests for balanced tokenizer and fix errors in tokenizer logic * fix order of parameters passed to deberta * Add support for byte_fallback which is enabled for DeBERTa byte_fallback decomposes unknown tokens into multiple tokens each of one byte if those bytes are in the vocabulary. * precommit * update tests to account for byte decomposition * remove sysout * fix tests for byteFallback, for real this time * Move defaultSpanForChunking into super class to avoid repitition * simplify decomposeBytePieces --------- Co-authored-by: Elastic Machine <[email protected]>

Enables end-to-end streaming for OpenAI's chat completion API. Additional changes: - InferenceServiceResults now returns a wildcard, allowing for tests to work with classes that extend ChunkedToXContent. - StreamingChatCompletionResults now defines the Results structure so that future providers can reuse the same structure. - DelegatingProcessor now cancels the upstream if `next` throws an exception and forwards the exception downstream. - Moved the streaming handler from OpenAiChatCompletionResponseHandler to OpenAiResponseHandler so that Azure Open AI can reuse it. - OpenAiStreamingProcessor now iterates over the returned choices array, handling both OpenAI and Azure response formats. - SenderService declares a helper `Set<TaskType>` for implementations to reuse when enabling streaming. - Added an InferenceEventsAssertion test helper to fluidly assert responses from the mock webserver.

These were added in #113804 and missing the `false`/`false` test case.

Let's use a vendor neutral link.

…113899)

Closes ES-9623

This PR tracks the total number of query and fetch failures, in addition to the existing metrics for each shard, and exposes them through the stats API.

We're awaiting more information about failures of this test, so we need to actually run it occasionally... Relates #101608

Fixes a validation step that might prevent creating empty aggregations iif the output format does not allows negative numbers.

This change exposes query, fetch and indexing metrics for each index mode.

…3712) This commit is a follow up to #113151 to better clarify how to deprecate HTTP routes. That change introduced RouteBuilder.deprecateAndKeep to enable the ability to deprecate an HTTP API, but not require REST compatibilty headers in the next major version. This commit deprecates the java methnod RouteBuilder.deprecated and introduces RouteBuilder.deprecatedForRemoval. The valid options are now RouteBuilder.deprecateAndKeep vs. RouteBuilder.deprecatedForRemoval where the later will require compatiblity headers to access the route in any newer REST API versions than the lastFullySupportedVersion declared. The javadoc should help to provide some clarification. Additionally, the deprecation level should not be configurable. The deprecation level of WARN, when applied to routes, is informational only (and may require compatibility headers in the next version). The deprecation level of CRITICAL means means no access what so ever in the next major version. Generally, CRTICIAL is used for any cases where the compatibility is required (which means it is the last version for any access), or no compatibility is planned. Some examples: ``` Route.builder(GET, "/foo/bar").build() -> no deprecations Route.builder(GET, "/foo/bar").deprecateAndKeep("my message").build() -> deprecated, but removal is not influenced by REST API compatibilty Route.builder(GET, "/foo/bar").deprecatedForRemoval("my message", V_8).build() -> will be available in V_8, but emit a warn level deprecation, V_9 will require compatiblity headers and emit a CRITICAL deprecation, and V_10 this will no longer be available Route.builder(GET, "/foo/bar").replaces(GET, "/foo/baz", V_8).build() -> /foo/bar will be available in all versions. /foo/baz will be available in V_8 but emit a warn level deprecation, V_9 will require compatibility headers and emit a CRITICAL deprecation, and V_10 this will no longer be available. This is effectively a short cut to register a new route ("/foo/bar") and deprecatedForRemoval the path being replaced. ``` The functionality remains unchanged and this refactoring only provides better contracts and cleans up some of the implementation.

Today there are a handful of integer settings for `repository-s3` repositories whose docs link to the page about numeric field types. Yet these settings are not fields, and do not support floating-point values either. The convention throughout the rest of the docs is to just call these things `integer` without linking to anything. This commit aligns the `repository-s3` docs with this convention.

…operations (#114075) * Took time and cluster details get updated for coordinator only query operations The ComputeService.runCompute pathway for coordinator only operations (such as `FROM foo | LIMIT 0` or a ROW command) get updated with overall took time. This also includes support for cross-cluster coordinator only operations, which come about with queries like `FROM foo,remote:foo | LIMIT 0`. The _clusters metadata is now properly updated for those cases as well. Fixes #114014

This fixes tests so that they can work with multiple shards.

Adds a REVERSE string function

…ulkHighWatermarkBackOff #114073

It is possible in the incremental high watermark test that the data is submitted causing a corruption of the bulk request. This commit fixes the issue to ensure we only send new data after it has been requested. Additionally, it adds an assertion to prevent this error from happening again.

… UI (#114042) * Hide aggregated metrics generated for the APM UI * Update 30_aggregated_metrics_tests.yml * Review feedback - Introduced templates for 1 minute aggregations - Moved dynamic templates `ecs_ip` and `all_strings_to_keywords` into a dedicated file and now I pull the file in instead of repeating them - Introduced `metrics-[x]m.otel@custom` - Added tests with terms aggregation that assert by default 1 bucket (only 1m) with metricsset.interval, and with allowing hidden indices it's 3 buckets (1m, 10m, 60m) * Update 30_aggregated_metrics_tests.yml Simplify - no need to separate tests for the 3bucket queries. * Rename metrics-otel-fixed@mappings to ecs-tsdb@mappings --------- Co-authored-by: Elastic Machine <[email protected]>

…ore 8.16 (#114061) * Fix HuggingFaceMixedIT test sometimes failing when run on version before 8.16 * Fixing typo in expected error message

…sts testFold {TestCase=<double> #7} #114175

Re-add the `semantic_text.inner_hits` cluster feature to fix serverless test failures

…PartsWithBackoff #114181

…ulkLowWatermarkBackOff #114182

… test {yaml=aggregations/stats_metric_fail_formatting/fail formatting} #114187

…ticsearch.xpack.esql.action.EsqlActionBreakerIT #114194

leemthompo and others added 30 commits October 2, 2024 20:00

[DOCS][101] Update first quick start with mappings examples (#113558)

c85c2d9

Add GetAliases test case (followup from #113571) (#113901)

43c1558

As requested in #113571, augmented testFindAliasWithExclusion to validate that exclusions work for multiple indices sharing an alias.

enable incremental bulk (#113971)

8db6a65

Enable incremental bulk processing by default.

[ES|QL] Validate index name in parser (#112081)

91dca8d

* validate index name in parser

Revert semantic query passage ranking documentation (#113982)

2f3bf74

Mute org.elasticsearch.xpack.inference.TextEmbeddingCrudIT testPutE5S…

72c164c

…mall_withPlatformAgnosticVariant #113983

Restore 20_synthetic_source/object array in object with dynamic overr…

539d4fd

…ide (#113990) Fixes #113966

Rework RRF to be evaluated during rewrite phase (#112648)

dc8c20d

More verbose logging in IndicesSegmentsRestCancellationIT (#113844)

cd42719

Relates #88201

ES|QL: fix EsqlQueryResponseTests.testEqualsAndHashcode (#113997)

b8b5a92

Remove obsolete tests in v9 (#113946)

1a92975

[ML] Check Search Inference ID Usage When Deleting An Inference Endpo…

2337f49

…int (#113895)

Mute org.elasticsearch.test.rest.ClientYamlTestSuiteIT org.elasticsea…

a5ef139

…rch.test.rest.ClientYamlTestSuiteIT #114013

Mute org.elasticsearch.xpack.rank.rrf.RRFRankClientYamlTestSuiteIT te…

7d19a16

…st {yaml=rrf/700_rrf_retriever_search_api_compatibility/rrf retriever with top-level collapse} #114019

Mute org.elasticsearch.xpack.inference.TextEmbeddingCrudIT testPutE5W…

7157c4b

…ithTrainedModelAndInference #114023

Rollback Semantic Query Inner Hits (#113986)

08fb7c0

Copy 8.16 CLDR migration notes to main (#113954)

7fd0a66

Copy from the 8.x branch to main

ESQL: More tests for anyTrue and anyFalse (#113958)

fa01665

These were added in #113804 and missing the `false`/`false` test case.

ESQL: change link to profile explanation (#114032)

f4cb329

Let's use a vendor neutral link.

[CI] Replace openjdk17 with openjdk21 in failing java-matrix tasks (#…

24260ce

…113899)

Revert es.trigger_merge_after_recovery sysprop name (#114026)

5923d32

Closes ES-9623

Track search and fetch failure stats (#113988)

5bee44d

This PR tracks the total number of query and fetch failures, in addition to the existing metrics for each shard, and exposes them through the stats API.

pmpailis and others added 15 commits October 4, 2024 16:36

Add PIT information to collapse requests (#114046)

16fb3cc

Unmute S3BlobStoreRepositoryTests#testMetrics (#114129)

e32a70a

We're awaiting more information about failures of this test, so we need to actually run it occasionally... Relates #101608

Don't validate internal stats if they are empty (#113846)

0ef5219

Fixes a validation step that might prevent creating empty aggregations iif the output format does not allows negative numbers.

Mute org.elasticsearch.xpack.inference.InferenceCrudIT testGet #114135

70f4a0e

Add query fetch and indexing metrics per index mode (#113978)

663ab04

This change exposes query, fetch and indexing metrics for each index mode.

Remove some easy/straightforward instances of UpdateForV9 (#114134)

fcdfa5b

[CI] Switch to local ssds for DRA workflow pipeline (#114147)

3b9150d

[ci] Increase disk size for DRA workflow job

bb22946

[ML] Deploy default on chunked infer (#114141)

c59e350

Fix flattened ignore_above tests (#114155)

3391f50

This fixes tests so that they can work with multiple shards.

[ES|QL] add reverse function (#113297)

147461f

Adds a REVERSE string function

Remove semantic_text.inner_hits feature (#114156)

7e3de04

pull bot added the ⤵️ pull label Oct 4, 2024

brianseeders and others added 13 commits October 4, 2024 16:26

[CI] Only trigger DRA workflow on intake for main branches (#114077)

93d7f3d

Mute org.elasticsearch.action.bulk.IncrementalBulkIT testIncrementalB…

6e2d554

…ulkHighWatermarkBackOff #114073

Fix HuggingFaceMixedIT test sometimes failing when run on version bef…

7344da0

…ore 8.16 (#114061) * Fix HuggingFaceMixedIT test sometimes failing when run on version before 8.16 * Fixing typo in expected error message

Mute org.elasticsearch.xpack.esql.expression.function.aggregate.AvgTe…

0c69de1

…sts testFold {TestCase=<double> #7} #114175

Correctly inject subobjects parameter in logsdb tests (#113643)

c1f2f80

Re-add semantic_text.inner_hits cluster feature (#114180)

b19d923

Re-add the `semantic_text.inner_hits` cluster feature to fix serverless test failures

Mute org.elasticsearch.action.bulk.IncrementalBulkIT testMultipleBulk…

8b501dc

…PartsWithBackoff #114181

Mute org.elasticsearch.action.bulk.IncrementalBulkIT testIncrementalB…

166667f

…ulkLowWatermarkBackOff #114182

Mute org.elasticsearch.aggregations.AggregationsClientYamlTestSuiteIT…

6535bda

… test {yaml=aggregations/stats_metric_fail_formatting/fail formatting} #114187

ingest-geoip: establish boundaries (#109655)

a110d71

Mute org.elasticsearch.xpack.esql.action.EsqlActionBreakerIT org.elas…

8b09e91

…ticsearch.xpack.esql.action.EsqlActionBreakerIT #114194

pull bot merged commit 8b09e91 into Samboski1:main Oct 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from elastic:main #507

[pull] main from elastic:main #507

pull bot commented Oct 4, 2024 •

edited

Loading

[pull] main from elastic:main #507

[pull] main from elastic:main #507

Conversation

pull bot commented Oct 4, 2024 • edited Loading

pull bot commented Oct 4, 2024 •

edited

Loading