Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC]Add new dot expander processor doc #5631

Merged
merged 37 commits into from
Jan 30, 2024
Merged
Changes from 35 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
1febfc9
Add new dot expander processor doc
vagimeli Nov 18, 2023
81eb142
Merge branch 'main' into dot-expander.md
vagimeli Nov 28, 2023
da2d4ad
Draft content for tech review
vagimeli Nov 28, 2023
e9b196f
Merge branch 'main' into dot-expander.md
vagimeli Nov 28, 2023
793b2ba
Merge branch 'main' into dot-expander.md
vagimeli Dec 1, 2023
7a7f886
Merge branch 'main' into dot-expander.md
vagimeli Dec 12, 2023
4406af3
Merge branch 'main' into dot-expander.md
vagimeli Dec 21, 2023
3e977ff
Address tech review feedback
vagimeli Dec 21, 2023
8184c9e
Merge branch 'main' into dot-expander.md
vagimeli Dec 21, 2023
b3f912a
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Dec 21, 2023
5c85986
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Dec 21, 2023
64b1001
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Dec 21, 2023
96e0306
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Dec 21, 2023
c5e32eb
Address doc review feedback
vagimeli Dec 22, 2023
d5042e1
Edit line 227
vagimeli Dec 22, 2023
38c639b
Edit line 227
vagimeli Dec 22, 2023
15006d6
Edit line 227
vagimeli Dec 22, 2023
041423b
Address doc review comments
vagimeli Jan 4, 2024
340f72d
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 4, 2024
7319647
Merge branch 'main' into dot-expander.md
vagimeli Jan 4, 2024
f87d41c
Merge branch 'main' into dot-expander.md
vagimeli Jan 10, 2024
c003b67
Added path parameter and field name conflicts sections
kolchfa-aws Jan 17, 2024
ec4477c
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
dd9118a
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
586f704
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
29f3fb9
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
026af45
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
41dda97
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
feae233
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
53d0d0a
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
3dde50b
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
6a01fba
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
0d5c299
Address editorial review feedback
vagimeli Jan 18, 2024
b201c63
Merge branch 'main' into dot-expander.md
vagimeli Jan 18, 2024
fbb4146
Merge branch 'main' into dot-expander.md
vagimeli Jan 18, 2024
01d81b6
Merge branch 'main' into dot-expander.md
vagimeli Jan 30, 2024
6729218
Update dot-expander.md
vagimeli Jan 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
373 changes: 373 additions & 0 deletions _ingest-pipelines/processors/dot-expander.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,373 @@
---
layout: default
title: Dot expander
parent: Ingest processors
nav_order: 65
---

# Dot expander

The `dot_expander` processor is a tool that helps you work with hierarchical data. It transforms fields containing dots into object fields, making them accessible to other processors in the pipeline. Without this transformation, fields with dots cannot be processed.

The following is the syntax for the `dot_expander` processor:

```json
{
"dot_expander": {
"field": "field.to.expand"
}
}
```
{% include copy-curl.html %}

## Configuration parameters

The following table lists the required and optional parameters for the `dot_expander` processor.

Parameter | Required/Optional | Description |
|-----------|-----------|-----------|
`field` | Required | The field to be expanded into an object field. |
`path` | Optional | This field is only required if the field to be expanded is nested within another object field. This is because the `field` parameter only recognizes leaf fields. |
`description` | Optional | A brief description of the processor. |
`if` | Optional | A condition for running the processor. |
`ignore_failure` | Optional | If set to `true`, failures are ignored. Default is `false`. |
`on_failure` | Optional | A list of processors to run if the processor fails. |
`tag` | Optional | An identifier tag for the processor. Useful for debugging in order to distinguish between processors of the same type. |

## Using the processor

Follow these steps to use the processor in a pipeline.

### Step 1: Create a pipeline

The following query creates a `dot_expander` processor that will expand two fields named `user.address.city` and `user.address.state` into nested objects:

```json
PUT /_ingest/pipeline/dot-expander-pipeline
{
"description": "Dot expander processor",
"processors": [
{
"dot_expander": {
"field": "user.address.city"
}
},
{
"dot_expander":{
"field": "user.address.state"
}
}
]
}
```
{% include copy-curl.html %}

### Step 2 (Optional): Test the pipeline

It is recommended that you test your pipeline before you ingest documents.
{: .tip}

To test the pipeline, run the following query:

```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
POST _ingest/pipeline/dot-expander-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user.address.city": "New York",
"user.address.state": "NY"
}
}
]
}
```
{% include copy-curl.html %}

#### Response

The following example response confirms that the pipeline is working as expected:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"user": {
"address": {
"city": "New York",
"state": "NY"
}
}
},
"_ingest": {
"timestamp": "2024-01-17T01:32:56.501346717Z"
}
}
}
]
}
```

### Step 3: Ingest a document

The following query ingests a document into an index named `testindex1`:

```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
PUT testindex1/_doc/1?pipeline=dot-expander-pipeline
{
"user.address.city": "Denver",
"user.address.state": "CO"
}
```
{% include copy-curl.html %}

### Step 4 (Optional): Retrieve the document

To retrieve the document, run the following query:

```json
GET testindex1/_doc/1
```
{% include copy-curl.html %}

#### Response

The following response confirms that the specified fields were expanded into nested fields:

```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
{
"_index": "testindex1",
"_id": "1",
"_version": 1,
"_seq_no": 3,
"_primary_term": 1,
"found": true,
"_source": {
"user": {
"address": {
"city": "Denver",
"state": "CO"
}
}
}
}
```

## The `path` parameter

You can use the `path` parameter to specify the path to a dotted field within an object. For example, the following pipeline specifies the `address.city` field that is located within the `user` object:

```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
PUT /_ingest/pipeline/dot-expander-pipeline
{
"description": "Dot expander processor",
"processors": [
{
"dot_expander": {
"field": "address.city",
"path": "user"
}
},
{
"dot_expander":{
"field": "address.state",
"path": "user"
}
}
]
}
```
{% include copy-curl.html %}

You can simulate the pipeline as follows:

```json
POST _ingest/pipeline/dot-expander-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user": {
"address.city": "New York",
"address.state": "NY"
}
}
}
]
}
```
{% include copy-curl.html %}

The `dot_expander` processor transforms the document into the following structure:

```json
{
"user": {
"address": {
"city": "New York",
"state": "NY"
}
}
}
```

## Field name conflicts

If a field already exists with the same path as the path to which the `dot_expander` processor should expand the value, the processor merges the two values into an array.

Consider the following pipeline that expands the field `user.name`:

```json
PUT /_ingest/pipeline/dot-expander-pipeline
{
"description": "Dot expander processor",
"processors": [
{
"dot_expander": {
"field": "user.name"
}
}
]
}
```
{% include copy-curl.html %}

You can simulate the pipeline with a document containing two values with the exact same path `user.name`:

```json
POST _ingest/pipeline/dot-expander-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user.name": "John",
"user": {
"name": "Steve"
}
}
}
]
}
```
{% include copy-curl.html %}

The response confirms that the values were merged into an array:

```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"user": {
"name": [
"Steve",
"John"
]
}
},
"_ingest": {
"timestamp": "2024-01-17T01:44:57.420220551Z"
}
}
}
]
}
```

If a field contains the same name but a different path, then the field needs to be renamed. For example, the following simulate call returns a parse exception:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should "simulate" be in code font (my brain wants to incorrectly change it to "simulated")?


```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
POST _ingest/pipeline/dot-expander-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user": "John",
"user.name": "Steve"
}
}
]
}
```

To avoid the parse exception, first rename the field by using the `rename` processor:

```json
PUT /_ingest/pipeline/dot-expander-pipeline
{
"processors" : [
{
"rename" : {
"field" : "user",
"target_field" : "user.name"
}
},
{
"dot_expander": {
"field": "user.name"
}
}
]
}
```
{% include copy-curl.html %}

Now you can simulate the pipeline:

```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
POST _ingest/pipeline/dot-expander-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user": "John",
"user.name": "Steve"
}
}
]
}
```
{% include copy-curl.html %}

The response confirms that the fields were merged:

```json
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"user": {
"name": [
"John",
"Steve"
]
}
},
"_ingest": {
"timestamp": "2024-01-17T01:52:12.864432419Z"
}
}
}
]
}
```