Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC]Add new dot expander processor doc #5631

Merged
merged 37 commits into from
Jan 30, 2024
Merged
Changes from 22 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
1febfc9
Add new dot expander processor doc
vagimeli Nov 18, 2023
81eb142
Merge branch 'main' into dot-expander.md
vagimeli Nov 28, 2023
da2d4ad
Draft content for tech review
vagimeli Nov 28, 2023
e9b196f
Merge branch 'main' into dot-expander.md
vagimeli Nov 28, 2023
793b2ba
Merge branch 'main' into dot-expander.md
vagimeli Dec 1, 2023
7a7f886
Merge branch 'main' into dot-expander.md
vagimeli Dec 12, 2023
4406af3
Merge branch 'main' into dot-expander.md
vagimeli Dec 21, 2023
3e977ff
Address tech review feedback
vagimeli Dec 21, 2023
8184c9e
Merge branch 'main' into dot-expander.md
vagimeli Dec 21, 2023
b3f912a
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Dec 21, 2023
5c85986
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Dec 21, 2023
64b1001
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Dec 21, 2023
96e0306
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Dec 21, 2023
c5e32eb
Address doc review feedback
vagimeli Dec 22, 2023
d5042e1
Edit line 227
vagimeli Dec 22, 2023
38c639b
Edit line 227
vagimeli Dec 22, 2023
15006d6
Edit line 227
vagimeli Dec 22, 2023
041423b
Address doc review comments
vagimeli Jan 4, 2024
340f72d
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 4, 2024
7319647
Merge branch 'main' into dot-expander.md
vagimeli Jan 4, 2024
f87d41c
Merge branch 'main' into dot-expander.md
vagimeli Jan 10, 2024
c003b67
Added path parameter and field name conflicts sections
kolchfa-aws Jan 17, 2024
ec4477c
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
dd9118a
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
586f704
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
29f3fb9
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
026af45
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
41dda97
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
feae233
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
53d0d0a
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
3dde50b
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
6a01fba
Update _ingest-pipelines/processors/dot-expander.md
vagimeli Jan 18, 2024
0d5c299
Address editorial review feedback
vagimeli Jan 18, 2024
b201c63
Merge branch 'main' into dot-expander.md
vagimeli Jan 18, 2024
fbb4146
Merge branch 'main' into dot-expander.md
vagimeli Jan 18, 2024
01d81b6
Merge branch 'main' into dot-expander.md
vagimeli Jan 30, 2024
6729218
Update dot-expander.md
vagimeli Jan 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
373 changes: 373 additions & 0 deletions _ingest-pipelines/processors/dot-expander.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,373 @@
---
layout: default
title: Dot expander
parent: Ingest processors
nav_order: 65
---

# Dot expander

The `dot_expander` processor is a tool that helps you work with hierarchical data. It transforms fields containing dots into object fields, making them accessible to other processors in the pipeline. Without this transformation, fields with dots cannot be processed.

The following is the syntax for the `dot_expander` processor:

```json
{
"dot_expander": {
"field": "field.to.expand"
}
}
```
{% include copy-curl.html %}

## Configuration parameters

The following table lists the required and optional parameters for the `dot_expander` processor.

Parameter | Required/Optional | Description |
|-----------|-----------|-----------|
`field` | Required | The field to expand into an object field. |
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`path` | Optional | The field is only required if the field to be expanded is nested within another object field. This is because the `field` parameter only recognizes leaf fields. |
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`description` | Optional | A brief description of the processor. |
`if` | Optional | A condition for running this processor. |
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`ignore_failure` | Optional | If set to `true`, failures are ignored. Default is `false`. |
`on_failure` | Optional | A list of processors to run if the processor fails. |
`tag` | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. |
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

## Using the processor

Follow these steps to use the processor in a pipeline.

### Step 1: Create a pipeline

The following query creates a `dot_expander` processor that will expand two fields named `user.address.city` and `user.address.state` into nested objects:

```json
PUT /_ingest/pipeline/dot-expander-pipeline
{
"description": "Dot expander processor",
"processors": [
{
"dot_expander": {
"field": "user.address.city"
}
},
{
"dot_expander":{
"field": "user.address.state"
}
}
]
}
```
{% include copy-curl.html %}

### Step 2 (Optional): Test the pipeline

It is recommended that you test your pipeline before you ingest documents.
{: .tip}

To test the pipeline, run the following query:

```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
POST _ingest/pipeline/dot-expander-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user.address.city": "New York",
"user.address.state": "NY"
}
}
]
}
```
{% include copy-curl.html %}

#### Response

The following example response confirms that the pipeline is working as expected:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"user": {
"address": {
"city": "New York",
"state": "NY"
}
}
},
"_ingest": {
"timestamp": "2024-01-17T01:32:56.501346717Z"
}
}
}
]
}
```

### Step 3: Ingest a document

The following query ingests a document into an index named `testindex1`:

```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
PUT testindex1/_doc/1?pipeline=dot-expander-pipeline
{
"user.address.city": "Denver",
"user.address.state": "CO"
}
```
{% include copy-curl.html %}

### Step 4 (Optional): Retrieve the document

To retrieve the document, run the following query:

```json
GET testindex1/_doc/1
```
{% include copy-curl.html %}

#### Response

The following response confirms that the specified fields were expanded into nested fields:

```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
{
"_index": "testindex1",
"_id": "1",
"_version": 1,
"_seq_no": 3,
"_primary_term": 1,
"found": true,
"_source": {
"user": {
"address": {
"city": "Denver",
"state": "CO"
}
}
}
}
```

## The `path` parameter

You can use the `path` parameter to specify the path to a dotted field within an object. For example, the following pipeline specifies the `address.city` field that is located within `user` object:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
PUT /_ingest/pipeline/dot-expander-pipeline
{
"description": "Dot expander processor",
"processors": [
{
"dot_expander": {
"field": "address.city",
"path": "user"
}
},
{
"dot_expander":{
"field": "address.state",
"path": "user"
}
}
]
}
```
{% include copy-curl.html %}

Simulate the pipeline as follows:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"You can simulate..."?


```json
POST _ingest/pipeline/dot-expander-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user": {
"address.city": "New York",
"address.state": "NY"
}
}
}
]
}
```
{% include copy-curl.html %}

The `dot_expander` processor transforms the document into:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"into the following [noun]"?

vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
{
"user": {
"address": {
"city": "New York",
"state": "NY"
}
}
}
```

## Field name conflicts

If there already exists a field with the same path as the path where the `dot_expander` processor should expand the value, the processor merges the two values into an array.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should "where" be "to which"? Unsure of this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised


Consider the following pipeline that expands the field `user.name`:

```json
PUT /_ingest/pipeline/dot-expander-pipeline
{
"description": "Dot expander processor",
"processors": [
{
"dot_expander": {
"field": "user.name"
}
}
]
}
```
{% include copy-curl.html %}

Simulate the pipeline with a document where there are two values with the exact same path `user.name`:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"You can simulate..."?


```json
POST _ingest/pipeline/dot-expander-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user.name": "John",
"user": {
"name": "Steve"
}
}
}
]
}
```
{% include copy-curl.html %}

The response shows that the values were merged into an array:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"indicates" instead of "shows"? (optional suggestion)


```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"user": {
"name": [
"Steve",
"John"
]
}
},
"_ingest": {
"timestamp": "2024-01-17T01:44:57.420220551Z"
}
}
}
]
}
```

If there is a field name with a same name but a different path field needs to be renamed. For example, the following simulate call returns a parse exception:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If a field contains the same name but a different path, then the field needs to be renamed"?


```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
POST _ingest/pipeline/dot-expander-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user": "John",
"user.name": "Steve"
}
}
]
}
```

To avoid the parse exception, rename the field first using the `rename` processor:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
PUT /_ingest/pipeline/dot-expander-pipeline
{
"processors" : [
{
"rename" : {
"field" : "user",
"target_field" : "user.name"
}
},
{
"dot_expander": {
"field": "user.name"
}
}
]
}
```
{% include copy-curl.html %}

Now simulate the pipeline:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Now you can simulate the pipeline"?


```json
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
POST _ingest/pipeline/dot-expander-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user": "John",
"user.name": "Steve"
}
}
]
}
```
{% include copy-curl.html %}

The response confirms that the fields are merged:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"user": {
"name": [
"John",
"Steve"
]
}
},
"_ingest": {
"timestamp": "2024-01-17T01:52:12.864432419Z"
}
}
}
]
}
```