Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new documentation IP2Geo processor automatic updating feature #4095

Closed
wants to merge 80 commits into from
Closed
Changes from 55 commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
8a49f53
Content planning
vagimeli May 17, 2023
398eae6
Content planning
vagimeli May 17, 2023
74dd61d
Writing
vagimeli May 22, 2023
dbcec4e
Writing
vagimeli May 23, 2023
daee116
Writing
vagimeli May 23, 2023
2c1bf28
Writing
vagimeli May 23, 2023
989aa7c
Writing
vagimeli May 23, 2023
0822c04
Writing
vagimeli May 23, 2023
a659296
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
0742e27
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
185179c
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
fd4b645
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
a460a6e
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
a5cf0e0
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
a5d96dc
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
9b292ea
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
9655bb8
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
a448e40
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
d225a16
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
720c244
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
aba49db
Update _api-reference/ingest-apis/geoip.md
vagimeli May 24, 2023
1affbe6
Writing
vagimeli May 24, 2023
91cd80a
Writing
vagimeli May 24, 2023
85b845b
Writing
vagimeli May 24, 2023
23d8cd4
Writing
vagimeli May 24, 2023
edbb76f
Writing
vagimeli May 24, 2023
ed5ac30
Writing
vagimeli May 25, 2023
037dff5
Writing
vagimeli May 25, 2023
0b9b5e2
Address tech review feedback
vagimeli May 26, 2023
823f77a
Update processors.md
vagimeli May 26, 2023
7c0474f
Update processors.md
vagimeli May 26, 2023
c32a6b6
Writing
vagimeli May 30, 2023
baa6a38
Add processor index page
vagimeli Jun 2, 2023
28f5b6b
Update front matter
vagimeli Jun 22, 2023
7a5bdc8
Created new file under Ingest Processors TOC
vagimeli Jun 22, 2023
bed4f18
Update ip2geo.md
vagimeli Jun 28, 2023
75572e9
Update ip2geo.md
vagimeli Jun 28, 2023
3c2c1df
Update ip2geo.md
vagimeli Jun 28, 2023
985e111
Update ip2geo.md
vagimeli Jul 5, 2023
b718810
Update ip2geo.md
vagimeli Jul 5, 2023
8a48e7b
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
7e1b3af
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
85bbaa6
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
c5e30f4
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
9c42c20
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
cc86348
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
72c2091
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
8abfc1c
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
99b0600
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
280f71b
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
0ca4e2d
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
b6f406b
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
3fbeb91
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
71ae9d7
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 3, 2023
99b117b
Update ip2geo.md
vagimeli Aug 3, 2023
a6dc976
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 18, 2023
e381d1e
Address doc review feedback
vagimeli Aug 18, 2023
c64e3cc
Update ip2geo.md
vagimeli Aug 18, 2023
008a52b
Update ip2geo.md
vagimeli Aug 18, 2023
59ef808
Add copy labels
vagimeli Aug 22, 2023
e83bbd8
Update ip2geo.md
vagimeli Aug 22, 2023
e43415a
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
512cb4f
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
0af7ee4
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
073980c
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
167468f
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
8fee903
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
9730088
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
e2c2e01
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
933a0c4
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
1a357f3
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
ff19e23
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
17cc8a6
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
c84be08
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
3e3743a
Update _api-reference/ingest-apis/ip2geo.md
vagimeli Aug 23, 2023
936b757
Copy edits
vagimeli Aug 23, 2023
48d88cf
Address editorial feedback
vagimeli Aug 23, 2023
2322ee0
Address editorial feedback
vagimeli Aug 23, 2023
2e4eac0
Update ip2geo.md
vagimeli Aug 23, 2023
42ab53a
Copy edit to align format to processors template
vagimeli Sep 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
247 changes: 247 additions & 0 deletions _api-reference/ingest-apis/ip2geo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
---
layout: default
title: IP2Geo
parent: Ingest processors
grand_parent: Ingest APIs
nav_order: 130
---

# IP2Geo
Introduced 2.9
{: .label .label-purple }

The `ip2geo` processor adds information about the geographical location of an IPv4 or IPv6 address. The `ip2geo` processor uses IP geolocation (GeoIP) data from an external endpoint and therefore requires an additional component, `datasource`, that defines from where to download GeoIP data and how frequently to update the data.

vagimeli marked this conversation as resolved.
Show resolved Hide resolved
The `ip2geo` processor maintains the GeoIP data mapping in system indexes. The GeoIP mapping is retrieved from these indexes during data ingestion to perform the IP to geolocation conversion on the incoming data. For optimal performance, it is preferable to have a node with both ingest and data roles. This configuration avoids internode calls reducing latency. Also, as the `ip2geo` processor searches GeoIP mapping data from the indexes, search performance is impacted.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
{: .note}

## Getting started

To get started with using the `ip2geo` processor, the `opensearch-geospatial` plugin must be installed. Learn more at [Installing plugins]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/).

## Creating the IP2Geo data source

Create the IP2Geo data source by defining the endpoint value to download GeoIP data and specify the update interval.

OpenSearch provides the following endpoints for GeoLite2 City, GeoLite2 Country, and GeoLite2 ASN databases from [MaxMind](http://dev.maxmind.com/geoip/geoip2/geolite2/), shared under the CC BY-SA 4.0 license:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

* GeoLite2 City: https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json
* GeoLite2 Country: https://geoip.maps.opensearch.org/v1/geolite2-country/manifest.json
* GeoLite2 ASN: https://geoip.maps.opensearch.org/v1/geolite2-asn/manifest.json

If a OpenSearch cluster cannot update a data source from the endpoints in 30 days, the cluster does not add GeoIP data to the documents, instead it adds `"error":"ip2geo_data_expired"`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The following table lists the IP2Geo data source options.

| Name | Required | Default | Description |
|------|----------|---------|-------------|
| endpoint | no | https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json | The endpoint for downloading the GeoIP data. |
| update_interval_in_days | no | 3 | The frequency in days for updating the GeoIP data; minimum value is 1. |

The following code example shows how to create an IP2Geo data source.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

#### Example: PUT request

```json
PUT /_plugins/geospatial/ip2geo/datasource/my-datasource
{
"endpoint" : "https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json",
"update_interval_in_days" : 3
}
```

The following code example shows the reponse to the preceding request. A true response means the request was successful and the server was able to process the request. A false reponse means check the request to make sure it is valid, check the URL to make sure it is correct, or try again.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

#### Example: Successful response

```json
{
"acknowledged":true
}
```

## Sending a GET request

To get information about one or more IP2Geo data sources, send a GET request.

#### Example: GET request

```json
GET /_plugins/geospatial/ip2geo/datasource/my-datasource
```

#### Example: Response

```json
{
"datasources": [
{
"name": "my-datasource",
"state": "AVAILABLE",
"endpoint": "https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json",
"update_interval_in_days": 3,
"next_update_at_in_epoch_millis": 1685125612373,
"database": {
"provider": "maxmind",
"sha256_hash": "0SmTZgtTRjWa5lXR+XFCqrZcT495jL5XUcJlpMj0uEA=",
"updated_at_in_epoch_millis": 1684429230000,
"valid_for_in_days": 30,
"fields": [
"country_iso_code",
"country_name",
"continent_name",
"region_iso_code",
"region_name",
"city_name",
"time_zone",
"location"
]
},
"update_stats": {
"last_succeeded_at_in_epoch_millis": 1684866730192,
"last_processing_time_in_millis": 317640,
"last_failed_at_in_epoch_millis": 1684866730492,
"last_skipped_at_in_epoch_millis": 1684866730292
}
}
]
}
```

## Updating an IP2Geo data source

To update an IP2Geo data source successfully, the GeoIP database from the new database's endpoint must contain all fields that the current database has. Otherwise, the update fails.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

#### Example: Update request

```json
PUT /_plugins/geospatial/ip2geo/datasource/my-datasource/_settings
{
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
"endpoint": https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json,
"update_interval_in_days": 10
}
```

#### Example: Response

```json
{
"acknowledged":true
}
```

vagimeli marked this conversation as resolved.
Show resolved Hide resolved
## Deleting the IP2Geo data source

To delete the IP2Geo data source, you must delete all processors associated with the data source first. Otherwise, the DELETE request fails.

#### Example: DELETE request

```json
DELETE /_plugins/geospatial/ip2geo/datasource/my-datasource
```

#### Example: Response

```json
{
"acknowledged": true
}
```

## Creating the processor
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

Once the IP2Geo data source is created, you can create the `ip2geo` processor.

#### Example: Create processor request

```json
PUT /_ingest/pipeline/my-pipeline
{
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
"description":"convert ip to geo",
"processors":[
{
"ip2geo":{
"field":"ip",
"datasource":"my-datasource"
}
}
]
}
```

#### Example: Response

```json
{
"acknowledged": true
}
```

## Creating the IP2Geo pipeline

The following table lists the `ip2geo` fields options for creating an IP2Geo pipeline.

| Name | Required | Default | Description |
|------|----------|---------|-------------|
| field | yes | - | The field to get the ip address for the geographical lookup. |
| datasource | yes | - | The data source name to look up geographical information. |
| properties | no | All fields in `datasource`. | The field that controls what properties are added to `target_field` from `datasource`. |
| target_field | no | ip2geo | The field that holds the geographical information looked up from the data source. |
| ignore_missing | no | false | If `true` and `field` does not exist, the processor quietly exits without modifying the document. |

The following code is an example of using the `ip2geo` processor to add the geographical information to the `ip2geo` field based on the `ip` field.

```json
PUT /_ingest/pipeline/ip2geo
{
"description":"convert ip to geo",
"processors":[
{
"ip2geo":{
"field":"ip",
"datasource":"my-datasource"
}
}
]
}

PUT /my-index/_doc/my-id?pipeline=ip2geo
{
"ip": "172.0.0.1"
}

GET /my-index/_doc/my-id
{
"_index":"my-index",
"_id":"my-id",
"_version":1,
"_seq_no":0,
"_primary_term":1,
"found":true,
"_source":{
"my_ip_field":"172.0.0.1",
"ip2geo":{
"continent_name":"North America",
"region_iso_code":"AL",
"city_name":"Calera",
"country_iso_code":"US",
"country_name":"United States",
"region_name":"Alabama",
"location":"33.1063,-86.7583",
"time_zone":"America/Chicago"
}
}
}
```

## Cluster settings

The IP2Geo data source and `ip2geo` processor node settings are listed in the following table.

| Key | Description | Default |
|--------------------|-------------|---------|
| plugins.geospatial.ip2geo.datasource.endpoint | Default endpoint for creating the data source API. | Defaults to https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json. |
| plugins.geospatial.ip2geo.datasource.update_interval_in_days | Default update interval for creating the data source API. | Defaults to 3. |
| plugins.geospatial.ip2geo.datasource.batch_size | Maximum number of documents to ingest in a bulk request during the IP2Geo data source creation process. | Defaults to 10,000. |
| plugins.geospatial.ip2geo.processor.cache_size | Maximum number of results that can be cached. There is only single cache used for all IP2Geo processors in each node | Defaults to 1,000. |
|-------------------|-------------|---------|