Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: David Venable <[email protected]>
  • Loading branch information
dlvenable and natebower committed Oct 15, 2024
1 parent 135262a commit 98fe5b7
Showing 1 changed file with 23 additions and 23 deletions.
46 changes: 23 additions & 23 deletions _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,42 +7,42 @@ authors:
date: 2024-10-15 12:30:00 -0600
categories:
- releases
excerpt: Data Prepper 2.10.0 offers an OpenSearch _bulk API and reads from Amazon Kinesis
excerpt: Data Prepper 2.10.0 offers an OpenSearch _bulk API and reads from Amazon Kinesis.
meta_keywords: Data Prepper, _bulk, Amazon Kinesis Data Streams, OTel Logs, OTLP JSON
meta_description: Data Prepper 2.10.0 offers a source that simulates the OpenSearch _bulk API and another source for reading from Amazon Kinesis Data Streams.
---

## Introduction

Data Prepper 2.10 is now available for the community to use.
Two major features include a source to send data to Data Prepper using an API mimicking the OpenSearch `_bulk` API from OpenSearch and reading from Amazon Kinesis Data Streams.
Data Prepper 2.10 is now available!
Two major features include a source that sends data to Data Prepper using an API mimicking the OpenSearch `_bulk` API and the ability to read from Amazon Kinesis Data Streams.


## OpenSearch API source

Many existing OpenSearch clients that perform ingestion directly to OpenSearch can now send that data to Data Prepper first.
With this, you can use Data Prepper's buffering and rich processor set before sending data to OpenSearch without having to change clients when they are using the OpenSearch `_bulk` API.
A new source has been added in Data Prepper named `opensearch_api` that accepts [OpenSearch Document API Bulk operation](https://opensearch.org/docs/latest/api-reference/document-apis/bulk/) requests from clients using REST and ingests data into OpenSearch.
With this, you can use Data Prepper's buffering and rich processor set before sending data to OpenSearch without having to change clients that are using the OpenSearch `_bulk` API.
A new Data Prepper source named `opensearch_api` has been added that accepts [OpenSearch Document API bulk operation](https://opensearch.org/docs/latest/api-reference/document-apis/bulk/) requests from clients using REST and ingests data into OpenSearch.
The behavior of this source is also quite similar to the existing `http` source.
It supports industry-standard encryption in the form of TLS/HTTPS and HTTP basic authentication.
It parses incoming requests and create Data Prepper events and associated event metadata making it compatible with the `opensearch` sink.
The request body should be compatible with OpenSearch Document API Bulk Operation and will also support all the actions like index, create, delete and update.
It also parses incoming requests and creates Data Prepper events and associated event metadata, making it compatible with the `opensearch` sink.
The request body should be compatible with the OpenSearch Document API bulk operation and will also support all actions: index, create, delete, and update.

The two HTTP methods supported now are the following:
The following two HTTP methods are now supported:

```
POST _bulk
POST <index>/_bulk
```

The second API which specifies the index in the path means you dont need to include it in the request body.
The second API specifies the index in the path, so you don't need to include it in the request body.

Moreover, the following query parameters are also supported that are available in OpenSearch Document API Bulk Operation as the following below:

* pipeline
* routing

An example of using the source
The following example demonstrates how to use the source:

```
version: "2"
Expand All @@ -62,28 +62,28 @@ opensearch-api-pipeline:
pipeline: "${getMetadata(\"opensearch_pipeline\")}"
```

Consider the sample:
Consider the following example request:

```
POST _bulk
{ "index": { "_index": "movies", "_id": "tt1979320" } }
{ "title": "Rush", "year": 2013 }
```

The above request will be ingested into OpenSearch and a new document will be created under the index movies with a document id `tt1979320` with the document `{ "title": "Rush", "year": 2013 }`.
This request will be ingested into OpenSearch, and a new document will be created under the index `movies` with the document ID `tt1979320` with the document `{ "title": "Rush", "year": 2013 }`.

Over time, the Data Prepper maintainers are interested in expanding this source to support other indexing APIs to allow it to stand-in for an OpenSearch cluster for ingestion workloads.
To learn more or express interest see [Provide an OpenSearch API source #4180](https://github.com/opensearch-project/data-prepper/issues/4180).
The Data Prepper maintainers are interested in further expanding this source to support other indexing APIs, allowing it to stand in for an OpenSearch cluster in ingestion workloads.
To learn more or provide feedback, see [Provide an OpenSearch API source #4180](https://github.com/opensearch-project/data-prepper/issues/4180).


## Kinesis source

[Amazon Kinesis Data Streams](https://docs.aws.amazon.com/streams/latest/dev/introduction.html) is a high speed streaming data service.
Data Prepper is introducing a new source named `kinesis` which can be used to ingest stream records data from multiple Kinesis data streams into OpenSearch clusters.
[Amazon Kinesis Data Streams](https://docs.aws.amazon.com/streams/latest/dev/introduction.html) is a high-speed streaming data service.
Data Prepper has also introduced a new source named `kinesis` that can be used to ingest stream record data from multiple Kinesis data streams into OpenSearch clusters.
You can configure it to read stream records from the beginning or from the latest record.
Moreover, if you enable end to end acknowledgements, Kinesis data streams will be checkpointed to prevent duplicate processing of records.
Moreover, if you enable end-to-end acknowledgements, Kinesis data streams will be checkpointed to prevent duplicate processing of records.

Sample pipeline
The following is an example pipeline:

```
version: "2"
Expand All @@ -109,18 +109,18 @@ kinesis-pipeline:

## Other features and improvements

Data Prepper 2.10 has a number of other changes to make it more powerful for the community.
Data Prepper 2.10 has introduced a number of other improvements:

* The kafka source now supports authenticating with an Apache Kafka cluster using SASL/SCRAM in addition to the SASL/PLAIN authentication provided in previous versions.
* Data Prepper can now parse OpenTelemetry logs from sources such as Amazon S3. The new `otel_logs` codec parses data from OpenTelemetry Protocol (OTLP) JSON formatted files. Now you can write OpenTelemetry logs from [AWS S3 Exporter for OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/awss3exporter/README.md) and read these from Data Prepper.
* Additionally, the maintainers have worked to improve the performance through the addition of an internal cache for event keys. Data Prepper administrators can configure this cache as necessary.
* The `kafka` source now supports authentication with an Apache Kafka cluster using SASL/SCRAM in addition to the SASL/PLAIN authentication provided in previous versions.
* Data Prepper can now parse OpenTelemetry logs from sources such as Amazon Simple Storage Service (Amazon S3). The new `otel_logs` codec parses data from OpenTelemetry Protocol (OTLP) JSON-formatted files. Now you can write OpenTelemetry logs from [AWS S3 Exporter for OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/awss3exporter/README.md) and read these using Data Prepper.
* Additionally, the maintainers have worked to improve performance through the addition of an internal cache for event keys. Data Prepper administrators can configure this cache as necessary.


## Next steps

* To download Data Prepper, visit the [OpenSearch downloads](https://opensearch.org/downloads.html) page.
* For instructions on how to get started with Data Prepper, see [Getting started with Data Prepper](https://opensearch.org/docs/latest/data-prepper/getting-started/).
* To learn more about the work in progress for Data Prepper 2.11 and other releases, see the [Data Prepper roadmap](https://github.com/orgs/opensearch-project/projects/221).
* To learn more about the work in progress for Data Prepper 2.11 and other releases, see the [Data Prepper Project Roadmap](https://github.com/orgs/opensearch-project/projects/221).

## Thanks to our contributors!

Expand Down

0 comments on commit 98fe5b7

Please sign in to comment.