Added CompactSplitter #60

innonagle · 2016-12-08T14:32:53Z

This is a solution for #57 .

Java is not my forté so I am unsure how to run the test suite. If instructions can be provided on how to do so I can update the PR to include test coverage.

chaochenq · 2017-12-07T19:32:56Z

Hi, sorry for the delay. We are releasing a AggregationSplitter that does the similar thing to this PR. I will be pushing out the changes soon. Please let me know if that fits your requirement. If not we can discuss how to improve it.

The changes that we made are pretty similar to what's in here, other than that we have tests added as well.

innonagle · 2017-12-08T16:59:28Z

@chaochenq Can I get a commit/branch/fork reference?

chaochenq · 2017-12-08T17:00:44Z

@lennynyktyk I haven't pushed out the changes yet. But I will let you know once it's done. Thanks!

chaochenq · 2018-01-02T18:17:35Z

@lennynyktyk I have released a newer version with AggregationSplitter which does similar thing to this one

https://github.com/awslabs/amazon-kinesis-agent/blob/master/src/com/amazon/kinesis/streaming/agent/tailing/AggregationSplitter.java

Please have a look and let me know of your thoughts. Thanks!

innonagle · 2018-12-06T04:07:47Z

Hello @chaochenq ,

I have got around to checking this out and can say this is equivalent.

In my testing the only issue I found was that the aggregatedRecordSizeBytes value in the agent.json does not seem to get parsed unless it is a JSON string. A JSON int does not work for me. I am only able to test with version 1.1.3 on Java 1.7 as that is the version my ElasticBeanstalk environment can find in yum.

innonagle · 2018-12-06T21:04:21Z

Hello @chaochenq ,

After digging into this more I am unable to replicate the results I would expect.

In the attached graph you can see how after 17:00 there are only dots and not lines. Prior to 17:00 is the stock AWSKinesisAgent JAR v1.1.3 as I tried to configure it with aggregatedRecordSizeBytes After 17:00 is the modified JAR with the CompactSplitter in this PR. Also after 17:00 the IncomingRecords series has a consistent value of 1 where as prior to that it has the same shape as the IncomingBytes series.

I understand there may need to be a time component e.g. aggegatedRecordSizeBytes or X milliseconds which ever comes first but the AggregatedSpiltter does not seem to be reducing the number of IncomingRecords at all.

nitzanav · 2019-01-03T14:35:31Z

@chaochenq @lennynyktyk
I wanted to ask if there is a solution to aggregate and compress lines of files as one Kinesis stream message.

Will this solution work for Kinesis Stream?
If not can anyway is willing to get paid to develop such a solution? Or I can contribute... Please email me for this, [email protected]

innonagle · 2019-01-04T11:04:11Z

@nitzanav The way I implemented this solution while I have only tested the aggregation for Kinesis Firehose there is nothing I think why this would not aggregate for Kinesis Streams. Please understand this PR does not compress the data only aggregates it. I have not attempted to compress the data prior to sending it. The solution provided in release 1.1.3 only support Kinesis Firehose.

I believe it is possible to write compressed data to the file under observation as long as each "blob" of compressed data written to the file is terminated by a newline "\n".

nitzanav · 2019-01-04T21:32:32Z

@lennynyktyk It seems that you are suggesting to compress each row in seperate, I guess that you want to avoid trying to compress the entire batch/chunk every time a row added, in order to verify that it doesn't exceed aggregatedRecordSizeBytes.

In any case I need a technical assistance of a few hours of work, so if any of you can help me code the things I need here, it will be much appreciated. [email protected]

Added CompactSplitter

5708e3d

chaochenq mentioned this pull request Jun 13, 2017

Optimized record size for concatenable events like JSON #57

Open

innonagle closed this Dec 6, 2018

innonagle reopened this Dec 6, 2018

Merge remote-tracking branch 'upstream/master'

a544a36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added CompactSplitter #60

Added CompactSplitter #60

innonagle commented Dec 8, 2016

chaochenq commented Dec 7, 2017 •

edited

Loading

innonagle commented Dec 8, 2017

chaochenq commented Dec 8, 2017

chaochenq commented Jan 2, 2018

innonagle commented Dec 6, 2018

innonagle commented Dec 6, 2018

nitzanav commented Jan 3, 2019

innonagle commented Jan 4, 2019 •

edited

Loading

nitzanav commented Jan 4, 2019

Added CompactSplitter #60

Are you sure you want to change the base?

Added CompactSplitter #60

Conversation

innonagle commented Dec 8, 2016

chaochenq commented Dec 7, 2017 • edited Loading

innonagle commented Dec 8, 2017

chaochenq commented Dec 8, 2017

chaochenq commented Jan 2, 2018

innonagle commented Dec 6, 2018

innonagle commented Dec 6, 2018

nitzanav commented Jan 3, 2019

innonagle commented Jan 4, 2019 • edited Loading

nitzanav commented Jan 4, 2019

chaochenq commented Dec 7, 2017 •

edited

Loading

innonagle commented Jan 4, 2019 •

edited

Loading