Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to parse [requestParameters.filter] [requestParameters.startTime] and probably others #24

Open
ghost opened this issue Jun 22, 2018 · 1 comment

Comments

@ghost
Copy link

ghost commented Jun 22, 2018

Another couple of Elasticsearch mapping error issues. Look similar to others but with a new particular field. In these cases: filter warned can't get text on a START_OBJECT and startTime failed to parse as an int (it was a date string)

Cloudtrail has such a high number of fields (I see your mapping limit of 1000 being broken and raise you a mapping limit of 2000 being broken!) that adding custom logic for every problem case field feels like a losing battle. But if you're interested I'm sure I can find another half-dozen or so errors like this and provide more details to help fix.

It would be excellent to share a good mapping that can dynamically map the fields that are likely required and map everything else as a string (or rather text, keyword multi-field since Elasticsearch 2.x is long gone).

I have managed to find this template which looks promising, but would love to know if there were any more efforts floating around that come recommended by anybody here?

  • Version: 3.0.0
  • Operating System: Ubuntu server
  • Config File (if you have sensitive info, please remove it):
input { 
  s3 { 
    bucket => "xxxxxxxxxxxxxxx" 
    type => "cloudtrail" 
    tags => [cloudtrail"] 
    region => "eu-west-1" 
    prefix => "xxxxxxxxxx" 
    Interval => 60 
    codec => "cloudtrail" 
    backup_to_bucket => "xxxxxxxx" 
    backup_add_prefix => "ingested/" 
    delete => true 
    exclude_pattern => ".*/CloudTrail\-Digest/.*"
  } 
} 

output { 
  elasticsearch { 
    flush_size => 1000 
    hosts => ["xxxxxxxxxxx"] 
    index => "logstash-cloudtrail-%{+YYYY.MM}" 
    user => "xxxxxxxxx" 
    password => "xxxxxxxxxx"
}
@fatalglitch
Copy link

This is more an issue with how the data is being formatted from Cloudtrail than with this codec. The requestParameters.filter field in some Cloudtrail JSON (IAM, and maybe others) is a concrete value, and not an object. But then in other CloudTrail JSON the request.filter is an object with nested fields.

There is no datatype in ElasticSearch I'm aware of that can handle this, so this would need to be identified at time of parsing, and then a new sub-field would have to be created using a document modification (logstash or similar).

So with that, I don't believe this is something that can be fixed in the codec. I could be wrong, but the above is my observation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant