Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logstash Azure Blob plugin doesn't see all the files #222

Open
katatohuk opened this issue Jun 2, 2020 · 6 comments
Open

Logstash Azure Blob plugin doesn't see all the files #222

katatohuk opened this issue Jun 2, 2020 · 6 comments

Comments

@katatohuk
Copy link

Hi folks,
I've started using Logstash Azure Blob plugin recently, all seemed to be pretty well in TEST, but when it came to PROD with thousands of jsons I'm getting into troubles.
The issue is - logstash can't read ALL the jsons stored on a blobl storage account, it picks them up randomly and only few of them, I'm running logstash in a debug mode and can't see any errors except the one appearing from time to time:
(412) A lease >ID was specified, but the lease for the blob has expired
however is still running and getting some of files into Elastic
Here is my config:

input
{
    azureblob
    {
        storage_account_name => "errorlogsstorage"
        storage_access_key => "mykeyhere"
        container => "errorlogs"
        codec => "json"
        interval => "1"
        registry_create_policy => "start_over"
    }
}

output {
  stdout { codec => rubydebug }
  elasticsearch {
    hosts => ["http://10.78.0.75:9200"]
    index => "errorlogs"
    codec => json
  }
}

Maybe anyone has experienced that same and could share a solution ? Any inputs are much appreciated
Thanks !

@pinochioze
Copy link

pinochioze commented Jun 16, 2020

Hi Katatohuk,
(412) A lease >ID was specified, but the lease for the blob has expired
This issue happens when it take too much time to read a Json file (I mean it keep the lease to long over the timeout), You can try by using "registry_lease_duration"

Sets the value for registry file lock duration in seconds. It must be set to -1, or between 15 to 60 inclusively.

config :registry_lease_duration, :validate => :number, :default => 15

@pinochioze
Copy link

I wonder why you use this option registry_create_policy => "start_over"

It will read the Json log from the beginning everytime you restart the Logstash service. It may cause there are many files not be read

About your output configuration, It should be only stdout or elasticsearch
if the logs are output to STDOUT, they will create the index in Elasticsearch.
One more thing, this config seems not necessary. you can remove it.

codec => json

@katatohuk
Copy link
Author

Hi Katatohuk,
(412) A lease >ID was specified, but the lease for the blob has expired
This issue happens when it take too much time to read a Json file (I mean it keep the lease to long over the timeout), You can try by using "registry_lease_duration"

Sets the value for registry file lock duration in seconds. It must be set to -1, or between 15 to 60 inclusively.

config :registry_lease_duration, :validate => :number, :default => 15

Where did you find parameter registry_lease_duration? I can't see it here
https://github.com/Azure/azure-diagnostics-tools/tree/master/Logstash/logstash-input-azureblob

@katatohuk
Copy link
Author

I wonder why you use this option registry_create_policy => "start_over"

It will read the Json log from the beginning everytime you restart the Logstash service. It may cause there are many files not be read

About your output configuration, It should be only stdout or elasticsearch
if the logs are output to STDOUT, they will create the index in Elasticsearch.
One more thing, this config seems not necessary. you can remove it.

codec => json

Regarding start_over, just used all possible variations :) I know what parameter implies of course

@pinochioze
Copy link

pinochioze commented Jun 16, 2020

I get the '' in the soure code in this link on line 83
https://github.com/Azure/azure-diagnostics-tools/blob/master/Logstash/logstash-input-azureblob/lib/logstash/inputs/azureblob.rb

Sets the value for registry file lock duration in seconds. It must be set to -1, or between 15 to 60 inclusively.

The default, 15 means the registry file will be locked for at most 15 seconds. This should usually be sufficient to
read the content of registry. Having this configuration here to allow lease expired in case the client crashed that
never got a chance to release the lease for the registry.

config :registry_lease_duration, :validate => :number, :default => 15

@pinochioze
Copy link

There are many things to concern when you want to apply to Prod env,
I have read and modify the source code to make it suitable with my system.
with my system, I export logs from Application Insights to Azure Blob, then Logstash goes to Azure Blob to get the log file.
So if you have same scenario, I am glad to support you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants