You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have one instance of logstash for reading data from blob storage. Although logs are in the same container I have 2 major folder structure for logs from two different processes. Blob structure is something like this
Blob
I run only one instance of logstash. Issue is logs from folder2 are processed much faster than logs from folder1. Folder2 is days ahead of folder1. ( This is catch up scenario. Am reading logs from start of this month) How do I debug this ?
The text was updated successfully, but these errors were encountered:
Hi Arun, I think your concern is due to the number of blob in each folder (you can get this number by using CLI or Ms Azure Storage Explorer), the procedure of this plugin is:
get the list of all the blobs in the container
Compare the list with the files in "path_filter" then get the list which matched
Get 1 blob in the list of matched blobs base on Generation algorthm and offset of the blob
So there are many blobs in the list of matched blobs have to wait to the next loop of the process
Issue with Logstash input for Azure blob
I have one instance of logstash for reading data from blob storage. Although logs are in the same container I have 2 major folder structure for logs from two different processes. Blob structure is something like this
Blob
My logstash blob config looks like this
`azureblob
{
storage_account_name => 'folder1'
storage_access_key => ''
container => 'logs'
id => 'jobs1'
blob_list_page_size => 150
file_chunk_size_bytes => 8088608
registry_create_policy => 'resume'
path_filters => 'folder1/2020 /**/*.csv'
}
azureblob
{
storage_account_name => 'folder2'
storage_access_key => ''
container => 'logs'
id => 'jobs1'
blob_list_page_size => 150
file_chunk_size_bytes => 8088608
registry_create_policy => 'resume'
path_filters => 'folder2/2020 /**/*.csv'
}`
Heap is around 3G and cpu usage is at 70-80%.
I run only one instance of logstash. Issue is logs from folder2 are processed much faster than logs from folder1. Folder2 is days ahead of folder1. ( This is catch up scenario. Am reading logs from start of this month) How do I debug this ?
The text was updated successfully, but these errors were encountered: