Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lsperfm getting stuck when file output is specified #22

Open
debraj-manna opened this issue Jul 14, 2015 · 12 comments
Open

lsperfm getting stuck when file output is specified #22

debraj-manna opened this issue Jul 14, 2015 · 12 comments

Comments

@debraj-manna
Copy link

It seems lsperm is getting stuck when file output plugin is used. It is just printing
.............................

Below are my suite:-

conf/simple.conf

input {
  stdin {}
}

filter {
  clone {}
}

output {  
  file {
     path => "/tmp/test.log"
 }
}

input/simple.txt

test 01
test 02
test 03

suite/suite.rb

[
  {:name => "simple line in/out", :config => "config/simple.conf", :input => "input/simple.txt", :time => 5},
]
@purbon
Copy link
Contributor

purbon commented Jul 14, 2015

Interesting, never tested with the file output, is actually on my TODO list, so thanks for reporting. All default configuration I've been using always use stdin/stdout.

It actually might make sense that is not working properly with file output, as the code is relaying on the output of messages to control the workflow. Can you test by doing two outputs, one to STDOUT and another to the file output?

@debraj-manna
Copy link
Author

Yes it is working if I specify two outputs.
conf/simple.conf

input {
  stdin {}
}

filter {
  clone {}
}

output {
  stdout { codec => line }
  file {
     path => "/tmp/test.log"
 }
}

@purbon
Copy link
Contributor

purbon commented Jul 14, 2015

yes, for now you should always use stdin and stdout :-(, in the next versions using other input and output plugins will come.

@debraj-manna
Copy link
Author

Thanks purbon. So in output alongwith stdout I can use other output plugins or I should only use stdout and no other output plugins?

@purbon
Copy link
Contributor

purbon commented Jul 14, 2015

For now is better to only use stdin and stdout, the number counting is done only with data out of stdout.

@debraj-manna
Copy link
Author

Ok :)

@colinsurprenant
Copy link
Contributor

As noted, the current system spawns an external logstash process and relies on stdin to feed events and stdout to read the resulting output.

For now, there isn't much we can do for the input side, stdin must be used but as suggested, you can also add multiple outputs. In logstash, when there are multiple outputs, each event must go through the list of outputs in sequence (unless you specify multiple output workers) so if you want to benchmark a configuration with a specific output, just also add stdout and the resulting metrics should be valid for that output - stdout (and stdin for that matter) are never the limiting factor in performance for any non trivial configuration.

@colinsurprenant
Copy link
Contributor

this was originally designed mainly for pipeline/codecs/filters benchmarking, without having any introspection API in logstash. This definitely raises the question of how to monitor logstash performance without relying on something we can control like stdin and stdout - I think we will have to add a performance metrics API in logstash to be able to measure performance of configurations without stdin and stdout.

@purbon
Copy link
Contributor

purbon commented Jul 14, 2015

+1 @colinsurprenant, this is actually on the roadmap of tasks for LS and the benchmarking efforts.

@debraj-manna
Copy link
Author

@colinsurprenant If I add stdout and some other output will the resulting metrics take into account only stdout or both stdout and the other output?

@purbon
Copy link
Contributor

purbon commented Jul 14, 2015

Only stdout, as this is the intended first design of it. For now i
recommend if you want to see the closest numbers use multiple outputs with
stdout the last. Feature versions should provide features to avoid this.

/purbon

On Tue, 14 Jul 2015 20:48 Debraj Manna [email protected] wrote:

@colinsurprenant https://github.com/colinsurprenant If I add stdout and
some other output will the resulting metrics take into account only stdout
or both stdout and the other output?

Reply to this email directly or view it on GitHub
#22 (comment)
.

@colinsurprenant
Copy link
Contributor

@debraj-manna @purbon both stdout and the other output will be taken into account and the performance hit of having also stdout with your other output should be negligible so I believe it should give you usable metrics, here's why:

when having 2 outputs defined, (without additional output workers and this is important), say output-1 and output-2, every event resulting from the filter stage will be pushed into an intermediate queue so the output stage can pick them up. the output stage will pop an event from the queue and this event will be passed to output-1 then output-2 and once output-2 is finished and returns, it will go back and pop another element from the queue and repeat the process.

this means that the overall TPS (or transactions/event per seconds) your logstash setup is achieving will include the cost of outputting each event to both outputs but as I say, outputting to stdout should be "almost free" in term of processing and the TPS you get should still be relevant.

also, it won't actually make a difference where you put the stdout output in the config, first or last...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants