Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add plugin execution duration metric #5046

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions cli-plugins/manager/cobra.go
Original file line number Diff line number Diff line change
Expand Up @@ -90,11 +90,18 @@ func AddPluginCommandStubs(dockerCli command.Cli, rootCmd *cobra.Command) (err e
cargs = append(cargs, args...)
cargs = append(cargs, toComplete)
os.Args = cargs
runCommand, runErr := PluginRunCommand(dockerCli, p.Name, cmd)
pluginRunCommand, runErr := PluginRunCommand(dockerCli, p.Name, cmd)
if runErr != nil {
return nil, cobra.ShellCompDirectiveError
}
runErr = runCommand.Run()
if verifiedDockerCli, ok := dockerCli.(*command.DockerCli); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just shadow dockerCli in this case instead of calling it verifiedDockerCli? Not a huge fan, sorry (I was wondering what made it "verified", and then realized it was just the typecast cli)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thing is if we shadow dockerCli here, then in the else condition we can't call dockerCli.Err() as it'd be a zero value and not the original value of dockerCli. I tried avoiding this where possible (see the other places where i do this cast), but here since we need the else i wasn't sure how to proceed if not renaming the var.

Maybe there are better ways?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a (can be un-exported) pluginInstrumenter interface and match that?

runCommand := verifiedDockerCli.InstrumentPluginCommand(pluginRunCommand)
runErr = runCommand.TimedRun()
Comment on lines +98 to +99
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InstrumentPluginCommand returns a non-exported type, and we always execute TimedRun() afterwards, except for the tryPluginRun function, where we construct it, and then later execute TimedRun.

It's unfortunate that exec.Cmd doesn't implement an interface, otherwise we could've made that the return, and we can't just define an interface that only has a func Run() error

Perhaps we should either make this function perform the TimedRun; puny-code;

if cli is a pluginInstrumenter {
    cli.RunInstrumented(pluginRunCommand)
} else {
    pluginRunCommand.Run()
}

But that could even be a function, because effectively it's not a method of the CLI;

func RunPluginCommandInstrumented(cli, pluginCommand)

Hmm.. but then we get to startPluginCommandTimer, which is even more generic, but missing the baseAttrs 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate that exec.Cmd doesn't implement an interface, otherwise we could've made that the return, and we can't just define an interface that only has a func Run() error

My thoughts exactly, it's kinda annoying.

We can surely find a better way to do this, my original goal with the first implementation was to keep the approach as close as possible to that used to instrument the cobra commands, but maybe that's not a great idea all things considered. The dockerCli struct / Cli interface situation in the CLI (since we don't always pass around just the interface or just the struct) also caused some minor headaches and ugliness that would be nice to avoid if possible.

} else {
// This should not happen. continue without instrumenting the cmd if it does
fmt.Fprint(dockerCli.Err(), "Warning: Unexpected error, the plugin command will not have OTEL metrics")
runErr = pluginRunCommand.Run()
}
if runErr == nil {
os.Exit(0) // plugin already rendered complete data
}
Expand Down
98 changes: 91 additions & 7 deletions cli/command/telemetry_utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package command
import (
"context"
"fmt"
"os/exec"
"strings"
"time"

Expand All @@ -14,8 +15,8 @@ import (
"go.opentelemetry.io/otel/metric"
)

// BaseCommandAttributes returns an attribute.Set containing attributes to attach to metrics/traces
func BaseCommandAttributes(cmd *cobra.Command, streams Streams) []attribute.KeyValue {
// baseCommandAttributes returns an attribute.Set containing attributes to attach to metrics/traces
func baseCommandAttributes(cmd *cobra.Command, streams Streams) []attribute.KeyValue {
return append([]attribute.KeyValue{
attribute.String("command.name", getCommandName(cmd)),
}, stdioAttributes(streams)...)
Expand Down Expand Up @@ -69,7 +70,7 @@ func (cli *DockerCli) InstrumentCobraCommands(ctx context.Context, cmd *cobra.Co
// It should be called immediately before command execution, and returns a stopInstrumentation function
// that must be called with the error resulting from the command execution.
func (cli *DockerCli) StartInstrumentation(cmd *cobra.Command) (stopInstrumentation func(error)) {
baseAttrs := BaseCommandAttributes(cmd, cli)
baseAttrs := baseCommandAttributes(cmd, cli)
return startCobraCommandTimer(cli.MeterProvider(), baseAttrs)
}

Expand All @@ -89,7 +90,7 @@ func startCobraCommandTimer(mp metric.MeterProvider, attrs []attribute.KeyValue)
defer cancel()

duration := float64(time.Since(start)) / float64(time.Millisecond)
cmdStatusAttrs := attributesFromError(err)
cmdStatusAttrs := attributesFromCommandError(err)
durationCounter.Add(ctx, duration,
metric.WithAttributes(attrs...),
metric.WithAttributes(cmdStatusAttrs...),
Expand All @@ -100,6 +101,66 @@ func startCobraCommandTimer(mp metric.MeterProvider, attrs []attribute.KeyValue)
}
}

// basePluginCommandAttributes returns a slice of attribute.KeyValue to attach to metrics/traces
func basePluginCommandAttributes(plugincmd *exec.Cmd, streams Streams) []attribute.KeyValue {
pluginPath := strings.Split(plugincmd.Path, "-")
pluginName := pluginPath[len(pluginPath)-1]
vvoland marked this conversation as resolved.
Show resolved Hide resolved
return append([]attribute.KeyValue{
attribute.String("plugin.name", pluginName),
}, stdioAttributes(streams)...)
}

// wrappedCmd is used to wrap an exec.Cmd in order to instrument the
// command with otel by using the TimedRun() func
type wrappedCmd struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of creating a wrapped exec.Cmd type, how hard would it be to just have pluginmanager.PluginRunCommand return a func that starts a timer, runs the command, and ends the timer?

Okay if we end up going with a wrapper, I just try hard to avoid those/introducing more state since now it has a pointer to a cli, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure that'd work..in the sense that we do things with the exec.Cmd that pluginmanager.PluginRunCommand returns aside from just running it (see the tryPluginRun() func as an example). So i wanted to keep the return value of InstrumentPluginCommand something close to what pluginmanager.PluginRunCommand currently returns. I guess we could rework all that but i'm not sure it'd be worth the extra work.

I'm not a fan of the wrapped type here either, but it seemed like a way to keep code changes relatively minimal on the usage side of things, and to allow code that doesn't need the otel bits to continue to function as usual. Happy to change the impl if we can find some simpler way to handle this. The pointer to DockerCli is really only needed to get the meter provider and the startPluginCommandTimer func atm.

The main idea of this impl was to try and keep the approach as close as possible to the one used by InstrumentCobraCommand

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Let me take a closer look, but if this is indeed the least invasive way that's okay by me. I also don't love having to remember to call a new method TimedRun, which is why I'd have rather returned a func or something.

*exec.Cmd

baseAttrs []attribute.KeyValue
cli *DockerCli
}

// TimedRun measures the duration of the command execution using an otel meter
func (c *wrappedCmd) TimedRun() error {
stopPluginCommandTimer := c.cli.startPluginCommandTimer(c.cli.MeterProvider(), c.baseAttrs)
err := c.Cmd.Run()
stopPluginCommandTimer(err)
return err
}

// InstrumentPluginCommand instruments the plugin's exec.Cmd to measure its execution time
// Execute the returned command with TimedRun() to record the execution time.
func (cli *DockerCli) InstrumentPluginCommand(plugincmd *exec.Cmd) *wrappedCmd {
baseAttrs := basePluginCommandAttributes(plugincmd, cli)
newCmd := &wrappedCmd{Cmd: plugincmd, baseAttrs: baseAttrs, cli: cli}
return newCmd
}

func (cli *DockerCli) startPluginCommandTimer(mp metric.MeterProvider, attrs []attribute.KeyValue) func(err error) {
durationCounter, _ := getDefaultMeter(mp).Float64Counter(
"plugin.command.time",
metric.WithDescription("Measures the duration of the plugin execution"),
metric.WithUnit("ms"),
)
start := time.Now()

return func(err error) {
// Use a new context for the export so that the command being cancelled
// doesn't affect the metrics, and we get metrics for cancelled commands.
ctx, cancel := context.WithTimeout(context.Background(), exportTimeout)
defer cancel()

duration := float64(time.Since(start)) / float64(time.Millisecond)
pluginStatusAttrs := attributesFromPluginError(err)
durationCounter.Add(ctx, duration,
metric.WithAttributes(attrs...),
metric.WithAttributes(pluginStatusAttrs...),
)
if mp, ok := mp.(MeterProvider); ok {
mp.ForceFlush(ctx)
}
}
}

func stdioAttributes(streams Streams) []attribute.KeyValue {
// we don't wrap stderr, but we do wrap in/out
_, stderrTty := term.GetFdInfo(streams.Err())
Expand All @@ -110,7 +171,9 @@ func stdioAttributes(streams Streams) []attribute.KeyValue {
}
}

func attributesFromError(err error) []attribute.KeyValue {
// Used to create attributes from an error.
// The error is expected to be returned from the execution of a cobra command
func attributesFromCommandError(err error) []attribute.KeyValue {
attrs := []attribute.KeyValue{}
exitCode := 0
if err != nil {
Expand All @@ -129,6 +192,27 @@ func attributesFromError(err error) []attribute.KeyValue {
return attrs
}

// Used to create attributes from an error.
// The error is expected to be returned from the execution of a plugin
func attributesFromPluginError(err error) []attribute.KeyValue {
attrs := []attribute.KeyValue{}
exitCode := 0
if err != nil {
exitCode = 1
if stderr, ok := err.(statusError); ok {
// StatusError should only be used for errors, and all errors should
// have a non-zero exit status, so only set this here if this value isn't 0
if stderr.StatusCode != 0 {
exitCode = stderr.StatusCode
}
}
attrs = append(attrs, attribute.String("plugin.error.type", otelErrorType(err)))
}
attrs = append(attrs, attribute.Int("plugin.status.code", exitCode))

return attrs
}

// otelErrorType returns an attribute for the error type based on the error category.
func otelErrorType(err error) string {
name := "generic"
Expand All @@ -149,7 +233,7 @@ func (e statusError) Error() string {
}

// getCommandName gets the cobra command name in the format
// `... parentCommandName commandName` by traversing it's parent commands recursively.
// `... parentCommandName commandName` by traversing its parent commands recursively.
// until the root command is reached.
//
// Note: The root command's name is excluded. If cmd is the root cmd, return ""
Expand All @@ -163,7 +247,7 @@ func getCommandName(cmd *cobra.Command) string {
}

// getFullCommandName gets the full cobra command name in the format
// `... parentCommandName commandName` by traversing it's parent commands recursively
// `... parentCommandName commandName` by traversing its parent commands recursively
// until the root command is reached.
func getFullCommandName(cmd *cobra.Command) string {
if cmd.HasParent() {
Expand Down
2 changes: 1 addition & 1 deletion cli/command/telemetry_utils_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ func TestAttributesFromError(t *testing.T) {
tc := tc
t.Run(tc.testName, func(t *testing.T) {
t.Parallel()
actual := attributesFromError(tc.err)
actual := attributesFromCommandError(tc.err)
assert.Check(t, reflect.DeepEqual(actual, tc.expected))
})
}
Expand Down
32 changes: 24 additions & 8 deletions cmd/docker/docker.go
Original file line number Diff line number Diff line change
Expand Up @@ -136,9 +136,15 @@ func setupHelpCommand(dockerCli command.Cli, rootCmd, helpCmd *cobra.Command) {
helpCmd.Run = nil
helpCmd.RunE = func(c *cobra.Command, args []string) error {
if len(args) > 0 {
helpcmd, err := pluginmanager.PluginRunCommand(dockerCli, args[0], rootCmd)
helpRunCmd, err := pluginmanager.PluginRunCommand(dockerCli, args[0], rootCmd)
if err == nil {
return helpcmd.Run()
if dockerCli, ok := dockerCli.(*command.DockerCli); ok {
helpcmd := dockerCli.InstrumentPluginCommand(helpRunCmd)
return helpcmd.TimedRun()
}
// This should not happen. continue without instrumenting the cmd if it does
fmt.Fprint(dockerCli.Err(), "Warning: Unexpected error, the help command will not have OTEL metrics")
return helpRunCmd.Run()
}
if !pluginmanager.IsNotFound(err) {
return errors.Errorf("unknown help topic: %v", strings.Join(args, " "))
Expand All @@ -159,11 +165,17 @@ func tryRunPluginHelp(dockerCli command.Cli, ccmd *cobra.Command, cargs []string
if err != nil {
return err
}
helpcmd, err := pluginmanager.PluginRunCommand(dockerCli, cmd.Name(), root)
helpRunCmd, err := pluginmanager.PluginRunCommand(dockerCli, cmd.Name(), root)
if err != nil {
return err
}
return helpcmd.Run()
if dockerCli, ok := dockerCli.(*command.DockerCli); ok {
helpcmd := dockerCli.InstrumentPluginCommand(helpRunCmd)
return helpcmd.TimedRun()
}
// This should not happen. continue without instrumenting the cmd if it does
fmt.Fprint(dockerCli.Err(), "Warning: Unexpected error, the plugin help command will not have OTEL metrics")
return helpRunCmd.Run()
}

func setHelpFunc(dockerCli command.Cli, cmd *cobra.Command) {
Expand Down Expand Up @@ -224,12 +236,14 @@ func setValidateArgs(dockerCli command.Cli, cmd *cobra.Command) {
})
}

func tryPluginRun(dockerCli command.Cli, cmd *cobra.Command, subcommand string, envs []string) error {
plugincmd, err := pluginmanager.PluginRunCommand(dockerCli, subcommand, cmd)
func tryPluginRun(dockerCli *command.DockerCli, cmd *cobra.Command, subcommand string, envs []string) error {
pluginRunCmd, err := pluginmanager.PluginRunCommand(dockerCli, subcommand, cmd)
if err != nil {
return err
}

plugincmd := dockerCli.InstrumentPluginCommand(pluginRunCmd)

// Establish the plugin socket, adding it to the environment under a
// well-known key if successful.
srv, err := socket.NewPluginServer(nil)
Expand Down Expand Up @@ -279,7 +293,7 @@ func tryPluginRun(dockerCli command.Cli, cmd *cobra.Command, subcommand string,
}
}()

if err := plugincmd.Run(); err != nil {
if err := plugincmd.TimedRun(); err != nil {
statusCode := 1
exitErr, ok := err.(*exec.ExitError)
if !ok {
Expand All @@ -304,6 +318,8 @@ func runDocker(ctx context.Context, dockerCli *command.DockerCli) error {
return err
}

cmd.SetContext(ctx)

if err := tcmd.Initialize(); err != nil {
return err
}
Expand Down Expand Up @@ -357,7 +373,7 @@ func runDocker(ctx context.Context, dockerCli *command.DockerCli) error {
// We've parsed global args already, so reset args to those
// which remain.
cmd.SetArgs(args)
err = cmd.ExecuteContext(ctx)
err = cmd.Execute()

// If the command is being executed in an interactive terminal
// and hook are enabled, run the plugin hooks.
Expand Down
Loading