Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] detect incorrect global assertions in Eventually #127

Open
aecay opened this issue Dec 27, 2023 · 19 comments · May be fixed by #178
Open

[Enhancement] detect incorrect global assertions in Eventually #127

aecay opened this issue Dec 27, 2023 · 19 comments · May be fixed by #178
Assignees
Labels
enhancement New feature or request

Comments

@aecay
Copy link

aecay commented Dec 27, 2023

Is your feature request related to a problem? Please describe.

There are two ways of passing functions to gomega's Eventually. One is to pass a function that returns values, and assert on the return values. The other is to pass a function that does assertions itself (link). In the latter case, the function should take g Gomega as an argument and call g.Expect in order to assert.

Failing to do this will assert against the "global" gomega object -- which will only pass (or fail) the first time and will not continue to poll the assertions. This is a bug, and it can be quite timing dependent at runtime (e.g. if your local dev machine is fast enough that the async assertion passes the first time, but the machine where CI runs is slower and so the first call to the assertion fails).

Describe the solution you'd like

We can catch this in the linter, by looking for code of the form:

Eventually(func () {
    // ...
    Expect(...) // (or Ω) <-- this is what the linter should alert on
}).Should(...)

This should even be auto-fixable (if the func is 0 arity at least) by adding the g Gomega argument and converting all the calls to Expect to g.Expect.

I don't know if the linter can see the source of a function that is passed by name to Eventually (like Eventually(someFuncThatMightCallExpect)) -- if it can, we can find erroneous global calls to Expect in that function as well. But even catching the error when the function is a literal, as in the earlier example, would be helpful.

Describe alternatives you've considered

I don't know of any other potential way to prevent or detect this error before runtime.

Additional context

n/a

@aecay aecay added the enhancement New feature or request label Dec 27, 2023
@nunnatsa
Copy link
Owner

Thanks you so much for your suggestion, @aecay!

This idea was discussed before (#85).

Using the general Expect is a valid usage for real use-cases - for cases we don't wan't to retry, but to exit with error immediately, e.g. in case of a non-recoverable error. The ginkgolinter has no way to distinguish between these cases.

I tend to close the issue. WDYT?

@aecay
Copy link
Author

aecay commented Dec 28, 2023

Apologies, I missed the prior discussion.

I have never needed to bail out early from an Eventually in the way that you describe. But I can see how the code you describe would have that effect.

But gomega provides StopTrying(...).Now() to accomplish the same thing: link. I would tend to say that if you need to bail out early, you should use StopTrying since it is more explicit, and can't be mistaken for a bug (calling Expect when you should call g.Expect).

I guess this is getting into stylistic territory (rather than strictly finding bugs), since global Expect is equally valid as a way of signaling an early exit. But IMO it's worthwhile to provide a way for the linter to catch global Expect bugs, even if it requires stylistic concessions elsewhere in the code. Potentially the check could not be enabled by default though, to avoid annoyingly triggering lint errors in codebases that use another style. WDYT?

@pohly
Copy link

pohly commented Nov 23, 2024

I was about to file an issue about this when I found that there is already one (okay, two...). This misuse of an assertion inside a gomega.Eventually and gomega.Consistently is by far the most common mistake that developers make in Kubernetes. The expectation clearly is that a failed assertion is caught and triggers a retry.

Instead, the test gets marked as failed and ginkgo.Fail raises a panic. I am not sure whether gomega.Eventually/Consistently catches that and then retries. Does it?

I see two solutions:

  • detect this in the linter
  • somehow handle it in Ginkgo + Gomega

Let's discuss here instead of splitting the discussion. cc @onsi

The linter could annotate functions which call ginkgo.Fail. This would cover Gomega and Kubernetes' e2e/framework.ExpectNoError. It then needs to detect functions that get called by gomega.Eventually/Consistently and then check those whether any function annotated as "calls ginkgo.Fail" is called. Non-trivial... If it can be made to work, developers still need to fix their code.

ginkgo.Fail could check for "called by gomega.Eventually/Consistently" during unwinding. Then instead of marking the test as failed, it could panic with enough information for gomega.Eventually/Consistently to capture a nice failure explanation in case that all retries fail. Perhaps a simple string is enough; alternatively, a type implementing a well-known AssertionFailed() string method could be used. Then Gomega can do type assertion with a locally defined interface to check for this special panic, without depending on Ginkgo source code.

@jackfrancis
Copy link

+1, as a dev would love to be able to more easily identify that a test failure was caused by incorrect Eventually usage.

@onsi
Copy link

onsi commented Nov 23, 2024

This should really get fixed in Ginkgo/Gomega. The strict separation was a cute decision 11 years ago but this pain point is pretty terrible.

Gomega captures the panic. I bet there's a way to annotate it so that (a) Gomega knows it came from Ginkgo and (b) can extract a callback from the panic object to tell Ginkgo to ignore the failure.

I'll dig into it - though I'm not confident about timelines. Let me see if I can make time for it soon...

@nunnatsa
Copy link
Owner

nunnatsa commented Nov 24, 2024

I don't think the problem is the implementation, at least not in the linter. This is not trivial, but doable.

The main problem is that the linter, for several reasons, does not know how to read the developer mind, yet. ;)

Global Expects within async gomega methods, are often done intentionally, in order to bail out in case of unrecoverable failures.

The gomega current solution is to replace the assertion (Expect...To), with regular golang code, and call StopTrying(...).Now() if needed. This is hard to make this in the linter, mostly because the test may use any matcher.

WDYT about adding something to gomega, so we'll could have the option to bail out, but using the existing matchers. e.g.

Eventually(func(g Gomega){
    ...
    ExpectOrExit(vm.Status).ToNot(Equal("Failed"))
    ...
    g.Expect(somethingElse).To(BeSomethingElse())
    ...
}).WithPolling(...).WithTimeout(...).Should(Succeed())

or maybe

Eventually(func(g Gomega){
    ...
    Expect(vm.Status).ToNotOrExit(Equal("Failed"), "VM must not be in Failed status")
    ...
    g.Expect(somethingElse).To(BeSomethingElse())
    ...
}).WithPolling(...).WithTimeout(...).Should(Succeed())

@pohly
Copy link

pohly commented Nov 24, 2024

Global Expects within async gomega methods, are often done intentionally, in order to bail out in case of unrecoverable failures.

But that doesn't work, does it? The async Gomega methods keep trying and eventually time out. Only then is the test really done and has failed.

What I and @onsi suggested would change that slightly: if such an assertion fails only once, the next retry may succeed and the test passes. This is a admittedly a change of behavior, but as the former behavior is basically undefined and/or falls under "don't do this" I think it's okay.

@nunnatsa
Copy link
Owner

But that doesn't work, does it? The async Gomega methods keep trying and eventually time out. Only then is the test really done and has failed.

Yes it is working. Using the global omega bails out early, and sometimes this is what we want.

@nunnatsa
Copy link
Owner

@pohly here is an example:

var _ = Describe("global Gomega vs. gomega var", func() {
	It("should bail out immediately", func() {
		Eventually(func() error {
			Expect(2).ToNot(Equal(2))
			return nil
		}).WithTimeout(10 * time.Second).WithPolling(time.Second).Should(Succeed())
	})

	It("should fail after 10 seconds", func() {
		Eventually(func(g Gomega) {
			g.Expect(2).ToNot(Equal(2))
		}).WithTimeout(10 * time.Second).WithPolling(time.Second).Should(Succeed())
	})
})

When running these tests, I can see the timing in the output:

• [FAILED] [0.000 seconds] <===== global gomega
global Gomega vs. gomega var [It] should bail out immediately
.../main_test.go:13

  [FAILED] Expected
      <int>: 2
  not to equal
      <int>: 2
  In [It] at: .../main_test.go:15 @ 11/25/24 08:11:39.646
------------------------------
• [FAILED] [10.000 seconds] <===== gomega var
global Gomega vs. gomega var [It] should fail after 10 seconds
.../main_test.go:20

  [FAILED] Timed out after 10.000s.
  The function passed to Eventually failed at .../main_test.go:22 with:
  Expected
      <int>: 2
  not to equal
      <int>: 2
  In [It] at: .../main_test.go:23 @ 11/25/24 08:11:49.646

@pohly
Copy link

pohly commented Nov 25, 2024

• [FAILED] [0.000 seconds] <===== global gomega

@onsi said that gomega.Eventually would catch the panic and I assumed that it would then retry, but your are right. It just passes "normal" panics through, so this aborts immediately:

https://github.com/onsi/gomega/blob/4c964c6b4e8d0ccab972a25aff74ed28940979e5/internal/async_assertion.go#L320-L322

So as much as I would like gomega.Expect inside gomega.Eventually/Consistently to work as expected (sic!) by most Kubernetes developers, it would be a breaking change for others. Perhaps make it configurable in Gomega?

@pohly
Copy link

pohly commented Nov 25, 2024

If a linter check can be implemented, then it could be kept disabled by default. If enabled, it becomes a per-project policy that global asserts should be avoided due to the ambiguity, with either StopPollingNow, func(g Gomega), or moving assertions to the Eventually matcher as replacement.

@nunnatsa
Copy link
Owner

I guess it's doable, but not so simple with the current code of the linter.

It's a bit challenge to understand the scope of a gomega assertion expression, to understand if it's in a function deceleration that is an argument of a async-gomega expression. And this would be the easy part. because sometimes these functions are declared elsewhere, even in a different package.

Anyway, if someone have some time for it, then some help is welcome.

I have an initial idea how to approach it - not sure it will work.

@pohly
Copy link

pohly commented Nov 25, 2024

You probably know best how to do this 😅

If you don't get to it, then please ping the issue again and I'll check whether I can make some time for it.

@onsi
Copy link

onsi commented Nov 25, 2024

hmm. sorry everyone, I was clearly misremembering. the existing behavior is surprising and unfortunate to me and I would consider it a bug. it's a conversation that's out of scope on this linter issue. in retrospect a better design would have been to avoid the func(g Gomega) pattern and instead capture Gomega failures (really any invocation of ginkgo.Fail) and retry - and encourage the user to explicitly use StopTrying to abort Eventually.

Given that some folks are relying on the current behavior I don't think there's a straightforward way to change things without breaking backward compatibility. I could add a configuration but now we're just layering on complexity instead of fixing the underlying issue. The other alternative is to build a roadmap for Gomega 2.0 and put this breaking change on it.

@pohly if you'd like to explore making this configurable further please open an issue on Gomega.

@pohly
Copy link

pohly commented Nov 25, 2024

@onsi: I agree with your assessment. Let's wait with the configuration option because even if it existed, I would have the problem that I can't be sure whether it is safe to enable without the linter check and going through all code locations that it points out.

@nunnatsa nunnatsa linked a pull request Dec 2, 2024 that will close this issue
15 tasks
@nunnatsa
Copy link
Owner

nunnatsa commented Dec 2, 2024

Hi @pohly , @onsi

I managed to build something, in #178

But there are two limitation I couldn't solve. First, if the Eventually calls a function from another package (e.g. test utils package), the linter can't find it.

Also, if the function is defined as a variable, again, the linter can't catch it, e.g.

...

testSomething := func() error
    Expect(something).To(BeSomething())
    ...
    return nil
}

...
It("could not be detected by the linter", func() {
    Eventually(testSomething).To(Succeed())
}

This is a very common pattern in some projects.

Also, I implemented it by running an internal analyzer first, to gather all the async scopes, and then the regular one that applies the linter rules. This new implementation causes a performance degradation.

WDYT? should I merge it?

@pohly
Copy link

pohly commented Dec 2, 2024

I haven't looked at your implementation, but it sounds like you aren't using the facts API, are you?

My (naive) expectation is that using facts it should be possible to mark functions in one package as "calls ginkgo.Fail" and use that fact when analyzing some other package. It should also be possible to handle function variables by copying that fact from the function into the variable while analyzing assignments.

@nunnatsa
Copy link
Owner

nunnatsa commented Dec 2, 2024

I was looking at facts in the past. I seems not working for me. I'll give it another try. BTW, finding function that calls ginkgo.Fail is very hard. I' looking for assertions that called from the global Gomega, that are used within an async assertion.

@pohly
Copy link

pohly commented Dec 2, 2024

BTW, finding function that calls ginkgo.Fail is very hard

True, it's very indirect when Gomega gets involved. Using gomega.Expect as an approximation and perhaps some a configuration mechanism for custom functions (like k/k/test/e2e/framework.ExpectNoError) and methods might be good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants