-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RET504 false-positive when variable built up with assignments and returned #2950
Comments
Possibly not respecting overrides, does look like a bug. |
Just noticed this too! def f(y):
x = 1
if y:
x = 2
return x $ ruff tmp.py --isolated --select RET
tmp.py:6:12: RET504 Unnecessary variable assignment before `return` statement
Found 1 error. It seems the RET checks don't handle branches very well (the "else after" ones also are broken when you have a chain). |
It may wants you to do this, even though it's awkward: def f(y):
if y:
return 2
return 1
Is this on |
(I find these rules really tedious lol. Maybe they should just be removed.) |
They will come back again when adding pylint rules. :) I like these in pylint a lot, which I is why I'm happy to see them showing up in a faster pre-commit friendly form, but the pylint version is solid and doesn't trigger falsely, while RET seems to not properly handle branching (I expect it's a problem with the flake8 plugin).
It's not practical in the actual code that caused me to look into this. The fix should be to just only enforce |
I haven't rerun with the most recent release yet for the RET 507 and similar rules, quite possibly fixed! |
Ah yeah, I mostly meant that I have issues with the implementation. It's a direct port of the flake8 plugin, and it just has a bunch of limitations (in addition to being hard to reason about). |
Ahh, I see what you mean. def f(y):
x = 1
if y:
return 2
return x Would pass. I could live with that in some/most cases. |
I think it will still complain about |
In a realistic example (where other things are happening to the x before the final if), it seems pretty happy. I didn't end up with any false positives that I couldn't handle knowing that it was fine just changing the assignment in the if to a return. There were several when converting Plumbum and scikit-hep/particle@10caa90 was the one that prompted me to search for this issue. |
Just to add one more (almost) real life example: I have this pattern that seems reasonable in Django to build a queryset based on different inputs def get_queryset(option_1, option_2):
queryset: Any = None
queryset = queryset.filter(a=1)
if option_1:
queryset = queryset.annotate(b=Value(2))
if option_2:
queryset = queryset.filter(c=3)
return queryset
|
This would be how you’d make it happy: def get_queryset(option_1, option_2):
queryset: Any = None
queryset = queryset.filter(a=1)
if option_1:
queryset = queryset.filter(b=2)
if option_2:
return queryset.filter(c=3)
return queryset Not quite as symmetric, but not really “worse” from a readers standpoint either. I’d probably disable the check if this was a lot of very symmetric conditions (like you have described), though in practice most of them are not that symmetric and are fine written with the extra return. Overriding a variable is a bit ugly, so avoiding an unnecessary one isn’t that bad. |
Personally, I dislike the resulting work-around because:
One idea would be to only flag the single-assignment and return scenarios, e.g.: def get_queryset():
queryset = Model.filter(a=1)
return queryset ...and not flag scenarios where there are multiple assignments to the same variable before returning, e.g.: def get_queryset(option_1, option_2):
queryset = Model.filter(a=1)
if option_1:
queryset = queryset.filter(b=2)
if option_2:
queryset = queryset.filter(c=3)
return queryset Though, I guess that would also mean that something like this would pass not get flagged: def get_queryset():
queryset = Model.filter(a=1)
queryset = queryset.filter(c=3)
return queryset Which raises the question of what is it that this rule is really meant to be targeting? Cases where would save a line of code due to an unnecessary assignment? Should check be simplified to only look for the simple assign/return case that happens at the same nesting level, and ignore if there is more complex branching/looping involved? |
If the rule should only be ignored if the assignment is under |
I agree with @gdub . In cases similar to the one mentioned by him, I usually do Don't know if it is simple for ruff to detect it though =P |
The multiple assignments heuristic seems like it could work... pretty well? In my personal opinion I actually prefer this to inlining def get_queryset():
queryset = Model.filter(a=1)
queryset = queryset.filter(c=3)
return queryset It also has the nice property that if you add another operation to the fluent chain, the diff much clearer. But, it's all a bit subjective of course. I think the intent of the rule is to capture "trivial" errors like: x = 1
return x Or: x = 1
print("Got 1")
return x Part of me is like... is this even a rule worth having? I wish I had data on how many people are using it and how often it's |
Can you run Ruff on a corpus of projects and gather rule hit counts and a random sampling of examples? MyPy does something like that with its MyPy Primer, which they use to evaluate pull requests. It might be nice to see how various changes make rules more or less sensitive, and examples of lines that would be newly flagged, and also examples of lines would no longer be flagged. |
I'd like to do something like this: index projects using Ruff, and get some data on their configurations (and on how often they |
Interesting looking through some results from: Shows up in some ignores with comments, e.g.:
See many examples with similar chaining-style assignment with conditionals that we've touched on already. Here's an interesting case where variable is assigned to, referenced, and then returned: def _create_app() -> Flask:
app = Flask(__name__)
@app.route('/hello')
def hello() -> str:
"""Hello endpoint."""
return 'Hello, World!'
return app # noqa: RET504 Seems like something that shouldn't be flagged. Perhaps instead of an assignment count, heuristic could be reference count. In other words, excluding return statement, flag if the only reference was assignment. |
This PR productionizes that suggestion (avoid RET504 for variables with multiple assignments): #3393. Feedback welcome, I won't be merging tonight. |
(Fixed the decorator thing in #3395, separate bug!) |
Ok, I looked through the GitHub Code Search, and I do see a lot of examples of this kind of "fluent interface"-like assignment style, e.g.: command = self.extract_command(text)
self.validate_prefix(command=command)
await self.validate_mention(bot=bot, command=command)
command = self.validate_command(command)
command = self.do_magic(command=command)
return command # noqa: RET504 This strikes me as very reasonable code, so I'm gonna go ahead and merge that PR. |
Personally, like moving the return up one, reducing the number of variable overwrites there (and that sort of code isn't very good for static typing unless every command returns the same thing!) I don't see any problem with: command = self.extract_command(text)
self.validate_prefix(command=command)
await self.validate_mention(bot=bot, command=command)
command = self.validate_command(command)
return self.do_magic(command=command) In fact, there's only one overwrite now, so I'd actually do: command = self.extract_command(text)
self.validate_prefix(command=command)
await self.validate_mention(bot=bot, command=command)
validated_command = self.validate_command(command)
return self.do_magic(command=validated_command) IMO, this now reads much better. The point of style checks is to force better style, not to try not to bother anyone or conform to what everyone is doing if it's wrong. I think the biggest problem with this check was that it wasn't clear how to "fix" it from the message (and also some false positives, like the function false positive above). And it's totally fine to disable a style check you don't agree with! Whenever I talk about style checking, I emphasize this (especially WRT PyLint). Also, one of the points of this sort of check is to help two people writing the same logic get the same code. Returning without assigning is perfectly valid, so it's reasonable to force this rather than force an assignment always before return. For example, I think this would have read much better without an extra assignment: https://github.com/scikit-build/scikit-build-core/pull/197/files#diff-5a9598893dbb4007601522cfb26b27c92d790c18bf35d7bfe86205ae1955fa0bR47-R48 I'm not sure why Ruff didn't find that, as I'd guess #3393 wasn't in the version of Ruff that ran on that, but I wish it did. |
## Summary The `RET504` rule, which looks for unnecessary assignments before return statements, is a frequent source of issues (#4173, #4236, #4242, #1606, #2950). Over time, we've tried to refine the logic to handle more cases. For example, we now avoid analyzing any functions that contain any function calls or attribute assignments, since those operations can contain side effects (and so we mark them as a "read" on all variables in the function -- we could do a better job with code graph analysis to handle this limitation, but that'd be a more involved change.) We also avoid flagging any variables that are the target of multiple assignments. Ultimately, though, I'm not happy with the implementation -- we just can't do sufficiently reliable analysis of arbitrary code flow given the limited logic herein, and the existing logic is very hard to reason about and maintain. This PR refocuses the rule to only catch cases of the form: ```py def f(): x = 1 return x ``` That is, we now only flag returns that are immediately preceded by an assignment to the returned variable. While this is more limiting, in some ways, it lets us flag more cases vis-a-vis the previous implementation, since we no longer "fully eject" when functions contain function calls and other effect-ful operations. Closes #4173. Closes #4236. Closes #4242.
## Summary This PR extends the new `RET504` implementation to handle cases like: ```py def foo(): with open("foo.txt", "r") as f: x = f.read() return x ``` This was originally suggested in #2950 (comment).
## Summary The `RET504` rule, which looks for unnecessary assignments before return statements, is a frequent source of issues (#4173, #4236, #4242, #1606, #2950). Over time, we've tried to refine the logic to handle more cases. For example, we now avoid analyzing any functions that contain any function calls or attribute assignments, since those operations can contain side effects (and so we mark them as a "read" on all variables in the function -- we could do a better job with code graph analysis to handle this limitation, but that'd be a more involved change.) We also avoid flagging any variables that are the target of multiple assignments. Ultimately, though, I'm not happy with the implementation -- we just can't do sufficiently reliable analysis of arbitrary code flow given the limited logic herein, and the existing logic is very hard to reason about and maintain. This PR refocuses the rule to only catch cases of the form: ```py def f(): x = 1 return x ``` That is, we now only flag returns that are immediately preceded by an assignment to the returned variable. While this is more limiting, in some ways, it lets us flag more cases vis-a-vis the previous implementation, since we no longer "fully eject" when functions contain function calls and other effect-ful operations. Closes #4173. Closes #4236. Closes #4242.
## Summary This PR extends the new `RET504` implementation to handle cases like: ```py def foo(): with open("foo.txt", "r") as f: x = f.read() return x ``` This was originally suggested in #2950 (comment).
Example:
Output:
In this case, there is no unnecessary variable assignment that could be removed and cannot do an earlier
return
as need to get through all the conditionals.That said, the error does not trigger if one were to instead use an in-place operator such as
+=
:Version:
The text was updated successfully, but these errors were encountered: