-
-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[osh] Test ${a[@]:-}
, ${a[*]:-}
, etc. based on the string result
#2215
Conversation
9145e4f
to
fba1079
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thank you, it looks like this improves bash compatibility so it's good
Although now I am sort of interested in "${@:-}" and so forth because it's a POSIX construct
I wonder if you can also improve test case 18, because it looks like we are missing coverage there. I think it may hit these same code paths
Or add another test case below 18
I am a bit surprised by the difference between
set -- ""
set -- "" ""
and also zsh is different
#### $@ and - and +
echo argv=${@-minus}
echo argv=${@+plus}
echo argv=${@:-minus}
echo argv=${@:+plus}
is_falsey = False | ||
break | ||
else: | ||
is_falsey = len(strs) == 0 or (len(strs) == 1 and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm this seems worth documenting in doc/ref/chap-word-lang.md
under op-test
?
I didn't know this !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a combination of $*
/ $@
, double quoting, and IFS
. I just calculated the width of the resulting length without actually calculating the string, which may look non-trivial. I'll add documentation and also a code comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I realized that OSH has another divergence from Bash:
$ bash -c 'IFS=; set -- "1x2" "3x4"; echo $*'
1x2 3x4
$ bin/osh -c 'IFS=; set -- "1x2" "3x4"; echo $*'
1x23x4
This is actually specified by POSIX, so all shells except OSH is consistent:
$ busybox sh -c 'IFS=; set -- "1x2" "3x4"; echo $*'
1x2 3x4
$ dash -c 'IFS=; set -- "1x2" "3x4"; echo $*'
1x2 3x4
$ ksh -c 'IFS=; set -- "1x2" "3x4"; echo $*'
1x2 3x4
$ mksh -c 'IFS=; set -- "1x2" "3x4"; echo $*'
1x2 3x4
$ zsh -c 'IFS=; set -- "1x2" "3x4"; echo $*'
1x2 3x4
$ yash -c 'IFS=; set -- "1x2" "3x4"; echo $*'
1x2 3x4
@andychu Is there a way to tell whether the currently evaluated variable substitution is inside double quotations or not? SimpleVarSub
and BracedVarSub
don't seem to contain the necessary context information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added 228ea80.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK interesting, yeah that looks like a bug
Regarding IFS=, we have to rewrite word splitting ... long story, but you can see there are multiple bugs reported over the years in test/spec.sh word-split
This is one of the biggest remaining areas of divergence
I am not sure if this is the same bug
For $*
, is the problem simply that we should ignore IFS?? We should always use a space rather than the join char? Not sure
$ dash -c 'IFS=x; set -- a b; echo $*'
a b
$ dash -c 'IFS=; set -- a b; echo $*'
a b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically I am not sure that we need to see if it's double quoted or not, for this case?
But I know that is an issue with word splitting in general. We have a complex set of data structures like Piece
that know about quoting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically I am not sure that we need to see if it's double quoted or not, for this case?
I think so (edit: I added a TODO and a failing test in commit 7ea2d8b). POSIX says (list formatting is added by me):
$*
Expands to the positional parameters, starting from one, initially producing one field for each positional parameter that is set.
- When the expansion occurs in a context where field splitting will be performed, any empty fields may be discarded and each of the non-empty fields shall be further split as described in 2.6.5 Field Splitting.
- When the expansion occurs in a context where field splitting will not be performed, the initial fields shall be joined to form a single field with the value of each parameter separated by the first character of the IFS variable if IFS contains at least one character, or separated by a if IFS is unset, or with no separation if IFS is set to a null string.
So, IFS
is used only in the context where field splitting (word splitting) is inactive. OSH now uses IFS
regardless of the context, which cause the deviation.
But I know that is an issue with word splitting in general. We have a complex set of data structures like
Piece
that know about quoting
I think we can include the context information in SimpleVarSub
and BracedVarSub
at the parser or lexer stage (instead of analyzing the AST on the evaluation time); i.e., $*
in echo "$*"
and echo $*
would be treated as different constructs in the parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a simple solution may be to convert $*
to $@
when $*
is used in a context where field splitting is active. It seems $*
and $@
are functionally equivalent in the context where field splitting is performed.
# a : 'no-colon' 'with-colon' | ||
# a[0]: 'no-colon' 'with-colon' | ||
# a[@]: 'no-colon' 'with-colon' | ||
# a[*]: 'no-colon' 'with-colon' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be useful to split this test up like the other one, since bash agrees with OSH on the first part
Although it is a bit annoying because you have to share test-hyphen
I suppose that could be source spec/testdata/var-op-test-helper.sh
or something
Or I guess it is OK to copy and paste too, since the function is short
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
361ac8c
to
655d14d
Compare
I added a set of tests in commit 655d14d. |
8ccf487
to
14014d6
Compare
This follows the behavioral change from Bash 4.4 to 5.0.
14014d6
to
7ea2d8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks, this looks good ... I guess the main point is that is an array with 1 value of empty string is considered empty
Thanks for adding the docs
I am going to adjust them a bit, and also adjust the test failures
spec/var-op-test.test.sh
Outdated
argv= | ||
argv=plus | ||
## END | ||
## BUG mksh/osh STDOUT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually let OSH fail and adjust allowed_failures
i.e. we don't usually have BUG osh, which is yellow - we make it RED
This is basically a matter of convention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually wondered if it would be possible to add another type of confirmed failure (e.g. "TODO") that produces red characters (or characters of another color) in the table. With the current strategy,
- When the number of allowed failures is increased by some changes, we cannot identify which failure was the increased one. We need to check out an older commit, build it, and compare the test results manually. Or one needs to open the github-jobs website, looking up the corresponding page, and compare the results.
- Even when the problem is fixed in the process of implementing another feature, the failure can be left to be FAIL if the result does not exactly match the expected result (which probably describes the Bash behavior).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm well technically ALL the ~100 failures now are TODO ! :-) Some of them will be classified as "OK osh", but never "BUG osh" I think ... our bugs are just things we haven't done yet
And notice there are ZERO FAIL on bash/dash/mksh/zsh ... so FAIL is really reserved for OSH
and BUG is reserved for the other shells
At least that is the convention I follow, but we can talk about it more if it causes a problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I could call it "a problem", but at least the following situation can be improved:
- When the number of allowed failures is increased by some changes, we cannot identify which failure was the increased one. We need to check out an older commit, build it, and compare the test results manually. Or one needs to open the github-jobs website, looking up the corresponding page, and compare the results.
- Even when the problem is fixed in the process of implementing another feature, the failure can be left to be FAIL if the result does not exactly match the expected result (which probably describes the Bash behavior).
doc/ref/chap-word-lang.md
Outdated
When the variable substituion is inside double quotations, the `[*]` form joins | ||
the elements by the first character of `IFS`: | ||
|
||
$ printf '<%s>\n' "${a[*]}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these cases can more concisely explained by a table, with 4 entries or 8 entries, like:
$@
"$@"
$*
"$*"
OK thanks I also fixed the ${*:-} bug on top of this - 31ce9f6 Thanks for finding it and writing the tests - that is the hard part! |
Thank you! |
This follows the behavioral change from Bash 4.4 to 5.0.