-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libzutil: allow to display powers of 1000 bytes #16579
base: master
Are you sure you want to change the base?
Conversation
7516b6b
to
89ef8f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should also be a testcase for this new formatting.
This PR works when the value is a byte value, but there are other places where we use ZFS_NICENUM_1024 (perhaps incorrectly) that wont be affected. For example, the JSON code uses ZFS_NICENUM_1024 in an bunch of places: Line 9177 in ca0141f
|
Hello and thanks for the feedback @mcmilk :
Sure. Could you direct me to an existing test case that I could use as a base?
Sorry I missed that.
In this example, it seems weird to express a number of errors in powers of 1024. (I think nobody expects "100K errors" to be 100Ki = 102400 errors). Maybe a new unit like |
@jcassette - Sorry, there is currently no such test case, I thought I had seen such test. So no - it seems that this is not needed. Sorry for the noise by me. |
I suspect we historically went with the ambiguous K/M/G/T prefixes to get a more precision in the One possible solution could consist of:
I'm fine with just number 1 being tacked in this PR, but you can try to fix the other ones if you want. |
In broader context referencing PR #14598 |
@jumbi77 thanks for the heads-up on that PR. I think we may need to be even more careful about
So we may want to rename the envar to something that wouldn't be ambiguous for those cases. Like include the word "display" or "output" in the envar name or something. |
ZFS displays bytes with K/M/G/T/P/E prefixes. They represent powers of 1024 bytes, i.e. KiB, MiB, GiB, TiB, PiB, EiB. Some users may want these prefixes to represent powers of 1000 bytes, i.e. KB, MB, GB, TB, PB, EB. This adds the new unit format and allows to use such display by defining an environment variable. Signed-off-by: Julien Cassette <[email protected]>
c8adac6
to
a8980aa
Compare
I have addressed the requested changes. Ready for review. |
@@ -64,19 +65,22 @@ zfs_nicenum_format(uint64_t num, char *buf, size_t buflen, | |||
uint64_t n = num; | |||
int index = 0; | |||
const char *u; | |||
const char *units[3][7] = { | |||
const char *units[6][7] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be?
- const char *units[6][7]
+ const char *units[4][7]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No because [ZFS_NICENUM_1000] = [5]
[ZFS_NICENUM_1024] = {"", "K", "M", "G", "T", "P", "E"}, | ||
[ZFS_NICENUM_BYTES] = {"B", "K", "M", "G", "T", "P", "E"}, | ||
[ZFS_NICENUM_TIME] = {"ns", "us", "ms", "s", "?", "?", "?"} | ||
[ZFS_NICENUM_TIME] = {"ns", "us", "ms", "s", "?", "?", "?"}, | ||
[ZFS_NICENUM_1000] = {"B", "K", "M", "G", "T", "P", "E"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thinking ahead - can you rename ZFS_NICENUM_1000
to ZFS_NICENUM_BYTES_1000
? That way we could potentially add a SI base 1000 prefix for non-byte values, like:
[ZFS_NICENUM_1000] = {"", "K", "M", "G", "T", "P", "E"}
[ZFS_NICENUM_BYTES_1000] = {"B", "K", "M", "G", "T", "P", "E"}
I can imagine in the future that we make all non-byte values SI powers of 1000 by default. It's kind of silly that we report zpool status error counters in powers of 1024, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change the binary 1024 nicenum values to the binary SI prefix?
So [ZFS_NICENUM_1024] = {"", "K", "M", "G", "T", "P", "E"},
becomes [ZFS_NICENUM_1024] = {"", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei"},
?
But a lot other code and maybe ZTS will need some changes with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is the 5-char limit for nicenum values. You'd give up precision with 2-char SI names. Like, currently you can report "20.7M" but with the change you could only report "20Mi". I'm sure we could fix the 5-char limit, but I don't know what kind of fallout there would be from it. We've been reporting 5-char nicenum values for a long time...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, then we should work with the 5-char limit currently and introduce the environment variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? You asked the opposite before: #16579 (comment)
.Bl -tag -width "ZFS_OUTPUT_BYTES_SI" | ||
.\" Shared with zfs.8 and zpool.8 | ||
.It Sy ZFS_OUTPUT_BYTES_SI | ||
Make K/M/G/T/P/E prefixes in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about rewording these lines to make it explicitly clear this is for byte values only?:
When printing byte values, make all prefixes (like KB, MB, GB, TB, PB, EB)
represent powers of 1000, not 1024.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do
is there anyone actually asking for such functionality? with what rationale if I may? "may want to use" is at least for me personally not really enough :) what good is it going to be if it's hidden behind an envvar nobody knows about? this is an added code, added maintenance burden one might say, but very little benefit? just because someone "may want"? ;) |
Okay, just let me know what you decide |
@jcassette awesome thank you, I should have said that I'm genuinely interested in knowing that, to see whether we could maybe come up with something better than an env var... also I'm in no position to reject this I would say :)) even if there was noone asking for it... I just have opinions which I'm sharing, up to you to either toss them away or maybe rethink, really but personally I do agree with @ryao 's "It would be best to pick one convention and stick with it" I'd also argue that probably noone presents these prettified ZFS numbers to (uninformed) users, I would dare to generalize that everyone uses some kind of raw value readouts, simply because that just makes the most sense, store measurements in base units... least amount of work now and also going forward seems to be just reverting to how things were, especially if those two tickets are more about fixing problems by change that - really, as far as I can see - nobody asked for? :D I mean this is kinda comical, wouldn't you yourself consider this a rather unwanted drive-by contribution? -> #13363 - I would... reason: absolutely no considerations about downstream effects, probably too low operational familiarity with ZFS (would have known and thought about the 5 chars limitation otherwise) (again, just my personal opinions, I'm not a lead anything here, can also blame self for not showing up for review during that time - or much at all, which I'm trying to improve ;)) |
After having thought about it some more, I don't want to be seen as the "old grumpy guy who won't change his habits so they die with him" by history, if you know what I mean :)) But I think it wouldn't be unreasonable to expect the change to be 100% consistent across the whole codebase. |
If it helps, here's a branch I was working on last year trying to do roughly the same thing: https://github.com/robn/zfs/commits/byte-prefix/. Please feel free to steal from it, at least the first two commits, which fix a bunch of places where I did a selector for three variations:
I too used an envvar, The reason I wanted to keep the traditional format by default, at least for a while, is to not break scripts scraping output. This was before JSON output was available so less of a concern now, but I'd still wait for next release if I was doing it today, to give people time to adapt. The main reason I didn't finish this was because I couldn't see how this made any sense without also changing it for inputs (eg properties), because any UI needs to be consistent; having input in one format and output in another is always going to be confusing. Yet, the 1000s versions make no sense: if we say that in the future, So I dropped it. I was around before |
My opinion is that it's probably long-term the easiest option to go all-in on full compliance with standards, meaning the unambiguous valid short version of I think a new major release is the best opportunity where a whole-sale change like this could be advertised. I'm not afraid of breaking old scripts, it's just that people need a backup option until they fix their stuff - IMHO easily provided by an older release branch, maybe this could drive demand/expectations of longer support of such. How about a module option (and envvar as a fallback for userspace only builds) as to which is the preferred form of displayed/rendered nicenums towards userspace? Whether to divide by 1024 (Ki/Gi..) or 1000 (k/G..). OpenZFS 2.4 or 3.0 as a target for the big bang? Why I think all-in approach is the easiest option long-term:
I'm willing to help out (reachable on irc) |
I guess mostly I don't see what the actual gain is. IME "standards compliance" is rarely by itself a good reason to do anything. If we were starting from scratch, then yes, no brainer, but there's scripts out there, there's published books, years of documentation, blog posts, videos, etc, that we'd be invalidating in one fell swoop. For me, that's too much without a clear benefit. But I also know that I err towards the status quo far too often, waiting for supporting data that never comes. Which is why I surround myself with people who will say "nah, you're worrying too much" :) Assuming you're saying "nah, you're worrying too much", and thinking on it more, the all-in is probably the way to do it.
(And On the input, I can get used to (I wouldn't go near |
Me neither but I could see an implementation that could allow for this, just a few well though-out shared functions.
something like that, but also that I could think of an implementation of this which would be nice to use and maintain, both, that's why I'd go for a module option + envvar, b/c I think it otherwise won't be that much code, maybe not even that many changed lines in the end - to get both decimal and binary prefixes supported and in tune of what I said earlier, I don't have the motivation to drive it - but I'd help if someone else did :) |
Aside of script compatibility I would not like to ask users each time about units they used in whatever they sent me and switch my brain accordingly. Also, does it mean switching memory units too to match disk units? Measuring memory in 1000 byte units would be even weirder than disks, while not doing it would create another source of confusions. To be short: I am against this. I have to regularly explain users why ZFS reports less disk space than disk's marketing, but at least it is consistent and stable for many years. |
the idea behind a global switch is to let the user choose whether they'd like decimal or binary prefixes more - but then when you get a paste, you'd see "Ki" or "k" accordingly, the main point of a breaking change (away from the 5 characters limit) would be to make ZFS use the prefixes correctly so it's obvious from a random paste which is which without any "tribal knowledge" |
It probably would be useful if the nicenum formatting function got a hint whether those are memory or disk related numbers, that would eliminate the need to stick always to 1 variant for everything |
Motivation and Context
ZFS displays bytes with K/M/G/T/P/E prefixes. They represent powers of 1024 bytes, i.e. KiB, MiB, GiB, TiB, PiB, EiB.
Some users may want these prefixes to represent powers of 1000 bytes, i.e. KB, MB, GB, TB, PB, EB.
Description
This adds the new unit format and allows to use such display by defining an environment variable.
How Has This Been Tested?
Not tested. If this draft gathers interest then I will add tests.
Types of changes
Checklist:
Signed-off-by
.