Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libzutil: allow to display powers of 1000 bytes #16579

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jcassette
Copy link

@jcassette jcassette commented Sep 28, 2024

Motivation and Context

ZFS displays bytes with K/M/G/T/P/E prefixes. They represent powers of 1024 bytes, i.e. KiB, MiB, GiB, TiB, PiB, EiB.
Some users may want these prefixes to represent powers of 1000 bytes, i.e. KB, MB, GB, TB, PB, EB.

Description

This adds the new unit format and allows to use such display by defining an environment variable.

How Has This Been Tested?

Not tested. If this draft gathers interest then I will add tests.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@jcassette jcassette marked this pull request as draft September 28, 2024 13:18
@jcassette jcassette force-pushed the kbyte-1000 branch 3 times, most recently from 7516b6b to 89ef8f6 Compare September 28, 2024 14:31
Copy link
Contributor

@mcmilk mcmilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should also be a testcase for this new formatting.

lib/libzutil/zutil_nicenum.c Outdated Show resolved Hide resolved
lib/libzutil/zutil_nicenum.c Outdated Show resolved Hide resolved
@tonyhutter
Copy link
Contributor

This PR works when the value is a byte value, but there are other places where we use ZFS_NICENUM_1024 (perhaps incorrectly) that wont be affected. For example, the JSON code uses ZFS_NICENUM_1024 in an bunch of places:

cb->cb_literal, cb->cb_json_as_int, ZFS_NICENUM_1024);

@jcassette
Copy link
Author

jcassette commented Oct 9, 2024

Hello and thanks for the feedback

@mcmilk :

There should also be a testcase for this new formatting.

Sure. Could you direct me to an existing test case that I could use as a base?

@tonyhutter :

This PR works when the value is a byte value, but there are other places where we use ZFS_NICENUM_1024 (perhaps incorrectly) that wont be affected. For example, the JSON code uses ZFS_NICENUM_1024 in an bunch of places:

Sorry I missed that.
I can see that ZFS_NICENUM_BYTES is used directly there. This PR does not work in that case currently, and I am not sure how to fix that.
If there are places where ZFS_NICENUM_1024 is used to represent bytes, could that be fixed in another PR by using ZFS_NICENUM_BYTES?

zfs/cmd/zpool/zpool_main.c

Lines 9176 to 9177 in ca0141f

nice_num_str_nvlist(vds, "read_errors", vs->vs_read_errors,
cb->cb_literal, cb->cb_json_as_int, ZFS_NICENUM_1024);

In this example, it seems weird to express a number of errors in powers of 1024. (I think nobody expects "100K errors" to be 100Ki = 102400 errors). Maybe a new unit like ZFS_NICENUM_1000 should be used instead?

@mcmilk
Copy link
Contributor

mcmilk commented Oct 9, 2024

There should also be a testcase for this new formatting.

@jcassette - Sorry, there is currently no such test case, I thought I had seen such test. So no - it seems that this is not needed. Sorry for the noise by me.

@tonyhutter
Copy link
Contributor

I suspect we historically went with the ambiguous K/M/G/T prefixes to get a more precision in the zpool iostat|status 5-char columns. That is, you could print "500.4M" rather then "500MB" or "500MiB" (which wouldn't even fit in 5 chars...)

One possible solution could consist of:

  1. ZFS_KB_IS_1000 envar - Default to 1000 instead of 1024 for byte values
  2. ZFS_USE_IEC_PREFIX envar - Print KB/MB/GB/TB (base 10) or KiB/MiB/GiB/TiB (base 2) instead of K/M/G/T. This might be a pain to implement due to the 5-char columns, but I suspect it would still be doable.
  3. Use ZFS_NICENUM_1000 for error counters in JSON output (and anywhere else its needed).

I'm fine with just number 1 being tacked in this PR, but you can try to fix the other ones if you want.

@jumbi77
Copy link
Contributor

jumbi77 commented Oct 9, 2024

In broader context referencing PR #14598

@tonyhutter
Copy link
Contributor

@jumbi77 thanks for the heads-up on that PR. I think we may need to be even more careful about ZFS_KB_IS_1000 since we've only talked about it for displaying numbers, not setting them. For example, this gets a little ambiguous:

export ZFS_KB_IS_1000=1
zfs set recordsize=8K tank

So we may want to rename the envar to something that wouldn't be ambiguous for those cases. Like include the word "display" or "output" in the envar name or something.

@amotin amotin added the Status: Work in Progress Not yet ready for general review label Oct 29, 2024
ZFS displays bytes with K/M/G/T/P/E prefixes. They represent powers of
1024 bytes, i.e. KiB, MiB, GiB, TiB, PiB, EiB.
Some users may want these prefixes to represent powers of 1000 bytes,
i.e. KB, MB, GB, TB, PB, EB.
This adds the new unit format and allows to use such display by
defining an environment variable.

Signed-off-by: Julien Cassette <[email protected]>
@jcassette jcassette marked this pull request as ready for review November 11, 2024 22:18
@github-actions github-actions bot added Status: Code Review Needed Ready for review and testing and removed Status: Work in Progress Not yet ready for general review labels Nov 11, 2024
@jcassette
Copy link
Author

I have addressed the requested changes. Ready for review.

@@ -64,19 +65,22 @@ zfs_nicenum_format(uint64_t num, char *buf, size_t buflen,
uint64_t n = num;
int index = 0;
const char *u;
const char *units[3][7] = {
const char *units[6][7] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be?

- const char *units[6][7]
+ const char *units[4][7]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because [ZFS_NICENUM_1000] = [5]

[ZFS_NICENUM_1024] = {"", "K", "M", "G", "T", "P", "E"},
[ZFS_NICENUM_BYTES] = {"B", "K", "M", "G", "T", "P", "E"},
[ZFS_NICENUM_TIME] = {"ns", "us", "ms", "s", "?", "?", "?"}
[ZFS_NICENUM_TIME] = {"ns", "us", "ms", "s", "?", "?", "?"},
[ZFS_NICENUM_1000] = {"B", "K", "M", "G", "T", "P", "E"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking ahead - can you rename ZFS_NICENUM_1000 to ZFS_NICENUM_BYTES_1000? That way we could potentially add a SI base 1000 prefix for non-byte values, like:

	    [ZFS_NICENUM_1000] = {"", "K", "M", "G", "T", "P", "E"}
	    [ZFS_NICENUM_BYTES_1000] = {"B", "K", "M", "G", "T", "P", "E"}

I can imagine in the future that we make all non-byte values SI powers of 1000 by default. It's kind of silly that we report zpool status error counters in powers of 1024, for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the binary 1024 nicenum values to the binary SI prefix?

So [ZFS_NICENUM_1024] = {"", "K", "M", "G", "T", "P", "E"},
becomes [ZFS_NICENUM_1024] = {"", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei"}, ?

But a lot other code and maybe ZTS will need some changes with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is the 5-char limit for nicenum values. You'd give up precision with 2-char SI names. Like, currently you can report "20.7M" but with the change you could only report "20Mi". I'm sure we could fix the 5-char limit, but I don't know what kind of fallout there would be from it. We've been reporting 5-char nicenum values for a long time...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, then we should work with the 5-char limit currently and introduce the environment variable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? You asked the opposite before: #16579 (comment)

.Bl -tag -width "ZFS_OUTPUT_BYTES_SI"
.\" Shared with zfs.8 and zpool.8
.It Sy ZFS_OUTPUT_BYTES_SI
Make K/M/G/T/P/E prefixes in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about rewording these lines to make it explicitly clear this is for byte values only?:

When printing byte values, make all prefixes (like KB, MB, GB, TB, PB, EB) 
represent powers of 1000, not 1024.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do

@snajpa
Copy link
Contributor

snajpa commented Nov 21, 2024

is there anyone actually asking for such functionality? with what rationale if I may? "may want to use" is at least for me personally not really enough :)

what good is it going to be if it's hidden behind an envvar nobody knows about? this is an added code, added maintenance burden one might say, but very little benefit? just because someone "may want"? ;)

@jcassette
Copy link
Author

jcassette commented Nov 25, 2024

@snajpa

is there anyone actually asking for such functionality? with what rationale if I may? "may want to use" is at least for me personally not really enough :)

#14598 and #11046 ?

this is an added code, added maintenance burden one might say, but very little benefit?

Okay, just let me know what you decide

@snajpa
Copy link
Contributor

snajpa commented Nov 26, 2024

@jcassette awesome thank you, I should have said that I'm genuinely interested in knowing that, to see whether we could maybe come up with something better than an env var...

also I'm in no position to reject this I would say :)) even if there was noone asking for it... I just have opinions which I'm sharing, up to you to either toss them away or maybe rethink, really

but personally I do agree with @ryao 's "It would be best to pick one convention and stick with it"

I'd also argue that probably noone presents these prettified ZFS numbers to (uninformed) users, I would dare to generalize that everyone uses some kind of raw value readouts, simply because that just makes the most sense, store measurements in base units...

least amount of work now and also going forward seems to be just reverting to how things were, especially if those two tickets are more about fixing problems by change that - really, as far as I can see - nobody asked for? :D I mean this is kinda comical, wouldn't you yourself consider this a rather unwanted drive-by contribution? ->

#13363 - I would... reason: absolutely no considerations about downstream effects, probably too low operational familiarity with ZFS (would have known and thought about the 5 chars limitation otherwise)

(again, just my personal opinions, I'm not a lead anything here, can also blame self for not showing up for review during that time - or much at all, which I'm trying to improve ;))

@snajpa
Copy link
Contributor

snajpa commented Nov 26, 2024

After having thought about it some more, I don't want to be seen as the "old grumpy guy who won't change his habits so they die with him" by history, if you know what I mean :))

But I think it wouldn't be unreasonable to expect the change to be 100% consistent across the whole codebase.

#14598 (comment)

@robn
Copy link
Member

robn commented Nov 26, 2024

If it helps, here's a branch I was working on last year trying to do roughly the same thing: https://github.com/robn/zfs/commits/byte-prefix/. Please feel free to steal from it, at least the first two commits, which fix a bunch of places where nicenum is called when it should be nicebytes.

I did a selector for three variations:

  • trad: powers of 1024 with SI prefixes
  • iec: powers of 1024 with IEC prefixes
  • si: powers of 1000 with SI prefixes

I too used an envvar, ZFS_BYTE_PREFIX, to set it. I was also going to add a --units or similar arg to commands to make it selectable there. My intention was that we'd use the tranditional format as default for now, but an operator could override it per-run or globally via the envvar. At some future point (major release maybe), we switch the default to iec, and anyone who wants the old behaviour can set the global var. (si is there for people that don't think about it much beyond "1000G" on the back of the box at the the hard drive shop).

The reason I wanted to keep the traditional format by default, at least for a while, is to not break scripts scraping output. This was before JSON output was available so less of a concern now, but I'd still wait for next release if I was doing it today, to give people time to adapt.

The main reason I didn't finish this was because I couldn't see how this made any sense without also changing it for inputs (eg properties), because any UI needs to be consistent; having input in one format and output in another is always going to be confusing. Yet, the 1000s versions make no sense: if we say that in the future, K is always 1000 and Ki is 1024, then you cannot ever write recordsize=128K, because it has to be power-of-2. So it either always has to be recordsize=128Ki, or, we allow the K to mean 1024 in those places. I could never find a way that didn't add confusion, which this is entire thing is supposed to remove!

So I dropped it. I was around before KiB was cool, so for me there's no ambiguity anyway - if its bytes, K means 1024, if not, 1000. The raw numbers are there if you need accuracy, otherwise its a ballpark anyway.

@snajpa
Copy link
Contributor

snajpa commented Nov 27, 2024

My opinion is that it's probably long-term the easiest option to go all-in on full compliance with standards, meaning the unambiguous valid short version of recordsize=131072 should be recordsize=128Ki, I think. It could also, especially if it makes implementation/integration in wider ecosystem easier, take in values like recordsize=131.072k, that is valid too, isn't it? Don't know how many decimals it handles currently and how much do we want from it, but there are also valid variants with Mi/M and so on, aren't there? :)

I think a new major release is the best opportunity where a whole-sale change like this could be advertised. I'm not afraid of breaking old scripts, it's just that people need a backup option until they fix their stuff - IMHO easily provided by an older release branch, maybe this could drive demand/expectations of longer support of such.

How about a module option (and envvar as a fallback for userspace only builds) as to which is the preferred form of displayed/rendered nicenums towards userspace? Whether to divide by 1024 (Ki/Gi..) or 1000 (k/G..).

OpenZFS 2.4 or 3.0 as a target for the big bang?

Why I think all-in approach is the easiest option long-term:

  • well, it can be confusing for some - and it'll be so for an ever increasing fraction of the userbase, simply b/c this is now being taught in schools :)
  • then it'll be reflected in the codebase eventually one way or another - and it already is in the documentation (which I still think is probably best to be reverted in current branches)
  • so, if it's unavoidable, how about we do it right - consistently so across the whole codebase?

I'm willing to help out (reachable on irc)

@robn
Copy link
Member

robn commented Nov 27, 2024

I guess mostly I don't see what the actual gain is. IME "standards compliance" is rarely by itself a good reason to do anything. If we were starting from scratch, then yes, no brainer, but there's scripts out there, there's published books, years of documentation, blog posts, videos, etc, that we'd be invalidating in one fell swoop.

For me, that's too much without a clear benefit. But I also know that I err towards the status quo far too often, waiting for supporting data that never comes. Which is why I surround myself with people who will say "nah, you're worrying too much" :)

Assuming you're saying "nah, you're worrying too much", and thinking on it more, the all-in is probably the way to do it. -p exists, and can be combined with numfmt to get a good-enough approximation of the old behaviour if you want that:

$ zfs list -p | numfmt --header --field 2-4 --to=iec --round=down --format=%.1f
NAME                        USED        AVAIL         REFER  MOUNTPOINT
crayon                    404.4G        26.1G        192.0K  none
crayon/dump                46.8M        26.1G         45.5M  /dump
crayon/home               395.8G        26.1G        200.0K  /home
crayon/home/robn          394.2G        26.1G        394.2G  /home/robn
crayon/home/root            1.6G        26.1G          1.6G  /root
crayon/root                 8.4G        26.1G        192.0K  none
crayon/root/debian          8.4G        26.1G          8.4G  /
crayon/var                 23.5M        26.1G        192.0K  /var
crayon/var/cache           15.0M        26.1G         15.0M  /var/cache
crayon/var/log              7.5M        26.1G          7.5M  /var/log
crayon/var/tmp            780.0K        26.1G        780.0K  /var/tmp

(And -j exists if you really want that).

On the input, I can get used to recordsize=128Ki, and if your suffix makes no sense, then I'd error, and if it's one of the classics, I'd say "K? You probably meant Ki". Gently force relearning.

(I wouldn't go near recordsize=131.072K, for at least nine reasons, one for each circle of hell, where this clearly came from 😅).

@snajpa
Copy link
Contributor

snajpa commented Nov 27, 2024

I wouldn't go near recordsize=131.072K, for at least nine reasons, one for each circle of hell, where this clearly came from 😅

Me neither but I could see an implementation that could allow for this, just a few well though-out shared functions.

nah, you're worrying too much

something like that, but also that I could think of an implementation of this which would be nice to use and maintain, both, that's why I'd go for a module option + envvar, b/c I think it otherwise won't be that much code, maybe not even that many changed lines in the end - to get both decimal and binary prefixes supported

and in tune of what I said earlier, I don't have the motivation to drive it - but I'd help if someone else did :)

@amotin
Copy link
Member

amotin commented Nov 27, 2024

I could think of an implementation of this which would be nice to use and maintain, both, that's why I'd go for a module option + envvar

Aside of script compatibility I would not like to ask users each time about units they used in whatever they sent me and switch my brain accordingly. Also, does it mean switching memory units too to match disk units? Measuring memory in 1000 byte units would be even weirder than disks, while not doing it would create another source of confusions. To be short: I am against this. I have to regularly explain users why ZFS reports less disk space than disk's marketing, but at least it is consistent and stable for many years.

@snajpa
Copy link
Contributor

snajpa commented Nov 27, 2024

Aside of script compatibility I would not like to ask users each time about units they used in whatever they sent me and switch my brain accordingly. Also, does it mean switching memory units too to match disk units? Measuring memory in 1000 byte units would be even weirder than disks, while not doing it would create another source of confusions. To be short: I am against this. I have to regularly explain users why ZFS reports less disk space than disk's marketing, but at least it is consistent and stable for many years.

the idea behind a global switch is to let the user choose whether they'd like decimal or binary prefixes more - but then when you get a paste, you'd see "Ki" or "k" accordingly, the main point of a breaking change (away from the 5 characters limit) would be to make ZFS use the prefixes correctly so it's obvious from a random paste which is which without any "tribal knowledge"

@snajpa
Copy link
Contributor

snajpa commented Nov 27, 2024

Also, does it mean switching memory units too to match disk units?

It probably would be useful if the nicenum formatting function got a hint whether those are memory or disk related numbers, that would eliminate the need to stick always to 1 variant for everything

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants