Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend and expose API for {:x?} and {:X?} formatting #99138

Closed
wants to merge 1 commit into from

Conversation

clarfonthey
Copy link
Contributor

@clarfonthey clarfonthey commented Jul 11, 2022

RFC 2226, originally tracked in #48584, added the x? and X? debug formats, which allow printing integers as hexadecimal even in debug contexts.

There was some discussion on exposing this to downstream APIs in both the tracking issue and the RFC, and ultimately, the tracking issue was closed without any progress on this. In fact, the code still references the now-closed tracking issue when asking how this should be exposed.

This takes the approach of expanding the RFC to all formatting traits, such that o?, b?, e?, E?, and p? are all possible. It uses these as applicable for all the types in the library crates, assuming I didn't miss any.

This is just a hint, meaning that it can safely be ignored by any debug implementations. For example, format!("{:x?}", Some("hello")) is still allowed, and it simply ignores the x specifier.

This implementation replaces the existing flags used for DebugLowerHex and DebugUpperHex with a four-bit bitfield on the formatting flags. It exposes this via a new DebugVariant field, which states what variant is used in Debug implementations, defaulting to Display if none is specified. For non-debug flags, it will always show Display, but we could potentially change that to show the current format.


Currently, this insta-stabilises the ability to use these variants in formatting (format!("{var:p?}") now works) but still leaves the Formatter::debug_variant method and DebugVariant enum as unstable under the debug_variants feature. This is mostly because I have no idea how we'd actually gate the parsing under the feature; if this is desired, I'd need some guidance on how to accomplish that.


Side note, this actually changes the layout of the flags for the hex-debug variants, which may be undesired. I assume this is okay since the flags method is deprecated, but will modify it to be backwards-compatible if this is a requirement. My thought process was that it would be better to add some padding to separate out the flags a bit better (allow 8 boolean flags before moving onto bitfield for variant), but this is honestly a weak justification. I'm not sure how much code out in the wild is using flags due to the lack of API for the existing hex-debug variants.

@rustbot rustbot added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 11, 2022
@rust-highfive
Copy link
Collaborator

r? @kennytm

(rust-highfive has picked a reviewer for you, use r? to override)

@rustbot
Copy link
Collaborator

rustbot commented Jul 11, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 11, 2022
@rust-log-analyzer

This comment has been minimized.

@clarfonthey
Copy link
Contributor Author

clarfonthey commented Jul 11, 2022

I keep having issues testing this locally, since src/test/codegen/mem-replace-direct-memcpy.rs keeps failing even though this is almost certainly not affected by my change. Hopefully running the tests on stage 2 instead of stage 1 fixes this.

Edit: Oh, yeah, nope, this doesn't fix it:

---- [codegen] src/test/codegen/mem-replace-direct-memcpy.rs stdout ----

error: verification with 'FileCheck' failed
status: exit status: 1
command: "/home/ltdk/rust/build/x86_64-unknown-linux-gnu/ci-llvm/bin/FileCheck" "--input-file" "/home/ltdk/rust/build/x86_64-unknown-linux-gnu/test/codegen/mem-replace-direct-memcpy/mem-replace-direct-memcpy.ll" "/home/ltdk/rust/src/test/codegen/mem-replace-direct-memcpy.rs" "--allow-unused-prefixes" "--check-prefixes" "CHECK,NONMSVC"
stdout: none
--- stderr -------------------------------
/home/ltdk/rust/src/test/codegen/mem-replace-direct-memcpy.rs:20:11: error: CHECK: expected string not found in input
// CHECK: call void @llvm.memcpy.{{.+}}({{i8\*|ptr}} align 1 %{{.*}}, {{i8\*|ptr}} align 1 %dest, i{{.*}} 1, i1 false)
          ^
/home/ltdk/rust/build/x86_64-unknown-linux-gnu/test/codegen/mem-replace-direct-memcpy/mem-replace-direct-memcpy.ll:52:21: note: scanning from here
; core::mem::replace
                    ^
/home/ltdk/rust/build/x86_64-unknown-linux-gnu/test/codegen/mem-replace-direct-memcpy/mem-replace-direct-memcpy.ll:138:2: note: possible intended match here
 call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %_4, i8* align 1 %src, i64 1, i1 false)
 ^

Input file: /home/ltdk/rust/build/x86_64-unknown-linux-gnu/test/codegen/mem-replace-direct-memcpy/mem-replace-direct-memcpy.ll
Check file: /home/ltdk/rust/src/test/codegen/mem-replace-direct-memcpy.rs

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
           47: define internal i8 @"_ZN4core3mem13manually_drop21ManuallyDrop$LT$T$GT$10into_inner17h35f74d8c84d7bcc5E"(i8 %slot) unnamed_addr #0 {
           48: start:
           49:  ret i8 %slot
           50: }
           51:
           52: ; core::mem::replace
check:20'0                         X error: no match found
           53: ; Function Attrs: inlinehint nonlazybind uwtable
check:20'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           54: define internal i8 @_ZN4core3mem7replace17hbec7fbe13492cafeE(i8* noalias noundef align 1 dereferenceable(1) %dest, i8 %src) unnamed_addr #1 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {
check:20'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           55: start:
check:20'0     ~~~~~~~
           56:  %0 = alloca { i8*, i32 }, align 8
check:20'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           57:  %_7 = alloca i8, align 1
check:20'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
            .
            .
            .
          133: ; call core::mem::maybe_uninit::MaybeUninit<T>::as_mut_ptr
check:20'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          134:  %_4 = call i8* @"_ZN4core3mem12maybe_uninit20MaybeUninit$LT$T$GT$10as_mut_ptr17h4bc92292b08ff7a1E"(i8* noalias noundef align 1 dereferenceable(1) %tmp)
check:20'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          135:  br label %bb2
check:20'0     ~~~~~~~~~~~~~~~
          136:
check:20'0     ~
          137: bb2: ; preds = %bb1
check:20'0     ~~~~~~~~~~~~~~~~~~~~
          138:  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %_4, i8* align 1 %src, i64 1, i1 false)
check:20'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:20'1      ?                                                                                         possible intended match
          139:  %_6 = load i8, i8* %tmp, align 1
check:20'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          140: ; call core::mem::maybe_uninit::MaybeUninit<T>::assume_init
check:20'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          141:  %1 = call i8 @"_ZN4core3mem12maybe_uninit20MaybeUninit$LT$T$GT$11assume_init17ha7013b104b271e83E"(i8 %_6, %"core::panic::location::Location"* noalias noundef readonly align 8 dereferenceable(24) bitcast (<{ i8*, [16 x i8] }>* @alloc21 to %"core::panic::location::Location"*))
check:20'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          142:  br label %bb3
check:20'0     ~~~~~~~~~~~~~~~
          143:
check:20'0     ~
            .
            .
            .
>>>>>>
------------------------------------------



failures:
    [codegen] src/test/codegen/mem-replace-direct-memcpy.rs

@rust-log-analyzer

This comment has been minimized.

@Urgau
Copy link
Member

Urgau commented Jul 11, 2022

I keep having issues testing this locally, since src/test/codegen/mem-replace-direct-memcpy.rs keeps failing even though this is almost certainly not affected by my change. Hopefully running the tests on stage 2 instead of stage 1 fixes this.

This is because the LLVM version used in CI is older or newer than the version you're using. I also don't think the failure matters for you. You can (most probably) ignore it.

You have however some compilation failures, those failing tests should probably be gated behind #[cfg(not(bootstrap))] (since they not are expected to compile without your changes in the compiler).

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@clarfonthey clarfonthey force-pushed the debug_variants branch 3 times, most recently from c6489d7 to a38651e Compare July 12, 2022 05:02
@jyn514
Copy link
Member

jyn514 commented Jul 12, 2022

Currently, this insta-stabilises the ability to use these variants in formatting (format!("{var:p?}") now works) but still leaves the Formatter::debug_variant method and DebugVariant enum as unstable under the debug_variants feature. This is mostly because I have no idea how we'd actually gate the parsing under the feature; if this is desired, I'd need some guidance on how to accomplish that.

you can check if tcx.features().your_feature_name { ... } in any of the compiler code, since this is implemented as a built-in proc macro. See https://rustc-dev-guide.rust-lang.org/implementing_new_features.html#stability-in-code

@clarfonthey
Copy link
Contributor Author

you can check if tcx.features().your_feature_name { ... } in any of the compiler code, since this is implemented as a built-in proc macro. See https://rustc-dev-guide.rust-lang.org/implementing_new_features.html#stability-in-code

My main issue is that the parsing code doesn't have any access to any of the compiler code, and ditto for the core library code. I'm not sure where in the code the two are linked together, and had trouble finding it.

Might take some more time over the weekend to poke around more, though.

@jyn514
Copy link
Member

jyn514 commented Jul 12, 2022

ah, sure - I think there's a gate_feature macro or something like that? If nothing else you can trace back from where sess creates features_untracked.

@pnkfelix
Copy link
Member

So, looking at this, I'm pretty confident that this should get some review from at least T-libs-api, and maybe also T-lang, since based on the description, this is insta-stably extending the format system to allow special treatment for all of o?, b?, e?, E?, and p? along with x? and X?

I'm tagging with both of those teams, and nominating it for both of those teams so that they can explicitly untag themselves if they do not think this is in their scope.

@rustbot label: I-lang-nominated T-lang T-libs-api I-libs-api-nominated

@rustbot rustbot added I-lang-nominated Nominated for discussion during a lang team meeting. I-libs-api-nominated Nominated for discussion during a libs-api team meeting. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Aug 11, 2022
@joshtriplett joshtriplett removed the I-lang-nominated Nominated for discussion during a lang team meeting. label Aug 16, 2022
@joshtriplett
Copy link
Member

We discussed this in today's lang meeting, and agreed that it was the purview of libs-api.

@joshtriplett joshtriplett removed the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Aug 16, 2022
@joshtriplett joshtriplett removed the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Aug 16, 2022
@m-ou-se m-ou-se assigned m-ou-se and unassigned kennytm Aug 23, 2022
@m-ou-se m-ou-se removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Aug 23, 2022
@m-ou-se
Copy link
Member

m-ou-se commented Aug 23, 2022

We discussed this last week in the library api meeting, and we're not sure if we want this. We're a bit worried about the inconsistency between hex/octal/decimal/etc. as flags, vs as different traits, and unsure about the practical value.

I've assigned this to myself for me to review and analyse the impact of this change and potential downsides.

@clarfonthey
Copy link
Contributor Author

clarfonthey commented Aug 23, 2022

So, it's worth mentioning that the trait system wouldn't even be able to capture the nuance of these flags because there are cases where one field might be able to display one way, but not the others. For example, (u32, &str) couldn't impl a theoretical DebugHex trait since &str: ! Debug, and there's no way to make (T, U) implement Trait for T: Trait or U: Trait, only T: Trait and U: Trait.

It's unfortunate that we can't detect whether a flag would have no effect, but that feels like something that a clippy lint might be able to handle.

At least from the perspective of a user, it's weird that hex formatting is given an elevated status over octal and binary, since especially in my case, I needed binary but not hex. I can see the Pointer trait being confusing, but also, it'd be nice to make it work for stuff like Option<NonNull<T>> and would be good to include.

Basically: I agree that there may be a better design, but the existing design has been stabilised and I'd like to try and make it work for more than just hex formatting, and this to me seems like the only reasonable way to do it.

@bors
Copy link
Contributor

bors commented Aug 23, 2022

☔ The latest upstream changes (presumably #100920) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@clarfonthey
Copy link
Contributor Author

Resolved the merge conflicts anyway, whether review deems this worth merging or not.

@clarfonthey
Copy link
Contributor Author

Wondering what the status of this should be. I could try and investigate feature gating this, open up an ACP, or just close it. I would like to see this available, but understand the apprehension in merging it as-is.

@m-ou-se
Copy link
Member

m-ou-se commented Jan 14, 2023

Wondering what the status of this should be.

This is stuck waiting for me, sorry about that. I've finally been able to focus a bit on format_args!() again this week, so I'll probably get to this PR soon.

@clarfonthey
Copy link
Contributor Author

Understandable; I remember that the last comment mentioned apprehension about including this, so, it makes sense that it got lost in the pool of other issues. I think that variants other than hex makes sense to include in one way or another, although I think the loosest justification is {:p?} since it's unclear when things should be printed as a pointer for nested types -- for example, if a type happens to include a reference inside it, I don't think this usually means the user would like to see that reference as a pointer instead of its data.

@m-ou-se
Copy link
Member

m-ou-se commented Jan 17, 2023

I dumped some of my thoughts on Zulip. Copied here:


it's a bit of a weird situation now: in {:#x?}, the # flag applies to both ? (use multi-line debug format) and to x (prepend 0x):

    assert_eq!(format!("{:x?}", (10, 20)), "(a, 14)");
    assert_eq!(format!("{:#x?}", (10, 20)), "(\n    0xa,\n    0x14,\n)");

(other flags are also passed on to the individual elements:

    assert_eq!(format!("{:04x?}", (1, 2, 3)), "(0001, 0002, 0003)");

but at least those only apply to the elements, not also to the outer layer.)

this can be useful sometimes. but it feels like a bit of a dirty hack. we don't have any way of specifying flags separately for the elements/fields or for the tuple/array/struct/whatever itself, like some other languages do. and now it's feels a bit random what exactly a flag applies to

we teach people that {:50} means that whatever will be formatted will have a width of 50. but for ? it means every single element will have a width of 50. (and for many custom types the width is just silently ignored)

(and it's also not great for code size in tiny programs, because <Option<u32> as Debug>::fmt will pull in both the decimal and hex formatting code. also supporting b and o will make that even worse. (but it already pulls in the code needed to support other options like width, so that problem is not specific to just this))

so it feels to me like the current situation is not great, and adding {:o?} and {:b?} will make it worse. but i'm not sure what we should to to improve the current situation. should we just leave it as is and accept that this remains a bit weird for historical reasons? or can we think of a way to make things less odd? one could argue that the output of Debug is not covered by our stability guarantees, but i wonder how much would break in practice if we were to change anything.

It looks like this feature never went through an unstable period. the RFC went through FCP, but then #48978 was merged directly without feature gate or FCP.

@m-ou-se
Copy link
Member

m-ou-se commented Jan 17, 2023

So tl;dr: I don't think we're feeling quite sure of the current situation and what to do with it. We should figure that out before extending it to {:o?} {:b?} {:e?} {:E?} and {:p?}.

@bors
Copy link
Contributor

bors commented Jan 27, 2023

☔ The latest upstream changes (presumably #107372) made this pull request unmergeable. Please resolve the merge conflicts.

@clarfonthey
Copy link
Contributor Author

I think I'm going to close this for now since @m-ou-se is doing a lot of work on format_args! right now and rust-lang/libs-team#165 exists to help work out some of the finer details of this. Would be happy to help implement whatever the result of that discussion is, though!

@m-ou-se
Copy link
Member

m-ou-se commented Jan 31, 2023

@clarfonthey I've posted a summary of the libs-api discussion about {:x?} here: rust-lang/libs-team#165 (comment)

@clarfonthey clarfonthey deleted the debug_variants branch September 16, 2023 05:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.