-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No good alternative to decode_config()
in 0.20
#205
Comments
First off, thanks for filing feedback. I released the 0.20 alpha precisely to gather feedback, but that didn't produce any... oh well. If 0.20's role is to be the lightning rod, that's ok! I hadn't moved things to be methods on Engine to avoid perturbing the API too much, but I'm not opposed to it. I went back and forth during development about if My intent with the jump from 0.13 to 0.20 was to signal the user that it's a major change in the API, which it obviously is given you filing this issue. ;) |
Right now in most of my usage sites I don't need to import anything, which is an aspect of the previous API I appreciated. The names
My 12 uses are in 7 files across 4 different crates, so reusing it would need a bunch of different infrastructure.
A one number minor version bump is used by every other 0.x crate out there, so it's a well-established standard. I don't recall any other crate having done a more-than-one minor bump. If you want to get early feedback on new API, I'd recommend instead adding it in a compatible release and deprecating the old API. |
If "not needing to import anything" is how you judge your libraries, I think we are not going to see eye to eye. The language gives you the ability to rename types at use-time, so while I'm open to suggestions for better names, in the meantime I encourage to make use of that ability.
How is that any different than sharing a custom 0.13
Complaining about incrementing a vaguely-meaningful number by 7 instead of 1 is making it hard to take you seriously. |
With 0.13 I didn't need a custom |
I think what @djc is trying to say is that going from
Because the
The problem isn't really the 7 vs 1 jump, it's the breaking of the semver convention. When I saw the jump I though I missed a large number of releases. |
That's not really viable now that there is an additional axis of choice. I want users to understand specifically what their API calls are doing. |
I don't necessarily want to understand the inner workings of this crate if all I want is a base64 url-encoded unpadded string. |
Lines of code is a poor metric for quality. I agree that all other things being equal, shorter is better, but all other things are not equal. |
Hyperbole much? You're welcome to provide concrete suggestions for different API. |
What specifically do you think is so important that users understand? I would like to do url-safe base64 encoding/decoding without padding. Why do you want me to understand what
Given that
I'm honestly not being adversarial, but I fail to see how things are not equal right now. |
I didn't mean to say this in a way that upsets you, and generally I appreciate you making this crate as versatile as possibly, allowing different engines and whatnot. However, the way I've been using this crate is as a simple utility to create standard base64 encoded strings and url-safe base64 encoded strings. This was easy as all I had to do was calling encode or calling encode_config with the url-safe config you provided in this crate. With the new API, this gets a lot more complex. I now need to create my own engine with the alphabet that I want, and a custom config that specifies whether I want padding or not. This makes the API more complex, harder to learn (as I now need to learn about the differences between engines, alphabet and config (and I'm not criticizing their existence)), and more verbose on my side.
I'd appreciate an API that lets me choose the alphabet and padding without creating my own engine. For example fn decode_config(input: &str, alphabet: &Alphabet, padding: &Padding) -> Result<Vec<u8>>; Alternatively, you could also add more pre-defined engines with popular combinations, like url-safe or unpadded. Additionally, I suggest adding an example for url-safe base64 en/decoding to the documentation. |
I am totally agree with the discussion above. For most users, they don't actually need to choose which I understand the necessary of the abstraction of
Most users should want to use those standardized formats:
I prefer having an API that @msrd0 proposed. |
Related: + #28 + marshallpierce/rust-base64#205
Related: + #28 + marshallpierce/rust-base64#205
I don't think this is true. Implicit defaults have been harmful in a number of circumstances because users don't realize there is a choice they have to make, and are implicitly making. Consider the case of default file charsets, a "convenience" in a number of language ecosystems. It's now clear that that was an error: reading a file should always result in the same code points regardless of OS, and having two ways of reading a file (one with the default, one with a specified charset) increases the cognitive load of learning the API and the chance of mistakenly using the "default" one. Time zones are another such case. Convenient, right up until they become a constant trickle of bugs as each developer who touches a codebase has to learn not to do the "easy" way with the default time zone. When we have a constant-time engine (which I'll get to, unless someone beats me to it), it will likely not have as good performance as the current one, but for anyone touching cryptographic material, it's the engine they should use. In such a context, the error of accidentally using the convenient, but incorrect, method of decoding because the IDE happened to autocomplete it, etc is actually a nontrivial problem -- something I hope resonates, given the link to an issue in rustls/pemfile!
I'm not necessarily opposed to it, but with the proliferation of wrinkles that base64 has, it's difficult to guess what users would want, especially when it's so easy for them to create their own tailored consts.
Good idea, I will.
Look at all the alphabets I've added over the years, all due to user requests, and the desire for custom alphabets as well.
If that's the API you want for your circumstance, isn't that literally a one line function you could write in your own project? I'm not saying it's a bad API for you, but I'm not convinced it's a good API for everyone. There's nothing wrong with tailoring an API to suit your purposes. So far, what I'm planning on doing:
|
Yes, I agree there must be users want to have customized alphabets, the current 0.20 API could fulfill their requests. But for most use cases, we only need to encode/decode standardized base64 formats!! We don't need any customization! Standard: base64, base64_url with padding, base64_url without padding. The current API design brings the complexity to all users just because a relatively small group of them want customization. As for the implementation detail of
So in those popular platforms, the library could set a SIMD |
Yes, implicit defaults can be harmful. But I don't think this applies for this particular case, because the choices here so far don't seem to be about correctness. As long as the default choice is correct, it won't be harmful. Rust doesn't make you pick the encoding for every string explicitly, right? It sets UTF-8 as a reasonable default and enables API that works around it. That feels more analogous to what the current API is doing. I don't think constant-time implementations are a great comparison because by and large developers who touch constant-time sensitive code know that what they're doing is a special case and they need to be careful to get it right. It would be great to expose a In my experience good API design will make the easy things simple and the hard things possible. The 0.20 API focuses very hard on the latter to the point that it basically doesn't do the former, at all.
That there are a number of different users asking for different alphabets does not mean users using a different alphabet constitute a majority of your users. I'm pretty confident that 80% or more of base64 users stick to the default and URL-safe alphabets.
rustls-pemfile doesn't actually need constant-time base64 -- it's only used in an input configuration path. |
This comment was marked as off-topic.
This comment was marked as off-topic.
A quick solution: Just add some constants in every engine module. pub const STANDARD: FastPortable = FastPortable::from(alphabet::STANDARD, PAD); base64::decode_engine("aGVsbG8gd29ybGR+Cg==", &base64::engine::fast_portable::STANDARD) Different base64 crates in Rust ecosystem:
|
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
I think this is a pretty good idea. Perhaps even the I think part of the disconnect here is that our goals aren't 100% aligned. You all want quick solutions to the problems in front of you, understandably. I want to create a structure that's going to age gracefully and not turn the API into confusing grab bag of different shortcuts that individually make certain use cases easier but together are an incoherent mess. I think there is a general theme here of mistaking "easy" for "simple". They are not the same. It is quite possible for a multi step process to be simple and an easy process to conceal or otherwise incur a complexity cost. If I included a shortcut method or function or constant for every individual's use case request, this library would have long since devolved into a heinous cross product of complexity. That would be a poor user experience, and much more work to maintain. |
Take a look at #207. I've released that as |
Thanks for updating the API, it looks much better now! There is however one thing that I think could be improved: In your new example, you use |
The problem with having them at the root is that it can be difficult for, say, |
Good point. Another idea: Maybe you could add a pub mod prelude {
pub use crate::engine::{
Engine as Base64Engine,
STANDARD as BASE64,
STANDARD_NO_PAD as BASE64_NO_PAD,
// ...
};
} That way, users could choose which naming scheme they prefer. |
A |
I think your issue with exporting the constants at the root of the crate also applies to the re-exports in engine. In my opinion, a consumer of the crate could easily interpret Personally, I think the best way to solve this currently is removing the re-export of general_purpose in engine and renaming the constants to have
|
I'm not sure I can do much to help a user who thinks I think |
@marshallpierce I might have worded my ideas badly, sorry about that. What I think could be confusing for users is, when seeing Concerning your last paragraph, I personally still think that my suggestions reads a bit better, even with your preference for keeping the module path attached. Here are two examples for how I imagine the base64 crate would be used in different contexts.
use base64::{engine::general_purpose::BASE64_URL, Engine as _};
let orig = b"data";
let encoded: String = BASE64_URL.encode(orig); Here, the user, when importing, has to make the choice to use the
use base64::{engine::{general_purpose, constant_time}, Engine as _};
let secret = b"c2VjcmV0";
let decoded: Vec<u8> = constant_time::BASE64_URL.decode(orig)?;
let orig = b"data";
let encoded: String = general_purpose::BASE64_URL.encode(orig); Here it is also really important to always be able to tell which engine is used. By keeping the module prefix included, this is possible at a glance. |
I generally never leave more than one module in the path, and always prefer to just use the type name if it is expressive enough (which While I'd still prefer an api like |
I'm not sure I follow what user population it is that is both so knowledgeable about implementation details of this crate that there's a concern about which implementation of I also don't think a BASE64 prefix is appropriate. The crate name is already there in the module path, and if people want
If rust-analyzer doesn't support |
When I manually scroll to the top of the file, manually type However, I cannot type
Frankly, I don't think that |
But you can do I am not optimizing for the convenience of rust-analyzer users working in their editor with its specific limitations. I am optimizing for the overall comprehensibility of the library. While I may well not have the optimal arrangement for that stated goal, "my editor makes this hard" is not a persuasive argument to change. |
I would design the API like this. use base64::Engine; // auto imported by rust-analyzer
let b1 = base64::gp::STANDARD.decode(s1)?;
let b2 = base64::ct::STANDARD.decode(s2)?;
let b3 = base64::simd::STANDARD.decode(s3)?;
The equivalent in upcoming let b1 = base64_simd::STANDARD.decode_to_vec(s1)?; |
It does read well, but I don't think it's easy to understand necessarily. A reader who isn't in the know about engine implementations isn't going to get much from |
After working on other things for a while and coming back to this with fresh-ish eyes, I think on balance @msrd0's |
Thanks for updating the API. I believe you should also re-export |
Ok. I'll add that and call it |
There's been no new feedback for a while so I'm closing this and releasing 0.21.0. |
@marshallpierce the 0.21 API is substantially better in this regard than the 0.20 one, thanks for the iteration here. |
I'm unhappy about the API changes in 0.20. In my code base at work we have 12 use cases of base64 0.13 encoding and decoding, 9 of which pass a config (mostly
URL_SAFE_NO_PAD
, someURL_SAFE
). Unless I misunderstand the new API (I don't see any examples of this use case in the README or the top-level documentation), in order to obtain the same result with 0.20 I would have to instantiate a customFastPortable
(which, despite being the only engine shipped with 0.20, I have to import frombase64::engine::fast_portable::FastPortable
) and initialize withbase64::alphabet::URL_SAFE
andbase64::engine::fast_portable::NO_PAD
.Instead of offering a simpler way to do this through a top-level function, 6 of the 9 top-level functions now take an
Engine
; I would argue that, in case you already have an engine, these could have been methods attached to it.I understand that the
Engine
abstraction is useful if you want to introduce a SIMD-backed implementation. I even like the introduction of a separateAlphabet
type. I think it doesn't really make sense to have aConfig
type that doesn't includeAlphabet
, keeping it separate instead -- the whole point of having aConfig
type is to make it easy to wrap all the contextual/usually stable input for your main API calls into a single type. But to not offer a top-level API that makes it easy to use different alphabets/config seems to me like a lack of empathy for how this crate is used.(FWIW, I also think jumping from 0.13 to 0.20 doesn't have much merit.)
The text was updated successfully, but these errors were encountered: