Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move base64 encoding/decoding to Rust side #13845

Closed
wants to merge 1 commit into from

Conversation

aapoalas
Copy link
Collaborator

@aapoalas aapoalas commented Mar 5, 2022

Attempt to at least somewhat fix #13838.

Before:

base64_roundtrip:       n = 10, dt = 4.157s, r = 2/s, t = 415700000ns/op

After:

base64_roundtrip:       n = 10, dt = 1.936s, r = 5/s, t = 193600000ns/op

@@ -43,8 +43,8 @@ declare namespace globalThis {
result: string;
position: number;
};
forgivingBase64Encode(data: Uint8Array): string;
forgivingBase64Decode(data: string): Uint8Array;
forgivingBase64Encode(data: string): string;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit of a shame that the strings (presumably) have to be copied. I wonder if it would be possible to have an optional &str serde?

if char::is_ascii_whitespace(&c) {
ws = true;
} else if c == '=' {
if i != len - 1 && i != len - 2 {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bit of a possibility for false negative here: Even if there are '=' characters at the end it does not mean that they'll be removed below since that depends on whitespace removal etc.

As such, it would be more appropriate to track '=' markers at end with an enum: None, One, Two. Then compare to those when '=' is being removed and reset the enum appropriately. If reset does not happen, it means that the '=' at the end were invalid characters.

});

// "Remove all ASCII whitespace from data"
let input = if ws { input.replace(|c| char::is_ascii_whitespace(&c), "") } else { input };
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be preferable if the replacement could be done on the same iter as the invalid characters but that's probably not possible.

_: (),
) -> Result<String, AnyError> {
let char_count = s.chars().count();
let s = s.into_bytes();
if s.len() != char_count {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Properly this should be s.len() > char_count but it doesn't really matter.

@@ -190,14 +205,21 @@ fn op_base64_decode(
err
))
})?;
Ok(ZeroCopyBuf::from(out))
Ok(String::from_utf8(out).unwrap())
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if .unwrap() here is valid.

@aapoalas
Copy link
Collaborator Author

aapoalas commented Mar 5, 2022

Ooops, a better perf optimization had already been posted.

@aapoalas aapoalas closed this Mar 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

atob and btoa are slower than node
1 participant