Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move base64 encoding/decoding to Rust side #13845

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions ext/web/00_infra.js
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@
}

/**
* @param {Uint8Array} data
* @param {string} data
* @returns {string}
*/
function forgivingBase64Encode(data) {
Expand All @@ -242,7 +242,7 @@

/**
* @param {string} data
* @returns {Uint8Array}
* @returns {string}
*/
function forgivingBase64Decode(data) {
return core.opSync("op_base64_decode", data);
Expand Down
22 changes: 2 additions & 20 deletions ext/web/05_base64.js
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,7 @@
context: "Argument 1",
});

const uint8Array = forgivingBase64Decode(data);
const result = ArrayPrototypeMap(
[...new SafeArrayIterator(uint8Array)],
(byte) => StringFromCharCode(byte),
);
return ArrayPrototypeJoin(result, "");
return forgivingBase64Decode(data);
}

/**
Expand All @@ -56,20 +51,7 @@
prefix,
context: "Argument 1",
});
const byteArray = ArrayPrototypeMap(
[...new SafeArrayIterator(data)],
(char) => {
const charCode = StringPrototypeCharCodeAt(char, 0);
if (charCode > 0xff) {
throw new DOMException(
"The string to be encoded contains characters outside of the Latin1 range.",
"InvalidCharacterError",
);
}
return charCode;
},
);
return forgivingBase64Encode(TypedArrayFrom(Uint8Array, byteArray));
return forgivingBase64Encode(data);
}

window.__bootstrap.base64 = {
Expand Down
4 changes: 2 additions & 2 deletions ext/web/internal.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ declare namespace globalThis {
result: string;
position: number;
};
forgivingBase64Encode(data: Uint8Array): string;
forgivingBase64Decode(data: string): Uint8Array;
forgivingBase64Encode(data: string): string;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit of a shame that the strings (presumably) have to be copied. I wonder if it would be possible to have an optional &str serde?

forgivingBase64Decode(data: string): string;
};

declare var domException: {
Expand Down
38 changes: 30 additions & 8 deletions ext/web/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -149,8 +149,25 @@ fn op_base64_decode(
_state: &mut OpState,
input: String,
_: (),
) -> Result<ZeroCopyBuf, AnyError> {
let mut input: &str = &input.replace(|c| char::is_ascii_whitespace(&c), "");
) -> Result<String, AnyError> {
let mut ws = false;
let len = input.len();
let invalid = input.char_indices().any(|(i, c)| {
if char::is_ascii_whitespace(&c) {
ws = true;
} else if c == '=' {
if i != len - 1 && i != len - 2 {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bit of a possibility for false negative here: Even if there are '=' characters at the end it does not mean that they'll be removed below since that depends on whitespace removal etc.

As such, it would be more appropriate to track '=' markers at end with an enum: None, One, Two. Then compare to those when '=' is being removed and reset the enum appropriately. If reset does not happen, it means that the '=' at the end were invalid characters.

return true;
}
} else if c != '+' && c != '/' && !c.is_alphanumeric() {
return true;
}
false
});

// "Remove all ASCII whitespace from data"
let input = if ws { input.replace(|c| char::is_ascii_whitespace(&c), "") } else { input };
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be preferable if the replacement could be done on the same iter as the invalid characters but that's probably not possible.

let mut input: &str = &input;
// "If the length of input divides by 4 leaving no remainder, then:
// if input ends with one or two U+003D EQUALS SIGN (=) characters,
// remove them from input."
Expand All @@ -170,10 +187,8 @@ fn op_base64_decode(
);
}

if input
.chars()
.any(|c| c != '+' && c != '/' && !c.is_alphanumeric())
{
// "If data contains a code point that is not '+', '/', or ASCII alphanumeric then return failure."
if invalid {
return Err(
DomExceptionInvalidCharacterError::new(
"Failed to decode base64: invalid character",
Expand All @@ -190,14 +205,21 @@ fn op_base64_decode(
err
))
})?;
Ok(ZeroCopyBuf::from(out))
Ok(String::from_utf8(out).unwrap())
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if .unwrap() here is valid.

}

fn op_base64_encode(
_state: &mut OpState,
s: ZeroCopyBuf,
s: String,
_: (),
) -> Result<String, AnyError> {
let char_count = s.chars().count();
let s = s.into_bytes();
if s.len() != char_count {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Properly this should be s.len() > char_count but it doesn't really matter.

// If the byte vec length is longer than the string length, it means that some characters were not respresentable as a single
// byte, ie. some characters were not Latin1.
return Err(DomExceptionInvalidCharacterError::new("The string to be encoded contains characters outside of the Latin1 range.").into());
}
let cfg = base64::Config::new(base64::CharacterSet::Standard, true)
.decode_allow_trailing_bits(true);
let out = base64::encode_config(&s, cfg);
Expand Down