Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with simd_masked_load to read beyond without undefined behavior #98

Draft
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

ogxd
Copy link
Owner

@ogxd ogxd commented Nov 6, 2024

No description provided.

@ogxd ogxd self-assigned this Nov 6, 2024

let indices = vld1q_s8([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15].as_ptr());
let mask = vreinterpretq_s8_u8(vcgtq_s8(vdupq_n_s8(len as i8), indices));
std::intrinsics::simd::simd_masked_load(mask, data as *const i8, vdupq_n_s8(len as i8))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do I understand correctly that it would be valid for LLVM to basically emit the same assembly as before, but somehow it doesn't?

I'm not a codegen expert so I have no idea whether this is just hard for LLVM to do or something they could reasonably fix. Maybe it'd be worth making an LLVM bugreport about this? (We can try to find some people that could help nail down the core issue here, if you are interested.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider using simd_masked_load for the Read Beyond of Death trick
2 participants