Possible Overalignment Issue for aarch64 (Apple M1): CachePadded Aligned to 128 Bytes #131019

xpepermint · 2024-09-29T13:25:56Z

While working on a project involving my own channel implementation, I noticed an issue where data was being loaded incorrectly from atomic variables wrapped in CachePadded. This problem arose specifically on an aarch64 platform, particularly on Apple M1. After troubleshooting for a while, I began to suspect that the alignment of CachePadded could be contributing to this behavior.

In this case, the issue did not seem to be related to any specific Rust library object. I have used CachePadded in other parts of my code without encountering problems, but in this particular case, I was working with an Arc<Vec<Commit>>, where each Commit structure included fields like Weak<AtomicUsize> and AtomicUsize. Additionally, the vector had a fairly large capacity, which might have exposed the issue.

Currently, CachePadded aligns data to 128 bytes on aarch64, likely due to the assumption that prefetchers on modern CPUs, including ARM chips, can fetch multiple cache lines at once. However, as far as I know, the actual cache line size on Apple M1 is 64 bytes. This over-alignment could be introducing unnecessary padding, leading to inefficiencies or even the incorrect behavior I’ve observed. Reducing the alignment to 64 bytes, matching the actual cache line size, resolved this issue.

Unfortunately, I’m unable to share the full code due to a signed policy. However, I believe the alignment strategy for CachePadded on aarch64 is worth investigating, especially in situations like mine where high capacity vectors and atomic operations are involved. Adjusting the alignment for aarch64 to 64 bytes may help prevent similar issues from arising in other cases.

I’d appreciate it if someone could take a look or clarify whether this behavior is expected. From my testing, it seems that reducing the alignment to 64 bytes mitigates the problem on M1, though I’m not entirely certain if this is the root cause.

Ref: crossbeam-rs/crossbeam#1139

The text was updated successfully, but these errors were encountered:

workingjubilee · 2024-09-29T17:06:22Z

@xpepermint The Apple M1 has an asymmetric cache line size across its different cores and caches. It in fact uses a 128-byte cache on non-efficiency cores.

However, this repository does not expose CachePadded, so you have filed the issue in the wrong place.

thomcc · 2024-09-29T22:11:16Z

See also the output of sysctl hw.cachelinesize on my M1 mac:

hw.cachelinesize: 128

rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 29, 2024

workingjubilee closed this as not planned Won't fix, can't repro, duplicate, stale Sep 29, 2024

jieyouxu removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible Overalignment Issue for aarch64 (Apple M1): CachePadded Aligned to 128 Bytes #131019

Possible Overalignment Issue for aarch64 (Apple M1): CachePadded Aligned to 128 Bytes #131019

xpepermint commented Sep 29, 2024

workingjubilee commented Sep 29, 2024

thomcc commented Sep 29, 2024

Possible Overalignment Issue for aarch64 (Apple M1): CachePadded Aligned to 128 Bytes #131019

Possible Overalignment Issue for aarch64 (Apple M1): CachePadded Aligned to 128 Bytes #131019

Comments

xpepermint commented Sep 29, 2024

workingjubilee commented Sep 29, 2024

thomcc commented Sep 29, 2024