You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While working on a project involving my own channel implementation, I noticed an issue where data was being loaded incorrectly from atomic variables wrapped in CachePadded. This problem arose specifically on an aarch64 platform, particularly on Apple M1. After troubleshooting for a while, I began to suspect that the alignment of CachePadded could be contributing to this behavior.
In this case, the issue did not seem to be related to any specific Rust library object. I have used CachePadded in other parts of my code without encountering problems, but in this particular case, I was working with an Arc<Vec<Commit>>, where each Commit structure included fields like Weak<AtomicUsize> and AtomicUsize. Additionally, the vector had a fairly large capacity, which might have exposed the issue.
Currently, CachePadded aligns data to 128 bytes on aarch64, likely due to the assumption that prefetchers on modern CPUs, including ARM chips, can fetch multiple cache lines at once. However, as far as I know, the actual cache line size on Apple M1 is 64 bytes. This over-alignment could be introducing unnecessary padding, leading to inefficiencies or even the incorrect behavior I’ve observed. Reducing the alignment to 64 bytes, matching the actual cache line size, resolved this issue.
Unfortunately, I’m unable to share the full code due to a signed policy. However, I believe the alignment strategy for CachePadded on aarch64 is worth investigating, especially in situations like mine where high capacity vectors and atomic operations are involved. Adjusting the alignment for aarch64 to 64 bytes may help prevent similar issues from arising in other cases.
I’d appreciate it if someone could take a look or clarify whether this behavior is expected. From my testing, it seems that reducing the alignment to 64 bytes mitigates the problem on M1, though I’m not entirely certain if this is the root cause.
@xpepermint The Apple M1 has an asymmetric cache line size across its different cores and caches. It in fact uses a 128-byte cache on non-efficiency cores.
However, this repository does not expose CachePadded, so you have filed the issue in the wrong place.
While working on a project involving my own channel implementation, I noticed an issue where data was being loaded incorrectly from atomic variables wrapped in
CachePadded
. This problem arose specifically on anaarch64
platform, particularly onApple M1
. After troubleshooting for a while, I began to suspect that the alignment ofCachePadded
could be contributing to this behavior.In this case, the issue did not seem to be related to any specific Rust library object. I have used
CachePadded
in other parts of my code without encountering problems, but in this particular case, I was working with anArc<Vec<Commit>>
, where each Commit structure included fields likeWeak<AtomicUsize>
andAtomicUsize
. Additionally, the vector had a fairly large capacity, which might have exposed the issue.Currently,
CachePadded
aligns data to128 bytes
onaarch64
, likely due to the assumption that prefetchers on modern CPUs, including ARM chips, can fetch multiple cache lines at once. However, as far as I know, the actual cache line size onApple M1
is64 bytes
. This over-alignment could be introducing unnecessary padding, leading to inefficiencies or even the incorrect behavior I’ve observed. Reducing the alignment to64 bytes
, matching the actual cache line size, resolved this issue.Unfortunately, I’m unable to share the full code due to a signed policy. However, I believe the alignment strategy for
CachePadded
onaarch64
is worth investigating, especially in situations like mine where high capacity vectors and atomic operations are involved. Adjusting the alignment foraarch64
to64 bytes
may help prevent similar issues from arising in other cases.I’d appreciate it if someone could take a look or clarify whether this behavior is expected. From my testing, it seems that reducing the alignment to
64 bytes
mitigates the problem onM1
, though I’m not entirely certain if this is the root cause.Ref: crossbeam-rs/crossbeam#1139
The text was updated successfully, but these errors were encountered: