Publication: Free Atomics: Hardware Atomic Operations without Fences
Authors
Asgharzadeh, Ashkan ; Cebrian, Juan M. ; Perais, Arthur ; Kaxiras, Stefanos ; Ros, Alberto
item.page.secondaryauthor
item.page.director
Publisher
Association for Computing Machinery
publication.page.editor
publication.page.department
DOI
https://doi.org/10.1145/3470496.3527385
item.page.type
info:eu-repo/semantics/article
Description
© 2022. The authors. This document is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0
This document is the accepted version of a published work that appeared in final form in ISCA '22: The 49th Annual International Symposium on Computer Architecture New York New York.
To access the final work, see DOI: https://doi.org/10.1145/3470496.3527385
Abstract
Atomic Read-Modify-Write (RMW) instructions are primitive synchronization operations implemented in hardware that provide the building blocks for higher-abstraction synchronization mechanisms to programmers. According to publicly available documentation, current x86 implementations serialize atomic RMW operations, i.e., the store buffer is drained before issuing atomic RMWs and subsequent memory operations are stalled until the atomic
RMW commits. This serialization, carried out by memory fences, incurs a performance cost which is expected to increase with deeper pipelines.
This work proposes Free atomics, a lightweight, speculative, deadlock-free implementation of atomic operations that removes the need for memory fences, thus improving performance, while preserving atomicity and consistency. Free atomics is, to the best of our knowledge, the first proposal to enable store-to-load forwarding
for atomic RMWs. Free atomics only requires simple modifications and incurs a small area overhead (15 bytes). Our evaluation using gem5-20 shows that, for a 32-core configuration, Free atomics improves performance by 12.5%, on average, for a large range of parallel workloads and 25.2%, on average, for atomic-intensive parallel
workloads over a fenced atomic RMW implementation.
publication.page.subject
Citation
ISCA '22: The 49th Annual International Symposium on Computer Architecture New York New York Pages 14–26
item.page.embargo
Collections
Ir a EstadÃsticas
Este Ãtem está sujeto a una licencia Creative Commons. http://creativecommons.org/licenses/by-nc-nd/4.0/