Publication:
Enhanced system-level coherence for heterogeneous unified memory architectures

dc.contributor.authorNataraja, Anoop Mysore
dc.contributor.authorFernández Pascual, Ricardo
dc.contributor.authorRos Bardisa, Alberto
dc.contributor.departmentIngeniería y Tecnología de Computadores
dc.date.accessioned2024-12-18T07:47:16Z
dc.date.available2024-12-18T07:47:16Z
dc.date.issued2024-11-28
dc.description© 2024 IEEE. This document is the Submitted version of a Published Work that appeared in final form in 2024 IEEE International Symposium on Workload Characterization (IISWC). To access the final edited and published work see https://doi.org/10.1109/IISWC63097.2024.00032es
dc.description.abstractHeterogeneous Unified Memory Architectures (HUMA) provide a unified memory space for on-die CPUs, GPUs, and other hardware accelerators. Such architectures improve performance and energy efficiency by obviating explicit data transfers between processors. An important feature of such architectures is Heterogeneous System Coherence (HSC) which simplifies the programming model by reducing the explicit synchronizations otherwise expected of the programmers of such systems. However, due to differences in the memory models and bandwidth requirements of CPUs and GPUs, hardware implementation of coherence for such systems is often complex and comes at high power, performance, and area trade-offs.This paper optimizes the existing heterogeneous coherence mechanism in early AMD Accelerated Processing Units, approximately modeled in the gem5 simulator. It introduces precise sharing information in the system-level directory, which monitors both CPU and GPU cache lines, and implements a new write-back shared last-level cache (LLC). The original implementation consisted of a stateless system-level directory and a write-through LLC. Our evaluation results with a set of collaborative heterogeneous benchmarks reveal, on average, a 14.4% performance improvement and 80.8% and 50.4% reduced probing traffic and main-memory interactions, respectively. Through optimizations and adaptation of the evaluated benchmarks, this work aims to reduce the barriers to entry into HSC research.es
dc.embargo.terms1-ene-2999
dc.formatapplication/pdfes
dc.format.extent11es
dc.identifier.doihttps://doi.org/10.1109/IISWC63097.2024.00032
dc.identifier.eissn979-8-3503-5603-8
dc.identifier.urihttp://hdl.handle.net/10201/147600
dc.languageenges
dc.publisherIEEE Computer Societyes
dc.relationThis project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (ECHO, grant agreement No 819134), from the CIN/AEI/10.13039/501100011033/ and the “ERDF A way of making Europe”, EU (grant PID2022-136315OB-I00), and from the MCIN/AEI/10.13039/501100011033/ and the European Union NextGenerationEU/PRTR (grant TED2021-130233BC33).es
dc.relation.ispartofIEEE International Symposium on Workload Characterization (IISWC), 2024, 15-17, Vancouver, pp. 273--283es
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/10763885es
dc.rights.accessRightsinfo:eu-repo/semantics/restrictedAccess
dc.subjectHeterogeneous system coherencees
dc.subjectCollaborative heterogeneous applicationses
dc.subjectArchitectural simulatores
dc.titleEnhanced system-level coherence for heterogeneous unified memory architectureses
dc.typeinfo:eu-repo/semantics/articlees
dspace.entity.typePublicationes
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
anataraja-iiswc24.pdf
Size:
593.28 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.26 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections