Experiment with Cuda/ROCm implementation #28
Labels
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TOPIO/BigWhoop#28
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
As a first step to run BigWhoop on Accelerator Cards we should experiment with an implementation of BigWhoop using the programming paradigms provided for the Nvidia/AMD Hardware that is being utilized at HLRS.
I started with HIP/ROCm on the AMD platform during limited access to corresponding hardware MI250 and MI300A, which was minorly successful because I didn't get past the point in CMakeLists.txt that I could use the shared library libbwc in the bwccmdl target.
With CUDA 11.8 / NVHPC 23.11 I adopted the CMakeLists.txt successfully for the project. Porting to CUDA can be started now.
The respective structs, such as bwc_field, bwc_tile, etc., will add device pointers where needed so that data can be copied from host to device and operated on from GPU kernels.