Port tier1 to GPU. #30

Open
opened 2024-05-03 13:06:33 +00:00 by Gregor Weiss · 0 comments
Collaborator

Tier 1 is currently the most time-consuming step and can be ported to the GPU so that individual code block de/encoding can be parallelized. This follows the current threading hooks with OpenMP.

GPU kernels for encode_codeblock(s) and decode_codeblock(s) would be established as __global__ kernels triggered from the CPU replacing the loops in tier1_encode and tier1_decode.

Tier 1 is currently the most time-consuming step and can be ported to the GPU so that individual code block de/encoding can be parallelized. This follows the current threading hooks with OpenMP. GPU kernels for encode_codeblock(s) and decode_codeblock(s) would be established as `__global__` kernels triggered from the CPU replacing the loops in tier1_encode and tier1_decode.
Gregor Weiss added this to the GPU implementation project 2024-05-03 13:06:33 +00:00
Gregor Weiss self-assigned this 2024-05-03 13:18:31 +00:00
Sign in to join this conversation.
No labels
No milestone
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TOPIO/BigWhoop#30
No description provided.