Read back data written in cuda shared memory in python #816

MaximeDebarbat · 2024-12-16T16:44:20Z

I am developing a micro-services app which relies on triton inference server. I send data to triton inference server by writing data directly in a cuda handler but now I wish to get it back in another process, exactly like the way Triton would retrieve it.
Here is an example of how I write my data:

import torch
from cuda_shared_memory import create_shared_memory_region, set_shared_memory_region_from_dlpack, get_raw_handle
import time
import os


if __name__ == "__main__":
    shm_name = "tensor_shm"
    device_id = 0

    # Example tensor
    tensor = torch.randn((3, 3), dtype=torch.float32).cuda(device_id)
    byte_size = tensor.element_size() * tensor.numel()

    handle = create_shared_memory_region(shm_name, byte_size=byte_size, device_id=device_id)
    set_shared_memory_region_from_dlpack(handle, [tensor])

    ser = get_raw_handle(handle)

    with open(os.path.join("/shm-dir", "address"), "wb") as f:
        f.write(ser)
        
    time.sleep(100)

is there anywhere an example I couldn't perhaps find that would explain me how to read this handler in another process by any chance ? Thanks a lot!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read back data written in cuda shared memory in python #816

Read back data written in cuda shared memory in python #816

MaximeDebarbat commented Dec 16, 2024

Read back data written in cuda shared memory in python #816

Read back data written in cuda shared memory in python #816

Comments

MaximeDebarbat commented Dec 16, 2024