Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read back data written in cuda shared memory in python #816

Open
MaximeDebarbat opened this issue Dec 16, 2024 · 0 comments
Open

Read back data written in cuda shared memory in python #816

MaximeDebarbat opened this issue Dec 16, 2024 · 0 comments

Comments

@MaximeDebarbat
Copy link

I am developing a micro-services app which relies on triton inference server. I send data to triton inference server by writing data directly in a cuda handler but now I wish to get it back in another process, exactly like the way Triton would retrieve it.
Here is an example of how I write my data:

import torch
from cuda_shared_memory import create_shared_memory_region, set_shared_memory_region_from_dlpack, get_raw_handle
import time
import os


if __name__ == "__main__":
    shm_name = "tensor_shm"
    device_id = 0

    # Example tensor
    tensor = torch.randn((3, 3), dtype=torch.float32).cuda(device_id)
    byte_size = tensor.element_size() * tensor.numel()

    handle = create_shared_memory_region(shm_name, byte_size=byte_size, device_id=device_id)
    set_shared_memory_region_from_dlpack(handle, [tensor])

    ser = get_raw_handle(handle)

    with open(os.path.join("/shm-dir", "address"), "wb") as f:
        f.write(ser)
        
    time.sleep(100)

is there anywhere an example I couldn't perhaps find that would explain me how to read this handler in another process by any chance ? Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant