Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Mixing "async" with "blocking" methods in Actor changes threaded concurrency behaviour #49869

Open
baughmann opened this issue Jan 15, 2025 · 1 comment
Labels
core Issues that should be addressed in Ray Core P1 Issue that should be fixed within a few weeks question Just a question :)

Comments

@baughmann
Copy link

baughmann commented Jan 15, 2025

What happened + What you expected to happen

I was playing around in a Jupiter notebook trying to learn ray and come across some strange behaviour regarding threaded vs asyncio concurrency in Actors. Perhaps this behaviour is expected, but I didn't glean that from the docs pages.

The simple actor

@ray.remote
class MyTestActor:
    async def do_stuff_async(self) -> str: # an async non-blocking function that sleeps
        await asyncio.sleep(1)
        return "hello"
    
    def do_stuff_sync(self) -> str: # a synchronous blocking function that sleeps
        sleep(1)
        return "hello"

actor = MyTestActor.options(max_concurrency=3).remote()

The experiment and results

Running do_stuff_async three times in my notebook takes one second, just as one might expect:

futures = [actor.do_stuff_async.remote() for _ in range(3)]
ray.get(futures)

However--bizarrely--running do_stuff_sync three times take three seconds, meaning that it is not running concurrently:

futures = [actor.do_stuff_sync.remote() for _ in range(3)]
ray.get(futures) 

Even stranger behaviour upon experimenting

Obviously, the above behaviour is quite strange. What is even stranger, however, is that by simply commenting out the async method:

@ray.remote
class MyTestActor:
#    async def do_stuff_async(self) -> str:
#        await asyncio.sleep(1)
#        return "hello"
    
    def do_stuff_sync(self) -> str:
        sleep(1)
        return "hello"

actor = MyTestActor.options(max_concurrency=3).remote()

and re-running that notebook cell, then re-running the cell that calls do_stuff_sync, the invocations all completed in one second, meaning they ran concurrently as expected.

Attempted resolutions

I tried:

  1. Using a named concurrency group for both functions rather than the default
  2. Giving the actor two distinct concurrency groups: one for the async method and one for the synchronous one.

In the end, the only way to get the example actor working correctly was to split it into two actors: one with only the async method, and one with only the sync method.

Expected behaviour

As a user of Ray, I would expect that an actor could have both async and non-async functions, and that the concurrency group(s) apply correctly to all of them.

If this is not possible, I would expect a warning or error. If not when deploying the actor without any extra args, then at least when changing the default concurrency params.

Versions / Dependencies

Ray 2.40.0
Python 3.12.3

Reproduction script

import timeit
import ray
from time import sleep
import asyncio

@ray.remote
class MyTestActor:
    async def do_stuff_async(self) -> str:
        await asyncio.sleep(1)
        return "hello"
    
    def do_stuff_sync(self) -> str:
        sleep(1)
        return "hello"
    
actor = MyTestActor.options(max_concurrency=3).remote()

# execute and time the async function call
async_futures = [actor.do_stuff_async.remote() for _ in range(3)]
start = timeit.default_timer()
ray.get(async_futures)
end = timeit.default_timer()
async_taken = end - start

# execute and time the sync function call
sync_futures = [actor.do_stuff_sync.remote() for _ in range(3)]
start = timeit.default_timer()
ray.get(sync_futures)
end = timeit.default_timer()
sync_taken = end - start

# print time taken
print(f"async: {async_taken}s")
print(f"sync: {sync_taken}s")

Issue Severity

Medium: It is a significant difficulty but I can work around it.

EDIT:
I guess it is explained in the docs:

Instead, you can use the max_concurrency Actor options without any async methods, allowng you to achieve threaded concurrency (like a thread pool).

Though it could be worded a bit more prominently. Happy to take a stab at a PR for the docs if someone else thinks this is a valid issue with them.

@baughmann baughmann added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jan 15, 2025
@baughmann baughmann changed the title [<Ray component: Core] Mixing "async" with "blocking" methods in Actor changes threaded concurrency behaviour [Core] Mixing "async" with "blocking" methods in Actor changes threaded concurrency behaviour Jan 15, 2025
@jcotant1 jcotant1 added the core Issues that should be addressed in Ray Core label Jan 15, 2025
@jjyao
Copy link
Collaborator

jjyao commented Jan 22, 2025

Hi @baughmann what you said is valid behavior. According to the doc

In async actors, only one task can be running at any point in time (though tasks can be multi-plexed). There will be only one thread in AsyncActor! See [Threaded Actors](https://docs.ray.io/en/latest/ray-core/actors/async_api.html#threaded-actors) if you want a threadpool.
Setting concurrency in Async Actors
You can set the number of “concurrent” task running at once using the max_concurrency flag. By default, 1000 tasks can be running concurrently.

An async actor ONLY has 1 thread and setting max_concurrency=3 won't create 3 threads for you but only allows 3 tasks (coroutines) to run at the same time in 1 thread.

@jjyao jjyao added question Just a question :) P1 Issue that should be fixed within a few weeks and removed bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Issues that should be addressed in Ray Core P1 Issue that should be fixed within a few weeks question Just a question :)
Projects
None yet
Development

No branches or pull requests

3 participants