-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bittensor/axon.py: thread and exception handling #2227
base: staging
Are you sure you want to change the base?
bittensor/axon.py: thread and exception handling #2227
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments for changes. Also it required ruff formatting.
But looks pretty good!
bittensor/axon.py
Outdated
_thread: threading.Thread = None | ||
_started: bool = False | ||
|
||
def set_exception(self, ex): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls use more descriptive variable name instead of ex
.
Also, pls add type annotation.
bittensor/axon.py
Outdated
with self._lock: | ||
return self._exception | ||
|
||
def set_thread(self, thread): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls add type annotation.
bittensor/axon.py
Outdated
with self._lock: | ||
return self._thread | ||
|
||
def set_started(self, started): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls add type annotation.
bittensor/axon.py
Outdated
@property | ||
def exception(self): | ||
# for future use: setting self._exception to signal an exception | ||
e = getattr(self,'_exception',None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls use more descriptive variable name instead of e
bittensor/axon.py
Outdated
""" | ||
thread = threading.Thread(target=self.run, daemon=True) | ||
thread.start() | ||
try: | ||
while not self.started: | ||
t0 = time.time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls use more descriptive variable name instead of t0
bittensor/axon.py
Outdated
# Our instantiator should be able to test axon.is_running() to see if all | ||
# required threads etc are running. | ||
def is_running(self): | ||
t = self.fast_server.get_thread() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls use more descriptive variable name instead of t
.
bittensor/axon.py
Outdated
return e | ||
return self.fast_server.get_exception() | ||
|
||
# Our instantiator should be able to test axon.is_running() to see if all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to keep it in docstring. Then it will help not only the developer who reads the code, but also the one who calls help(function/method/property)
bittensor/axon.py
Outdated
@@ -405,6 +458,24 @@ def info(self) -> "bittensor.AxonInfo": | |||
placeholder2=0, | |||
) | |||
|
|||
# Our instantiator should be able to test axon.exception to see if any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to keep it in docstring. Then it will help not only the developer who reads the code, but also the one who calls help(function/method/property)
7bb5ae9
to
25a7a47
Compare
Pushed fixes for first round of feedback, for now as a separate commit, plus some further tweaks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with these changes. Just rebase, and we'll merge it in.
Various issues were encountered trying to run and understand e2e tests: - if uvicorn fails to start, an uncaught exception is emitted to stderr - axon keeps spinning waiting for self.started, indefinitely - exceptions are not propagated from threads - there is no way to (simply) test from the outside whether an axon started and/or runs - axon creates a thread that only creates another thread, which seems redundant This patch addresses some of these issues, in FastAPIThreadedServer: - add thread safe set/get_exception() to set/get exceptions - run_in_thread() yields the created thread, so that the code using it can check whether the thread is alive - uvicorn.Server.startup() is wrapped to set a thread-safe flag using self.set_started(True) to indicate startup succeeded - run_in_thread() times out after one second to prevent infinite loop in case self.get_started() never becomes True - run_in_thread() raises an exception if it fails to start the thread - _wrapper_run() tests whether the thread is still alive and in class axon, the following are added: - @Property axon.exception(), returning any exception - axon.is_running(), returning True when the axon is operational The seemingly redundant thread is left in until feedback is received on the reasons for including it.
- ruff formatting - docstrings - variable names - type annotations - don't return ret = startup() which is annotated to return None - moved time sleep intervals to singular global, for clarity
…after pip install Although it seems common to have types.py (e.g. scalecodec, torch, substrateinterface) this may lead to issues after applying pip install -e to a package, as is suggested for bittensor (see git grep 'pip install -e.* bittensor') Issues were observed where circular imports would arise as Python's native typing.py would include bittensor's types.py.
825145e
to
1c73994
Compare
f13f5f3
to
73d9fd4
Compare
The merge-base changed after approval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed merge conflicts in #2459
Bug
Various issues were encountered trying to run and understand e2e tests:
self.started
, indefinitelyDescription of the Change
This patch addresses some of these issues, in
FastAPIThreadedServer
:set
/get_exception()
to set/get exceptionsrun_in_thread()
yields the created thread, so that the code using it can check whether the thread is aliveuvicorn.Server.startup()
is wrapped to set a thread-safe flag usingself.set_started(True)
to indicate startup succeededrun_in_thread()
times out after one second to prevent infinite loop in caseself.get_started()
never becomesTrue
run_in_thread()
raises an exception if it fails to start the thread_wrapper_run()
tests whether the thread is still aliveand in
class axon
, the following are added:@property axon.exception()
, returning any exceptionaxon.is_running()
, returningTrue
when the axon is operationalThe seemingly redundant thread is left in until feedback is received on the reasons for including it.
Alternate Designs
It is a deliberate choice to add status calls on
class axon
, and not expect the user instantiating an axon to look into implementation details such asaxon.fast_server
, even though they are not explicitly marked private by the naming convention used.Possible Drawbacks
Not expected. Perhaps issues that are currently buried will suddenly pop up. This is then intended and an expected effect of improving error handling.
Verification Process
These changes are part of an effort to improve e2e tests, and as such they are part of various other developments and tested along with other changes.
Release Notes
N/A