-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding timeout functionality to action nodes for ineffectual work functions #62
Conversation
Closes #45 |
573ab0b
to
dcefc01
Compare
d49a5ee
to
8df553d
Compare
8df553d
to
a2c1edb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a lot of questions, but maybe I'm misunderstanding things.
beams/sequencer/helpers/Timer.py
Outdated
self.is_periodic = is_periodic | ||
self.auto_start = auto_start | ||
if (self.auto_start): | ||
self.timer_start_time = time.time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and throughout we should use time.monotonic
, since we don't actually care about the wall clock.
volatile_status.set_value(py_trees.common.Status.RUNNING) | ||
# Start timer | ||
work_loop_timeout_timer.start_timer() | ||
while not completion_condition() and not work_loop_timeout_timer.is_elapsed(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we only check the timer at the top of the while loop, if a work function never completes / hangs forever, this won't actually timeout will it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, at present this doesn't have a mechanism to send a kill signal to a hung function. Within Worker we can tie these functionalities together... it is a good, deeper idea. The intention here was if futile work was being done, i.e. the function is returning but having no effect on the success condition, to time out that work
beams/behavior_tree/ActionWorker.py
Outdated
work_loop_timeout_timer = Timer(name=name, | ||
timer_period_seconds=work_function_timeout_period_sec, | ||
auto_start=False, | ||
is_periodic=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the intent behind the periodic timer here? It looks like we just reset the start time if the elapsed time is greater than the timer period. But the way the timer is called, the loop will never repeat within one work wrapper call (while loop starts -> work loops a couple times -> timer elapses -> while loop broken (even though timer started again))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we want the (do_work) while loop, the one that launches this "thread" as a persistent process across the life cycle of the program to persist.
I was adding a timer to the actual "behavior tree load" this can be seen as, "I got initialized, I did my task for [loop_timer] amount of time, I never found my success condition to be true, let me signal fail and wait for the BT to try me again should it do so"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's no disagreement that there needs to be a timer
The question here is: why is the timer periodic?
After all, on line 63 we manually restart the timer anyway. What does periodicity do here?
Status.FAILURE, | ||
): | ||
time.sleep(0.01) | ||
action.tick_once() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this testing the right thing? The timeout is supposed to trigger within a tick right? Should the sleep be within the work function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test is correct, the sleep here is more like the tree's tick rate
if (self.is_periodic): | ||
self.timer_start_time = time.time() | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this periodic clause would be used for?
It's unintuitive that a timer would restart itself when checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great point. It was more out of habit from previously implemented timers that I found the pattern of "one shot" vs "periodic" timer distinction helpful. Here I am not sure if there is much (or may even be negative) value add. Let me reflect for a moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay on further reflection. You are 100% right that the timer within ActionWorker does not need to be periodic. I would however like to keep the notion of periodic timer in its current implementation for this use case:
A periodic timer when checked if elapsed (and it is) will emit a rising edge signal that is lowered thereafter which is convenient for breaking out of loops at fixed frequency, if not set as periodic the elapsed signal will stay high
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it I might advocate for adding this when we need it, if we later discover we need it.
Is the distinction between a stays-high signal and a periodic-break signal even relevant here in python land? We should design our "signals" in the way that works for our use case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pep8 says module names should be lowercase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you don't have to change it because we need to do a mass update later to bring the package into compliance
import time | ||
|
||
|
||
class Timer(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mixed feelings about this class name because it collides with threading.Timer
from the standard library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to approve because the core seems fully correct to me and I'm quibbling on some secondary things
a2c1edb
to
331cd0f
Compare
beams/tests/test_timer.py
Outdated
class TestTimer(): | ||
def test_elapsed(self): | ||
t = Timer(name="test_elapsed", | ||
timer_period_seconds=0.1, | ||
is_periodic=False) | ||
t.start_timer() | ||
assert t.is_elapsed() is False | ||
time.sleep(0.5) | ||
assert t.is_elapsed() is True | ||
|
||
def test_timer_error(self): | ||
t = Timer(name="test_error_not_started", | ||
timer_period_seconds=0.1, | ||
is_periodic=False) | ||
with pytest.raises(RuntimeError): | ||
t.get_elapsed() | ||
with pytest.raises(RuntimeError): | ||
t.is_elapsed() | ||
|
||
def test_periodic(self): | ||
t = Timer(name="test_error_not_started", | ||
timer_period_seconds=0.1, | ||
auto_start=True, | ||
is_periodic=True) | ||
time.sleep(0.2) | ||
assert t.is_elapsed() is True | ||
assert t.is_elapsed() is False | ||
time.sleep(0.1) | ||
assert t.is_elapsed() is True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 layer Nitpick:
- This class structure appears unnecessary to me
- The assert statements are not pythonic.
is True
can be entirely omitted for truthiness checks andis False
should usenot
instead.
class TestTimer(): | |
def test_elapsed(self): | |
t = Timer(name="test_elapsed", | |
timer_period_seconds=0.1, | |
is_periodic=False) | |
t.start_timer() | |
assert t.is_elapsed() is False | |
time.sleep(0.5) | |
assert t.is_elapsed() is True | |
def test_timer_error(self): | |
t = Timer(name="test_error_not_started", | |
timer_period_seconds=0.1, | |
is_periodic=False) | |
with pytest.raises(RuntimeError): | |
t.get_elapsed() | |
with pytest.raises(RuntimeError): | |
t.is_elapsed() | |
def test_periodic(self): | |
t = Timer(name="test_error_not_started", | |
timer_period_seconds=0.1, | |
auto_start=True, | |
is_periodic=True) | |
time.sleep(0.2) | |
assert t.is_elapsed() is True | |
assert t.is_elapsed() is False | |
time.sleep(0.1) | |
assert t.is_elapsed() is True | |
def test_elapsed(self): | |
t = Timer(name="test_elapsed", | |
timer_period_seconds=0.1, | |
is_periodic=False) | |
t.start_timer() | |
assert not t.is_elapsed() | |
time.sleep(0.5) | |
assert t.is_elapsed() | |
def test_timer_error(self): | |
t = Timer(name="test_error_not_started", | |
timer_period_seconds=0.1, | |
is_periodic=False) | |
with pytest.raises(RuntimeError): | |
t.get_elapsed() | |
with pytest.raises(RuntimeError): | |
t.is_elapsed() | |
def test_periodic(self): | |
t = Timer(name="test_error_not_started", | |
timer_period_seconds=0.1, | |
auto_start=True, | |
is_periodic=True) | |
time.sleep(0.2) | |
assert t.is_elapsed() | |
assert not t.is_elapsed() | |
time.sleep(0.1) | |
assert t.is_elapsed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left my nit and approving again
331cd0f
to
f069edb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-approval post nit
I just went down a small rabbit hole, our settings are probably sufficient and err in the correct direciton |
… ActioNodeWorker, added functional timeout ability, tested
I think the diff visualization is broken, it includes most of the elements from the refactor PR. |
3a9badf
to
277f99c
Compare
Description
Want to allow work function of
ActionNode
'sActionWorker
to timeout.Specifically if the action performed in the work function is ineffectual. I.e. the work done in the work function never satisfies the completion condition. This does not guard against the case in which the work function itself hangs.
This motivated:
wrapped_action_work
to theActionWorker
, keeps file sizes and contents reasonable. This did lead to what may be considered a pythonic sin of a pass-through definition in ActionNode; however for generally cleanliness of code base I would strongly argue this is best.Timer
class. One can imagine this will be used elsewhere. This can also extend thetime.perf_counter
method to be a code timer across the codebaseMotivation and Context
We want our work functions to be able to timeout. Notably this does not address the very real issue #56, but it is critical our trees be intelligent enough to stop work on a meaningful time horizon and report failure should they be unable to execute their task . This does not address #66
How Has This Been Tested?
Only in unit tests
Where Has This Been Documented?
We should do that more. I will do a little here.
Our work wrapper's wrapper now takes an additional, optional
work_function_timeout_period_sec
BEAMS/beams/behavior_tree/ActionWorker.py
Line 32 in d49a5ee
We initialize the
Timer
object (within the spawned process), using this new parameterBEAMS/beams/behavior_tree/ActionWorker.py
Lines 51 to 54 in d49a5ee
We use the timer as another mechanism to exit the work loop:
BEAMS/beams/behavior_tree/ActionWorker.py
Line 64 in d49a5ee
The final completion condition check is now more than perfunctory
BEAMS/beams/behavior_tree/ActionWorker.py
Lines 78 to 84 in d49a5ee
Pre-merge checklist
docs/pre-release-notes.sh
and created a pre-release documentation page