-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confusing or buggy hash implementation of instantaneous actions #567
Comments
Hi @fteicht, we implemented an BTW, the problem with the This is why the 2 actions are different, because the basic expressions are semantically equivalent but belong to a different environment (which is a copy of the original one, but it's not the same; indeed if you use simple I don't know if we want the |
Thanks @Framba-Luca for your explanation, it makes sense. IMHO the issue is that the environment is then part of the action semantics when you say that 2 equal actions are different just because their environments are. It means that 2 processes which host different environments, and which exchange the same action via pipes for instances, will consider that 2 actions are different. It is an issue when storing, for instance, a policy in one process and asking in the other process what is the action recommending by the policy in a given state (it is excatly the issue I am encountering in the scikit-decide backend of UP btw). Another view to this semantic issue is that the problem arises if we create the same semantic action twice, and then try to find it in a dictionary or set of actions. It is typically the case when you want to search in a dictionary : you create the key on the heap (an UP action in this case), then try to find it in the dictionary. It is actually how the bug appeared on my side, and I understand that we cannot store UP actions in sets or dictionaries, and then to search for them later on. Am I correct? |
Hi @Framba-Luca ! I have digged further into the issue and I now believe there is a bug in UP when computing the hash value and equality test of grounded actions with simulated effects. At the moment it is preventing the up-skdecide engine from working properly, especially with the scikit-decide's RL engine which allows UP problems to be solved by using Reinforcement Learning. Indeed, this engine relies on grounded actions. Below is a minimal code example that you can run and that shows that UP simulated action effects prevent grounded actions from being properly compared: from unified_planning.shortcuts import *
from unified_planning.environment import get_environment
from unified_planning.engines.compilers.grounder import GrounderHelper
from unified_planning.engines.sequential_simulator import UPSequentialSimulator
Location = UserType("Location")
Robot = UserType("Robot")
at = Fluent("at", Location, robot=Robot)
battery_charge = Fluent("battery_charge", IntType(0, 100), robot=Robot)
move = InstantaneousAction("move", robot=Robot, l_from=Location, l_to=Location)
robot = move.parameter("robot")
l_from = move.parameter("l_from")
l_to = move.parameter("l_to")
move.add_precondition(Equals(at(robot), l_from))
move.add_precondition(GE(battery_charge(robot), 10))
move.add_precondition(Not(Equals(l_from, l_to)))
move.add_effect(at(robot), l_to)
def fun(problem, state, actual_params):
value = state.get_value(battery_charge(actual_params.get(robot))).constant_value()
return [Int(value - 10)]
# COMMENT IN THE FOLLOWING LINE AND THE LAST ASSERT WILL WORK
move.set_simulated_effect(SimulatedEffect([battery_charge(robot)], fun))
l1 = Object("l1", Location)
l2 = Object("l2", Location)
r1 = Object("r1", Robot)
problem = Problem("robot_with_simulated_effects")
problem.add_fluent(at)
problem.add_fluent(battery_charge)
problem.add_action(move)
problem.add_object(l1)
problem.add_object(l2)
problem.add_object(r1)
problem.set_initial_value(at(r1), l1)
problem.set_initial_value(battery_charge(r1), 100)
simulator = UPSequentialSimulator(problem)
all_actions = set(
[(a[0], a[1]) for a in GrounderHelper(problem).get_grounded_actions()]
)
app_actions = set(
[a for a in simulator.get_applicable_actions(simulator.get_initial_state())]
)
# FOLLOWING ASSERT ALWAYS WORKS WITH OR WITHOUT SIMULATED EFFECT
assert app_actions.issubset(all_actions)
all_grounded_actions = set(
[a[2] for a in GrounderHelper(problem).get_grounded_actions()]
)
app_grounded_actions = set(
[
simulator._ground_action(a[0], a[1])
for a in simulator.get_applicable_actions(simulator.get_initial_state())
]
)
# FOLLOWING ASSERT ONLY WORKS WITHOUT SIMULATED EFFECT
assert app_grounded_actions.issubset(all_grounded_actions) If you comment in the line Do you confirm this bug, and if so, would it be please possible to correct it as soon as you can so that I can push working up-skdecide RL-based engine in theAIPlan4EU project? Thank you in advance for looking into this! |
I have created another issue #585 with a more accurate title since now I believe the bug is different from what I originally thought it would be. Feel free to close this issue and answer in the new one if you prefer. Thank you. |
I am closing this issue since it is now tracked in issue #585 |
Describe the bug
I don't know if it only concerns instantaneous actions but I noticed that the hashes of semantically equal actions are different (and similarly for the equality test). I would expect 2 semantically equal actions to have the same hash, but it seems that the hash is rather based on the object instance ID, which is weird (for instance, 2 equal ints or floats, even if created at different times, have the same hash and are equal...).
Is it really the intended the behavior?
To Reproduce
Expected behavior
We should have
hash(move) == hash(move_bis)
andmove == move_bis
The text was updated successfully, but these errors were encountered: