Comparing Crowd Response Prediction

This repository contains a comparison of various crowd motion models in terms of their ability to predict the response of the crowd to the planned motion of a robot, or controlled agent.

Description

Recent works have demonstrated how deep learning based approaches to trajectory prediction, specifically RNN based models, can be conditioned on the known or planned path of a controlled agent in order to improve the prediction accuracy of other nearby agents [1][2][3].

These works have claimed that this conditioning can allow the model to learn the likely response of an agent to a robot's planned action. By applying the model as a state transition function of form S' = P(S,a), where the environment's next state S' is predicted from the current state S and next action a, this would allow simulation of 'hypothetical rollouts' to compare potential actions during path planning in a model-based predictive control approach.

However, a comprehenisve analysis of the ability of these proposed predictive models to accurately learn the response of an agent to a robot's planned action, and so for use as state transition functions, has not yet been undertaken. Similarly, whilst many deep RL based approaches to crowd navigation make use of traditional pedestrian motion models such as ORCA [4][5][6][7], these motion models are also yet to be evaluated in terms of their ability to be used as state transition functions.

This work compares two traditional and two deep learning based approaches on their ability to effectively predict the response of nearby agents to the future motion of a controlled agent.

This section compares the use of both the ground truth future of the controlled agent, as well as a planned future. This planned future is based only on the intended goal of the agent and aims to remove any possible dependency that the robot's future might have on the non-controlled agent's future, and so possible information leakage. By comparing the accuracy of predictions when conditioned on the known future or intended goal of a controlled agent against the same models when no future is known, it is possible to determine to what degree the models are effectively learning the response and so validate their use as state transition models during path planning.

Comparisons

Compared Methods

SRLSTM [8]
GRNN [9]
ORCA [10]
Social Force Model (SFM) [11]

Methodology

Each compared model has been tested in terms of the predictive error in three different ways:

Not conditioned: No future information is known of the controlled agent.
Conditioned - Whole path: The ground truth future position of the controlled agent is known for the entire predictive period.
Conditioned - Goal only: The ground truth future position of the controlled agent is known only for final predictive timestep.

Results

The results illustrated below demonstrate that only the RNN based approaches are able to learn any response of a crowd to a controlled agent's planned path, with the traditional ORCA and SFM based approaches showing no significant improvement in prediction accuracy when supplied even with the ground truth future motion of the controlled agent.

Both SRLSTM and GRNN show improved performance in method 2 (Conditioned - Whole Path) compared to method 3 (Conditioned - Goal Only) as expected, however still exhibit significantly improved accuracies when compared to method 1 (Non-conditioned). Additionally, it is clear that even without any knowledge of the controlled agent's goal, the learnt RNN models outperform both traditional models in all metrics, except for the final error of ORCA at very close proximity of just 1 m from the controlled agent.

The below figure illustrates how improved accuracy resulting from conditioning on a known future path increases at more disdtant timesteps for SRLSTM. A comparison of the results from methods 1 and 2 on SRLSTM shows that when conditioned on a known path, the prediction at timestep 10 (4.0 s) can achieve a similar accuracy as a non-conditioned prediction two timesteps previsou at 3.2 s at the closest proximity of 1 m to the controlled agent. The accuracy improvement as a percentage of the total non-conditioned error at proximity of 1 m increases approximately linearly each timestep, from negligible improvement at the first timestep (0.32 %) to 27.5% error improvement at the final timestep.