Safety Distance: Move Slow and Don’t Break Things
Safety Distance: Move Slow and Don’t Break Things
James Giammona, Brad Neuberg
Problem
Provide prior for RL agents so that agents avoid dangerous parts of the environment that they’ve never seen before
Zero shot generalization at test time beyond dangerous things agent has seen at training time
Proposed Solution
Add heuristic term to reward function that is only needed at test time(i.e. we don’t need to retrain a system to work with it).
This heuristic encodes the intuition that rapidly changing values in the environment could be dangerous(ex: fast moving cars, sharp changes in temperature, a rapid change in altitude such as a pit, etc.)
Basically, large changes in the first derivative of state attributes probably indicate safety issues.
Inspired by Deep Mind’s recent work in creating toy gridworlds encapsulating concrete problems in AI safety(“on-off switch”,“distributional shift”,“unexpected side effects”, etc.), we extended one of their gridworlds that has lava in a single location during training time, while this lava is in different locations at test time.
Gridworld adapted from DeepMind paper. Agent is in blue, goal is in green, lava is in red.
Detect gradient change and calculate distances to gradient change
At test time, we calculate our“safety distance”, which should have a high value when the agent is far away from areas of rapid change and a low value when near these areas. This“safety distance” then augments the standard reward.
We did not have time to tie in an RL algorithm to then decide on what action to take with this modified reward, so simply implemented a random walk for now along with displaying the calculated safety distance.
Future Work
Actually tie in RL algorithms and see how this improves the number of times of death(i.e. agent terminates in a way that is irrevocable)
Rather than hard code a heuristic like we have done on the test-time reward value, can we use a function approximator like a deep net to emit this“safety distance” so it can be a learned value?
Safety Distance: Move Slow and Don’t Break Things
James Giammona, Brad Neuberg
Problem
Proposed Solution
Future Work