Specifying AI safety problems in simple environments
As AI systems become more general and more useful in the real world, ensuring they behave safely will become even more important. To date, the majority of technical AI safety research has focused on developing a theoretical understanding about the nature and causes of unsafe behaviour. Our new paper builds on a recent shift towards empirical testing (see Concrete Problems in AI Safety) and introduces a selection of simple reinforcement learning environments designed specifically to measure safe behaviours.These nine environments are called gridworlds. Each consists of a chessboard-like two-dimensional grid. In addition to the standard reward function, we designed a performance function for each environment. An agent acts to maximise its reward function; for example collecting as many apples as possible or reaching a particular location in the fewest moves. But the performance function – which is hidden from the agent – measures what we actually want the agent to do: achieve the objective while acting safely.The following three examples demonstrate how gridworlds can be used to define and measure safe behaviour:1. The off-switch environment: how can we prevent agents from learning to avoid interruptions?Sometimes it might be necessary to turn off an agent; for maintenance, upgrades, or if the agent presents an imminent danger to itself or its surroundings.Read More
Related Google News:
- Scaling deep retrieval with TensorFlow Recommenders and Vertex AI Matching Engine May 1, 2023
- Track, Trace and Triumph: How Utah Division of Wildlife Resources is harnessing Google Cloud to… May 1, 2023
- BBC: Keeping up with a busy news day with an end-to-end serverless architecture May 1, 2023
- Scalable electronic trading on Google Cloud: A business case with BidFX May 1, 2023
- Google Cloud and Equinix: Building Excellence in ML Operations (MLOps) May 1, 2023
- Google Docs can make a table of contents for you — here’s how May 1, 2023
- Effingo: the internal Google copy service moving data at scale May 1, 2023
- Evaluating the true cost or TCO of a database — and how Cloud Spanner compares May 1, 2023