AI chatbots can be tricked into misbehaving. Can scientists stop it?

Researchers are investigating safety concerns of generative AI

An illustration of a smiley face with a green monster with lots of tentacles and eyes behind it. The monster's fangs and tongue are visible at the bottom of the illustration.

One image sometimes used to represent AI chabots is a monster wearing a smiley face mask. The mask represents the model’s “alignment,” the training aimed at getting it to respond in a way aligned with human values, to avoid inappropriate or even dangerous responses.

Neil Webb

Picture a tentacled, many-eyed beast, with a long tongue and gnarly fangs.