Adversarial algorithms can systematically probe large language models like OpenAI’s GPT-4 for weaknesses that can make them misbehave.
A New Trick Uses AI to Jailbreak AI Models—Including GPT-4
Posted in information science, robotics/AI
Posted in information science, robotics/AI
Adversarial algorithms can systematically probe large language models like OpenAI’s GPT-4 for weaknesses that can make them misbehave.