What if you were to have say a government on the verge of going full authoritarian mode, who is obsessed with being perceived as the best at everything, that also has a history of bombing anything they feel like, and sticking their noses in everyone’s border disputes? Couldn’t that government then use this as the perfect tool to justify horrible actions while obsfucating where decisions are coming from?
Like yeah the takeaway is in part “LLM does what we tell it to” but also I think the safety part is “scary data in scary actions out”. That is a very risky potential feedback loop to allow into government decisions especially when coming from a system with no regard to humanity.
If you ask a LLM about how to best genocide and extend territory, in the end you will manage even if it takes some “jailbreaking” prompts.
This is a far cry from the claim of the title: “AI chatbots tend to choose violence and nuclear strikes in wargames”. They will do so if asked to do so.
Give an AI the rules of starcraft and it will suggest to kill civilians and use nukes because these are sound strategies within the given framework.
scary data in scary actions out
You also need a prompt, aka instructions. You choose if you tell it to make the world more scary or less scary.
What if you were to have say a government on the verge of going full authoritarian mode, who is obsessed with being perceived as the best at everything, that also has a history of bombing anything they feel like, and sticking their noses in everyone’s border disputes? Couldn’t that government then use this as the perfect tool to justify horrible actions while obsfucating where decisions are coming from?
Like yeah the takeaway is in part “LLM does what we tell it to” but also I think the safety part is “scary data in scary actions out”. That is a very risky potential feedback loop to allow into government decisions especially when coming from a system with no regard to humanity.
If you ask a LLM about how to best genocide and extend territory, in the end you will manage even if it takes some “jailbreaking” prompts.
This is a far cry from the claim of the title: “AI chatbots tend to choose violence and nuclear strikes in wargames”. They will do so if asked to do so.
Give an AI the rules of starcraft and it will suggest to kill civilians and use nukes because these are sound strategies within the given framework.
You also need a prompt, aka instructions. You choose if you tell it to make the world more scary or less scary.