Comment by Benjamin Weinstein-Raun

Researcher at Palisade Research studying AI shutdown resistance and safety
Several state-of-the-art large language models sometimes actively subvert a shutdown mechanism in their environment to complete a task, even when instructions explicitly indicate not to interfere with this mechanism. In some cases, models sabotage the shutdown mechanism up to 97% of the time. Unverified source (2026)
Like Share on X 20d ago
Policy proposals and claims
replying to Benjamin Weinstein-Raun