We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Comment by Ryan Greenblatt
Chief Scientist at Redwood Research; AI safety researcher; lead author of "Alignment faking in large language models"
We don't know how AIs are aligned. A somewhat crazy aspect of the current situation is that we have very little confirmed public information about why frontier AIs end up being apparently behaviorally aligned. And more generally, we don't know what factors in training are most relevant for behavioral alignment. [...] AI companies should be much more transparent about this.
Unverified
source
(2026)
Policy proposals and claims
replying to Ryan Greenblatt