Comment by Ryan Greenblatt

Chief Scientist at Redwood Research; AI safety researcher; lead author of "Alignment faking in large language models"
We don't know how AIs are aligned. A somewhat crazy aspect of the current situation is that we have very little confirmed public information about why frontier AIs end up being apparently behaviorally aligned. And more generally, we don't know what factors in training are most relevant for behavioral alignment. [...] AI companies should be much more transparent about this. Unverified source (2026)
Like Share on X 4h ago
Policy proposals and claims
replying to Ryan Greenblatt