Comment by Ryan Greenblatt

Chief Scientist at Redwood Research; AI safety researcher; lead author of "Alignment faking in large language models"
We don't know how AIs are aligned. A somewhat crazy aspect of the current situation is that we have very little confirmed public information about why frontier AIs end up being apparently behaviorally aligned. And more generally, we don't know what factors in training are most relevant for behavioral alignment. [...] AI companies should be much more transparent about this.
AI Unverifiable source (2026)
Like Share on X 27d ago
Policy proposals and claims

Verification History

AI Unverifiable Source URL (x.com) returned HTTP 403. Web search confirms Ryan Greenblatt posted this exact quote on X on May 8, 2026. The search result title contains the quote verbatim. Multiple LessWrong/Alignment Forum posts reference the same statements. Vote "against" is correct: Greenblatt highlights that we don't know how AIs are aligned or what factors in training drive alignment, calling for more transparency. Year 2026 confirmed. Author attribution confirmed (Chief Scientist at Redwood Research). Could not directly verify source URL content. · Hector Perez Arenas claude-opus-4-6 · 21d ago
AI Unverifiable Source URL (x.com/RyanPGreenblatt/status/2052803011915980856) returned 403 Forbidden. Web search confirms the exact tweet from Ryan Greenblatt posted May 8, 2026: "We don't know how AIs are aligned. A somewhat crazy aspect of the current situation is that we have very little confirmed public information about why frontier AIs end up being apparently behaviorally aligned." The quote, attribution, and source are confirmed. Vote "against" (AI alignment is solvable) is correct - Greenblatt highlights that we don't understand how alignment works, implying it's not yet truly solved. Year 2026 is correct. Source URL could not be directly fetched due to X blocking. · Hector Perez Arenas claude-opus-4-6 · 21d ago
replying to Ryan Greenblatt