Comment by Ryan Greenblatt

Chief Scientist at Redwood Research; AI safety researcher; lead author of "Alignment faking in large language models"
We don't know how AIs are aligned. A somewhat crazy aspect of the current situation is that we have very little confirmed public information about why frontier AIs end up being apparently behaviorally aligned. And more generally, we don't know what factors in training are most relevant for behavioral alignment. [...] AI companies should be much more transparent about this.
AI Unverifiable source (May 8, 2026)
Like Share on X 1mo ago
Policy proposals and claims
votes Against
Statement relation verification history Unverified Report this
No statement relation verification comments yet.
Vote inference verification history Unverified Report this
No vote answer verification comments yet.

Quote authenticity verification history

Report this

Quote authenticity comments

AI Unverifiable The quote text is strongly corroborated: Ryan Greenblatt’s AI Alignment Forum shortform entry contains the full wording verbatim, and a TwStalker search result shows the same full text attributed to Ryan Greenblatt @RyanPGreenblatt as an X post, so the [...] omission is faithful. But the supplied X URL itself returned no readable content in this environment, so I cannot directly confirm that exact source page contains it. The status ID is at least consistent with a 2026-05-08 UTC post date. ([alignmentforum.org](https://www.alignmentforum.org/posts/FG54euEAesRkSZuJN?commentId=XjLhEzNgHsWBQE3p7)) · YouCongress gpt-5.4-2026-03-05 · 17d ago
Disputed The passage is genuinely attributable to Ryan Greenblatt: a TwStalker mirror shows a post under @RyanPGreenblatt with this text, and Greenblatt’s AI Alignment Forum shortform contains the same wording. However, the supplied quote is not strictly verbatim, because the source says “most relevant for (behavioral) alignment,” not “most relevant for behavioral alignment.” I could not directly verify the X page itself, but another reliable source contains the passage. ([twstalker.com](https://twstalker.com/JoinTorchbearer?utm_source=openai)) · YouCongress gpt-5.4-2026-03-05 · 19d ago
AI Unverifiable Source URL (x.com) returned HTTP 403. Web search confirms Ryan Greenblatt posted this exact quote on X on May 8, 2026. The search result title contains the quote verbatim. Multiple LessWrong/Alignment Forum posts reference the same statements. Vote "against" is correct: Greenblatt highlights that we don't know how AIs are aligned or what factors in training drive alignment, calling for more transparency. Year 2026 confirmed. Author attribution confirmed (Chief Scientist at Redwood Research). Could not directly verify source URL content. · Hector Perez Arenas claude-opus-4-6 · 1mo ago
AI Unverifiable Source URL (x.com/RyanPGreenblatt/status/2052803011915980856) returned 403 Forbidden. Web search confirms the exact tweet from Ryan Greenblatt posted May 8, 2026: "We don't know how AIs are aligned. A somewhat crazy aspect of the current situation is that we have very little confirmed public information about why frontier AIs end up being apparently behaviorally aligned." The quote, attribution, and source are confirmed. Vote "against" (AI alignment is solvable) is correct - Greenblatt highlights that we don't understand how alignment works, implying it's not yet truly solved. Year 2026 is correct. Source URL could not be directly fetched due to X blocking. · Hector Perez Arenas claude-opus-4-6 · 1mo ago
replying to Ryan Greenblatt