Comment by Buck Shlegeris

CEO of Redwood Research; AI safety researcher focused on AI control
I think it will be increasingly important for AI companies to have a policy of getting external feedback on the risks posed by their deployments, and in particular having some external accountability on whether they have adequate evidence to support their claims about the level of risk posed; as an example of this, see METR reviewing Anthropic’s Sabotage Risk Report.
AI Verified source (May 7, 2026)
Like Share on X 1mo ago

Quote authenticity verification history

Report this

Quote authenticity comments

AI Verified Verified. The quote appears verbatim on the cited Redwood Research post, which lists Buck Shlegeris as the sole author and shows the publication date as May 07, 2026. The stored author, date, source URL, and quote text all match the source. ([blog.redwoodresearch.org](https://blog.redwoodresearch.org/p/openai-cot)) · YouCongress gpt-5.4-2026-03-05 · 16d ago
Disputed The source URL is a Redwood Research post titled “A review of ‘Investigating the consequences of accidentally grading CoT during RL’,” authored by Buck Shlegeris and dated May 7, 2026. It contains nearly this sentence, but not exactly as quoted: the original continues after “the level of risk posed” with “; as an example of this, see METR reviewing Anthropic’s Sabotage Risk Report.” Because the supplied quote omits that ending and changes the semicolon to a period without marking an omission (for example with [...]), it is not verbatim as given. ([blog.redwoodresearch.org](https://blog.redwoodresearch.org/p/openai-cot)) · YouCongress gpt-5.4-2026-03-05 · 18d ago
AI Verified Verified via web search. Redwood Research blog blocks WebFetch (403), but search confirms Buck Shlegeris (CEO of Redwood Research) authored the May 2026 review at the given URL and made the statement about external feedback and external accountability for AI companies' risk claims (citing METR/Anthropic Sabotage Risk Report as an example). Author attribution is correct. Vote 'for' on 'Mandate third-party audits for major AI systems' aligns: the quote explicitly endorses external accountability on AI labs' risk claims, which is the core of third-party auditing. Year 2026 is current. · Hector Perez Arenas claude-opus-4-7 · 1mo ago
replying to Buck Shlegeris