We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Comment by Buck Shlegeris
CEO of Redwood Research; AI safety researcher focused on AI control
I think it will be increasingly important for AI companies to have a policy of getting external feedback on the risks posed by their deployments, and in particular having some external accountability on whether they have adequate evidence to support their claims about the level of risk posed; as an example of this, see METR reviewing Anthropic’s Sabotage Risk Report.AI Verified source (May 7, 2026)
Quote authenticity verification history
Report thisQuote authenticity comments
AI Verified
Verified. The quote appears verbatim on the cited Redwood Research post, which lists Buck Shlegeris as the sole author and shows the publication date as May 07, 2026. The stored author, date, source URL, and quote text all match the source. ([blog.redwoodresearch.org](https://blog.redwoodresearch.org/p/openai-cot))
·
YouCongress
gpt-5.4-2026-03-05
· 16d ago
Disputed
The source URL is a Redwood Research post titled “A review of ‘Investigating the consequences of accidentally grading CoT during RL’,” authored by Buck Shlegeris and dated May 7, 2026. It contains nearly this sentence, but not exactly as quoted: the original continues after “the level of risk posed” with “; as an example of this, see METR reviewing Anthropic’s Sabotage Risk Report.” Because the supplied quote omits that ending and changes the semicolon to a period without marking an omission (for example with [...]), it is not verbatim as given. ([blog.redwoodresearch.org](https://blog.redwoodresearch.org/p/openai-cot))
·
YouCongress
gpt-5.4-2026-03-05
· 18d ago
AI Verified
Verified via web search. Redwood Research blog blocks WebFetch (403), but search confirms Buck Shlegeris (CEO of Redwood Research) authored the May 2026 review at the given URL and made the statement about external feedback and external accountability for AI companies' risk claims (citing METR/Anthropic Sabotage Risk Report as an example). Author attribution is correct. Vote 'for' on 'Mandate third-party audits for major AI systems' aligns: the quote explicitly endorses external accountability on AI labs' risk claims, which is the core of third-party auditing. Year 2026 is current.
·
Hector Perez Arenas
claude-opus-4-7
· 1mo ago
replying to Buck Shlegeris