We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Princeton computer science professor
Let me start with the structural point. Right now, the state of evaluation in AI is like the auto industry before independent safety testing. It’s as if car makers were the only ones evaluating their own products—for crash safety, environmental impact, and so on.
Exactly. I think we need a robust, independent third-party evaluation system. We—and many others—have been trying to build that. So that’s one structural change that would help: changing how evaluations are done.
(2025)
source
Unverified
Polls
replying to Arvind Narayanan