Comment by Sayash Kapoor

Computer science Ph.D. candidate at Princeton's Center for Information Technology Policy; co-author of AI Snake Oil; author of the AI as Normal Technology newsletter; 2025-2026 Jacobus Fellow
Surprisingly, even though the lack of reliability of AI agents is well known, right now the AI industry doesn't have good tools for measuring reliability, or even a good definition of reliability. [...] For autonomous operation in high-stakes contexts, we need 3-5 "nines" of performance — 99.9% to 99.999% accuracy — in order for reliability to become a non-issue. We don't think LLM-based agents are on track to reach such a threshold. Unverified source (2026)
Like Share on X 10h ago
Polls
replying to Sayash Kapoor