Anthropic’s May 28 release of Claude Opus 4.8, featuring adaptive reasoning and max-effort modes, propelled the model to a leading 45.7% on Humanity’s Last Exam, a 2,500-question expert benchmark spanning math, physics, biology, and humanities designed to resist saturation. This edges out Gemini 3.1 Pro Preview at 44.7% and recent GPT-5.5 variants near 44.3%, building on earlier Claude Opus 4.7 and 4.6 thinking-mode results in the mid-30s. With the June 30 resolution deadline weeks away, traders are monitoring potential new Claude iterations, inference optimizations, or leaderboard updates that could extend the narrow lead before the market closes. Human experts average near 90%, underscoring the benchmark’s difficulty for frontier large language models.
基于Polymarket数据的AI实验性摘要。这不是交易建议,也不影响该市场的结算方式。 · 更新于$289,166 交易量
45%以上
25%
50%+
15%
55%以上
8%
$289,166 交易量
45%以上
25%
50%+
15%
55%以上
8%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
市场开放时间: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Anthropic’s May 28 release of Claude Opus 4.8, featuring adaptive reasoning and max-effort modes, propelled the model to a leading 45.7% on Humanity’s Last Exam, a 2,500-question expert benchmark spanning math, physics, biology, and humanities designed to resist saturation. This edges out Gemini 3.1 Pro Preview at 44.7% and recent GPT-5.5 variants near 44.3%, building on earlier Claude Opus 4.7 and 4.6 thinking-mode results in the mid-30s. With the June 30 resolution deadline weeks away, traders are monitoring potential new Claude iterations, inference optimizations, or leaderboard updates that could extend the narrow lead before the market closes. Human experts average near 90%, underscoring the benchmark’s difficulty for frontier large language models.
基于Polymarket数据的AI实验性摘要。这不是交易建议,也不影响该市场的结算方式。 · 更新于
警惕外部链接哦。
警惕外部链接哦。
常见问题