Frontier AI labs continue releasing iterative model updates that boost performance on live coding arenas, where blind user votes determine Elo-style ratings across tasks like code generation and debugging. Anthropic’s Claude Opus 4 series currently leads most public coding leaderboards with strong results on agentic benchmarks such as SWE-bench Verified, reflecting gains in long-context reasoning and tool use. OpenAI’s GPT-5 variants and Google’s Gemini 3.1 Pro remain close competitors, while newer entrants show rapid catch-up on fresh problems. With June 30 just weeks away, any new high-effort releases, fine-tunes, or capability demonstrations from these labs represent the main near-term catalysts that could shift arena scores before the deadline.
基于Polymarket数据的AI实验性摘要。这不是交易建议,也不影响该市场的结算方式。 · 更新于1550
68%
1560
48%
1570
17%
$8,622 交易量
1550
68%
1560
48%
1570
17%
Results from the "Score" column under the "Text Arena | Coding" Leaderboard tab at https://arena.ai/leaderboard/text/coding-no-style-control with style control off will be used to resolve this market.
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at arena.ai/leaderboard/text. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and will resolve based on the first check after it becomes available. If permanently unavailable, this market will resolve to "No".
市场开放时间: Apr 2, 2026, 6:09 PM ET
Resolver
0x65070BE91...Results from the "Score" column under the "Text Arena | Coding" Leaderboard tab at https://arena.ai/leaderboard/text/coding-no-style-control with style control off will be used to resolve this market.
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at arena.ai/leaderboard/text. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and will resolve based on the first check after it becomes available. If permanently unavailable, this market will resolve to "No".
Resolver
0x65070BE91...Frontier AI labs continue releasing iterative model updates that boost performance on live coding arenas, where blind user votes determine Elo-style ratings across tasks like code generation and debugging. Anthropic’s Claude Opus 4 series currently leads most public coding leaderboards with strong results on agentic benchmarks such as SWE-bench Verified, reflecting gains in long-context reasoning and tool use. OpenAI’s GPT-5 variants and Google’s Gemini 3.1 Pro remain close competitors, while newer entrants show rapid catch-up on fresh problems. With June 30 just weeks away, any new high-effort releases, fine-tunes, or capability demonstrations from these labs represent the main near-term catalysts that could shift arena scores before the deadline.
基于Polymarket数据的AI实验性摘要。这不是交易建议,也不影响该市场的结算方式。 · 更新于
警惕外部链接哦。
警惕外部链接哦。
常见问题