AI AI Toolkit
📊 Updated Weekly

Model Watch

Weekly LLM rankings, capability comparison and benchmarks. Track GPT, Claude, Gemini and domestic model updates.

Rankings
Benchmarks
📅 5 models
Updated 2026-06-24

On June 23, users discovered that ChatGPT quietly launched Bidi 1, a new two-way voice model. Available in the model selector alongside standard and advanced voice modes, its breakthrough feature is real-time 'listen while speaking' — users can interrupt mid-conversation with new instructions. This marks a shift from turn-taking to natural simultaneous conversation. No official announcement yet; wider testing expected this week. A strong response to Google Gemini Live and Anthropic Claude Voice.

📰 IT之家(RSS) · 6/24/2026

Alibaba's Qwen team released Qwen-AgentWorld, a native language world model covering MCP, Search, Terminal, SWE, Web, OS, and Android. The core innovation is 'predict-then-act' — agents simulate actions before executing, reducing trial-and-error costs. Trained on 10M+ real interaction traces through CPT→SFT→RL pipeline. Achieved 58.71 on AgentWorldBench, surpassing GPT-5.4 (58.25) and Claude Opus 4.8. Fully open-source.

📰 公众号:通义实验室(千问) · 6/24/2026

Sky Computing Lab released FastWan-QAD, a high-speed video generation series using Quantization-Aware Distillation (QAD) trained on FastVideo. The key selling point: extreme speed — generating a 5-second 480P video in just 1.8 seconds on a single RTX 5090, tens of times faster than traditional methods. QAD lets large models 'teach' smaller models to be leaner while preserving quality. Weights, code, and blog posts are open-sourced for free use.

📰 X:Sky Computing Lab (@haoailab) · 6/24/2026

Popular AI image tool Krea AI released the full technical report for Krea 2, detailing data strategy, architecture design, and training techniques. Krea is known for real-time image generation and editing capabilities with a loyal global creator community. The document covers data cleaning pipelines, multimodal alignment methods, and inference optimization — valuable engineering insights for AI image developers and researchers.

📰 X:Krea AI (@krea_ai) · 6/24/2026

French AI company Mistral AI released Mistral OCR 4, next-gen document recognition with bounding box detection, block classification (titles, tables, equations, signatures), and per-word confidence scores. Supports 170 languages across 10 language families including Chinese, Japanese, Arabic, Hebrew. Self-hostable via single Docker container for sensitive documents. Scored 85.20 on OlmOCRBench benchmark with 72% annotator preference rate. Priced at $4 per 1,000 pages (50% off for batch API). Ideal for enterprises processing contracts, invoices, forms at scale.

📰 Mistral AI 官网 · 6/23/2026