LLMs & Models

LLMs & Models Articles

Browse 420 articles about LLMs & Models.

May 4, 2026

Kimi K2 Runs 300 Sub-Agents Across 4,000 Steps on 4x H100s — The Story Hermes Found That Everyone Missed

Hermes's content ideation agent surfaced Kimi K2: an open-source system orchestrating 300 sub-agents across 4,000 coordinated steps on 4x H100 GPUs.

Multi-Agent LLMs & Models Automation

May 4, 2026

OpenAI's Goblin Problem: How RL Training in Codex Infected GPT-5.4 with Creature References Across Model Generations

GPT started mentioning goblins and gremlins in responses. The cause: RL 'nerdy personality' training in Codex scored creature references highly and bled…

GPT & OpenAI LLMs & Models AI Concepts

May 4, 2026

Scott Aaronson's 2029 Warning: Why the World's Top Quantum Skeptic Is Now Sounding the Alarm

Scott Aaronson — historically skeptical of quantum timelines — now says fault-tolerant quantum computers capable of breaking crypto are expected by ~2029.

Security & Compliance AI Concepts LLMs & Models

May 4, 2026

How to Use a Smart Orchestrator Model to Direct Cheaper Sub-Agent Models in Claude Code

Use Claude Opus as an orchestrator to plan and review while DeepSeek or Gemma handle heavy lifting—cutting token costs by 5-10x without losing quality.

Multi-Agent Workflows LLMs & Models

May 4, 2026

What Is the Mistral Medium 3.5 Model? Open-Weight AI Built for Agent Harnesses

Mistral Medium 3.5 is a 128B open-weight model combining reasoning, coding, and instruction-following for agent harnesses like OpenClaw and Hermes.

LLMs & Models Articles

Kimi K2 Runs 300 Sub-Agents Across 4,000 Steps on 4x H100s — The Story Hermes Found That Everyone Missed

OpenAI's Goblin Problem: How RL Training in Codex Infected GPT-5.4 with Creature References Across Model Generations

Scott Aaronson's 2029 Warning: Why the World's Top Quantum Skeptic Is Now Sounding the Alarm

How to Use a Smart Orchestrator Model to Direct Cheaper Sub-Agent Models in Claude Code

What Is the Mistral Medium 3.5 Model? Open-Weight AI Built for Agent Harnesses

AI Model Orchestration: How to Use a Smart Model to Direct Cheaper Sub-Agents

Andrej Karpathy on DeepSeek's OCR Paper: Why Pixels May Beat Tokens as AI Inputs

Andrej Karpathy's Verifiability Thesis: Why AI Is Superhuman at Code and Fails at Car Washes

How to Build a Local AI Stack from Scratch: Ollama to vLLM, Step by Step

China Blocks Meta's $2B Manus Acquisition: 4 Reasons the Unwinding Problem Has No Clear Solution

Claude Mythos and GPT-5.5 Pass the 'Last Ones' Cyberattack Benchmark: 6 Things You Need to Know

Cursor SDK + GPT-5.5 Scores 87.2% vs Native Codex's 61.5% — The Harness Is the Bottleneck

DeepSeek V4 Launch: 5 Specs That Threaten Closed Frontier Labs

DeepSeek V4 Vision: 10x Cheaper Multimodal AI for Your Workflows

DeepSeek V4 Vision Model: 10x KV-Cache Efficiency and 67% Maze Navigation vs GPT-5.4's 50%

Google AI Co-clinician vs GPT-5.4 Thinking: Which Medical AI Do Physicians Actually Prefer?

Google DeepMind AI Co-clinician: 6 Benchmark Results That Redefine Medical AI in 2026

Google DeepMind's AI Co-clinician Tops the RXQA Drug Knowledge Benchmark — Beating Every Frontier Model

How to Use OpenRouter with Claude Code: Run Cheaper Models as a Backend

Karpathy's Sequoia Talk: 5 Predictions About Agentic Engineering That Should Change How You Work