LLMs & Models Articles
Browse 140 articles about LLMs & Models.
How to Use Open Router Free Models With Claude Code to Cut AI Costs by 99%
Configure Claude Code to route through Open Router's free model tier instead of Anthropic's paid API. A step-by-step guide with the exact settings.json setup.
Open-Source vs Closed-Source AI Models: Which Should You Use for Agentic Workflows?
Compare open-weight models like Gemma 4 and Qwen 3.6 against closed models like Claude Opus and GPT-5.4 for agentic coding and automation tasks.
Why You Should Use an Agentic Harness With Qwen 3.6 Plus (Not Just Chat Mode)
Qwen 3.6 Plus performs dramatically better inside an agentic harness than in chat mode. Here's why and how to set it up with OpenCode.
Qwen 3.6 Plus vs Claude Opus 4.6: Which Model Is Better for Agentic Coding?
Compare Qwen 3.6 Plus and Claude Opus 4.6 on agentic coding benchmarks, context window, multimodal support, and real-world task performance.
What Is Microsoft MAI Transcribe 1? The Speech Model That Beats Whisper and Gemini
MAI Transcribe 1 is Microsoft's new speech recognition model that outperforms Whisper, Gemini Flash, and Scribe V2 across 25 languages.
What Is the Qwen 3.5 Omni Model? Alibaba's Multimodal AI That Builds Apps From Your Camera
Qwen 3.5 Omni understands text, image, audio, and video—and can build a functional website from a camera description. Here's what it can do.
What Is Qwen 3.6 Plus? Alibaba's Agentic Coding Model With 1M Token Context
Qwen 3.6 Plus is Alibaba's frontier agentic coding model with a 1M token context window, multimodal reasoning, and computer use capabilities.
Why Your AI Agent Needs a Harness: Qwen 3.6 Plus vs Chat Mode Performance
Running Qwen 3.6 Plus in a chat session vs an agentic harness produces dramatically different results. Here's what the difference looks like in practice.
Gemma 4 31B vs Qwen 3.5: Which Open-Weight Model Should You Use for Agentic Workflows?
Compare Gemma 4 31B and Qwen 3.5 on benchmarks, agentic capabilities, and local deployment to find the best open model for your AI workflows.
Gemma 4 for Edge Deployment: How the E2B and E4B Models Run on Phones and Raspberry Pi
Gemma 4's edge models support native audio, vision, and function calling in under 4B effective parameters. Here's what that means for on-device AI apps.
Qwen 3.6 Plus Review: Alibaba's Frontier-Level Agentic Coding Model
Qwen 3.6 Plus is Alibaba's latest proprietary model with 1M context and strong agentic coding. Learn how it performs and when to use it in a harness.
What Is Gemma 4? Google's Open-Weight Model Family With Apache 2.0 License
Gemma 4 is Google's newest open-weight model family with Apache 2.0 licensing, native multimodality, and function calling built in from the ground up.
What Is the Bitter Lesson of Building with LLMs? Why Simpler Prompts Win
As AI models get smarter, over-specified prompts hurt more than they help. Learn why the bitter lesson of LLM development is to simplify, not complexify.
What Is the Bitter Lesson of Building with LLMs? Why Simpler Prompts Win
As AI models get smarter, over-specified prompts hurt more than they help. Learn why the bitter lesson of LLM development is to simplify, not complexify.
What Is Google TurboQuant? The KV Cache Compression That Crashed Memory Chip Stocks
Google's TurboQuant algorithm compresses AI memory to 3 bits with zero accuracy loss, delivering 8x speed and 6x memory reduction on H100 GPUs.
ARC AGI 3 Results: GPT-5.4, Claude Opus 4.6, and Gemini 3.1 All Score 0%
Every major AI model scored 0% on ARC AGI 3 while humans score 100%. Here's what the results reveal about the gap between AI capability and generalization.
What Is ARC AGI 3? The Interactive AI Benchmark Humans Solve at 100%
ARC AGI 3 is a video game-style benchmark where humans score 100% and every frontier AI model scores 0%. Here's how it works and why it matters.
What Is Chroma Context-1? The Specialized RAG Model That Beats Frontier Models
Chroma Context-1 is a 20B parameter model trained specifically for retrieval tasks. It beats GPT-5.4 on search benchmarks at a fraction of the cost.
What Is Claude Mythos? Anthropic's Most Powerful AI Model Explained
Claude Mythos is Anthropic's leaked next-gen model tier above Opus, with dramatically higher scores in coding, reasoning, and cybersecurity tasks.
What Is Gemini 3.1 Flash Live? Google's Multimodal Voice AI for Real-Time Conversations
Gemini 3.1 Flash Live is Google's native speech-to-speech model with webcam, screen sharing, and tool-calling support. Here's how to use it for free.