Data & Analytics

Data & Analytics Articles

Browse 159 articles about Data & Analytics.

May 1, 2026

Mayo Clinic's AI Spotted Pancreatic Cancer 3 Years Early on Routine CT Scans — Here's How It Works

Mayo Clinic's model finds pre-tumor signs on routine CT scans up to 3 years before diagnosis. Here's the back-testing methodology and what it means clinically.

AI Concepts Use Cases Data & Analytics

May 1, 2026

Q1 2026 AI Earnings: 6 Numbers That Confirm the Compute Shortage Is Now the Only Constraint That Matters

Google's backlog nearly doubled to $460B in one quarter. AWS free cash flow collapsed to $1.2B. Six numbers that define the AI infrastructure race.

Enterprise AI LLMs & Models Data & Analytics

April 30, 2026

AI Agent Evaluation: How to Build Custom Benchmarks That Actually Test Intelligence

Public benchmarks are often contaminated by training data. Learn how to build custom AI agent benchmarks using simulation environments and iterative testing.

Multi-Agent AI Concepts Automation

April 29, 2026

How to Use AI Agents for Data Migration: Lessons from Real-World Testing

AI agents can handle messy business data migrations—but they need the right guardrails. Learn what works, what fails, and how to validate outputs safely.

Multi-Agent Automation Data & Analytics

April 7, 2026

ARC AGI 2 vs Pencil Puzzle Bench: The Benchmarks That Expose AI Capability Gaps

These two benchmarks test reasoning you can't fake with training data. See how GPT-5.2, Claude, Gemini, and Chinese models actually compare.

LLMs & Models Comparisons AI Concepts

April 7, 2026

What Is Benchmark Gaming in AI? Why Self-Reported Scores Are Often Inflated

Kimi K2 reported 50% on HLE but independent testing found 29.4%. Learn how benchmark gaming works and how to evaluate AI models honestly.

LLMs & Models AI Concepts Comparisons

April 7, 2026

What Is the China AI Gap? Why Chinese Models Lag on Benchmarks That Can't Be Gamed

ARC AGI 2 and Pencil Puzzle Bench reveal Chinese frontier models score like Western models from 8 months ago. Here's what the data shows.

LLMs & Models Comparisons AI Concepts

April 7, 2026

What Is the Frontier Math Benchmark? Why Open Research Problems Expose True AI Reasoning

Frontier Math uses unpublished problems that take researchers days to solve. Models with full Python access still score under 3%. Here's why it matters.

LLMs & Models AI Concepts Data & Analytics

April 7, 2026

What Is the Humanities Last Exam Benchmark? How Independent Testing Revealed a 21-Point Score Inflation

Kimi K2 self-reported 50% on HLE. Independent testing found 29.4%. Here's how the HLE benchmark works and why third-party verification matters.

LLMs & Models AI Concepts Data & Analytics

April 7, 2026

What Is the Pencil Puzzle Benchmark? The Test That Measures Pure Multi-Step Logical Reasoning

Pencil Puzzle Bench tests constraint satisfaction problems with no training data contamination. GPT-5.2 scores 56%. Chinese models score under 7%.

LLMs & Models AI Concepts Data & Analytics

April 7, 2026

What Is the SWE-Rebench Benchmark? How Decontaminated Tests Expose Chinese Model Inflation

SWE-Rebench uses fresh GitHub tasks that models haven't seen in training. Chinese models that match Western scores on SWE-bench drop significantly here.

LLMs & Models AI Concepts Comparisons

March 27, 2026

Why Cursor, Claude Code, and Devin Use grep, Not Vectors

Cursor, Claude Code, and Devin lean on grep, find, and direct file reads — not vector search. Why agentic coding tools dropped RAG and where it still wins.

Workflows Automation AI Concepts

March 27, 2026

What Is LiteParse? LlamaIndex's Open-Source Document Parser for AI Agents

LiteParse is a free, GPU-free document parser from LlamaIndex that preserves spatial layout for tables and charts. Here's why it matters for AI workflows.

Workflows Automation AI Concepts

March 22, 2026

What Is the Remote Labor Index? Why AI Agents Complete Only 2.5% of Real Freelance Work

Scale AI's Remote Labor Index tested frontier agents on 240 Upwork projects. The 97.5% failure rate reveals the gap between task execution and real jobs.

AI Concepts Enterprise AI Data & Analytics

March 21, 2026

AI Job Market Impact: What the Data Actually Shows About White-Collar Employment

White-collar job openings hit a 10-year low. Here's what the Anthropic AI Exposure Index, Gartner forecasts, and real layoff data reveal.

Enterprise AI AI Concepts Productivity

March 21, 2026

What Is the Anthropic AI Exposure Index? How to Find Out If Your Job Is at Risk

Anthropic's AI Exposure Index maps 800+ occupations against real Claude usage data. Here's how to read it and what it means for your career.

Claude AI Concepts Data & Analytics

March 18, 2026

How to Use Gemini Deep Research for Competitive Intelligence and Market Reports

Gemini's deep research feature outperforms ChatGPT and Claude for multi-source reports. Here's how to use it for competitive analysis and market research.

Gemini Productivity Data & Analytics

March 16, 2026

Gemini Embedding 2 and the End of Stitched-Together Embeddings

Why Gemini Embedding 2 matters: a primer on embeddings and how a unified vector space replaces the brittle stitching of separate text, image, and audio models.

Gemini AI Concepts Data & Analytics

March 14, 2026

Gemini Embedding 2 vs Qwen3 VL Embeddings: Which Multimodal Model Should You Use?

Compare Gemini Embedding 2 and Qwen3 VL embeddings across supported modalities, embedding dimensions, API access, and real-world search use cases.

Gemini LLMs & Models Comparisons

March 14, 2026

What Is Matryoshka Representation Learning in Gemini Embedding 2?

Gemini Embedding 2 supports flexible embedding sizes from 3,072 down to 768 dimensions. Learn how Matryoshka learning works and when to use smaller embeddings.

Gemini LLMs & Models AI Concepts