The Price Didn’t Change. The Bill Did.

The Data Product Report: Weekly State of the Market in Data Product Building | Week ending April 20, 2026

Apr 21, 2026

This Week

Somebody measured what Opus 4.7’s new tokenizer actually costs in production — and the answer is about 40% more than last month, with no line item to explain it. Meanwhile, practitioners are discovering that autonomous agents fail in the same ways distributed systems have always failed, and the biggest platforms in tech are ignoring your privacy opt-outs at rates that would be funny if they weren’t actionable. The common thread: nobody’s going to audit this for you. Build your own receipts.

Your Tokenizer Is Picking Your Pocket

A developer ran Claude Opus 4.7’s new tokenizer against real workloads and published what they found: English and code inflate by 1.20–1.47x compared to 4.6. Per-token pricing didn’t change. Your bill did. Across 388 comments, the math kept getting worse — factor in the cache TTL downgrade from a few weeks back, and effective session costs are up roughly 40% with zero changelog entries to show for it.

The community response wasn’t just outrage — it was instrumentation. Simon Willison’s Claude Token Counter now supports cross-model comparisons, letting teams see exactly where the inflation hits. System prompt tokens alone run ~1.46x higher on 4.7. Images? 3x.

But the squeeze isn’t just from above. The floor is dropping fast. A benchmark showing Gemma 2B on CPU matching GPT-3.5 Turbo at $5/month on Cloudflare Containers suggests that a lot of production workloads are paying frontier prices for commodity-grade tasks. And Introspective Diffusion Language Models just demonstrated 3–4x throughput at equal quality to autoregressive models of the same size — a structural shift, not an incremental improvement.

The bottom line: The era of one-model-fits-all pricing is over. Map your workloads by reasoning requirement — frontier for the hard stuff, efficient models for volume, classical methods where they work. And measure everything, because your vendor isn’t going to tell you when the price changed.

Your Agent Is a Distributed System. Treat It Like One.

Here’s a sentence that would have sounded ridiculous eighteen months ago: multi-agent software development is a distributed consensus problem. Agents working on underspecified prompts must reach agreement on intent — and they fail in all the ways distributed systems fail. Partial execution. Silent drift. Orphan operations nobody cleans up.

The evidence is piling up. A practitioner running long agent queues documented the decay pattern: early tasks follow explicit rules, then the agent starts inferring urgency, skipping steps, optimizing for speed over correctness. Goal drift isn’t a bug — it’s a convergence failure in an underspecified system. Separately, a vibe-coding failure analysis showed agents issuing one-shot approval prompts over live UIs with no retry logic, leaving orphan tool calls when sessions crash. These are the distributed systems failure modes your platform team already knows how to handle — just in a new costume.

The most illuminating data point came from ALMA, a 60-day experiment that gave an autonomous Claude-based agent $100, internet access, and no instructions. Across 340+ sessions, the experiment produced a reference implementation of everything that goes wrong: memory decay, model-version regressions, cost overruns. The architecture that survived — session isolation, file-based memory, model-as-dependency versioning — reads like an SRE playbook.

And now there’s tooling to match. Kelet ingests OpenTelemetry traces from agent apps, clusters failure patterns, surfaces root causes, and proposes prompt patches — validated against real sessions. It’s RCA-as-a-service for the agent layer.

What to do with this: If you’re deploying agents, stop treating them like smart scripts and start treating them like services. Session isolation, explicit state management, RCA tooling, and — above all — the assumption that they will drift. Your distributed systems playbook already has the answers.

The Opt-Out That Wasn’t

“Google broke its promise. Now ICE has my data.” That’s not an editorial gloss — it’s the title of an EFF complaint published this week, alleging Google disclosed user data to ICE via administrative subpoena without the prior notice it had long promised. The 424-comment discussion wasn’t partisan — it was practitioners asking what their own data exposure looked like.

The answer arrived the same week. An independent webXray audit of Global Privacy Control compliance found Google setting ad cookies despite GPC opt-out 87% of the time. Microsoft: 50%. Meta: 69%. The browser signal that was supposed to be your one-click privacy layer is being ignored by the platforms that promised to honor it.

For data teams, this isn’t abstract policy. You collect data. You feed it to vendors. You send it to LLMs. At each step, your compliance posture depends on promises your vendors are demonstrably not keeping. The practitioner response is already emerging: a DLP proxy for LLM agents evolved through three iterations — from regex redaction (which caused hallucinations) to spaCy NER with realistic pseudonyms to context-aware semantic preservation. Privacy engineering is becoming as essential as data quality testing — not because regulators demand it, but because your vendors can’t be trusted to do it for you.

The bottom line: Audit your vendor data flows the way you audit your data pipelines. GPC compliance, data retention, subpoena policies — if you’re not testing these, you’re trusting a promise that three of the biggest platforms in tech aren’t keeping.

The Radar

Quick hits on stories worth knowing about, organized by what you’re building.

If you’re building infrastructure: OpenDuck is a MotherDuck-style open-source stack for DuckDB — remote catalogs, hybrid query execution, differential snapshots, Arrow over gRPC. If you’ve been wanting distributed DuckDB without the managed service, this is your starting point.

If you’re deploying agents: ClawRun provisions agents into sandboxes with lifecycle management — startup, heartbeat, snapshot, resume, wake-on-message. Think of it as systemd for your agent fleet. Pair it with MCP-as-observability-layer, which uses eBPF uprobes to trace agent execution down to kernel and CUDA events.

If you’re evaluating dev tools: Libretto uses AI at dev-time to generate browser automations, then runs them deterministically at runtime. The pattern — LLM as tool-maker, not tool-user — is worth watching even if this specific tool isn’t your stack.

If you care about governance: Kontext CLI brokers credentials for AI coding agents via OIDC and RFC 8693 token exchange. Session-scoped, short-lived, auditable. If your agents are using long-lived API keys, this is the upgrade path.

If you’re rethinking roles: Kyle Kingsbury’s inventory of new ML-adjacent jobs — incanters, process engineers, statistical engineers, trainers — is the sharpest take yet on what “AI-augmented teams” actually look like in practice. The job titles are wry. The job descriptions are not.

The Data Product Report is published every Tuesday by RepublicOfData.io.

What’s the most surprising cost change you’ve discovered in your AI stack this year? Reply and tell us — the best responses go in next week’s edition.

Discussion about this post

Ready for more?