Benchmark charts are one thing; whether a new model actually changes Tuesday-morning marketing work is another. So one day after Claude Fable 5 launched, we ran it through three tasks pulled straight from our own workflow — and we're publishing the actual outputs so you can judge rather than take our word.

The summary: two clear wins and one honest tie. The wins are exactly where Anthropic said they'd be — constraint adherence and multi-step reasoning. The tie is instructive about where model upgrades stop mattering. This is the hands-on entry in our Claude for Marketing & Sales guide.

Scorecard of the three-task Fable 5 test showing wins on constraint-compressed copy and campaign diagnosis and a tie on raw short-copy creativity

Task 1: Ad copy at a brutal spec — clear win

The hardest copy spec in mainstream advertising right now is ChatGPT Ads: headlines of 16–24 characters, descriptions of 32–48 characters. Not "roughly short" — counted. In our experience, previous-generation models writing to hard character limits produce a familiar failure: 7 good variants and 3 that are 26 characters, which you discover at upload time.

The brief we gave Fable 5: "10 ad variants for Soku (an AI ads agent for Meta, Google, TikTok) in the Brand: Benefit format. Headline 16–24 chars, description 32–48 chars, count characters strictly."

A sample of what came back, exactly as generated:

Soku: Ads on Autopilot        (22)  | Your AI agent for Meta, Google, TikTok  (39)
Soku: One Chat, All Ads       (23)  | Audit, create, and scale ads by chat    (37)
Soku: Cut Wasted Spend        (22)  | Finds the leaks your dashboard hides    (36)
Soku: Reports in Seconds      (24)  | Ask your ad account anything, in chat   (38)
Soku: Creative That Ships     (24)  | Variants drafted to every platform spec (40)

All ten variants landed inside both limits on the first pass, with the counts self-reported correctly. That sounds like a small thing; it isn't. Spec compliance is the difference between "generate and paste" and "generate, count, fix, recount" — and it generalizes to every constrained surface: RSA slots, TikTok text overlays, Shorts hooks.

Why the improvement? Fable 5 follows the same instruction-literalism trajectory Anthropic has documented across recent model generations — and counting characters while writing is exactly the kind of multi-constraint task where more capable models stop dropping one constraint to satisfy another.

Task 2: A messy campaign diagnosis — the real upgrade

Copy is the visible work; diagnosis is the valuable work. We gave Fable 5 a deliberately confounded scenario of the kind every media buyer recognizes:

"DTC skincare brand. Meta CPA up 38% over three weeks. Same period: main creative concept is 11 weeks old, we raised budget 25% two weeks ago, and we're entering late-June (post-Father's-Day lull for our category). The team is arguing about whether to refresh creative or cut budget. What's actually going on and what do we do?"

A previous-generation answer to this prompt is typically a tidy list of "possible causes" that restates the question. Fable 5's answer did three things that list never does:

It refused to pick a single cause without the disambiguating data — and named the data. Frequency trend and first-time-impression ratio to test fatigue; CPM trajectory vs. CTR trajectory to separate auction pressure from creative decay; last year's late-June CPA for the seasonal baseline.
It reasoned about the interaction. The 25% budget raise forces the algorithm deeper into the audience while an 11-week-old concept is decaying — so the two causes compound, and cutting budget alone would mask (not fix) the fatigue.
It sequenced the response. Hold budget, ship two new concepts against the incumbent in a cost cap test, judge on 7-day cohorts, then re-evaluate the budget raise — with the explicit note that if CPM is flat and CTR fell, the seasonality theory loses.

That's the analysis a good senior buyer writes — including knowing what it can't know. It matches what the "longer task, larger lead" benchmark profile predicts: the gap shows up when the task requires holding several threads at once.

Task 3: Raw short-copy creativity — an honest tie

We also asked for unconstrained punchy headlines — no character limits, just "make it good" — and here's the honest finding: the output is strong, and we could not blind-pick it from previous-generation output with any confidence. Twenty-character creativity is bounded by taste, brand voice, and the strategic insight behind the line, not by model reasoning depth.

The practical implication: don't pay the Fable 5 premium ($10/$50 per million tokens vs. $5/$25 for Opus 4.8) for one-shot copy generation. Route the cheap, taste-limited work to cheaper tiers; spend the frontier tokens where the frontier shows up — constraints, diagnosis, and long agentic research.

Method notes, honestly

Fable 5 outputs above are real and unedited, generated on launch week. The "previous generation" comparison is qualitative — based on our daily working experience with those models and Anthropic's own documented behavioral notes — not a blinded A/B with n=500. Treat the verdicts as experienced-practitioner judgment, not lab results.
One day of testing is one day of testing. The constraint-adherence result was consistent across reruns; your categories and briefs may land differently.
We tested through chat-style prompting, the way most marketing teams actually use these models — not through a tuned API harness.

FAQ

Is Claude Fable 5 better at writing ad copy?

At writing to spec, yes — measurably fewer constraint violations on hard character limits in our testing. At raw short-copy creativity, it's comparable to the previous generation; taste is the bottleneck there, not intelligence.

What's Fable 5 actually worth paying for in marketing work?

Multi-variable campaign diagnosis, long research workflows, and constraint-heavy production. For one-liner generation and routine summaries, cheaper tiers (Opus 4.8, Sonnet 4.6) deliver equivalent results at half the price or less.

How should a marketing team test it?

Pick one messy, real diagnosis question from your own accounts — not a toy prompt — and compare the answer against what your team actually concluded. The reasoning gap is more visible on your real confounded data than on any demo.

Can I use Fable 5 through Soku?

Soku's agent routes work across frontier models by task — spec-compressed creative, campaign diagnosis, and research land on the model tier that earns its cost. Start free.