Ad creative pipeline — PitCrew demo

Default build

$3,902/mo

$260/user/mo

anthropic Haiku 4.5 on every call, no caching

$2 brain + $3,900 generation

PitCrew plan

$390/mo

$26/user/mo

with 2 recommended changes

$0.43 brain + $390 generation

$1,300–$10,533 saved every month ($3,511 most likely)

90% off the default build

Real bill typically runs $469–$586/mo — PitCrew computes steady-state inference; production adds dev/eval/retry overhead. Why →

How PitCrew gets you to $390/mo

Each recommendation below is one change you make at design time, with the dollars it shaves and the running total saved before you ship.

01

Default build

anthropic Haiku 4.5 on every call, no caching, real-time pricing on async work

$3,902/mo

starting point

02

Switch Veo 3 720p → Gen-3 TurboGeneration

runwayml Gen-3 Turbo prices its per-second video workload at $0.05/unit vs.

$392/mo

−$3,510/mo

$3,510 saved before build

03

+ Switch to M2.7

M2.7 from minimax runs the same workload at lower cost (budget tier, same quality bucket).

$390/mo

−$1/mo

$3,511 saved before build

Action plan

The full reasoning behind each recommendation — copy into your build doc.

01

Switch Veo 3 720p → Gen-3 TurboGeneration

low confidence — assumptions vary widely

−$1,300–$9,477/mo

runwayml Gen-3 Turbo prices its per-second video workload at $0.05/unit vs. your current $0.5. Same modality, comparable workflow.

Gen-3 Turbo is one tier below Veo 3 720p on quality. Sample-test before scaling.

02

Switch to M2.7

low confidence — assumptions vary widely

−$0–$3/mo

M2.7 from minimax runs the same workload at lower cost (budget tier, same quality bucket). Spec lists it as good for: agentic, productivity. Verify quality on a sample of your traffic before fully switching.

Different provider (minimax vs anthropic) — you'll need a separate API key and may see different latency.

Considered, didn’t apply

PitCrew checks every lever — model fit, prompt caching, batch lanes, prompt trimming. Here’s why the rest didn’t make the cut on this build.

Prompt caching
Your system prompt is 77 tokens; caching needs ≥1,024 tokens to amortize the cache-write cost.
Trim system prompt
No redundancy detected — your 77-token prompt is already tight.
Batch API
anthropic doesn't offer a Batch API, and no quality-equivalent provider in our pricing table does either.

Alternative video models

Same modality, your wizard’s output settings. Click Try as default to re-render this report with that generation model as the new baseline.

Model	Unit cost	Resolution	Monthly cost	vs default	Try as default
runwaymlGen-3 Turbo fast iterationsocial-first	$0.05/sec	720p	$390/mo	$-3,510/mo	Try as default →
fal.aiMulti-Model Router multi-modelexperimentationflexible routing	$0.05/sec	720p	$390/mo	$-3,510/mo	Try as default →
klingKling 1.5 Standard consistent characterssmooth motion	$0.07/sec	720p	$546/mo	$-3,354/mo	Try as default →
openaiSora 480p quick previewsB-roll	$0.10/sec	480p	$780/mo	$-3,120/mo	Try as default →
runwaymlGen-3 Alpha cinematicnarrative shots	$0.12/sec	720p	$936/mo	$-2,964/mo	Try as default →
openaiSora 1080p hero shotscinematic	$0.30/sec	1080p	$2,340/mo	$-1,560/mo	Try as default →
lumaDream Machine dream-likesurreal visuals	$0.40/song	720p	$3,120/mo	$-780/mo	Try as default →
googleVeo 3 720p Default realistic motionphotorealismphysics-accurate	$0.50/sec	720p	$3,900/mo	—

What we assumed

These are the inputs we used. If anything looks off, re-run the audit with better numbers.

System prompt

77 tokens (estimated)

Avg user input

1,500 tokens

Avg output

800 tokens

Calls per month

300

Batch share

30%

Pricing as of

Apr 28, 2026

Output per call (sec)

5

Generations per agent call

4

Regeneration rate

1.30×

Resolution

720p

How precise is this?

Savings band spans 263% of the central estimate. Top sources of uncertainty:

Call volume is your guess — typical pre-deploy estimates land within ±50% of actual.
Conversation length is a coarse bucket — actual tokens vary by ±40% per call.

Real-bill expectation

PitCrew forecasts steady-state inference cost — the dollars the LLM provider bills for the deterministic, no-extras workload your wizard described. Real production bills are typically 1.2-1.5× higher because the steady-state model excludes:

Dev / eval loops (often 10-30% of total spend)
Retries, error recovery, idempotency replays
Background batch jobs (summaries, classification of past data)
A/B traffic on alternate models
Embeddings + fine-tunes that ride alongside the agent

Scenario	Steady-state (PitCrew)	Expected real bill
Default build	$3,902/mo	$4,682–$5,853/mo
PitCrew plan	$390/mo	$469–$586/mo

The 20-50% multiplier comes from public engineering postmortems and the validation cases in docs/accuracy-validation.md. If your team has tight eval loops and minimal retry traffic, target the low end.

How sensitive is this forecast?

Pre-deploy estimates are guesses. Here’s how the savings shift if the volume or conversation length you guessed turns out to be off.

If your volume is different

Scenario

Default monthly

PitCrew monthly

Savings

Half the volume

5 calls/day (0.5×)

$1,951

$196

−$1,755/mo

As you estimated

10 calls/day

$3,902

$390

−$3,511/mo

Double the volume

20 calls/day (2×)

$7,803

$781

−$7,022/mo

5× the volume

50 calls/day (5×)

$19,508

$1,952

−$17,556/mo

If conversations run shorter or longer

Scenario

Default monthly

PitCrew monthly

Savings

One bucket shorter

Medium (5-15 turns)

$3,901

$391

−$3,510/mo

As you estimated

Long (15+ turns)

$3,902

$390

−$3,511/mo

Your forecast is in

How PitCrew gets you to $390/mo

Action plan

Considered, didn’t apply

Alternative video models

What we assumed

Real-bill expectation

How sensitive is this forecast?

Run another audit
for a different build

Your forecast is in

How PitCrew gets you to $390/mo

Action plan

Considered, didn’t apply

Alternative video models

What we assumed

Real-bill expectation

How sensitive is this forecast?

Run another auditfor a different build

Run another audit
for a different build