Pick the closest match — we’ll seed the rest of the wizard with sensible defaults you can override. Not sure? Pick “Custom” and fill in every field by hand.
Support chatbot
Tier-1 customer support, FAQ deflection, ticket triage
Medium conversations, real-time, web search + RAG common. ~1,000 conversations/day is a typical mid-market starting point.
RAG / docs Q&A
Internal knowledge base, policy lookup, doc search
Short conversations dominated by retrieval context — input-token-heavy. Caching pays off when the same retrieved chunks repeat across users.
Long-context coding agent — team
Team Claude Code / Cursor / Cody deployment
Heavy multi-user coding-agent deployment — full file-tree context, agentic loops, ~80K tokens per call. Frontier models (Sonnet/Opus) typical. Pick this when your team shares a deployment, runs Claude Code at scale, or operates an agentic coding harness.
Long-context coding agent — solo
Individual Cursor / Aider / Windsurf user
Solo dev with an IDE-integrated coding agent — Cursor, Aider, Windsurf, or similar. Long sessions with file context but single-user volume (~25K tokens/call typical). Frontier or mid-tier models. Pick this when it's just you, not a team deployment.
Coding agent
Code review, refactor, snippet help — not full IDE
Lightweight code questions, occasional review or refactor — not full IDE integration. For Cursor / Claude Code / Cody / Aider, see Long-context coding agent above. Quality-tier flexible; mid-tier models often suffice.
Batch summarizer
Overnight email digests, meeting summaries, daily reports
Single-turn, scheduled — the canonical Batch API use case. 50% off list pricing on output tokens, no real-time pressure.
Classifier / extractor
Intent detection, sentiment, structured extraction
High-volume single-turn. Mid- or budget-tier models often win — quality difference is small for narrow tasks. Caching the system prompt is the biggest lever. If the workload is a multi-step extractor (reasoning LLM picking which sub-model to run), set 'inner LLM calls per task' on Step 5.
Content generator
Blog posts, social copy, marketing content — multi-stage drafting
Each user task fans out to several LLM calls (draft → revise → style-check → polish). Set 'tasks per day' on Step 5; PitCrew multiplies by callsPerTask (default 5) to compute the real LLM call count. Caching pays off because the system prompt is reused across the fan-out.
Batched aggregator
Hourly / daily cron — process many items per call
Cron-style workload that batches many items per LLM call (e.g. 500 comments per hourly run, daily feedback summary). Token shape: high input (items concatenated into prompt) + high output (per-item analysis). Batch API typically saves 50%. If each run internally fires multiple LLM calls (e.g. an agent-eval harness that executes a target agent end-to-end), set 'inner LLM calls per task' on Step 5.
Voice agent (phone bot)
Phone bot, AI receptionist, voice sales agent — real-time conversations
The agent listens to a caller and replies in real time. Common stack: OpenAI gpt-realtime or ElevenLabs Conversational doing the speech-in / language model / speech-out pipeline. Audio (the spoken words) dominates cost — typically ~$0.17 per audio-minute end-to-end. Phone-line cost (Twilio) and call-routing infrastructure (LiveKit) are tracked separately on Step 5 → Advanced.
Search index (no chat)
Match documents by meaning — recommendation, dedup, semantic search
The agent turns text into searchable vectors but does NOT call a chat LLM after. Pick this for: recommendation systems ("customers who viewed X also viewed…"), search-by-meaning over documents, finding near-duplicates, clustering, or any pure embedding workload. Cost is per-MTok of input plus any managed vector-database infrastructure (Pinecone / Weaviate / pgvector) which you disclose separately on Step 5 → Advanced. (This archetype is also called "embedding-only" if you're familiar with the term.)
Video creator
TikTok / Reels / Shorts pipeline, ad creatives, B-roll
Long conversations to plan + script + per-scene prompt + retry. Generation cost dominates — Veo or Sora at 720p, ~4 clips × 8s per finished video, ~10 videos/day. Brain LLM is the minority of the bill.
Image pipeline
Bulk product photos, social tiles, ad variations
Single-turn brain calls (or short), one-image-per-call generation at 1024². ~100 images/day across DALL-E 3 or Flux. Caching the brain prompt + sharing infrastructure across calls is where the savings hide.
Voice narrator
Audiobook, podcast, IVR, accessibility narration
Brain LLM drafts a script (~5,000 chars), TTS synthesizes it. ~50 scripts/day. ElevenLabs for expressive reads, OpenAI / Google for cost-sensitive volume.
Music creator
Suno / Udio songwriting, jingles, sound design
Brain LLM iterates on prompt + style cues; generation produces full songs. High reject rate — most tracks regenerated 2-3× before keep. ~20 finished tracks/day.
Custom / something else
Skip the preset and fill in every field by hand