Learn how to build an AI agent in n8n using Claude. Skip frameworks—real architecture for founders and ops leads.
Most n8n AI agent tutorials show you how to wire up a chatbot that forgets context after three messages. Here's how to build one that actually remembers, routes decisions, and doesn't hallucinate your customer data across workflows.
Zapier charges per task. Slack Workflow Builder charges per automation. n8n lets you self-host and run 10,000 agent loops for the cost of one Zapier "premium" tier.
But cost isn't the real win. The real win is that n8n gives you HTTP nodes plus native LLM integrations without JavaScript bloat. You get Claude integration via Anthropic API—no LangChain wrapper, no framework opinions getting in the way.
Connecting to Supabase or Postgres means your agent actually sees customer context, not just the current message. Your sales agent can read the last three interactions from your database before Claude decides whether to route an email to the team or send an auto-response. That's the difference between a toy and a tool.
Compare this to frameworks like LangChain or CrewAI: you're fighting version conflicts, debugging token counting logic, and paying for abstraction you don't need. n8n is simpler. It's dumber in a good way. It does exactly what you tell it.
Every agent needs three layers. Input handler. LLM reasoning. Action executor.
Use Claude's native tool_choice and tools parameters. Don't write custom JSON parsers. Claude 3.5 Sonnet is built to call tools—that's the whole point of the API.
Store conversation state in Supabase (2–5ms latency, queryable, costs $25/month for production volume) or a Redis layer if you need sub-second latency. But for most teams, Postgres is enough.
Route decisions with Claude's response, not regex or token counting. Here's the real pattern: inbound email arrives → n8n webhook catches it → you query Supabase for customer history → Claude reads that context → Claude picks a tool (maybe "route_to_sales" or "send_auto_response") → n8n executes that tool → you store the result back in Supabase for the next run.
This is the engine. Everything else is wiring.
Start with a Webhook trigger. Use n8n's HTTP In node. It accepts JSON from your CRM, email parser, or Slack—doesn't matter. One node catches everything.
Add an HTTP Request node to call Claude API directly. Use claude-3-5-sonnet for speed (shorter response times, cheaper). Use claude-3-opus only if you're asking Claude to reason about complex multi-tool chains. Most teams never need Opus.
Define three to five tools as Claude sees them:
fetch_customer_record (query Supabase)create_crm_task (write to your CRM)send_email (call an email API)log_interaction (store metadata)escalate_to_human (fallback)stop_reason: end_turn. This is the agent. It's not magic.
Claude 3.5 Sonnet is faster at instruction-following and costs $3 per 1M tokens. OpenAI's GPT-4o costs $15 per 1M tokens and is better at vision tasks.
For most agentic workflows, Claude wins. In production deployments we've run, we see 40% fewer hallucinations with Claude when the tool set is clear. Your agent asks for a tool that exists. Claude picks it. It doesn't invent a fifth tool because it got confused.
If you're running 100K requests per month with GPT-4o, you're spending $1.50 per 1K requests. Switch to Claude, you're at $0.30. That's $120/month in savings on one workflow alone. Scale that to ten agents, and you're looking at $14.4K/year—enough to justify a mid-level engineer who maintains them.
Mistake 1: No context window management. Your agent reads 50KB of chat history, wastes tokens, forgets the actual question. Truncate to the last 10 message exchanges plus the customer record. Done.
Mistake 2: Tool definitions are vague. Instead of "update_customer," tell Claude exactly: update_customer(field: string, value: string, customer_id: integer) — returns true or error message. Claude is literal. Be literal back.
Mistake 3: No fallback routing. If Claude can't pick a tool, the workflow dies. Add a catch node that routes to a human or a default action. Every agent needs an escape hatch.
Mistake 4: Treating it like a chatbot. Agents need structure: ingest → reason → act → log. Chatbots ramble. Don't blur them. An agent that can't measure its output is just a chatbot pretending to work.
Supabase PostgreSQL gives you 2–5ms latency for customer record lookups. Store messages, user state, previous actions there. That's your agent's memory across separate API calls.
If your CRM is HubSpot, Apollo, or Salesforce, n8n has native connectors. No custom code. Drag the node in, authenticate, pass the data.
For long-running agents (multi-day workflows), store session state with a UUID. Claude doesn't remember across separate API calls. Your database does. Query it every time.
The real pattern: webhook → lookup customer in Supabase → pass JSON to Claude → Claude calls "create_task" → n8n creates task in your CRM → store result back in Supabase for the next run. That's production.
Use n8n's debug mode to trace each Claude API call. Check that tool_choice values match your node names exactly. Claude is literal.
Set up error branches. If Claude returns stop_reason: max_tokens, add a continuation loop or escalation node. If the API times out, retry twice then fail gracefully.
For high-volume agents (100+ requests/day), self-host n8n on a VPS or Vercel. Cloud n8n works but adds latency. You're paying for convenience you don't need on day one.
Monitor: log every Claude call (tokens used, reasoning, response time) to a Supabase table. After 100 runs, you'll see which tool combinations actually work. After 500 runs, you'll know exactly where to optimize.
Narrow scope wins. Don't build a "do everything" agent. Pick one use case: inbound lead triage, customer support routing, or outbound call logging.
Lead triage example: email arrives → Claude reads company name from email plus Apollo enrichment data → decides: sales team, partner channel, or auto-respond with a case number. One decision, three possible actions.
Measurable output: "Agent routed 200 leads this week, 94% routed correctly" beats "Agent is running." Track accuracy. Track volume. Track cost per action. That's how you know if it's working.
Ship with one tool first. Just "create_task." After 500 successful runs, add a second tool. Every tool you add is another way the agent can fail. Start simple.
No. n8n is visual. You drag nodes, wire them together, and handle the rest in the UI. You'll need to know JSON for Claude tool definitions and understand basic SQL if you query Postgres directly. That's it. If you can write a Zapier zap, you can build an n8n agent.
Super variable. n8n self-hosted is free (you pay hosting). Claude costs $3 per 1M input tokens, $15 per 1M output tokens. A 1K-request-per-day agent costs about $0.30/day if you're efficient with your prompts. Call it $10/month. Add $25 for Supabase, $20 for hosting. You're at $55/month for a production agent.
A standard automation does the same thing every time. If an email comes in, it creates a task. Done. An agent reads the email, reasons about it, picks from multiple actions, and adapts based on what it learns. Automations are rigid. Agents think. Both live in n8n. Agents just use Claude.
---
If you want to talk through applying this to your stack, book a strategy call at cognival.co/book.
30-min strategy call. No pitch, real look at your stack.
Book a strategy call →