Building AI Chatbots: Architectures, Patterns & Real-World Examples

A "chatbot" can mean a scripted button menu or a fully autonomous agent that books meetings, queries databases, and escalates to humans. The gap between those two is architectural. This guide maps the four main chatbot architectures, explains when each is the right choice, and shows real implementation patterns — not product demos.

1. The Four Chatbot Architectures

Chatbot design is a spectrum. Each step up adds capability and complexity. Choosing the wrong tier — building an agentic system for a two-question FAQ flow — is as damaging as building a scripted bot for a task that needs reasoning.

2. Tier 1: Scripted Chatbots

Scripted chatbots navigate users through a pre-defined decision tree. Every response is authored by a human. There is no model inference — only pattern matching or button selection. They are not "dumb" — they are deliberately bounded, which is a feature for compliance-sensitive environments.

When to use scripted bots

Banking FAQs where every answer must be compliance-approved
Returns/refund flows with a fixed set of outcomes
Appointment booking with no free-text needed
Emergency response systems (the bot must not improvise)

Architecture pattern

User selects button / sends keyword
      ↓
Intent Matcher (regex or keyword list)
      ↓
Decision Tree Node lookup
      ↓
Template Response renderer
      ↓
Reply with structured message (text + button options)

Common platforms: Dialogflow ES (pre-LLM mode), Botpress, WhatsApp Business API with button replies, Freshdesk Freddy.

3. Tier 2: LLM Prompt-Based Chatbots

Add a large language model and you get natural language understanding for free. The simplest form is a system prompt that constrains the model's persona, scope, and tone — the model handles the rest.

System prompt anatomy

You are "Aria", a customer support assistant for WidgetCo.

SCOPE: Answer questions about WidgetCo products, orders, and returns.
TONE: Friendly, concise, professional.
LIMITS: Do not discuss competitors, do not offer refunds
        without a valid order ID. Do not provide legal advice.
ESCALATE: If user asks about billing disputes, say:
          "For billing issues, I'll connect you with our billing team."

---

[conversation history appended here]
User: [user message]
Aria:

Context window management

Each LLM API call includes the full conversation history. At roughly 4 characters per token, a 100-message support chat can cost 20,000+ tokens per reply — and eventually exceed the model's context window. Implement a sliding window (keep last N messages) or summarise older turns into a compressed memory block to prevent this.

4. Tier 3: RAG-Powered Chatbots

A pure LLM chatbot only knows what was in its training data. For questions about your specific product documentation, internal policies, or knowledge base, you need Retrieval-Augmented Generation (RAG).

RAG adds a retrieval step before the LLM call: embed the user's question, search a vector database for relevant document chunks, and inject those chunks into the LLM prompt as context. The model answers based on your documents, not its training data, and can cite sources.

RAG chatbot real-world example: internal IT helpdesk

SOURCES INGESTED:
  - IT policies PDF (chunked, embedded → Qdrant)
  - 500 past support ticket resolutions (embedded)
  - System runbooks (Confluence → webhook → re-embedded on update)

SYSTEM PROMPT:
  "You are the IT helpdesk assistant. Answer only from the provided
   context. Cite the source document name for each answer.
   If the answer is not in the context, say 'I don't have that
   information — please raise a ticket at helpdesk.corp.com'."

SAMPLE EXCHANGE:
  User: "How do I request access to the analytics dashboard?"
  Retriever: [top 3 chunks from IT-Access-Policy.pdf, section 4.2]
  LLM: "To request analytics dashboard access, submit form IT-REQ-44
        via the self-service portal. Your manager must approve within
        2 business days. [Source: IT-Access-Policy.pdf, §4.2]"

5. Tier 4: Agentic Chatbots

Agentic chatbots go beyond answering questions — they take actions. The LLM can call tools (functions, APIs, MCP servers) and iterate based on results. The conversation loop includes a reasoning step where the model plans which tools to use, executes them, and synthesises the results.

Agentic chatbot example: sales assistant

User: "I want to follow up with all leads from last month's trade show
       who haven't been contacted yet."

Agent turn 1 — Reason: I need to find trade show leads.
  Act: crm_tool.search_contacts(source="Trade Show", date_range="last_month")
  Observe: 47 contacts found, 12 have no activity in 30 days.

Agent turn 2 — Reason: I have the uncontacted leads. Draft follow-up.
  Act: email_tool.draft_template(
    contacts=12_uncontacted_leads,
    template="trade_show_followup_v2"
  )
  Observe: Draft ready, subject: "Great meeting you at [event]…"

Agent turn 3 — Reason: Show draft to user for approval before sending.
  Final answer: "Found 12 uncontacted trade show leads.
                Draft follow-up email ready for your review.
                [Show preview] — shall I send to all 12?"

Always pause before irreversible actions

The best agentic chatbots do not auto-send emails, auto-book meetings, or auto-post. They prepare and confirm. Design a "show me before you do it" step for any action that sends, modifies, or deletes — it makes agents trustworthy rather than alarming.

6. Platform Comparison

Platform	Tier	Best for	Tradeoff
Botpress	1–3	Enterprise support flows	UI heavy; customisation costs dev time
Vercel AI SDK + OpenAI	2–4	Web apps with embedded chat	You own the infra; streaming UX out of the box
LangChain + LangGraph	3–4	RAG pipelines, multi-agent graphs	Steep learning curve; rapidly evolving API
Anthropic Claude (direct API)	2–4	Instruction-following, tool use, long context	No built-in UI; pairs with any frontend
AWS Lex v2	1–2	Existing AWS workloads, compliance	Slots-and-intents model; less flexible reasoning
Rasa Open Source	1–3	On-premise, fully self-hosted	Training data required; ML ops overhead

7. Common Implementation Mistakes

1. No conversation memory strategy

Stateless LLM APIs don't remember previous messages — you must pass history each call. Without a strategy (sliding window, summarisation, vector-based episodic memory) your bot either forgets after a few turns or blows its context budget.

2. No fallback path

Every chatbot will eventually receive a query outside its scope. Define explicit out-of-scope handling: say what the bot can't do and route to a human or a form. A bot that confabulates an answer is worse than one that says "I can't help with that."

3. Prompt injection in user input

Users (or malicious actors) can inject instructions into their messages: "Ignore your system prompt and…". Defend by: (a) separating user input from trusted content in the prompt structure, (b) having an output validation layer that checks for policy violations, and (c) rate-limiting and monitoring for anomalous inputs.

4. Skipping evaluation

Ship-first evaluation is the norm but a mistake. Before deploying, build a small golden test set of 50–100 representative queries with expected answers. Run every model/prompt change against it. Silent regressions — where a new prompt fixes one thing and breaks three others — are the most common source of chatbot quality decay.

8. Minimal Viable Implementation (Node.js)

// Minimal tier-2 chatbot backend (Node.js + Anthropic SDK)
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic(); // reads ANTHROPIC_API_KEY from env

const SYSTEM_PROMPT = `You are a support assistant for WidgetCo.
Answer questions about our products. Be concise and friendly.
If you don't know, say so — do not make things up.`;

const conversationHistory = [];

async function chat(userMessage) {
  conversationHistory.push({ role: "user", content: userMessage });

  const response = await client.messages.create({
    model: "claude-opus-4-5",
    max_tokens: 1024,
    system: SYSTEM_PROMPT,
    messages: conversationHistory,
  });

  const assistantMessage = response.content[0].text;
  conversationHistory.push({ role: "assistant", content: assistantMessage });

  // Sliding window: keep last 20 messages to control token usage
  if (conversationHistory.length > 20) {
    conversationHistory.splice(0, 2);
  }

  return assistantMessage;
}

Choosing the right tier

Start at Tier 2 unless you have a specific knowledge retrieval requirement (Tier 3) or need real-world actions (Tier 4). Most chatbot projects over-engineer for the first version. A well-prompted Tier 2 bot with good system instructions and explicit fallback handling will outperform a poorly-designed Tier 4 agent in production reliability.

JSON Validator

Building a chatbot API? Validate your message payload schemas and tool definitions before sending them to the LLM API — malformed tool JSON is a common silent failure.

Open JSON Validator