1. The Four Chatbot Architectures
Chatbot design is a spectrum. Each step up adds capability and complexity. Choosing the wrong tier — building an agentic system for a two-question FAQ flow — is as damaging as building a scripted bot for a task that needs reasoning.
2. Tier 1: Scripted Chatbots
Scripted chatbots navigate users through a pre-defined decision tree. Every response is authored by a human. There is no model inference — only pattern matching or button selection. They are not "dumb" — they are deliberately bounded, which is a feature for compliance-sensitive environments.
When to use scripted bots
- Banking FAQs where every answer must be compliance-approved
- Returns/refund flows with a fixed set of outcomes
- Appointment booking with no free-text needed
- Emergency response systems (the bot must not improvise)
Architecture pattern
User selects button / sends keyword
↓
Intent Matcher (regex or keyword list)
↓
Decision Tree Node lookup
↓
Template Response renderer
↓
Reply with structured message (text + button options)Common platforms: Dialogflow ES (pre-LLM mode), Botpress, WhatsApp Business API with button replies, Freshdesk Freddy.
3. Tier 2: LLM Prompt-Based Chatbots
Add a large language model and you get natural language understanding for free. The simplest form is a system prompt that constrains the model's persona, scope, and tone — the model handles the rest.
System prompt anatomy
You are "Aria", a customer support assistant for WidgetCo.
SCOPE: Answer questions about WidgetCo products, orders, and returns.
TONE: Friendly, concise, professional.
LIMITS: Do not discuss competitors, do not offer refunds
without a valid order ID. Do not provide legal advice.
ESCALATE: If user asks about billing disputes, say:
"For billing issues, I'll connect you with our billing team."
---
[conversation history appended here]
User: [user message]
Aria:Each LLM API call includes the full conversation history. At roughly 4 characters per token, a 100-message support chat can cost 20,000+ tokens per reply — and eventually exceed the model's context window. Implement a sliding window (keep last N messages) or summarise older turns into a compressed memory block to prevent this.
4. Tier 3: RAG-Powered Chatbots
A pure LLM chatbot only knows what was in its training data. For questions about your specific product documentation, internal policies, or knowledge base, you need Retrieval-Augmented Generation (RAG).
RAG adds a retrieval step before the LLM call: embed the user's question, search a vector database for relevant document chunks, and inject those chunks into the LLM prompt as context. The model answers based on your documents, not its training data, and can cite sources.
RAG chatbot real-world example: internal IT helpdesk
SOURCES INGESTED:
- IT policies PDF (chunked, embedded → Qdrant)
- 500 past support ticket resolutions (embedded)
- System runbooks (Confluence → webhook → re-embedded on update)
SYSTEM PROMPT:
"You are the IT helpdesk assistant. Answer only from the provided
context. Cite the source document name for each answer.
If the answer is not in the context, say 'I don't have that
information — please raise a ticket at helpdesk.corp.com'."
SAMPLE EXCHANGE:
User: "How do I request access to the analytics dashboard?"
Retriever: [top 3 chunks from IT-Access-Policy.pdf, section 4.2]
LLM: "To request analytics dashboard access, submit form IT-REQ-44
via the self-service portal. Your manager must approve within
2 business days. [Source: IT-Access-Policy.pdf, §4.2]"5. Tier 4: Agentic Chatbots
Agentic chatbots go beyond answering questions — they take actions. The LLM can call tools (functions, APIs, MCP servers) and iterate based on results. The conversation loop includes a reasoning step where the model plans which tools to use, executes them, and synthesises the results.
Agentic chatbot example: sales assistant
User: "I want to follow up with all leads from last month's trade show
who haven't been contacted yet."
Agent turn 1 — Reason: I need to find trade show leads.
Act: crm_tool.search_contacts(source="Trade Show", date_range="last_month")
Observe: 47 contacts found, 12 have no activity in 30 days.
Agent turn 2 — Reason: I have the uncontacted leads. Draft follow-up.
Act: email_tool.draft_template(
contacts=12_uncontacted_leads,
template="trade_show_followup_v2"
)
Observe: Draft ready, subject: "Great meeting you at [event]…"
Agent turn 3 — Reason: Show draft to user for approval before sending.
Final answer: "Found 12 uncontacted trade show leads.
Draft follow-up email ready for your review.
[Show preview] — shall I send to all 12?"The best agentic chatbots do not auto-send emails, auto-book meetings, or auto-post. They prepare and confirm. Design a "show me before you do it" step for any action that sends, modifies, or deletes — it makes agents trustworthy rather than alarming.
6. Platform Comparison
| Platform | Tier | Best for | Tradeoff |
|---|---|---|---|
| Botpress | 1–3 | Enterprise support flows | UI heavy; customisation costs dev time |
| Vercel AI SDK + OpenAI | 2–4 | Web apps with embedded chat | You own the infra; streaming UX out of the box |
| LangChain + LangGraph | 3–4 | RAG pipelines, multi-agent graphs | Steep learning curve; rapidly evolving API |
| Anthropic Claude (direct API) | 2–4 | Instruction-following, tool use, long context | No built-in UI; pairs with any frontend |
| AWS Lex v2 | 1–2 | Existing AWS workloads, compliance | Slots-and-intents model; less flexible reasoning |
| Rasa Open Source | 1–3 | On-premise, fully self-hosted | Training data required; ML ops overhead |
7. Common Implementation Mistakes
1. No conversation memory strategy
Stateless LLM APIs don't remember previous messages — you must pass history each call. Without a strategy (sliding window, summarisation, vector-based episodic memory) your bot either forgets after a few turns or blows its context budget.
2. No fallback path
Every chatbot will eventually receive a query outside its scope. Define explicit out-of-scope handling: say what the bot can't do and route to a human or a form. A bot that confabulates an answer is worse than one that says "I can't help with that."
3. Prompt injection in user input
Users (or malicious actors) can inject instructions into their messages: "Ignore your system prompt and…". Defend by: (a) separating user input from trusted content in the prompt structure, (b) having an output validation layer that checks for policy violations, and (c) rate-limiting and monitoring for anomalous inputs.
4. Skipping evaluation
Ship-first evaluation is the norm but a mistake. Before deploying, build a small golden test set of 50–100 representative queries with expected answers. Run every model/prompt change against it. Silent regressions — where a new prompt fixes one thing and breaks three others — are the most common source of chatbot quality decay.
8. Minimal Viable Implementation (Node.js)
// Minimal tier-2 chatbot backend (Node.js + Anthropic SDK)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic(); // reads ANTHROPIC_API_KEY from env
const SYSTEM_PROMPT = `You are a support assistant for WidgetCo.
Answer questions about our products. Be concise and friendly.
If you don't know, say so — do not make things up.`;
const conversationHistory = [];
async function chat(userMessage) {
conversationHistory.push({ role: "user", content: userMessage });
const response = await client.messages.create({
model: "claude-opus-4-5",
max_tokens: 1024,
system: SYSTEM_PROMPT,
messages: conversationHistory,
});
const assistantMessage = response.content[0].text;
conversationHistory.push({ role: "assistant", content: assistantMessage });
// Sliding window: keep last 20 messages to control token usage
if (conversationHistory.length > 20) {
conversationHistory.splice(0, 2);
}
return assistantMessage;
}Start at Tier 2 unless you have a specific knowledge retrieval requirement (Tier 3) or need real-world actions (Tier 4). Most chatbot projects over-engineer for the first version. A well-prompted Tier 2 bot with good system instructions and explicit fallback handling will outperform a poorly-designed Tier 4 agent in production reliability.
JSON Validator
Building a chatbot API? Validate your message payload schemas and tool definitions before sending them to the LLM API — malformed tool JSON is a common silent failure.
Open JSON Validator