Build Your Own AI Chatbot: A Complete Guide
Introduction
Welcome to the comprehensive guide on building and managing your own AI-powered chatbot using the LangGraph Knowledge Base module. This system allows you to create intelligent agents that can understand documents, remember past interactions, and escalate complex issues to human agents.
0. Prerequisites (Dec 2025 Update)
Before you begin, ensure you have the necessary API keys and database endpoints. This system relies on OpenAI for intelligence and Qdrant for memory.
A. OpenAI API Key
Required for the LLM (reasoning) and creating vector embeddings (if using standard 1536-dimension models like text-embedding-3-small).
- Create Account: Sign up at platform.openai.com.
- Billing: You must add a payment method (credit card) to enable API access. New accounts may get $5 free credit.
- Generate Key: Go to Dashboard > API Keys > Create new secret key.
- Important: Copy the key immediately (starts with
sk-...). You cannot see it again once you close the window.
B. Qdrant Vector Database (Free Tier)
Required to store your knowledge base documents as “vectors” for semantic search. We use the Qdrant Cloud service.
- Sign Up: Register at cloud.qdrant.io. You can use your Google or GitHub account.
- Free Tier Specs: As of Dec 2025, the “Free Tier” offers a 1GB cluster (approx. 1M vectors), fully managed, with no credit card required.
- Create Cluster: Click “Create Cluster” and select the Free Tier.
- Get Credentials:
- URL: Copy the Cluster URL (e.g.,
https://xyz-example.us-east-1.aws.cloud.qdrant.io). - API Key: Go to “Data Access Control” and create/copy an API Key.
- URL: Copy the Cluster URL (e.g.,
Note: You do NOT need to create a collection manually. The system automatically creates a collection named wizmessage when you upload your first document.
1. Knowledge Base Management
The foundation of your chatbot is the data you provide. The Documents tab is where you manage this knowledge.
Uploading Documents
- Supported Formats: PDF, DOCX, TXT, MD.
- Size Limit: Up to 50MB per file.
- Status Tracking: Files go through Uploaded -> Processing -> Processed states. If a file fails, check the error message and retry.
Testing Your KB
Use the KB Search Panel to verify that your documents are being indexed correctly.
- Search Query: Enter a question to see what chunks are retrieved.
- Relevance Score: Each result is graded (Excellent, Good, Fair, Weak) to help you understand quality.
- Metadata: Inspect the source file and page number for every retrieved snippet.
2. Tuning & Configuration
Fine-tune how your chatbot behaves using the Settings tabs.
Relevance Settings
Control how strict the bot is when matching user queries to documents:
- Top K: Number of document chunks to retrieve (e.g., 5 or 10).
- Relevance Thresholds: Define what counts as a “High” or “Medium” match.
Answer Quality
Customize the generation process:
- Max Snippets: Limit how many pieces of information the LLM uses to form an answer.
- Context Window: Set the maximum number of characters to feed into the model.
3. Advanced Integrations
Power up your chatbot with cutting-edge AI capabilities.
LLM Providers
Choose the brain behind your bot:
- Providers: Switch between OpenAI and Gemini.
- Models: Select specific models like
gpt-4ofor reasoning orgemini-2.5-flashfor speed.
Vector Database (Qdrant)
Connect to a Qdrant instance for high-performance semantic search.
- Configuration: Enter your Qdrant URL and API Key.
- Collection: Default collection name is
wizmessage.
Vision & Voice
- Vision (Phase 1): Enable image analysis to let users send photos. Configure monthly cost limits and model selection (e.g.,
gpt-4o). - Speech (Phase 2): Allow voice notes. The bot will transcribe audio using models like
gpt-4o-transcribebefore responding.
4. Branding & Customization
Make the chatbot feel like part of your team.
- Headers & Footers: Add standard text to the beginning or end of every message.
- CTA Buttons: Append a “Visit Website” or “Contact Sales” button to responses to drive conversion.
5. Performance & Caching
Optimize for speed and cost efficiency using the 3-tier caching system. This can reduce LLM costs by 40-80%.
- Exact Match Cache: Instant response for identical repeated queries.
- Semantic Cache: Uses vector similarity to answer questions with the same meaning (requires Qdrant).
- KB Results Cache: Caches the document retrieval step to save database processing time.
6. Escalation Management
When the AI can’t help, seamlessly hand off to a human agent.
Dashboard
Monitor the health of your support system with real-time stats (Total, Active, Resolved) and Average Resolution Time.
Managing Tickets
- Filtering: Sort by Status, Urgency, or Topic (Billing, Technical).
- Review: Read the full conversation history to understand the user’s issue before replying.
- Interaction: Reply directly via WhatsApp from the dashboard.
Resolution & Automation
- Bulk Resolve: Close multiple outdated tickets at once.
- Auto-Resolve: Set a timer (e.g., 24 hours) to automatically close inactive escalations.
- Export: Download CSV reports for offline analysis.
7. Long Term Memory (LTM) & Satisfaction
Your chatbot learns from its successes to become smarter over time.
How LTM Works
The system stores “successful patterns” (Question + Answer) in a dedicated Long Term Memory. When a new user asks a similar question, the bot recalls the successful answer from the past.
Satisfaction Signals
How does the bot know it did a good job? It listens for:
- Reactions: Positive emojis (👍, ❤️, 🎉) on a bot message.
- Text: Explicit messages like “Thanks!”, “Perfect”, or “That helped.”
Note: LTM entries automatically expire after 30 days or if the underlying knowledge base document is updated, ensuring answers never get stale.
8. How It Works (Behind the Scenes)
Curious about how the magic happens? Here is a simplified look at how your chatbot learns and thinks.
Phase A: The Learning Process
- Upload: You submit a document (PDF, Word, etc.) via the dashboard.
- Reading: The system reads your file and breaks it down into small, logical pieces (like paragraphs or sections).
- Understanding: Each piece is converted into a digital “fingerprint” (vector) that represents its meaning, not just key words. This allows the AI to understand context.
- Memorizing: These fingerprints are securely stored in your database (Qdrant), creating a searchable library of your knowledge.
Phase B: The Thinking Process
When a user sends a message, the chatbot follows a smart decision path:
- Language Detection: First, it instantly identifies the language (e.g., Spanish, Hindi) to ensure it replies correctly.
- Intent Recognition: It figures out what the user wants. Are they just saying “Hi”, or asking a specific question like “How do I reset my password?”
- Smart Search: If it’s a question, the bot searches your library for the most relevant answers. It only looks at your data, ensuring total privacy.
- Drafting the Reply: The AI combines the user’s question with the facts it found to write a natural, helpful response.
- Proactive Help: Even if a user is complaining, the bot secretly checks if you have a known solution. If it finds one, it interrupts the standard complaint process to solve the problem immediately!

