Ecommerce AI Series Built on TechHeavenIntermediateLive

AI Customer Support with RAG

Answer customer questions automatically using your product knowledge base

The Problem

TechHeaven’s support queue is full of the same questions

TechHeaven’s support team receives hundreds of questions every week. The answers exist - in the return policy, the product descriptions, the FAQ, and the order database. But finding and writing each answer still takes human time.

“Where is my order #12847?”

Order database

“Can I return opened headphones?”

Return policy

“Does the Sony WH-1000XM5 work with MacBook Air?”

Product attributes

“Is my laptop still under warranty if the screen cracked?”

Warranty terms

“How long does express shipping take to Texas?”

Shipping policy

The answer to each question already exists in TechHeaven’s data. The problem is connecting the question to the right piece of information - automatically, accurately, and without hallucinating details that are not in the source.

The Limitation

Why keyword search is not enough

Traditional search returns documents. It cannot synthesize an answer from those documents or reason about what the customer actually needs.

No semantic understanding

"Send back" and "return" mean the same thing to a customer, but keyword search treats them as different queries.

Returns documents, not answers

Searching "return policy" returns the policy page - but does not answer "Can I return an opened product?"

No context awareness

The customer's order history, product details, and account status are invisible to a keyword search.

Cannot synthesize across sources

An answer about warranty on a specific product requires joining the product record, the category warranty rules, and the general warranty policy.

The Solution

Retrieval Augmented Generation

RAG solves this by separating two problems: finding the right information (retrieval) and generating a coherent answer from it (generation). The model does not memorize your data - it looks it up at query time.

This means the AI can only answer from what it retrieves. If the answer is not in the knowledge base, it says so instead of inventing one.

RAG Architecture - TechHeaven Customer Support

Customer Question

"Can I return opened headphones?"

Embedding Model

Convert question to vector

Vector Search

TechHeaven knowledge base - Return policy, Product FAQs, Warranty terms

Retrieved Context

Top-k most relevant chunks

LLM + Prompt

Question + context - generate grounded answer

Grounded Answer

"Based on our return policy, opened electronics..."

Core Concepts

How each piece works

Knowledge Base

The collection of documents the AI can retrieve from. For TechHeaven, this is the return policy, shipping policy, warranty terms, product descriptions, FAQ entries, and structured order data.

TechHeaven10 entity types - Products, Policies, FAQ, CMS pages, and more. Each becomes a source of retrievable documents.

Chunking

Documents are too long to fit in a single prompt. Chunking splits them into smaller, overlapping pieces - typically 200-500 tokens each - so the most relevant section can be retrieved without pulling in the entire document.

TechHeavenThe return policy (2,000 words) becomes ~8 chunks. A customer asking about electronics returns retrieves the electronics-specific chunk, not the entire policy.

Embeddings

A numerical representation of a chunk's meaning. Text with similar meaning has similar embeddings, even if the words are different. "Return" and "send back" map to nearby points in embedding space.

TechHeavenEvery chunk is embedded once, at index time. Embeddings are stored in a vector database alongside the original text.

Vector Search

The customer's question is embedded using the same model as the knowledge base. The vector database returns the k chunks whose embeddings are closest to the question embedding - the most semantically relevant passages.

TechHeaven"Can I return opened headphones?" retrieves the return policy chunk about opened electronics, the headphone product FAQ, and the general returns FAQ - not the shipping policy.

Prompt Construction

The retrieved chunks are injected into a prompt template alongside the customer's question. The template instructs the LLM to answer only from the provided context.

TechHeavenTemplate: "You are TechHeaven's support assistant. Answer using only the following information: [retrieved chunks]. If the answer is not in the context, say so. Question: [question]"

Hallucination Prevention

Without grounding, LLMs confidently invent policies that do not exist. RAG prevents this by restricting the LLM to retrieved context. When the knowledge base does not contain the answer, the system responds with a fallback instead of guessing.

TechHeaven"Is this product compatible with X?" - if no compatibility data exists in the product records, the system says "I don't have compatibility information for this product" rather than inventing an answer.

Evaluation

RAG systems need ongoing measurement. Key metrics: Retrieval precision (did we retrieve the right chunks?), Answer faithfulness (did the LLM stay within the context?), Answer completeness (did it answer the full question?). Frameworks like RAGAS automate this.

TechHeaven200 question-answer pairs built from known policy content. Each pipeline change is evaluated against this set before deployment.

Production

Production architecture

A production RAG system for TechHeaven has two pipelines: an offline ingestion pipeline that keeps the knowledge base current, and an online retrieval pipeline that answers questions in real time.

Offline - Ingestion

1Pull updated content from Bagisto API
2Parse and clean documents
3Chunk into 300-token segments with 50-token overlap
4Embed each chunk using text-embedding model
5Upsert into vector database
6Schedule nightly re-index for policy changes

Online - Query

1Receive customer question
2Embed the question
3Retrieve top-5 chunks from vector DB
4Inject chunks into prompt template
5Call LLM with context-grounded prompt
6Return answer + source references
7Log question + answer for evaluation

Technology choices

Embedding

OpenAI ada-002, Cohere embed-v3

Vector store

pgvector, Pinecone, Weaviate

LLM

Claude 3.5 Sonnet, GPT-4o

Orchestration

LangChain, custom Python

Interactive

Explore the system

Once TechHeaven data is available, these sections will be interactive. You will be able to browse the knowledge base, see how documents are chunked, test questions, and inspect the retrieved context.

Document Browser

Browse TechHeaven's full knowledge base - policies, FAQs, product descriptions

Coming with TechHeaven data

Chunk Explorer

See how each document is split into retrieval chunks and where overlap occurs

Coming with TechHeaven data

Question Playground

Ask questions and see exactly which chunks are retrieved and why

Coming with TechHeaven data

Prompt Viewer

Inspect the full prompt sent to the LLM, including injected context

Coming with TechHeaven data

Retrieved Context Viewer

Compare the retrieved chunks to the final answer side by side

Coming with TechHeaven data

Evaluation Dashboard

Measure faithfulness, precision, and recall across 200 test questions

Coming with TechHeaven data

Business Impact

What this achieves for TechHeaven

60-80%

Ticket deflection

Repetitive questions answered automatically, before they reach a human agent

24/7

Availability

Support runs continuously without scaling headcount

Consistent

Answers

Every customer gets the same accurate answer from the same source of truth

Traceable

Sources

Every answer cites which document it came from - auditable and correctable

Business Applications

Insurance

Answer policy questions, coverage queries, and claims status questions automatically. The knowledge base contains policy documents, FAQs, and state-specific regulations.

SaaS / Software

Support documentation, API references, and troubleshooting guides become the knowledge base. The AI answers feature questions and routes complex issues to engineering.

Healthcare

Patient FAQs, appointment policies, and billing questions can be answered from a structured knowledge base - with hard boundaries on anything requiring medical advice.

Professional Services

Intake questionnaires, service scope documents, and engagement FAQs form the knowledge base. The AI qualifies and routes inquiries before a human responds.