Building a RAG Chatbot from Scratch

In this comprehensive tutorial, we’ll walk through building a complete Retrieval Augmented Generation (RAG) chatbot using Vellum’s Workflow Builder. RAG systems combine the power of Large Language Models with external knowledge sources to provide more accurate, context-aware responses while reducing hallucinations.

By the end of this tutorial, you’ll have built a chatbot that:

Maintains conversation context across multiple turns
Searches through your document knowledge base
Provides responses grounded in your specific content
Gracefully handles questions outside its knowledge scope

What You’ll Learn

Basic Chatbot Setup

Create a foundational chatbot with Chat History support

Document Index Creation

Upload and configure documents for search and retrieval

RAG Implementation

Integrate search functionality with LLM responses

Hallucination Prevention

Constrain responses to available context

Part 1: Building the Foundation

Step 1: Create Your Workflow

Start by creating a new Workflow in Vellum. You’ll begin with an empty canvas containing just an Entrypoint node.

Step 2: Set Up Workflow Inputs

First, configure your workflow to accept chat history as input:

Click on the Inputs tab in the left sidebar
Click Add and select Chat History from the dropdown
Add a test message to simulate user input

Add test message to chat history — Configure Chat History input for your workflow

Step 3: Add and Configure the Prompt Node

Drag from the Entrypoint node to create a connection and select Prompt from the node panel
In the Prompt Node configuration, update the system prompt to include context constraints:

Please answer the user's questions, but only use the <context> you're given below. If you can't answer their questions with the <context>, please say "Sorry, I'm unable to answer that."
<context>
</context>

Add a Chat History input variable and connect it to your workflow’s chat history
Connect the Prompt Node to the Final Output Node

The context tags in the prompt are placeholders - we’ll populate them with search results in the next section.

Step 4: Test the Basic Setup

Before adding RAG capabilities, test your basic chatbot:

Ensure your Chat History contains only one user message (remove any assistant responses from previous tests)
Run the workflow

Clean chat history setup — Clean chat history with single user message

You should see output similar to this, where the chatbot correctly responds that it cannot answer without context:

Chatbot unable to answer without context — Expected output when no context is provided

Part 2: Creating Your Knowledge Base

Step 5: Create a Document Index

Now we’ll create a Document Index to store your knowledge base:

Navigate to the Document Indexes page
Click Create Index in the top right corner

Document Indexes main page — Document Indexes page

Name your index (e.g., “Countries”) and configure the settings:

Document Index creation dialog — Create Document Index dialog

You can read more about Document Index configuration options and chunking strategies in our documentation.

Click Save to create your index

Step 6: Upload Documents

After creating your index, you’ll see the upload interface:

Document upload interface — Document upload area

Vellum supports various file types including:

PDF documents
Word documents (DOCX)
Text files (TXT)
CSV files
PowerPoint presentations (PPTX)
HTML files

Drag and drop your documents or click to select files for upload.

Step 7: Preview Your Documents

Once uploaded, you can preview your documents to ensure they were processed correctly:

Document preview interface — Document preview showing processed content

Part 3: Implementing RAG Functionality

Step 8: Restructure Your Workflow

Now we’ll modify the workflow to include document search:

Delete the existing connection between Entrypoint and Prompt Node:
- Click on the edge between nodes
- Press Backspace or click the Delete icon

Delete edge between nodes — Delete the direct connection to add search functionality

Step 9: Add a Templating Node

We’ll use a Templating Node to extract the most recent user message for our search query:

Create a Templating Node between the Entrypoint and Prompt Node

Adding Templating Node — Add Templating Node to the workflow

Configure the Templating Node:
- Connect chat_history as an input variable
- Use this template to extract the latest user message:

1 {{ chat_history[-1].text }}

Templating Node setup — Templating Node configuration

Optional: Rename the node to “Current User Message” for clarity
Test the Templating Node by running the workflow:

Templating Node output showing extracted user message

Step 10: Add a Search Node

Drag from the Templating Node to create a new connection and select Search Node
Configure the Search Node:
- Document Index: Select your “Countries” index
- Search Query: Connect to the “Current User Message” output

Search Node setup — Search Node configuration

Step 11: Connect Search Results to the Prompt

Connect the Search Node to your Prompt Node
In the Prompt Node, add a new input variable of type String
Connect this input to the Search Node output

Prompt Node with search input — Prompt Node with Search Node input connected

Step 12: Insert Search Results into the Prompt

Place your cursor between the <context> tags in your prompt
Press the / key to open the variable insertion dropdown
Select the Search Node results to insert them into the context

Search results insertion dropdown — Insert search results using the dropdown

Part 4: Testing Your RAG Chatbot

Step 13: Run the Complete RAG Workflow

Now test your complete RAG implementation:

Ensure your chat history contains a question that can be answered from your documents
Run the workflow
Observe how the chatbot now provides context-aware responses based on your document content

Final RAG workflow output — RAG chatbot providing context-aware responses

Step 14: Test Edge Cases

Test your chatbot with various scenarios:

Questions Within Knowledge Base

Ask questions that can be answered using your uploaded documents. The chatbot should provide accurate, context-grounded responses.

Questions Outside Knowledge Base

Ask questions about topics not covered in your documents. The chatbot should respond with “Sorry, I’m unable to answer that.”

Multi-turn Conversations

Test follow-up questions and conversation continuity using the Chat History tab.

Advanced Enhancements

Adding Source Citations

To make your RAG chatbot even more useful, consider implementing source citations. This will be covered in a future tutorial, but you can explore:

Using metadata from search results
Formatting citations in responses
Providing document references

Optimizing Search Performance

Fine-tune your RAG system by:

Adjusting chunking strategies in your Document Index
Experimenting with different search weights
Implementing metadata filtering for more precise results

Evaluation and Monitoring

Consider setting up:

RAG-specific evaluation metrics
Online evaluations for production monitoring
Cost tracking for your workflow executions

Key Takeaways