Overview

In this example, we’ll build a basic RAG chatbot. The chatbot will be able to answer questions whose answers are grounded in the contents of PDF documents (in this case, Vellum’s Trust Center Policies). This is useful if you want to help scale your support team by finding them quick answers to common questions from customers.

Ultimately, we’ll end up with a Workflow that performs the following steps:

ExtractUserMessage: extracts the most recent message from the user
SearchNode: uses the user’s message to find relevant quotes from ingested PDFs
FormatSearchResultsNode: reformats quotes to include the name of the document that they came from
PromptNode: passes the user’s question and the PDF context to the LLM to answer the question

1 ## Graph Definition
2 class BasicRAGWorkflow(BaseWorkflow[Inputs, BaseState]):
3   graph = ExtractUserMessage >> SearchNode >> FormatSearchResultsNode >> PromptNode
4 
5   class Outputs(BaseWorkflow.Outputs):
6     result = PromptNode.Outputs.text
7 
8   ## Running it
9   workflow = BasicRAGWorkflow()
10   terminal_event = workflow.run(
11     inputs=Inputs(
12       chat_history=[
13           ChatMessageRequest(
14               role="USER",
15               text="How often is employee training?",
16           )
17       ]
18     )
19   )
20 
21 ## Output:
22 print(terminal_event.outputs.result)
23 
24 """
25 Employee training, as outlined in the Information Security Policy
26   occurs on an annual basis. All new hires are required to complete
27   information security awareness training as part of their new employee
28   onboarding process and then annually thereafter. This ongoing training
29   includes security and privacy requirements, the correct use of 
30   information assets and facilities, and, consistent with assigned roles 
31   and responsibilities, incident response and contingency training. 
32   Additionally, individuals responsible for supporting or writing code for 
33   internet-facing applications or internal applications that handle customer 
34   information must complete annual security training specific to secure coding 
35   practices, which includes OWASP secure development principles and OWASP top 10 
36   vulnerability awareness for the most recent year available.
37 
38 Citation: Policy Information Security Policy - v1.pdf & Policy Software Development Life Cycle Policy - v1.pdf\
39 """

Which corresponds to a Workflow graph like this:

Let’s dive in!

Setup

Install Vellum

$ pip install vellum-ai

Create your Project

In this example, we’ll structure our project like this:

1 basic_rag_chatbot/
2 ├── workflow.py
3 ├── inputs.py
4 ├── __init__.py
5 └── nodes/
6     ├── __init__.py
7     ├── extract_user_message.py
8     ├── search_node.py
9     ├── format_search_results_node.py
10     └── prompt_node.py

Folder structure matters! Vellum relies on this structure to convert between UI and code representations of the graph. If you don’t want to use the UI, you can use whatever folder structure you’d like.

Define Workflow Inputs

1 from typing import List
2 from vellum import ChatMessageRequest
3 from vellum.workflows.inputs import BaseInputs
4 
5 class Inputs(BaseInputs):
6   chat_history: List[ChatMessageRequest]

Our chatbot will have a chat history, which is a full list of messages between the user and the bot. If we want, we could use this to answer follow-up questions with context from previous messages.

Build the Nodes

Extract User Message

We’ll use the output from this node in the next step— to search relevant documents to answer the user’s question factually.

1 # nodes/extract_user_message.py
2 from vellum.workflows.nodes import TemplatingNode
3 from ..inputs import Inputs
4 
5 class ExtractUserMessage(TemplatingNode):
6   # Here, we reference the chat_history input that we've connected to this node.
7   template = """\
8     {{ chat_history[-1]["text"] }}\
9   """
10 
11   # Here, we define the inputs to _this_ node.
12   inputs = {
13     "chat_history": Inputs.chat_history,
14   }

You can see that we’re subclassing the TemplatingNode class, which allows us to use a Jinja template to extract the user’s query from the chat history.

Search Node

Specify which document index to search over, and use the user’s query to find relevant chunks of information.

1 # nodes/search_node.py
2 from vellum.workflows.nodes import BaseSearchNode
3 
4 from .extract_user_message import ExtractUserMessage
5 
6 class SearchNode(BaseSearchNode):
7   document_index = "vellum-trust-center-policies"
8   query = ExtractUserMessage.Outputs.result

Here, we subclass BaseSearchNode, which allows us to specify a document index to search over, and a query to search with. Vellum provides out-of-the-box, scalable vector database and embeddings solutions that make this easy.

Format Search Results Node

This is an optional step, but it can be useful to format the search results in a way that’s optimal for an LLM to consume. You may want to include metadata in a certain format or omit it altogether. Here, we include the name of the document that each chunk came from, so that we can later instruct an LLM to cite its sources.

1 # nodes/format_search_results_node.py
2 from vellum.workflows.nodes import TemplatingNode
3 
4 from .search_node import SearchNode
5 
6 class FormatSearchResultsNode(TemplatingNode):
7   template = """\
8     {% for result in results -%}
9     Policy: {{ result.document.label }}
10     ------
11     {{ result.text }}
12     {% if not loop.last %}
13     #####
14     {% endif -%}
15     {% endfor %}\
16   """
17 
18   inputs = {
19     "results": SearchNode.Outputs.results,
20   }

Use an LLM to Answer the User’s Question

Pass the user’s question and the answer context to the LLM so the LLM can answer in a personalized manner for the user.

1 # nodes/prompt_node.py
2 from vellum.workflows.nodes import InlinePromptNode
3 from vellum import (
4   ChatMessagePromptBlock,
5   JinjaPromptBlock,
6 )
7 
8 from .extract_user_message import ExtractUserMessage
9 from .format_search_results_node import FormatSearchResultsNode
10 
11 class PromptNode(InlinePromptNode):
12   ml_model = "gpt-4o"
13   prompt_inputs = {
14     "question": ExtractUserMessage.Outputs.result,
15     "context": FormatSearchResultsNode.Outputs.result,
16   }
17   blocks = [
18       ChatMessagePromptBlock(
19           chat_role="SYSTEM",
20           blocks=[
21               JinjaPromptBlock(
22                   block_type="JINJA",
23                   template="""\
24                       Answer user question based on the context provided below, if you don't know the answer say "Sorry I don't know"
25 
26                       **Context**
27                       ``
28                       {{ context }}
29                       ``
30 
31                       Limit your answer to 250 words and provide a citation at the end of your answer\
32                   """,
33               ),
34           ],
35       ),
36       ChatMessagePromptBlock(
37           chat_role="USER", 
38           blocks=[
39               JinjaPromptBlock(
40                   block_type="JINJA",
41                   template="""\
42                       {{ question }}\
43                   """,
44               ),
45           ],
46       ),
47   ]

Instantiate the Graph and Invoke it

Define the Graph and its Outputs

1 # workflow.py
2 from vellum.workflows import BaseWorkflow
3 from vellum.workflows.state import BaseState
4 
5 from .inputs import Inputs
6 from .nodes.extract_user_message import ExtractUserMessage
7 from .nodes.search_node import SearchNode
8 from .nodes.format_search_results_node import FormatSearchResultsNode
9 from .nodes.prompt_node import PromptNode
10 
11 class BasicRAGWorkflow(BaseWorkflow[Inputs, BaseState]):
12   graph = ExtractUserMessage >> SearchNode >> FormatSearchResultsNode >> PromptNode
13 
14   class Outputs(BaseWorkflow.Outputs):
15     result = PromptNode.Outputs.text

Instantiate the Workflow

1 ## From any file / function from which you want to reference the Workflow
2 
3 # Required import (the file imported from depends on your folder structure)
4 # from .workflow import BasicRAGWorkflow
5 
6 workflow = BasicRAGWorkflow()

Invoke the Workflow and Output the Answer

1 ## From any file / function from which you want to run the Workflow
2 
3 # Required imports (the file imported from depends on your folder structure)
4 # from .inputs import Inputs
5 # from vellum import ChatMessageRequest
6 
7 terminal_event = workflow.run(
8   inputs=Inputs(
9     chat_history=[
10         ChatMessageRequest(
11             role="USER",
12             text="How often is employee training??",
13         )
14     ]
15   )
16 )
17 ## Output:
18 print(terminal_event.outputs.result)
19 
20 """
21 Employee training, as outlined in the Information Security Policy
22   occurs on an annual basis. All new hires are required to complete
23   information security awareness training as part of their new employee
24   onboarding process and then annually thereafter. This ongoing training
25   includes security and privacy requirements, the correct use of 
26   information assets and facilities, and, consistent with assigned roles 
27   and responsibilities, incident response and contingency training. 
28   Additionally, individuals responsible for supporting or writing code for 
29   internet-facing applications or internal applications that handle customer 
30   information must complete annual security training specific to secure coding 
31   practices, which includes OWASP secure development principles and OWASP top 10 
32   vulnerability awareness for the most recent year available.
33 
34 Citation: Policy Information Security Policy - v1.pdf & Policy Software Development Life Cycle Policy - v1.pdf\
35 """

Conclusion

In under 120 lines of code, we built a RAG chatbot that can answer users’ questions with context from a vector database. Looking forward, we can:

Version control the graph with the rest of our project in a git repository
Continue building the graph in the Vellum UI
Evaluate the pipeline with test data, see Evaluating RAG Pipelines
Host it on our own servers or deploy to Vellum

1	## Graph Definition
2	class BasicRAGWorkflow(BaseWorkflow[Inputs, BaseState]):
3	graph = ExtractUserMessage >> SearchNode >> FormatSearchResultsNode >> PromptNode
4
5	class Outputs(BaseWorkflow.Outputs):
6	result = PromptNode.Outputs.text
7
8	## Running it
9	workflow = BasicRAGWorkflow()
10	terminal_event = workflow.run(
11	inputs=Inputs(
12	chat_history=[
13	ChatMessageRequest(
14	role="USER",
15	text="How often is employee training?",
16	)
17	]
18	)
19	)
20
21	## Output:
22	print(terminal_event.outputs.result)
23
24	"""
25	Employee training, as outlined in the Information Security Policy
26	occurs on an annual basis. All new hires are required to complete
27	information security awareness training as part of their new employee
28	onboarding process and then annually thereafter. This ongoing training
29	includes security and privacy requirements, the correct use of
30	information assets and facilities, and, consistent with assigned roles
31	and responsibilities, incident response and contingency training.
32	Additionally, individuals responsible for supporting or writing code for
33	internet-facing applications or internal applications that handle customer
34	information must complete annual security training specific to secure coding
35	practices, which includes OWASP secure development principles and OWASP top 10
36	vulnerability awareness for the most recent year available.
37
38	Citation: Policy Information Security Policy - v1.pdf & Policy Software Development Life Cycle Policy - v1.pdf\
39	"""

1	basic_rag_chatbot/
2	├── workflow.py
3	├── inputs.py
4	├── __init__.py
5	└── nodes/
6	├── __init__.py
7	├── extract_user_message.py
8	├── search_node.py
9	├── format_search_results_node.py
10	└── prompt_node.py

1	from typing import List
2	from vellum import ChatMessageRequest
3	from vellum.workflows.inputs import BaseInputs
4
5	class Inputs(BaseInputs):
6	chat_history: List[ChatMessageRequest]

1	# nodes/extract_user_message.py
2	from vellum.workflows.nodes import TemplatingNode
3	from ..inputs import Inputs
4
5	class ExtractUserMessage(TemplatingNode):
6	# Here, we reference the chat_history input that we've connected to this node.
7	template = """\
8	{{ chat_history[-1]["text"] }}\
9	"""
10
11	# Here, we define the inputs to _this_ node.
12	inputs = {
13	"chat_history": Inputs.chat_history,
14	}

1	# nodes/search_node.py
2	from vellum.workflows.nodes import BaseSearchNode
3
4	from .extract_user_message import ExtractUserMessage
5
6	class SearchNode(BaseSearchNode):
7	document_index = "vellum-trust-center-policies"
8	query = ExtractUserMessage.Outputs.result

1	# nodes/format_search_results_node.py
2	from vellum.workflows.nodes import TemplatingNode
3
4	from .search_node import SearchNode
5
6	class FormatSearchResultsNode(TemplatingNode):
7	template = """\
8	{% for result in results -%}
9	Policy: {{ result.document.label }}
10	------
11	{{ result.text }}
12	{% if not loop.last %}
13	#####
14	{% endif -%}
15	{% endfor %}\
16	"""
17
18	inputs = {
19	"results": SearchNode.Outputs.results,
20	}

1	# nodes/prompt_node.py
2	from vellum.workflows.nodes import InlinePromptNode
3	from vellum import (
4	ChatMessagePromptBlock,
5	JinjaPromptBlock,
6	)
7
8	from .extract_user_message import ExtractUserMessage
9	from .format_search_results_node import FormatSearchResultsNode
10
11	class PromptNode(InlinePromptNode):
12	ml_model = "gpt-4o"
13	prompt_inputs = {
14	"question": ExtractUserMessage.Outputs.result,
15	"context": FormatSearchResultsNode.Outputs.result,
16	}
17	blocks = [
18	ChatMessagePromptBlock(
19	chat_role="SYSTEM",
20	blocks=[
21	JinjaPromptBlock(
22	block_type="JINJA",
23	template="""\
24	Answer user question based on the context provided below, if you don't know the answer say "Sorry I don't know"
25
26	Context
27	``
28	{{ context }}
29	``
30
31	Limit your answer to 250 words and provide a citation at the end of your answer\
32	""",
33	),
34	],
35	),
36	ChatMessagePromptBlock(
37	chat_role="USER",
38	blocks=[
39	JinjaPromptBlock(
40	block_type="JINJA",
41	template="""\
42	{{ question }}\
43	""",
44	),
45	],
46	),
47	]

1	# workflow.py
2	from vellum.workflows import BaseWorkflow
3	from vellum.workflows.state import BaseState
4
5	from .inputs import Inputs
6	from .nodes.extract_user_message import ExtractUserMessage
7	from .nodes.search_node import SearchNode
8	from .nodes.format_search_results_node import FormatSearchResultsNode
9	from .nodes.prompt_node import PromptNode
10
11	class BasicRAGWorkflow(BaseWorkflow[Inputs, BaseState]):
12	graph = ExtractUserMessage >> SearchNode >> FormatSearchResultsNode >> PromptNode
13
14	class Outputs(BaseWorkflow.Outputs):
15	result = PromptNode.Outputs.text

1	## From any file / function from which you want to reference the Workflow
2
3	# Required import (the file imported from depends on your folder structure)
4	# from .workflow import BasicRAGWorkflow
5
6	workflow = BasicRAGWorkflow()

1	## From any file / function from which you want to run the Workflow
2
3	# Required imports (the file imported from depends on your folder structure)
4	# from .inputs import Inputs
5	# from vellum import ChatMessageRequest
6
7	terminal_event = workflow.run(
8	inputs=Inputs(
9	chat_history=[
10	ChatMessageRequest(
11	role="USER",
12	text="How often is employee training??",
13	)
14	]
15	)
16	)
17	## Output:
18	print(terminal_event.outputs.result)
19
20	"""
21	Employee training, as outlined in the Information Security Policy
22	occurs on an annual basis. All new hires are required to complete
23	information security awareness training as part of their new employee
24	onboarding process and then annually thereafter. This ongoing training
25	includes security and privacy requirements, the correct use of
26	information assets and facilities, and, consistent with assigned roles
27	and responsibilities, incident response and contingency training.
28	Additionally, individuals responsible for supporting or writing code for
29	internet-facing applications or internal applications that handle customer
30	information must complete annual security training specific to secure coding
31	practices, which includes OWASP secure development principles and OWASP top 10
32	vulnerability awareness for the most recent year available.
33
34	Citation: Policy Information Security Policy - v1.pdf & Policy Software Development Life Cycle Policy - v1.pdf\
35	"""