For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
BlogLog InRequest Demo
HomeProductDevelopersSelf-HostingChangelog
HomeProductDevelopersSelf-HostingChangelog
  • Getting Started
    • Overview
  • Workflows SDK
    • Introduction
    • Installation
    • Core Concepts
    • Defining Control Flow
    • Configuration
    • Custom Docker Images
      • Displayable Nodes
        • Agent Node
        • Inline Prompt Node
        • Inline Subworkflow Node
        • Prompt Deployment Node
        • Search Node
        • Subworkflow Deployment Node
        • API Node
        • Code Execution Node
        • Templating Node
        • Guardrail Node
        • Map Node
        • Conditional Node
        • Merge Node
        • Error Node
        • Final Output Node
      • Datasets
      • CLI
  • Client SDK
    • Introduction
    • Authentication
    • API Versioning
LogoLogo
BlogLog InRequest Demo
Workflows SDKAPI ReferenceDisplayable Nodes

Inline Prompt Node

Using Workflow Inputs and Upstream Node Outputs
1from vellum.workflows.nodes import InlinePromptNode
2from vellum import (
3 ChatMessagePromptBlock,
4 JinjaPromptBlock,
5 PlainTextPromptBlock,
6 PromptParameters,
7 RichTextPromptBlock,
8 VariablePromptBlock,
9)
10from .some_other_node import SomeOtherNode
11
12class Inputs(BaseInputs):
13 foo: str
14
15class MyPrompt(InlinePromptNode):
16 ml_model = "gpt-5"
17 blocks = [
18 ChatMessagePromptBlock(
19 chat_role="SYSTEM",
20 blocks=[
21 RichTextPromptBlock(
22 blocks=[
23 # Prefer RichTextPromptBlock for a nicer UI editing experience
24 PlainTextPromptBlock(text="Answer the user's question: "),
25 VariablePromptBlock(input_variable="question"),
26 ]
27 ),
28 JinjaPromptBlock(
29 template="Using a templating block to write Jinja templates inline: {{ query | upper | truncate(3) }}"
30 ),
31 ],
32 ),
33 # Use VariablePromptBlock at the top level to include chat history in context, for any chatbot / chat agent use-cases
34 VariablePromptBlock(input_variable="chat_history"),
35 ]
36 prompt_inputs = {
37 "foo": Inputs.foo, # Reference workflow input
38 "bar": SomeOtherNode.Outputs.bar, # Reference upstream node output
39 "chat_history": Inputs.chat_history, # List[ChatMessage]
40 }
41
42
43class Workflow(BaseWorkflow[Inputs, BaseState]):
44 graph = SomeOtherNode >> MyPrompt
Was this page helpful?
Previous

Inline Subworkflow Node

Next
Built with

vellum.workflows.nodes.InlinePromptNode

Used to execute a prompt directly within a workflow, without requiring a prompt deployment.

Attributes

prompt_inputs
EntityInputsInterface

Optional inputs for variable substitution in the prompt. These inputs are used to replace:

  • Variables within Jinja blocks
  • Variable blocks in the blocks attribute

You can reference either Workflow inputs or outputs from upstream nodes.

blocks
List[PromptBlock]Required

The blocks that make up the Prompt

ml_model
strRequired

The model to use for execution (e.g., “gpt-5”, “claude-4-sonnet”)

functions
Optional[List[FunctionDefinition]]

The functions to include in the prompt

parameters
Optional[PromptParameters]

Model parameters for execution. Defaults to:

  • stop: []
  • temperature: 0.0
  • max_tokens: 4096
  • top_p: 1.0
  • top_k: 0
  • frequency_penalty: 0.0
  • presence_penalty: 0.0
  • logit_bias: None
  • custom_parameters: None
    • This field can be used to pass additional parameters to the LLM, like json_schema (learn more here).
expand_meta
Optional[PromptDeploymentExpandMetaRequest]

Expandable execution fields to include in the response. See more here.

request_options
RequestOptions

Additional options for request-specific configuration when calling APIs via the SDK. This is used primarily as an optional final parameter for service functions.

  • timeout_in_seconds: The number of seconds to await an API call before timing out
  • max_retries: The max number of retries to attempt if the API call fails
  • additional_headers: A dictionary containing additional parameters to spread into the request’s header dict
  • additional_query_parameters: A dictionary containing additional parameters to spread into the request’s query parameters dict
  • additional_body_parameters: A dictionary containing additional parameters to spread into the request’s body parameters dict

Outputs

text
str

The generated text output from the prompt execution

results
List[PromptOutput]

The array of results from the prompt execution. PromptOutput is a union of the following types:

  • StringVellumValue
  • JsonVellumValue
  • ErrorVellumValue
  • FunctionCallVellumValue

Examples

JSON Extraction and Ports
1from vellum import (
2 ChatMessagePromptBlock,
3 JinjaPromptBlock,
4 PlainTextPromptBlock,
5 PromptParameters,
6 PromptSettings,
7 RichTextPromptBlock,
8 VariablePromptBlock,
9)
10from vellum.workflows.nodes.displayable import InlinePromptNode
11from vellum.workflows.ports import Port
12from vellum.workflows.references import LazyReference
13
14# nodes/router_prompt.py
15class RouterPrompt(InlinePromptNode):
16 """
17 This prompt is used to route to the appropriate handler based on the type of document being parsed.
18 """
19 ml_model = "gpt-5"
20 blocks = [
21 ChatMessagePromptBlock(
22 chat_role="SYSTEM",
23 blocks=[
24 RichTextPromptBlock(
25 blocks=[
26 # Prefer RichTextPromptBlock for a nicer UI editing experience
27 PlainTextPromptBlock(text="Answer the user's question: "),
28 VariablePromptBlock(input_variable="question"),
29 ]
30 ),
31 JinjaPromptBlock(
32 template="Using a templating block to write Jinja templates inline: {{ query | upper | truncate(3) }}"
33 ),
34 ],
35 ),
36 # Use VariablePromptBlock at the top level to include chat history in context, for any chatbot / chat agent use-cases
37 VariablePromptBlock(input_variable="chat_history"),
38 ]
39 prompt_inputs = {
40 "document_text": Inputs.document_text, # Reference workflow input
41 "chat_history": Inputs.chat_history, # List[ChatMessage]
42 }
43 custom_parameters={
44 # prefer json_schema over json_mode if strict types are required
45 # json_mode is a more flexible way to produce valid JSON through schemas defined in the prompt itself
46 "json_mode": True,
47 "json_schema": {
48 "strict": True,
49 "name": "schema",
50 "schema": {
51 "type": "object",
52 "properties": {
53 "classification": {
54 "type": "string",
55 "description": "What type of document to classify as",
56 "enum": [
57 "policy",
58 "certificate_of_insurance",
59 ],
60 },
61 },
62 "required": [
63 "classification",
64 ],
65 },
66 },
67 },
68
69 class Ports(InlinePromptNode.Ports):
70 group_1_if_port = Port.on_if(LazyReference(lambda: RouterPrompt.Outputs.json)["classification"].equals("policy"))
71 group_1_else_port = Port.on_else()
72
73class FinalOutputNode(BaseOutputs):
74 classification: str
75
76
77# workflow.py
78class Workflow(BaseWorkflow[Inputs, BaseState]):
79 graph = GetDocument >> {
80 RouterPrompt.Ports.group_1_if_port >> MyPolicyParserWorkflow,
81 RouterPrompt.Ports.group_1_else_port >> MyCOIParserWorkflow,
82 } >> FinalOutputNode
83
84 class Outputs(BaseWorkflow.Outputs):
85 final_output = FinalOutput.Outputs.value