Inline Prompt Node | Vellum

vellum.workflows.nodes.InlinePromptNode

Used to execute a prompt directly within a workflow, without requiring a prompt deployment.

Attributes

prompt_inputs

EntityInputsInterface

Optional inputs for variable substitution in the prompt. These inputs are used to replace:

Variables within Jinja blocks
Variable blocks in the blocks attribute

You can reference either Workflow inputs or outputs from upstream nodes.

blocks

List[PromptBlock]Required

The blocks that make up the Prompt

ml_model

strRequired

The model to use for execution (e.g., “gpt-5”, “claude-4-sonnet”)

functions

Optional[List[FunctionDefinition]]

The functions to include in the prompt

parameters

Optional[PromptParameters]

Model parameters for execution. Defaults to:

stop: []
temperature: 0.0
max_tokens: 4096
top_p: 1.0
top_k: 0
frequency_penalty: 0.0
presence_penalty: 0.0
logit_bias: None
custom_parameters: None
- This field can be used to pass additional parameters to the LLM, like json_schema (learn more here).

expand_meta

Optional[PromptDeploymentExpandMetaRequest]

Expandable execution fields to include in the response. See more here.

request_options

RequestOptions

Additional options for request-specific configuration when calling APIs via the SDK. This is used primarily as an optional final parameter for service functions.

timeout_in_seconds: The number of seconds to await an API call before timing out
max_retries: The max number of retries to attempt if the API call fails
additional_headers: A dictionary containing additional parameters to spread into the request’s header dict
additional_query_parameters: A dictionary containing additional parameters to spread into the request’s query parameters dict
additional_body_parameters: A dictionary containing additional parameters to spread into the request’s body parameters dict

Outputs

text

str

The generated text output from the prompt execution

results

List[PromptOutput]

The array of results from the prompt execution. PromptOutput is a union of the following types:

StringVellumValue
JsonVellumValue
ErrorVellumValue
FunctionCallVellumValue

Examples

JSON Extraction and Ports

1 from vellum import (
2     ChatMessagePromptBlock,
3     JinjaPromptBlock,
4     PlainTextPromptBlock,
5     PromptParameters,
6     PromptSettings,
7     RichTextPromptBlock,
8     VariablePromptBlock,
9 )
10 from vellum.workflows.nodes.displayable import InlinePromptNode
11 from vellum.workflows.ports import Port
12 from vellum.workflows.references import LazyReference
13 
14 # nodes/router_prompt.py
15 class RouterPrompt(InlinePromptNode):
16     """
17     This prompt is used to route to the appropriate handler based on the type of document being parsed.
18     """
19     ml_model = "gpt-5"
20     blocks = [
21         ChatMessagePromptBlock(
22             chat_role="SYSTEM",
23             blocks=[
24                 RichTextPromptBlock(
25                     blocks=[
26                         # Prefer RichTextPromptBlock for a nicer UI editing experience
27                         PlainTextPromptBlock(text="Answer the user's question: "),
28                         VariablePromptBlock(input_variable="question"),
29                     ]
30                 ),
31                 JinjaPromptBlock(
32                     template="Using a templating block to write Jinja templates inline: {{ query | upper | truncate(3) }}"
33                 ),
34             ],
35         ),
36         # Use VariablePromptBlock at the top level to include chat history in context, for any chatbot / chat agent use-cases
37         VariablePromptBlock(input_variable="chat_history"),
38     ]
39     prompt_inputs = {
40         "document_text": Inputs.document_text,  # Reference workflow input
41         "chat_history": Inputs.chat_history,  # List[ChatMessage]
42     }
43     custom_parameters={
44         # prefer json_schema over json_mode if strict types are required
45         # json_mode is a more flexible way to produce valid JSON through schemas defined in the prompt itself
46         "json_mode": True,
47         "json_schema": {
48             "strict": True,
49             "name": "schema",
50             "schema": {
51                 "type": "object",
52                 "properties": {
53                     "classification": {
54                         "type": "string",
55                         "description": "What type of document to classify as",
56                         "enum": [
57                             "policy",
58                             "certificate_of_insurance",
59                         ],
60                     },
61                 },
62                 "required": [
63                     "classification",
64                 ],
65             },
66         },
67     },
68 
69     class Ports(InlinePromptNode.Ports):
70         group_1_if_port = Port.on_if(LazyReference(lambda: RouterPrompt.Outputs.json)["classification"].equals("policy"))
71         group_1_else_port = Port.on_else()
72 
73 class FinalOutputNode(BaseOutputs):
74     classification: str
75 
76 
77 # workflow.py
78 class Workflow(BaseWorkflow[Inputs, BaseState]):
79     graph = GetDocument >> {
80         RouterPrompt.Ports.group_1_if_port >> MyPolicyParserWorkflow,
81         RouterPrompt.Ports.group_1_else_port >> MyCOIParserWorkflow,
82     } >> FinalOutputNode
83 
84     class Outputs(BaseWorkflow.Outputs):
85         final_output = FinalOutput.Outputs.value