Core Concepts
Defining a Workflow
All Vellum Workflows extend from the BaseWorkflow
class. Workflows define the control flow of your application,
orchestrating the order of execution between each Node.
Workflows can be invoked via a run
method, which returns the final event that was emitted by the Workflow.
In the example above, final_event
has a name
of "workflow.execution.fulfilled"
. This indicates that the Workflow
ran to completion successfully. Had the Workflow encountered an error, the name
would have been "workflow.execution.rejected"
.
Workflow Outputs
You can think of a Workflow as a black box that produces values for pre-defined outputs. To specify the outputs of a Workflow,
you must define an Outputs
class that extends from BaseWorkflow.Outputs
.
Here is a very basic Workflow that defines a single output called hello
with a hard-coded return value of the string "world"
.
Defining Nodes
Nodes are the building blocks of a Workflow and are responsible for executing a specific task. All Nodes in a Workflow
must extend from the BaseNode
class.
Here we define a very simple custom Node called GreetingNode
that overrides the run
method to print "Hello, world!"
to the
console. Notably, this Node doesn’t produce any outputs (yet!).
Defining Node Outputs
Most Nodes produce Outputs that can be referenced elsewhere in the Workflow. Just like a Workflow, a Node defines its
outputs via an Outputs
class, this time, extending from BaseNode.Outputs
.
Here we define a GreetingNode
that produces a single output of type str
called greeting
. The run
method returns
an instance of GreetingNode.Outputs
with the greeting
attribute set to "Hello, world!"
.
Using a Node in a Workflow
Nodes are executed as part of a Workflow once they’re added to the Workflow’s graph
attribute. Once added, a Node’s
output can be used as the Workflow’s output.
Workflow Inputs
The runtime behavior of a Workflow almost always depends on some set of input values that are provided at the time of execution.
You can define a Workflow’s inputs via an Inputs
class that extends from BaseInputs
and that’s then referenced in
the Workflow’s parent class as a generic type.
Here’s a Workflow that defines a single input called greeting
of type str
and simply passes it through
as an output.
Node Attributes
A Workflow’s inputs are usually used to drive the behavior of its Nodes. Nodes can reference these inputs via class attributes that are resolved at runtime.
Below we drive the behavior of a GreetingNode
by specifying noun = Inputs.noun
as a class attribute, then referencing
self.noun
in the run
method to produce a dynamic greeting.
Descriptors
Inputs.noun
is what we call a “descriptor” and is not a literal value. Think of it like a pointer or reference whose value is resolved at runtime. If you were to call Inputs.noun
within a node’s run method instead of self.noun
an exception would be raised.
Control Flow
Defining Control Flow
Until now, we’ve only defined Workflows that contain a single Node – not very interesting! Most Workflows orchestrate
the execution of multiple Nodes in a specific order. This is achieved by defining a graph
attribute with a special
syntax that describes the control flow between Nodes.
Here we define three Nodes, GreetingNode
, EndNode
, and AggregatorNode
, then define the order of their execution
by using the >>
operator.
Ports and Conditionals
Nodes contain Ports and use them to determine which Nodes to execute next. Ports are useful for performing branching logic and conditional execution of subsequent Nodes.
We haven’t seen any Ports up until now, but they’re actually present in every Node. By default, a Node has a single Port called default
, which is always invoked after the Node’s run
method completes.
The following Workflows are equivalent:
You can explicitly define a Ports
class on a Node and define the conditions in which one Node or another should execute. Below, we define a SwitchNode
that has a winner
Port and a loser
Port.
Notice that we use the greater_than
Expression to define the winner
Port— more on Expressions next.
Expressions
Descriptors support a declarative syntax for defining Expressions. Expressions are usually used in conjunction with Ports
to define conditional execution of subsequent Nodes, but can also be used as short-hand for performing simple
operations that would otherwise have to be manually defined in a Node’s run
method.
Here we define a StartNode
that produces a random score
between 0 and 10. We then define an EndNode
that has a
single output called winner
that is True
if the score
is greater than 5.
For example, the longform definition of a Node that relies on StartNode.Outputs.score
would look like this:
And the shortform using an Expression would look like this:
Triggers
In some cases, you may want to delay the execution of a Node until a certain condition is met. For example, you may want to wait for multiple upstream Nodes to complete before executing a Node, like when executing Nodes in parallel. This is where Triggers come in.
Just as Nodes define a Ports
class implicitly by default, they also define a Trigger
class implicitly by default. Here’s what the default Trigger
class looks like:
This means that by default, a Node will execute as soon as any one of its immediately upstream Nodes have fulfilled. You might instead want to wait until all of its upstream Nodes have fulfilled. To do this, you can explicitly define a Trigger
class on a Node like so:
Here’s a complete example:
It’s usually sufficient to stick with the “Await All” and “Await Any” merge behaviors that are provided out-of-box. However, you can also define your own custom merge behaviors by overriding the Trigger
class’s should_initiate
method. By doing so, you can access any information about the Node’s dependencies or the Workflow’s State (more on State later).
Parallel Execution
You may want to run multiple execution paths in parallel. For example, if you want to run multiple LLM prompts concurrently, or respond to a user while performing background tasks. To do this, you can use “set syntax” as follows:
State
In most cases it’s sufficient to drive a Node’s behavior based on either inputs to the Workflow, or the outputs of upstream Nodes. However, Workflow’s also support writing to and reading from a global state object that lives for the duration of the Workflow’s execution.
Here’s an example of how to define the schema of a State object and use it in a Workflow.
State
class is explicitly defined, Workflows use State under the hood to track all information about a Workflow’s execution. This information is stored under the reserved meta
attribute on the State
class and can be accessed for your own purposes.Streaming Outputs
Workflow Event Streaming
Until now, we’ve only seen the run()
method being invoked on Workflows we’ve defined. run()
is a blocking
call that waits for the Workflow to complete before returning a terminal fulfilled or rejected event.
In some cases, you may want to stream the events a Workflow produces as they’re being emitted. This is useful when your Workflow produces outputs along the way, and you want to consume them in real-time.
You can do this via the stream()
method, which returns a Generator that yields events as they’re produced.
Node Event Streaming
By default, when you call a Workflow’s stream()
method, you’ll only receive Workflow-level events. However, you may
also opt in to receive Node-level events by specifying the event_types
parameter.
With this, you can receive the events that Nodes in the Workflow produce as they’re emitted. This is useful when you want to inspect the outputs of individual Nodes for debugging purposes.