Building Common LLM architectures with Vellum Workflows

With a large number of supported node types (full details here: Experimenting with Workflows) and few limits on how they can be connected to each other, the types of architectures/ applications you can create using Workflows is very large.

The list of architectures below is not exhaustive, we’re continuing to build it out. If you come up with an interesting architecture that you think the community might benefit from, please reach out so we can add it to the list here.

Create a Retrieval Augmented Generation (RAG) system

LLM applications often require specific context from a Vector DB which is added into the prompt. Forget signing up for multiple systems and being stuck on various micro decisions, with Vellum you can prototype a RAG system in minutes

Walkthrough

1

Create a Document Index and upload your documents

Follow this article for tips: Uploading Documents)

2

Add a Search Node in your Workflow

Place this anywhere and connect it to the “entrypoint”

3

Add a Prompt Node

The prompt node should take the results of your Search Node as an input variable

5

Set up input variables and hit Run!

Workflow

Route messages to a Human

If you’re building an agent that answers questions coming from users (e.g., a support chatbot), you may want to set up rules such that anytime the incoming message from a user is sensitive (e.g., the user is angry or in a dangerous situation) then the LLM automatically escalates it to a human. With Workflows you’d be able to build that out real quick.

Walkthrough

1

Add a classification prompt

Use a Prompt Node to filter out incoming messages

2

Add a downstream prompt

Use another prompt node for the LLM to respond to messages that don’t need to be escalated

3

Add and connect two Final Output Nodes

Connect the classification prompt outputs to two separate Final Output Nodes

4

Set up variables and hit Run!

Workflow

Retrying a Prompt Node in case of non-deterministic failure

Prompt nodes support two selectable outputs - one from the model in case of a valid output and one in case of a non deterministic error. Model hosts fail for all sorts of reasons that include timeouts, rate limits, or server overload. You could make your production-grade LLM features resilient to these features by adding retry logic into your Workflows!

Walkthrough

1

Add a standard Prompt Node

2

Add a Conditional Node (Error Check)

This node will read from the new Error output from the Prompt Node and check to see if it’s not null.

3

Define another Conditional Node (Count Check)

This node will read from the Prompt Node’s Execution Counter, and check if it’s been invoked more than your desired limit (3).

4

Loop back to the Prompt Node

Loop back to the Prompt Node if it’s under the limit, or exit with some error message if it’s over the limit. In the case that the error is null, exit with the Prompt Node’s response.

Workflow

Summarizing the contents of a PDF file

Vellum Document Indexes are typically used to power RAG systems via Search Nodes. However, they can also be used to operate on the entirety of a single file’s contents. In this example, we make use of Vellum Document Indexes not for the purpose of search, instead, to leverage the OCR that’s performed and operate on the raw text that’s extracted from a PDF file.

Prerequisites: You need to have …

  • Created a Document Index. Note: it doesn’t matter what embedding model or chunking strategy you choose, since we’re only leveraging the OCR capabilities of the Document Index.
  • Uploaded a PDF file to the Document Index and noted down its ID.
  • Generated a Vellum API Token and saved its value as a Workspace Secret.

Walkthrough

1

Set the input to the workflow

This will be the ID of a Document that was previously uploaded to a Document Index

2

Add a Templating Node (Document API URL)

This will construct the url of a Vellum API we want to hit.

3

Add an API Node (Document API)

This will ping the Vellum API and retrieve metadata about the Document.

4

Add a Templating Node (Processed Document URL)

This will extract the url of the processed document from the API response.

5

Add an API Node (Processed Document Contents)

This will retrieve the text contents of the Document.

6

Pass those contents to a Prompt Node that summarizes the text.

Workflow