Changelog | September, 2024 | Vellum

Fireworks Llama 3.2 90B Vision Instruct

September 30th, 2024

Meta’s most recent open source vision model, Llama 3.2 Vision Instruct, is now available in Vellum. This model excels in visual recognition, image reasoning, captioning, and answering diverse questions related to images and is a great open source option if you’re looking for a vision model.

Private Models Cost Tracking

September 26th, 2024

Models that are now created through the Custom Model Carousel on the models page will have cost tracking for prompt sandboxes and cost tracking for prompt deployments. This means that you’ll be able to see the dollar cost of LLM calls to model providers even for your custom models.

Google Gemini 1.5 002 Models

September 24th, 2024

Google Gemini’s newest 002 models gemini-1.5-pro-002 & gemini-1.5-flash-002 are now available in Vellum! They offer 50% reduced pricing, 2x higher rate limits, and 3x lower latency than the previous Gemini 1.5 models.

Release Tag Column and Filter for Prompt Deployment Execution Table

September 24th, 2024

You can now view and filter on release tags attached to your prompt executions within the Prompt Deployment Execution Table! This addition allows for quick identification of the release version associated with each execution. You can enable this new column in the Columns dropdown.

Prompt Deployment Executions Table with Release Tag Filter

New Prompt Caching Columns for Prompt Deployment Execution Table

September 23rd, 2024

A while back Anthropic added support for Prompt Caching. With this update, you’ll now see the number of Prompt Cache Read and Cache Creation Tokens used by a Prompt Deployment’s executions if it’s backed by an Anthropic model. This new monitoring data can be used to help analyze your cache hit rate with Anthropic and optimize your LLM spend.

Prompt Executions with Cache Tokens

Improved Latency Filter and Sorting for Workflow Executions

September 23rd, 2024

You can now sort and filter by the Latency field in the Workflow Executions Table! This update allows for better prioritization and identification of executions with higher or lower latencies, as well as targeting executions within a range of latencies. We believe these improvements will greatly aid in monitoring and managing workflow executions and their performance and metrics!

Improved Debugging for Map Nodes

September 23rd, 2024

It used to be difficult to debug problematic iterations when a Map Node failed. We now keep track of each iteration’s execution and make it easy to view them. You can page through a Map Node’s iterations one-by-one.

Map Node Rejected Pagination

Each of these iterations, included the any that failed, are now also show in a Map Node’s full screen editor.

Map Node Rejected Editor

The full screen editor now also allows you to cycle through each of an executed Map Node’s iterations, making it easy to debug problematic iterations and iterate on the subworkflow used to produce that iteration’s execution.

Resizable Node Editor Panel

September 20th, 2024

For those of you using the new Workflow Builder, you’ll now be able to resize the Node Editor Panel. This update makes it much easier to edit complex Conditional Node rules, Chat History Messages, JSON values, and more.

Resizable editor panel

Evaluations Performance Improvements

September 17th, 2024

While not as flashy as some of our other updates, we’ve undergone a major overhaul of our Evaluations backend resulting in significant performance improvements to the Evaluations page. Test Suites consisting of thousands of Test Cases used to feel sluggish and sometimes not load, but now load successfully and should feel much more responsive.

Cost Tracking for Prompt Deployment Executions Table

September 17th, 2024

You can now see the cost of each Prompt Execution in the Prompt Executions Table.

Cost tracking prompt executions

This is the next step of many we have planned for improving visibility into LLM costs in Vellum. You might use this to audit expensive calls and optimize your prompts to reduce costs.

Optimized Prompt Deployment Executions Table

September 13th, 2024

This update brings a reduction in load times for filters and sorts; in some instances, dropping 2 minute load times to a few seconds.

We’ve achieved this by switching to a more efficient data source, enabling more effective filtering and sorting capabilities. You’ll notice faster page load times across the board, resulting in a smoother, more responsive experience when working with Prompt Deployment Executions.

This optimization sets the stage for exciting new features we have in the works. Stay tuned for more updates that will enhance your ability to analyze, and optimize your prompt executions.

External ID Filtering for Workflow Deployment Executions

September 13th, 2024

Previously, when filtering workflow deployment executions by external IDs, you had to provide the exact string match to retrieve relevant results.

Now, you can filter external IDs using a variety of string patterns. You can specify that the external ID should start with, end with, or contain certain substrings. This enhancement allows for more flexible filtering, making it easier to locate specific workflow deployment executions based on partial matches.

new_external_id_filter_options

Workflow Execution Timeline View Revamp

September 13th, 2024

We have given the Workflow Execution Timeline View a bit of a facelift. Along with a more modern look, we have added a couple quality of life improvements:

Subworkflows: Instead of needing to navigate to a separate page, you can now expand subworkflows to view their executions details within the same page.
Node Pages: Instead of cluttering the page with the details of all nodes at once, we now display the details for just one node at a time. Click on a node to view its inputs, outputs, and more. Each node has its own permalink so that you can share the url with others.

Workflow Execution Timeline

OpenAI Strawberry (o1) Models

September 12th, 2024

OpenAI’s newest Strawberry (o1) models o1-preview, o1-mini, o1-preview-2024-09-12, & o1-mini-2024-09-12 are now available in Vellum and have been added to all workspaces!

Interactive Pages in Single Editor Mode

September 7th, 2024

It used to be that when two people were on the same Prompt/Workflow Sandbox, only one person could edit and interact with the page. If you were a Viewer, you were unable to interact with the page at all and were blocked with a big page overlay.

Now, the page overlay is gone and Viewers can interact with the page in a read-only mode and perform actions that don’t affect the state of the page. This includes things like scrolling, opening modals, copying text, etc.

Expand Cost in Execute Prompt APIs

September 4th, 2024

You can now opt in to receive the cost of a Prompt’s execution in the response of the Execute Prompt and Execute Prompt Stream APIs.

This is helpful if you want to capture the cost of executing a Prompt in your own system or if you want to provide cost transparency to your end users.

To opt in, you can pass the expand_meta field in the request body with the cost key set to true.

1 {
2   ...,
3   "expand_meta" : {
4     "cost": true
5   }
6 }

You can expect a corresponding value to be included in the meta field on the response:

1 {
2   ...,
3   "meta": {
4     "cost" : {
5         "value" : 0.000450003,
6         "unit" : "USD"
7     }
8   }
9 }

This functionality is available in our SDKs beginning v0.8.9.

Default Block Type Preference

September 4th, 2024

You can now set a default Block type to use when defining Prompts in Vellum. Whenever you see the “Add Block” or “Add Message” options in a Prompt Editor, your preferred Block type will be used.

By default, the Block type is set to “Rich Text,” the newer option that supports Variable Chips. You can still switch between Block types for individual Blocks within the Prompt Editor.

default block type toggle

New and Improved Code Editor

September 3rd, 2024

We now use Monaco Editor for our code editor that is used by Workflow Code Nodes and custom Code Evaluation Metrics. Monaco is the same editor that Visual Studio Code uses under the hood.

This offers a number of improvements including IntelliSense, semantic validation and syntax validation. Additionally we now inject Vellum Value types into the editor, so you can now have fully typed input values for things such as Chat History. Some of these improvements are currently only available for TypeScript and not Python.

VPC Disable gVisor Option for Code Execution

September 3rd, 2024

VPC customers of Vellum can now disable gVisor sandboxing for code execution in self-hosted environments to significantly improve the performance of Code Nodes in Workflows. gVisor is needed for secure sandboxing in our Managed SASS platform, but in a self hosted environment where you’re the only organization, it’s not strictly required if you trust that users within your org won’t run malicious code.

Download Original Document from UI

September 2nd, 2024

You can now download a file that was originally uploaded as a Document to a Document Index from the UI. You’ll find a new “Download Original” option in a Document’s ••• More Menu.