API Versioning

Vellum uses date-based API versioning via request headers to give you control over when to enable new API features that may introduce breaking changes. This allows you to upgrade to new features on your own timeline while maintaining backward compatibility.

Why API Versioning Exists

API versioning in Vellum serves several important purposes:

  • Backward Compatibility: Ensures your existing integrations continue to work when new features are released
  • Controlled Upgrades: Allows you to test and adopt new features when you’re ready, rather than being forced to handle breaking changes immediately
  • Feature Gating: Enables access to new capabilities like Reasoning Outputs and enhanced response formats
  • Stable Production: Keeps your production systems running smoothly while new features are being developed

How to Use API Versioning

Setting the API Version

Specify which version of Vellum’s API you want to use by including the X-API-VERSION header in your requests:

$curl --request POST \
> --url "https://predict.vellum.ai/v1/execute-prompt" \
> --header "Content-Type: application/json" \
> --header "X-API-KEY: $VELLUM_API_KEY" \
> --header "X-API-VERSION: 2025-07-30" \
> --data '{
> "prompt_deployment_name": "your-prompt-name",
> "release_tag": "LATEST"
> }'

Default Behavior

The default API version behavior differs depending on how you make requests:

  • Raw API requests: If you don’t specify an X-API-VERSION header, Vellum will use the default version (2024-10-25) to ensure backward compatibility with existing integrations.
  • SDK v1.0.0+: The latest API version (2025-07-30) is used by default, giving you access to the newest features automatically. You can override this to use a prior API version if you wish.
  • SDK versions prior to 1.0.0: Defaults to using API version 2024-10-25, but can be overridden.

SDK Version Requirements

Different SDK versions support different API versions:

  • SDK versions prior to 1.0.0: Only support API version 2024-10-25
  • SDK versions 1.0.0 and later: Support both 2024-10-25 and 2025-07-30, with 2025-07-30 as the default

Make sure to upgrade your SDK to version 1.0.0 or later to access the latest API features.

Available API Versions

2024-10-25 (Legacy Default)

This is the original API version that maintains full backward compatibility. It’s used by default for raw API requests when no X-API-VERSION header is provided.

Features:

  • Standard prompt execution responses
  • Traditional output formats
  • Full compatibility with all existing integrations

2025-07-30 (Latest, SDK Default)

The latest API version that includes new features and improvements. This is the default in our SDKs beginning v1.0.0.

New Features:

  • Reasoning Outputs: Support for thinking/reasoning blocks from compatible models
  • Enhanced Response Format: Differentiated blocks in API responses for reasoning-capable models
  • Future Features: This version will continue to receive new capabilities as they’re developed

Breaking Changes Across Versions

Changes in 2025-07-30

Reasoning Outputs Support

The most significant change in 2025-07-30 is the introduction of Reasoning Outputs for models that support thinking/reasoning capabilities.

Non-Streaming Output Before (2024-10-25):

1{
2 "execution_id": "3e973c88-86c2-45ae-bd0f-c72e9c43ddb4",
3 "state": "FULFILLED",
4 "outputs": [
5 {
6 "type": "STRING",
7 "value": "The answer is 42."
8 }
9 ]
10}

Non-Streaming Output After (2025-07-30):

1{
2 "execution_id": "1b37a0be-d089-4518-9ac5-867135eab960",
3 "state": "FULFILLED",
4 "outputs": [
5 {
6 "type": "THINKING",
7 "value": {
8 "type": "STRING",
9 "value": "Let me think about that question. A quick google search claims that the meaning to life is 42."
10 }
11 },
12 {
13 "type": "STRING",
14 "value": "The answer is 42."
15 }
16 ]
17}

Streaming outputs also support Reasoning Outputs with the API Version set to 2025-07-30.

Streaming Output before (2024-10-25):

1{
2 "state": "INITIATED",
3 "execution_id": "1df0213a-9946-47f8-ac8d-e11359748c4e"
4}
5{
6 "state": "STREAMING",
7 "output": {
8 "type": "STRING",
9 "value": "The"
10 },
11 "output_index": 0,
12 "execution_id": "1df0213a-9946-47f8-ac8d-e11359748c4e"
13}
14{
15 "state": "STREAMING",
16 "output": {
17 "type": "STRING",
18 "value": " answer"
19 },
20 "output_index": 0,
21 "execution_id": "1df0213a-9946-47f8-ac8d-e11359748c4e"
22}
23... # The rest of the streaming deltas
24{
25 "state": "FULFILLED",
26 "outputs": [
27 {
28 "type": "STRING",
29 "value": "The answer is 42."
30 }
31 ],
32 "execution_id": "1df0213a-9946-47f8-ac8d-e11359748c4e"
33}

Streaming Output After (2025-07-30):

1{
2 "state": "INITIATED",
3 "execution_id": "e3a40c85-20e1-421c-83da-f3dbbc9dd1e4"
4}
5{
6 "state": "STREAMING",
7 "output": {
8 "type": "THINKING",
9 "value": {
10 "type": "STRING",
11 "value": "Let me"
12 }
13 },
14 "output_index": 0,
15 "execution_id": "e3a40c85-20e1-421c-83da-f3dbbc9dd1e4"
16}
17{
18 "state": "STREAMING",
19 "output": {
20 "type": "THINKING",
21 "value": {
22 "type": "STRING",
23 "value": " think"
24 }
25 },
26 "output_index": 0,
27 "execution_id": "e3a40c85-20e1-421c-83da-f3dbbc9dd1e4"
28}
29... # The rest of the Reasoning streaming deltas
30{
31 "state": "STREAMING",
32 "output": {
33 "type": "STRING",
34 "value": "The"
35 },
36 "output_index": 1,
37 "execution_id": "e3a40c85-20e1-421c-83da-f3dbbc9dd1e4"
38}
39{
40 "state": "STREAMING",
41 "output": {
42 "type": "STRING",
43 "value": " answer"
44 },
45 "output_index": 1,
46 "execution_id": "e3a40c85-20e1-421c-83da-f3dbbc9dd1e4"
47}
48... # The rest of the Final deltas
49{
50 "state": "FULFILLED",
51 "outputs": [
52 {
53 "type": "THINKING",
54 "value": {
55 "type": "STRING",
56 "value": "Let me think about that question. A quick google search claims that the meaning to life is 42."
57 }
58 },
59 {
60 "type": "STRING",
61 "value": "The answer is 42."
62 }
63 ],
64 "execution_id": "e3a40c85-20e1-421c-83da-f3dbbc9dd1e4"
65}

If your application parses API responses and expects a specific structure, you may need to update your code to handle the new THINKING output type and introduction of additional outputs when upgrading to 2025-07-30.

Models Affected

Reasoning Outputs are currently supported by:

  • All Anthropic Claude models with reasoning capabilities
  • All OpenAI Models with reasoning capabilities invoked via Responses API

Only models that actually support reasoning will include the THINKING output in responses. Standard models will continue to return responses in the same format as before.

Best Practices

Testing New Versions

  1. Test in Development: Always test new API versions in your development environment first
  2. Gradual Rollout: Consider rolling out API version changes gradually across your systems
  3. Monitor Responses: Watch for any changes in response structure that might affect your application

Version Management

  1. Pin Versions: Explicitly specify the API version in your requests rather than relying on defaults
  2. Document Usage: Keep track of which API versions you’re using across different parts of your application
  3. Plan Upgrades: Review changelog entries for new API versions to understand what changes to expect

Migration Guide

Upgrading to 2025-07-30

  1. Update SDK: Upgrade to SDK version 1.0.0 or later
  2. Test Reasoning Models: If you use reasoning-capable models, test that your application handles the new thinking field appropriately
  3. Update Headers: Add X-API-VERSION: 2025-07-30 to your API requests (or rely on SDK v1.0.0+ defaults)
  4. Monitor: Watch for any unexpected behavior after the upgrade

Handling Reasoning Outputs

If you’re upgrading to 2025-07-30 and use reasoning-capable models, you may need to update your response handling:

1# Handle both thinking and non-thinking responses
2result = client.execute_prompt(
3 prompt_deployment_name="your-prompt-name",
4 release_tag="LATEST"
5)
6
7result = client.execute_prompt(
8 prompt_deployment_name="testing",
9 release_tag="LATEST",
10)
11
12compiled_result = result.outputs
13
14for result in compiled_result:
15 if result.type == "THINKING":
16 print(f"Thinking Output: {result.value.value}\n")
17 elif result.type == "STRING":
18 print(f"Final Output: {result.value}\n")

Future Versions

Vellum will continue to release new API versions as new features are developed. Each new version will be documented with:

  • A comprehensive list of new features
  • Breaking changes and migration guidance
  • SDK compatibility requirements
  • Examples of new capabilities

Stay tuned to the changelog for announcements of new API versions and features.