Batching Executions

When you need to process many Workflow executions at once, the async execution endpoint is ideal for batch processing. Async executions automatically queue when you exceed your concurrency limit, allowing you to initiate many executions quickly without waiting for each one to complete.

Why Use Async Execution for Batch Jobs

Async execution is perfect for batch processing scenarios because:

  • Automatic queuing: Executions automatically queue when you exceed your concurrency limit
  • Non-blocking: You can initiate many executions quickly without waiting for completion
  • Efficient resource usage: Executions process as capacity becomes available
  • Scalable: Handle large batches without overwhelming your system

Basic Batch Job Pattern

The simplest pattern is to initiate all executions at once. They’ll queue automatically if needed:

1import vellum
2from typing import List
3
4client = vellum.VellumClient(api_key="your-api-key")
5
6# Process a batch of items
7items_to_process = [
8 {"user_query": "Process item 1"},
9 {"user_query": "Process item 2"},
10 {"user_query": "Process item 3"},
11 # ... many more items
12]
13
14# Initiate all executions - they'll queue automatically if needed
15execution_ids = []
16for i, item in enumerate(items_to_process):
17 response = client.execute_workflow_async(
18 workflow_deployment_name="your-workflow",
19 inputs=[
20 vellum.WorkflowRequestStringInput(
21 name="user_query",
22 value=item["user_query"]
23 )
24 ],
25 external_id=f"batch-item-{i}" # Track each item
26 )
27 execution_ids.append(response.execution_id)
28 print(f"Initiated execution {i+1}/{len(items_to_process)}: {response.execution_id}")
29
30print(f"\nInitiated {len(execution_ids)} executions. They'll process as capacity becomes available.")

Tracking Batch Job Completion

After initiating your batch, you have several options for tracking completion:

The most efficient approach is to use webhooks to receive completion notifications. See our Long Running Workflows guide for webhook setup details.

Python SDK - Batch Job with Webhooks
1import vellum
2
3client = vellum.VellumClient(api_key="your-api-key")
4
5# Initiate batch executions
6items_to_process = [/* your items */]
7execution_ids = []
8
9for i, item in enumerate(items_to_process):
10 response = client.execute_workflow_async(
11 workflow_deployment_name="your-workflow",
12 inputs=[/* your inputs */],
13 external_id=f"batch-item-{i}" # Use external_id for webhook correlation
14 )
15 execution_ids.append(response.execution_id)
16
17# Store execution_ids for tracking
18# Webhooks will notify you when each execution completes
19print(f"Initiated {len(execution_ids)} executions. Webhooks will notify on completion.")

Option 2: Status Polling

Poll the status endpoint to check completion. This is useful when you need to wait for results before proceeding:

Python SDK - Batch Job with Status Polling
1import vellum
2import time
3from typing import Dict, List
4
5client = vellum.VellumClient(api_key="your-api-key")
6
7# Initiate batch executions
8items_to_process = [/* your items */]
9execution_ids = []
10
11for i, item in enumerate(items_to_process):
12 response = client.execute_workflow_async(
13 workflow_deployment_name="your-workflow",
14 inputs=[/* your inputs */],
15 external_id=f"batch-item-{i}"
16 )
17 execution_ids.append(response.execution_id)
18
19# Poll for completion
20results: Dict[str, dict] = {}
21pending = set(execution_ids)
22
23while pending:
24 for execution_id in list(pending):
25 try:
26 status_response = client.check_workflow_execution_status(
27 execution_id=execution_id
28 )
29
30 if status_response.status == "FULFILLED":
31 results[execution_id] = {
32 "status": "completed",
33 "outputs": status_response.outputs
34 }
35 pending.remove(execution_id)
36 print(f"Completed: {execution_id}")
37 elif status_response.status == "REJECTED":
38 results[execution_id] = {
39 "status": "failed"
40 }
41 pending.remove(execution_id)
42 print(f"Failed: {execution_id}")
43 except Exception as e:
44 print(f"Error checking {execution_id}: {e}")
45
46 if pending:
47 print(f"Still processing {len(pending)} executions...")
48 time.sleep(30) # Poll every 30 seconds
49
50print(f"\nBatch complete! Processed {len(results)} executions.")

Option 3: Hybrid Approach

Initiate executions and periodically check status, but rely on webhooks for final notification:

Python SDK - Hybrid Approach
1import vellum
2import time
3
4client = vellum.VellumClient(api_key="your-api-key")
5
6# Initiate batch executions
7items_to_process = [/* your items */]
8execution_ids = []
9
10for i, item in enumerate(items_to_process):
11 response = client.execute_workflow_async(
12 workflow_deployment_name="your-workflow",
13 inputs=[/* your inputs */],
14 external_id=f"batch-item-{i}"
15 )
16 execution_ids.append(response.execution_id)
17
18# Optional: Quick status check after a delay
19time.sleep(60) # Wait 1 minute
20
21# Check how many have completed so far
22completed = 0
23for execution_id in execution_ids:
24 try:
25 status = client.check_workflow_execution_status(execution_id=execution_id)
26 if status.status in ["FULFILLED", "REJECTED"]:
27 completed += 1
28 except:
29 pass
30
31print(f"Progress: {completed}/{len(execution_ids)} completed")
32print("Webhooks will notify when remaining executions complete.")

Best Practices

Always include an external_id when initiating batch executions. This allows you to correlate webhook events with your internal records, making it easy to track which item in your batch corresponds to each execution.

Be aware of your organization’s concurrency limits. While async executions queue automatically, understanding your limits helps you plan batch sizes and processing times.

Some executions in a batch may fail. Use webhooks or status polling to identify failures and implement retry logic or error handling as needed.

For large batches (hundreds or thousands of executions), webhooks are more efficient than polling. They reduce API calls and provide real-time notifications.

There’s no hard limit on batch size, but consider:

  • Your organization’s concurrency limits
  • Processing time per execution
  • Webhook endpoint capacity
  • Error handling complexity