When using an LLM, you input a prompt and get back a completion. However, in order for a prompt to be reusable, you’ll want to be able to specify certain parts of it dynamically.
That’s where prompt templates come in. Prompt templates contain the fixed parts of a prompt with placeholders for the parts that might change from run to run. When you use a prompt template, you pass in specific values for the placeholders to create the final prompt that’s sent to the LLM.
Vellum’s Prompt Syntax supports dynamically constructed prompts via variable substitution, jinja templating, and what we call “blocks.”
Jinja Templating
Jinja is a powerful templating syntax useful for dynamic content. Most frequently, you’ll use it to reference Prompt Variables. Below are the most common things you’re likely to want to do, but you can find jinja’s complete documentation here.
Variables
Reference variables using double-curly-brackets. For example,
Note that all prompt variables are treated as strings!
Conditionals
Perform conditional logic based on your input variables using if/else statements
Comments
You can use jinja to leave comments in your prompt that don’t use up any tokens when compiled and sent to the LLM. For example,
JSON Inputs
All Prompt Variables as treated as strings. However, if you want to supply json, you can use this trick to access specific key/value pairs.
For example, say you have a variable called traits
whose value in a Scenario looked like:
Then you can access "happy go lucky" with the following:
Casting Variable Types
Vellum currently treats all input variables to prompts as strings. However, you may use jinja filters to convert your variables to specific types and then use them accordingly. For example:
Blocks
Vellum uses blocks to separate key pieces of a prompt, such as System/Assistant/User messages of a Chat model (e.g. gpt-3.5-turbo
or claude-v1
) or the special $chat_history
variable.
Blocks are more noticeable when using a Chat model than when using a Text model.
Chat Model Example
Here’s what a sequence of blocks might look like for a Chat model
Text Model Example
Text models typically use a single block, which might look like this:
Switching from Chat ↔ Text Models
When switching from a Chat Model to a Text model, blocks are converted as best they can be.
Chat → Text
Here’s an example going from Chat to Text.
Text → Chat
Here’s an example going from Text to Chat.
Function Calling
Function Calling allows you to provide function definitions within your prompts the the model could use in deciding how to respond with each user prompt. It’s best to think of functions as a type of classifier prompt that pushes to model to respond with either:
- The name of one of those functions, with associated parameter values.
- A standard text response.
This definitive response from the model allows developers building LLM features into their applications to know when to call a function or when to return a message to the user. This removes the need to try parsing JSONs are other formats from the LLM text response, leading to more stable user experiences.
To define a function, you first need to choose a model that supports function calling and click the + Add
→ Function
button at the bottom of your prompt:
Then, click the block that’s created and a modal will appear where you can start defining your function! There are three important sections to consider:
Name
- This is a single identifier that the model will use to instruct you which function to call nextDescription
- This is a natural language description of what your function does. This is the part most used by the model to decide which function to call and should be considered counting towards your token count.Parameters
- The set of parameters your functions accept. Each parameter will also have aName
,Description
, &Type
that the model uses to decide what values the function should be called with.
When you then call the model with a prompt along with these function definitions, the model will then decide whether it makes sense to call one of the defined functions or return the standard text response. If it decides to call a defined function, the response will be a JSON directing which function to call and with which parameter values:
Notice that in this example, the model is directing us to call the get_current_weather
function with the location
parameter set to Boston, MA
. At this point, it is up to the app developer to actually invoke the function - the model itself does not have access to the execution logic. These functions should represent public or private APIs that the app developer supports and could invoke once instructed by the model.
Once the function is called and a response is observed, the response should be fed back to the LLM as a function message so that the model knows what was the outcome of calling that function. The model will then be able to use the response of that function when deciding how to respond next in order to satisfy the original prompt.
Notice that we need to specify the original response from the model as an assistant message, before following up with a function message. The final output from the model then represents its understanding of the user’s prompt and the output of the function it had access to.
Once you have reached this point, you’ll have everything you’ll need to add functions to models! Function calling is best used for incorporating the following types of data into your prompts:
- Runtime or recent data - New data that has become available that the developer’s APIs have access to but the model was not trained on
- Proprietary data - Data the developer has collected that is specialized to their business proposition that gives their application a comparative advantage
- Weighted data - Data that the model already has been trained on but that the developer would like to re-prioritize based on some special insights that could be inferred from their API
Previewing Compiled Prompts
Given the powerful and dynamic nature of Vellum’s prompt syntax, you may want to see what the final, compiled payload sent to the model provider after all variable substitutions and jinja templating is applied. You can do this in the Playground UI like so: