When evaluating LLM outputs, you may want to check for different criteria using the same type of Metric. Vellum allows you to add the same Metric multiple times to a test suite, each with different configurations and names, enabling more comprehensive evaluation of your outputs.
There are several scenarios where reusing the same Metric with different configurations is valuable:
To add multiple instances of the same Metric to your Test Suite:
When using the same Metric multiple times, it’s important to rename each instance to clearly indicate its purpose:

A common use case is using the LLM-as-Judge Metric multiple times to evaluate different aspects of your outputs:
You might add three instances of the LLM-as-Judge Metric:
Another powerful use case is using the same Code Execution Metric multiple times to evaluate different fields within a structured JSON output.
Imagine your LLM generates a product recommendation in JSON format with multiple nested fields:
You can create a single Code Execution Metric that extracts and evaluates a specific field based on a provided key path:
Then, you can add this same Metric multiple times to your test suite with different configurations:
Product Name Validation
completion: output the Prompt or Workflow outputtarget: expected_output would resolve to “Ultra Comfort Mattress”key_path: “recommendation.product.name” a constantPrice Validation
completion: output the Prompt or Workflow outputtarget: expected_output would resolve to 899.99key_path: “recommendation.product.price” a constantFeatures Validation
completion: output the Prompt or Workflow outputtarget: expected_output would resolve to [“memory foam”, “cooling gel”, “hypoallergenic”]key_path: “recommendation.product.features” a constantThis approach allows you to reuse the exact same code while evaluating different aspects of your JSON output by simply changing the input parameters. It’s a powerful pattern that reduces duplication and makes your evaluation more maintainable.
Learn more about setting and using Expected Outputs in Quantitative Evaluation.
When reusing Metrics in your test suites:
Reusing Metrics with different configurations provides a powerful way to perform multi-dimensional evaluation of your LLM outputs. By applying the same Metric type in different ways, you can gain deeper insights into the quality and correctness of your AI-generated content.