Improve Retrieval Results with Metadata Filtering
Some use-cases of Vellum Search require you to narrow in on a subset of documents prior to searching based on keyword match / semantic similarity. For example, you might want to search across historical conversations for a specific user or only across documents that have specific tags.
You can do this through metadata filtering.
Metadata filtering requires that you:
- Provide structured metadata for your documents either upon initial upload or later; and
- Provide filter criteria when performing a search.
Let’s see how to do each.
Specifying Metadata
You can specify metadata for documents through both the UI and API.
Through the UI
You can provide metadata upon initial upload.
You can also view metadata associated with a document and edit it after it’s been uploaded.
Through the API
You can provide metadata as stringified JSON upon initial upload using the upload Documents API here.
You can also update a document’s metadata after-the-fact using the the Document - Partial Update
endpoint here.
Note that in this endpoint, you can simply provide a JSON object (rather than a stringified JSON object as is required during initial upload).
Filtering Against Metadata
You use the search
endpoint to perform a search against an index (documented here). This endpoint exposes an options.filters.metadata
field for filtering against your provided metadata prior to matching on keywords/semantic similarity.
The syntax of the metadata
property supports complex boolean logic and was borrowed from React Query Builder. You can use their demo here to get a feel for the query syntax.
Note that values for fields must be JSON-deserializable. If you’re looking to filter against a string, then the value passed in should contain escaped double quotes.
Example
Suppose you have two documents with the following metadata:
And you wanted to perform a search across all documents that are marked as high priority, customer-facing bugs, you would use the following query: