Version Selection
The AI Gateway automatically determines which prompt version to use based on the parameters you provide:Uses the version deployed to that environment (e.g., production, staging, development)
Uses a specific version directly by its ID
Default behavior: If neither parameter is provided, the production version is used. Environment takes precedence over version_id if both are specified.
Parameter Priority
Saved prompts store all the configuration you set in the playground - temperature, max tokens, response format, system messages, and more. At runtime, these saved parameters are used as defaults, but any parameters you specify in your API call will override them.Message Handling
Messages work differently than other parameters. Instead of overriding, runtime messages are appended to the saved prompt messages. This allows you to:- Define consistent system prompts and example conversations in your saved prompt
- Add dynamic user messages at runtime
- Build multi-turn conversations that maintain context
messages parameter becomes optional in API calls when using Helicone prompts. However, if your prompt template is empty or lacks messages, you’ll need to provide them at runtime.
Runtime messages are always appended to the end of your saved prompt messages. Make sure your saved prompt structure accounts for this behavior.
Prompt Partial Resolution
Prompt partials are resolved before variable substitution, allowing you to reference messages from other prompts and control their variables from the main prompt.Resolution Order
The prompt assembly process follows this order:- Prompt Partial Resolution: All
{{hcp:prompt_id:index:environment}}tags are replaced with the corresponding message content - Variable Substitution: All
{{hc:name:type}}variables are replaced with their provided values
Partial Resolution Process
When a prompt partial is encountered:- Version Selection: The system determines which version of the referenced prompt to use based on the
environmentparameter (or defaults to production) - Message Extraction: The message at the specified
indexis extracted from that prompt version - Content Replacement: The partial tag is replaced with the extracted message content (which may contain its own variables)
- Variable Collection: Variables from the resolved partial are collected and made available for substitution
Variable Control
Since partials are resolved before variables, variables within partials can be controlled from the main prompt’s inputs:Variables from prompt partials are automatically extracted and shown in the prompt editor. You only need to provide values for these variables in your main prompt’s inputs - they will be substituted in both the main prompt and any resolved partials.
Override Examples
- Temperature Override
- Max Tokens Override
- Response Format Override
This compilation approach gives you the flexibility to have consistent prompt templates while still allowing runtime customization for specific use cases.