Why Retries

Introduction

Retries Parameters

Quick Start

Configure Helicone to automatically retry failed LLM requests, overcoming rate limits and server issues using intelligent exponential backoff.

Retries

Retries - Helicone OSS LLM Observability

Helicone OSS LLM Observability

Helicone is the open-source LLM observability platform for developers to monitor, debug, and improve **production-ready** applications.

Introduction to Helicone - Open Source LLM Observability

Reduce latency and save costs on LLM calls by caching responses on the edge. Configure cache duration, bucket sizes, and use cache seeds for consistent results across requests.

Caching

LLM Caching

LLM Cache - Helicone OSS LLM Observability

Add custom metadata to LLM requests for advanced segmentation and analysis. Tag requests with session IDs, conversation context, or application data to gain deeper insights into your AI application performance.

Custom Properties

Custom Properties - Helicone OSS LLM Observability

Provide customers with a dedicated portal to view and manage their LLM usage. Oversee customer access, set rate limits, and monitor usage while allowing customers to track their own metrics.

Customer Portal

Customer Portal - Helicone OSS LLM Observability

Configure Helicone to selectively exclude sensitive request and response data from your LLM application logs for privacy and compliance needs.

Omit Logs

Omit Logs - Helicone OSS LLM Observability

Group and visualize multi-step LLM interactions. Track request flows across multiple traces and gain insights into complex AI workflows by adding only 2 simple headers.

Sessions

Sessions - Helicone OSS LLM Observability

Monitor individual user interactions with your LLM applications. Track per-user request volumes, costs, and usage patterns across your AI services.

User Metrics

User Metrics - Helicone OSS LLM Observability

Experiments is a spreadsheet-like experience designed to tune your LLM prompts for production. 

Experiments

Experiments | Helicone OSS LLM Observability

Curate datasets and fine-tune your LLMs with Helicone to improve performance on specific tasks.

Datasets & Fine-Tuning

Datasets & Fine-Tuning - Helicone OSS LLM Observability

Add custom scoring metrics to your LLM requests and experiments. Evaluate prompt performance, compare results across datasets, and quantify model outputs for continuous improvement.

Scores

Scores - Helicone OSS LLM Observability

Collect and analyze user feedback on LLM responses. Evaluate output quality, identify trends, and optimize your AI application's performance based on user satisfaction.

User Feedback

User Feedback - Helicone OSS LLM Observability

Helicone's AI Gateway is the open-source, lightweight Rust-based solution for intelligent routing to 100+ language models.

Get started with Helicone AI Gateway in 1 minute

Quickstart

Build your first custom router with load balancing, caching, and rate limiting in **5 minutes**

Create Your First Router

Choose the deployment method that works best for your infrastructure and requirements

Deploy Your Gateway

Authentication integration with the Helicone platform

Secure Your Gateway

Monitor your AI Gateway requests with Helicone

Enable Observability

Observability with Helicone

Complete reference for configuring your LLM Gateway

Configuration

Configuration Reference

Learn how Helicone calculates the cost per request for nearly all models, including both streamed and non-streamed requests. Detailed explanations and examples provided.

Cost Calculation

How We Calculate Cost

How We Calculate Cost - Helicone OSS LLM Observability

Helicone ensures top-tier data security and privacy through our SOC2 compliant cloud solution, with options for enhanced control and data ownership.

Data Security & Privacy

Data Security & Privacy - Helicone OSS LLM Observability

Comprehensive guide to all Helicone headers. Learn how to access and implement various Helicone features through custom request headers.

Header Directory

Helicone Header Directory

Helicone Header Directory - Open Source LLM Observability

Helicone minimizes latency for your LLM applications using Cloudflare's global network. Detailed benchmarking results and performance metrics included.

Latency

Latency Impact

Latency Impact - Helicone OSS LLM Observability

Understanding Helicone's open-source status and how to contribute

Open Source

Compare Helicone's Proxy and Async integration methods. Understand the features, benefits, and use cases for each approach to choose the best fit for your LLM application.

Proxy vs Async

Proxy vs Async Integration

Proxy vs Async Integration - Helicone OSS LLM Observability

Helicone ensures high availability for your LLM applications using Cloudflare's global network. Learn about our deployment practices and how we maintain reliability.

Reliability

Availability and Reliability

Availability - Helicone OSS LLM Observability

Retrieve all requests visible in the request table at Helicone.

Get Requests

Get Requests - Helicone OSS LLM Observability

Get Requests (Point Queries)

Get Requests (Point Queries) - Helicone OSS LLM Observability

Retrieve a single request visible in the request table at Helicone.

Get Single Request

Get Single Request - Helicone OSS LLM Observability

Get Requests by IDs

Get Requests by IDs - Helicone OSS LLM Observability

Submit Feedback

Submit Feedback - Helicone OSS LLM Observability

Create or update a property of a specific request.

Upsert Request Property

Upsert Request Property - Helicone OSS LLM Observability

Submit assets for a specific request. - If you don't know what this is, you probably don't need this.

Submit Request Assets

Submit Request Assets - Helicone OSS LLM Observability

Submit Score

Submit Score - Helicone OSS LLM Observability

Query Prompts

Get Prompts - Helicone OSS LLM Observability

Query Prompt

Get Prompt - Helicone OSS LLM Observability

Delete Prompt

Create Prompt

Modify the label of a specific prompt version

Edit Prompt Version Label

Modify the template of a specific prompt version

Edit Prompt Version Template

Retrieve a subversion of a specific prompt version

Get Prompt Version Subversion

Get Subversion of Prompt Version - Helicone OSS LLM Observability

Promote Prompt Version

Search through inputs of a specific prompt version

Query Prompt Version Inputs

Retrieve experiments associated with a prompt

Get Prompt Experiments

Search through versions of a specific prompt

Query Prompt Versions

Get Prompt Versions - Helicone OSS LLM Observability

Retrieve details of a specific prompt version

Get Prompt Version

Delete Prompt Version

Compile Prompt

Query Prompt Template

Get an overview of aggregated user metrics

Query User Metrics Overview

Search and filter through user-specific metrics

Query User Metrics

Retrieve user data based on specified user IDs and time filters

Get User Data

Get User Data - Helicone OSS LLM Observability

Search and filter through evaluation results

Query Evaluations

Get Evaluation Scores

Create a new evaluation for a specific request

Create Evaluation

Analyze distribution of evaluation scores

Query Score Distributions

Get Webhooks

Create Webhook

Delete Webhook

Create Evaluator

Get Evaluator

Modify an existing evaluator configuration

Update Evaluator

Remove an existing evaluator configuration

Delete Evaluator

Search and filter through evaluator configurations

Query Evaluators

Retrieve experiments associated with a specific evaluator

Get Evaluator Experiments

Retrieve all online evaluators for a specific evaluator

Get Online Evaluators

Add a new online evaluator to an existing evaluator

Create Online Evaluator

Remove an online evaluator from a specific evaluator configuration

Delete Online Evaluator

Initialize a new empty experiment using v2 API

Create Empty Experiment (v2)

Create a new experiment based on an existing request using v2 API

Create from Request (v2)

Create Experiment from Request (v2)

Create a new experiment with initial configuration using v2 API

Create New Experiment (v2)

List Experiments

Retrieve detailed information about a specific experiment

Get Experiment Details

Associate a prompt version with an experiment

Add Prompt Version

Add Prompt Version to Experiment

Retrieve all prompt versions used in an experiment

Get Experiment Prompt Versions

Retrieve input keys used in an experiment

Get Experiment Input Keys

Manually add a new row to an experiment using v2 API

Add Manual Row (v2)

Insert multiple rows into an experiment using v2 API

Batch Insert Rows (v2)

Update rows in an experiment using v2 API

Update Rows (v2)

Run a hypothesis for an experiment using v2 API

Run Hypothesis (v2)

Retrieve all evaluators associated with an experiment using v2 API

Get Experiment Evaluators (v2)

Add evaluators to an experiment using v2 API

Add Evaluators (v2)

Add Experiment Evaluators (v2)

Remove an evaluator from an experiment using v2 API

Delete Experiment Evaluator (v2)

Execute evaluators for an experiment using v2 API

Run Evaluators (v2)

Run Experiment Evaluators (v2)

Check if evaluators should be run for an experiment

Check Evaluator Run Status

Retrieve scoring metrics for a specific prompt version in an experiment

Get Experiment Scores

Retrieve a specific score for an experiment request

Parameter	Description
`helicone-retry-num`	Number of retries
`helicone-retry-factor`	The exponential backoff factor used to increaase the wait time between subsequent retries. The default is usually `2`.
`helicone-retry-min-timeout`	Minimum timeout (in milliseconds) between retries
`helicone-retry-max-timeout`	Maximum timeout (in milliseconds) between retries

Getting Started

Integrations

Tracing

Prompt Engineering

AI Gateway

References

Retries

Introduction

Why Retries

Quick Start

Retries Parameters

Getting Started

Integrations

Tracing

Prompt Engineering

AI Gateway

References

​Introduction

​Why Retries

​Quick Start

​Retries Parameters

Introduction

Why Retries

Quick Start

Retries Parameters