Who can use this feature: Anyone on any plan.

Scores will be ingested with a 10 minutes delay to ensure there is enough time to process the request.

Introduction

Helicone’s scores API allows you to score your requests and experiments. You can use this feature to evaluate the performance of your prompts and compare different experiments and datasets. E.g., if you are building an image classification application, you might need a variety of scores to help you determine how accurate the outputs are compared to what you expect. For example, an image classification app might have one score that tells you how accurate the model classifies images into the correct categories, and another that measures the confidence level of the model’s predictions.

Example: Experiment scores.

Why Scores

Scoring request allows you:

  • Evaluate the performance of your prompts.
  • Compare the scores of different experiments and datasets.

We are currently not supporting autoscoring, but you can write your own logic and submit scores via our API.

Quick Start

Option 1: Using the Request Page

You can add scores to your requests directly from the request page:

Example: Adding scores on the request page.

Option 2: Setting up your own Scoring Webhook

You can set up your own scoring webhook to score your requests. Here’s an example of how you can do this with Cloudflare Workers:

1

Deploy Worker

First, deploy Cloudflare Worker template by clicking the button above.

2

Create a Webhook

Create a webhook to be able receive request and response data create a webhook with your Scoring Worker URL.

3

Customize your scoring logic

You can can customize the scoring logic in the index.js file in your Scoring Worker.

// You can customize the scoring function below and add more scores as needed.
function calculateScore(data: HeliconeRequest): Record<string, number> {
  if (data.response_body) {
    return {
      vocabulary_diversity: calculateVocabularyDiversity(data.response_body),
      // Add more scores here
    };
  }

  return {};
}

Local Testing

If you want to test your scoring logic locally you’ll need to use wrangler secrets to add appropriate value for HELICONE_AUTH.

$ wrangler secret put HELICONE_AUTH

And run Scoring Webhook locally:

$ wrangler deploy