Streaming mode with synchronous requests
In this mode, the request is made synchronously, but the response is streamed.Streaming mode with asynchronous requests
In this mode, both the request is made asynchronously and the response is streamed. You’ll need to use theawait
keyword when calling openai.ChatCompletion.acreate
, and use an async
for loop to iterate over the response.
Enhanced Streaming Support
Helicone now provides significantly improved streaming functionality with several key updates:Stream Fixes and Improvements
We’ve made several improvements to our stream handling across different LLM providers:- Better handling of stream interruptions and reconnections
- Enhanced error handling for streaming responses
- Improved compatibility with different LLM provider streaming formats
- More reliable token counting for streamed content
- Accurate timing calculations for streamed responses
New Streaming Methods
TheHeliconeManualLogger
class now includes enhanced methods for working with streams:
logBuilder
: The recommended method for handling streaming responses with better error handling and simplified workflowlogStream
: Logs a streaming operation with full control over stream handlinglogSingleStream
: Simplified method for logging a single ReadableStreamlogSingleRequest
: Logs a single request with a response body
Asynchronous Stream Parser
Our new asynchronous stream parser significantly improves performance when working with streamed responses:- Processes stream chunks asynchronously for reduced latency
- Provides more reliable token counting for streamed responses
- Accurately captures time-to-first-token metrics
- Efficiently handles multiple concurrent streams
Using the LogBuilder Method (Recommended)
The newlogBuilder
method provides a more streamlined approach to working with streaming responses, with better error handling:
logBuilder
approach offers several advantages:
- Better error handling with
setError
method - Simplified stream handling with
toReadableStream
- More flexible async/await patterns with
sendLog
- Proper error status code tracking
Using the Enhanced Streaming Features
OpenAI Streaming Example
Together AI Streaming Example
Anthropic Streaming Example
Calculating Costs with Streaming
For information on how to accurately calculate costs when using streaming features, please refer to our streaming usage guide. You can enable accurate cost calculation by either:- Including
stream_options: { include_usage: true }
in your request - Adding the
helicone-stream-usage: true
header to your request
Vercel App Router Integration
When using Next.js App Router with Vercel, you can use theafter
function to log streaming responses without blocking the response to the client: