LLM Security
Enable robust security measures in your LLM applications to protect against prompt injections, detect anomalies, and prevent data exfiltration.
Introduction
Generative AI is quickly changing the cybersecurity landscape. Helicone provides built-in security measures powered by Meta’s state-of-the-art security models to protect your LLM applications.
Security Implementation
Helicone’s LLM security is powered by two advanced models from Meta:
-
Prompt Guard (86M): A specialized model for detecting:
- Direct prompt injections
- Indirect/embedded malicious instructions
- Jailbreak attempts
- Multi-language attacks (supports 8 languages)
-
Advanced Security Analysis: Optional deeper security analysis using Meta’s Llama Guard (3.8B) for comprehensive threat detection across 14 categories:
Category Description Violent Crimes Violence toward people or animals Non-Violent Crimes Financial crimes, property crimes, cyber crimes Sex-Related Crimes Trafficking, assault, harassment Child Exploitation Any content related to child abuse Defamation False statements harming reputation Specialized Advice Unauthorized financial/medical/legal advice Privacy Handling of sensitive personal information Intellectual Property Copyright and IP violations Indiscriminate Weapons Creation of dangerous weapons Hate Speech Content targeting protected characteristics Suicide & Self-Harm Content promoting self-injury Sexual Content Adult content and erotica Elections Misinformation about voting Code Interpreter Abuse Malicious code execution attempts
Quick Start
To enable LLM security in Helicone, simply add Helicone-LLM-Security-Enabled: true
to your request headers. For advanced security analysis using Llama Guard, add Helicone-LLM-Security-Advanced: true
:
Security Checks
When LLM Security is enabled, Helicone:
- Analyzes each user message using Meta’s Prompt Guard model (86M parameters) to detect:
- Direct jailbreak attempts
- Indirect injection attacks
- Malicious content in 8 languages (English, French, German, Hindi, Italian, Portuguese, Spanish, Thai)
- When advanced security is enabled (
Helicone-LLM-Security-Advanced: true
), activates Meta’s Llama Guard (3.8B) model for:- Deeper content analysis across 14 threat categories
- Higher accuracy threat detection
- More nuanced understanding of context and intent
- Blocks detected threats and returns an error response:
- Adds minimal latency to ensure a smooth experience for legitimate requests
Advanced Security Features
- Two-Tier Protection:
- Base tier: Fast screening with Prompt Guard (86M parameters)
- Advanced tier: Comprehensive analysis with Llama Guard (3.8B parameters)
- Multilingual Support: Detects threats across 8 languages
- Low Base Latency: Initial screening uses the lightweight Prompt Guard model
- High Accuracy:
- Base: Over 97% detection rate on jailbreak attempts
- Advanced: Enhanced accuracy with Llama Guard’s larger model
- Customizable: Security thresholds can be adjusted based on your application’s needs
Was this page helpful?