How can I use the OpenAI fine-tuning API?
Learn how to use the OpenAI fine-tuning API for custom model versions
The OpenAI fine-tuning API allows you to create custom versions of their models tailored to your specific use case. Here’s a step-by-step guide on how to use it:
Prepare your training data
Create a JSONL file with your training examples. Each example should be a prompt-completion pair. Ensure your dataset has at least a few hundred high-quality examples.
You can use Helicone datasets to curate a dataset and export it as JSONL for fine-tuning:
- Within Helicone, create a new dataset or select an existing one.
- Use Helicone’s filtering and tagging features to curate your data.
- Export the dataset in JSONL format, which is compatible with OpenAI’s fine-tuning process.
Example format of the exported JSONL file:
This approach allows you to leverage your existing data in Helicone to create high-quality training datasets for fine-tuning.
Upload the file and create a fine-tuning job
You have two main options for uploading your file and creating a fine-tuning job:
- Using the OpenAI Fine-tuning UI (Recommended for beginners):
- Go to platform.openai.com/finetune
- Click on “Create a fine-tuning job”
- Upload your JSONL file
- Select the base model you want to fine-tune
- Configure any additional settings
- Start the fine-tuning job
This method allows you to upload your file and create a job in one streamlined process.
-
Using the OpenAI API: First, upload your file:
Then, create a fine-tuning job:
Choose the method that best suits your needs and level of expertise.
Monitor the job
If you used the UI, you can monitor your job’s progress directly on the OpenAI platform.
If you used the API, you can check the status of your fine-tuning job using openai.FineTuningJob.retrieve()
. You can also list all your fine-tuning jobs with openai.FineTuningJob.list()
.
Use the fine-tuned model
Once complete, you can use your fine-tuned model by specifying its name in your API calls.
Best Practices
When fine-tuning models, consider these best practices:
-
Data Quality: Ensure your training data is high-quality, diverse, and representative of the tasks you want the model to perform. Quality often trumps quantity.
-
Model Selection: Choose the appropriate base model. For most use cases, “gpt-3.5-turbo” is recommended, but consider your specific needs and budget. Learn more about LLM fine-tuning duration and fine-tuning best practices chapter 2: models to make an informed decision.
-
Prompt Engineering: Craft clear and consistent prompts. Include relevant context and instructions within your prompts to guide the model effectively.
-
Iterative Improvement: Fine-tuning is often an iterative process. Be prepared to refine your dataset and try multiple fine-tuning runs based on performance evaluations.
-
Evaluation: Always evaluate your fine-tuned model against a held-out test set to ensure it performs better than the base model for your specific task.
-
Version Control: Keep track of different versions of your fine-tuned models, including the datasets and parameters used for each version.
-
Consider Alternatives: Depending on your use case, RAG (Retrieval Augmented Generation) might be a better fit than fine-tuning. Evaluate both approaches for your specific needs.
Remember to consult the official OpenAI documentation for the most up-to-date instructions on fine-tuning.
For more in-depth best practices on fine-tuning, check out OpenPipe’s detailed guides:
For more detailed information on fine-tuning LLMs with Helicone, check out our comprehensive guide.
Was this page helpful?