Sign In
Free Sign Up
  • English
  • Español
  • 简体中文
  • Deutsch
  • 日本語
Sign In
Free Sign Up
  • English
  • Español
  • 简体中文
  • Deutsch
  • 日本語

How to Fine-Tune an LLM Using OpenAI

Welcome back to our series on fine-tuning language models (LLMs)! In our previous post, we explored the fine-tuning of LLMs using Hugging Face (opens new window). Today, we're shifting our focus to the OpenAI platform. While many associate OpenAI primarily with ChatGPT and the API key access for integrating AI capabilities, there's another powerful feature: the ability to fine-tune models tailored to your specific needs. This process allows you to use the vast knowledge base of pre-trained models while ensuring compatibility and optimization with your distinct dataset.

In this blog, we'll walk you through how to fine-tune your model using the OpenAI API. Whether you're making a chatbot better, creating new types of stories, or setting up a system to answer questions, this post will show you how to make the most out of OpenAI's fine-tuning feature.

# Pre-requisites

Before we start, ensure you have the necessary packages installed. You will need the datasets package for handling our data and openai for interacting with the OpenAI API. Open your terminal and enter the following command.

pip install datasets openai

The datasets library is a versatile tool for loading and manipulating datasets, especially useful for machine learning workflows.

Related Article: How to build a recommendation model (opens new window)

# Loading the Dataset

We'll begin by loading our dataset. For demonstration purposes, we're using a dataset available on Hugging Face. Here's how you can load it

from datasets import load_dataset

# Load your dataset from Hugging Face
dataset = load_dataset("lamini/lamini_docs")

In this code snippet, we load our dataset named "lamini/lamini_docs". Replace this with your dataset name if different.

# Exploring the Dataset

Understanding your dataset is crucial before proceeding with fine-tuning. Let's take a look at the structure of our dataset:


This will print results like this:

    train: Dataset({
        features: ['question', 'answer', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 1260
    test: Dataset({
        features: ['question', 'answer', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 140

The next step is to extract the required data from this dataset and prepare it for training.

Related Article : Build an image search application (opens new window)

# Formatting the Data for Fine-Tuning

The dataset is distributed into train and test splits. We will only use the training data. Let’s extract that.

import pandas as pd
train_dataset = dataset['train']
train_df = pd.DataFrame(train_dataset)
questions_answers = train_df[['question', 'answer']]

In this step, we will only extract questions and answers from the dataframe because for fine-tuning, we primarily need these two elements.

OpenAI requires the data in a specific JSONL format for fine-tuning. Each line must be a JSON object representing a single training example. Here's how to format your data:

with open('finetune_data_chat_format.jsonl', 'w') as jsonl_file:
    for index, example in questions_answers.iterrows():
        formatted_data = {
            "messages": [
                {"role": "system", "content": "You're a helpful assistant"}, 
                {"role": "user", "content": example['question']},
                {"role": "assistant", "content": example['answer']}
        jsonl_file.write(json.dumps(formatted_data) + '\\n')

Note: Our goal is to create a chatbot and use gpt-3.5-turbo for finetuning, that’s why we are using a conversational chat format. You can go to chat formats (opens new window) page of OpenAI tp see other formats.

# Uploading Your Dataset to OpenAI

Before fine-tuning, you need to upload your formatted dataset to OpenAI:

from openai import OpenAI
from pathlib import Path

client = OpenAI(api_key="your_api_key")
response = client.files.create(

Note: Securely store your API key and never expose it in shared or public codebases. The purpose='fine-tune' parameter indicates that the uploaded file is intended for model training.

# Initiating the Fine-Tuning Process

With your data uploaded, you can now start the fine-tuning process:

fine_tune_response =,  # Use the uploaded file's ID
  model="gpt-3.5-turbo"       # Specify the model you want to fine-tune

print("Fine-tuning job started with ID:",

This will begin the fine-tuning process on the selected model. The job ID for tracking the progress of your fine-tuning job.

Note: You’ll get an email with the model name once the training is done. You’ll use that model name in the testing part.

Related Article: Empowering GenAI Applications with Relational Vector Databases (opens new window)

# Monitoring Fine-Tuning Progress

You can monitor the status of your fine-tuning job as follows:"your_fine_tune_job_id")

Replace "your_fine_tune_job_id" with the ID returned from the job creation step. This command provides detailed information about the job's status and performance.

# Testing the Fine-Tuned Model

After fine-tuning, it's time to test your model. Here's how you can generate completions using your fine-tuned model:

completion =
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Your message here"}

Replace "your_fine_tuned_model_name" and "Your prompt here" with your model name and test prompt, respectively.

By now, you should have a model finely tuned to your needs, but this is just the start with OpenAI. This platform allows you to train even more advanced and complex models, expanding your AI possibilities.

# Comparison of Fine-tuning LLMs Using Hugging Face and OpenAI

After the comprehnsive overview and working of both the techniques. Let’s take a comparison.

Factor Hugging Face OpenAI
Ease of Use User-friendly interface. Comprehensive documentation but requires strong machine learning background. Straightforward and requires some machine learning familiarity.
Model Availability Wide range of pre-trained models (BERT, GPT, etc.). Mainly focuses on GPT variants, with high optimization. Also offers Jurassic-1 Jumbo (QA) and Codex (code generation).
Customization Extensive customization options for fine-tuning. Simplified customization process, less granular than Hugging Face.
Data Privacy Strong options: allows local or private cloud processing. Strong, primarily cloud-based. May not suit all data sensitivity needs.
Performance Varies by model and settings, scalable with proper hardware. High performance, especially in language understanding and generation.
Scalability User-managed scalability, depending on hardware and dataset size. Managed by OpenAI, less user concern for infrastructure.
Cost Free and paid tiers, cost-effective with good management (especially local processing). Usage-based pricing, can be expensive at scale.
Community & Support Large, active community with forums, tutorials, and shared projects. Strong official channels and documentation, less community-driven.
Additional Features TRL Library simplifies fine-tuning (SFT, RLHF) User-friendly API for application integration

Both the fine-tuning techniques have their own pros and cons but it mainly depends on your usecase, if you want data privacy and have some technical knowledge, you can go with Hugging Face otherwise you can adopt OpenAI.

# Conclusion

Fine-tuning LLMs using the OpenAI API offers you a streamlined, powerful approach to customizing language models for your specific needs. By following the steps outlined in this article, you can fine-tune your model efficiently, ensuring it delivers tailored, high-quality results. Remember, the effectiveness of fine-tuning heavily relies on the quality and relevance of your training data. Invest time in curating and structuring your dataset to ensure the best outcomes from your fine-tuning efforts.

Fine-tuning a model can involve generating embeddings or vectors for input data. In some cases, these embeddings can be stored in a vector database for efficient retrieval or similarity search. For instance, if you fine-tune a language model for a specific application like document classification. MyScale (opens new window), an SQL vector database, is designed for AI applications and achieves fast retrieval or similarity search performance. It is highly accessible for developers, as only SQL is required for interaction.

For your feedback or suggestions,you can contact us on MyScale Discord (opens new window).