In this lesson, we are going to learn how to use OpenAI Whisper API to transcribe and translate audio files in Python.

OpenAI Whisper is an automatic speech recognition model, and with the OpenAI Whisper API, we can now integrate speech-to-text transcription functionality into our applications to translate or transcribe audio with ease.



Step 1: Create an OpenAI account

The first step to getting started with OpenAI GPT API is to create an account on the OpenAI website. Go to https://openai.com/api/ and sign up for an account.

Step 2: Generate OpenAI API key

To connect to OpenAI API endpoint, we need to first create a secret key. Click on your user name, then click on View API keys. Click Create new secret key button to generate an API key.



Step 3: Install the OpenAI API package

The OpenAI API package can be installed using the pip package manager in Python. Open a terminal and type the following command to install the package:

pip install openai    

Step 4: Connect to OpenAI In Python

To connect to OpenAI endpoint, we will import the openai modle and attach the API key

import openai

API_KEY = '<openAI API key>'
openai.api_key = API_KEY




demo_transcription.py

import openai

API_KEY = '<API KEY>'
model_id = 'whisper-1'

media_file_path = 'Steve Job\'s Goodbye Speech.wav'
media_file = open(media_file_path, 'rb')

response = openai.Audio.transcribe(
    api_key=API_KEY,
    model=model_id,
    file=media_file
)
print(response.data['text'])

demo_translate.py

import openai

API_KEY = '<API KEY>'
model_id = 'whisper-1'

media_file_path = 'video_japanese.mp4'
media_file = open(media_file_path, 'rb')

response = openai.Audio.translate(
    api_key=API_KEY,
    model=model_id,
    file=media_file,
    prompt=''
)
print(response.data['text'])