This documentation page is also available as an interactive notebook. You can launch the notebook in
Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the
above links.
Generate natural-sounding audio from text using OpenAI’s text-to-speech
models.
Problem
You need to convert text content into spoken audio—for accessibility,
content repurposing, or voice applications.
Solution
What’s in this recipe:
- Generate speech with OpenAI TTS
- Choose from multiple voice options
- Store text and audio together
You add a computed column that converts text to audio. The audio is
cached and only regenerated when the source text changes.
Setup
%pip install -qU pixeltable openai
import os
import getpass
if 'OPENAI_API_KEY' not in os.environ:
os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key: ')
import pixeltable as pxt
from pixeltable.functions.openai import speech
# Create a fresh directory
pxt.drop_dir('tts_demo', force=True)
pxt.create_dir('tts_demo')
Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata
Created directory ‘tts_demo’.
<pixeltable.catalog.dir.Dir at 0x17f0d5bd0>
Create text-to-speech pipeline
# Create table for articles
articles = pxt.create_table(
'tts_demo.articles',
{'title': pxt.String, 'content': pxt.String}
)
Created table ‘articles’.
# Add audio generation column
articles.add_computed_column(
audio=speech(
articles.content,
model='tts-1',
voice='alloy'
)
)
Added 0 column values with 0 errors.
No rows affected.
Generate audio
# Insert sample articles
sample_articles = [
{
'title': 'Welcome to AI',
'content': 'Artificial intelligence is transforming how we work and live. From smart assistants to autonomous vehicles, AI is becoming part of our daily lives.'
},
{
'title': 'Getting Started',
'content': 'To begin your journey with machine learning, start by understanding the basics of data preparation and model training.'
},
]
articles.insert(sample_articles)
Inserting rows into `articles`: 2 rows [00:00, 423.90 rows/s]
Inserted 2 rows with 0 errors.
2 rows inserted, 6 values computed.
# View articles with generated audio
articles.select(articles.title, articles.content, articles.audio).collect()
Explanation
OpenAI TTS models:
Voice options:
Tips:
- Use
tts-1 for drafts and real-time applications
- Use
tts-1-hd for final production audio
- Audio is cached—no regeneration on queries
See also