Skip to main content
Open in Kaggle  Open in Colab  Download Notebook
This documentation page is also available as an interactive notebook. You can launch the notebook in Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the above links.
Generate natural-sounding audio from text using OpenAI’s text-to-speech models.

Problem

You need to convert text content into spoken audio—for accessibility, content repurposing, or voice applications.

Solution

What’s in this recipe:
  • Generate speech with OpenAI TTS
  • Choose from multiple voice options
  • Store text and audio together
You add a computed column that converts text to audio. The audio is cached and only regenerated when the source text changes.

Setup

%pip install -qU pixeltable openai
import os
import getpass

if 'OPENAI_API_KEY' not in os.environ:
    os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key: ')
import pixeltable as pxt
from pixeltable.functions.openai import speech
# Create a fresh directory
pxt.drop_dir('tts_demo', force=True)
pxt.create_dir('tts_demo')
Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata
Created directory ‘tts_demo’.
<pixeltable.catalog.dir.Dir at 0x17f0d5bd0>

Create text-to-speech pipeline

# Create table for articles
articles = pxt.create_table(
    'tts_demo.articles',
    {'title': pxt.String, 'content': pxt.String}
)
Created table ‘articles’.
# Add audio generation column
articles.add_computed_column(
    audio=speech(
        articles.content,
        model='tts-1',
        voice='alloy'
    )
)
Added 0 column values with 0 errors.
No rows affected.

Generate audio

# Insert sample articles
sample_articles = [
    {
        'title': 'Welcome to AI',
        'content': 'Artificial intelligence is transforming how we work and live. From smart assistants to autonomous vehicles, AI is becoming part of our daily lives.'
    },
    {
        'title': 'Getting Started',
        'content': 'To begin your journey with machine learning, start by understanding the basics of data preparation and model training.'
    },
]

articles.insert(sample_articles)
Inserting rows into `articles`: 2 rows [00:00, 423.90 rows/s]
Inserted 2 rows with 0 errors.
2 rows inserted, 6 values computed.
# View articles with generated audio
articles.select(articles.title, articles.content, articles.audio).collect()

Explanation

OpenAI TTS models:
Voice options:
Tips:
  • Use tts-1 for drafts and real-time applications
  • Use tts-1-hd for final production audio
  • Audio is cached—no regeneration on queries

See also