Generate captions for images

This documentation page is also available as an interactive notebook. You can launch the notebook in Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the above links.

Automatically create descriptive captions for images using AI vision models.

Problem

You have a collection of images that need captions—for accessibility, SEO, content management, or searchability. Writing captions manually doesn’t scale.

Solution

What’s in this recipe:

Generate captions using OpenAI’s vision models
Customize caption style (short, detailed, SEO-focused)
Process images in batch automatically

You add a computed column that sends each image to a vision model with a captioning prompt. New images are captioned automatically on insert.

Setup

%pip install -qU pixeltable openai

import getpass
import os

if 'OPENAI_API_KEY' not in os.environ:
    os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key: ')

import pixeltable as pxt
from pixeltable.functions.openai import chat_completions

Load images

# Create a fresh directory
pxt.drop_dir('caption_demo', force=True)
pxt.create_dir('caption_demo')

Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata
Created directory ‘caption_demo’.
<pixeltable.catalog.dir.Dir at 0x11fba5840>

# Create table for images
images = pxt.create_table('caption_demo/images', {'image': pxt.Image})

Created table ‘images’.

# Insert sample images
image_urls = [
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000036.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000090.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000106.jpg',
]

images.insert([{'image': url} for url in image_urls])

Inserted 3 rows with 0 errors in 0.12 s (25.17 rows/s)
3 rows inserted.

# View images
images.collect()