Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt

Use this file to discover all available pages before exploring further.

Open in Kaggle  Open in Colab  Download Notebook
This documentation page is also available as an interactive notebook. You can launch the notebook in Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the above links.

Why Pixeltable

Every multimodal AI app needs the same five things: store media, run models, index embeddings, serve endpoints, version everything. Most teams glue together 5-8 services (Postgres + Pinecone + S3 + Airflow + LangChain + …) and spend more time on infrastructure than on the product. Pixeltable is a single system that handles all five. One pip install, one Python API, one place to store, transform, index, retrieve, serve, version, observe, and debug.
For developers and vibe coders: Pixeltable’s declarative API means AI assistants generate correct, production-grade code. No glue logic, no orchestrator configs, no serialization code. Experimenting on multimodal data (extract a frame, run a model, draw bounding boxes) is one expression, not a pipeline. Install the Pixeltable Skill and prompt. For teams evaluating infrastructure: Transaction integrity, async execution, parallelization, caching, retries, and observability are built in. One system to operate, monitor, and maintain. Schema changes are one line. Model upgrades are zero-downtime. Extensible via @pxt.udf, @pxt.uda, @pxt.query. 20+ AI providers built in. Skill | MCP Server | Starter Kit | llms.txt: docs.pixeltable.com/llms.txt
%pip install -qU pixeltable google-genai 'fastapi[standard]'
%pip install -qU torch torchvision transformers  # optional, for object detection
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
import getpass
import logging
import os
import warnings

warnings.filterwarnings('ignore')
logging.getLogger('asyncio').setLevel(logging.CRITICAL)
logging.getLogger('huggingface_hub').setLevel(logging.CRITICAL)

if (
    'GEMINI_API_KEY' not in os.environ
    and 'GOOGLE_API_KEY' not in os.environ
):
    os.environ['GEMINI_API_KEY'] = getpass.getpass('Gemini API Key: ')

import pixeltable as pxt
from pixeltable.functions import gemini

BASE_URL = 'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources'

  1. Store: Multimodal Tables

Video, audio, images, and documents are first-class column types. pip install pixeltable is all you need.
from pixeltable.functions.uuid import uuid7

pxt.drop_dir('demo', force=True, if_not_exists='ignore')
pxt.create_dir('demo')

videos = pxt.create_table(
    'demo/videos',
    {'id': uuid7(), 'video': pxt.Video, 'title': pxt.String},
    primary_key='id',
)
videos
Created directory ‘demo’.
Created table ‘videos’.

  1. Orchestrate: AI as Computed Columns

Add a computed column; Pixeltable calls Gemini on every insert, caches results, retries failures, keeps embeddings in sync.
videos.add_computed_column(
    response=gemini.generate_content(
        [videos.video, 'Describe this video in detail.'],
        model='gemini-3-flash-preview',
    )
)

videos.add_computed_column(
    description=videos.response.candidates[0]
    .content.parts[0]
    .text.astype(pxt.String)
)

videos.add_embedding_index(
    'description',
    embedding=gemini.embed_content.using(
        model='gemini-embedding-2-preview'
    ),
)
Added 0 column values with 0 errors in 0.01 s
Added 0 column values with 0 errors in 0.03 s

  1. Insert: One Call Triggers the Full Pipeline

insert() downloads videos, runs Gemini, extracts text, computes embeddings. Open the Dashboard to watch in real time.
videos.insert(
    [
        {
            'video': f'{BASE_URL}/bangkok.mp4',
            'title': 'Bangkok Street Tour',
        },
        {
            'video': f'{BASE_URL}/The-Pursuit-of-Happiness-Video-Extract.mp4',
            'title': 'The Pursuit of Happiness',
        },
    ]
)
Inserted 2 rows with 0 errors in 22.85 s (0.09 rows/s)
2 rows inserted.
videos = pxt.get_table('demo/videos')
videos.select(videos.title, videos.description).collect()
Embedding index stays in sync automatically. No separate vector DB.
sim = videos.description.similarity(string='street food')

# Filter + rank in one expression
videos.where(videos.description != None).order_by(sim, asc=False).limit(
    5
).select(videos.title, videos.description, sim).collect()

  1. Experiment on Media Data

Extract a frame, run DETR object detection, draw bounding boxes, all in one expression. Change timestamp and re-run to explore different frames.
from pixeltable.functions import huggingface
from pixeltable.functions.vision import bboxes_draw

frame = videos.video.extract_frame(timestamp=2.0)
detections = huggingface.detr_for_object_detection(
    frame, model_id='facebook/detr-resnet-50'
)

videos.select(
    videos.title,
    annotated=bboxes_draw(
        frame,
        boxes=detections.boxes,
        labels=detections.label_text,
        fill=True,
        fill_alpha=0.15,
        width=2,
        font_size=14,
    ),
).collect()
videos.describe()

  1. Serve: Queries Become API Endpoints

@pxt.query becomes an HTTP endpoint via FastAPIRouter. In production, use pxt serve service.toml. See HTTP Serving.
import fastapi
from fastapi.testclient import TestClient
from pixeltable.serving import FastAPIRouter


@pxt.query
def search_videos(query_text: str, limit: int = 5):
    sim = videos.description.similarity(string=query_text)
    return (
        videos.order_by(sim, asc=False)
        .limit(limit)
        .select(videos.title, videos.description, sim)
    )


app = fastapi.FastAPI()
router = FastAPIRouter()
router.add_query_route(path='/search', query=search_videos)
router.add_insert_route(
    videos,
    path='/ingest',
    inputs=['video', 'title'],
    outputs=['id', 'title', 'description'],
)
router.add_delete_route(videos, path='/delete')
app.include_router(router)

client = TestClient(app)
resp = client.post(
    '/search', json={'query_text': 'street food', 'limit': 2}
)
resp.json()
{‘rows’: [{‘title’: ‘The Pursuit of Happiness’,
   ‘description’: ‘In this clip from the movie “The Pursuit of Happyness,” Chris Gardner (played by Will Smith) has just finished an interview for a competitive internship at a brokerage firm. Despite his disheveled appearance—wearing a grey work jacket and looking tired—he is approached by Jay Twistle, a senior manager at the firm.\n\nDetailed Scene Breakdown:\n\n* The Approach: The scene opens with Chris looking down, appearing stressed or emotional. A voice calls out “Chris…”, and he turns to see Mr. Twistle walking toward him with a wide, congratulatory smile. They are in a professional office lobby with people moving in the background and a reception desk visible.\n* The Interaction: Mr. Twistle expresses his admiration, saying, “I don't know how you did it dressed as a garbage man, but you really pulled it off in there.” This refers to Chris's impressive performance during the interview despite his unconventional attire (having come straight from a night in a jail cell due to unpaid parking tickets).\n* Building Rapport: Chris politely thanks him, addressing him as “Mr. Twistle.” In a sign of newfound respect and a positive result, Twistle insists, “Hey, now you can call me Jay. We'll talk to you soon.” He gives Chris a friendly pat on the shoulder before walking away.\n* Gardner's Reaction: Chris is left standing in the hallway, a look of immense relief and quiet triumph washing over his face. The scene highlights a pivotal moment where his intelligence and determination overcame his difficult circumstances.\n\nThe video features the “Binge Society” logo in the top left corner and copyright information at the bottom for Columbia Pictures Industries, Inc. and GH One LLC from 2006.’,
   ‘similarity’: 0.4487778141613705},
  {‘title’: ‘Bangkok Street Tour’,
   ‘description’: “The video is a static, high-angle shot overlooking a busy multi-lane city street in what appears to be Bangkok, Thailand, indicated by the presence of tuk-tuks and brightly colored taxis. The scene captures the constant flow of traffic throughout the entire clip.\n\nIn the foreground on the left, a blue hatchback and a traditional three-wheeled tuk-tuk with a pink delivery bag on its back are either stationary or moving very slowly. Throughout the video, various vehicles, including white sedans, silver SUVs, motorcycles, and the city’s signature pink and green-yellow taxis, navigate the lanes. \n\nThe road is divided by a narrow median with small green bushes. Traffic moves in both directions, with vehicles heading away from the camera and towards it. On the left side of the street, large multi-story buildings feature several prominent billboards, one of which displays a woman’s face. On the right, a row of trees lines the sidewalk, behind which several large, white-roofed structures with pink accents are visible. In the background, a pedestrian overpass crosses the busy road, and taller city buildings can be seen in the distance under a bright, overcast sky. The overall atmosphere is one of a typical, bustling urban afternoon.”,
   ‘similarity’: 0.13178936719516832}]}

  1. Version: Automatic History

Every insert, update, and delete is versioned. history() returns the full changelog.
videos.history()

  1. Agents: Tool Calling and Memory as Computed Columns

An agent is just more computed columns. Define tools from @pxt.query functions, wire tool calling and context assembly as a chain of columns, and every insert triggers the full reasoning pipeline. Memory is a table with an embedding index. This pattern scales to production. See the Starter Kit for a complete implementation with documents, images, video, and cross-modal search.
from pixeltable.functions.anthropic import invoke_tools, messages

# 1. Define tools from existing @pxt.query functions
tools = pxt.tools(search_videos)

# 2. Memory: chat history with embedding index
chat = pxt.create_table(
    'demo/chat',
    {
        'role': pxt.String,
        'content': pxt.String,
        'conversation_id': pxt.String,
    },
    if_exists='ignore',
)
chat.add_embedding_index(
    'content',
    string_embed=gemini.embed_content.using(
        model='gemini-embedding-2-preview'
    ),
    if_exists='ignore',
)

# 3. Agent pipeline: each step is a computed column
agent = pxt.create_table(
    'demo/agent', {'prompt': pxt.String}, if_exists='ignore'
)

# Step 1: LLM decides which tools to call
agent.add_computed_column(
    response=messages(
        model='claude-sonnet-4-20250514',
        messages=[{'role': 'user', 'content': agent.prompt}],
        tools=tools,
        tool_choice=tools.choice(required=True),
        max_tokens=4096,
    ),
    if_exists='ignore',
)

# Step 2: Pixeltable executes the tool calls (runs search_videos)
agent.add_computed_column(
    tool_output=invoke_tools(tools, agent.response), if_exists='ignore'
)

# In production, add more steps: assemble context, call LLM again with results.
# See the Starter Kit for the full multi-step agent pipeline.
agent.describe()
Created table ‘chat’.
Created table ‘agent’.
Added 0 column values with 0 errors in 0.01 s

Bonus: Cloud Storage (Optional)

Free managed bucket with Pixeltable Cloud. Set two config values; computed media flows to cloud. See Cloud Services.
# ~/.pixeltable/config.toml
[pixeltable]
api_key = "your-pixeltable-cloud-api-key"
output_media_dest = "pxtfs://yourorg:yourdb/home"

Summary

Last modified on May 9, 2026