Introduction

Persistent Storage

All data and computed results are automatically stored and versioned.

Computed Columns

Define transformations once; they run automatically on new data.

Multimodal Support

Handle images, video, audio, and text seamlessly in one unified interface.

AI Integration

Built-in support for AI services like OpenAI, YOLOX, Together, Label Studio, Replicate…

The below steps will get you started in 1 minute. Learn more by looking at this tutorial on Github.

Getting Started

Start Building (Step 1)

pip install pixeltable

Get up and running with basic tables, queries, and computed columns.

# Create a table to hold data
t = pxt.create_table('films_table', {
    'name': pxt.String,
    'revenue': pxt.Float,
    'budget': pxt.Float
})

All data and computed results are automatically stored and versioned.

Add Processing (Step 2)

Add LLMs, computer vision, embeddings indices, and build your first multimodal app.

from pixeltable.functions.huggingface import clip
import PIL.Image

# create embedding index on the 'img' column of table 't'
t.add_embedding_index(
    'img',
    embedding=clip.using(model_id='openai/clip-vit-base-patch32')
)

# index is kept up-to-date enabling relevant searches
sim = t.img.similarity(sample_img)

res = (
    t.order_by(sim, asc=False)  # Order by similarity
    .where(t.id != 6)  # Metadata filtering
    .limit(2)  # Limit number of results to 2
    .select(t.id, t.img, sim)
    .collect()  # Retrieve results now
)

Pixeltable orchestrates model execution, ensuring results are stored, indexed, and accessible through the same query interface.

Scale Up (Step 3)

Handle production data volumes, and deploy your application.

# Import media data (videos, images, audio...)
v = pxt.create_table('videos', {'video': pxt.Video})

prefix = 's3://multimedia-commons/'
paths = [
    'data/videos/mp4/ffe/ffb/ffeffbef41bbc269810b2a1a888de.mp4',
    'data/videos/mp4/ffe/feb/ffefebb41485539f964760e6115fbc44.mp4',
    'data/videos/mp4/ffe/f73/ffef7384d698b5f70d411c696247169.mp4'
]
v.insert({'video': prefix + p} for p in paths)

Handle images, video, audio, numbers, array and text seamlessly in one interface.

Popular Use Cases

Enterprise Chat Systems

Multi-model RAG Chatbot

Build RAG systems that compare multiple LLMs with ground truth evaluation.

Discord AI Assistant

Create context-aware chat bots with semantic search and memory.

Agentic Workflows

Build tool-calling agents

Visual Understanding

Video Object Detection

Real-time object detection in videos using YOLOX.

Visual Search Engine

Text and image similarity search on video frames.

Call Analysis Tool

Analyze video calls with automatic transcription and insights.

Document Intelligence

Document Audio Synthesis

Convert documents to natural speech with context-aware processing.

Social Media Generator

Generate social posts from video content analysis.

Collaborative Writing

Build AI-powered collaborative writing tools.

Vertical AI Products

LLM Studio

Test and compare LLM performance with structured evaluation.

AI-Based Trading Chrome Extension

Real-time trading analysis using AI for technical indicators.

RPG Adventure

Create interactive AI storytelling experiences.

Next Steps

Code Examples

Working implementations and reference architecture

Developer Community

Technical discussions and implementation support

Source Repository

Explore the codebase and contribute

Enterprise Chat Systems

Multi-model RAG Chatbot

Discord AI Assistant

Agentic Workflows

Visual Understanding

Video Object Detection

Visual Search Engine

Call Analysis Tool

Document Intelligence

Document Audio Synthesis

Social Media Generator

Collaborative Writing

Vertical AI Products

LLM Studio

AI-Based Trading Chrome Extension

RPG Adventure

Welcome to Pixeltable

Multimodal AI Datastore

Tutorials

Libraries

Persistent Storage

Computed Columns

Multimodal Support

AI Integration

Getting Started

Popular Use Cases

Next Steps

Code Examples

Developer Community

Source Repository

Welcome to Pixeltable

Multimodal AI Datastore

Tutorials

Libraries

Persistent Storage

Computed Columns

Multimodal Support

AI Integration

​Getting Started

​Popular Use Cases

Enterprise Chat Systems

Multi-model RAG Chatbot

Discord AI Assistant

Agentic Workflows

Visual Understanding

Video Object Detection

Visual Search Engine

Call Analysis Tool

Document Intelligence

Document Audio Synthesis

Social Media Generator

Collaborative Writing

Vertical AI Products

LLM Studio

AI-Based Trading Chrome Extension

RPG Adventure

​Next Steps

Code Examples

Developer Community

Source Repository

Getting Started

Popular Use Cases

Next Steps