Skip to main content
The only open source Python library providing declarative data infrastructure for building multimodal AI applications, enabling incremental storage, transformation, indexing, retrieval, and orchestration of data. With Pixeltable, you define your entire data processing and AI workflow declaratively using computed columns on tables. Focus on your application logic, not the data plumbing.

Before Pixeltable

AI teams are building on images, video, audio, and text, but the infrastructure is broken:

Fragmented Data

Data lives across object stores, vector DBs, SQL, and ad-hoc pipelines. No single source of truth.

Costly Iteration

Every model change requires reprocessing. Pipelines are brittle and hard to reproduce.
This creates high engineering cost, slow iteration, and production risk.
Pixeltable solves this. One system for storage, orchestration, and retrieval. Transactions, incremental updates, and automatic dependency tracking built in.

With Pixeltable

Persistent Storage

All data and computed results are automatically stored and versioned.

Incremental Updates

Data transformations run automatically on new data. No orchestration code needed.

Multimodal-Native

Images, video, audio, and documents integrate seamlessly with structured data.

AI Integration

Built-in support for OpenAI, Anthropic, Gemini, Hugging Face, and dozens more.

Get started

Many documentation pages are interactive notebooks (marked with in the sidebar). Open them in Colab, Kaggle, or locally to follow along.

Core Primitives

Pixeltable provides a small set of primitives that compose into any multimodal AI workflow:
Create tables with native multimodal types
t = pxt.create_table('myapp.media', {
    'video': pxt.Video,
    'image': pxt.Image,
    'audio': pxt.Audio,
    'document': pxt.Document,
    'metadata': pxt.Json
})
Declarative computed columns: API calls, LLM inference, local models, vision
# LLM API call
t.add_computed_column(summary=openai.chat_completions(
    messages=[{'role': 'user', 'content': 'Summarize: ' + t.text}]
))

# Local model inference
t.add_computed_column(objects=yolox(t.image, model_id='yolox_s'))

# Vision analysis
t.add_computed_column(desc=openai.vision(prompt="Describe", image=t.image))
Explode rows: video→frames, doc→chunks, audio→segments
# Extract frames from video at 1 fps
frames = pxt.create_view('myapp.frames', t, iterator=FrameIterator(t.video, fps=1))

# Chunk documents for RAG
chunks = pxt.create_view('myapp.chunks', t, iterator=DocumentSplitter(t.document))
Add embedding indexes for semantic search
t.add_embedding_index('text', embedding=openai.embeddings())

# Search by similarity
results = t.order_by(t.text.similarity('find relevant docs'), asc=False).limit(10)

Embedding Indexes

Vector search with automatic maintenance
Write custom functions with @pxt.udf and @pxt.query
@pxt.udf
def extract_entities(text: str) -> list[str]:
    # Your custom logic
    return entities

@pxt.query
def search_by_topic(topic: str):
    return t.where(t.category == topic).select(t.title, t.summary)

UDFs & Queries

Custom Python functions
Tool calling for AI agents and MCP integration
# Load tools from MCP server, UDFs, and queries
mcp_tools = pxt.mcp_udfs('http://localhost:8000/mcp')
tools = pxt.tools(search_by_topic, extract_entities, *mcp_tools)

# LLM decides which tool to call; Pixeltable executes it
t.add_computed_column(response=openai.chat_completions(
    messages=[{'role': 'user', 'content': t.question}],
    tools=tools
))
t.add_computed_column(result=openai.invoke_tools(tools, t.response))
SQL-like queries + test transformations before committing
# Query data with familiar syntax
results = t.where(t.score > 0.8).order_by(t.timestamp).limit(10).collect()

# Test transformations on sample rows BEFORE adding to table
t.select(t.text, summary=summarize(t.text)).head(3)  # Nothing stored yet
t.add_computed_column(summary=summarize(t.text))      # Now commit to all rows
Time travel and automatic versioning
t.history()                    # View all versions
t.revert(version=5)            # Rollback changes
old_data = pxt.get_table('myapp.media:3')  # Query past version

Version Control

History, snapshots, lineage
Load from any source, export to ML formats
# Import from files, URLs, S3, Hugging Face
t.insert(pxt.io.import_csv('data.csv'))
t.insert(pxt.io.import_huggingface_dataset(dataset))

# Export to ML/analytics formats
pxt.io.export_parquet(t, 'output.parquet')
loader = DataLoader(t.to_pytorch_dataset(), batch_size=32)
coco_path = t.to_coco_dataset()
Publish and replicate datasets via Pixeltable Cloud
pxt.publish(t, 'my-dataset')              # Share publicly
pxt.replicate('user/dataset', 'local')   # Pull to local

Data Sharing

Publish, replicate, collaborate

Use Cases

Pixeltable’s primitives are use-case agnostic. They compose into any multimodal AI workflow:

Agents & MCP

Tool-calling agents with persistent memory, MCP server integration, and automatic conversation history.
Start with the Quick Start to get running in 5 minutes, or explore Cookbooks for hands-on examples covering RAG, video analysis, audio transcription, and more.

Choose How You Run Pixeltable

Book a Demo

Schedule a call to discuss your use case and see how Pixeltable can help.

Next steps

Last modified on January 29, 2026