Backend for AI Apps

Who: AI/App Developers Output: AI-powered application Add multimodal intelligence to applications with two deployment patterns.

Same foundation, different intent: This workflow uses the same Pixeltable capabilities as Data Wrangling for ML — tables, multimodal types, computed columns, iterators. The difference is the output: training datasets vs. live application intelligence.

Data Lifecycle

1. Store
2. Build
3. Index
4. Query
5. Serve

Create Tables

Define schema with native multimodal types — Pixeltable handles storage and referencescreate_table(), pxt.Image, pxt.Video, pxt.Audio, pxt.Document, pxt.Json

import pixeltable as pxt

# Native multimodal types
t = pxt.create_table('app.docs', {
    'pdf': pxt.Document,
    'metadata': pxt.Json
})

Tables Guide

Create tables and manage data

Type System

Image, Video, Audio, Document, JSON & more

Ingest Data

Load from any source — local files, URLs, cloud storage, or databasesinsert(), import_csv(), S3/GCS/Azure

# Insert with URLs, local paths, or direct upload
t.insert([
    {'pdf': 'https://example.com/report.pdf'},
    {'pdf': '/local/path/to/doc.pdf'},
    {'pdf': 's3://bucket/documents/spec.pdf'}
])

Import from S3

Load from cloud storage

Cloud Storage Setup

S3, GCS, Azure, R2 configuration

Define Pipelines

Create UDFs and computed columns — they auto-update when data changes@pxt.udf, @pxt.query, add_computed_column()

UDF Guide

Write custom functions in Python

Computed Columns

Auto-update derived data

Process Media

Extract frames, transcribe audio, chunk documentsFrameIterator, DocumentSplitter, AudioSplitter

Extract Video Frames

Process video into searchable frames

Transcribe Audio

Audio to text with Whisper

Embedding Index

Add embedding indexes with incremental sync — only new/changed rows are embeddedadd_embedding_index()

# Add index once — auto-updates on insert
docs.add_embedding_index('content', string_embed=e5_embed)

Embedding Indexes Guide

Configure and query indexes

OpenAI Embeddings

Use OpenAI embedding models

Reusable Queries

Define @pxt.query functions that return data from your tables@pxt.query

@pxt.query
def get_image(image_id: str) -> PIL.Image.Image:
    return (
        images.where(images.uuid == image_id)
        .select(images.image)
        .limit(1)
    )

# Use in computed columns or API endpoints
t.add_computed_column(thumbnail=get_image(t.image_id))

Query Functions

Reusable parameterized queries

Similarity Search

Find relevant content by meaning, not keywords.similarity(), .order_by(), .where(), .collect()

sim = images.image.similarity(query)
results = images.order_by(sim, asc=False).select(
    uuid=images.uuid,
    url=images.image.fileurl
).limit(10).collect()

Semantic Text Search

Search documents by meaning

Similar Images

Find visually similar images

Tool Calling

Expose Pixeltable functions as LLM tools for agentspxt.tools(), invoke_tools()

Tool Calling Guide

LLM agents with function calling

Agent Memory

Persistent conversation context

API Endpoints

Integrate with Flask, FastAPI, or any Python web frameworkpxt.get_table(), .insert(), .select(), .collect()

from flask import Flask, request
import pixeltable as pxt

app = Flask(__name__)
images = pxt.get_table("app.images")

@app.route("/api/search", methods=["POST"])
def search():
    query = request.form.get("q")
    sim = images.image.similarity(query)
    return images.order_by(sim, asc=False).limit(10).collect()

@app.route("/api/upload", methods=["POST"])
def upload():
    images.insert([{"image": request.files["file"]}])
    return {"status": "ok"}

Deployment Guide

Production deployment patterns

Pixelbot Example

Full Flask app with file upload & search

Media URLs

Get pre-signed URLs for media files stored in cloud storage.fileurl, pre-signed URLs for S3/GCS/Tigris

# Get file URL from Pixeltable
url = row["image"].fileurl

# Generate pre-signed URL for client access
presigned = s3.generate_presigned_url(
    "get_object",
    Params={"Bucket": bucket, "Key": key},
    ExpiresIn=3600
)

Cloud Storage

S3, GCS, Azure, R2, Tigris configuration

Deployment Patterns

Orchestration Layer
Full Backend

When: Keep existing RDBMS + blob storagePixeltable processes media, runs models, then exports results to your existing systems.

# Process in Pixeltable with media stored directly to S3/GCS/Azure
videos.add_computed_column(
    thumbnail=videos.frame.resize((256, 256)),
    destination='s3://my-bucket/thumbnails/'  # Direct to blob storage
)

# Export metadata to external RDBMS
df = videos.select(videos.video, videos.transcript).collect()
df.to_sql('video_metadata', engine, if_exists='append')  # SQLAlchemy

Orchestration Pattern Guide

Process → Export to your existing infrastructure

When: Need versioning, lineage, and retrieval (RAG) from same systemPixeltable persists everything—use it as your primary data backend with automatic versioning.

# Everything in one place: storage + compute + retrieval
docs.add_computed_column(chunks=DocumentSplitter(docs.pdf))
docs.add_embedding_index('chunks', string_embed=e5_embed)

# Query with full lineage
results = docs.chunks.similarity(query).limit(10).collect()

Full Backend Guide

Versioning, lineage, and retrieval in one system

End-to-End Examples

Pixelbot AI Agent

Multimodal AI agent with memory, file search, and image generation

Similarity Search App

Next.js + FastAPI app for text & image search

RAG Pipeline

Retrieval-augmented generation workflow

More sample apps: Check out the sample-apps directory for chat applications, multimodal search, and more.

Use Cases

Deployment

Pixeltable Cloud

Data Lifecycle

Tables Guide

Type System

Import from S3

Cloud Storage Setup

UDF Guide

Computed Columns

Extract Video Frames

Transcribe Audio

Embedding Indexes Guide

OpenAI Embeddings

Query Functions

Semantic Text Search

Similar Images

Tool Calling Guide

Agent Memory

Deployment Guide

Pixelbot Example

Cloud Storage

Deployment Patterns

Orchestration Pattern Guide

Full Backend Guide

End-to-End Examples

Pixelbot AI Agent

Similarity Search App

RAG Pipeline

Use Cases

Deployment

Pixeltable Cloud

​Data Lifecycle

Tables Guide

Type System

Import from S3

Cloud Storage Setup

UDF Guide

Computed Columns

Extract Video Frames

Transcribe Audio

Embedding Indexes Guide

OpenAI Embeddings

Query Functions

Semantic Text Search

Similar Images

Tool Calling Guide

Agent Memory

Deployment Guide

Pixelbot Example

Cloud Storage

​Deployment Patterns

Orchestration Pattern Guide

Full Backend Guide

​End-to-End Examples

Pixelbot AI Agent

Similarity Search App

RAG Pipeline

Data Lifecycle

Deployment Patterns

End-to-End Examples