Documentation Index
Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
Use this file to discover all available pages before exploring further.
Who: AI/App Developers
Output: AI-powered application
Add multimodal intelligence to applications with two deployment patterns.
Same foundation, different intent: This workflow uses the same Pixeltable capabilities as Data Wrangling for ML — tables, multimodal types, computed columns, iterators. The difference is the output: training datasets vs. live application intelligence.
Data Lifecycle
1. Store
2. Build
3. Index
4. Query
5. Serve
Create Tables
Define schema with native multimodal types — Pixeltable handles storage and referencescreate_table(), pxt.Image, pxt.Video, pxt.Audio, pxt.Document, pxt.Jsonimport pixeltable as pxt
# Native multimodal types
t = pxt.create_table('app.docs', {
'pdf': pxt.Document,
'metadata': pxt.Json
})
Tables Guide
Create tables and manage data
Type System
Image, Video, Audio, Document, JSON & more
Ingest Data
Load from any source — local files, URLs, cloud storage, or databasesinsert(), import_csv(), S3/GCS/Azure# Insert with URLs, local paths, or direct upload
t.insert([
{'pdf': 'https://example.com/report.pdf'},
{'pdf': '/local/path/to/doc.pdf'},
{'pdf': 's3://bucket/documents/spec.pdf'}
])
Import from S3
Load from cloud storage
Cloud Storage Setup
S3, GCS, Azure, R2 configuration
Embedding Index
Add embedding indexes with incremental sync — only new/changed rows are embeddedadd_embedding_index()# Add index once — auto-updates on insert
docs.add_embedding_index('content', string_embed=e5_embed)
Embedding Indexes Guide
Configure and query indexes
OpenAI Embeddings
Use OpenAI embedding models
Reusable Queries
Define @pxt.query functions that return data from your tables@pxt.query@pxt.query
def get_image(image_id: str) -> PIL.Image.Image:
return (
images.where(images.uuid == image_id)
.select(images.image)
.limit(1)
)
# Use in computed columns or API endpoints
t.add_computed_column(thumbnail=get_image(t.image_id))
Query Functions
Reusable parameterized queries
Similarity Search
Find relevant content by meaning, not keywords.similarity(), .order_by(), .where(), .collect()sim = images.image.similarity(query)
results = images.order_by(sim, asc=False).select(
uuid=images.uuid,
url=images.image.fileurl
).limit(10).collect()
Semantic Text Search
Search documents by meaning
Similar Images
Find visually similar images
Tool Calling
Expose Pixeltable functions as LLM tools for agentspxt.tools(), invoke_tools()Tool Calling Guide
LLM agents with function calling
Agent Memory
Persistent conversation context
Built-in HTTP Serving
Expose tables and queries as HTTP endpoints with a TOML config or a single CLI commandpxt serve, FastAPIRouter# service.toml
[[service]]
name = "image-service"
port = 8000
[[service.routes]]
type = "insert"
table = "app/images"
path = "/upload"
uploadfile_inputs = ["image"]
outputs = ["image", "caption"]
[[service.routes]]
type = "query"
path = "/search"
query = "app.queries.search_images"
pxt serve image-service --config service.toml
HTTP Serving Guide
TOML config, CLI, Python API, background jobs
Deployment Overview
Full backend vs. orchestration layer
Custom Endpoints
For custom logic, middleware, or authentication, use Flask, FastAPI, or any Python web frameworkpxt.get_table(), .insert(), .select(), .collect()from flask import Flask, request
import pixeltable as pxt
app = Flask(__name__)
images = pxt.get_table("app.images")
@app.route("/api/search", methods=["POST"])
def search():
query = request.form.get("q")
sim = images.image.similarity(query)
return images.order_by(sim, asc=False).limit(10).collect()
Production Operations
Concurrency, error handling, sync endpoints
Pixelbot Example
Full Flask app with file upload & search
Media URLs
Get pre-signed URLs for media files stored in cloud storage.fileurl, pre-signed URLs for S3/GCS/Tigrisurl = row["image"].fileurl
presigned = s3.generate_presigned_url(
"get_object",
Params={"Bucket": bucket, "Key": key},
ExpiresIn=3600,
)
Cloud Storage
S3, GCS, Azure, R2, Tigris configuration
Deployment Patterns
Batch Processing
Full Backend
When: Keep existing RDBMS + blob storagePixeltable processes media, runs models, then exports results to your existing systems.from pixeltable.io.sql import export_sql
# Process in Pixeltable with media stored directly to S3/GCS/Azure
videos.add_computed_column(
thumbnail=videos.frame.resize((256, 256)),
destination='s3://my-bucket/thumbnails/'
)
# Export structured results to serving DB
export_sql(
videos.select(videos.video, videos.transcript),
'video_metadata',
db_connect_str='postgresql+psycopg://...',
if_exists='replace',
)
Batch Processing Guide
Process with computed columns, export with export_sql
When: Need versioning, lineage, and retrieval (RAG) from same systemPixeltable persists everything—use it as your primary data backend with automatic versioning.# Everything in one place: storage + compute + retrieval
docs.add_computed_column(chunks=document_splitter(docs.pdf))
docs.add_embedding_index('chunks', string_embed=e5_embed)
# Query with full lineage
results = docs.chunks.similarity(query).limit(10).collect()
Full Backend Guide
Versioning, lineage, and retrieval in one system
End-to-End Examples
Pixelbot AI Agent
Multimodal AI agent with memory, file search, and image generation
Similarity Search App
Next.js + FastAPI app for text & image search
RAG Pipeline
Retrieval-augmented generation workflow
More sample apps: Check out the sample-apps directory for chat applications, multimodal search, and more.