Documentation Index
Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
Use this file to discover all available pages before exploring further.
Who: AI/App Developers
Output: AI-powered application
Add multimodal intelligence to applications with two deployment patterns.
Same foundation, different intent: This workflow uses the same Pixeltable capabilities as Data Wrangling for ML — tables, multimodal types, computed columns, iterators. The difference is the output: training datasets vs. live application intelligence.
Data Lifecycle
1. Store
2. Build
3. Index
4. Query
5. Serve
Create Tables
Define schema with native multimodal types — Pixeltable handles storage and referencescreate_table(), pxt.Image, pxt.Video, pxt.Audio, pxt.Document, pxt.Jsonimport pixeltable as pxt
# Native multimodal types
t = pxt.create_table('app.docs', {
'pdf': pxt.Document,
'metadata': pxt.Json
})
Tables Guide
Create tables and manage data
Type System
Image, Video, Audio, Document, JSON & more
Ingest Data
Load from any source — local files, URLs, cloud storage, or databasesinsert(), import_csv(), S3/GCS/Azure# Insert with URLs, local paths, or direct upload
t.insert([
{'pdf': 'https://example.com/report.pdf'},
{'pdf': '/local/path/to/doc.pdf'},
{'pdf': 's3://bucket/documents/spec.pdf'}
])
Import from S3
Load from cloud storage
Cloud Storage Setup
S3, GCS, Azure, R2 configuration
Embedding Index
Add embedding indexes with incremental sync — only new/changed rows are embeddedadd_embedding_index()# Add index once — auto-updates on insert
docs.add_embedding_index('content', string_embed=e5_embed)
Embedding Indexes Guide
Configure and query indexes
OpenAI Embeddings
Use OpenAI embedding models
Reusable Queries
Define @pxt.query functions that return data from your tables@pxt.query@pxt.query
def get_image(image_id: str) -> PIL.Image.Image:
return (
images.where(images.uuid == image_id)
.select(images.image)
.limit(1)
)
# Use in computed columns or API endpoints
t.add_computed_column(thumbnail=get_image(t.image_id))
Query Functions
Reusable parameterized queries
Similarity Search
Find relevant content by meaning, not keywords.similarity(), .order_by(), .where(), .collect()sim = images.image.similarity(query)
results = images.order_by(sim, asc=False).select(
uuid=images.uuid,
url=images.image.fileurl
).limit(10).collect()
Semantic Text Search
Search documents by meaning
Similar Images
Find visually similar images
Tool Calling
Expose Pixeltable functions as LLM tools for agentspxt.tools(), invoke_tools()Tool Calling Guide
LLM agents with function calling
Agent Memory
Persistent conversation context
API Endpoints
Integrate with Flask, FastAPI, or any Python web frameworkpxt.get_table(), .insert(), .select(), .collect()from flask import Flask, request
import pixeltable as pxt
app = Flask(__name__)
images = pxt.get_table("app.images")
@app.route("/api/search", methods=["POST"])
def search():
query = request.form.get("q")
sim = images.image.similarity(query)
return images.order_by(sim, asc=False).limit(10).collect()
@app.route("/api/upload", methods=["POST"])
def upload():
images.insert([{"image": request.files["file"]}])
return {"status": "ok"}
Deployment Guide
Production deployment patterns
Pixelbot Example
Full Flask app with file upload & search
Media URLs
Get pre-signed URLs for media files stored in cloud storage.fileurl, pre-signed URLs for S3/GCS/Tigris# Get file URL from Pixeltable
url = row["image"].fileurl
# Generate pre-signed URL for client access
presigned = s3.generate_presigned_url(
"get_object",
Params={"Bucket": bucket, "Key": key},
ExpiresIn=3600
)
Cloud Storage
S3, GCS, Azure, R2, Tigris configuration
Deployment Patterns
Orchestration Layer
Full Backend
When: Keep existing RDBMS + blob storagePixeltable processes media, runs models, then exports results to your existing systems.# Process in Pixeltable with media stored directly to S3/GCS/Azure
videos.add_computed_column(
thumbnail=videos.frame.resize((256, 256)),
destination='s3://my-bucket/thumbnails/' # Direct to blob storage
)
# Export metadata to external RDBMS
df = videos.select(videos.video, videos.transcript).collect()
df.to_sql('video_metadata', engine, if_exists='append') # SQLAlchemy
Orchestration Pattern Guide
Process → Export to your existing infrastructure
When: Need versioning, lineage, and retrieval (RAG) from same systemPixeltable persists everything—use it as your primary data backend with automatic versioning.# Everything in one place: storage + compute + retrieval
docs.add_computed_column(chunks=document_splitter(docs.pdf))
docs.add_embedding_index('chunks', string_embed=e5_embed)
# Query with full lineage
results = docs.chunks.similarity(query).limit(10).collect()
Full Backend Guide
Versioning, lineage, and retrieval in one system
End-to-End Examples
Pixelbot AI Agent
Multimodal AI agent with memory, file search, and image generation
Similarity Search App
Next.js + FastAPI app for text & image search
RAG Pipeline
Retrieval-augmented generation workflow
More sample apps: Check out the sample-apps directory for chat applications, multimodal search, and more.