Pixeltable is thread-safe and works with FastAPI, Flask, Django, and other web frameworks out of the box. The key rule: use sync (def) endpoint handlers, not async def.
FastAPI (and Starlette) dispatches sync (def) handlers to a thread pool. Each concurrent request gets its own thread, and Pixeltable automatically creates an isolated database connection per thread. This gives you true parallel request handling with no extra configuration.
from pydantic import BaseModelfrom fastapi import FastAPIimport pixeltable as pxtapp = FastAPI()class SearchResult(BaseModel): text: str score: float@app.post("/ingest")def ingest(text: str): t = pxt.get_table('myapp/documents') status = t.insert([{'text': text}]) return {'inserted': status.num_rows}@app.get("/search")def search(query: str, limit: int = 10) -> list[SearchResult]: t = pxt.get_table('myapp/documents') sim = t.text.similarity(string=query) results = ( t.order_by(sim, asc=False) .limit(limit) .select(t.text, score=sim) .collect() ) return list(results.to_pydantic(SearchResult))
Do not use async def for endpoints that call Pixeltable. Pixeltable’s API is synchronous. Inside an async def handler, Pixeltable calls block the event loop, serializing all requests and starving other coroutines. With def handlers, FastAPI’s thread pool handles concurrency for you.
table.select(...).collect() returns a ResultSet object, which Pydantic cannot serialize directly. You have two options:Option 1: to_pydantic() (recommended for FastAPI)Define a Pydantic model and let Pixeltable validate and convert each row. FastAPI serializes these natively.
class Item(BaseModel): name: str score: float@app.get("/rows")def get_rows() -> list[Item]: t = pxt.get_table('myapp/items') return list(t.select(t.name, t.score).collect().to_pydantic(Item))
Option 2: to_pandas() + to_dict()Convert via pandas when you don’t need a Pydantic model.
Pixeltable is compatible with uvloop, the high-performance event loop used by default in many production deployments. No special configuration is needed — sync endpoints work identically whether the server uses the default asyncio loop or uvloop.
# uvicorn with uvloop (the default when uvloop is installed)uvicorn app:app --host 0.0.0.0 --port 8000 --workers 1
When you add a computed column, Pixeltable executes it immediately for all existing rows. For expensive operations (LLM calls, model inference), validate your logic on a sample first using select(); nothing is stored until you commit with add_computed_column().
# 1. Test transformation on sample rows (nothing stored)table.select( table.text, summary=summarize_with_llm(table.text)).head(3) # Only processes 3 rows# 2. Once satisfied, persist to table (processes all rows)table.add_computed_column(summary=summarize_with_llm(table.text))
This “iterate-then-add” workflow lets you catch errors early without wasting API calls or compute on your full dataset.
Pro tip: Save expressions as variables to guarantee identical logic in both steps:
Modify computed columns (if_exists='replace'), Drop columns/tables/views
Full recomputation or data loss
Production Safety:
# Use if_exists='ignore' for idempotent schema migrationsimport pixeltable as pxtimport configdocs_table = pxt.get_table(f'{config.APP_NAMESPACE}/documents')docs_table.add_computed_column( embedding=embed_model(docs_table.document), if_exists='ignore' # No-op if column exists)
Version control setup_pixeltable.py like database migration scripts.
Rollback via table.revert() (single operation) or Git revert (complex changes).
The most common schema evolution is switching an embedding or LLM model. In a traditional stack this requires a migration script, a compute cluster, reprocessing every row, and a maintenance window. In Pixeltable it’s one line — the old column keeps working while the new one backfills.Traditional approach:
# 1. Write migration script# 2. Spin up compute to re-embed all rows (hours of downtime)# 3. Swap the column in application code# 4. Deploy during maintenance window# 5. Monitor for consistency issuesdata = db.query("SELECT id, content FROM documents")for row in data: new_vec = new_model.encode(row["content"]) db.execute("UPDATE documents SET embedding = %s WHERE id = %s", (new_vec, row["id"]))
Pixeltable approach:
# Add a new computed column. Old column still serves queries — zero downtime.docs.add_computed_column( embedding_v2=sentence_transformer(docs.text, model_id='intfloat/e5-large-v2'), if_exists='ignore')# Pixeltable backfills in batches, rate-limited, with automatic retries.# Switch your queries to embedding_v2 when ready.
Because both columns coexist, you can A/B test retrieval quality before cutting over — no rollback plan needed.
Execute setup_pixeltable.py during deployment initialization
Web server processes connect to Pixeltable instance
Pixeltable uses connection pooling internally
Use sync (def) endpoint handlers for concurrent request support
Pixeltable Starter Kit
Clone a production-ready FastAPI + React app with multimodal upload, search, and agent endpoints — plus deployment configs for Docker Compose, Helm, Terraform, and AWS CDK.
Batch Processing / Orchestration Pipeline:
Schedule via cron, Airflow, AWS EventBridge, GCP Cloud Scheduler
Isolate batch workloads from real-time serving (separate containers/instances)
Use Pixeltable’s incremental computation to process only new data
The starter kit includes an orchestration pipeline example that uses export_sql and the destination parameter for ephemeral batch processing
Containers:
Docker provides reproducible builds across environments
Full Backend: Mount persistent volume at ~/.pixeltable (or set PIXELTABLE_HOME)
Kubernetes: Use ReadWriteOnce PVC (single-pod write access)
Docker Compose or Kubernetes for multi-container deployments
One active pod writes to dedicated volume. Failover requires volume detach/reattach.
Multiple Pods + Shared Volume (NFS/EFS)
❌ Not Supported
Will cause database corruption. Do not mount same pgdata to multiple pods.
Multi-Node HA
🔜 Coming 2026
Available in Pixeltable Cloud (serverless scaling, API endpoints). Join waitlist
Single-Writer Limitation: Pixeltable’s storage layer uses an embedded PostgreSQL instance. Only one process can write to ~/.pixeltable/pgdata at a time.
Test schema changes, UDF updates, application code changes.
Use representative data (anonymized or synthetic).
# Test environment with isolated namespaceimport pixeltable as pxtTEST_NS = 'test_myapp'pxt.create_dir(TEST_NS, if_exists='replace')# Run setup targeting test namespace# Execute tests# pxt.drop_dir(TEST_NS, force=True) # Cleanup