Pixeltable is thread-safe and works with FastAPI, Flask, Django, and other web frameworks out of the box. The key rule: use sync (def) endpoint handlers, not async def.
FastAPI (and Starlette) dispatches sync (def) handlers to a thread pool. Each concurrent request gets its own thread, and Pixeltable automatically creates an isolated database connection per thread. This gives you true parallel request handling with no extra configuration.
Copy
Ask AI
from pydantic import BaseModelfrom fastapi import FastAPIimport pixeltable as pxtapp = FastAPI()class SearchResult(BaseModel): text: str score: float@app.post("/ingest")def ingest(text: str): t = pxt.get_table('myapp/documents') status = t.insert([{'text': text}]) return {'inserted': status.num_rows}@app.get("/search")def search(query: str, limit: int = 10) -> list[SearchResult]: t = pxt.get_table('myapp/documents') sim = t.text.similarity(string=query) results = ( t.order_by(sim, asc=False) .limit(limit) .select(t.text, score=sim) .collect() ) return list(results.to_pydantic(SearchResult))
Do not use async def for endpoints that call Pixeltable. Pixeltable’s API is synchronous. Inside an async def handler, Pixeltable calls block the event loop, serializing all requests and starving other coroutines. With def handlers, FastAPI’s thread pool handles concurrency for you.
table.select(...).collect() returns a ResultSet object, which Pydantic cannot serialize directly. You have two options:Option 1: to_pydantic() (recommended for FastAPI)Define a Pydantic model and let Pixeltable validate and convert each row. FastAPI serializes these natively.
Copy
Ask AI
class Item(BaseModel): name: str score: float@app.get("/rows")def get_rows() -> list[Item]: t = pxt.get_table('myapp/items') return list(t.select(t.name, t.score).collect().to_pydantic(Item))
Option 2: to_pandas() + to_dict()Convert via pandas when you don’t need a Pydantic model.
Pixeltable is compatible with uvloop, the high-performance event loop used by default in many production deployments. No special configuration is needed — sync endpoints work identically whether the server uses the default asyncio loop or uvloop.
Copy
Ask AI
# uvicorn with uvloop (the default when uvloop is installed)uvicorn app:app --host 0.0.0.0 --port 8000 --workers 1
When you add a computed column, Pixeltable executes it immediately for all existing rows. For expensive operations (LLM calls, model inference), validate your logic on a sample first using select(); nothing is stored until you commit with add_computed_column().
Copy
Ask AI
# 1. Test transformation on sample rows (nothing stored)table.select( table.text, summary=summarize_with_llm(table.text)).head(3) # Only processes 3 rows# 2. Once satisfied, persist to table (processes all rows)table.add_computed_column(summary=summarize_with_llm(table.text))
This “iterate-then-add” workflow lets you catch errors early without wasting API calls or compute on your full dataset.
Pro tip: Save expressions as variables to guarantee identical logic in both steps:
Modify computed columns (if_exists='replace'), Drop columns/tables/views
Full recomputation or data loss
Production Safety:
Copy
Ask AI
# Use if_exists='ignore' for idempotent schema migrationsimport pixeltable as pxtimport configdocs_table = pxt.get_table(f'{config.APP_NAMESPACE}/documents')docs_table.add_computed_column( embedding=embed_model(docs_table.document), if_exists='ignore' # No-op if column exists)
Version control setup_pixeltable.py like database migration scripts.
Rollback via table.revert() (single operation) or Git revert (complex changes).
The most common schema evolution is switching an embedding or LLM model. In a traditional stack this requires a migration script, a compute cluster, reprocessing every row, and a maintenance window. In Pixeltable it’s one line — the old column keeps working while the new one backfills.Traditional approach:
Copy
Ask AI
# 1. Write migration script# 2. Spin up compute to re-embed all rows (hours of downtime)# 3. Swap the column in application code# 4. Deploy during maintenance window# 5. Monitor for consistency issuesdata = db.query("SELECT id, content FROM documents")for row in data: new_vec = new_model.encode(row["content"]) db.execute("UPDATE documents SET embedding = %s WHERE id = %s", (new_vec, row["id"]))
Pixeltable approach:
Copy
Ask AI
# Add a new computed column. Old column still serves queries — zero downtime.docs.add_computed_column( embedding_v2=sentence_transformer(docs.text, model_id='intfloat/e5-large-v2'), if_exists='ignore')# Pixeltable backfills in batches, rate-limited, with automatic retries.# Switch your queries to embedding_v2 when ready.
Because both columns coexist, you can A/B test retrieval quality before cutting over — no rollback plan needed.
One active pod writes to dedicated volume. Failover requires volume detach/reattach.
Multiple Pods + Shared Volume (NFS/EFS)
❌ Not Supported
Will cause database corruption. Do not mount same pgdata to multiple pods.
Multi-Node HA
🔜 Coming 2026
Available in Pixeltable Cloud (serverless scaling, API endpoints). Join waitlist
Single-Writer Limitation: Pixeltable’s storage layer uses an embedded PostgreSQL instance. Only one process can write to ~/.pixeltable/pgdata at a time.
Test schema changes, UDF updates, application code changes.
Use representative data (anonymized or synthetic).
Copy
Ask AI
# Test environment with isolated namespaceimport pixeltable as pxtTEST_NS = 'test_myapp'pxt.create_dir(TEST_NS, if_exists='replace')# Run setup targeting test namespace# Execute tests# pxt.drop_dir(TEST_NS, force=True) # Cleanup