Skip to main content
Who: ML Engineers, Data Scientists Output: Training/evaluation datasets Pixeltable is your system of record—all data, cached results, and references stay in sync.

Data Lifecycle

Ingest

Load from any source: import_csv(), import_parquet(), HuggingFace, S3/GCS/Azure, RDBMS via Python DB API

Import from S3

Load images/videos from cloud storage

Import HuggingFace

Load datasets from HuggingFace Hub

Explore

Statistics & sampling: select(), .sample(), .head()

Data Sampling

Sample and filter large datasets efficiently

End-to-End Examples

Object Detection Pipeline

Complete workflow: ingest video → extract frames → detect objects → export

Audio Transcription Pipeline

Transcribe and analyze audio at scale

Structured Vision Output

Extract structured data from images with GPT-4o

Generate Captions

Auto-generate image descriptions
Last modified on February 24, 2026