Output: Training/evaluation datasets Pixeltable is your system of record—all data, cached results, and references stay in sync.
Data Lifecycle
- 1. Acquire Data
- 2. Enrich & Annotate
- 3. Curate
Ingest
Load from any source:
import_csv(), import_parquet(), HuggingFace, S3/GCS/Azure, RDBMS via Python DB APIExplore
Statistics & sampling:
select(), .sample(), .head()Data Sampling
Sample and filter large datasets efficiently