Documentation Index
Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Pixeltable Cloud enables you to:- Publish your datasets for sharing with teams or the public
- Replicate datasets from the cloud to your local environment
- Share multimodal AI datasets (images, videos, audio, documents) without managing infrastructure
Setup
Data sharing functionality requires Pixeltable version 0.4.24 or later.Replicating datasets
You can replicate any public dataset from Pixeltable Cloud to your local environment without needing an account or API key.Replicate a public dataset
Let’s replicate a mini-version of the COCO-2017 dataset from Pixeltable Cloud. You can find this dataset at pixeltable.com/t/pixeltable:fiftyone/coco_mini_2017, or browse for other public datasets. When callingreplicate():
remote_uri(required): The URI of the cloud dataset you want to replicatelocal_path(your choice): The local directory/table name where you want to store the replica- Variable name (your choice): The Python variable in your
session/script to reference the table (e.g.,
coco_copy)
Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata
Created directory ‘sharing-demo’.
Output()
Extracting table data into: /Users/asiegel/.pixeltable/tmp/acad78b1-4a62-483e-a0b1-728ccb5603cf
Created directory ‘_system’.
Created local replica ‘sharing-demo/coco-copy’ from URI: pxt://pixeltable:fiftyone/coco_mini_2017
You can check that the replica exists at the local path with
list_tables().
[‘sharing-demo/coco-copy’]
To see the structure of the replicated table:
Working with replicas
Replicated datasets are read-only locally, but you can query, explore, and use them in powerful ways: 1. Query and explore the data
list_tables() and get_table() to access
your replicas:
[‘sharing-demo/coco-copy’]
Created table ‘my-coco-table’.
This copies the values in the source, but drops the computational
definitions and cannot be updated if the source table changes.
Updating replicas with pull
If the upstream table changes, you can update your local replica usingpull():
Replica ‘sharing-demo/coco-copy’ is already up to date with source: pxt://pixeltable:fiftyone/d699317b-23a4-404b-8f71-6531fd8dc462
This synchronizes your local replica with any updates made to the source
dataset.
Publishing datasets
Requirements:- A Pixeltable Cloud account (Community Edition includes 1TB storage - see pricing)
- Your API key from the account dashboard
Configure your API key
Pixeltable looks for your API key in thePIXELTABLE_API_KEY
environment variable. Choose one of these methods:
Option 1: In your notebook (secure and convenient)
Run this cell to securely enter your API key (get it from
pixeltable.com/dashboard):
~/.zshrc or ~/.bashrc:
~/.pixeltable/config.toml:
Create a sample dataset
Let’s create a table with images from this repository to publish. Thecomment parameter provides a description that will be visible on
Pixeltable Cloud:
Created table ‘photos’.
Inserted 3 rows with 0 errors in 0.02 s (169.05 rows/s)
3 rows inserted.
Publish your dataset
Publish your table to Pixeltable Cloud. When callingpublish():
-
source(required): An existing local table - either a table path string (e.g.,'sample-images.photos') or table handle (e.g.,t)- If you use a local table path string, it must match a table in
your local database (you can verify with
pxt.list_tables())
- If you use a local table path string, it must match a table in
your local database (you can verify with
-
destination_uri(required): The cloud URI where you want to publish, in the formatpxt://orgname/dataset- Pixeltable automatically creates any directory structure in the cloud based on this URI
- Your local directory structure doesn’t need to match the cloud structure
Understanding destination URIs
Thedestination_uri in publish() uses the format:
pxt://org:database/path
URI components:
org(required): Your organization namedatabase(optional): Database name - defaults tomainif omittedpath(required): Directory and table path in the cloud
pxt://orgname/my-dataset→ Uses the defaultmaindatabasepxt://orgname:main/my-dataset→ Explicitly specifies themaindatabasepxt://orgname:analytics/my-dataset→ Uses theanalyticsdatabase
- Every Pixeltable Cloud account includes a
maindatabase by default - Each database has its own storage bucket
- You can create additional databases in your Pixeltable dashboard
Updating published datasets with push
After you’ve published a dataset, you can update the cloud replica with local changes usingpush():
replicate() to create a replica of your own table. This is because the
table already exists in your Pixeltable database. The replicate()
function is intended for pulling datasets published by others into your
environment.
Access control
Theaccess parameter in publish() controls who can replicate your
dataset:
access='private'(default): Only your team members can access the datasetaccess='public': Anyone can replicate your dataset
access parameter, or change it later in the Pixeltable Cloud
UI. You can also manage team members
and permissions in your dashboard.
Deleting published tables
If you want to delete a published table, you have two options: Option 1: Using the Pixeltable SDK Usedrop_table() with your table’s destination URI (the same pxt://
URI you used when publishing):
Get help
Have questions or need support? Join our community:- Discord Community: Ask questions, get community support, and share what you build with Pixeltable
- YouTube: Watch tutorials, demos, and feature walkthroughs
- GitHub Issues: Report bugs or request features