Skip to main content
Open in Kaggle  Open in Colab  Download Notebook
This documentation page is also available as an interactive notebook. You can launch the notebook in Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the above links.
When Pixeltable generates media files (thumbnails, extracted frames, processed images), by default it stores them locally. For production workflows, you can configure Pixeltable to upload these files directly to cloud blob storage including Amazon S3, Google Cloud Storage, Azure Blob Storage, and S3-compatible services like Cloudflare R2, Backblaze B2, and Tigris. Key features:
  • Computed media (AI-generated outputs) automatically uploads to your bucket
  • Input media can optionally be persisted for durability
  • Files are cached locally and downloaded on-demand
Configuration options:
  1. Global defaults in config.toml:
    [pixeltable]
    input_media_dest = "s3://my-bucket/input/"
    output_media_dest = "s3://my-bucket/output/" 
    
  2. Per-column destination (computed columns only):
    t.add_computed_column(
        thumbnail=t.image.thumbnail((128, 128)),
        destination='s3://my-bucket/thumbnails/'
    )
    
In this notebook, you’ll learn how to configure blob storage destinations for your media files.

What you’ll learn

  • Where Pixeltable stores files by default
  • How to specify destinations for individual columns
  • How to configure global destinations for all columns
  • How destination precedence works

How it works

Pixeltable decides where to store media files using this priority:
  1. Column destination (highest priority) — destination parameter in add_computed_column()
  2. Global configurationinput_media_dest / output_media_dest in config file
  3. Pixeltable’s default local storage — Used if nothing else is configured

Prerequisites

For this notebook, you’ll need:
  • pixeltable and boto3 installed
  • (Optional) Cloud storage credentials if you want to use a cloud provider
%pip install -qU pixeltable boto3

Setup

Let’s set up our demo environment. We’ll create a Pixeltable directory for this demo, set up local destination paths, create a table, and insert a sample image. You can substitute cloud storage URIs (like s3://my-bucket/path/) anywhere you see a local destination path.
import pixeltable as pxt
from pathlib import Path
# Clean slate for this demo
pxt.drop_dir('blob_storage_demo', force=True)
pxt.create_dir('blob_storage_demo')
Now we’ll create a table with an image column and insert a sample image from the web.
# Create table
t = pxt.create_table(
    'blob_storage_demo/media',
    {'source_image': pxt.Image},
    if_exists='replace',
)
Created table ‘media’.
We can inspect the schema before adding images to our table:
t
Let’s insert a single sample image.
sample_image = 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000036.jpg'
t.insert(source_image=sample_image)
Inserted 1 row with 0 errors in 0.77 s (1.29 rows/s)
1 row inserted.
And we can see the image in our table:
t.collect()

Default destinations

By default, Pixeltable stores all media files in local storage under ~/.pixeltable/media:
  • Input files (files you insert) — If you insert a URL, Pixeltable stores the URL and downloads it to cache on access. If you insert a local file path, Pixeltable just stores the path reference (the file stays where it is).
  • Output files (files Pixeltable generates) — Stored in ~/.pixeltable/media
This works out of the box with no configuration. You can change these defaults, which we’ll cover in the rest of this notebook. Let’s check where the source image is stored. Since we inserted a URL (not a local file), Pixeltable stores the URL reference and will download it to cache when we access it.
# Let's see where the source_image is stored by default
t.select(t.source_image.fileurl).collect()
Now let’s add a computed column without specifying a destination. This will show us where Pixeltable stores output files by default.
# Add computed column with no destination specified - uses default
t.add_computed_column(
    flipped=t.source_image.transpose(0), if_exists='replace'
)
Added 1 column value with 0 errors in 0.02 s (45.44 rows/s)
1 row updated.
Check the file URL - it points to ~/.pixeltable/media, the default location for generated files.
t.select(t.flipped, t.flipped.fileurl).collect()

Per-column destinations

When you create a computed column, you can specify exactly where to store generated files using the destination= parameter. This gives you fine-grained control over outputs, which may be costly and/or difficult to re-generate. We’ll create a destination directory for storing one of our processed images. For this demo, we’re using a local directory on your Desktop, but you can replace this path with a cloud storage URI (like s3://my-bucket/rotated/).
# Create a local destination directory
# For S3: dest_rotated = "s3://my-bucket/rotated/"
# For GCS: dest_rotated = "gs://my-bucket/rotated/"
base_path = Path.home() / 'Desktop' / 'pixeltable_outputs'
base_path.mkdir(parents=True, exist_ok=True)

dest_rotated = str(base_path / 'rotated')

# Create directory (only needed for local paths)
Path(dest_rotated).mkdir(exist_ok=True)
Now let’s add a computed column with an explicit destination to see the difference from the default behavior.
# Add column WITH explicit destination
t.add_computed_column(
    rotated=t.source_image.rotate(90),
    destination=dest_rotated,
    if_exists='replace',
)
Added 1 column value with 0 errors in 0.02 s (48.98 rows/s)
1 row updated.
Compare the file URLs. The rotated image uses our explicit destination, while flipped (created earlier) uses the default ~/.pixeltable/media location.
t.select(t.rotated, t.rotated.fileurl).collect()
t.select(t.flipped, t.flipped.fileurl).collect()

Changing global destinations

Instead of setting destination= on every column, you can change the global default for ALL columns.

Output and input destinations

You can configure two types of global destinations:
  • output_media_dest — Changes the default for files Pixeltable generates (computed columns)
  • input_media_dest — Changes the default for files you insert into tables
You can set them to the same bucket or different buckets depending on your needs.

How to configure

You have two options: Option 1: Configuration file (~/.pixeltable/config.toml)
[pixeltable]
# Where files Pixeltable generates are stored
output_media_dest = "s3://my-bucket/output/"

# Where files you insert are stored  
input_media_dest = "s3://my-bucket/input/"
Option 2: Environment variables
export PIXELTABLE_OUTPUT_MEDIA_DEST="s3://my-bucket/output/"
export PIXELTABLE_INPUT_MEDIA_DEST="s3://my-bucket/input/"

Supported providers and URI formats

For complete authentication and setup details, see the Cloud Storage documentation.

Overriding global destinations

Even if you configure global destinations, you can still override them for specific columns using the destination= parameter in add_computed_column(). Let’s create a new destination directory and add a thumbnail column that uses it.
# Create a different destination for thumbnails
dest_thumbnails = str(base_path / 'thumbnails')
Path(dest_thumbnails).mkdir(exist_ok=True)

# Add column with explicit destination (overrides any global default)
t.add_computed_column(
    thumbnail=t.source_image.thumbnail((128, 128)),
    destination=dest_thumbnails,
    if_exists='replace',
)
Added 1 column value with 0 errors in 0.02 s (47.89 rows/s)
1 row updated.
Let’s view the thumbnail and its file URL. The explicit destination= parameter always wins, regardless of global configuration.
t.select(t.thumbnail, t.thumbnail.fileurl).collect()

Getting URLs for your files

When your files are in blob storage, you can get URLs that point directly to them. These URLs work in HTML, APIs, or any application you need to serve media with. The .fileurl property gives you direct URLs you can use anywhere.
t.select(
    source=t.source_image.fileurl,
    rotated=t.rotated.fileurl,
    flipped=t.flipped.fileurl,
).collect()

Generating presigned URLs

Note: This section only applies if you’re using cloud storage (S3, GCS, Azure, R2, B2, Tigris). If you’re following along with local destinations (as in the examples above), you can skip this section or configure cloud storage to try it out.
When your files are in cloud storage, the .fileurl property returns storage URIs like s3://bucket/path/file.jpg. These aren’t directly accessible over HTTP. For private buckets or when you need time-limited HTTP access, use presigned URLs. These are temporary, authenticated URLs that allow anyone to access your files for a limited time without needing credentials. Presigned URLs are particularly useful for:
  • Sharing files from private buckets without making them public
  • Creating temporary download links with expiration
  • Serving media in web applications without exposing credentials
  • Providing time-limited access to sensitive content
Use the presigned_url function from pixeltable.functions.net:
import os

# Use HTTPS URL format for Backblaze B2
b2_region = 'us-east-005'
b2_bucket = 'pixeltable'
cloud_destination = (
    f'https://s3.{b2_region}.backblazeb2.com/{b2_bucket}/presigned-demo/'
)

# Add the computed column
t.add_computed_column(
    cloud_thumbnail=t.source_image.thumbnail((64, 64)),
    destination=cloud_destination,
    if_exists='replace',
)
Added 1 column value with 0 errors in 0.22 s (4.46 rows/s)
1 row updated.
# Now generate presigned URLs for the cloud-stored files
from pixeltable.functions import net

t.select(
    cloud_thumbnail=t.cloud_thumbnail,
    storage_url=t.cloud_thumbnail.fileurl,
    presigned_url=net.presigned_url(
        t.cloud_thumbnail.fileurl, 3600
    ),  # 1-hour expiration
).collect()
The presigned URLs in the output are fully authenticated HTTP/HTTPS URLs that can be accessed directly in a browser or used in APIs without any credentials.

Common expiration times

Note: Different storage providers have different maximum expiration limits. For example, Google Cloud Storage has a maximum 7-day expiration for presigned URLs.

Troubleshooting presigned URLs

If presigned_url() isn’t working:
  1. Local files: Presigned URLs only work with cloud storage (S3, GCS, Azure, R2, B2, Tigris). If your files are stored locally (default), you’ll get an error. Configure a cloud destination first.
  2. Already HTTP URLs: If .fileurl returns an http:// or https:// URL (not a storage URI like s3://), the file is already publicly accessible and doesn’t need a presigned URL.
  3. Credentials: Ensure your cloud storage credentials are properly configured. See the Cloud Storage documentation for provider-specific setup.

Common patterns

Here are a few real-world patterns you might use:

Pattern 1: All media in one place

If you want everything in the same bucket, configure both input and output destinations in ~/.pixeltable/config.toml:
[pixeltable]
input_media_dest = "s3://my-bucket/media/"
output_media_dest = "s3://my-bucket/media/"
Or set environment variables:
export PIXELTABLE_INPUT_MEDIA_DEST="s3://my-bucket/media/"
export PIXELTABLE_OUTPUT_MEDIA_DEST="s3://my-bucket/media/"

Pattern 2: Separate input and output

Keep source files separate from processed files in ~/.pixeltable/config.toml:
[pixeltable]
input_media_dest = "s3://my-bucket/uploads/"
output_media_dest = "s3://my-bucket/processed/"

Pattern 3: Override for specific columns

Use a global default, but send some columns elsewhere. First, set a global default in your config:
[pixeltable]
output_media_dest = "s3://my-bucket/processed/"
Then in your code, most columns use the global default, but you can override specific ones:
# Uses global default (s3://my-bucket/processed/)
t.add_computed_column(
    thumbnail=t.image.thumbnail((128, 128))
)

# Overrides global default - goes to different location
t.add_computed_column(
    large_thumbnail=t.image.thumbnail((512, 512)),
    destination='s3://my-bucket/thumbnails/'
)

Where do my files go?

Understanding how Pixeltable handles different types of input files helps you make better decisions about storage configuration.
When you configure a cloud destination, Pixeltable populates both the destination and the local cache efficiently during insert(). For URLs, this means downloading once and using that download for both the upload and cache—avoiding wasteful upload→download cycles.

What you learned

  • Pixeltable uses local storage by default for all media files
  • You can override the default for specific columns with the destination parameter
  • You can change the global default with input_media_dest and output_media_dest
  • Precedence: column destination > global config > Pixeltable’s default local storage
  • Use .fileurl to get URLs for your stored files
  • Use net.presigned_url() to generate time-limited, authenticated HTTP URLs for cloud storage files
  • Pixeltable handles caching intelligently to avoid wasteful operations

See also

Next steps

Last modified on February 4, 2026