> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Cloud Storage

> Connect Pixeltable to S3, Google Cloud Storage, Azure Blob, and other cloud storage backends to manage media files and external references.

Pixeltable supports storing media files (images, videos, audio, documents) in external cloud storage providers instead of local disk. This is essential for production deployments, enabling scalable storage, team collaboration, and integration with existing data infrastructure.

## Supported providers

<CardGroup cols={3}>
  <Card title="Pixeltable Cloud" icon="cloud">
    Free managed storage, no bucket setup required
  </Card>

  <Card title="Amazon S3" icon="aws">
    Native S3 storage with full feature support
  </Card>

  <Card title="Google Cloud Storage" icon="google">
    GCS buckets with gs\:// URI scheme
  </Card>

  <Card title="Azure Blob Storage" icon="microsoft">
    Azure containers with wasb:// or abfs\:// schemes
  </Card>

  <Card title="Cloudflare R2" icon="cloudflare">
    S3-compatible storage with zero egress fees
  </Card>

  <Card title="Backblaze B2" icon="hard-drive">
    Cost-effective S3-compatible storage
  </Card>

  <Card title="Tigris" icon="database">
    Globally distributed S3-compatible storage
  </Card>
</CardGroup>

## How it works

When you configure a storage destination, Pixeltable automatically:

1. **Uploads computed media** — AI-generated images, extracted video frames, and other computed media files are stored in your bucket
2. **Copies input media** — Optionally persists referenced media files for durability
3. **Manages file lifecycle** — Cleans up files when table data is deleted
4. **Handles caching** — Downloads files on-demand with intelligent local caching

## Configuration

There are two ways to configure cloud storage destinations:

### Global default destinations

Set default destinations for all media columns in your `config.toml` (see [Configuration](/platform/configuration) for details):

```toml theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
[pixeltable]
# For input media (inserted/referenced files)
input_media_dest = "s3://my-bucket/input/"

# For computed media (AI-generated outputs)
output_media_dest = "s3://my-bucket/output/"
```

Or via environment variables:

```bash theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
export PIXELTABLE_INPUT_MEDIA_DEST="s3://my-bucket/input/"
export PIXELTABLE_OUTPUT_MEDIA_DEST="s3://my-bucket/output/"
```

<Tip>
  Configure these before creating tables. All media columns will automatically use the configured destinations.
</Tip>

### Per-column destination (computed columns only)

For **computed columns**, you can override the default with a specific destination:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
import pixeltable as pxt

# Create a table with input media column
# (uses global input_media_dest if configured)
t = pxt.create_table('my_app/images', {'image': pxt.Image})

# Add computed column with explicit destination
t.add_computed_column(
    thumbnail=t.image.resize((128, 128)),
    destination='s3://my-bucket/thumbnails/'
)
```

<Note>
  The `destination` parameter only applies to **stored computed columns**. For input columns, use the global `input_media_dest` configuration.
</Note>

### Precedence rules

Destinations are resolved in this order:

1. **Explicit column destination** — highest priority (computed columns only)
2. **Global default** — `input_media_dest` for input columns, `output_media_dest` for computed columns
3. **Local storage** — fallback if no destination is configured

## Provider configuration

### Pixeltable Cloud (home bucket)

Every Pixeltable Cloud account includes a free managed storage bucket. No bucket creation, no credentials file, no cloud provider account needed.

<Tabs>
  <Tab title="URI Format">
    ```
    pxtfs://org-slug:db-slug/home
    ```

    Replace `org-slug` and `db-slug` with your Pixeltable Cloud organization and database names.
  </Tab>

  <Tab title="Authentication">
    Set your Pixeltable API key:

    ```toml theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    [pixeltable]
    api_key = "your-pixeltable-api-key"
    ```

    Or via environment variable:

    ```bash theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    export PIXELTABLE_API_KEY="your-pixeltable-api-key"
    ```

    Pixeltable automatically fetches and refreshes temporary credentials from the Cloud control plane.
  </Tab>

  <Tab title="Example">
    ```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    import pixeltable as pxt

    t = pxt.create_table('app/images', {'photo': pxt.Image})

    t.add_computed_column(
        thumbnail=t.photo.resize((256, 256)),
        destination='pxtfs://myorg:mydb/home'
    )
    ```

    Or set it as your global default in `config.toml`:

    ```toml theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    [pixeltable]
    output_media_dest = "pxtfs://myorg:mydb/home"
    ```

    Or as an environment variable:

    ```bash theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    export PIXELTABLE_OUTPUT_MEDIA_DEST="pxtfs://myorg:mydb/home"
    ```
  </Tab>
</Tabs>

<Tip>
  This is the fastest way to get cloud storage working. No AWS/GCP/Azure account required.
</Tip>

You can browse, search, and preview the contents of your home bucket directly from the [Pixeltable Cloud dashboard](https://www.pixeltable.com/dashboard). Navigate to **Storage & Buckets** in the sidebar, then select your **home** bucket to explore files by type (images, docs, video, audio), view metadata, and inspect individual objects.

```
https://www.pixeltable.com/dashboard/{org-slug}/{db-slug}/storage/home/browse
```

### Amazon S3

<Tabs>
  <Tab title="URI Format">
    ```
    s3://bucket-name/optional/prefix/
    ```
  </Tab>

  <Tab title="Authentication">
    Uses standard AWS credential chain:

    * Environment variables (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`)
    * AWS credentials file (`~/.aws/credentials`)
    * IAM role (when running on AWS)

    Optionally specify a profile in `config.toml`:

    ```toml theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    [pixeltable]
    s3_profile = "my-aws-profile"
    ```
  </Tab>

  <Tab title="Example">
    ```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    import pixeltable as pxt

    # With global config: output_media_dest = "s3://my-bucket/output/"
    t = pxt.create_table('app/images', {'photo': pxt.Image})

    # Or set destination per computed column
    t.add_computed_column(
        thumbnail=t.photo.resize((256, 256)),
        destination='s3://my-production-bucket/thumbnails/'
    )
    ```
  </Tab>
</Tabs>

### Google Cloud Storage

<Tabs>
  <Tab title="URI Format">
    ```
    gs://bucket-name/optional/prefix/
    ```
  </Tab>

  <Tab title="Authentication">
    Uses Google Cloud Application Default Credentials:

    * Service account key file (`GOOGLE_APPLICATION_CREDENTIALS`)
    * gcloud CLI authentication
    * GCE metadata service (when running on GCP)
  </Tab>

  <Tab title="Requirements">
    ```bash theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    pip install google-cloud-storage
    ```
  </Tab>

  <Tab title="Example">
    ```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    # With global config: output_media_dest = "gs://my-gcs-bucket/output/"
    t = pxt.create_table('app/videos', {'video': pxt.Video})

    # Or set destination per computed column
    t.add_computed_column(
        frames=pxt.functions.video.frame_iterator(t.video, fps=1),
        destination='gs://my-gcs-bucket/frames/'
    )
    ```
  </Tab>
</Tabs>

### Azure Blob Storage

<Tabs>
  <Tab title="URI Formats">
    Azure supports multiple URI schemes:

    ```
    wasbs://container@account.blob.core.windows.net/prefix/
    abfss://container@account.dfs.core.windows.net/prefix/
    ```
  </Tab>

  <Tab title="Authentication">
    Configure in `config.toml`:

    ```toml theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    [azure]
    storage_account_name = "myaccount"
    storage_account_key = "your-key-here"
    ```

    Or via environment variables:

    ```bash theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    export AZURE_STORAGE_ACCOUNT_NAME="myaccount"
    export AZURE_STORAGE_ACCOUNT_KEY="your-key-here"
    ```
  </Tab>

  <Tab title="Requirements">
    ```bash theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    pip install azure-storage-blob
    ```
  </Tab>

  <Tab title="Example">
    ```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    # With global config: output_media_dest = "wasbs://mycontainer@myaccount.blob.core.windows.net/output/"
    t = pxt.create_table('app/docs', {'document': pxt.Document})

    # Or set destination per computed column
    t.add_computed_column(
        chunks=pxt.functions.video.document_splitter(t.document),
        destination='wasbs://mycontainer@myaccount.blob.core.windows.net/chunks/'
    )
    ```
  </Tab>
</Tabs>

### Cloudflare R2

<Tabs>
  <Tab title="URI Format">
    ```
    https://account-id.r2.cloudflarestorage.com/bucket-name/prefix/
    ```
  </Tab>

  <Tab title="Authentication">
    Create an R2 API token and configure AWS-style credentials.

    In `~/.aws/credentials`:

    ```ini theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    [r2]
    aws_access_key_id = your-r2-access-key
    aws_secret_access_key = your-r2-secret-key
    ```

    In `config.toml`:

    ```toml theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    [pixeltable]
    r2_profile = "r2"
    ```
  </Tab>

  <Tab title="Example">
    ```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    t = pxt.create_table('app/images', {'image': pxt.Image})

    t.add_computed_column(
        rotated=t.image.rotate(90),
        destination='https://abc123.r2.cloudflarestorage.com/my-bucket/processed/'
    )
    ```
  </Tab>
</Tabs>

### Backblaze B2

<Tabs>
  <Tab title="URI Format">
    ```
    https://s3.region.backblazeb2.com/bucket-name/prefix/
    ```
  </Tab>

  <Tab title="Authentication">
    Create B2 application keys and configure AWS-style credentials.

    In `~/.aws/credentials`:

    ```ini theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    [b2]
    aws_access_key_id = your-b2-key-id
    aws_secret_access_key = your-b2-application-key
    ```

    In `config.toml`:

    ```toml theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    [pixeltable]
    b2_profile = "b2"
    ```
  </Tab>

  <Tab title="Example">
    ```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    t = pxt.create_table('app/audio', {'audio': pxt.Audio})

    t.add_computed_column(
        segments=pxt.functions.video.audio_splitter(t.audio, duration=30),
        destination='https://s3.us-west-004.backblazeb2.com/my-bucket/segments/'
    )
    ```
  </Tab>
</Tabs>

### Tigris

<Tabs>
  <Tab title="URI Format">
    ```
    https://t3.storage.dev/bucket-name/prefix/
    ```
  </Tab>

  <Tab title="Authentication">
    Configure AWS-style credentials for Tigris.

    In `~/.aws/credentials`:

    ```ini theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    [tigris]
    aws_access_key_id = your-tigris-access-key
    aws_secret_access_key = your-tigris-secret-key
    ```

    In `config.toml`:

    ```toml theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    [pixeltable]
    tigris_profile = "tigris"
    ```
  </Tab>

  <Tab title="Example">
    ```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
    t = pxt.create_table('app/media', {'file': pxt.Image})

    t.add_computed_column(
        thumbnail=t.file.resize((128, 128)),
        destination='https://t3.storage.dev/my-bucket/thumbnails/'
    )
    ```
  </Tab>
</Tabs>

## Complete example

Here's a full example using S3 for both input and computed media.

First, configure your global destinations in `~/.pixeltable/config.toml`:

```toml theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
[pixeltable]
input_media_dest = "s3://my-app-bucket/uploads/"
output_media_dest = "s3://my-app-bucket/generated/"

s3_profile = "my-aws-profile"  # optional, uses default credentials if not set
```

Then create your table and add computed columns:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
import pixeltable as pxt
from pixeltable.functions import openai

# Create a table — input media automatically goes to input_media_dest
t = pxt.create_table('production/photos', {'photo': pxt.Image})

# Add a computed column for thumbnails
# Uses output_media_dest by default, or specify a custom destination
t.add_computed_column(
    thumbnail=t.photo.resize((256, 256)),
    destination='s3://my-app-bucket/thumbnails/'  # override default
)

# Add AI-generated descriptions (uses output_media_dest)
messages = [
    {
        'role': 'user',
        'content': [
            {'type': 'text', 'text': 'Describe this image briefly.'},
            {'type': 'image_url', 'image_url': t.photo},
        ],
    }
]
t.add_computed_column(
    description=openai.chat_completions(messages, model='gpt-4o-mini')
)

# Insert data — Pixeltable handles all uploads automatically
t.insert([
    {'photo': 'https://example.com/image1.jpg'},
    {'photo': '/local/path/to/image2.png'},
])

# Query as usual — files are streamed/cached as needed
t.select(t.photo, t.thumbnail, t.description).collect()
```

## Best practices

<AccordionGroup>
  <Accordion title="Use prefixes to organize data">
    Structure your bucket with prefixes that reflect your application:

    ```
    s3://my-bucket/
      ├── production/
      │   ├── uploads/
      │   └── generated/
      └── staging/
          ├── uploads/
          └── generated/
    ```
  </Accordion>

  <Accordion title="Separate input and output destinations">
    Use different prefixes or buckets for input vs computed media:

    * Easier to set different retention policies
    * Clearer cost attribution
    * Simpler backup strategies
  </Accordion>

  <Accordion title="Configure lifecycle policies">
    Set up bucket lifecycle policies to automatically:

    * Transition old data to cheaper storage tiers
    * Delete temporary/staging data after a period
    * Enable versioning for critical data
  </Accordion>

  <Accordion title="Use IAM roles in production">
    When running on cloud infrastructure, use IAM roles instead of access keys:

    * More secure (no key rotation needed)
    * Automatic credential refresh
    * Better audit trails
  </Accordion>
</AccordionGroup>

## Troubleshooting

<AccordionGroup>
  <Accordion title="Access Denied errors">
    Verify your credentials have the necessary permissions:

    * `s3:GetObject`, `s3:PutObject`, `s3:DeleteObject`
    * `s3:ListBucket` for the bucket

    For GCS: `storage.objects.create`, `storage.objects.get`, `storage.objects.delete`
  </Accordion>

  <Accordion title="Bucket not found">
    * Ensure the bucket exists and the name is spelled correctly
    * Check the region matches your credential configuration
    * For S3-compatible providers, verify the endpoint URL is correct
  </Accordion>

  <Accordion title="Slow uploads">
    * Pixeltable uses connection pooling and parallel uploads automatically
    * Consider using a bucket in the same region as your compute
    * Check your network bandwidth and latency
  </Accordion>
</AccordionGroup>

<Card title="Configuration Reference" icon="gear" href="/platform/configuration">
  See the complete list of storage configuration options including profiles for S3, R2, B2, Tigris, and Azure.
</Card>

<Note>
  Need help setting up cloud storage? Join our [Discord community](https://discord.com/invite/QPyqFYx2UN) for support.
</Note>
