Skip to main content
Open in Kaggle  Open in Colab  Download Notebook
This documentation page is also available as an interactive notebook. You can launch the notebook in Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the above links.
Undo mistakes, audit changes, and create point-in-time snapshots of your data.

Problem

You need to track what changed in your data pipeline, undo accidental modifications, or preserve a specific state for reproducibility.

Solution

What’s in this recipe:
  • View version history with history() and get_versions()
  • Access specific versions with pxt.get_table('table:N')
  • Undo changes with revert()
  • Create point-in-time snapshots with pxt.create_snapshot()

Setup

%pip install -qU pixeltable
import pixeltable as pxt

pxt.drop_dir('version_demo', force=True)
pxt.create_dir('version_demo')
Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata
Created directory ‘version_demo’.
<pixeltable.catalog.dir.Dir at 0x13ce91c00>

Create a table and make some changes

Every data or schema change creates a new version.
# Create table (version 0)
products = pxt.create_table(
    'version_demo.products',
    {'name': pxt.String, 'price': pxt.Float, 'category': pxt.String}
)
Created table ‘products’.
# Insert data (version 1)
products.insert([
    {'name': 'Widget', 'price': 9.99, 'category': 'Tools'},
    {'name': 'Gadget', 'price': 24.99, 'category': 'Electronics'},
    {'name': 'Gizmo', 'price': 14.99, 'category': 'Electronics'},
])
Inserting rows into `products`: 0 rows [00:00, ? rows/s]Inserting rows into `products`: 3 rows [00:00, 432.95 rows/s]
Inserted 3 rows with 0 errors.
3 rows inserted, 6 values computed.
# Add a computed column (version 2 - schema change)
products.add_computed_column(price_with_tax=products.price * 1.08)
Added 3 column values with 0 errors.
3 rows updated, 6 values computed.
# Update some data (version 3)
products.update({'price': 19.99}, where=products.name == 'Widget')
Inserting rows into `products`: 0 rows [00:00, ? rows/s]Inserting rows into `products`: 1 rows [00:00, 297.47 rows/s]
1 row updated, 3 values computed.
# Insert more data (version 4)
products.insert([
    {'name': 'Thingamajig', 'price': 49.99, 'category': 'Tools'},
])
Inserting rows into `products`: 0 rows [00:00, ? rows/s]Inserting rows into `products`: 1 rows [00:00, 661.46 rows/s]
Inserted 1 row with 0 errors.
1 row inserted, 3 values computed.

View version history

Use history() for a human-readable summary of all changes.
# View full history (most recent first)
products.history()
# View only the last 3 versions
products.history(n=3)

Programmatic access to version metadata

Use get_versions() to access version data programmatically.
# Get version metadata as a list of dictionaries
versions = products.get_versions()

# Access specific version info
latest = versions[0]
latest['version'], latest['change_type'], latest['inserts']
(4, ‘data’, 1)

Access a specific version

Use pxt.get_table('table_name:version') to get a read-only handle to a specific version:
# Get the table at version 1 (after initial insert, before computed column)
products_v1 = pxt.get_table('version_demo.products:1')

# This is a read-only view of the data at that point in time
products_v1.collect()
# Compare data at version 2 (after computed column added) vs version 1
# Note: version 1 doesn't have the price_with_tax column yet
products_v2 = pxt.get_table('version_demo.products:2')
products_v2.collect()

Revert to previous version

Use revert() to undo the most recent change. This is irreversible.
# Current state: 4 products
products.count()
4
# Revert the last insert (removes Thingamajig)
products.revert()
products.count()
3
# History now shows version 4 was reverted
products.history()
# Can revert multiple times (back to before the update)
products.revert()

# Check the Widget price is back to original
products.where(products.name == 'Widget').select(products.name, products.price).collect()

Create point-in-time snapshots

Snapshots freeze a table’s state for reproducibility. Unlike revert(), snapshots preserve the data indefinitely.
# Create a snapshot of the current state
snapshot_v1 = pxt.create_snapshot('version_demo.products_v1', products)

snapshot_v1.collect()
# Now make changes to the original table
products.insert([
    {'name': 'Doohickey', 'price': 99.99, 'category': 'Premium'},
])
products.update({'price': 29.99}, where=products.name == 'Gadget')

products.collect()
Inserting rows into `products`: 0 rows [00:00, ? rows/s]Inserting rows into `products`: 1 rows [00:00, 535.67 rows/s]
Inserted 1 row with 0 errors.
Inserting rows into `products`: 0 rows [00:00, ? rows/s]Inserting rows into `products`: 1 rows [00:00, 558.05 rows/s]
# Snapshot remains unchanged - still shows original data
snapshot_v1.collect()

Explanation

What creates a new version:
  • insert() - adding rows
  • update() - modifying rows
  • delete() - removing rows
  • add_column() / add_computed_column() - schema changes
  • drop_column() - schema changes
  • rename_column() - schema changes
Version history methods:
  • history() - Human-readable DataFrame showing all changes
  • get_versions() - List of dictionaries for programmatic access
Accessing specific versions:
  • pxt.get_table('table_name:N') - Get read-only handle to version N
  • Useful for comparing data across versions, auditing changes, or recovering specific values
  • Version handles are read-only—you cannot modify historical versions
Reverting:
  • revert() undoes the most recent version
  • Can call multiple times to go back further
  • Cannot revert past version 0
  • Cannot revert if a snapshot references that version
Snapshots vs revert:
  • Snapshots are persistent, named, point-in-time copies
  • revert() permanently removes the latest version
  • Use snapshots when you need to preserve state for reproducibility
  • Use revert() to undo mistakes

See also