This documentation page is also available as an interactive notebook. You can launch the notebook in
Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the
above links.
Undo mistakes, audit changes, and create point-in-time snapshots of your
data.
Problem
You need to track what changed in your data pipeline, undo accidental
modifications, or preserve a specific state for reproducibility.
Solution
What’s in this recipe:
- View version history with
history() and get_versions()
- Access specific versions with
pxt.get_table('table:N')
- Undo changes with
revert()
- Create point-in-time snapshots with
pxt.create_snapshot()
Setup
%pip install -qU pixeltable
import pixeltable as pxt
pxt.drop_dir('version_demo', force=True)
pxt.create_dir('version_demo')
Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata
Created directory ‘version_demo’.
<pixeltable.catalog.dir.Dir at 0x13ce91c00>
Create a table and make some changes
Every data or schema change creates a new version.
# Create table (version 0)
products = pxt.create_table(
'version_demo.products',
{'name': pxt.String, 'price': pxt.Float, 'category': pxt.String}
)
Created table ‘products’.
# Insert data (version 1)
products.insert([
{'name': 'Widget', 'price': 9.99, 'category': 'Tools'},
{'name': 'Gadget', 'price': 24.99, 'category': 'Electronics'},
{'name': 'Gizmo', 'price': 14.99, 'category': 'Electronics'},
])
Inserting rows into `products`: 0 rows [00:00, ? rows/s]Inserting rows into `products`: 3 rows [00:00, 432.95 rows/s]
Inserted 3 rows with 0 errors.
3 rows inserted, 6 values computed.
# Add a computed column (version 2 - schema change)
products.add_computed_column(price_with_tax=products.price * 1.08)
Added 3 column values with 0 errors.
3 rows updated, 6 values computed.
# Update some data (version 3)
products.update({'price': 19.99}, where=products.name == 'Widget')
Inserting rows into `products`: 0 rows [00:00, ? rows/s]Inserting rows into `products`: 1 rows [00:00, 297.47 rows/s]
1 row updated, 3 values computed.
# Insert more data (version 4)
products.insert([
{'name': 'Thingamajig', 'price': 49.99, 'category': 'Tools'},
])
Inserting rows into `products`: 0 rows [00:00, ? rows/s]Inserting rows into `products`: 1 rows [00:00, 661.46 rows/s]
Inserted 1 row with 0 errors.
1 row inserted, 3 values computed.
View version history
Use history() for a human-readable summary of all changes.
# View full history (most recent first)
products.history()
# View only the last 3 versions
products.history(n=3)
Use get_versions() to access version data programmatically.
# Get version metadata as a list of dictionaries
versions = products.get_versions()
# Access specific version info
latest = versions[0]
latest['version'], latest['change_type'], latest['inserts']
(4, ‘data’, 1)
Access a specific version
Use pxt.get_table('table_name:version') to get a read-only handle to a
specific version:
# Get the table at version 1 (after initial insert, before computed column)
products_v1 = pxt.get_table('version_demo.products:1')
# This is a read-only view of the data at that point in time
products_v1.collect()
# Compare data at version 2 (after computed column added) vs version 1
# Note: version 1 doesn't have the price_with_tax column yet
products_v2 = pxt.get_table('version_demo.products:2')
products_v2.collect()
Revert to previous version
Use revert() to undo the most recent change. This is irreversible.
# Current state: 4 products
products.count()
4
# Revert the last insert (removes Thingamajig)
products.revert()
products.count()
3
# History now shows version 4 was reverted
products.history()
# Can revert multiple times (back to before the update)
products.revert()
# Check the Widget price is back to original
products.where(products.name == 'Widget').select(products.name, products.price).collect()
Create point-in-time snapshots
Snapshots freeze a table’s state for reproducibility. Unlike revert(),
snapshots preserve the data indefinitely.
# Create a snapshot of the current state
snapshot_v1 = pxt.create_snapshot('version_demo.products_v1', products)
snapshot_v1.collect()
# Now make changes to the original table
products.insert([
{'name': 'Doohickey', 'price': 99.99, 'category': 'Premium'},
])
products.update({'price': 29.99}, where=products.name == 'Gadget')
products.collect()
Inserting rows into `products`: 0 rows [00:00, ? rows/s]Inserting rows into `products`: 1 rows [00:00, 535.67 rows/s]
Inserted 1 row with 0 errors.
Inserting rows into `products`: 0 rows [00:00, ? rows/s]Inserting rows into `products`: 1 rows [00:00, 558.05 rows/s]
# Snapshot remains unchanged - still shows original data
snapshot_v1.collect()
Explanation
What creates a new version:
insert() - adding rows
update() - modifying rows
delete() - removing rows
add_column() / add_computed_column() - schema changes
drop_column() - schema changes
rename_column() - schema changes
Version history methods:
history() - Human-readable DataFrame showing all changes
get_versions() - List of dictionaries for programmatic access
Accessing specific versions:
pxt.get_table('table_name:N') - Get read-only handle to version N
- Useful for comparing data across versions, auditing changes, or
recovering specific values
- Version handles are read-only—you cannot modify historical versions
Reverting:
revert() undoes the most recent version
- Can call multiple times to go back further
- Cannot revert past version 0
- Cannot revert if a snapshot references that version
Snapshots vs revert:
- Snapshots are persistent, named, point-in-time copies
revert() permanently removes the latest version
- Use snapshots when you need to preserve state for reproducibility
- Use
revert() to undo mistakes
See also