This documentation page is also available as an interactive notebook. You can launch the notebook in
Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the
above links.
Send your Pixeltable data to PostgreSQL, SQLite, MySQL, TigerData, or
Snowflake for use in external applications.
What’s in this recipe:
- Export entire tables or filtered queries to any SQL database
- Select specific columns for export
- Handle existing tables with replace or append options
- Connect to cloud PostgreSQL services (e.g. TigerData)
Problem
You have processed data in your pipeline—cleaned text, generated
embeddings, extracted metadata—and need to send it to a SQL database for
use by other applications or teams.
Solution
You use export_sql() to export tables or queries to any SQL database
via database connection strings. The function automatically maps
Pixeltable types to appropriate SQL types for each database dialect.
Setup
%pip install -qU pixeltable snowflake-sqlalchemy
import tempfile
from pathlib import Path
import pixeltable as pxt
from pixeltable.io.sql import export_sql
Create sample data
# Create a fresh directory
pxt.drop_dir('sql_export_demo', force=True)
pxt.create_dir('sql_export_demo')
Created directory ‘sql_export_demo’.
<pixeltable.catalog.dir.Dir at 0x146d66c10>
# Create a table with product data
products = pxt.create_table(
'sql_export_demo.products',
{
'name': pxt.String,
'price': pxt.Float,
'in_stock': pxt.Bool,
'metadata': pxt.Json,
},
)
Created table ‘products’.
# Insert sample products
products.insert([
{'name': 'Wireless Mouse', 'price': 29.99, 'in_stock': True, 'metadata': {'category': 'electronics', 'rating': 4.5}},
{'name': 'USB-C Hub', 'price': 49.99, 'in_stock': False, 'metadata': {'category': 'electronics', 'rating': 4.2}},
{'name': 'Mechanical Keyboard', 'price': 89.99, 'in_stock': True, 'metadata': {'category': 'electronics', 'rating': 4.8}},
{'name': 'Monitor Stand', 'price': 39.99, 'in_stock': True, 'metadata': {'category': 'accessories', 'rating': 4.0}},
{'name': 'Webcam', 'price': 59.99, 'in_stock': False, 'metadata': {'category': 'electronics', 'rating': 3.9}},
])
Inserted 5 rows with 0 errors in 0.01 s (566.35 rows/s)
5 rows inserted.
# View the data
products.collect()
Export an entire table
You pass a table and a SQLAlchemy connection string to export all rows
and columns.
# Create a SQLite database for this demo
db_path = Path(tempfile.mkdtemp()) / 'products.db'
connection_string = f'sqlite:///{db_path}'
# Export the full table
export_sql(products, 'products', db_connect_str=connection_string)
# Verify the export with SQLAlchemy
import sqlalchemy as sql
engine = sql.create_engine(connection_string)
with engine.connect() as conn:
result = conn.execute(sql.text('SELECT * FROM products')).fetchall()
result
[(‘Wireless Mouse’, 29.99, 1, ’{“rating”: 4.5, “category”: “electronics”}’),
(‘USB-C Hub’, 49.99, 0, ’{“rating”: 4.2, “category”: “electronics”}’),
(‘Mechanical Keyboard’, 89.99, 1, ’{“rating”: 4.8, “category”: “electronics”}’),
(‘Monitor Stand’, 39.99, 1, ’{“rating”: 4.0, “category”: “accessories”}’),
(‘Webcam’, 59.99, 0, ’{“rating”: 3.9, “category”: “electronics”}’)]
Export a filtered query
You can export any query result—filter rows, select specific columns, or
apply transformations before export.
# Export only in-stock products
export_sql(
products.where(products.in_stock == True),
'in_stock_products',
db_connect_str=connection_string,
)
# Verify filtered export
with engine.connect() as conn:
result = conn.execute(sql.text('SELECT name, price FROM in_stock_products')).fetchall()
result
[(‘Wireless Mouse’, 29.99),
(‘Mechanical Keyboard’, 89.99),
(‘Monitor Stand’, 39.99)]
Export specific columns
You select only the columns you need before exporting. You can also
rename columns in the output.
# Export only name and price columns
export_sql(
products.select(products.name, products.price),
'price_list',
db_connect_str=connection_string,
)
# Export with renamed columns
export_sql(
products.select(product_name=products.name, unit_price=products.price),
'renamed_columns',
db_connect_str=connection_string,
)
# Verify column selection
inspector = sql.inspect(engine)
columns = [col['name'] for col in inspector.get_columns('price_list')]
columns
[‘name’, ‘price’]
Handle existing tables
You control what happens when the target table already exists using the
if_exists parameter:
# Append new data to existing table
export_sql(
products.where(products.price > 50),
'products',
db_connect_str=connection_string,
if_exists='insert',
)
# Check row count after insert
with engine.connect() as conn:
result = conn.execute(sql.text('SELECT COUNT(*) FROM products')).fetchone()
f'Total rows after insert: {result[0]}'
‘Total rows after insert: 7’
# Replace with fresh data
export_sql(
products.select(products.name, products.price),
'products',
db_connect_str=connection_string,
if_exists='replace',
)
# Check that table was replaced
inspector = sql.inspect(engine)
columns = [col['name'] for col in inspector.get_columns('products')]
with engine.connect() as conn:
row_count = conn.execute(sql.text('SELECT COUNT(*) FROM products')).fetchone()[0]
f'Columns: {columns}, Row count: {row_count}'
“Columns: [‘name’, ‘price’], Row count: 5”
Export to cloud PostgreSQL (TigerData)
You can export directly to cloud-hosted PostgreSQL databases like
TigerData (Timescale Cloud). Get your
credentials from the TigerData dashboard after creating a service.
import getpass
import os
# Skip interactive sections in CI environments
SKIP_CLOUD_TESTS = os.environ.get('CI') or os.environ.get('GITHUB_ACTIONS')
if not SKIP_CLOUD_TESTS:
# Enter your TigerData credentials interactively
tigerdata_host = input('TigerData host (e.g., abc123.tsdb.cloud.timescale.com): ')
tigerdata_port = input('TigerData port (e.g., 38963): ')
tigerdata_user = input('TigerData username (e.g., tsdbadmin): ')
tigerdata_password = getpass.getpass('TigerData password: ')
tigerdata_dbname = input('TigerData database name (e.g., tsdb): ')
# Build the connection string (use postgresql+psycopg:// for SQLAlchemy compatibility)
tigerdata_connection = f'postgresql+psycopg://{tigerdata_user}:{tigerdata_password}@{tigerdata_host}:{tigerdata_port}/{tigerdata_dbname}?sslmode=require'
else:
print('Skipping TigerData section (running in CI)')
if not SKIP_CLOUD_TESTS:
# Export to TigerData
export_sql(products, 'pixeltable_products', db_connect_str=tigerdata_connection, if_exists='replace')
if not SKIP_CLOUD_TESTS:
# Verify the export in TigerData
tigerdata_engine = sql.create_engine(tigerdata_connection)
with tigerdata_engine.connect() as conn:
result = conn.execute(sql.text('SELECT * FROM pixeltable_products')).fetchall()
result
[(‘Wireless Mouse’, 29.99, True, {‘rating’: 4.5, ‘category’: ‘electronics’}),
(‘USB-C Hub’, 49.99, False, {‘rating’: 4.2, ‘category’: ‘electronics’}),
(‘Mechanical Keyboard’, 89.99, True, {‘rating’: 4.8, ‘category’: ‘electronics’}),
(‘Monitor Stand’, 39.99, True, {‘rating’: 4.0, ‘category’: ‘accessories’}),
(‘Webcam’, 59.99, False, {‘rating’: 3.9, ‘category’: ‘electronics’})]
Export to Snowflake
You can export directly to Snowflake data
warehouses. Get your account identifier from the Snowflake web interface
under Admin → Accounts.
if not SKIP_CLOUD_TESTS:
# Enter your Snowflake credentials interactively
snowflake_account = input('Snowflake account identifier (e.g., WEZMMGC-AIB20064): ')
snowflake_user = input('Snowflake username: ')
snowflake_password = getpass.getpass('Snowflake password: ')
snowflake_warehouse = input('Snowflake warehouse (e.g., COMPUTE_WH): ')
snowflake_database = input('Snowflake database: ')
snowflake_schema = input('Snowflake schema (e.g., PUBLIC): ')
# Build the connection string
snowflake_connection = f'snowflake://{snowflake_user}:{snowflake_password}@{snowflake_account}/{snowflake_database}/{snowflake_schema}?warehouse={snowflake_warehouse}'
else:
print('Skipping Snowflake section (running in CI)')
if not SKIP_CLOUD_TESTS:
# Export to Snowflake (without JSON column)
export_sql(
products.select(products.name, products.price, products.in_stock),
'PIXELTABLE_PRODUCTS',
db_connect_str=snowflake_connection,
if_exists='replace'
)
if not SKIP_CLOUD_TESTS:
# Verify the export in Snowflake
snowflake_engine = sql.create_engine(snowflake_connection)
with snowflake_engine.connect() as conn:
result = conn.execute(sql.text('SELECT * FROM PIXELTABLE_PRODUCTS')).fetchall()
result
[(‘Wireless Mouse’, 29.99, True, None),
(‘USB-C Hub’, 49.99, False, None),
(‘Mechanical Keyboard’, 89.99, True, None),
(‘Monitor Stand’, 39.99, True, None),
(‘Webcam’, 59.99, False, None)]
For tables containing media types (pxt.Image, pxt.Video,
pxt.Audio), you have two options:
-
Extract metadata before export - Select only the columns you
need (paths, embeddings, extracted text, etc.) and export those to
SQL.
-
Use Pixeltable destinations - For syncing media files to cloud
storage, use Pixeltable’s built-in destination support with
providers like
Tigris.
Example: Export image metadata to SQL
# Create a table with images
images = pxt.create_table(
'sql_export_demo.images',
{'image': pxt.Image, 'label': pxt.String}
)
# Add computed columns for metadata
images.add_computed_column(width=images.image.width)
images.add_computed_column(height=images.image.height)
images.add_computed_column(mode=images.image.mode)
Created table ‘images’.
Added 0 column values with 0 errors in 0.01 s
Added 0 column values with 0 errors in 0.01 s
Added 0 column values with 0 errors in 0.01 s
No rows affected.
# Insert sample images
base_url = 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images'
images.insert([
{'image': f'{base_url}/000000000036.jpg', 'label': 'cat'},
{'image': f'{base_url}/000000000090.jpg', 'label': 'scene'},
])
Inserted 2 rows with 0 errors in 0.03 s (63.85 rows/s)
2 rows inserted.
# Export metadata (not the image itself) to SQL
export_sql(
images.select(images.label, images.width, images.height, images.mode),
'image_metadata',
db_connect_str=connection_string, # or tigerdata_connection for cloud
)
# Verify the metadata export
with engine.connect() as conn:
result = conn.execute(sql.text('SELECT * FROM image_metadata')).fetchall()
result
[(‘cat’, 481, 640, ‘RGB’), (‘scene’, 640, 429, ‘RGB’)]
Explanation
Connection strings:
The function uses SQLAlchemy connection strings. Common formats:
Type mapping:
Pixeltable types map to SQL types automatically:
Unsupported types:
Media types like pxt.Image, pxt.Video, and pxt.Audio cannot be
exported directly. Extract the data you need (paths, embeddings,
metadata) before export.
See also