Documentation Index
Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
Use this file to discover all available pages before exploring further.
Problem
You have data stored in Parquet format—a common format for analytics, data lakes, and ML pipelines. You need to load this data for processing with AI models or combining with other data sources.Solution
What’s in this recipe:- Import Parquet files directly into tables
- Export tables to Parquet for external tools
- Handle schema type overrides
pxt.create_table() with a source parameter to create a table
from a Parquet file. Pixeltable infers column types from the Parquet
schema automatically.
Setup
Create sample Parquet file
First, create a sample Parquet file to demonstrate the import process:Import Parquet file
Usecreate_table with the source parameter to create a table
directly from the Parquet file:
Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata
Created directory ‘parquet_demo’.
<pixeltable.catalog.dir.Dir at 0x17f0ca920>
Created table ‘products’.Inserting rows into `products`: 0 rows [00:00, ? rows/s] Inserting rows into `products`: 5 rows [00:00, 653.18 rows/s] Inserted 5 rows with 0 errors.
Add computed columns
Once imported, you can add computed columns like any other Pixeltable table:Added 5 column values with 0 errors.
5 rows updated, 10 values computed.
Import with primary key
Specify a primary key when you need upsert behavior or unique constraints:Created table ‘products_with_pk’.Inserting rows into `products_with_pk`: 0 rows [00:00, ? rows/s] Inserting rows into `products_with_pk`: 5 rows [00:00, 1548.97 rows/s] Inserted 5 rows with 0 errors.
Export table to Parquet
Export your processed data back to Parquet for use with other tooleeExplanation
When to use Parquet import: Key features:- Automatic schema inference from Parquet metadata
- Support for partitioned datasets (directory of files)
- Export with
pxt.io.export_parquetfor interoperability - Primary key support for upsert workflows
See also
- Import CSV files - For CSV and Excel imports
- Import JSON files - For JSON data