Tables
Learn the fundamentals of Pixeltable tables, types, and how to build in Pixeltable
Learn more about Pixeltable tables and the data operations with our in-depth guide.
What are Tables?
Tables are the fundamental data storage units in Pixeltable. They function similarly to SQL database tables but with enhanced capabilities designed specifically for AI and ML workflows. Each table consists of columns with defined data types and can store both structured data and unstructured media assets.
In Pixeltable, tables:
- Persist across sessions, meaning your data remains available even after restarting your environment
- Maintain strong typing for data consistency
- Support operations like filtering, querying, and transformation
- Can handle specialized data types for machine learning and media processing
- Group logically into directories (namespaces) for organization
Creating a table requires defining a name and schema that describes its structure:
Type System
Column Casting
Pixeltable allows you to explicitly cast column values to ensure they conform to the expected type. This is particularly useful when working with computed columns or transforming data from external sources.
Column casting helps maintain data consistency and prevents type errors when processing your data.
Data Operations
Query Operations
Filter and retrieve data:
String Operations
Manipulate text data:
Insert Operations
Add new data:
Import from Source
Create tables or insert data directly from external sources:
Pixeltable supports importing from various data sources:
- CSV files (
.csv
) - Excel files (
.xls
,.xlsx
) - Parquet files (
.parquet
,.pq
,.parq
) - JSON files (
.json
) - Pandas DataFrames
- Pixeltable DataFrames
- Hugging Face datasets
Update Operations
Modify existing data:
Delete Operations
Remove data with conditions:
Column Operations
Manage table structure:
Versioning
Manage table versions:
Export Operations
Extract data for analysis:
Join Tables
Combine data from multiple tables using different join types.
Inner Join
Returns only matching records from both tables.
Left Outer Join
Returns all records from the left table and matching records from the right table.
Right Outer Join
Returns all records from the right table and matching records from the left table.
Cross Join
Returns all possible combinations of records from both tables.
Best Practices
Schema Definition
- Use clear naming for directories and tables
- Document computed column dependencies
Application Code
- Use
get_table()
to fetch existing tables - Use batch operations for multiple rows
Common Patterns
Additional Resources
Remember that Pixeltable automatically handles versioning and lineage tracking. Every operation is recorded and can be reverted if needed.