Custom Functions (UDFs)
Create and use custom functions (UDFs) in Pixeltable
What are User-Defined Functions?
User-Defined Functions (UDFs) in Pixeltable allow you to extend the platform with custom Python code. They bridge the gap between Pixeltable’s built-in operations and your specific data processing needs, enabling you to create reusable components for transformations, analysis, and AI workflows.
Pixeltable UDFs offer several key advantages:
- Reusability: Define a function once and use it across multiple tables and operations
- Type Safety: Strong typing ensures data compatibility throughout your pipelines
- Performance: Batch processing and caching capabilities optimize execution
- Integration: Seamlessly combine custom code with Pixeltable’s query system
- Flexibility: Process any data type including text, images, videos, and embeddings
UDFs can be as simple as a basic transformation or as complex as a multi-stage ML pipeline. Pixeltable offers three types of custom functions to handle different scenarios:
User-Defined Functions in Pixeltable
Learn more about UDFs and UDAs with our in-depth guide.
This guide covers three types of custom functions in Pixeltable:
- Basic User-Defined Functions (UDFs)
- Tables as UDFs
- User-Defined Aggregates (UDAs)
1. Basic User-Defined Functions (UDFs)
Overview
UDFs allow you to:
- Write custom Python functions for data processing
- Integrate them into computed columns and queries
- Optimize performance through batching
- Create reusable components for your data workflow
All UDFs require type hints for parameters and return values. This enables Pixeltable to validate and optimize your data workflow before execution.
Creating Basic UDFs
UDF Types
Local UDFs are serialized with their columns. Changes to the UDF only affect new columns.
Supported Types
Performance Optimization
Batching
Caching
Best Practices for Basic UDFs
2. Tables as UDFs
Overview
Tables as UDFs allow you to:
- Convert entire tables into reusable functions
- Create modular and complex data processing workflows
- Encapsulate multi-step operations
- Share workflows between different tables and applications
Tables as UDFs are particularly powerful for building AI agents and complex automation workflows that require multiple processing steps.
Creating Table UDFs
Step 1: Create a Specialized Table
Step 2: Convert to UDF
Step 3: Use the Table UDF
Flow Diagram
Agent Table UDF Flow
Key Benefits of Table UDFs
Modularity
Break complex workflows into reusable components that can be tested and maintained separately.
Encapsulation
Hide implementation details and expose only the necessary inputs and outputs through clean interfaces.
Composition
Combine multiple specialized agents to build more powerful workflows through function composition.
Advanced Techniques
3. User-Defined Aggregates (UDAs)
Overview
UDAs enable you to:
- Create custom aggregation functions
- Process multiple rows into a single result
- Use them in group_by operations
- Build reusable aggregation logic
Creating UDAs
UDA Components
-
Initialization (
__init__
)- Sets up initial state
- Defines parameters
- Called once at start
-
Update Method (
update
)- Processes each input row
- Updates internal state
- Must handle all value types
-
Value Method (
value
)- Returns final result
- Called after all updates
- Performs final calculations
Using UDAs
Best Practices for UDAs
- Manage state carefully
- Handle edge cases and errors
- Optimize for performance
- Use appropriate type hints
- Document expected behavior
Additional Resources
API Documentation
Complete API reference
Was this page helpful?