Changelog
Product Updates
Keep track of changes
Highlights
- Added AWS Bedrock Adapter for expanded LLM integration options
- Built Reddit Agentic Bot
Enhancements
- Improved Table._descriptors() functionality
- Added markdown support when displaying table/dataframe descriptors
- Removed internal column types from pixeltable top level module
- Used source CTE explicitly to generate GROUP BY for query optimization
- Added comprehensive user workflow test script with timing
Fixes
- Improved error reporting in ExprEvalError
- Ensured that extra_fields is properly set in TableDataConduit
Highlights
- Introduced pxt.retrieval_tool() for exposing tabular data as a RAG data source
- Added client-side support for publishing snapshots. Sign up for cloud preview
Enhancements
- Added graceful handling of keyboard interrupts
Fixes
- Fixed concurrency issues with directory operations
- Fixed grouping aggregation-related bugs
Highlights
- Added support to initialize Pixeltable with database connection strings
- Added support for separate userspaces in the Pixeltable catalog
Enhancements
- Improved file format detection by preferring file extension over puremagic
- Enabled table.select(None) functionality
- Integrated JsonMapper with async expression evaluation
- Widened numpy version compatibility
Fixes
- Fixed add_embedding_index() when embedding function has arity > 1
- Disallowed updates to media columns
Highlights
- Introduced
pxtf.map()
as a replacement for the>>
operator to create JsonMappers - Added string concatenation operations (
+
and*
) support in arithmetic expressions - Incorporated import operations into table_create and insert methods
- Added access method for embedding indices
- Switched to pixeltable-yolox for computer vision functionality
Enhancements
- Restructured documentation for improved navigation and clarity
- Added table IDs to Pixeltable markdown documentation
- Added create_parents option to create_dir to automatically create missing parent directories
- Improved JsonMapper functionality with new unit tests for QueryTemplateFunction
Fixes
- Fixed event loop debug logging
- Resolved syntax errors in documentation
- Addressed bugs in directory operations, particularly when drop_dir() is the first operation after catalog load
- Fixed issues with chained tool calling
- Corrected bug involving @pxt.query instances with default parameter values
- Improved JsonPath serialization
Highlights
- Introduced linting for improved code quality
- Added just-in-time initialization for spaCy, improving pxt.init() performance
- Made catalog changes to prepare for concurrency support
Enhancements
- Added video index to cookbook
- Updated configurations page to match API reference
- Added MCP to documentation
- Improved documentation with updated vision search examples
Fixes
- Implemented graceful failure handling for backwards incompatibility in computed column UDF calls
- Various bugfixes and improvements
- Updated Label Studio job to Python 3.10 in nightly CI
Highlights
- Enhanced OpenAI/Anthropic integration with support for multiple invocations of the same tool in tool calling logic
Highlights
- Added Deepseek integration
- Implemented data sharing logic for publishing snapshots
- Enhanced UDF handling in computed columns
- Migrated to Mintlify documentation
Enhancements
- Improved test suite with pytest fixtures for Hugging Face embedding models
- Enabled view creation from dataframes with select clause
- Updated PyAV to 14.2 and WhisperX to 3.3.1
- Improved handling of relative pathnames and filenames with unusual characters
Documentation
- Fixed documentation for stored attribute on computed columns
- Added audio file examples
Development & Infrastructure
- Updated llama_cpp version (disabled in non-Linux CI)
- Implemented release script fixes for Poetry 2.0
Highlights
- Added support for OpenAI reasoning models
- Introduced tables as UDFs for more modular workflows
- Implemented AudioSplitter support for audio processing
- Enabled all types of computed columns to be unstored for flexibility
- Added support for variable parameters in query limit() clause
- Enhanced data management with a packager for table data
- Updated PostgreSQL to version 16.8 and pgvector to 0.8.0
Enhancements
- Improved parallel execution capabilities
- Added support for generalized arrays (unparameterized/with only a dtype)
- Allowed numpy.ndarray arrays to be used as Literal constants
- Enhanced type checking for tests package
- Improved handling of collections with all constants as Literals
- Converted more UDFs to async for better performance
- Added verbose system config option for improved debugging
Fixes
- Fixed FastAPI integration bug
- Resolved issues with AsyncConnectionPool behavior
- Improved test resiliency and reliability
- Fixed tiktoken dependency issue
- Corrected validity of column error properties
- Upgraded httpcore for better compatibility
- Fixed notebook test failures
Development & Infrastructure
- Added archive functionality for Pixeltable logs from every test run
- Improved CI/CD workflow with tmate in detached mode
- Enhanced documentation with updates to numerous guides
- Streamlined API syntax for better developer experience
- Updated example applications to use new query syntax
Highlights
- Enhanced Function Support with multiple signatures capability for Functions, UDFs, and UDAs
- Improved Data Validation with JSON Schema validation for JsonType columns
- Enhanced Database Reliability by changing SQL Engine isolation level to ‘REPEATABLE READ’
Enhancements
- Added ifexists parameter to create* APIs for better control
- Improved DataFrame docstrings for better documentation
- Fixed indexed column loading for views
- Enhanced type validation by preventing bool literals in int columns
- Improved handling of index name conflicts
Documentation & Examples
- Updated Discord Bot documentation
- Added Gemini integration examples
Fixes
- Fixed assertion in ReloadTester
- Resolved pgserver-related issues for Docker and windows setup
Highlights
- Added Python 3.13 Support
- Introduced basic joins functionality for tables
- Added Gemini AI integration
- Implemented Parquet export API
- Extended document support to include .txt files
Enhancements
- Added test utility for query result verification after catalog reload
- Fixed Optional vs. Required handling in astype()
- Updated Ollama integration for version 0.4.0
- Added graceful error handling when using dropped catalog.Tables
- Reorganized docs and examples folders
- Added feature guide for time zones
- Made Tables, DataFrames, and Expressions repr output more user-friendly
Fixes
- Fixed string comparison to use != instead of ‘is not’
- Resolved various development environment configuration issues
New Contributors
- @jacobweiss2305 made his first contribution
Highlights
- Added Context-Aware Discord Bot with Semantic Search Capabilities
- Introduced TileIterator for efficient data processing
- Migrated to torchaudio from librosa for improved audio preprocessing
Enhancements
- Implemented reusable retry script for CI workflows
- Added configuration documentation (config.md)
- Enhanced Function bindings with partial support
- Fixed backwards-incompatible Mistral API changes
- Improved create_insert_plan functionality
- Disabled sentence_transformers tests on linux ARM for better stability
- Updated README.md with clearer organization
- Added support for table/column handles in APIs
Highlights
- Added support for Ollama, llama_cpp, and Replicate
- Switched FrameIterator to PyAV and added XML document type support
- Added Voxel51 integration for computer vision workflows
- Implemented custom type hints for all Pixeltable types
- Added support for converting aggregate FunctionCalls to SQL
- Streamlined create_view API and enhanced documentation
Development & Infrastructure
- Updated CI/CD configuration and Makefile
- Upgraded GitHub Actions to use macos-13
- Limited ubuntu-arm64 and ubuntu-x64-t4 to scheduled runs
- Added Image.point() to API
- Improved type-checking correctness across packages
- Enhanced documentation and display for new type hint pattern
Fixes
- Fixed issues in working-with-huggingface notebook
- Resolved Replicate notebook compatibility with external URLs
- Ensured correct nullability in FunctionCall return types
- Added exception raising during add_column() errors
- Allowed @query redefinition in notebook scope
- Updated BtreeIndex.str_filter implementation
Enhancements
- Added support for loading Hugging Face datasets containing images
- Implemented LRU eviction in FileCache for improved memory management
- Enhanced JSON path functionality to allow getitem syntax
- Updated iterators to handle None values as input
Fixes
- Resolved an issue with the Together AI image endpoint
Enhancements
- Initial support for converting FunctionCalls to SQL
- Added comprehensive time zone handling
- Improved type-checking correctness for catalog, functions, and ext packages
- Introduced integration with Mistral AI and Anthropic
- Added a new tutorial on computed columns
Improvements
- Made mistune an optional dependency
Fixes
- Resolved a circularity issue in database migration for schema version 19 -> 20
Enhancements
- Improved type-checking system with groundwork and performance improvements
- Added cross-links to docstrings
- Enhanced create_table to accept DataFrame directly
- Updated Postgres to version 16.4 and pgvector to 0.7.4
- Implemented Notebook CI and Nightly CI
Fixes
- Fixed unit test for Together AI integration
- Resolved notebook regressions
- Updated to psycopg3 as Postgres driver
- Cleaned up Table class namespace
- Fixed JSON serialization and literal handling
Highlights
- Optimized data loading with StoreBase.load_column()
- Added support for lists, dictionaries, and non-numpy datatypes in import_pandas
- Enhanced video frame extraction control in FrameIterator
- Added UDF draw_bounding_boxes() for object detection visualization
- Migrated to Pixeltable-compatible pgserver fork
- Made all column types stored by default
New Features
- Added import_json() and import_rows() functions
- Expanded timestamp functions library
- Added aggregate make_list() function
Improvements
- Simplified method call syntax
- Enhanced notebook experience
- Improved test coverage and automation
Fixes
- Updated database version
- Removed support for Python datetime.date
- Improved CSV import with nullable types
Features
- Added Label Studio integration with pre-signed URLs for S3 buckets
- Enhanced compatibility with newer versions of label-studio-sdk
- Added new String functions
- Introduced new tutorial about Tables and Data Operations