Skip to main content

module  pixeltable

Core Pixeltable API for table operations, data processing, and UDF management.

func  array()

Signature
array(elements: Iterable) -> exprs.Expr

func  configure_logging()

Signature
configure_logging(
    *,
    to_stdout: bool | None = None,
    level: int | None = None,
    add: str | None = None,
    remove: str | None = None
) -> None
Configure logging. Parameters:
  • to_stdout (bool | None): if True, also log to stdout
  • level (int | None): default log level
  • add (str | None): comma-separated list of ‘module name:log level’ pairs; ex.: add=‘video:10’
  • remove (str | None): comma-separated list of module names

func  create_dir()

Signature
create_dir(
    path: str,
    *,
    if_exists: Literal['error', 'ignore', 'replace', 'replace_force'] = 'error',
    parents: bool = False
) -> catalog.Dir | None
Create a directory. Parameters:
  • path (str): Path to the directory.
  • if_exists (Literal['error', 'ignore', 'replace', 'replace_force'], default: 'error'): Directive regarding how to handle if the path already exists. Must be one of the following:
    • 'error': raise an error
    • 'ignore': do nothing and return the existing directory handle
    • 'replace': if the existing directory is empty, drop it and create a new one
    • 'replace_force': drop the existing directory and all its children, and create a new one
  • parents (bool, default: False): Create missing parent directories.
Returns:
  • catalog.Dir | None: A handle to the newly created directory, or to an already existing directory at the path when if_exists='ignore'. Please note the existing directory may not be empty.
Examples:
pxt.create_dir('my_dir')
Create a subdirectory:
pxt.create_dir('my_dir/sub_dir')
Create a subdirectory only if it does not already exist, otherwise do nothing:
pxt.create_dir('my_dir/sub_dir', if_exists='ignore')
Create a directory and replace if it already exists:
pxt.create_dir('my_dir', if_exists='replace_force')
Create a subdirectory along with its ancestors:
pxt.create_dir('parent1/parent2/sub_dir', parents=True)

func  create_snapshot()

Signature
create_snapshot(
    path_str: str,
    base: catalog.Table | Query,
    *,
    additional_columns: Mapping[str, type | ColumnSpec | exprs.Expr] | None = None,
    iterator: func.GeneratingFunctionCall | None = None,
    num_retained_versions: int = 10,
    comment: str | None = None,
    custom_metadata: Any = None,
    media_validation: Literal['on_read', 'on_write'] = 'on_write',
    if_exists: Literal['error', 'ignore', 'replace', 'replace_force'] = 'error'
) -> catalog.Table | None
Create a snapshot of an existing table object (which itself can be a view or a snapshot or a base table). Parameters:
  • path_str (str): A name for the snapshot; can be either a simple name such as my_snapshot, or a pathname such as dir1/my_snapshot.
  • base (catalog.Table | Query): Table (i.e., table or view or snapshot) or Query to base the snapshot on.
  • additional_columns (Mapping[str, type | ColumnSpec | exprs.Expr] | None): If specified, will add these columns to the snapshot once it is created. The format of the additional_columns parameter is identical to the format of the schema parameter in create_table.
  • iterator (func.GeneratingFunctionCall | None): The iterator to use for this snapshot. If specified, then this snapshot will be a one-to-many view of the base table.
  • num_retained_versions (int, default: 10): Number of versions of the view to retain.
  • comment (str | None): Optional comment for the snapshot.
  • custom_metadata (Any): Optional user-defined JSON metadata to associate with the snapshot.
  • media_validation (Literal['on_read', 'on_write'], default: 'on_write'): Media validation policy for the snapshot.
    • 'on_read': validate media files at query time
    • 'on_write': validate media files during insert/update operations
  • if_exists (Literal['error', 'ignore', 'replace', 'replace_force'], default: 'error'): Directive regarding how to handle if the path already exists. Must be one of the following:
    • 'error': raise an error
    • 'ignore': do nothing and return the existing snapshot handle
    • 'replace': if the existing snapshot has no dependents, drop and replace it with a new one
    • 'replace_force': drop the existing snapshot and all its dependents, and create a new one
Returns:
  • catalog.Table | None: A handle to the Table representing the newly created snapshot. Please note the schema or base of the existing snapshot may not match those provided in the call.
Examples: Create a snapshot my_snapshot of a table my_table:
tbl = pxt.get_table('my_table')
snapshot = pxt.create_snapshot('my_snapshot', tbl)
Create a snapshot my_snapshot of a view my_view with additional int column col3, if my_snapshot does not already exist:
view = pxt.get_table('my_view')
snapshot = pxt.create_snapshot(
    'my_snapshot',
    view,
    additional_columns={'col3': pxt.Int},
    if_exists='ignore',
)
Create a snapshot my_snapshot on a table my_table, and replace any existing snapshot named my_snapshot:
tbl = pxt.get_table('my_table')
snapshot = pxt.create_snapshot(
    'my_snapshot', tbl, if_exists='replace_force'
)

func  create_table()

Signature
create_table(
    path: str,
    schema: Mapping[str, type | ColumnSpec | exprs.Expr] | None = None,
    *,
    source: TableDataSource | None = None,
    source_format: Literal['csv', 'excel', 'parquet', 'json'] | None = None,
    schema_overrides: dict[str, Any] | None = None,
    create_default_idxs: bool = True,
    on_error: Literal['abort', 'ignore'] = 'abort',
    primary_key: str | list[str] | None = None,
    num_retained_versions: int = 10,
    comment: str | None = None,
    custom_metadata: Any = None,
    media_validation: Literal['on_read', 'on_write'] = 'on_write',
    if_exists: Literal['error', 'ignore', 'replace', 'replace_force'] = 'error',
    extra_args: dict[str, Any] | None = None
) -> catalog.Table
Create a new base table. Exactly one of schema or source must be provided. If a schema is provided, then an empty table will be created with the specified schema. If a source is provided, then Pixeltable will attempt to infer a data source format and table schema from the contents of the specified data, and the data will be imported from the specified source into the new table. The source format and/or schema can be specified directly via the source_format and schema_overrides parameters. Parameters:
  • path (str): Pixeltable path (qualified name) of the table, such as 'my_table' or 'my_dir/my_subdir/my_table'.
  • schema (Mapping[str, type | ColumnSpec | exprs.Expr] | None): Schema for the new table, mapping column names to Pixeltable types.
  • source (TableDataSource | None): A data source (file, URL, Table, Query, or list of rows) to import from.
  • source_format (Literal['csv', 'excel', 'parquet', 'json'] | None): Must be used in conjunction with a source. If specified, then the given format will be used to read the source data. (Otherwise, Pixeltable will attempt to infer the format from the source data.)
  • schema_overrides (dict[str, Any] | None): Must be used in conjunction with a source. If specified, then columns in schema_overrides will be given the specified types. (Pixeltable will attempt to infer the types of any columns not specified.)
  • create_default_idxs (bool, default: True): If True, creates a B-tree index on every scalar and media column that is not computed, except for boolean columns.
  • on_error (Literal['abort', 'ignore'], default: 'abort'): Determines the behavior if an error occurs while evaluating a computed column or detecting an invalid media file (such as a corrupt image) for one of the inserted rows.
    • If on_error='abort', then an exception will be raised and the rows will not be inserted.
    • If on_error='ignore', then execution will continue and the rows will be inserted. Any cells with errors will have a None value for that cell, with information about the error stored in the corresponding tbl.col_name.errortype and tbl.col_name.errormsg fields.
  • primary_key (str | list[str] | None): An optional column name or list of column names to use as the primary key(s) of the table.
  • num_retained_versions (int, default: 10): Number of versions of the table to retain.
  • comment (str | None): An optional comment; its meaning is user-defined.
  • custom_metadata (Any): Optional user-defined metadata to associate with the table. Must be a valid JSON-serializable object [str, int, float, bool, dict, list].
  • media_validation (Literal['on_read', 'on_write'], default: 'on_write'): Media validation policy for the table.
    • 'on_read': validate media files at query time
    • 'on_write': validate media files during insert/update operations
  • if_exists (Literal['error', 'ignore', 'replace', 'replace_force'], default: 'error'): Determines the behavior if a table already exists at the specified path location.
    • 'error': raise an error
    • 'ignore': do nothing and return the existing table handle
    • 'replace': if the existing table has no views or snapshots, drop and replace it with a new one; raise an error if the existing table has views or snapshots
    • 'replace_force': drop the existing table and all its views and snapshots, and create a new one
  • extra_args (dict[str, Any] | None): Must be used in conjunction with a source. If specified, then additional arguments will be passed along to the source data provider.
Returns:
  • catalog.Table: A handle to the newly created table, or to an already existing table at the path when if_exists='ignore'. Please note the schema of the existing table may not match the schema provided in the call.
Examples: Create a table with an int and a string column:
tbl = pxt.create_table(
    'my_table', schema={'col1': pxt.Int, 'col2': pxt.String}
)
Create a table from a select statement over an existing table orig_table (this will create a new table containing the exact contents of the query):
tbl1 = pxt.get_table('orig_table')
tbl2 = pxt.create_table(
    'new_table', tbl1.where(tbl1.col1 < 10).select(tbl1.col2)
)
Create a table if it does not already exist, otherwise get the existing table:
tbl = pxt.create_table(
    'my_table',
    schema={'col1': pxt.Int, 'col2': pxt.String},
    if_exists='ignore',
)
Create a table with an int and a float column, and replace any existing table:
tbl = pxt.create_table(
    'my_table',
    schema={'col1': pxt.Int, 'col2': pxt.Float},
    if_exists='replace',
)
Create a table from a CSV file:
tbl = pxt.create_table('my_table', source='data.csv')
Create a table with an auto-generated UUID primary key:
tbl = pxt.create_table(
    'my_table',
    schema={'id': pxt.functions.uuid.uuid4(), 'data': pxt.String},
    primary_key=['id'],
)

func  create_view()

Signature
create_view(
    path: str,
    base: catalog.Table | Query,
    *,
    additional_columns: Mapping[str, type | ColumnSpec | exprs.Expr] | None = None,
    is_snapshot: bool = False,
    create_default_idxs: bool = False,
    iterator: func.GeneratingFunctionCall | None = None,
    num_retained_versions: int = 10,
    comment: str | None = None,
    custom_metadata: Any = None,
    media_validation: Literal['on_read', 'on_write'] = 'on_write',
    if_exists: Literal['error', 'ignore', 'replace', 'replace_force'] = 'error'
) -> catalog.Table | None
Create a view of an existing table object (which itself can be a view or a snapshot or a base table). Parameters:
  • path (str): A name for the view; can be either a simple name such as my_view, or a pathname such as dir1/my_view.
  • base (catalog.Table | Query): Table (i.e., table or view or snapshot) or Query to base the view on.
  • additional_columns (Mapping[str, type | ColumnSpec | exprs.Expr] | None): If specified, will add these columns to the view once it is created. The format of the additional_columns parameter is identical to the format of the schema parameter in create_table.
  • is_snapshot (bool, default: False): Whether the view is a snapshot. Setting this to True is equivalent to calling create_snapshot.
  • create_default_idxs (bool, default: False): Whether to create default indexes on the view’s columns (the base’s columns are excluded). Cannot be True for snapshots.
  • iterator (func.GeneratingFunctionCall | None): The iterator to use for this view. If specified, then this view will be a one-to-many view of the base table.
  • num_retained_versions (int, default: 10): Number of versions of the view to retain.
  • comment (str | None): Optional comment for the view.
  • custom_metadata (Any): Optional user-defined JSON metadata to associate with the view.
  • media_validation (Literal['on_read', 'on_write'], default: 'on_write'): Media validation policy for the view.
    • 'on_read': validate media files at query time
    • 'on_write': validate media files during insert/update operations
  • if_exists (Literal['error', 'ignore', 'replace', 'replace_force'], default: 'error'): Directive regarding how to handle if the path already exists. Must be one of the following:
    • 'error': raise an error
    • 'ignore': do nothing and return the existing view handle
    • 'replace': if the existing view has no dependents, drop and replace it with a new one
    • 'replace_force': drop the existing view and all its dependents, and create a new one
Returns:
  • catalog.Table | None: A handle to the Table representing the newly created view. If the path already exists and if_exists='ignore', returns a handle to the existing view. Please note the schema or the base of the existing view may not match those provided in the call.
Examples: Create a view my_view of an existing table my_table, filtering on rows where col1 is greater than 10:
tbl = pxt.get_table('my_table')
view = pxt.create_view('my_view', tbl.where(tbl.col1 > 10))
Create a view my_view of an existing table my_table, filtering on rows where col1 is greater than 10, and if it not already exist. Otherwise, get the existing view named my_view:
tbl = pxt.get_table('my_table')
view = pxt.create_view(
    'my_view', tbl.where(tbl.col1 > 10), if_exists='ignore'
)
Create a view my_view of an existing table my_table, filtering on rows where col1 is greater than 100, and replace any existing view named my_view:
tbl = pxt.get_table('my_table')
view = pxt.create_view(
    'my_view', tbl.where(tbl.col1 > 100), if_exists='replace_force'
)

func  drop_dir()

Signature
drop_dir(
    path: str,
    force: bool = False,
    if_not_exists: Literal['error', 'ignore'] = 'error'
) -> None
Remove a directory. Parameters:
  • path (str): Name or path of the directory.
  • force (bool, default: False): If True, will also drop all tables and subdirectories of this directory, recursively, along with any views or snapshots that depend on any of the dropped tables.
  • if_not_exists (Literal['error', 'ignore'], default: 'error'): Directive regarding how to handle if the path does not exist. Must be one of the following:
    • 'error': raise an error
    • 'ignore': do nothing and return
Examples: Remove a directory, if it exists and is empty:
pxt.drop_dir('my_dir')
Remove a subdirectory:
pxt.drop_dir('my_dir/sub_dir')
Remove an existing directory if it is empty, but do nothing if it does not exist:
pxt.drop_dir('my_dir/sub_dir', if_not_exists='ignore')
Remove an existing directory and all its contents:
pxt.drop_dir('my_dir', force=True)

func  drop_table()

Signature
drop_table(
    table: str | catalog.Table,
    force: bool = False,
    if_not_exists: Literal['error', 'ignore'] = 'error'
) -> None
Drop a table, view, snapshot, or replica. Parameters:
  • table (str | catalog.Table): Fully qualified name or table handle of the table to be dropped; or a remote URI of a cloud replica to be deleted.
  • force (bool, default: False): If True, will also drop all views and sub-views of this table.
  • if_not_exists (Literal['error', 'ignore'], default: 'error'): Directive regarding how to handle if the path does not exist. Must be one of the following:
    • 'error': raise an error
    • 'ignore': do nothing and return
Examples: Drop a table by its fully qualified name:
pxt.drop_table('subdir/my_table')
Drop a table by its handle:
t = pxt.get_table('subdir/my_table')
pxt.drop_table(t)
Drop a table if it exists, otherwise do nothing:
pxt.drop_table('subdir/my_table', if_not_exists='ignore')
Drop a table and all its dependents:
pxt.drop_table('subdir/my_table', force=True)

func  get_dir_contents()

Signature
get_dir_contents(dir_path: str = '', recursive: bool = True) -> DirContents
Get the contents of a Pixeltable directory. Parameters:
  • dir_path (str, default: ''): Path to the directory. Defaults to the root directory.
  • recursive (bool, default: True): If False, returns only those tables and directories that are directly contained in specified directory; if True, returns all tables and directories that are descendants of the specified directory, recursively.
Returns:
  • 'DirContents': A DirContents object representing the contents of the specified directory.
Examples: Get contents of top-level directory:
pxt.get_dir_contents()
Get contents of ‘dir1’:
pxt.get_dir_contents('dir1')

func  get_table()

Signature
get_table(
    path: str,
    if_not_exists: Literal['error', 'ignore'] = 'error'
) -> catalog.Table | None
Get a handle to an existing table, view, or snapshot. Parameters:
  • path (str): Path to the table.
  • if_not_exists (Literal['error', 'ignore'], default: 'error'): Directive regarding how to handle if the path does not exist. Must be one of the following:
    • 'error': raise an error
    • 'ignore': do nothing and return None
Returns:
  • catalog.Table | None: A handle to the Table.
Examples: Get handle for a table in the top-level directory:
tbl = pxt.get_table('my_table')
For a table in a subdirectory:
tbl = pxt.get_table('subdir/my_table')
Handles to views and snapshots are retrieved in the same way:
tbl = pxt.get_table('my_snapshot')
Get a handle to a specific version of a table:
tbl = pxt.get_table('my_table:722')

func  home()

Signature
home() -> Path
Get the path to the user’s home directory in Pixeltable. Returns:
  • Path: The path to the user’s home directory.

func  init()

Signature
init(config_overrides: dict[str, Any] | None = None) -> None
Initializes the Pixeltable environment. Parameters:
  • config_overrides (dict[str, Any] | None): Optional dictionary of configuration overrides.

func  iterator()

Signature
iterator(*args, **kwargs)

func  list_dirs()

Signature
list_dirs(path: str = '', recursive: bool = True) -> list[str]
List the directories in a directory. Parameters:
  • path (str, default: ''): Name or path of the directory.
  • recursive (bool, default: True): If True, lists all descendants of this directory recursively.
Returns:
  • list[str]: List of directory paths.
Examples:
cl.list_dirs('my_dir', recursive=True)

func  list_functions()

Signature
list_functions() -> Styler
Returns information about all registered functions. Returns:
  • Styler: Pandas DataFrame with columns ‘Path’, ‘Name’, ‘Parameters’, ‘Return Type’, ‘Is Agg’, ‘Library’

func  list_tables()

Signature
list_tables(dir_path: str = '', recursive: bool = True) -> list[str]
List the Tables in a directory. Parameters:
  • dir_path (str, default: ''): Path to the directory. Defaults to the root directory.
  • recursive (bool, default: True): If False, returns only those tables that are directly contained in specified directory; if True, returns all tables that are descendants of the specified directory, recursively.
Returns:
  • list[str]: A list of Table paths.
Examples: List tables in top-level directory:
pxt.list_tables()
List tables in ‘dir1’:
pxt.list_tables('dir1')

func  ls()

Signature
ls(path: str = '') -> pd.DataFrame
List the contents of a Pixeltable directory. This function returns a Pandas DataFrame representing a human-readable listing of the specified directory, including various attributes such as version and base table, as appropriate. To get a programmatic list of the directory’s contents, use get_dir_contents() instead.

func  mcp_udfs()

Signature
mcp_udfs(url: str) -> list['pxt.func.Function']

func  move()

Signature
move(
    path: str,
    new_path: str,
    *,
    if_exists: Literal['error', 'ignore'] = 'error',
    if_not_exists: Literal['error', 'ignore'] = 'error'
) -> None
Move a schema object to a new directory and/or rename a schema object. Parameters:
  • path (str): absolute path to the existing schema object.
  • new_path (str): absolute new path for the schema object.
  • if_exists (Literal['error', 'ignore'], default: 'error'): Directive regarding how to handle if a schema object already exists at the new path. Must be one of the following:
    • 'error': raise an error
    • 'ignore': do nothing and return
  • if_not_exists (Literal['error', 'ignore'], default: 'error'): Directive regarding how to handle if the source path does not exist. Must be one of the following:
    • 'error': raise an error
    • 'ignore': do nothing and return
Examples: Move a table to a different directory:
pxt.move('dir1/my_table', 'dir2/my_table')
Rename a table:
pxt.move('dir1/my_table', 'dir1/new_name')

func  publish()

Signature
publish(
    source: str | catalog.Table,
    destination_uri: str,
    bucket_name: str | None = None,
    access: Literal['public', 'private'] = 'private'
) -> None
Publishes a replica of a local Pixeltable table to Pixeltable cloud. A given table can be published to at most one URI per Pixeltable cloud database. Parameters:
  • source (str | catalog.Table): Path or table handle of the local table to be published.
  • destination_uri (str): Remote URI where the replica will be published, such as 'pxt://org_name/my_dir/my_table'.
  • bucket_name (str | None): The name of the bucket to use to store replica’s data. The bucket must be registered with Pixeltable cloud. If no bucket_name is provided, the default storage bucket for the destination database will be used.
  • access (Literal['public', 'private'], default: 'private'): Access control for the replica.
    • 'public': Anyone can access this replica.
    • 'private': Only the host organization can access.

func  query()

Signature
query(*args: Any, **kwargs: Any) -> Any

func  replicate()

Signature
replicate(remote_uri: str, local_path: str) -> catalog.Table
Retrieve a replica from Pixeltable cloud as a local table. This will create a full local copy of the replica in a way that preserves the table structure of the original source data. Once replicated, the local table can be queried offline just as any other Pixeltable table. Parameters:
  • remote_uri (str): Remote URI of the table to be replicated, such as 'pxt://org_name/my_dir/my_table' or 'pxt://org_name/my_dir/my_table:5' (with version 5).
  • local_path (str): Local table path where the replica will be created, such as 'my_new_dir/my_new_tbl'. It can be the same or different from the cloud table name.
Returns:
  • catalog.Table: A handle to the newly created local replica table.

func  retrieval_udf()

Signature
retrieval_udf(
    table: catalog.Table,
    name: str | None = None,
    description: str | None = None,
    parameters: Iterable[str | exprs.ColumnRef] | None = None,
    limit: int | None = 10
) -> func.QueryTemplateFunction
Constructs a retrieval UDF for the given table. The retrieval UDF is a UDF whose parameters are columns of the table and whose return value is a list of rows from the table. The return value of
f(col1=x, col2=y, ...)
will be a list of all rows from the table that match the specified arguments. Parameters:
  • table (catalog.Table): The table to use as the dataset for the retrieval tool.
  • name (str | None): The name of the tool. If not specified, then the name of the table will be used by default.
  • description (str | None): The description of the tool. If not specified, then a default description will be generated.
  • parameters (Iterable[str | exprs.ColumnRef] | None): The columns of the table to use as parameters. If not specified, all data columns (non-computed columns) will be used as parameters. All of the specified parameters will be required parameters of the tool, regardless of their status as columns.
  • limit (int | None, default: 10): The maximum number of rows to return. If not specified, then all matching rows will be returned.
Returns:
  • func.QueryTemplateFunction: A list of dictionaries containing data from the table, one per row that matches the input arguments. If there are no matching rows, an empty list will be returned.

func  tool()

Signature
tool(
    fn: func.Function,
    name: str | None = None,
    description: str | None = None
) -> func.tools.Tool
Specifies a Pixeltable UDF to be used as an LLM tool with customizable metadata. See the documentation for pxt.tools() for more details. Parameters:
  • fn (func.Function): The UDF to use as a tool.
  • name (str | None): The name of the tool. If not specified, then the unqualified name of the UDF will be used by default.
  • description (str | None): The description of the tool. If not specified, then the entire contents of the UDF docstring will be used by default.
Returns:
  • func.tools.Tool: A Tool instance that can be passed to an LLM tool-calling API.

func  tools()

Signature
tools(*args: func.Function | func.tools.Tool) -> func.tools.Tools
Specifies a collection of UDFs to be used as LLM tools. Pixeltable allows any UDF to be used as an input into an LLM tool-calling API. To use one or more UDFs as tools, wrap them in a pxt.tools call and pass the return value to an LLM API. The UDFs can be specified directly or wrapped inside a pxt.tool() invocation. If a UDF is specified directly, the tool name will be the (unqualified) UDF name, and the tool description will consist of the entire contents of the UDF docstring. If a UDF is wrapped in a pxt.tool() invocation, then the name and/or description may be customized. Parameters:
  • args (func.Function | func.tools.Tool): The UDFs to use as tools.
Returns:
  • func.tools.Tools: A Tools instance that can be passed to an LLM tool-calling API or invoked to generate tool results.
Examples: Create a tools instance with a single UDF:
tools = pxt.tools(stock_price)
Create a tools instance with several UDFs:
tools = pxt.tools(stock_price, weather_quote)
Create a tools instance, some of whose UDFs have customized metadata:
tools = pxt.tools(
    stock_price,
    pxt.tool(
        weather_quote,
        description='Returns information about the weather in a particular location.',
    ),
    pxt.tool(traffic_quote, name='traffic_conditions'),
)

func  uda()

Signature
uda(*args, **kwargs)
Decorator for user-defined aggregate functions. The decorated class must inherit from Aggregator and implement the following methods:
  • init(self, …) to initialize the aggregator
  • update(self, …) to update the aggregator with a new value
  • value(self) to return the final result
The decorator creates an AggregateFunction instance from the class and adds it to the module where the class is defined. Parameters:
  • requires_order_by: if True, the first parameter to the function is the order-by expression
  • allows_std_agg: if True, the function can be used as a standard aggregate function w/o a window
  • allows_window: if True, the function can be used with a window

func  udf()

Signature
udf(*args, **kwargs)
A decorator to create a Function from a function definition. Examples:
@pxt.udf
def my_function(x: int) -> int:
    return x + 1
Last modified on April 11, 2026