Methods
add_column()
Adds an ordinary (non-computed) column to the table.
Signature:
-
kwargs(ts.ColumnType | builtins.type | _GenericAlias | exprs.Expr): Exactly one keyword argument of the formcol_name=col_type. -
if_exists(Literal[‘error’, ‘ignore’, ‘replace’, ‘replace_force’]) =error: Determines the behavior if the column already exists. Must be one of the following: -
'error': an exception will be raised. -
'ignore': do nothing and return. -
'replace' or 'replace_force': drop the existing column and add the new column, if it has no dependents.
- UpdateStatus: Information about the execution status of the operation.
add_columns()
Adds multiple columns to the table. The columns must be concrete (non-computed) columns; to add computed
columns, use add_computed_column() instead.
The format of the schema argument is a dict mapping column names to their types.
Signature:
-
schema(dict[str, ts.ColumnType | builtins.type | _GenericAlias]): A dictionary mapping column names to types. -
if_exists(Literal[‘error’, ‘ignore’, ‘replace’, ‘replace_force’]) =error: Determines the behavior if a column already exists. Must be one of the following: -
'error': an exception will be raised. -
'ignore': do nothing and return. -
'replace' or 'replace_force': drop the existing column and add the new column, if it has no dependents.
if_exists parameter is applied to all columns in the schema. To apply different behaviors to different columns, please use add_column() for each column.
Returns:
- UpdateStatus: Information about the execution status of the operation.
my_table:
add_computed_column()
Adds a computed column to the table.
Signature:
-
kwargs(exprs.Expr): Exactly one keyword argument of the formcol_name=expression. -
stored(Optional[bool]): Whether the column is materialized and stored or computed on demand. -
destination(Optional[str | Path]): An object store reference for persisting computed files. -
print_stats(bool) =False: IfTrue, print execution metrics during evaluation. -
on_error(Literal[‘abort’, ‘ignore’]) =abort: Determines the behavior if an error occurs while evaluating the column expression for at least one row. -
'abort': an exception will be raised and the column will not be added. -
'ignore': execution will continue and the column will be added. Any rows with errors will have aNonevalue for the column, with information about the error stored in the correspondingtbl.col_name.errormsgandtbl.col_name.errortypefields. -
if_exists(Literal[‘error’, ‘ignore’, ‘replace’]) =error: Determines the behavior if the column already exists. Must be one of the following: -
'error': an exception will be raised. -
'ignore': do nothing and return. -
'replace' or 'replace_force': drop the existing column and add the new column, iff it has no dependents.
- UpdateStatus: Information about the execution status of the operation.
frame, add an image column rotated that rotates the image by
90 degrees:
add_embedding_index()
Add an embedding index to the table. Once the index is created, it will be automatically kept up-to-date as new
rows are inserted into the table.
To add an embedding index, one must specify, at minimum, the column to be indexed and an embedding UDF. Only String and Image columns are currently supported. Here’s an example that uses a CLIP embedding to index an image column:
similarity pseudo-function.
-
column(str | ColumnRef): The name of, or reference to, the column to be indexed; must be aStringorImagecolumn. -
idx_name(Optional[str]): An optional name for the index. If not specified, a name such as'idx0'will be generated automatically. If specified, the name must be unique for this table and a valid pixeltable column name. -
embedding(Optional[pxt.Function]): The UDF to use for the embedding. Must be a UDF that accepts a single argument of typeStringorImage(as appropriate for the column being indexed) and returns a fixed-size 1-dimensional array of floats. -
string_embed(Optional[pxt.Function]): An optional UDF to use for the string embedding component of this index. Can be used in conjunction withimage_embedto construct multimodal embeddings manually, by specifying different embedding functions for different data types. -
image_embed(Optional[pxt.Function]): An optional UDF to use for the image embedding component of this index. Can be used in conjunction withstring_embedto construct multimodal embeddings manually, by specifying different embedding functions for different data types. -
metric(str) =cosine: Distance metric to use for the index; one of'cosine','ip', or'l2'. The default is'cosine'. -
if_exists(Literal[‘error’, ‘ignore’, ‘replace’, ‘replace_force’]) =error: Directive for handling an existing index with the same name. Must be one of the following: -
'error': raise an error if an index with the same name already exists. -
'ignore': do nothing if an index with the same name already exists. -
'replace'or'replace_force': replace the existing index with the new one.
img column of the table my_table:
img column may be specified by name:
img column, using the inner product as the distance metric,
and with a specific name:
batch_update()
Update rows in this table.
Signature:
-
rows(Iterable[dict[str, Any]]): an Iterable of dictionaries containing values for the updated columns plus values for the primary key columns. -
cascade(bool) =True: if True, also update all computed columns that transitively depend on the updated columns. -
if_not_exists(Literal[‘error’, ‘ignore’, ‘insert’]) =error: Specifies the behavior if a row to update does not exist: -
'error': Raise an error. -
'ignore': Skip the row silently. -
'insert': Insert the row.
name and age columns for the rows with ids 1 and 2 (assuming id is the primary key).
If either row does not exist, this raises an error:
name and age columns for the row with id 1 (assuming id is the primary key) and insert
the row with new id 3 (assuming this key does not exist):
columns()
Return the names of the columns in this table.
Signature:
count()
Return the number of rows in this table.
Signature:
delete()
Delete rows in this table.
Signature:
where(Optional[‘exprs.Expr’]): a predicate to filter rows to delete.
a is greater than 5:
describe()
Print the table schema.
Signature:
drop_column()
Drop a column from the table.
Signature:
-
column(str | ColumnRef): The name or reference of the column to drop. -
if_not_exists(Literal[‘error’, ‘ignore’]) =error: Directive for handling a non-existent column. Must be one of the following: -
'error': raise an error if the column does not exist. -
'ignore': do nothing if the column does not exist.
col from the table my_table by column name:
col from the table my_table by column reference:
col from the table my_table if it exists, otherwise do nothing:
drop_embedding_index()
Drop an embedding index from the table. Either a column name or an index name (but not both) must be
specified. If a column name or reference is specified, it must be a column containing exactly one embedding index; otherwise the specific index name must be provided instead.
Signature:
-
column(str | ColumnRef | None): The name of, or reference to, the column from which to drop the index. The column must have only one embedding index. -
idx_name(Optional[str]): The name of the index to drop. -
if_not_exists(Literal[‘error’, ‘ignore’]) =error: Directive for handling a non-existent index. Must be one of the following: -
'error': raise an error if the index does not exist. -
'ignore': do nothing if the index does not exist.
if_not_exists parameter is only applicable when an idx_name is specified and it does not exist, or when column is specified and it has no index. if_not_exists does not apply to non-exisitng column.
Example:
Drop the embedding index on the img column of the table my_table by column name:
img column of the table my_table by column reference:
idx1 of the table my_table by index name:
idx1 of the table my_table by index name, if it exists, otherwise do nothing:
drop_index()
Drop an index from the table. Either a column name or an index name (but not both) must be
specified. If a column name or reference is specified, it must be a column containing exactly one index; otherwise the specific index name must be provided instead.
Signature:
-
column(str | ColumnRef | None): The name of, or reference to, the column from which to drop the index. The column must have only one embedding index. -
idx_name(Optional[str]): The name of the index to drop. -
if_not_exists(Literal[‘error’, ‘ignore’]) =error: Directive for handling a non-existent index. Must be one of the following: -
'error': raise an error if the index does not exist. -
'ignore': do nothing if the index does not exist.
if_not_exists parameter is only applicable when an idx_name is specified and it does not exist, or when column is specified and it has no index. if_not_exists does not apply to non-exisitng column.
Example:
Drop the index on the img column of the table my_table by column name:
img column of the table my_table by column reference:
idx1 of the table my_table by index name:
idx1 of the table my_table by index name, if it exists, otherwise do nothing:
get_metadata()
Retrieves metadata associated with this table.
Signature:
- ‘TableMetadata’: A [TableMetadata][pixeltable.TableMetadata] instance containing this table’s metadata.
get_versions()
Returns information about versions of this table, most recent first.
get_versions() is intended for programmatic access to version metadata; for human-readable output, use history() instead.
Signature:
n(Optional[int]): if specified, will return at mostnversions
- list[VersionMetadata]: A list of [VersionMetadata][pixeltable.VersionMetadata] dictionaries, one per version retrieved, most recent first.
tbl:
tbl:
history()
Returns a human-readable report about versions of this table.
history() is intended for human-readable output of version metadata; for programmatic access, use get_versions() instead.
Signature:
n(Optional[int]): if specified, will return at mostnversions
- pd.DataFrame: A report with information about each version, one per row, most recent first.
insert()
Inserts rows into this table. There are two mutually exclusive call patterns:
To insert multiple rows at a time:
-
source(Optional[TableDataSource]): A data source from which data can be imported. -
kwargs(Any): (if inserting a single row) Keyword-argument pairs representing column names and values. (if inserting multiple rows) Additional keyword arguments are passed to the data source. -
source_format(Optional[Literal[‘csv’, ‘excel’, ‘parquet’, ‘json’]]): A hint about the format of the source data -
schema_overrides(Optional[dict[str, ts.ColumnType]]): If specified, then columns inschema_overrideswill be given the specified types -
on_error(Literal[‘abort’, ‘ignore’]) =abort: Determines the behavior if an error occurs while evaluating a computed column or detecting an invalid media file (such as a corrupt image) for one of the inserted rows. -
If
on_error='abort', then an exception will be raised and the rows will not be inserted. -
If
on_error='ignore', then execution will continue and the rows will be inserted. Any cells with errors will have aNonevalue for that cell, with information about the error stored in the correspondingtbl.col_name.errortypeandtbl.col_name.errormsgfields. -
print_stats(bool) =False: IfTrue, print statistics about the cost of computed columns.
- UpdateStatus: An
UpdateStatusobject containing information about the update.
my_table with three int columns a, b, and c.
Column c is nullable:
pxt.Int columns a and b:
list_views()
Returns a list of all views and snapshots of this Table.
Signature:
recursive(bool) =True: IfFalse, returns only the immediate successor views of thisTable. IfTrue, returns all sub-views (including views of views, etc.)
- list[str]: A list of view paths.
recompute_columns()
Recompute the values in one or more computed columns of this table.
Signature:
-
columns(str | ColumnRef): The names or references of the computed columns to recompute. -
where(‘exprs.Expr’ | None): A predicate to filter rows to recompute. -
errors_only(bool) =False: If True, only run the recomputation for rows that have errors in the column (ie, the column’serrortypeproperty indicates that an error occurred). Only allowed for recomputing a single column. -
cascade(bool) =True: if True, also update all computed columns that transitively depend on the recomputed columns.
c1 and c2 for all rows in this table, and everything that transitively
depends on them:
c1 for all rows in this table, but don’t recompute other columns that depend on
it:
c1 and its dependents, but only for rows with c2 == 0:
c1 and its dependents, but only for rows that have errors in it:
rename_column()
Rename a column.
Signature:
-
old_name(str): The current name of the column. -
new_name(str): The new name of the column.
col1 to col2 of the table my_table:
revert()
Reverts the table to the previous version.
.. warning:: This operation is irreversible.
Signature:
sync()
Synchronizes this table with its linked external stores.
Signature:
-
stores(str | list[str] | None): If specified, will synchronize only the specified named store or list of stores. If not specified, will synchronize all of this table’s external stores. -
export_data(bool) =True: IfTrue, data from this table will be exported to the external stores during synchronization. -
import_data(bool) =True: IfTrue, data from the external stores will be imported to this table during synchronization.
unlink_external_stores()
Unlinks this table’s external stores.
Signature:
-
stores(str | list[str] | None): If specified, will unlink only the specified named store or list of stores. If not specified, will unlink all of this table’s external stores. -
ignore_errors(bool) =False: IfTrue, no exception will be thrown if a specified store is not linked to this table. -
delete_external_data(bool) =False: IfTrue, then the external data store will also be deleted. WARNING: This is a destructive operation that will delete data outside Pixeltable, and cannot be undone.
update()
Update rows in this table.
Signature:
-
value_spec(dict[str, Any]): a dictionary mapping column names to literal values or Pixeltable expressions. -
where(Optional[‘exprs.Expr’]): a predicate to filter rows to update. -
cascade(bool) =True: if True, also update all computed columns that transitively depend on the updated columns.
- UpdateStatus: An
UpdateStatusobject containing information about the update.
int_col to 1 for all rows:
int_col to 1 for all rows where int_col is 0:
int_col to the value of other_int_col + 1:
int_col by 1 for all rows where int_col is 0: