class pixeltable.exprs.ColumnRef
A Pixeltable expression that references a column of a table. AColumnRef is created by column access
on a Table, such as t.col.
method embedding()
Signature
similarity() lookup. Sometimes it is also useful to directly access
the index values (i.e., the embedding vectors themselves). Calling embedding() returns a new ColumnRef
expression of type pxt.Array[(dim,), prec], where dim and prec are the dimensionality and precision
of the column’s embedding index.
If there is more than one embedding index defined on this column, then the idx parameter must be provided to
specify which index to reference. If there is only one index, then idx is optional.
Args:
idx: An optional embedding index name. Required if there is more than one embedding index defined on
this column.
Returns:
A new ColumnRef referencing the values of the specified embedding index on this column.
Raises:
pxt.Error if there is no embedding index defined on this column, if idx is not provided when there are
multiple embedding indices, or if idx does not match any embedding index defined on this column.
Examples:
All of these examples assume that t is a table with an image column t.image.
Add an embedding index to t.image using the clip()
embedding (this only needs to be done once):
Reference the embedding index values directly:from pixeltable.functions.huggingface import clip … … t.add_embedding_index( … t.image, clip.using(model_id=‘openai/clip-vit-base-patch32’) … )
t.select(t.image, t.image.embedding())
method similarity()
Signature
string, image, audio, video, document, or vector must be provided. The item
parameter is deprecated and exists for backward compatibility only.
If string, image, audio, video, or document is provided, then an embedding vector will be computed
for the given input as defined by the embedding index and used to determine similarity. If vector is
provided, then it must be a 1-dimensional array of the same dimensionality as the index, and similarity will
be determined directly against the vector.
The optional idx parameter specifies the name of the embedding index to use. If there is more than one
embedding index defined on this column, then idx must be provided.
Parameters:
string(str | None): A string to compare against the values of this column.image(str | PIL.Image.Image | None): An image to compare against the values of this column (either a local file path, a URL, or an in-memoryPIL.Image.Image).audio(str | None): An audio file to compare against the values of this column (a local file path or a URL).video(str | None): A video file to compare against the values of this column (a local file path or a URL).document(str | None): A document file to compare against the values of this column (a local file path or a URL).vector(np.ndarray | None): A 1-dimensional NumPy array to compare against the values of this column.idx(str | None): An optional embedding index name. Required if there is more than one embedding index defined on this column.item(Any): Deprecated as of version 0.5.7.
Expr: A new expression representing the similarity score between the values of this column and the given item.
t is a table with an image column t.image. Add an embedding index to t.image using the clip() embedding (this only needs to be done once):
k=5):