> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
> Use this file to discover all available pages before exploring further.

# voyageai

> <a href="https://github.com/pixeltable/pixeltable/blob/main/pixeltable/functions/voyageai.py#L0" id="viewSource" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/View%20Source%20on%20Github-blue?logo=github&labelColor=gray" alt="View Source on GitHub" style={{ display: 'inline', margin: '0px' }} noZoom /></a>

# <span style={{ 'color': 'gray' }}>module</span>  pixeltable.functions.voyageai

Pixeltable UDFs
that wrap various endpoints from the Voyage AI API. In order to use them, you must
first `pip install voyageai` and configure your Voyage AI credentials, as described in
the [Working with Voyage AI](https://docs.pixeltable.com/notebooks/integrations/working-with-voyageai) tutorial.

## <span style={{ 'color': 'gray' }}>udf</span>  embeddings()

```python Signature theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
@pxt.udf
embeddings(
    input: pxt.String,
    *,
    model: pxt.String,
    input_type: pxt.String | None = None,
    truncation: pxt.Bool | None = None,
    output_dimension: pxt.Int | None = None,
    output_dtype: pxt.String | None = None
) -> pxt.Array[(None,), float32]
```

Creates an embedding vector representing the input text.

Equivalent to the Voyage AI `embeddings` API endpoint.
For additional details, see: [https://docs.voyageai.com/docs/embeddings](https://docs.voyageai.com/docs/embeddings)

Request throttling:
Applies the rate limit set in the config (section `voyageai`, key `rate_limit`). If no rate
limit is configured, uses a default of 600 RPM.

**Requirements:**

* `pip install voyageai`

**Parameters:**

* **`input`** (`pxt.String`): The text to embed.
* **`model`** (`pxt.String`): The model to use for the embedding. Recommended options: `voyage-3-large`, `voyage-3.5`,
  `voyage-3.5-lite`, `voyage-code-3`, `voyage-finance-2`, `voyage-law-2`.
* **`input_type`** (`pxt.String | None`): Type of the input text. Options: `None`, `query`, `document`.
  When `input_type` is `None`, the embedding model directly converts the inputs into numerical vectors.
  For retrieval/search purposes, we recommend setting this to `query` or `document` as appropriate.
* **`truncation`** (`pxt.Bool | None`): Whether to truncate the input texts to fit within the context length. Defaults to `True`.
* **`output_dimension`** (`pxt.Int | None`): The number of dimensions for resulting output embeddings.
  Most models only support a single default dimension. Models `voyage-3-large`, `voyage-3.5`,
  `voyage-3.5-lite`, and `voyage-code-3` support: 256, 512, 1024 (default), and 2048.
* **`output_dtype`** (`pxt.String | None`): The data type for the embeddings to be returned. Options: `float`, `int8`, `uint8`,
  `binary`, `ubinary`. Only `float` is currently supported in Pixeltable.

**Returns:**

* `pxt.Array[(None,), float32]`: An array representing the application of the given embedding to `input`.

**Examples:**

Add a computed column that applies the model `voyage-3.5` to an existing
Pixeltable column `tbl.text` of the table `tbl`:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_computed_column(
    embed=embeddings(tbl.text, model='voyage-3.5', input_type='document')
)
```

Add an embedding index to an existing column `text`, using the model `voyage-3.5`:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_embedding_index(
    'text', string_embed=embeddings.using(model='voyage-3.5')
)
```

## <span style={{ 'color': 'gray' }}>udf</span>  multimodal\_embed()

```python Signatures theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
# Signature 1:
@pxt.udf
multimodal_embed(
    text: pxt.String,
    model: pxt.String,
    input_type: pxt.String | None,
    truncation: pxt.Bool
) -> pxt.Array[(1024,), float32]

# Signature 2:
@pxt.udf
multimodal_embed(
    image: pxt.Image,
    model: pxt.String,
    input_type: pxt.String | None,
    truncation: pxt.Bool
) -> pxt.Array[(1024,), float32]

# Signature 3:
@pxt.udf
multimodal_embed(
    video: pxt.Video,
    model: pxt.String,
    input_type: pxt.String | None,
    truncation: pxt.Bool
) -> pxt.Array[(1024,), float32]
```

Creates an embedding vector for text, images, or video using Voyage AI's multimodal model.

Equivalent to the Voyage AI `multimodal_embed` API endpoint.
For additional details, see: [https://docs.voyageai.com/docs/multimodal-embeddings](https://docs.voyageai.com/docs/multimodal-embeddings)

Request throttling:
Applies the rate limit set in the config (section `voyageai`, key `rate_limit`). If no rate
limit is configured, uses a default of 600 RPM.

**Requirements:**

* `pip install voyageai`

**Parameters:**

* **`text`** (`String`): The text to embed.
* **`image`** (`Image`): The image to embed.
* **`video`** (`Video`): The video to embed.
* **`model`** (`String`): The model to use. Currently only `voyage-multimodal-3` is supported.
* **`input_type`** (`String | None`, default: `Literal(None)`): Type of the input. Options: `None`, `query`, `document`.
  For retrieval/search, set to `query` or `document` as appropriate.
* **`truncation`** (`Bool`, default: `Literal(True)`): Whether to truncate inputs to fit within context length. Defaults to `True`.

**Returns:**

* `pxt.Array[(1024,), float32]`: An array of 1024 floats representing the embedding.

**Examples:**

Embed a text column `description`:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_computed_column(
    embed=multimodal_embed(tbl.description, input_type='document')
)
```

Add an embedding index for column `description`:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_embedding_index(
    'description',
    embed=multimodal_embed.using(model='voyage-multimodal-3'),
)
```

## <span style={{ 'color': 'gray' }}>udf</span>  rerank()

```python Signature theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
@pxt.udf
rerank(
    query: pxt.String,
    documents: pxt.Json[(String, ...)],
    *,
    model: pxt.String,
    top_k: pxt.Int | None = None,
    truncation: pxt.Bool = True
) -> pxt.Json
```

Reranks documents based on their relevance to a query.

Equivalent to the Voyage AI `rerank` API endpoint.
For additional details, see: [https://docs.voyageai.com/docs/reranker](https://docs.voyageai.com/docs/reranker)

Request throttling:
Applies the rate limit set in the config (section `voyageai`, key `rate_limit`). If no rate
limit is configured, uses a default of 600 RPM.

**Requirements:**

* `pip install voyageai`

**Parameters:**

* **`query`** (`pxt.String`): The query as a string.
* **`documents`** (`pxt.Json[(String`): The documents to be reranked as a list of strings.
* **`model`** (`Any`): The model to use for reranking. Recommended options: `rerank-2.5`, `rerank-2.5-lite`.
* **`top_k`** (`Any`): The number of most relevant documents to return. If not specified, all documents
  will be reranked and returned.
* **`truncation`** (`Any`): Whether to truncate the input to satisfy context length limits. Defaults to `True`.

**Returns:**

* `pxt.Json`: A dictionary containing:
  * `results`: List of reranking results with `index`, `document`, and `relevance_score`
  * `total_tokens`: The total number of tokens used

**Examples:**

Rerank similarity search results for better relevance. First, create a table with
an embedding index, then use a query function to retrieve candidates and rerank them:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
docs = pxt.create_table('docs', {'text': pxt.String})
docs.add_computed_column(embed=embeddings(docs.text, model='voyage-3.5'))
docs.add_embedding_index('text', embed=docs.embed)


@pxt.query
def get_candidates(query_text: str):
    sim = docs.text.similarity(
        query_text, embed=embeddings.using(model='voyage-3.5')
    )
    return docs.order_by(sim, asc=False).limit(20).select(docs.text)


queries = pxt.create_table('queries', {'query': pxt.String})
queries.add_computed_column(candidates=get_candidates(queries.query))
queries.add_computed_column(
    reranked=rerank(
        queries.query,
        queries.candidates.text,
        model='rerank-2.5',
        top_k=5,
    )
)
```