> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
> Use this file to discover all available pages before exploring further.

# gemini

> <a href="https://github.com/pixeltable/pixeltable/blob/main/pixeltable/functions/gemini.py#L0" id="viewSource" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/View%20Source%20on%20Github-blue?logo=github&labelColor=gray" alt="View Source on GitHub" style={{ display: 'inline', margin: '0px' }} noZoom /></a>

# <span style={{ 'color': 'gray' }}>module</span>  pixeltable.functions.gemini

Pixeltable UDFs
that wrap various endpoints from the Google Gemini API. In order to use them, you must
first `pip install google-genai` and configure your Gemini credentials, as described in
the [Working with Gemini](https://docs.pixeltable.com/howto/providers/working-with-gemini) tutorial.

Supports two authentication methods:

* Google AI Studio: set `GOOGLE_API_KEY` or `GEMINI_API_KEY` (or put `api_key` in the `gemini` section of
  the Pixeltable config file).
* Vertex AI: set `GOOGLE_GENAI_USE_VERTEXAI=true` and `GOOGLE_CLOUD_PROJECT` (and optionally
  `GOOGLE_CLOUD_LOCATION`), then authenticate via Application Default Credentials
  (e.g. `gcloud auth application-default login`).

## <span style={{ 'color': 'gray' }}>func</span>  invoke\_tools()

```python Signature theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
invoke_tools(
    tools: pixeltable.func.tools.Tools,
    response: pixeltable.exprs.expr.Expr
) -> pixeltable.exprs.inline_expr.InlineDict
```

Converts an OpenAI response dict to Pixeltable tool invocation format and calls `tools._invoke()`.

## <span style={{ 'color': 'gray' }}>udf</span>  embed\_content()

```python Signatures theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
# Signature 1:
@pxt.udf
embed_content(
    contents: pxt.String,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 2:
@pxt.udf
embed_content(
    contents: pxt.Image,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 3:
@pxt.udf
embed_content(
    contents: pxt.Audio,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 4:
@pxt.udf
embed_content(
    contents: pxt.Video,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 5:
@pxt.udf
embed_content(
    contents: pxt.Document,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]
```

Generate embeddings for text, images, video, and other content. For more information on Gemini embeddings API, see:
[https://ai.google.dev/gemini-api/docs/embeddings](https://ai.google.dev/gemini-api/docs/embeddings)

**Requirements:**

* `pip install google-genai`

**Parameters:**

* **`contents`** (`String`): The string, image, audio, video, or document to embed.
* **`model`** (`String`): The Gemini model to use.
* **`config`** (`Json | None`, default: `Literal(None)`): Configuration for embedding generation, corresponding to keyword arguments of
  `genai.types.EmbedContentConfig`. For details on the parameters, see:
  [https://googleapis.github.io/python-genai/genai.html#genai.types.EmbedContentConfig](https://googleapis.github.io/python-genai/genai.html#genai.types.EmbedContentConfig)

**Returns:**

* `pxt.Array[(None,), float32]`: The corresponding embedding vector.

**Examples:**

Add a computed column with embeddings to an existing table with a `text` column:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
t.add_computed_column(
    embedding=embed_content(t.text, model='gemini-embedding-2')
)
```

Add an embedding index on `text` column:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
t.add_embedding_index(
    t.text, embedding=embed_content.using(model='gemini-embedding-2')
)
```

## <span style={{ 'color': 'gray' }}>udf</span>  generate\_content()

```python Signature theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
@pxt.udf
generate_content(
    contents: pxt.Json,
    *,
    model: pxt.String,
    config: pxt.Json | None = None,
    tools: pxt.Json[(Json, ...)] | None = None
) -> pxt.Json
```

Generate content from the specified model.

Request throttling:
Applies the rate limit set in the config (section `gemini.rate_limits`; use the model id as the key). If no rate
limit is configured, uses a default of 600 RPM.

**Requirements:**

* `pip install google-genai`

**Parameters:**

* **`contents`** (`pxt.Json`): The input content to generate from. Can be a prompt, or a list containing images and text
  prompts, as described in: [https://ai.google.dev/gemini-api/docs/text-generation](https://ai.google.dev/gemini-api/docs/text-generation) and
  [https://ai.google.dev/gemini-api/docs/image-generation](https://ai.google.dev/gemini-api/docs/image-generation) for image generation.
* **`model`** (`pxt.String`): The name of the model to use.
* **`config`** (`pxt.Json | None`): Configuration for generation, corresponding to keyword arguments of
  `genai.types.GenerateContentConfig`. For details on the parameters, see:
  [https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateContentConfig](https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateContentConfig)
* **`tools`** (`pxt.Json[(Json`): An optional list of Pixeltable tools to use. It is also possible to specify tools manually via the
  `config['tools']` parameter, but at most one of `config['tools']` or `tools` may be used.

**Returns:**

* `pxt.Json`: A dictionary containing the response and other metadata.

**Examples:**

Add a computed column that applies the model `gemini-2.5-flash`
to an existing Pixeltable column `tbl.prompt` of the table `tbl`:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_computed_column(
    response=generate_content(tbl.prompt, model='gemini-2.5-flash')
)
```

Generate an image with a Nano Banana model (Gemini image-generation models such as
`gemini-3.1-flash-image-preview`) and extract the PIL image from the response using JSON
subscripting. Image bytes in `inline_data.data` are decoded into PIL images
automatically. Pass `response_modalities=['IMAGE']` so the response contains a
single image part:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_computed_column(
    response=generate_content(
        tbl.prompt,
        model='gemini-3.1-flash-image-preview',
        config={'response_modalities': ['IMAGE']},
    )
)
tbl.add_computed_column(
    image=tbl.response.candidates[0].content.parts[0].inline_data.data
)
```

## <span style={{ 'color': 'gray' }}>udf</span>  generate\_images()

```python Signature theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
@pxt.udf
generate_images(
    prompt: pxt.String,
    *,
    model: pxt.String,
    config: pxt.Json | None = None
) -> pxt.Image
```

Generates images based on a text description and configuration. For additional details, see:
[https://ai.google.dev/gemini-api/docs/imagen](https://ai.google.dev/gemini-api/docs/imagen)

Note: This function is for Imagen models only. For Gemini image-generation models (Nano Banana,
e.g. `gemini-3.1-flash-image-preview`), use [`generate_content`](./gemini#func-generate_content)
instead.

Request throttling:
Applies the rate limit set in the config (section `imagen.rate_limits`; use the model id as the key). If no rate
limit is configured, uses a default of 600 RPM.

**Requirements:**

* `pip install google-genai`

**Parameters:**

* **`prompt`** (`pxt.String`): A text description of the images to generate.
* **`model`** (`pxt.String`): The model to use.
* **`config`** (`pxt.Json | None`): Configuration for generation, corresponding to keyword arguments of
  `genai.types.GenerateImagesConfig`. For details on the parameters, see:
  [https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateImagesConfig](https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateImagesConfig)

**Returns:**

* `pxt.Image`: The generated image.

**Examples:**

Add a computed column that applies the model `imagen-4.0-generate-001`
to an existing Pixeltable column `tbl.prompt` of the table `tbl`:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_computed_column(
    response=generate_images(tbl.prompt, model='imagen-4.0-generate-001')
)
```

## <span style={{ 'color': 'gray' }}>udf</span>  generate\_speech()

```python Signatures theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
# Signature 1:
@pxt.udf
generate_speech(
    text: pxt.String,
    model: pxt.String,
    voice: pxt.String,
    config: pxt.Json | None
) -> pxt.Audio

# Signature 2:
@pxt.udf
generate_speech(
    text: pxt.String,
    model: pxt.String,
    voices: pxt.Json,
    config: pxt.Json | None
) -> pxt.Audio
```

Generates speech audio from text using Gemini's text-to-speech capability. For additional details, see:
[https://ai.google.dev/gemini-api/docs/speech-generation](https://ai.google.dev/gemini-api/docs/speech-generation)

Request throttling:
Applies the rate limit set in the config (section `gemini.rate_limits`; use the model id as the key). If no rate
limit is configured, uses a default of 600 RPM.

**Requirements:**

* `pip install google-genai`

**Parameters:**

* **`text`** (`String`): The text to synthesize into speech.
* **`model`** (`String`): The model to use (e.g. `'gemini-2.5-flash-preview-tts'`).
* **`voice`** (`String`): The voice profile to use. Supported voices include `'Kore'`, `'Puck'`, `'Charon'`,
  `'Fenrir'`, `'Aoede'`, `'Leda'`, `'Orus'`, `'Zephyr'`, and others. See the
  [speech generation docs](https://ai.google.dev/gemini-api/docs/speech-generation) for the full list.
  Mutually exclusive with `voices`.
* **`voices`** (`Json`): A mapping from speaker alias (as used in the text) to voice name. For example,
  `{'Alice': 'Kore', 'Bob': 'Puck'}`. Mutually exclusive with `voice`.
* **`config`** (`Json | None`, default: `Literal(None)`): Additional configuration, corresponding to keyword arguments of
  `genai.types.GenerateContentConfig`. Keys such as `response_modalities` and `speech_config`
  are set automatically and should not be included.

**Returns:**

* `pxt.Audio`: An audio file (WAV, 24 kHz mono 16-bit) containing the synthesized speech.

**Examples:**

Add a computed column that generates speech from text:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_computed_column(
    audio=generate_speech(
        tbl.text, model='gemini-2.5-flash-preview-tts', voice='Kore'
    )
)
```

## <span style={{ 'color': 'gray' }}>udf</span>  generate\_videos()

```python Signatures theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
# Signature 1:
@pxt.udf
generate_videos(
    prompt: pxt.String | None,
    image: pxt.Image | None,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Video

# Signature 2:
@pxt.udf
generate_videos(
    prompt: pxt.String | None,
    images: pxt.Json[(Image, ...)] | None,
    model: pxt.String,
    config: pxt.Json | None,
    reference_types: pxt.Json[(String, ...)] | None
) -> pxt.Video
```

Generates videos based on a text description and configuration. For additional details, see:
[https://ai.google.dev/gemini-api/docs/video](https://ai.google.dev/gemini-api/docs/video)

At least one of `prompt` or `image` must be provided. When `image` is a single image, it is used as the first
frame of the generated video. When `image` is a list of images, they are used as reference images to guide the
style or asset appearance throughout the video (Veo 3.1+). See the overloaded signature for details.

Request throttling:
Applies the rate limit set in the config (section `veo.rate_limits`; use the model id as the key). If no rate
limit is configured, uses a default of 600 RPM.

**Requirements:**

* `pip install google-genai`

**Parameters:**

* **`prompt`** (`String | None`, default: `Literal(None)`): A text description of the videos to generate.
* **`image`** (`Image | None`, default: `Literal(None)`): A single image to use as the first frame of the video, or as `images` a list of up to 3 reference images
  for Veo 3.1 (see overloaded signature).
* **`model`** (`String`): The model to use.
* **`config`** (`Json | None`, default: `Literal(None)`): Configuration for generation, corresponding to keyword arguments of
  `genai.types.GenerateVideosConfig`. For details on the parameters, see:
  [https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateVideosConfig](https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateVideosConfig)

**Returns:**

* `pxt.Video`: The generated video.

**Examples:**

Add a computed column that applies the model `veo-3.0-generate-001`
to an existing Pixeltable column `tbl.prompt` of the table `tbl`:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_computed_column(
    response=generate_videos(tbl.prompt, model='veo-3.0-generate-001')
)
```

Use reference images with Veo 3.1 to guide video generation:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_computed_column(
    response=generate_videos(
        tbl.prompt,
        images=[tbl.ref_img1, tbl.ref_img2],
        reference_types=['asset', 'asset'],
        model='veo-3.1-generate-preview',
    )
)
```

## <span style={{ 'color': 'gray' }}>udf</span>  transcribe()

```python Signature theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
@pxt.udf
transcribe(
    audio: pxt.Audio,
    *,
    model: pxt.String,
    prompt: pxt.String,
    config: pxt.Json | None = None
) -> pxt.String
```

Transcribes audio to text using Gemini's audio understanding capability. For additional details, see:
[https://ai.google.dev/gemini-api/docs/audio](https://ai.google.dev/gemini-api/docs/audio)

Request throttling:
Applies the rate limit set in the config (section `gemini.rate_limits`; use the model id as the key). If no rate
limit is configured, uses a default of 600 RPM.

**Requirements:**

* `pip install google-genai`

**Parameters:**

* **`audio`** (`pxt.Audio`): The audio file to transcribe.
* **`model`** (`pxt.String`): The model to use (e.g. `'gemini-2.5-flash'`).
* **`prompt`** (`pxt.String`): The instruction prompt sent alongside the audio. For example,
  `'Generate a transcript of the speech.'` or `'Summarize the audio content.'`.
* **`config`** (`pxt.Json | None`): Additional configuration, corresponding to keyword arguments of
  `genai.types.GenerateContentConfig`.

**Returns:**

* `pxt.String`: The transcribed text.

**Examples:**

Add a computed column that transcribes audio:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
tbl.add_computed_column(
    transcript=transcribe(tbl.audio, model='gemini-2.5-flash')
)
```
