module pixeltable.functions.gemini

Pixeltable UDFs that wrap various endpoints from the Google Gemini API. In order to use them, you must first pip install google-genai and configure your Gemini credentials, as described in the Working with Gemini tutorial. Supports two authentication methods:

Google AI Studio: set GOOGLE_API_KEY or GEMINI_API_KEY (or put api_key in the gemini section of the Pixeltable config file).
Vertex AI: set GOOGLE_GENAI_USE_VERTEXAI=true and GOOGLE_CLOUD_PROJECT (and optionally GOOGLE_CLOUD_LOCATION), then authenticate via Application Default Credentials (e.g. gcloud auth application-default login).

func invoke_tools()

Signature

invoke_tools(
    tools: pixeltable.func.tools.Tools,
    response: pixeltable.exprs.expr.Expr
) -> pixeltable.exprs.inline_expr.InlineDict

Converts an OpenAI response dict to Pixeltable tool invocation format and calls tools._invoke().

udf embed_content()

Signatures

# Signature 1:
@pxt.udf
embed_content(
    contents: pxt.String,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 2:
@pxt.udf
embed_content(
    contents: pxt.Image,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 3:
@pxt.udf
embed_content(
    contents: pxt.Audio,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 4:
@pxt.udf
embed_content(
    contents: pxt.Video,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 5:
@pxt.udf
embed_content(
    contents: pxt.Document,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

Generate embeddings for text, images, video, and other content. For more information on Gemini embeddings API, see: https://ai.google.dev/gemini-api/docs/embeddings Requirements:

pip install google-genai

Parameters:

contents (String): The string, image, audio, video, or document to embed.
model (String): The Gemini model to use.
config (Json | None, default: Literal(None)): Configuration for embedding generation, corresponding to keyword arguments of genai.types.EmbedContentConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.EmbedContentConfig

Returns:

pxt.Array[(None,), float32]: The corresponding embedding vector.

Examples: Add a computed column with embeddings to an existing table with a text column:

t.add_computed_column(
    embedding=embed_content(t.text, model='gemini-embedding-2')
)

Add an embedding index on text column:

t.add_embedding_index(
    t.text, embedding=embed_content.using(model='gemini-embedding-2')
)

udf generate_content()

Signature

@pxt.udf
generate_content(
    contents: pxt.Json,
    *,
    model: pxt.String,
    config: pxt.Json | None = None,
    tools: pxt.Json[(Json, ...)] | None = None
) -> pxt.Json

Generate content from the specified model. Request throttling: Applies the rate limit set in the config (section gemini.rate_limits; use the model id as the key). If no rate limit is configured, uses a default of 600 RPM. Requirements:

pip install google-genai

Parameters:

contents (pxt.Json): The input content to generate from. Can be a prompt, or a list containing images and text prompts, as described in: https://ai.google.dev/gemini-api/docs/text-generation and https://ai.google.dev/gemini-api/docs/image-generation for image generation.
model (pxt.String): The name of the model to use.
config (pxt.Json | None): Configuration for generation, corresponding to keyword arguments of genai.types.GenerateContentConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateContentConfig
tools (pxt.Json[(Json): An optional list of Pixeltable tools to use. It is also possible to specify tools manually via the config['tools'] parameter, but at most one of config['tools'] or tools may be used.

Returns:

pxt.Json: A dictionary containing the response and other metadata.

Examples: Add a computed column that applies the model gemini-2.5-flash to an existing Pixeltable column tbl.prompt of the table tbl:

tbl.add_computed_column(
    response=generate_content(tbl.prompt, model='gemini-2.5-flash')
)

Generate an image with a Nano Banana model (Gemini image-generation models such as gemini-3.1-flash-image-preview) and extract the PIL image from the response using JSON subscripting. Image bytes in inline_data.data are decoded into PIL images automatically. Pass response_modalities=['IMAGE'] so the response contains a single image part:

tbl.add_computed_column(
    response=generate_content(
        tbl.prompt,
        model='gemini-3.1-flash-image-preview',
        config={'response_modalities': ['IMAGE']},
    )
)
tbl.add_computed_column(
    image=tbl.response.candidates[0].content.parts[0].inline_data.data
)

udf generate_images()

Signature

@pxt.udf
generate_images(
    prompt: pxt.String,
    *,
    model: pxt.String,
    config: pxt.Json | None = None
) -> pxt.Image

Generates images based on a text description and configuration. For additional details, see: https://ai.google.dev/gemini-api/docs/imagen Note: This function is for Imagen models only. For Gemini image-generation models (Nano Banana, e.g. gemini-3.1-flash-image-preview), use generate_content instead. Request throttling: Applies the rate limit set in the config (section imagen.rate_limits; use the model id as the key). If no rate limit is configured, uses a default of 600 RPM. Requirements:

pip install google-genai

Parameters:

prompt (pxt.String): A text description of the images to generate.
model (pxt.String): The model to use.
config (pxt.Json | None): Configuration for generation, corresponding to keyword arguments of genai.types.GenerateImagesConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateImagesConfig

Returns:

pxt.Image: The generated image.

Examples: Add a computed column that applies the model imagen-4.0-generate-001 to an existing Pixeltable column tbl.prompt of the table tbl:

tbl.add_computed_column(
    response=generate_images(tbl.prompt, model='imagen-4.0-generate-001')
)

udf generate_speech()

Signatures

# Signature 1:
@pxt.udf
generate_speech(
    text: pxt.String,
    model: pxt.String,
    voice: pxt.String,
    config: pxt.Json | None
) -> pxt.Audio

# Signature 2:
@pxt.udf
generate_speech(
    text: pxt.String,
    model: pxt.String,
    voices: pxt.Json,
    config: pxt.Json | None
) -> pxt.Audio

Generates speech audio from text using Gemini’s text-to-speech capability. For additional details, see: https://ai.google.dev/gemini-api/docs/speech-generation Request throttling: Applies the rate limit set in the config (section gemini.rate_limits; use the model id as the key). If no rate limit is configured, uses a default of 600 RPM. Requirements:

pip install google-genai

Parameters:

text (String): The text to synthesize into speech.
model (String): The model to use (e.g. 'gemini-2.5-flash-preview-tts').
voice (String): The voice profile to use. Supported voices include 'Kore', 'Puck', 'Charon', 'Fenrir', 'Aoede', 'Leda', 'Orus', 'Zephyr', and others. See the speech generation docs for the full list. Mutually exclusive with voices.
voices (Json): A mapping from speaker alias (as used in the text) to voice name. For example, {'Alice': 'Kore', 'Bob': 'Puck'}. Mutually exclusive with voice.
config (Json | None, default: Literal(None)): Additional configuration, corresponding to keyword arguments of genai.types.GenerateContentConfig. Keys such as response_modalities and speech_config are set automatically and should not be included.

Returns:

pxt.Audio: An audio file (WAV, 24 kHz mono 16-bit) containing the synthesized speech.

Examples: Add a computed column that generates speech from text:

tbl.add_computed_column(
    audio=generate_speech(
        tbl.text, model='gemini-2.5-flash-preview-tts', voice='Kore'
    )
)

udf generate_videos()

Signatures

# Signature 1:
@pxt.udf
generate_videos(
    prompt: pxt.String | None,
    image: pxt.Image | None,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Video

# Signature 2:
@pxt.udf
generate_videos(
    prompt: pxt.String | None,
    images: pxt.Json[(Image, ...)] | None,
    model: pxt.String,
    config: pxt.Json | None,
    reference_types: pxt.Json[(String, ...)] | None
) -> pxt.Video

Generates videos based on a text description and configuration. For additional details, see: https://ai.google.dev/gemini-api/docs/video At least one of prompt or image must be provided. When image is a single image, it is used as the first frame of the generated video. When image is a list of images, they are used as reference images to guide the style or asset appearance throughout the video (Veo 3.1+). See the overloaded signature for details. Request throttling: Applies the rate limit set in the config (section veo.rate_limits; use the model id as the key). If no rate limit is configured, uses a default of 600 RPM. Requirements:

pip install google-genai

Parameters:

prompt (String | None, default: Literal(None)): A text description of the videos to generate.
image (Image | None, default: Literal(None)): A single image to use as the first frame of the video, or as images a list of up to 3 reference images for Veo 3.1 (see overloaded signature).
model (String): The model to use.
config (Json | None, default: Literal(None)): Configuration for generation, corresponding to keyword arguments of genai.types.GenerateVideosConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateVideosConfig

Returns:

pxt.Video: The generated video.

Examples: Add a computed column that applies the model veo-3.0-generate-001 to an existing Pixeltable column tbl.prompt of the table tbl:

tbl.add_computed_column(
    response=generate_videos(tbl.prompt, model='veo-3.0-generate-001')
)

Use reference images with Veo 3.1 to guide video generation:

tbl.add_computed_column(
    response=generate_videos(
        tbl.prompt,
        images=[tbl.ref_img1, tbl.ref_img2],
        reference_types=['asset', 'asset'],
        model='veo-3.1-generate-preview',
    )
)

udf transcribe()

Signature

@pxt.udf
transcribe(
    audio: pxt.Audio,
    *,
    model: pxt.String,
    prompt: pxt.String,
    config: pxt.Json | None = None
) -> pxt.String

Transcribes audio to text using Gemini’s audio understanding capability. For additional details, see: https://ai.google.dev/gemini-api/docs/audio Request throttling: Applies the rate limit set in the config (section gemini.rate_limits; use the model id as the key). If no rate limit is configured, uses a default of 600 RPM. Requirements:

pip install google-genai

Parameters:

audio (pxt.Audio): The audio file to transcribe.
model (pxt.String): The model to use (e.g. 'gemini-2.5-flash').
prompt (pxt.String): The instruction prompt sent alongside the audio. For example, 'Generate a transcript of the speech.' or 'Summarize the audio content.'.
config (pxt.Json | None): Additional configuration, corresponding to keyword arguments of genai.types.GenerateContentConfig.

Returns:

pxt.String: The transcribed text.

Examples: Add a computed column that transcribes audio:

tbl.add_computed_column(
    transcript=transcribe(tbl.audio, model='gemini-2.5-flash')
)

Documentation Index

​module pixeltable.functions.gemini

​func invoke_tools()

​udf embed_content()

​udf generate_content()

​udf generate_images()

​udf generate_speech()

​udf generate_videos()

​udf transcribe()

module pixeltable.functions.gemini

func invoke_tools()

udf embed_content()

udf generate_content()

udf generate_images()

udf generate_speech()

udf generate_videos()

udf transcribe()