Skip to main content

module  pixeltable.functions.gemini

Pixeltable UDFs that wrap various endpoints from the Google Gemini API. In order to use them, you must first pip install google-genai and configure your Gemini credentials, as described in the Working with Gemini tutorial. Supports two authentication methods:
  • Google AI Studio: set GOOGLE_API_KEY or GEMINI_API_KEY (or put api_key in the gemini section of the Pixeltable config file).
  • Vertex AI: set GOOGLE_GENAI_USE_VERTEXAI=true and GOOGLE_CLOUD_PROJECT (and optionally GOOGLE_CLOUD_LOCATION), then authenticate via Application Default Credentials (e.g. gcloud auth application-default login).

func  invoke_tools()

Signature
invoke_tools(
    tools: pixeltable.func.tools.Tools,
    response: pixeltable.exprs.expr.Expr
) -> pixeltable.exprs.inline_expr.InlineDict
Converts an OpenAI response dict to Pixeltable tool invocation format and calls tools._invoke().

udf  embed_content()

Signatures
# Signature 1:
@pxt.udf
embed_content(
    contents: pxt.String,
    model: pxt.String,
    config: pxt.Json | None,
    use_batch_api: pxt.Bool
) -> pxt.Array[(None,), float32]

# Signature 2:
@pxt.udf
embed_content(
    contents: pxt.Image,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 3:
@pxt.udf
embed_content(
    contents: pxt.Audio,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 4:
@pxt.udf
embed_content(
    contents: pxt.Video,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]

# Signature 5:
@pxt.udf
embed_content(
    contents: pxt.Document,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Array[(None,), float32]
Generate embeddings for text, images, video, and other content. For more information on Gemini embeddings API, see: https://ai.google.dev/gemini-api/docs/embeddings Requirements:
  • pip install google-genai
Parameters:
  • contents (String): The string, image, audio, video, or document to embed.
  • model (String): The Gemini model to use.
  • config (Json | None, default: Literal(None)): Configuration for embedding generation, corresponding to keyword arguments of genai.types.EmbedContentConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.EmbedContentConfig
  • use_batch_api (Bool, default: Literal(False)): If True, use Gemini’s Batch API that provides a higher throughput at a lower cost at the expense of higher latency.
Returns:
  • pxt.Array[(None,), float32]: The corresponding embedding vector.
Examples: Add a computed column with embeddings to an existing table with a text column:
t.add_computed_column(
    embedding=embed_content(t.text, model='gemini-embedding-001')
)
Add an embedding index on text column:
t.add_embedding_index(
    t.text, embedding=embed_content.using(model='gemini-embedding-001')
)

udf  generate_content()

Signature
@pxt.udf
generate_content(
    contents: pxt.Json,
    *,
    model: pxt.String,
    config: pxt.Json | None = None,
    tools: pxt.Json[(Json = None
) -> pxt.Json
Generate content from the specified model. Request throttling: Applies the rate limit set in the config (section gemini.rate_limits; use the model id as the key). If no rate limit is configured, uses a default of 600 RPM. Requirements:
  • pip install google-genai
Parameters:
  • contents (pxt.Json): The input content to generate from. Can be a prompt, or a list containing images and text prompts, as described in: https://ai.google.dev/gemini-api/docs/text-generation
  • model (pxt.String): The name of the model to use.
  • config (pxt.Json | None): Configuration for generation, corresponding to keyword arguments of genai.types.GenerateContentConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateContentConfig
  • tools (pxt.Json[(Json): An optional list of Pixeltable tools to use. It is also possible to specify tools manually via the config['tools'] parameter, but at most one of config['tools'] or tools may be used.
Returns:
  • pxt.Json: A dictionary containing the response and other metadata.
Examples: Add a computed column that applies the model gemini-2.5-flash to an existing Pixeltable column tbl.prompt of the table tbl:
tbl.add_computed_column(
    response=generate_content(tbl.prompt, model='gemini-2.5-flash')
)

udf  generate_images()

Signature
@pxt.udf
generate_images(
    prompt: pxt.String,
    *,
    model: pxt.String,
    config: pxt.Json | None = None
) -> pxt.Image
Generates images based on a text description and configuration. For additional details, see: https://ai.google.dev/gemini-api/docs/image-generation Request throttling: Applies the rate limit set in the config (section imagen.rate_limits; use the model id as the key). If no rate limit is configured, uses a default of 600 RPM. Requirements:
  • pip install google-genai
Parameters: Returns:
  • pxt.Image: The generated image.
Examples: Add a computed column that applies the model imagen-4.0-generate-001 to an existing Pixeltable column tbl.prompt of the table tbl:
tbl.add_computed_column(
    response=generate_images(tbl.prompt, model='imagen-4.0-generate-001')
)

udf  generate_speech()

Signatures
# Signature 1:
@pxt.udf
generate_speech(
    text: pxt.String,
    model: pxt.String,
    voice: pxt.String,
    config: pxt.Json | None
) -> pxt.Audio

# Signature 2:
@pxt.udf
generate_speech(
    text: pxt.String,
    model: pxt.String,
    voices: pxt.Json,
    config: pxt.Json | None
) -> pxt.Audio
Generates speech audio from text using Gemini’s text-to-speech capability. For additional details, see: https://ai.google.dev/gemini-api/docs/speech-generation Request throttling: Applies the rate limit set in the config (section gemini.rate_limits; use the model id as the key). If no rate limit is configured, uses a default of 600 RPM. Requirements:
  • pip install google-genai
Parameters:
  • text (String): The text to synthesize into speech.
  • model (String): The model to use (e.g. 'gemini-2.5-flash-preview-tts').
  • voice (String): The voice profile to use. Supported voices include 'Kore', 'Puck', 'Charon', 'Fenrir', 'Aoede', 'Leda', 'Orus', 'Zephyr', and others. See the speech generation docs for the full list. Mutually exclusive with voices.
  • voices (Json): A mapping from speaker alias (as used in the text) to voice name. For example, {'Alice': 'Kore', 'Bob': 'Puck'}. Mutually exclusive with voice.
  • config (Json | None, default: Literal(None)): Additional configuration, corresponding to keyword arguments of genai.types.GenerateContentConfig. Keys such as response_modalities and speech_config are set automatically and should not be included.
Returns:
  • pxt.Audio: An audio file (WAV, 24 kHz mono 16-bit) containing the synthesized speech.
Examples: Add a computed column that generates speech from text:
tbl.add_computed_column(
    audio=generate_speech(
        tbl.text, model='gemini-2.5-flash-preview-tts', voice='Kore'
    )
)

udf  generate_videos()

Signatures
# Signature 1:
@pxt.udf
generate_videos(
    prompt: pxt.String | None,
    image: pxt.Image | None,
    model: pxt.String,
    config: pxt.Json | None
) -> pxt.Video

# Signature 2:
@pxt.udf
generate_videos(
    prompt: pxt.String | None,
    images: pxt.Json[(Image, ...)] | None,
    model: pxt.String,
    config: pxt.Json | None,
    reference_types: pxt.Json[(String, ...)] | None
) -> pxt.Video
Generates videos based on a text description and configuration. For additional details, see: https://ai.google.dev/gemini-api/docs/video At least one of prompt or image must be provided. When image is a single image, it is used as the first frame of the generated video. When image is a list of images, they are used as reference images to guide the style or asset appearance throughout the video (Veo 3.1+). See the overloaded signature for details. Request throttling: Applies the rate limit set in the config (section veo.rate_limits; use the model id as the key). If no rate limit is configured, uses a default of 600 RPM. Requirements:
  • pip install google-genai
Parameters:
  • prompt (String | None, default: Literal(None)): A text description of the videos to generate.
  • image (Image | None, default: Literal(None)): A single image to use as the first frame of the video, or as images a list of up to 3 reference images for Veo 3.1 (see overloaded signature).
  • model (String): The model to use.
  • config (Json | None, default: Literal(None)): Configuration for generation, corresponding to keyword arguments of genai.types.GenerateVideosConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateVideosConfig
Returns:
  • pxt.Video: The generated video.
Examples: Add a computed column that applies the model veo-3.0-generate-001 to an existing Pixeltable column tbl.prompt of the table tbl:
tbl.add_computed_column(
    response=generate_videos(tbl.prompt, model='veo-3.0-generate-001')
)
Use reference images with Veo 3.1 to guide video generation:
tbl.add_computed_column(
    response=generate_videos(
        tbl.prompt,
        images=[tbl.ref_img1, tbl.ref_img2],
        reference_types=['asset', 'asset'],
        model='veo-3.1-generate-preview',
    )
)

udf  transcribe()

Signature
@pxt.udf
transcribe(
    audio: pxt.Audio,
    *,
    model: pxt.String,
    prompt: pxt.String,
    config: pxt.Json | None = None
) -> pxt.String
Transcribes audio to text using Gemini’s audio understanding capability. For additional details, see: https://ai.google.dev/gemini-api/docs/audio Request throttling: Applies the rate limit set in the config (section gemini.rate_limits; use the model id as the key). If no rate limit is configured, uses a default of 600 RPM. Requirements:
  • pip install google-genai
Parameters:
  • audio (pxt.Audio): The audio file to transcribe.
  • model (pxt.String): The model to use (e.g. 'gemini-2.5-flash').
  • prompt (pxt.String): The instruction prompt sent alongside the audio. For example, 'Generate a transcript of the speech.' or 'Summarize the audio content.'.
  • config (pxt.Json | None): Additional configuration, corresponding to keyword arguments of genai.types.GenerateContentConfig.
Returns:
  • pxt.String: The transcribed text.
Examples: Add a computed column that transcribes audio:
tbl.add_computed_column(
    transcript=transcribe(tbl.audio, model='gemini-2.5-flash')
)
Last modified on April 17, 2026