module pixeltable.functions.gemini
Pixeltable UDFs that wrap various endpoints from the Google Gemini API. In order to use them, you must firstpip install google-genai and configure your Gemini credentials, as described in
the Working with Gemini tutorial.
Supports two authentication methods:
- Google AI Studio: set
GOOGLE_API_KEYorGEMINI_API_KEY(or putapi_keyin thegeminisection of the Pixeltable config file). - Vertex AI: set
GOOGLE_GENAI_USE_VERTEXAI=trueandGOOGLE_CLOUD_PROJECT(and optionallyGOOGLE_CLOUD_LOCATION), then authenticate via Application Default Credentials (e.g.gcloud auth application-default login).
func invoke_tools()
Signature
tools._invoke().
udf embed_content()
Signatures
pip install google-genai
contents(String): The string, image, audio, video, or document to embed.model(String): The Gemini model to use.config(Json | None, default:Literal(None)): Configuration for embedding generation, corresponding to keyword arguments ofgenai.types.EmbedContentConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.EmbedContentConfiguse_batch_api(Bool, default:Literal(False)): If True, use Gemini’s Batch API that provides a higher throughput at a lower cost at the expense of higher latency.
pxt.Array[(None,), float32]: The corresponding embedding vector.
text column:
text column:
udf generate_content()
Signature
gemini.rate_limits; use the model id as the key). If no rate
limit is configured, uses a default of 600 RPM.
Requirements:
pip install google-genai
contents(pxt.Json): The input content to generate from. Can be a prompt, or a list containing images and text prompts, as described in: https://ai.google.dev/gemini-api/docs/text-generationmodel(pxt.String): The name of the model to use.config(pxt.Json | None): Configuration for generation, corresponding to keyword arguments ofgenai.types.GenerateContentConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateContentConfigtools(pxt.Json[(Json): An optional list of Pixeltable tools to use. It is also possible to specify tools manually via theconfig['tools']parameter, but at most one ofconfig['tools']ortoolsmay be used.
pxt.Json: A dictionary containing the response and other metadata.
gemini-2.5-flash to an existing Pixeltable column tbl.prompt of the table tbl:
udf generate_images()
Signature
imagen.rate_limits; use the model id as the key). If no rate
limit is configured, uses a default of 600 RPM.
Requirements:
pip install google-genai
prompt(pxt.String): A text description of the images to generate.model(pxt.String): The model to use.config(pxt.Json | None): Configuration for generation, corresponding to keyword arguments ofgenai.types.GenerateImagesConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateImagesConfig
pxt.Image: The generated image.
imagen-4.0-generate-001 to an existing Pixeltable column tbl.prompt of the table tbl:
udf generate_speech()
Signatures
gemini.rate_limits; use the model id as the key). If no rate
limit is configured, uses a default of 600 RPM.
Requirements:
pip install google-genai
text(String): The text to synthesize into speech.model(String): The model to use (e.g.'gemini-2.5-flash-preview-tts').voice(String): The voice profile to use. Supported voices include'Kore','Puck','Charon','Fenrir','Aoede','Leda','Orus','Zephyr', and others. See the speech generation docs for the full list. Mutually exclusive withvoices.voices(Json): A mapping from speaker alias (as used in the text) to voice name. For example,{'Alice': 'Kore', 'Bob': 'Puck'}. Mutually exclusive withvoice.config(Json | None, default:Literal(None)): Additional configuration, corresponding to keyword arguments ofgenai.types.GenerateContentConfig. Keys such asresponse_modalitiesandspeech_configare set automatically and should not be included.
pxt.Audio: An audio file (WAV, 24 kHz mono 16-bit) containing the synthesized speech.
udf generate_videos()
Signatures
prompt or image must be provided. When image is a single image, it is used as the first
frame of the generated video. When image is a list of images, they are used as reference images to guide the
style or asset appearance throughout the video (Veo 3.1+). See the overloaded signature for details.
Request throttling:
Applies the rate limit set in the config (section veo.rate_limits; use the model id as the key). If no rate
limit is configured, uses a default of 600 RPM.
Requirements:
pip install google-genai
prompt(String | None, default:Literal(None)): A text description of the videos to generate.image(Image | None, default:Literal(None)): A single image to use as the first frame of the video, or asimagesa list of up to 3 reference images for Veo 3.1 (see overloaded signature).model(String): The model to use.config(Json | None, default:Literal(None)): Configuration for generation, corresponding to keyword arguments ofgenai.types.GenerateVideosConfig. For details on the parameters, see: https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateVideosConfig
pxt.Video: The generated video.
veo-3.0-generate-001 to an existing Pixeltable column tbl.prompt of the table tbl:
udf transcribe()
Signature
gemini.rate_limits; use the model id as the key). If no rate
limit is configured, uses a default of 600 RPM.
Requirements:
pip install google-genai
audio(pxt.Audio): The audio file to transcribe.model(pxt.String): The model to use (e.g.'gemini-2.5-flash').prompt(pxt.String): The instruction prompt sent alongside the audio. For example,'Generate a transcript of the speech.'or'Summarize the audio content.'.config(pxt.Json | None): Additional configuration, corresponding to keyword arguments ofgenai.types.GenerateContentConfig.
pxt.String: The transcribed text.