Skip to main content

module  pixeltable.functions.audio

Pixeltable UDFs for AudioType.

iterator  audio_splitter()

Signature
@pxt.iterator
audio_splitter(
    audio: pxt.Audio,
    duration: pxt.Float,
    *,
    overlap: pxt.Float = 0.0,
    min_segment_duration: pxt.Float = 0.0
)
Iterator over segments of an audio file. The audio file is split into smaller segments, where the duration of each segment is determined by duration. If the input contains no audio, no segments are yielded. Outputs: One row per audio segment, with the following columns:
  • segment_start (pxt.Float): Start time of the audio segment in seconds
  • segment_end (pxt.Float): End time of the audio segment in seconds
  • audio_segment (pxt.Audio | None): The audio content of the segment
Parameters:
  • duration (pxt.Float): Audio segment duration in seconds
  • overlap (pxt.Float): Overlap between consecutive segments in seconds
  • min_segment_duration (pxt.Float): Drop the last segment if it is smaller than min_segment_duration
Examples: This example assumes an existing table tbl with a column audio of type pxt.Audio. Create a view that splits all audio files into segments of 30 seconds with 5 seconds overlap:
pxt.create_view(
    'audio_segments',
    tbl,
    iterator=audio_splitter(tbl.audio, duration=30.0, overlap=5.0),
)

udf  encode_audio()

Signature
@pxt.udf
encode_audio(
    audio_data: pxt.Array[float32],
    *,
    input_sample_rate: pxt.Int,
    format: pxt.String,
    output_sample_rate: pxt.Int | None = None
) -> pxt.Audio
Encodes an audio clip represented as an array into a specified audio format. Parameters:
  • audio_data (pxt.Array[float32]): An array of sampled amplitudes. The accepted array shapes are (N,) or (1, N) for mono audio or (2, N) for stereo.
  • input_sample_rate (pxt.Int): The sample rate of the input audio data.
  • format (pxt.String): The desired output audio format. The supported formats are ‘wav’, ‘mp3’, ‘flac’, and ‘mp4’.
  • output_sample_rate (pxt.Int | None): The desired sample rate for the output audio. Defaults to the input sample rate if unspecified.
Examples: Add a computed column with encoded FLAC audio files to a table with audio data (as arrays of floats) and sample rates:
t.add_computed_column(
    audio_file=encode_audio(
        t.audio_data, input_sample_rate=t.sample_rate, format='flac'
    )
)

udf  get_metadata()

Signature
@pxt.udf
get_metadata(audio: pxt.Audio) -> pxt.Json
Gets various metadata associated with an audio file and returns it as a dictionary. Parameters:
  • audio (pxt.Audio): The audio to get metadata for.
Returns:
  • pxt.Json: A dict such as the following:
    {
        'size': 2568827,
        'streams': [
            {
                'type': 'audio',
                'frames': 0,
                'duration': 2646000,
                'metadata': {},
                'time_base': 2.2675736961451248e-05,
                'codec_context': {
                    'name': 'flac',
                    'profile': None,
                    'channels': 1,
                    'codec_tag': '\x00\x00\x00\x00',
                },
                'duration_seconds': 60.0,
            }
        ],
        'bit_rate': 342510,
        'metadata': {'encoder': 'Lavf61.1.100'},
        'bit_exact': False,
    }
    
Examples: Extract metadata for files in the audio_col column of the table tbl:
tbl.select(tbl.audio_col.get_metadata()).collect()
Last modified on March 1, 2026