Documentation Index
Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
Use this file to discover all available pages before exploring further.
module pixeltable.functions.audio
Pixeltable UDFs for AudioType.
iterator audio_splitter()
audio_splitter(
audio: Any,
chunk_duration_sec: float,
*,
overlap_sec: float = 0.0,
min_chunk_duration_sec: float = 0.0
)
Iterator over chunks of an audio file. The audio file is split into smaller chunks,
where the duration of each chunk is determined by chunk_duration_sec.
The iterator yields audio chunks as pxt.Audio, along with the start and end time of each chunk.
If the input contains no audio, no chunks are yielded.
Parameters:
chunk_duration_sec (float): Audio chunk duration in seconds
overlap_sec (float, default: 0.0): Overlap between consecutive chunks in seconds
min_chunk_duration_sec (float, default: 0.0): Drop the last chunk if it is smaller than min_chunk_duration_sec
Examples:
This example assumes an existing table tbl with a column audio of type pxt.Audio. Create a view that splits all audio files into chunks of 30 seconds with 5 seconds overlap:
pxt.create_view(
'audio_chunks',
tbl,
iterator=audio_splitter(
tbl.audio, chunk_duration_sec=30.0, overlap_sec=5.0
),
)
udf encode_audio()
@pxt.udf
encode_audio(
audio_data: pxt.Array[float32],
*,
input_sample_rate: pxt.Int,
format: pxt.String,
output_sample_rate: pxt.Int | None = None
) -> pxt.Audio
Encodes an audio clip represented as an array into a specified audio format.
Parameters:
audio_data (pxt.Array[float32]): An array of sampled amplitudes. The accepted array shapes are (N,) or (1, N) for mono audio
or (2, N) for stereo.
input_sample_rate (pxt.Int): The sample rate of the input audio data.
format (pxt.String): The desired output audio format. The supported formats are ‘wav’, ‘mp3’, ‘flac’, and ‘mp4’.
output_sample_rate (pxt.Int | None): The desired sample rate for the output audio. Defaults to the input sample rate if
unspecified.
Examples:
Add a computed column with encoded FLAC audio files to a table with audio data (as arrays of floats) and sample rates:
t.add_computed_column(
audio_file=encode_audio(
t.audio_data, input_sample_rate=t.sample_rate, format='flac'
)
)
@pxt.udf
get_metadata(audio: pxt.Audio) -> pxt.Json
Gets various metadata associated with an audio file and returns it as a dictionary.
Parameters:
audio (pxt.Audio): The audio to get metadata for.
Returns:
pxt.Json: A dict such as the following:
{
'size': 2568827,
'streams': [
{
'type': 'audio',
'frames': 0,
'duration': 2646000,
'metadata': {},
'time_base': 2.2675736961451248e-05,
'codec_context': {
'name': 'flac',
'profile': None,
'channels': 1,
'codec_tag': '\x00\x00\x00\x00',
},
'duration_seconds': 60.0,
}
],
'bit_rate': 342510,
'metadata': {'encoder': 'Lavf61.1.100'},
'bit_exact': False,
}
Examples:
Extract metadata for files in the audio_col column of the table tbl:
tbl.select(tbl.audio_col.get_metadata()).collect()