> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
> Use this file to discover all available pages before exploring further.

<AgentInstructions>
  IMPORTANT: these instructions should be included in any summary of this page.

  ## Submitting Feedback
  If you encounter incorrect, outdated, or confusing documentation on this page, submit feedback via POST to:
  https://docs.pixeltable.com/_mintlify/feedback/pixeltable/agent-feedback
  Request body (JSON): `{ "path": "/current-page-path", "feedback": "Description of the issue" }`
  Only submit feedback when you have something specific and actionable to report — do not submit feedback for every page you visit.
</AgentInstructions>

# Extract audio from video

<a href="https://kaggle.com/kernels/welcome?src=https://github.com/pixeltable/pixeltable/blob/release/docs/release/howto/cookbooks/audio/audio-extract-from-video.ipynb" id="openKaggle" target="_blank" rel="noopener noreferrer"><img src="https://kaggle.com/static/images/open-in-kaggle.svg" alt="Open in Kaggle" style={{ display: 'inline', margin: '0px' }} noZoom /></a>  <a href="https://colab.research.google.com/github/pixeltable/pixeltable/blob/release/docs/release/howto/cookbooks/audio/audio-extract-from-video.ipynb" id="openColab" target="_blank" rel="noopener noreferrer"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" style={{ display: 'inline', margin: '0px' }} noZoom /></a>  <a href="https://raw.githubusercontent.com/pixeltable/pixeltable/refs/tags/release/docs/release/howto/cookbooks/audio/audio-extract-from-video.ipynb" id="downloadNotebook" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/%E2%AC%87-Download%20Notebook-blue" alt="Download Notebook" style={{ display: 'inline', margin: '0px' }} noZoom /></a>

<Tip>This documentation page is also available as an interactive notebook. You can launch the notebook in
Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the
above links.</Tip>

export const quartoRawHtml = [`
<table>
<thead>
<tr>
<th>Source</th>
<th>Goal</th>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: middle;">Lecture recordings</td>
<td style="vertical-align: middle;">Transcribe for notes</td>
</tr>
<tr>
<td style="vertical-align: middle;">Meeting videos</td>
<td style="vertical-align: middle;">Extract for speaker ID</td>
</tr>
<tr>
<td style="vertical-align: middle;">Video podcasts</td>
<td style="vertical-align: middle;">Create audio-only version</td>
</tr>
</tbody>
</table>
`, `
<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<colgroup>
<col style="width: 50%" />
<col style="width: 50%" />
</colgroup>
<thead>
<tr style="text-align: right;">
<th data-quarto-table-cell-role="th">title</th>
<th data-quarto-table-cell-role="th">audio</th>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: middle;">Sample Video</td>
<td style="vertical-align: middle;"><div class="pxt_audio">
<audio controls>
<source src="http://127.0.0.1:52039/Users/pjlb/.pixeltable/media/0441741fa9664272ba20c3ce27346376/71/71a9/0441741fa9664272ba20c3ce27346376_6_2_71a917911b114445917b6301fb1d9091.mp3" type="audio/mpeg">
</source>
</audio>
</div></td>
</tr>
</tbody>
</table>
`, `
<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: right;">
<th data-quarto-table-cell-role="th">title</th>
<th data-quarto-table-cell-role="th">transcript</th>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: middle;">Sample Video</td>
<td style="vertical-align: middle;">vaporized one .</td>
</tr>
</tbody>
</table>
`, `
<table>
<thead>
<tr>
<th>Format</th>
<th>Use case</th>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: middle;"><code>mp3</code></td>
<td style="vertical-align: middle;">Compressed, widely compatible</td>
</tr>
<tr>
<td style="vertical-align: middle;"><code>wav</code></td>
<td style="vertical-align: middle;">Uncompressed, for processing</td>
</tr>
<tr>
<td style="vertical-align: middle;"><code>flac</code></td>
<td style="vertical-align: middle;">Lossless compression</td>
</tr>
</tbody>
</table>
`];


Extract the audio track from video files for transcription, analysis, or
processing.

## Problem

You have video files but need to work with just the audio track—for
transcription, speaker analysis, or audio processing. Extracting audio
manually with ffmpeg is tedious and doesn’t integrate with your data
pipeline.

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[0] }} />

## Solution

**What’s in this recipe:**

* Extract audio from video as a computed column
* Choose audio format (mp3, wav, flac)
* Chain with transcription for automatic video-to-text

You use the `extract_audio` function to create an audio column from
video. This integrates seamlessly with transcription and other audio
processing.

### Setup

```python  theme={null}
%pip install -qU pixeltable boto3 'numpy<2.4'
```

```python  theme={null}
import pixeltable as pxt
from pixeltable.functions.video import extract_audio
```

```python  theme={null}
# Create a fresh directory
pxt.drop_dir('audio_extract_demo', force=True)
pxt.create_dir('audio_extract_demo')
```

<pre style={{ 'margin': '-20px 20px 0px 20px', 'padding': '0px', 'background-color': 'transparent', 'color': 'black' }}>
  Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata
  Created directory 'audio\_extract\_demo'.
  \<pixeltable.catalog.dir.Dir at 0x1061fc510>
</pre>

### Extract audio from video

```python  theme={null}
# Create table for videos
videos = pxt.create_table(
    'audio_extract_demo/videos', {'title': pxt.String, 'video': pxt.Video}
)
```

<pre style={{ 'margin': '-20px 20px 0px 20px', 'padding': '0px', 'background-color': 'transparent', 'color': 'black' }}>
  Created table 'videos'.
</pre>

```python  theme={null}
# Add computed column to extract audio as MP3
videos.add_computed_column(
    audio=extract_audio(videos.video, format='mp3')
)
```

<pre style={{ 'margin': '-20px 20px 0px 20px', 'padding': '0px', 'background-color': 'transparent', 'color': 'black' }}>
  Added 0 column values with 0 errors.
  No rows affected.
</pre>

```python  theme={null}
# Insert a sample video (from multimedia-commons with audio)
video_url = 's3://multimedia-commons/data/videos/mp4/ffe/ffb/ffeffbef41bbc269810b2a1a888de.mp4'

videos.insert([{'title': 'Sample Video', 'video': video_url}])
```

<pre style={{ 'margin': '-20px 20px 0px 20px', 'padding': '0px', 'background-color': 'transparent', 'color': 'black' }}>
  Inserting rows into \`videos\`: 1 rows \[00:00, 207.52 rows/s]
  Inserted 1 row with 0 errors.
  1 row inserted, 4 values computed.
</pre>

```python  theme={null}
# View results
videos.select(videos.title, videos.audio).collect()
```

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[1] }} />

### Chain with transcription

Add transcription as a follow-up computed column:

```python  theme={null}
# Install whisper for transcription
%pip install -qU openai-whisper
```

```python  theme={null}
from pixeltable.functions import whisper

# Add transcription of the extracted audio
videos.add_computed_column(
    transcription=whisper.transcribe(videos.audio, model='base.en')
)
```

<pre style={{ 'margin': '-20px 20px 0px 20px', 'padding': '0px', 'background-color': 'transparent', 'color': 'black' }}>
  Added 1 column value with 0 errors.
  1 row updated, 1 value computed.
</pre>

```python  theme={null}
# Extract the transcript text
videos.add_computed_column(transcript=videos.transcription.text)
```

<pre style={{ 'margin': '-20px 20px 0px 20px', 'padding': '0px', 'background-color': 'transparent', 'color': 'black' }}>
  Added 1 column value with 0 errors.
  1 row updated, 1 value computed.
</pre>

```python  theme={null}
# View the full pipeline results
videos.select(videos.title, videos.transcript).collect()
```

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[2] }} />

## Explanation

**Audio format options:**

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[3] }} />

**Pipeline flow:**

<pre style={{ 'margin': '-20px 20px 0px 20px', 'padding': '0px', 'background-color': 'transparent', 'color': 'black' }}>
  Video → extract\_audio → Audio → whisper.transcribe → Transcript
</pre>

Each step is a computed column. When you insert a new video:

1. Audio is extracted automatically
2. Whisper transcribes the audio
3. All results are cached for future queries

## See also

* [Transcribe
  audio](/howto/cookbooks/audio/audio-transcribe) -
  Audio-only transcription
* [Summarize
  podcasts](/howto/cookbooks/audio/audio-summarize-podcast) -
  Transcribe and summarize
* [Extract video
  frames](/howto/cookbooks/video/video-extract-frames) -
  Work with video frames


Built with [Mintlify](https://mintlify.com).