What are Iterators?
Iterators in Pixeltable are specialized tools for processing and transforming media content. They efficiently break down large files into manageable chunks, enabling analysis at different granularities. Iterators work seamlessly with views to create virtual derived tables without duplicating storage.
In Pixeltable, iterators:
Process media files incrementally to manage memory efficiently
Transform single records into multiple output records
Support various media types including documents, videos, images, and audio
Integrate with the view system for automated processing pipelines
Provide configurable parameters for fine-tuning output
Iterators are particularly useful when:
Working with large media files that can’t be processed at once
Building retrieval systems that require chunked content
Creating analysis pipelines for multimedia data
Implementing feature extraction workflows
import pixeltable as pxt
from pixeltable.iterators import DocumentSplitter
# Create a view using an iterator
chunks = pxt.create_view(
'docs.chunks' ,
documents_table,
iterator = DocumentSplitter.create(
document = documents_table.document,
separators = 'paragraph'
)
)
Core Concepts
Document Splitting Split documents into chunks by headings, paragraphs, or sentences
Video Processing Extract frames at specified intervals or counts
Image Tiling Divide images into overlapping or non-overlapping tiles
Audio Chunking Split audio files into time-based chunks with configurable overlap
Iterators are powerful tools for processing large media files. They work seamlessly with Pixeltable’s computed columns and versioning system.
Available Iterators
DocumentSplitter
FrameIterator
TileIterator
AudioSplitter
from pixeltable.iterators import DocumentSplitter
# Create view with document chunks
chunks_view = pxt.create_view(
'docs.chunks' ,
docs_table,
iterator = DocumentSplitter.create(
document = docs_table.document,
separators = 'paragraph,token_limit' ,
limit = 500 ,
metadata = 'title,heading'
)
)
Parameters
separators: Choose from ‘heading’, ‘paragraph’, ‘sentence’, ‘token_limit’, ‘char_limit’, ‘page’
limit: Maximum tokens/characters per chunk
metadata: Optional fields like ‘title’, ‘heading’, ‘sourceline’, ‘page’, ‘bounding_box’
overlap: Optional overlap between chunks
from pixeltable.iterators import FrameIterator
# Extract frames at 1 FPS
frames_view = pxt.create_view(
'videos.frames' ,
videos_table,
iterator = FrameIterator.create(
video = videos_table.video,
fps = 1.0
)
)
# Extract exact number of frames (evenly spaced)
frames_view = pxt.create_view(
'videos.sampled' ,
videos_table,
iterator = FrameIterator.create(
video = videos_table.video,
num_frames = 10 # Extract 10 evenly-spaced frames
)
)
# Extract only keyframes (I-frames) for efficient processing
keyframes_view = pxt.create_view(
'videos.keyframes' ,
videos_table,
iterator = FrameIterator.create(
video = videos_table.video,
keyframes_only = True
)
)
Parameters
fps: Frames per second to extract (can be fractional)
num_frames: Exact number of frames to extract
keyframes_only: Extract only keyframes (I-frames) - efficient for quick video scanning
Only one of fps, num_frames, or keyframes_only can be specified
from pixeltable.iterators import TileIterator
# Create tiles with overlap
tiles_view = pxt.create_view(
'images.tiles' ,
images_table,
iterator = TileIterator.create(
image = images_table.image,
tile_size = ( 224 , 224 ), # Width, Height
overlap = ( 32 , 32 ) # Horizontal, Vertical overlap
)
)
Parameters
tile_size: Tuple of (width, height) for each tile
overlap: Optional tuple for overlap between tiles
from pixeltable.iterators import AudioSplitter
# Split audio into chunks
chunks_view = pxt.create_view(
'audio.chunks' ,
audio_table,
iterator = AudioSplitter.create(
audio = audio_table.audio,
chunk_duration_sec = 30.0 , # Split into 30-second chunks
overlap_sec = 2.0 , # 2-second overlap between chunks
min_chunk_duration_sec = 5.0 # Drop last chunk if < 5 seconds
)
)
Parameters
chunk_duration_sec (float): Duration of each audio chunk in seconds
overlap_sec (float, default: 0.0): Overlap duration between consecutive chunks in seconds
min_chunk_duration_sec (float, default: 0.0): Minimum duration threshold - the last chunk will be dropped if it’s shorter than this value
Returns For each chunk, yields:
start_time_sec: Start time of the chunk in seconds
end_time_sec: End time of the chunk in seconds
audio_chunk: The audio chunk as pxt.Audio type
Notes
If the input contains no audio, no chunks are yielded
The audio file is processed efficiently with proper codec handling
Supports various audio formats including MP3, AAC, Vorbis, Opus, FLAC
Common Use Cases
Document Processing Split documents for:
RAG systems
Text analysis
Content extraction
Video Analysis Extract frames for:
Object detection
Scene classification
Activity recognition
Image Processing Create tiles for:
High-resolution analysis
Object detection
Segmentation tasks
Audio Analysis Split audio for:
Speech recognition
Sound classification
Audio feature extraction
Example Workflows
# Create document chunks
chunks = pxt.create_view(
'rag.chunks' ,
docs_table,
iterator = DocumentSplitter.create(
document = docs_table.document,
separators = 'paragraph' ,
limit = 500
)
)
# Add embeddings
chunks.add_embedding_index(
'text' ,
string_embed = sentence_transformer.using(
model_id = 'all-mpnet-base-v2'
)
)
# Extract frames at 1 FPS
frames = pxt.create_view(
'detection.frames' ,
videos_table,
iterator = FrameIterator.create(
video = videos_table.video,
fps = 1.0
)
)
# Add object detection
frames.add_computed_column( detections = detect_objects(frames.frame))
# Split long audio files
chunks = pxt.create_view(
'audio.chunks' ,
audio_table,
iterator = AudioSplitter.create(
audio = audio_table.audio,
chunk_duration_sec = 30.0
)
)
# Add transcription
chunks.add_computed_column( text = whisper_transcribe(chunks.audio_chunk))
from pixeltable.functions.video import make_video
# Extract frames at 1 FPS
frames = pxt.create_view(
'video.frames' ,
videos_table,
iterator = FrameIterator.create(
video = videos_table.video,
fps = 1.0
)
)
# Process frames (e.g., apply a filter)
frames.add_computed_column( processed = frames.frame.filter( 'BLUR' ))
# Create new videos from processed frames
processed_videos = frames.select(
frames.video_id,
make_video(frames.pos, frames.processed) # Default fps is 25
).group_by(frames.video_id).collect()
Best Practices
Memory Management
Use appropriate chunk sizes
Consider overlap requirements
Monitor memory usage with large files
Performance
Balance chunk size vs. processing time
Use batch processing when possible
Cache intermediate results
Tips & Tricks
When using token_limit with DocumentSplitter, ensure the limit accounts for any model context windows in your pipeline.
Additional Resources