Learn about iterators for processing documents, videos, audio, and images
separators
: Choose from ‘heading’, ‘paragraph’, ‘sentence’, ‘token_limit’, ‘char_limit’, ‘page’limit
: Maximum tokens/characters per chunkmetadata
: Optional fields like ‘title’, ‘heading’, ‘sourceline’, ‘page’, ‘bounding_box’overlap
: Optional overlap between chunksRAG Pipeline
Video Object Detection
Audio Transcription
Video Generation
token_limit
with DocumentSplitter, ensure the limit accounts for any model context windows in your pipeline.