Welcome to Pixeltable

AI Data infrastructure providing a declarative, incremental approach for multimodal workloads.

Open Source AI Data Infrastructure

Express complex operations through simple table operations and computed columns:

  • Data transformations
  • Model inference
  • Custom logic
  • Multimodal data handling

🎯 The Problem

Building AI applications today requires juggling multiple tools and writing complex infrastructure code to:

  • Process and store different types of data (text, images, video, audio)
  • Track changes and maintain data lineage
  • Scale processing efficiently
  • Move from development to production

πŸ’‘ The Solution

Pixeltable unifies all these operations under a simple, declarative interface. Pixeltable features built-in versioning, lineage tracking, and incremental updates, enabling users to store, transform, index, and iterate on data for their ML workflows. It combines data storage, versioning, indexing, and orchestration under a unified table interface, enabling data scientists and ML engineers to focus on modeling and experimentation rather than data plumbing.

import pixeltable as pxt

# Create a video table
videos = pxt.create_table('videos', {'video': pxt.VideoType()})

# Automatic frame extraction
frames = pxt.create_view(
    'frames', 
    videos, 
    iterator=FrameIterator.create(video=videos.video)
)

# Add AI processing - only runs on new data
frames_view.add_computed_column(detect_yolox_tiny=yolox(
    frames_view.frame, model_id='yolox_tiny', threshold=0.25
))

πŸš€ Quick Start

pip install pixeltable

Core Use Cases

1. LLM Development & RAG

Industry Challenge

Organizations implementing LLM applications face significant hurdles in managing document processing, tracking model decisions, and maintaining efficient RAG systems. Traditional approaches lead to:

  • Costly reprocessing of entire document bases for minor changes
  • Lack of transparency in model decision-making
  • Complex management of chunking strategies and embeddings
  • Difficulty comparing performance across different approaches

Pixeltable Solution

# Declarative document processing with automatic versioning
docs = pxt.create_table('knowledge_base', {'document': pxt.DocumentType()})

# Flexible chunking strategies with computed views
chunks = pxt.create_view(
    'chunks',
    docs,
    iterator=DocumentSplitter.create(
        document=docs.document,
        separators='token_limit',
        limit=300
    )
)

# Automatic embedding generation and indexing
chunks.add_embedding_index(
    'text',
    idx_name='minilm_idx',
    string_embed=sentence_transformer.using(model_id='sentence-transformers/all-MiniLM-L12-v2')
)

[...]

# Add a computed column that calls OpenAI
queries_t.add_computed_column(
    response=openai.chat_completions(model='gpt-4o-mini', messages=messages)
)

Business Impact

  • Cost Reduction: 70%+ reduction in processing costs through incremental updates
  • Quality Improvement: Complete lineage tracking ensures answer accuracy
  • Development Speed: Rapid experimentation with different strategies
  • Operational Efficiency: Built-in versioning eliminates manual tracking

2. Computer Vision Workflows

Industry Challenge

Computer vision teams struggle with:

  • Managing large-scale video and image datasets
  • Tracking model versions and annotations
  • Maintaining consistency between development and production
  • Efficiently processing incremental updates

Pixeltable Solution

# Unified video processing pipeline
videos = pxt.create_table('videos', {'video': pxt.VideoType()})

# Automatic frame extraction and management
frames = pxt.create_view(
    'frames',
    videos,
    iterator=FrameIterator.create(video=videos.video)
)

# Integrated object detection and annotation
frames_view.add_computed_column(detect_yolox_tiny=yolox(
    frames_view.frame, model_id='yolox_tiny', threshold=0.25
))

@pxt.udf
def draw_boxes(
    img: PIL.Image.Image, boxes: list[list[float]]
) -> PIL.Image.Image:
    result = img.copy()  # Create a copy of `img`
    d = PIL.ImageDraw.Draw(result)
    for box in boxes:
        # Draw bounding box rectangles on the copied image
        d.rectangle(box, width=3)
    return result

frames_view.group_by(videos_table).select(
    pxt.functions.video.make_video(
        frames_view.pos,
        draw_boxes(
            frames_view.frame,
            frames_view.detect_yolox_tiny.bboxes
        )
    )
).show(1)

Business Impact

  • Resource Optimization: Lazy evaluation reduces storage and compute costs
  • Quality Assurance: Automatic lineage tracking for all model outputs
  • Development Efficiency: Seamless integration with annotation tools
  • Deployment Confidence: Production parity with development environment

3. Multimodal AI Applications

Industry Challenge

Organizations building multimodal AI applications face:

  • Complex integration of different data types
  • Difficult relationship management between modalities
  • Lack of unified search capabilities
  • Complex pipeline maintenance

Pixeltable Solution

# Unified multimodal data management
t = pxt.create_table('content', {
    'video': pxt.Video
})

# Automated cross-modal processing
t['audio'] = extract_audio(t.video, format='mp3')
t['metadata'] = get_metadata(t.audio)
t['transcription'] = openai.transcriptions(audio=t.audio, model='whisper-1')
t['transcription_text'] = t.transcription.text
[...]
t['response'] = openai.chat_completions(messages=t.message, model='gpt-4o-mini-2024-07-18', max_tokens=500)

Business Impact

  • Simplified Architecture: Single interface for all data types
  • Enhanced Search: Unified search across modalities
  • Reduced Complexity: Automated pipeline management
  • Faster Development: Built-in transformations between modalities

First Steps

πŸ“š Popular Tutorials

Computer Vision

Natural Language Processing

Multimodal Applications

πŸ” Next Steps