Building YOLOX Detection Apps

Pixeltable YOLOX apps work in two phases:

  1. Define your detection workflow (once)
  2. Use your app (anytime)
1

Install Dependencies

pip install pixeltable

Define Your Detection Workflow

Create table.py:

import pixeltable as pxt
from pixeltable.ext.functions.yolox import yolox
import PIL.Image
import PIL.ImageDraw

# Initialize app structure
pxt.drop_dir("detection", force=True)
pxt.create_dir("detection")

# Create tables for different media types
images = pxt.create_table(
    'detection.images', 
    {'image': pxt.ImageType()},
    if_exists="ignore"
)

videos = pxt.create_table(
    'detection.videos',
    {'video': pxt.VideoType()},
    if_exists="ignore"
)

# Create frame extraction view
frames = pxt.create_view(
    'detection.frames',
    videos,
    iterator=pxt.iterators.FrameIterator.create(
        video=videos.video,
        fps=1  # Extract 1 frame per second
    )
)

# Add detection workflow to images
images.add_computed_column(
    detections=yolox(
        images.image,
        model_id='yolox_s',  # Choose model size
        threshold=0.5        # Detection confidence threshold
    )
)

# Add detection workflow to video frames
frames.add_computed_column(
    detections=yolox(
        frames.frame,
        model_id='yolox_m',
        threshold=0.25
    )
)

# Add visualization function
@pxt.udf
def draw_boxes(img: PIL.Image.Image, boxes: list[list[float]]) -> PIL.Image.Image:
    result = img.copy()
    d = PIL.ImageDraw.Draw(result)
    for box in boxes:
        d.rectangle(box, width=3)
    return result

# Add visualization column to both tables
images.add_computed_column(
    visualization=draw_boxes(images.image, images.detections.boxes)
)

frames.add_computed_column(
    visualization=draw_boxes(frames.frame, frames.detections.boxes)
)

Use Your App

Create app.py:

import pixeltable as pxt

# Connect to your tables
images = pxt.get_table("detection.images")
videos = pxt.get_table("detection.videos")
frames = pxt.get_view("detection.frames")

# Process images
images.insert([
    {'image': 'path/to/image1.jpg'},
    {'image': 'path/to/image2.jpg'}
])

# Process videos
videos.insert([
    {'video': 'path/to/video1.mp4'}
])

# Get detection results
image_results = images.select(
    images.image,
    images.detections,
    images.visualization
).collect()

frame_results = frames.select(
    frames.frame,
    frames.detections,
    frames.visualization
).collect()

# Access specific detection information
high_confidence = frames.where(
    frames.detections.scores[0] > 0.9
).collect()

Available Models

Model Variants

ModelSpeedAccuracyUse Case
yolox_nanoFastestBaseMobile/Edge devices
yolox_tinyVery FastGoodResource-constrained environments
yolox_sFastBetterBalanced performance
yolox_mModerateHighGeneral purpose
yolox_lSlowerVery HighHigh accuracy needs
yolox_xSlowestHighestMaximum accuracy

Key Features

Automatic Processing

Workflow handles model loading, inference, and result storage:

detections=yolox(images.image, model_id='yolox_s')

Integrated Video Support

Built-in frame extraction and processing:

frames = pxt.create_view('detection.frames', videos,
    iterator=pxt.iterators.FrameIterator.create(
        video=videos.video, fps=1
    )
)

Rich Results

Comprehensive detection information:

{
    "boxes": [[x1, y1, x2, y2], ...],
    "scores": [0.98, ...],
    "labels": [1, ...],
    "label_text": ["person", ...]
}

Best Practices