Compare object detection and panoptic segmentation

This documentation page is also available as an interactive notebook. You can launch the notebook in Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the above links.

Understand when to use bounding boxes versus pixel-level masks for image analysis. What’s in this recipe:

Run object detection to get bounding boxes and labels
Run panoptic segmentation to get pixel-level masks
Visualize and compare outputs side-by-side

Problem

You need to analyze objects in images, but there are two approaches:

Which should you use? Detection is faster but approximate. Segmentation is slower but precise.

Solution

Run both approaches on the same images using DETR models and compare the results.

Setup

%pip install -qU pixeltable torch transformers timm

import numpy as np
import pixeltable as pxt
from pixeltable.functions.huggingface import (
    detr_for_object_detection,
    detr_for_segmentation,
)
from pixeltable.functions.vision import (
    draw_bounding_boxes,
    overlay_segmentation,
)

Load images

pxt.drop_dir('detection_vs_seg', force=True)
pxt.create_dir('detection_vs_seg')

Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata
Created directory ‘detection_vs_seg’.
<pixeltable.catalog.dir.Dir at 0x145b43f90>

images = pxt.create_table('detection_vs_seg/images', {'image': pxt.Image})

base_url = 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images'
images.insert(
    [
        {'image': f'{base_url}/000000000034.jpg'},
        {'image': f'{base_url}/000000000049.jpg'},
    ]
)

Created table ‘images’.
Inserted 2 rows with 0 errors in 0.22 s (9.21 rows/s)
2 rows inserted.

Run object detection

The detr_for_object_detection function returns bounding boxes, labels, and confidence scores. Parameters:

model_id: DETR variant (facebook/detr-resnet-50 or facebook/detr-resnet-101)
threshold: Confidence threshold (0.0-1.0). Higher = fewer but more confident detections

Output:

{'boxes': [[x1, y1, x2, y2], ...], 'scores': [0.98, ...], 'label_text': ['person', ...]}

images.add_computed_column(
    detections=detr_for_object_detection(
        images.image, model_id='facebook/detr-resnet-50', threshold=0.8
    )
)

Added 2 column values with 0 errors in 4.09 s (0.49 rows/s)
2 rows updated.

# View detection results
images.select(images.image, images.detections).collect()

Visualize detections with bounding boxes

Use draw_bounding_boxes to overlay the detection results on the original image.

images.add_computed_column(
    detection_viz=draw_bounding_boxes(
        images.image,
        boxes=images.detections.boxes,
        labels=images.detections.label_text,
        fill=True,
        width=2,
    )
)

Added 2 column values with 0 errors in 0.03 s (58.89 rows/s)
2 rows updated.

images.select(images.detection_viz).collect()

Run panoptic segmentation

The detr_for_segmentation function returns pixel-level masks and segment metadata. Parameters:

model_id: Segmentation model (facebook/detr-resnet-50-panoptic)
threshold: Confidence threshold for filtering segments

Output:

{
    'segmentation': np.ndarray,  # (H, W) array where each pixel = segment ID
    'segments_info': [{'id': 1, 'label_text': 'person', 'score': 0.98}, ...]
}

Note: The full segmentation output contains a numpy array that can’t be stored as JSON. We store just the segments_info metadata and compute the pixel-level visualization inline.

# Store just the segments_info (JSON-serializable) as a computed column
# The segmentation array will be computed inline for visualization
seg_expr = detr_for_segmentation(
    images.image,
    model_id='facebook/detr-resnet-50-panoptic',
    threshold=0.5,
)

images.add_computed_column(segments_info=seg_expr.segments_info)

# View stored segmentation info
images.select(images.image, images.segments_info).collect()

Visualize segmentation with colored overlay

Use overlay_segmentation to visualize the pixel masks with colored regions and contours.

# Compute segmentation visualization inline
# Cast the segmentation array to the proper type for overlay_segmentation
seg_expr = detr_for_segmentation(
    images.image,
    model_id='facebook/detr-resnet-50-panoptic',
    threshold=0.5,
)
segmentation_map = seg_expr.segmentation.astype(
    pxt.Array[(None, None), np.int32]
)

images.select(
    segmentation_viz=overlay_segmentation(
        images.image,
        segmentation_map,
        alpha=0.5,
        draw_contours=True,
        contour_thickness=2,
    )
).collect()

Compare side-by-side

# Side-by-side comparison: original, detection, segmentation
seg_expr = detr_for_segmentation(
    images.image,
    model_id='facebook/detr-resnet-50-panoptic',
    threshold=0.5,
)
segmentation_map = seg_expr.segmentation.astype(
    pxt.Array[(None, None), np.int32]
)

images.select(
    images.image,
    images.detection_viz,
    segmentation_viz=overlay_segmentation(
        images.image,
        segmentation_map,
        alpha=0.5,
        draw_contours=True,
        contour_thickness=2,
    ),
).collect()

Count objects per image

# Count objects per image (using stored columns)
images.select(
    images.image,
    num_detections=images.detections.boxes.apply(len, col_type=pxt.Int),
    num_segments=images.segments_info.apply(len, col_type=pxt.Int),
).collect()

Explanation

Detection gives fast, approximate locations. Segmentation gives slower but precise boundaries.

Capability comparison

Performance tradeoffs

When to use each

Choose detection when:

You need to know what objects are present and where (approximately)
Speed matters (detection is 2x faster)
You need search, filtering, or counting
Bounding boxes suffice for visualization

Choose segmentation when:

You need exact object boundaries (pixel-perfect masks)
You’re doing image editing, compositing, or AR
You need to measure actual object area/coverage
You want scene composition analysis (what % is sky vs buildings)

Welcome to Pixeltable

Core Concepts

How-To

Compare object detection and panoptic segmentation

Problem

Solution

Setup

Load images

Run object detection

Visualize detections with bounding boxes

Run panoptic segmentation

Visualize segmentation with colored overlay

Compare side-by-side

Count objects per image

Explanation

Capability comparison

Performance tradeoffs

When to use each

See also

Welcome to Pixeltable

Core Concepts

How-To

​Problem

​Solution

​Setup

​Load images

​Run object detection

​Visualize detections with bounding boxes

​Run panoptic segmentation

​Visualize segmentation with colored overlay

​Compare side-by-side

​Count objects per image

​Explanation

​Capability comparison

​Performance tradeoffs

​When to use each

​See also

Problem

Solution

Setup

Load images

Run object detection

Visualize detections with bounding boxes

Run panoptic segmentation

Visualize segmentation with colored overlay

Compare side-by-side

Count objects per image

Explanation

Capability comparison

Performance tradeoffs

When to use each

See also