Skip to main content
Pixeltable can export data directly from tables and views to the popular Voxel51 frontend, providing a way to visualize and explore image and video datasets. In this tutorial, we’ll learn how to: - Export data from Pixeltable to Voxel51 - Apply labels from image classification and object detection models to exported data We begin by installing the necessary libraries for this tutorial.
%pip install -qU pixeltable fiftyone torch transformers

Example 1: An Image Dataset

import fiftyone as fo
import pixeltable as pxt

# Create a Pixeltable directory for the demo. We first drop the directory if it
# exists, in order to ensure a clean environment.

pxt.drop_dir('fo_demo', force=True)
pxt.create_dir('fo_demo')
Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata
Created directory `fo_demo`.
<pixeltable.catalog.dir.Dir at 0x387766700>
# Create a Pixeltable table for our dataset and insert some sample images.

url_prefix = 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images'

urls = [
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000019.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000025.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000030.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000034.jpg',
]

t = pxt.create_table('fo_demo.images', {'image': pxt.Image})
t.insert({'image': url} for url in urls)
t.head()
Created table `images`.
Inserting rows into `images`: 4 rows [00:00, 2775.85 rows/s]
Inserted 4 rows with 0 errors.
image
Now we export our new table to a Voxel51 dataset and load it into a new Voxel51 session within our demo notebook. Once it’s been loaded, the images can be interactively navigated as with any other Voxel51 dataset.
fo_dataset = pxt.io.export_images_as_fo_dataset(t, t.image)
session = fo.launch_app(fo_dataset)

Adding Labels

We’ll now show how Voxel51 labels can be attached to the exported dataset. Currently, Pixeltable supports only classification and detection labels; other Voxel51 label types may be added in the future. First, let’s generate some labels by applying two models from the Huggingface transformers library: A ViT model for image classification and a DETR model for object detection.
from pixeltable.functions.huggingface import vit_for_image_classification, detr_for_object_detection

t.add_computed_column(classifications=vit_for_image_classification(
    t.image, model_id='google/vit-base-patch16-224'
))
t.add_computed_column(detections=detr_for_object_detection(
    t.image, model_id='facebook/detr-resnet-50'
))
Computing cells: 100%|████████████████████████████████████████████| 4/4 [00:00<00:00,  5.86 cells/s]
Added 4 column values with 0 errors.
Computing cells: 100%|████████████████████████████████████████████| 4/4 [00:01<00:00,  2.15 cells/s]
Added 4 column values with 0 errors.
UpdateStatus(num_rows=4, num_computed_values=4, num_excs=0, updated_cols=[], cols_with_excs=[])
Both models output JSON containing the model results. Let’s peek at the contents of our table now:
t.head()
image classifications detections
{"labels": [345, 690, 912, 346, 730], "scores": [0.767, 0.019, 0.007, 0.004, 0.004], "label_text": ["ox", "oxcart", "worm fence, snake fence, snake-rail fence, Virginia fence", "water buffalo, water ox, Asiatic buffalo, Bubalus bubalis", "plow, plough"]} {"boxes": [[335.855, 43.414, 339.903, 55.582], [87.309, 152.279, 584.422, 395.81], [241.067, 136.443, 413.393, 205.956]], "labels": [1, 21, 21], "scores": [0.564, 0.999, 0.999], "label_text": ["person", "cow", "cow"]}
{"labels": [340, 353, 386, 9, 352], "scores": [0.325, 0.198, 0.105, 0.049, 0.034], "label_text": ["zebra", "gazelle", "African elephant, Loxodonta africana", "ostrich, Struthio camelus", "impala, Aepyceros melampus"]} {"boxes": [[51.96, 356.187, 181.493, 413.93], [383.233, 58.665, 605.705, 361.374]], "labels": [25, 25], "scores": [0.99, 0.999], "label_text": ["giraffe", "giraffe"]}
{"labels": [883, 738, 708, 725, 716], "scores": [0.636, 0.245, 0.041, 0.01, 0.004], "label_text": ["vase", "pot, flowerpot", "pedestal, plinth, footstall", "pitcher, ewer", "picket fence, paling"]} {"boxes": [[238.074, 155.501, 406.337, 349.963], [201.495, 29.842, 457.037, 350.032]], "labels": [86, 64], "scores": [1., 0.77], "label_text": ["vase", "potted plant"]}
{"labels": [340, 353, 386, 352, 9], "scores": [0.995, 0.002, 0., 0., 0.], "label_text": ["zebra", "gazelle", "African elephant, Loxodonta africana", "impala, Aepyceros melampus", "ostrich, Struthio camelus"]} {"boxes": [[-0.231, 19.502, 439.539, 400.324]], "labels": [24], "scores": [1.], "label_text": ["zebra"]}
Now we need to transform our model data into the format the Voxel51 API expects (see the Pixeltable documentation for pxt.io.export_images_as_fo_dataset for details). We’ll use Pixeltable UDFs to do the appropriate conversions.
@pxt.udf
def vit_to_fo(vit_labels: list) -> list:
    return [
        {'label': label, 'confidence': score}
        for label, score in zip(vit_labels['label_text'], vit_labels['scores'])
    ]

@pxt.udf
def detr_to_fo(img: pxt.Image, detr_labels: dict) -> list:
    result = []
    for label, box, score in zip(detr_labels['label_text'], detr_labels['boxes'], detr_labels['scores']):
        # DETR gives us bounding boxes in (x1,y1,x2,y2) absolute (pixel) coordinates.
        # Voxel51 expects (x,y,w,h) relative (fractional) coordinates.
        # So we need to do a conversion.
        fo_box = [
            box[0] / img.width,
            box[1] / img.height,
            (box[2] - box[0]) / img.width,
            (box[3] - box[1]) / img.height,
        ]
        result.append({'label': label, 'bounding_box': fo_box, 'confidence': score})
    return result
We can test that our UDFs are working as expected with a select() statement.
t.select(
    t.image,
    t.classifications,
    vit_to_fo(t.classifications),
    t.detections,
    detr_to_fo(t.image, t.detections)
).head()
image classifications vit_to_fo detections detr_to_fo
{"labels": [345, 690, 912, 346, 730], "scores": [0.767, 0.019, 0.007, 0.004, 0.004], "label_text": ["ox", "oxcart", "worm fence, snake fence, snake-rail fence, Virginia fence", "water buffalo, water ox, Asiatic buffalo, Bubalus bubalis", "plow, plough"]} [{"label": "ox", "confidence": 0.767}, {"label": "oxcart", "confidence": 0.019}, {"label": "worm fence, snake fence, snake-rail fence, Virginia fence", "confidence": 0.007}, {"label": "water buffalo, water ox, Asiatic buffalo, Bubalus bubalis", "confidence": 0.004}, {"label": "plow, plough", "confidence": 0.004}] {"boxes": [[335.855, 43.414, 339.903, 55.582], [87.309, 152.279, 584.422, 395.81], [241.067, 136.443, 413.393, 205.956]], "labels": [1, 21, 21], "scores": [0.564, 0.999, 0.999], "label_text": ["person", "cow", "cow"]} [{"label": "person", "bounding_box": [0.525, 0.102, 0.006, 0.028], "confidence": 0.564}, {"label": "cow", "bounding_box": [0.136, 0.357, 0.777, 0.57], "confidence": 0.999}, {"label": "cow", "bounding_box": [0.377, 0.32, 0.269, 0.163], "confidence": 0.999}]
{"labels": [340, 353, 386, 9, 352], "scores": [0.325, 0.198, 0.105, 0.049, 0.034], "label_text": ["zebra", "gazelle", "African elephant, Loxodonta africana", "ostrich, Struthio camelus", "impala, Aepyceros melampus"]} [{"label": "zebra", "confidence": 0.325}, {"label": "gazelle", "confidence": 0.198}, {"label": "African elephant, Loxodonta africana", "confidence": 0.105}, {"label": "ostrich, Struthio camelus", "confidence": 0.049}, {"label": "impala, Aepyceros melampus", "confidence": 0.034}] {"boxes": [[51.96, 356.187, 181.493, 413.93], [383.233, 58.665, 605.705, 361.374]], "labels": [25, 25], "scores": [0.99, 0.999], "label_text": ["giraffe", "giraffe"]} [{"label": "giraffe", "bounding_box": [0.081, 0.836, 0.202, 0.136], "confidence": 0.99}, {"label": "giraffe", "bounding_box": [0.599, 0.138, 0.348, 0.711], "confidence": 0.999}]
{"labels": [883, 738, 708, 725, 716], "scores": [0.636, 0.245, 0.041, 0.01, 0.004], "label_text": ["vase", "pot, flowerpot", "pedestal, plinth, footstall", "pitcher, ewer", "picket fence, paling"]} [{"label": "vase", "confidence": 0.636}, {"label": "pot, flowerpot", "confidence": 0.245}, {"label": "pedestal, plinth, footstall", "confidence": 0.041}, {"label": "pitcher, ewer", "confidence": 0.01}, {"label": "picket fence, paling", "confidence": 0.004}] {"boxes": [[238.074, 155.501, 406.337, 349.963], [201.495, 29.842, 457.037, 350.032]], "labels": [86, 64], "scores": [1., 0.77], "label_text": ["vase", "potted plant"]} [{"label": "vase", "bounding_box": [0.372, 0.363, 0.263, 0.454], "confidence": 1.}, {"label": "potted plant", "bounding_box": [0.315, 0.07, 0.399, 0.748], "confidence": 0.77}]
{"labels": [340, 353, 386, 352, 9], "scores": [0.995, 0.002, 0., 0., 0.], "label_text": ["zebra", "gazelle", "African elephant, Loxodonta africana", "impala, Aepyceros melampus", "ostrich, Struthio camelus"]} [{"label": "zebra", "confidence": 0.995}, {"label": "gazelle", "confidence": 0.002}, {"label": "African elephant, Loxodonta africana", "confidence": 0.}, {"label": "impala, Aepyceros melampus", "confidence": 0.}, {"label": "ostrich, Struthio camelus", "confidence": 0.}] {"boxes": [[-0.231, 19.502, 439.539, 400.324]], "labels": [24], "scores": [1.], "label_text": ["zebra"]} [{"label": "zebra", "bounding_box": [-0., 0.046, 0.687, 0.896], "confidence": 1.}]
Now we pass the modified structures to export_images_as_fo_dataset.
fo_dataset = pxt.io.export_images_as_fo_dataset(
    t,
    t.image,
    classifications=vit_to_fo(t.classifications),
    detections=detr_to_fo(t.image, t.detections)
)
session = fo.launch_app(fo_dataset)

Adding Multiple Label Sets

You can include multiple label sets of the same type in the same dataset by passing a list or dict of expressions to the classifications and/or detections parameters. If a list is specified, default names will be assigned to the label sets; if a dict is specified, the label sets will be named according to its keys. As an example, let’s try recomputing our detections using the more powerful DETR model ResNet-101, and then load them into the same Voxel51 dataset as the earlier detections in order to compare them side-by-side.
t.add_computed_column(detections_101=detr_for_object_detection(
    t.image, model_id='facebook/detr-resnet-101'
))
Computing cells: 100%|████████████████████████████████████████████| 4/4 [00:01<00:00,  2.90 cells/s]
Added 4 column values with 0 errors.
UpdateStatus(num_rows=4, num_computed_values=4, num_excs=0, updated_cols=[], cols_with_excs=[])
fo_dataset = pxt.io.export_images_as_fo_dataset(
    t,
    t.image,
    classifications=vit_to_fo(t.classifications),
    detections={
        'detections_50': detr_to_fo(t.image, t.detections),
        'detections_101': detr_to_fo(t.image, t.detections_101)
    }
)
session = fo.launch_app(fo_dataset)
Exploring the resulting images, we can see that the results are not much different between the two models, at least on our small sample dataset.