Kaggle Colab Download Notebook

Working with Voxel51 in Pixeltable

Pixeltable can export data directly from tables and views to the popular Voxel51 frontend, providing a way to visualize and explore image and video datasets. In this tutorial, we'll learn how to:

  • Export data from Pixeltable to Voxel51
  • Apply labels from image classification and object detection models to exported data

We begin by installing the necessary libraries for this tutorial.

%pip install -qU pixeltable fiftyone transformers

Example 1: An Image Dataset

import fiftyone as fo
import pixeltable as pxt

# Create a Pixeltable directory for the demo. We first drop the directory if it
# exists, in order to ensure a clean environment.

pxt.drop_dir('fo_demo', force=True)
pxt.create_dir('fo_demo')
Connected to Pixeltable database at:
postgresql+psycopg://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata
Created directory `fo_demo`.

<pixeltable.catalog.dir.Dir at 0x387766700>
# Create a Pixeltable table for our dataset and insert some sample images.

url_prefix = 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/source/data/images'

urls = [
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/source/data/images/000000000019.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/source/data/images/000000000025.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/source/data/images/000000000030.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/source/data/images/000000000034.jpg',
]

t = pxt.create_table('fo_demo.images', {'image': pxt.Image})
t.insert({'image': url} for url in urls)
t.head()
Created table `images`.
Inserting rows into `images`: 4 rows [00:00, 2775.85 rows/s]
Inserted 4 rows with 0 errors.

Now we export our new table to a Voxel51 dataset and load it into a new Voxel51 session within our demo notebook. Once it's been loaded, the images can be interactively navigated as with any other Voxel51 dataset.

fo_dataset = pxt.io.export_images_as_fo_dataset(t, t.image)
session = fo.launch_app(fo_dataset)
 4 [20.0ms elapsed, ? remaining, 199.8 samples/s] 
INFO:eta.core.utils: 4 [20.0ms elapsed, ? remaining, 199.8 samples/s] 

Adding Labels

We'll now show how Voxel51 labels can be attached to the exported dataset. Currently, Pixeltable supports only classification and detection labels; other Voxel51 label types may be added in the future.

First, let's generate some labels by applying two models from the Huggingface transformers library: A ViT model for image classification and a DETR model for object detection.

from pixeltable.functions.huggingface import vit_for_image_classification, detr_for_object_detection

t.add_column(classifications=vit_for_image_classification(
    t.image, model_id='google/vit-base-patch16-224'
))
t.add_column(detections=detr_for_object_detection(
    t.image, model_id='facebook/detr-resnet-50'
))
Computing cells: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:00<00:00,  5.86 cells/s]
Added 4 column values with 0 errors.
Computing cells: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:01<00:00,  2.15 cells/s]
Added 4 column values with 0 errors.





UpdateStatus(num_rows=4, num_computed_values=4, num_excs=0, updated_cols=[], cols_with_excs=[])

Both models output JSON containing the model results. Let's peek at the contents of our table now:

t.head()

Now we need to transform our model data into the format the Voxel51 API expects (see the Pixeltable documentation for pxt.io.export_images_as_fo_dataset for details). We'll use Pixeltable UDFs to do the appropriate conversions.

@pxt.udf
def vit_to_fo(vit_labels: list) -> list:
    return [
        {'label': label, 'confidence': score}
        for label, score in zip(vit_labels['label_text'], vit_labels['scores'])
    ]

@pxt.udf
def detr_to_fo(img: pxt.Image, detr_labels: dict) -> list:
    result = []
    for label, box, score in zip(detr_labels['label_text'], detr_labels['boxes'], detr_labels['scores']):
        # DETR gives us bounding boxes in (x1,y1,x2,y2) absolute (pixel) coordinates.
        # Voxel51 expects (x,y,w,h) relative (fractional) coordinates.
        # So we need to do a conversion.
        fo_box = [
            box[0] / img.width,
            box[1] / img.height,
            (box[2] - box[0]) / img.width,
            (box[3] - box[1]) / img.height,
        ]
        result.append({'label': label, 'bounding_box': fo_box, 'confidence': score})
    return result

We can test that our UDFs are working as expected with a select() statement.

t.select(
    t.image,
    t.classifications,
    vit_to_fo(t.classifications),
    t.detections,
    detr_to_fo(t.image, t.detections)
).head()

Now we pass the modified structures to export_images_as_fo_dataset.

fo_dataset = pxt.io.export_images_as_fo_dataset(
    t,
    t.image,
    classifications=vit_to_fo(t.classifications),
    detections=detr_to_fo(t.image, t.detections)
)
session = fo.launch_app(fo_dataset)
 4 [36.6ms elapsed, ? remaining, 109.4 samples/s] 
INFO:eta.core.utils: 4 [36.6ms elapsed, ? remaining, 109.4 samples/s] 

Adding Multiple Label Sets

You can include multiple label sets of the same type in the same dataset by passing a list or dict of expressions to the classifications and/or detections parameters. If a list is specified, default names will be assigned to the label sets; if a dict is specified, the label sets will be named according to its keys.

As an example, let's try recomputing our detections using the more powerful DETR model ResNet-101, and then load them into the same Voxel51 dataset as the earlier detections in order to compare them side-by-side.

t.add_column(detections_101=detr_for_object_detection(
    t.image, model_id='facebook/detr-resnet-101'
))
Computing cells: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:01<00:00,  2.90 cells/s]
Added 4 column values with 0 errors.
UpdateStatus(num_rows=4, num_computed_values=4, num_excs=0, updated_cols=[], cols_with_excs=[])
fo_dataset = pxt.io.export_images_as_fo_dataset(
    t,
    t.image,
    classifications=vit_to_fo(t.classifications),
    detections={
        'detections_50': detr_to_fo(t.image, t.detections),
        'detections_101': detr_to_fo(t.image, t.detections_101)
    }
)
session = fo.launch_app(fo_dataset)
 4 [53.9ms elapsed, ? remaining, 74.2 samples/s] 
INFO:eta.core.utils: 4 [53.9ms elapsed, ? remaining, 74.2 samples/s] 

Exploring the resulting images, we can see that the results are not much different between the two models, at least on our small sample dataset.