Install Python Packages
First run the following command to install Pixeltable and related libraries needed for this tutorial.Creating a Table
Let’s begin by creating ademo directory (if it doesn’t already exist)
and a table that can hold image data, demo.first. The table will
initially have just a single column to hold our input images, which
we’ll call input_image. We also need to specify a type for the column:
pxt.Image.
t.describe() to examine the table schema. We see that it
now contains a single column, as expected.
| Column Name | Type | Computed With |
|---|---|---|
| input_image | image |
t.show() to examine the contents of the table.
| input_image |
|---|
Adding Computed Columns
Great! Now we have a table containing some data. Let’s add an object detection model to our workflow. Specifically, we’re going to use the ResNet-50 object detection model, which runs using the Huggingface DETR (“DEtection TRansformer”) model class. Pixeltable contains a built-in adapter for this model family, so all we have to do is call thedetr_for_object_detection Pixeltable function. A nice thing about the
Huggingface models is that they run locally, so you don’t need an
account with a service provider in order to use them.
This is our first example of a computed column, a key concept in
Pixeltable. Recall that when we created the input_image column, we
specified a type, ImageType, indicating our intent to populate it with
data in the future. When we create a computed column, we instead
specify a function that operates on other columns of the table. By
default, when we add the new computed column, Pixeltable immediately
evaluates it against all existing data in the table - in this case, by
calling the detr_for_object_detection function on the image.
Depending on your setup, it may take a minute for the function to
execute. In the background, Pixeltable is downloading the model from
Huggingface (if necessary), instantiating it, and caching it for later
use.
| input_image | detections |
|---|---|
| {"boxes": [[51.942, 356.174, 181.481, 413.975], [383.225, 58.66, 605.64, 361.346]], "labels": [25, 25], "scores": [0.99, 0.999], "label_text": ["giraffe", "giraffe"]} |
label_text:
Descriptions of the objects detected - boxes: Bounding boxes for each
detected object - scores: Confidence scores for each detection -
labels: The DETR model’s internal IDs for the detected objects
Perhaps this is more than we need, and all we really want are the text
labels. We could add another computed column to extract label_text
from the JSON struct:
| input_image | detections | detections_text |
|---|---|---|
| {"boxes": [[51.942, 356.174, 181.481, 413.975], [383.225, 58.66, 605.64, 361.346]], "labels": [25, 25], "scores": [0.99, 0.999], "label_text": ["giraffe", "giraffe"]} | ["giraffe", "giraffe"] |
| Column Name | Type | Computed With |
|---|---|---|
| input_image | image | |
| detections | json | detr_for_object_detection(input_image, model_id='facebook/detr-resnet-50') |
| detections_text | json | detections.label_text |
detect_text column. Pixeltable
will orchestrate the execution of any sequence (or DAG) of computed
columns.
Note how we can pass multiple rows to t.insert with a single
statement, which will insert them more efficiently.
t.select to suppress
the display of the detect column, since right now we’re only
interested in the text labels.
| input_image | detections_text |
|---|---|
| ["giraffe", "giraffe"] | |
| ["vase", "potted plant"] | |
| ["zebra"] | |
| ["dog", "dog"] | |
| ["person", "person", "bench", "person", "elephant", "elephant", "person"] |
Pixeltable Is Persistent
An important feature of Pixeltable is that everything is persistent. Unlike in-memory Python libraries such as Pandas, Pixeltable is a database: all your data, transformations, and computed columns are stored and preserved between sessions. To see this, let’s clear all the variables in our notebook and start fresh. You can optionally restart your notebook kernel at this point, to demonstrate how Pixeltable data persists across sessions.| input_image | detections_text |
|---|---|
| ["giraffe", "giraffe"] | |
| ["vase", "potted plant"] |
GPT-4o
For comparison, let’s try running our examples through a generative model, Open AI’sgpt-4o-mini. For this section, you’ll need an OpenAI
account with an API key. You can use the following command to add your
API key to the environment (just enter your API key when prompted):
| input_image | detections_text | vision |
|---|---|---|
| ["giraffe", "giraffe"] | The image shows two giraffes in a natural setting, likely in a zoo or wildlife park. One giraffe is prominently featured in the foreground, standing and reaching toward the leaves of a tree. It has a distinctive coat pattern with light brown patches and a long neck. The second giraffe is in the background, partially visible, also appearing to be grazing. The surroundings include greenery, trees, and some fallen branches, providing a natural habitat feel. The lighting suggests a bright, sunny day. | |
| ["vase", "potted plant"] | The image features a white vase with a shell-like design, filled with a colorful bouquet of flowers. The arrangement includes white flowers, pink roses, and green foliage. The vase is placed on a railing, with a blurred green garden backdrop providing a serene and natural setting. The lighting appears bright and soft, suggesting a pleasant, sunny day. | |
| ["dog", "dog"] | The image shows a small, curly-haired dog lying on a shoe rack. The dog is partially obscured by several pairs of shoes, which include a red flip-flop, a couple of sports shoes, and sandals. The setting appears to be indoors, with a simple metal rack holding the shoes, and the floor has a tiled pattern. The dog seems comfortable and cozy among the footwear. | |
| ["zebra"] | The image depicts a zebra grazing on green grass. The zebra is characterized by its distinctive black and white stripes, and its mane is also visible. The background consists of lush greenery, suggesting a natural habitat. The zebra appears to be focused on eating, with its head lowered towards the ground. The vibrant colors and natural setting highlight the animal's features. | |
| ["person", "person", "bench", "person", "elephant", "elephant", "person"] | The image shows a lush, green forest scene where two elephants are walking through dense vegetation. On each elephant, there is a person sitting, likely enjoying an elephant ride. The surrounding area is filled with various tropical plants and trees, creating a vibrant and natural atmosphere. The overall setting appears to be part of a jungle or a wildlife reserve, showcasing the beauty of nature and wildlife interaction. |
| rot_image | rot_vision |
|---|---|
| The image features a zebra lying on green grass. The zebra is depicted from a top-down perspective, showcasing its distinctive black and white stripes. The position of the zebra suggests it might be resting or relaxing in its natural habitat, with the lush greenery surrounding it. | |
| The image features a giraffe in a natural setting, with lush greenery in the background. The giraffe, characterized by its long neck and spotted coat, is likely feeding or interacting with its environment. There are also trees and foliage surrounding the area, which adds to the natural atmosphere. Another giraffe can be seen in the background, indicating they are in a habitat typical for these animals. | |
| The image shows a pile of shoes and a small, fluffy dog partially hidden among them. The shoes include various styles, such as sneakers, sandals, and open-toed footwear, all stacked on a shoe rack. The dog's curly fur is visible, suggesting it is resting or curled up in the midst of the shoes. The background features a tiled floor. | |
| The image depicts a lush, green forest scene with dense foliage and trees. It appears to show some sort of trail or pathway in the middle, most likely leading further into the forest. The lighting suggests it may be a bright day, with sunlight filtering through the leaves. There are also hints of people or objects slightly obscured by the greenery, which might indicate that someone is present in this natural setting. Overall, the scene conveys a sense of tranquility and rich biodiversity typical of a forest environment. | |
| The image features a white vase hanging upside down from a ceiling or beam, adorned with a cluster of colorful flowers. The flowers include shades of pink, white, and possibly green, creating a vibrant and decorative display. The background is softly blurred, suggesting a natural setting with greenery, enhancing the overall aesthetic of the arrangement. The lighting appears bright and natural, contributing to a fresh and inviting atmosphere. |
UDFs: Enhancing Pixeltable’s Capabilities
Another important principle of Pixeltable is that, although Pixeltable has a built-in library of useful operations and adapters, it will never prescribe a particular way of doing things. Pixeltable is built from the ground up to be extensible. Let’s take a specific example. Recall our use of the ResNet-50 detection model, in which thedetect column contains a JSON blob with bounding
boxes, scores, and labels. Suppose we want to create a column containing
the single label with the highest confidence score. There’s no built-in
Pixeltable function to do this, but it’s easy to write our own. In fact,
all we have to do is write a Python function that does the thing we
want, and mark it with the @pxt.udf decorator.
| detections_text | top |
|---|---|
| ["giraffe", "giraffe"] | giraffe |
| ["zebra"] | zebra |
| ["dog", "dog"] | dog |
| ["person", "person", "bench", "person", "elephant", "elephant", "person"] | elephant |
| ["vase", "potted plant"] | vase |