Skip to main content
Welcome to Pixeltable! In this tutorial, we’ll survey how to create tables, populate them with data, and enhance them with built-in and user-defined transformations and AI operations. If you want to follow along with this tutorial interactively, there are two ways to go. - Use a Kaggle or Colab container (easiest): Click on one of the badges above. - Locally in a self-managed Python environment: You’ll probably want to create your own empty notebook, then copy-paste each command from the website. Be sure your Jupyter kernel is running in a Python virtual environment; you can check out the Getting Started with Pixeltable guide for step-by-step instructions.

Install Python Packages

First run the following command to install Pixeltable and related libraries needed for this tutorial.
%pip install -qU torch transformers openai pixeltable

Creating a Table

Let’s begin by creating a demo directory (if it doesn’t already exist) and a table that can hold image data, demo.first. The table will initially have just a single column to hold our input images, which we’ll call input_image. We also need to specify a type for the column: pxt.Image.
import pixeltable as pxt

# Create the directory `demo` (if it doesn't already exist)
pxt.drop_dir('demo', force=True)  # First drop `demo` to ensure a clean environment
pxt.create_dir('demo')

# Create the table `demo.first` with a single column `input_image`
t = pxt.create_table('demo.first', {'input_image': pxt.Image})
Connected to Pixeltable database at: postgresql://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata
Created directory `demo`.
Created table `first`.
We can use t.describe() to examine the table schema. We see that it now contains a single column, as expected.
t.describe()
Column Name Type Computed With
input_image image
The new table is initially empty, with no rows:
t.count()
0
Now let’s put an image into it! We can add images simply by giving Pixeltable their URLs. The example images in this demo come from the COCO dataset, and we’ll be referencing copies of them in the Pixeltable github repo. But in practice, the images can come from anywhere: an S3 bucket, say, or the local file system. When we add the image, we see that Pixeltable gives us some useful status updates indicating that the operation was successful.
t.insert([{'input_image': 'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000025.jpg'}])
Inserting rows into `first`: 1 rows [00:00, 336.92 rows/s]
Inserted 1 row with 0 errors.
UpdateStatus(num_rows=1, num_computed_values=0, num_excs=0, updated_cols=[], cols_with_excs=[])
We can use t.show() to examine the contents of the table.
t.show()
input_image

Adding Computed Columns

Great! Now we have a table containing some data. Let’s add an object detection model to our workflow. Specifically, we’re going to use the ResNet-50 object detection model, which runs using the Huggingface DETR (“DEtection TRansformer”) model class. Pixeltable contains a built-in adapter for this model family, so all we have to do is call the detr_for_object_detection Pixeltable function. A nice thing about the Huggingface models is that they run locally, so you don’t need an account with a service provider in order to use them. This is our first example of a computed column, a key concept in Pixeltable. Recall that when we created the input_image column, we specified a type, ImageType, indicating our intent to populate it with data in the future. When we create a computed column, we instead specify a function that operates on other columns of the table. By default, when we add the new computed column, Pixeltable immediately evaluates it against all existing data in the table - in this case, by calling the detr_for_object_detection function on the image. Depending on your setup, it may take a minute for the function to execute. In the background, Pixeltable is downloading the model from Huggingface (if necessary), instantiating it, and caching it for later use.
from pixeltable.functions import huggingface

t.add_computed_column(detections=huggingface.detr_for_object_detection(
    t.input_image, model_id='facebook/detr-resnet-50'
))
Computing cells: 100%|████████████████████████████████████████████| 1/1 [00:02<00:00,  2.03s/ cells]
Added 1 column value with 0 errors.
Let’s examine the results.
t.show()
input_image detections
{"boxes": [[51.942, 356.174, 181.481, 413.975], [383.225, 58.66, 605.64, 361.346]], "labels": [25, 25], "scores": [0.99, 0.999], "label_text": ["giraffe", "giraffe"]}
We see that the model returned a JSON structure containing a lot of information. In particular, it has the following fields: - label_text: Descriptions of the objects detected - boxes: Bounding boxes for each detected object - scores: Confidence scores for each detection - labels: The DETR model’s internal IDs for the detected objects Perhaps this is more than we need, and all we really want are the text labels. We could add another computed column to extract label_text from the JSON struct:
t.add_computed_column(detections_text=t.detections.label_text)
t.show()
Computing cells: 100%|███████████████████████████████████████████| 1/1 [00:00<00:00, 281.61 cells/s]
Added 1 column value with 0 errors.
input_image detections detections_text
{"boxes": [[51.942, 356.174, 181.481, 413.975], [383.225, 58.66, 605.64, 361.346]], "labels": [25, 25], "scores": [0.99, 0.999], "label_text": ["giraffe", "giraffe"]} ["giraffe", "giraffe"]
If we inspect the table schema now, we see how Pixeltable distinguishes between ordinary and computed columns.
t.describe()
Column Name Type Computed With
input_image image
detections json detr_for_object_detection(input_image, model_id='facebook/detr-resnet-50')
detections_text json detections.label_text
Now let’s add some more images to our table. This demonstrates another important feature of computed columns: by default, they update incrementally any time new data shows up on their inputs. In this case, Pixeltable will run the ResNet-50 model against each new image that is added, then extract the labels into the detect_text column. Pixeltable will orchestrate the execution of any sequence (or DAG) of computed columns. Note how we can pass multiple rows to t.insert with a single statement, which will insert them more efficiently.
more_images = [
    'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000030.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000034.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000042.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000061.jpg'
]
t.insert({'input_image': image} for image in more_images)
Computing cells:  50%|██████████████████████                      | 4/8 [00:01<00:01,  3.67 cells/s]
Inserting rows into `first`: 4 rows [00:00, 3478.59 rows/s]
Computing cells: 100%|████████████████████████████████████████████| 8/8 [00:01<00:00,  7.32 cells/s]
Inserted 4 rows with 0 errors.
UpdateStatus(num_rows=4, num_computed_values=8, num_excs=0, updated_cols=[], cols_with_excs=[])
Let’s see what the model came up with. We’ll use t.select to suppress the display of the detect column, since right now we’re only interested in the text labels.
t.select(t.input_image, t.detections_text).show()
input_image detections_text
["giraffe", "giraffe"]
["vase", "potted plant"]
["zebra"]
["dog", "dog"]
["person", "person", "bench", "person", "elephant", "elephant", "person"]

Pixeltable Is Persistent

An important feature of Pixeltable is that everything is persistent. Unlike in-memory Python libraries such as Pandas, Pixeltable is a database: all your data, transformations, and computed columns are stored and preserved between sessions. To see this, let’s clear all the variables in our notebook and start fresh. You can optionally restart your notebook kernel at this point, to demonstrate how Pixeltable data persists across sessions.
# Clear all variables in the notebook
%reset -f

# Instantiate a new client object
import pixeltable as pxt
t = pxt.get_table('demo.first')

# Display just the first two rows, to avoid cluttering the tutorial
t.select(t.input_image, t.detections_text).show(2)
input_image detections_text
["giraffe", "giraffe"]
["vase", "potted plant"]

GPT-4o

For comparison, let’s try running our examples through a generative model, Open AI’s gpt-4o-mini. For this section, you’ll need an OpenAI account with an API key. You can use the following command to add your API key to the environment (just enter your API key when prompted):
import os
import getpass
if 'OPENAI_API_KEY' not in os.environ:
    os.environ['OPENAI_API_KEY'] = getpass.getpass('Enter your OpenAI API key:')
Enter your OpenAI API key: ········
Now we can connect to OpenAI through Pixeltable. This may take some time, depending on how long OpenAI takes to process the query.
from pixeltable.functions import openai

t.add_computed_column(vision=openai.vision(
    prompt="Describe what's in this image.",
    image=t.input_image,
    model='gpt-4o-mini'
))
Computing cells: 100%|████████████████████████████████████████████| 5/5 [00:28<00:00,  5.64s/ cells]
Computing cells: 100%|███████████████████████████████████████████| 5/5 [00:00<00:00, 647.69 cells/s]
Added 5 column values with 0 errors.

Added 5 column values with 0 errors.
Let’s see how GPT-4’s responses compare to the traditional discriminative (DETR) model.
t.select(t.input_image, t.detections_text, t.vision).show()
input_image detections_text vision
["giraffe", "giraffe"] The image shows two giraffes in a natural setting, likely in a zoo or wildlife park. One giraffe is prominently featured in the foreground, standing and reaching toward the leaves of a tree. It has a distinctive coat pattern with light brown patches and a long neck. The second giraffe is in the background, partially visible, also appearing to be grazing. The surroundings include greenery, trees, and some fallen branches, providing a natural habitat feel. The lighting suggests a bright, sunny day.
["vase", "potted plant"] The image features a white vase with a shell-like design, filled with a colorful bouquet of flowers. The arrangement includes white flowers, pink roses, and green foliage. The vase is placed on a railing, with a blurred green garden backdrop providing a serene and natural setting. The lighting appears bright and soft, suggesting a pleasant, sunny day.
["dog", "dog"] The image shows a small, curly-haired dog lying on a shoe rack. The dog is partially obscured by several pairs of shoes, which include a red flip-flop, a couple of sports shoes, and sandals. The setting appears to be indoors, with a simple metal rack holding the shoes, and the floor has a tiled pattern. The dog seems comfortable and cozy among the footwear.
["zebra"] The image depicts a zebra grazing on green grass. The zebra is characterized by its distinctive black and white stripes, and its mane is also visible. The background consists of lush greenery, suggesting a natural habitat. The zebra appears to be focused on eating, with its head lowered towards the ground. The vibrant colors and natural setting highlight the animal's features.
["person", "person", "bench", "person", "elephant", "elephant", "person"] The image shows a lush, green forest scene where two elephants are walking through dense vegetation. On each elephant, there is a person sitting, likely enjoying an elephant ride. The surrounding area is filled with various tropical plants and trees, creating a vibrant and natural atmosphere. The overall setting appears to be part of a jungle or a wildlife reserve, showcasing the beauty of nature and wildlife interaction.
In addition to adapters for local models and inference APIs, Pixeltable can perform a range of more basic image operations. These image operations can be seamlessly chained with API calls, and Pixeltable will keep track of the sequence of operations, constructing new images and caching when necessary to keep things running smoothly. Just for fun (and to demonstrate the power of computed columns), let’s see what OpenAI thinks of our sample images when we rotate them by 180 degrees.
t.add_computed_column(rot_image=t.input_image.rotate(180))
t.add_computed_column(rot_vision=openai.vision(
    prompt="Describe what's in this image.",
    image=t.rot_image,
    model='gpt-4o-mini'
))
Added 5 column values with 0 errors.
Computing cells: 100%|████████████████████████████████████████████| 5/5 [00:26<00:00,  5.24s/ cells]
Computing cells: 100%|███████████████████████████████████████████| 5/5 [00:00<00:00, 661.02 cells/s]
Added 5 column values with 0 errors.
t.select(t.rot_image, t.rot_vision).show()
rot_image rot_vision
The image features a zebra lying on green grass. The zebra is depicted from a top-down perspective, showcasing its distinctive black and white stripes. The position of the zebra suggests it might be resting or relaxing in its natural habitat, with the lush greenery surrounding it.
The image features a giraffe in a natural setting, with lush greenery in the background. The giraffe, characterized by its long neck and spotted coat, is likely feeding or interacting with its environment. There are also trees and foliage surrounding the area, which adds to the natural atmosphere. Another giraffe can be seen in the background, indicating they are in a habitat typical for these animals.
The image shows a pile of shoes and a small, fluffy dog partially hidden among them. The shoes include various styles, such as sneakers, sandals, and open-toed footwear, all stacked on a shoe rack. The dog's curly fur is visible, suggesting it is resting or curled up in the midst of the shoes. The background features a tiled floor.
The image depicts a lush, green forest scene with dense foliage and trees. It appears to show some sort of trail or pathway in the middle, most likely leading further into the forest. The lighting suggests it may be a bright day, with sunlight filtering through the leaves. There are also hints of people or objects slightly obscured by the greenery, which might indicate that someone is present in this natural setting. Overall, the scene conveys a sense of tranquility and rich biodiversity typical of a forest environment.
The image features a white vase hanging upside down from a ceiling or beam, adorned with a cluster of colorful flowers. The flowers include shades of pink, white, and possibly green, creating a vibrant and decorative display. The background is softly blurred, suggesting a natural setting with greenery, enhancing the overall aesthetic of the arrangement. The lighting appears bright and natural, contributing to a fresh and inviting atmosphere.

UDFs: Enhancing Pixeltable’s Capabilities

Another important principle of Pixeltable is that, although Pixeltable has a built-in library of useful operations and adapters, it will never prescribe a particular way of doing things. Pixeltable is built from the ground up to be extensible. Let’s take a specific example. Recall our use of the ResNet-50 detection model, in which the detect column contains a JSON blob with bounding boxes, scores, and labels. Suppose we want to create a column containing the single label with the highest confidence score. There’s no built-in Pixeltable function to do this, but it’s easy to write our own. In fact, all we have to do is write a Python function that does the thing we want, and mark it with the @pxt.udf decorator.
@pxt.udf
def top_detection(detect: dict) -> str:
    scores = detect['scores']
    label_text = detect['label_text']
    # Get the index of the object with the highest confidence
    i = scores.index(max(scores))
    # Return the corresponding label
    return label_text[i]
t.add_computed_column(top=top_detection(t.detections))
Computing cells: 100%|███████████████████████████████████████████| 5/5 [00:00<00:00, 495.50 cells/s]
Computing cells: 100%|██████████████████████████████████████████| 5/5 [00:00<00:00, 1096.21 cells/s]
Added 5 column values with 0 errors.

Added 5 column values with 0 errors.
t.select(t.detections_text, t.top).show()
detections_text top
["giraffe", "giraffe"] giraffe
["zebra"] zebra
["dog", "dog"] dog
["person", "person", "bench", "person", "elephant", "elephant", "person"] elephant
["vase", "potted plant"] vase
Congratulations! You’ve reached the end of the tutorial. Hopefully, this gives a good overview of the capabilities of Pixeltable, but there’s much more to explore. As a next step, you might check out one of the other tutorials, depending on your interests: - Object Detection in Videos - RAG Operations in Pixeltable - Working with OpenAI in Pixeltable