Computed columns are permanent table columns that automatically calculate values based on expressions involving other columns. They maintain those calculations as your data changes, enabling seamless data transformations without manual updates.
import pixeltable as pxt# Create a table with population datapop_t = pxt.io.import_csv( 'fundamentals.population', 'https://github.com/pixeltable/pixeltable/raw/main/docs/source/data/world-population-data.csv')# Add a computed column for year-over-year changepop_t.add_computed_column(yoy_change=(pop_t.pop_2023 - pop_t.pop_2022))# Create a computed column to track population change year over yearpop_t.add_computed_column(yoy_change=(pop_t.pop_2023 - pop_t.pop_2022))# Display the resultspop_t.select(pop_t.country, pop_t.pop_2022, pop_t.pop_2023, pop_t.yoy_change).head(5)
As soon as the column is added, Pixeltable will (by default) automatically compute its value for all rows in the table, storing the results in the new column.
In traditional data workflows, it is commonplace to recompute entire pipelines when the input dataset is changed or enlarged. In Pixeltable, by contrast, all updates are applied incrementally. When new data appear in a table or existing data are altered, Pixeltable will recompute only those rows that are dependent on the changed data.
Let’s explore another example that uses computed columns for image processing operations.
Copy
# Create a table for image operationst = pxt.create_table('fundamentals.image_ops', {'source': pxt.Image})# Extract image metadata (dimensions, format, etc.)t.add_computed_column(metadata=t.source.get_metadata())# Create a rotated version of each imaget.add_computed_column(rotated=t.source.rotate(10))# Create a version with transparency and rotationt.add_computed_column(rotated_transparent=t.source.convert('RGBA').rotate(10))
Once we insert data, it will automatically compute the values for the new columns.
Copy
# Insert sample images from a GitHub repositoryurl_prefix = 'https://github.com/pixeltable/pixeltable/raw/main/docs/source/data/images'images = ['000000000139.jpg', '000000000632.jpg', '000000000872.jpg']t.insert({'source': f'{url_prefix}/{image}'} for image in images)# Display the original and rotated imagest.select(t.source, t.rotated).limit(2)
Pixeltable will automatically manage the dependencies between the columns, so that when the source image is updated, the rotated and rotated_transparent columns are automatically recomputed.
You don’t need to think about orchestration. Our DAG engine will take care of the dependencies for you.
Computed columns can depend on other computed columns:
Copy
# First computed column: calculate total population change over 3 yearspop_t.add_computed_column(total_change=pop_t.pop_2023 - pop_t.pop_2020)# Second computed column: calculate average yearly change using the first columnpop_t.add_computed_column( avg_yearly_change=pop_t.total_change / 3)