@pxt.udf decorator.
Similarly, custom iterators are created by decorating a Python generator
function with @pxt.iterator.
Custom iterators are a relatively advanced
Pixeltable feature. This guide will make the most sense if you’re
already familiar with Pixeltable’s built-in iterators, as well as the
pxt.udf decorator. If you haven’t encountered those
concepts yet, it’s recommended to first read the
Iterators
and
UDFs
tutorial sections.Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata
Created directory ‘iterators_demo’.
<pixeltable.catalog.dir.Dir at 0x14739e080>
In this tutorial, we’ll be creating an iterator that takes an image as
input, and produces multiple images as output. The output images will be
variations of the input with different characteristics. To start, we’ll
create a base table to store our source images.
n different grayscale images of varying brightness.
Creating a functioning iterator is as simple as defining a Python
generator function (a function that yields its output) and then
decorating it with @pxt.iterator.
TypedDict class describing the content of the iterator’s output.
Unlike UDFs, iterators can (and usually do) return multiple outputs.
They will always yield dictionaries, and you must annotate the
return type with a suitable TypedDict. This is how Pixeltable knows
what types to assign to the iterator’s output columns.
Defining a
TypedDict for your
iterator is not optional. Remember that Pixeltable is a database system,
and everything must be typed!images table and collect the results.
brightness and grayscale_image,
which were defined in GrayscaleOutput. In addition, Pixeltable added a
third column pos. Every iterator will automatically output a pos
column, regardless of what shows up in the iterator’s TypedDict. The
pos column simply indicates the integer position of that row in the
original iteration order. If we look at the schema of our new view, we
can see that pos always has type Int.
n times;
Pixeltable materializes it in the view by joining against the base
table.)
Parameterizing Iterators
Iterators often contain complex functionality;document_splitter, for
example, has 10 optional parameters to tune its behavior. Like UDFs,
iterators can involve any number of parameters. To illustrate this,
let’s add an optional colorize parameter to our iterator.
Validation
Often it’s desirable to validate an iterator’s inputs as a sanity check. Suppose we want to check that thecolorize input is a valid PIL color
name. That’s already being done, in a sense: when ImageOps.colorize is
called in our iterator code, it will raise an exception if the color
name is not valid. The problem is that the iterator code isn’t executed
until our workflow actually runs. There’s nothing stopping us from
instantiating instances of grayscale_iterator with broken inputs. To
appreciate this distinction, let’s set up an empty table with no rows,
and define an invalid iterator view on it.
Created table ‘images’.
ValueError: unknown color specifier: ‘invalid_color_name’ [0;31m---------------------------------------------------------------------------[0m [0;31mValueError[0m Traceback (most recent call last) Cell [0;32mIn[10], line 1[0m [0;32m----> 1[0m [43mt[49m[38;5;241;43m.[39;49m[43minsert[49m[43m([49m[43m{[49m[38;5;124;43m’[39;49m[38;5;124;43mimage[39;49m[38;5;124;43m’[39;49m[43m:[49m[43m [49m[43mimage[49m[43m}[49m[43m [49m[38;5;28;43;01mfor[39;49;00m[43m [49m[43mimage[49m[43m [49m[38;5;129;43;01min[39;49;00m[43m [49m[43mimages[49m[43m)[49mFile [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/catalog/insertable_table.py:174[0m, in [0;36mInsertableTable.insert[0;34m(self, source, source_format, schema_overrides, on_error, print_stats, **kwargs)[0m [1;32m 171[0m data_source[38;5;241m.[39madd_table_info(table) [1;32m 172[0m data_source[38;5;241m.[39mprepare_for_insert_into_table() [0;32m—> 174[0m [38;5;28;01mreturn[39;00m [43mtable[49m[38;5;241;43m.[39;49m[43minsert_table_data_source[49m[43m([49m [1;32m 175[0m [43m [49m[43mdata_source[49m[38;5;241;43m=[39;49m[43mdata_source[49m[43m,[49m[43m [49m[43mfail_on_exception[49m[38;5;241;43m=[39;49m[43mfail_on_exception[49m[43m,[49m[43m [49m[43mprint_stats[49m[38;5;241;43m=[39;49m[43mprint_stats[49m [1;32m 176[0m [43m[49m[43m)[49mFile [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/catalog/insertable_table.py:195[0m, in [0;36mInsertableTable.insert_table_data_source[0;34m(self, data_source, fail_on_exception, print_stats)[0m [1;32m 193[0m status [38;5;241m=[39m pxt[38;5;241m.[39mUpdateStatus() [1;32m 194[0m [38;5;28;01mfor[39;00m row_batch [38;5;129;01min[39;00m data_source[38;5;241m.[39mvalid_row_batch(): [0;32m—> 195[0m status [38;5;241m+[39m[38;5;241m=[39m [38;5;28;43mself[39;49m[38;5;241;43m.[39;49m[43m_tbl_version[49m[38;5;241;43m.[39;49m[43mget[49m[43m([49m[43m)[49m[38;5;241;43m.[39;49m[43minsert[49m[43m([49m [1;32m 196[0m [43m [49m[43mrows[49m[38;5;241;43m=[39;49m[43mrow_batch[49m[43m,[49m[43m [49m[43mquery[49m[38;5;241;43m=[39;49m[38;5;28;43;01mNone[39;49;00m[43m,[49m[43m [49m[43mprint_stats[49m[38;5;241;43m=[39;49m[43mprint_stats[49m[43m,[49m[43m [49m[43mfail_on_exception[49m[38;5;241;43m=[39;49m[43mfail_on_exception[49m [1;32m 197[0m [43m [49m[43m)[49m [1;32m 199[0m Env[38;5;241m.[39mget()[38;5;241m.[39mconsole_logger[38;5;241m.[39minfo(status[38;5;241m.[39minsert_msg(start_ts)) [1;32m 201[0m FileCache[38;5;241m.[39mget()[38;5;241m.[39memit_eviction_warnings()File [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/catalog/table_version.py:1183[0m, in [0;36mTableVersion.insert[0;34m(self, rows, query, print_stats, fail_on_exception)[0m [1;32m 1180[0m [38;5;28;01myield[39;00m rowid [1;32m 1182[0m [38;5;28;01mwith[39;00m Env[38;5;241m.[39mget()[38;5;241m.[39mreport_progress(): [0;32m-> 1183[0m result [38;5;241m=[39m [38;5;28;43mself[39;49m[38;5;241;43m.[39;49m[43m_insert[49m[43m([49m [1;32m 1184[0m [43m [49m[43mplan[49m[43m,[49m[43m [49m[43mtime[49m[38;5;241;43m.[39;49m[43mtime[49m[43m([49m[43m)[49m[43m,[49m[43m [49m[43mprint_stats[49m[38;5;241;43m=[39;49m[43mprint_stats[49m[43m,[49m[43m [49m[43mrowids[49m[38;5;241;43m=[39;49m[43mrowids[49m[43m([49m[43m)[49m[43m,[49m[43m [49m[43mabort_on_exc[49m[38;5;241;43m=[39;49m[43mfail_on_exception[49m [1;32m 1185[0m [43m [49m[43m)[49m [1;32m 1186[0m [38;5;28;01mreturn[39;00m resultFile [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/catalog/table_version.py:1214[0m, in [0;36mTableVersion.insert[0;34m(self, exec_plan, timestamp, rowids, print_stats, abort_on_exc)[0m [1;32m 1211[0m [38;5;28;01mfrom[39;00m [38;5;21;01mpixeltable[39;00m[38;5;21;01m.[39;00m[38;5;21;01mplan[39;00m [38;5;28;01mimport[39;00m Planner [1;32m 1213[0m view_plan, [38;5;241m=[39m Planner[38;5;241m.[39mcreate_view_load_plan(view[38;5;241m.[39mget()[38;5;241m.[39mpath, propagates_insert[38;5;241m=[39m[38;5;28;01mTrue[39;00m) [0;32m-> 1214[0m status [38;5;241m=[39m [43mview[49m[38;5;241;43m.[39;49m[43mget[49m[43m([49m[43m)[49m[38;5;241;43m.[39;49m[43m_insert[49m[43m([49m[43mview_plan[49m[43m,[49m[43m [49m[43mtimestamp[49m[43m,[49m[43m [49m[43mprint_stats[49m[38;5;241;43m=[39;49m[43mprint_stats[49m[43m)[49m [1;32m 1215[0m result [38;5;241m+[39m[38;5;241m=[39m status[38;5;241m.[39mto_cascade() [1;32m 1217[0m [38;5;66;03m# Use the net status after all propagations[39;00mFile [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/catalog/table_version.py:1201[0m, in [0;36mTableVersion._insert[0;34m(self, exec_plan, timestamp, rowids, print_stats, abort_on_exc)[0m [1;32m 1199[0m [38;5;28mself[39m[38;5;241m.[39mbump_version(timestamp, bump_schema_version[38;5;241m=[39m[38;5;28;01mFalse[39;00m) [1;32m 1200[0m exec_plan[38;5;241m.[39mctx[38;5;241m.[39mtitle [38;5;241m=[39m [38;5;28mself[39m[38;5;241m.[39mdisplay_str() [0;32m-> 1201[0m cols_with_excs, row_counts [38;5;241m=[39m [38;5;28;43mself[39;49m[38;5;241;43m.[39;49m[43mstore_tbl[49m[38;5;241;43m.[39;49m[43minsert_rows[49m[43m([49m [1;32m 1202[0m [43m [49m[43mexec_plan[49m[43m,[49m[43m [49m[43mv_min[49m[38;5;241;43m=[39;49m[38;5;28;43mself[39;49m[38;5;241;43m.[39;49m[43mversion[49m[43m,[49m[43m [49m[43mrowids[49m[38;5;241;43m=[39;49m[43mrowids[49m[43m,[49m[43m [49m[43mabort_on_exc[49m[38;5;241;43m=[39;49m[43mabort_on_exc[49m [1;32m 1203[0m [43m[49m[43m)[49m [1;32m 1204[0m result [38;5;241m=[39m UpdateStatus( [1;32m 1205[0m cols_with_excs[38;5;241m=[39m[[38;5;124mf[39m[38;5;124m’[39m[38;5;132;01m{[39;00m[38;5;28mself[39m[38;5;241m.[39mname[38;5;132;01m}[39;00m[38;5;124m.[39m[38;5;132;01m{[39;00m[38;5;28mself[39m[38;5;241m.[39mcols_by_id[cid][38;5;241m.[39mname[38;5;132;01m}[39;00m[38;5;124m’[39m [38;5;28;01mfor[39;00m cid [38;5;129;01min[39;00m cols_with_excs], [1;32m 1206[0m row_count_stats[38;5;241m=[39mrow_counts, [1;32m 1207[0m ) [1;32m 1209[0m [38;5;66;03m# update views[39;00mFile [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/store.py:425[0m, in [0;36mStoreBase.insert_rows[0;34m(self, exec_plan, v_min, rowids, abort_on_exc)[0m [1;32m 420[0m [38;5;28;01mwith[39;00m exec_plan: [1;32m 421[0m progress_reporter [38;5;241m=[39m exec_plan[38;5;241m.[39mctx[38;5;241m.[39madd_progress_reporter( [1;32m 422[0m [38;5;124mf[39m[38;5;124m’[39m[38;5;124mRows written (table [39m[38;5;132;01m{[39;00m[38;5;28mself[39m[38;5;241m.[39mtbl_version[38;5;241m.[39mget()[38;5;241m.[39mname[38;5;132;01m!r}[39;00m[38;5;124m)[39m[38;5;124m’[39m, [38;5;124m’[39m[38;5;124mrows[39m[38;5;124m’[39m [1;32m 423[0m ) [0;32m—> 425[0m [38;5;28;01mfor[39;00m row_batch [38;5;129;01min[39;00m exec_plan: [1;32m 426[0m num_rows [38;5;241m+[39m[38;5;241m=[39m [38;5;28mlen[39m(row_batch) [1;32m 427[0m batch_table_rows: [38;5;28mlist[39m[[38;5;28mtuple[39m[Any]] [38;5;241m=[39m []File [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/exec/exec_node.py:63[0m, in [0;36mExecNode.__iter__[0;34m(self)[0m [1;32m 61[0m [38;5;28;01mtry[39;00m: [1;32m 62[0m [38;5;28;01mwhile[39;00m [38;5;28;01mTrue[39;00m: [0;32m---> 63[0m batch: DataRowBatch [38;5;241m=[39m [43mloop[49m[38;5;241;43m.[39;49m[43mrun_until_complete[49m[43m([49m[38;5;28;43maiter[39;49m[38;5;241;43m.[39;49m[38;5;21;43m__anext__[39;49m[43m([49m[43m)[49m[43m)[49m [1;32m 64[0m [38;5;28;01myield[39;00m batch [1;32m 65[0m [38;5;28;01mexcept[39;00m [38;5;167;01mStopAsyncIteration[39;00m:File [0;32m/opt/miniconda3/envs/pxt/lib/python3.10/site-packages/nest_asyncio.py:99[0m, in [0;36m_patch_loop.<locals>.run_until_complete[0;34m(self, future)[0m [1;32m 96[0m [38;5;28;01mif[39;00m [38;5;129;01mnot[39;00m f[38;5;241m.[39mdone(): [1;32m 97[0m [38;5;28;01mraise[39;00m [38;5;167;01mRuntimeError[39;00m( [1;32m 98[0m [38;5;124m’[39m[38;5;124mEvent loop stopped before Future completed.[39m[38;5;124m’[39m) [0;32m---> 99[0m [38;5;28;01mreturn[39;00m [43mf[49m[38;5;241;43m.[39;49m[43mresult[49m[43m([49m[43m)[49mFile [0;32m/opt/miniconda3/envs/pxt/lib/python3.10/asyncio/futures.py:201[0m, in [0;36mFuture.result[0;34m(self)[0m [1;32m 199[0m [38;5;28mself[39m[38;5;241m.[39m__log_traceback [38;5;241m=[39m [38;5;28;01mFalse[39;00m [1;32m 200[0m [38;5;28;01mif[39;00m [38;5;28mself[39m[38;5;241m.[39m_exception [38;5;129;01mis[39;00m [38;5;129;01mnot[39;00m [38;5;28;01mNone[39;00m: [0;32m—> 201[0m [38;5;28;01mraise[39;00m [38;5;28mself[39m[38;5;241m.[39m_exception[38;5;241m.[39mwith_traceback([38;5;28mself[39m[38;5;241m.[39m_exception_tb) [1;32m 202[0m [38;5;28;01mreturn[39;00m [38;5;28mself[39m[38;5;241m.[39m_resultFile [0;32m/opt/miniconda3/envs/pxt/lib/python3.10/asyncio/tasks.py:232[0m, in [0;36mTask.__step[0;34m(failed resolving arguments)[0m [1;32m 228[0m [38;5;28;01mtry[39;00m: [1;32m 229[0m [38;5;28;01mif[39;00m exc [38;5;129;01mis[39;00m [38;5;28;01mNone[39;00m: [1;32m 230[0m [38;5;66;03m# We use the `send` method directly, because coroutines[39;00m [1;32m 231[0m [38;5;66;03m# don’t have `iter` and `next` methods.[39;00m [0;32m—> 232[0m result [38;5;241m=[39m [43mcoro[49m[38;5;241;43m.[39;49m[43msend[49m[43m([49m[38;5;28;43;01mNone[39;49;00m[43m)[49m [1;32m 233[0m [38;5;28;01melse[39;00m: [1;32m 234[0m result [38;5;241m=[39m coro[38;5;241m.[39mthrow(exc)File [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/exec/object_store_save_node.py:128[0m, in [0;36mObjectStoreSaveNode.__aiter__[0;34m(self)[0m [1;32m 125[0m [38;5;28;01mwhile[39;00m [38;5;28;01mTrue[39;00m: [1;32m 126[0m [38;5;66;03m# Create work to fill the queue to the high water mark … ?without overrunning the in-flight row limit.[39;00m [1;32m 127[0m [38;5;28;01mwhile[39;00m [38;5;129;01mnot[39;00m [38;5;28mself[39m[38;5;241m.[39minput_finished [38;5;129;01mand[39;00m [38;5;28mself[39m[38;5;241m.[39mqueued_work [38;5;241m<[39m [38;5;28mself[39m[38;5;241m.[39mQUEUE_DEPTH_HIGH_WATER: [0;32m—> 128[0m input_batch [38;5;241m=[39m [38;5;28;01mawait[39;00m [38;5;28mself[39m[38;5;241m.[39mget_input_batch(input_iter) [1;32m 129[0m [38;5;28;01mif[39;00m input_batch [38;5;129;01mis[39;00m [38;5;129;01mnot[39;00m [38;5;28;01mNone[39;00m: [1;32m 130[0m [38;5;28mself[39m[38;5;241m.[39m__process_input_batch(input_batch, executor)File [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/exec/object_store_save_node.py:114[0m, in [0;36mObjectStoreSaveNode.get_input_batch[0;34m(self, input_iter)[0m [1;32m 112[0m [38;5;250m[39m[38;5;124;03m"""Get the next batch of input rows, or None if there are no more rows"""[39;00m [1;32m 113[0m [38;5;28;01mtry[39;00m: [0;32m—> 114[0m input_batch [38;5;241m=[39m [38;5;28;01mawait[39;00m anext(input_iter) [1;32m 115[0m [38;5;28;01mif[39;00m input_batch [38;5;129;01mis[39;00m [38;5;28;01mNone[39;00m: [1;32m 116[0m [38;5;28mself[39m[38;5;241m.[39minput_finished [38;5;241m=[39m [38;5;28;01mTrue[39;00mFile [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/exec/expr_eval/expr_eval_node.py:298[0m, in [0;36mExprEvalNode.__aiter__[0;34m(self)[0m [1;32m 296[0m [38;5;28;01mraise[39;00m [38;5;28mself[39m[38;5;241m.[39merror [38;5;28;01mfrom[39;00m [38;5;21;01mself[39;00m[38;5;21;01m.[39;00m[38;5;21;01merror[39;00m[38;5;21;01m.[39;00m[38;5;21;01mexc[39;00m [1;32m 297[0m [38;5;28;01melse[39;00m: [0;32m—> 298[0m [38;5;28;01mraise[39;00m [38;5;28mself[39m[38;5;241m.[39merror [1;32m 299[0m [38;5;28;01mif[39;00m completed_aw [38;5;129;01min[39;00m done: [1;32m 300[0m [38;5;28mself[39m[38;5;241m.[39m_log_state([38;5;124m’[39m[38;5;124mcompleted_aw done[39m[38;5;124m’[39m)File [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/exec/expr_eval/expr_eval_node.py:124[0m, in [0;36mExprEvalNode._fetch_input_batch[0;34m(self)[0m [1;32m 122[0m [38;5;28;01massert[39;00m [38;5;129;01mnot[39;00m [38;5;28mself[39m[38;5;241m.[39minput_complete [1;32m 123[0m [38;5;28;01mtry[39;00m: [0;32m—> 124[0m batch [38;5;241m=[39m [38;5;28;01mawait[39;00m anext([38;5;28mself[39m[38;5;241m.[39minput_iter) [1;32m 125[0m [38;5;28;01mif[39;00m [38;5;28mself[39m[38;5;241m.[39mprogress_reporter [38;5;129;01mis[39;00m [38;5;129;01mnot[39;00m [38;5;28;01mNone[39;00m: [1;32m 126[0m [38;5;66;03m# make sure our progress reporter shows up before we run anything long[39;00m [1;32m 127[0m [38;5;28mself[39m[38;5;241m.[39mprogress_reporter[38;5;241m.[39mupdate([38;5;241m0[39m)File [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/exec/component_iteration_node.py:56[0m, in [0;36mComponentIterationNode.__aiter__[0;34m(self)[0m [1;32m 54[0m [38;5;28;01mif[39;00m [38;5;28mself[39m[38;5;241m.[39m__non_nullable_args_specified(iterator_args): [1;32m 55[0m iterator [38;5;241m=[39m [38;5;28mself[39m[38;5;241m.[39mview[38;5;241m.[39mget()[38;5;241m.[39miterator_call[38;5;241m.[39meval(iterator_args) [0;32m---> 56[0m [38;5;28;01mfor[39;00m pos, component_dict [38;5;129;01min[39;00m [38;5;28menumerate[39m(iterator): [1;32m 57[0m output_row [38;5;241m=[39m [38;5;28mself[39m[38;5;241m.[39mrow_builder[38;5;241m.[39mmake_row() [1;32m 58[0m input_row[38;5;241m.[39mcopy(output_row)Cell [0;32mIn[6], line 10[0m, in [0;36mgrayscale_iterator[0;34m(image, n, colorize)[0m [1;32m 8[0m grayscale_image [38;5;241m=[39m image[38;5;241m.[39mconvert([38;5;124m’[39m[38;5;124mL[39m[38;5;124m’[39m) [1;32m 9[0m [38;5;28;01mif[39;00m colorize [38;5;129;01mis[39;00m [38;5;129;01mnot[39;00m [38;5;28;01mNone[39;00m: [0;32m---> 10[0m grayscale_image [38;5;241m=[39m [43mImageOps[49m[38;5;241;43m.[39;49m[43mcolorize[49m[43m([49m [1;32m 11[0m [43m [49m[43mgrayscale_image[49m[43m,[49m[43m [49m[43mblack[49m[38;5;241;43m=[39;49m[38;5;124;43m’[39;49m[38;5;124;43mblack[39;49m[38;5;124;43m’[39;49m[43m,[49m[43m [49m[43mwhite[49m[38;5;241;43m=[39;49m[43mcolorize[49m [1;32m 12[0m [43m [49m[43m)[49m [1;32m 13[0m enhancer [38;5;241m=[39m Brightness(grayscale_image) [1;32m 14[0m [38;5;28;01mfor[39;00m brightness [38;5;129;01min[39;00m [[38;5;241m0.5[39m [38;5;241m*[39m (i [38;5;241m+[39m [38;5;241m1[39m) [38;5;28;01mfor[39;00m i [38;5;129;01min[39;00m [38;5;28mrange[39m(n)]:File [0;32m/opt/miniconda3/envs/pxt/lib/python3.10/site-packages/PIL/ImageOps.py:207[0m, in [0;36mcolorize[0;34m(image, black, white, mid, blackpoint, whitepoint, midpoint)[0m [1;32m 205[0m [38;5;66;03m# Define colors from arguments[39;00m [1;32m 206[0m rgb_black [38;5;241m=[39m cast(Sequence[[38;5;28mint[39m], _color(black, [38;5;124m”[39m[38;5;124mRGB[39m[38;5;124m”[39m)) [0;32m—> 207[0m rgb_white [38;5;241m=[39m cast(Sequence[[38;5;28mint[39m], [43m_color[49m[43m([49m[43mwhite[49m[43m,[49m[43m [49m[38;5;124;43m”[39;49m[38;5;124;43mRGB[39;49m[38;5;124;43m”[39;49m[43m)[49m) [1;32m 208[0m rgb_mid [38;5;241m=[39m cast(Sequence[[38;5;28mint[39m], _color(mid, [38;5;124m”[39m[38;5;124mRGB[39m[38;5;124m”[39m)) [38;5;28;01mif[39;00m mid [38;5;129;01mis[39;00m [38;5;129;01mnot[39;00m [38;5;28;01mNone[39;00m [38;5;28;01melse[39;00m [38;5;28;01mNone[39;00m [1;32m 210[0m [38;5;66;03m# Empty lists for the mapping[39;00mFile [0;32m/opt/miniconda3/envs/pxt/lib/python3.10/site-packages/PIL/ImageOps.py:48[0m, in [0;36m_color[0;34m(color, mode)[0m [1;32m 45[0m [38;5;28;01mif[39;00m [38;5;28misinstance[39m(color, [38;5;28mstr[39m): [1;32m 46[0m [38;5;28;01mfrom[39;00m [38;5;21;01m.[39;00m [38;5;28;01mimport[39;00m ImageColor [0;32m---> 48[0m color [38;5;241m=[39m [43mImageColor[49m[38;5;241;43m.[39;49m[43mgetcolor[49m[43m([49m[43mcolor[49m[43m,[49m[43m [49m[43mmode[49m[43m)[49m [1;32m 49[0m [38;5;28;01mreturn[39;00m colorFile [0;32m/opt/miniconda3/envs/pxt/lib/python3.10/site-packages/PIL/ImageColor.py:144[0m, in [0;36mgetcolor[0;34m(color, mode)[0m [1;32m 130[0m [38;5;250m[39m[38;5;124;03m"""[39;00m [1;32m 131[0m [38;5;124;03mSame as :py:func:`~PIL.ImageColor.getrgb` for most modes. However, if[39;00m [1;32m 132[0m [38;5;124;03m“mode“ is HSV, converts the RGB value to a HSV value, or if “mode“ is[39;00m [0;32m (…)[0m [1;32m 141[0m [38;5;124;03m:return: “graylevel, (graylevel, alpha) or (red, green, blue[, alpha])“[39;00m [1;32m 142[0m [38;5;124;03m"""[39;00m [1;32m 143[0m [38;5;66;03m# same as getrgb, but converts the result to the given mode[39;00m [0;32m—> 144[0m rgb, alpha [38;5;241m=[39m [43mgetrgb[49m[43m([49m[43mcolor[49m[43m)[49m, [38;5;241m255[39m [1;32m 145[0m [38;5;28;01mif[39;00m [38;5;28mlen[39m(rgb) [38;5;241m==[39m [38;5;241m4[39m: [1;32m 146[0m alpha [38;5;241m=[39m rgb[[38;5;241m3[39m]File [0;32m/opt/miniconda3/envs/pxt/lib/python3.10/site-packages/PIL/ImageColor.py:125[0m, in [0;36mgetrgb[0;34m(color)[0m [1;32m 123[0m [38;5;28;01mreturn[39;00m [38;5;28mint[39m(m[38;5;241m.[39mgroup([38;5;241m1[39m)), [38;5;28mint[39m(m[38;5;241m.[39mgroup([38;5;241m2[39m)), [38;5;28mint[39m(m[38;5;241m.[39mgroup([38;5;241m3[39m)), [38;5;28mint[39m(m[38;5;241m.[39mgroup([38;5;241m4[39m)) [1;32m 124[0m msg [38;5;241m=[39m [38;5;124mf[39m[38;5;124m”[39m[38;5;124munknown color specifier: [39m[38;5;132;01m{[39;00m[38;5;28mrepr[39m(color)[38;5;132;01m}[39;00m[38;5;124m”[39m [0;32m—> 125[0m [38;5;28;01mraise[39;00m [38;5;167;01mValueError[39;00m(msg)[0;31mValueError[0m: unknown color specifier: ‘invalid_color_name’It’s more useful to do fail-fast validation, in which the arguments get checked at the time the iterator is first instantiated. This can be done in Pixeltable with the
@validate decorator.
Created table ‘images’.
ValueError: Invalid color name: invalid_color_name [0;31m---------------------------------------------------------------------------[0m [0;31mValueError[0m Traceback (most recent call last) Cell [0;32mIn[11], line 9[0m, in [0;36m_[0;34m(bound_args)[0m [1;32m 8[0m [38;5;28;01mtry[39;00m: [0;32m----> 9[0m [43mImageColor[49m[38;5;241;43m.[39;49m[43mgetrgb[49m[43m([49m[43mcolor[49m[43m)[49m [1;32m 10[0m [38;5;28;01mexcept[39;00m [38;5;167;01mValueError[39;00m [38;5;28;01mas[39;00m exc:File [0;32m/opt/miniconda3/envs/pxt/lib/python3.10/site-packages/PIL/ImageColor.py:125[0m, in [0;36mgetrgb[0;34m(color)[0m [1;32m 124[0m msg [38;5;241m=[39m [38;5;124mf[39m[38;5;124m”[39m[38;5;124munknown color specifier: [39m[38;5;132;01m{[39;00m[38;5;28mrepr[39m(color)[38;5;132;01m}[39;00m[38;5;124m”[39m [0;32m—> 125[0m [38;5;28;01mraise[39;00m [38;5;167;01mValueError[39;00m(msg)[0;31mValueError[0m: unknown color specifier: ‘invalid_color_name’The above exception was the direct cause of the following exception:[0;31mValueError[0m Traceback (most recent call last) Cell [0;32mIn[13], line 4[0m [1;32m 1[0m v [38;5;241m=[39m pxt[38;5;241m.[39mcreate_view( [1;32m 2[0m [38;5;124m’[39m[38;5;124miterators_demo/grayscale[39m[38;5;124m’[39m, [1;32m 3[0m t, [0;32m----> 4[0m iterator[38;5;241m=[39m[43mgrayscale_iterator[49m[43m([49m [1;32m 5[0m [43m [49m[43mt[49m[38;5;241;43m.[39;49m[43minput[49m[43m,[49m[43m [49m[43mn[49m[38;5;241;43m=[39;49m[38;5;241;43m3[39;49m[43m,[49m[43m [49m[43mcolorize[49m[38;5;241;43m=[39;49m[38;5;124;43m’[39;49m[38;5;124;43minvalid_color_name[39;49m[38;5;124;43m’[39;49m [1;32m 6[0m [43m [49m[43m)[49m, [1;32m 7[0m )File [0;32m~/Dropbox/workspace/pixeltable/pixeltable/pixeltable/func/iterator.py:233[0m, in [0;36mGeneratingFunction.__call__[0;34m(self, *args, **kwargs)[0m [1;32m 231[0m [38;5;66;03m# Run custom iterator validation on whatever args are bound to literals at this stage[39;00m [1;32m 232[0m [38;5;28;01mif[39;00m [38;5;28mself[39m[38;5;241m.[39m_validate [38;5;129;01mis[39;00m [38;5;129;01mnot[39;00m [38;5;28;01mNone[39;00m: [0;32m—> 233[0m [38;5;28;43mself[39;49m[38;5;241;43m.[39;49m[43m_validate[49m[43m([49m[43mliteral_args[49m[43m)[49m [1;32m 235[0m output_schema [38;5;241m=[39m [38;5;28mself[39m[38;5;241m.[39mcall_output_schema(literal_args) [1;32m 237[0m outputs [38;5;241m=[39m { [1;32m 238[0m name: IteratorOutput(orig_name[38;5;241m=[39mname, is_stored[38;5;241m=[39m(name [38;5;129;01mnot[39;00m [38;5;129;01min[39;00m [38;5;28mself[39m[38;5;241m.[39munstored_cols), col_type[38;5;241m=[39mcol_type) [1;32m 239[0m [38;5;28;01mfor[39;00m name, col_type [38;5;129;01min[39;00m output_schema[38;5;241m.[39mitems() [1;32m 240[0m }Cell [0;32mIn[11], line 11[0m, in [0;36m_[0;34m(bound_args)[0m [1;32m 9[0m ImageColor[38;5;241m.[39mgetrgb(color) [1;32m 10[0m [38;5;28;01mexcept[39;00m [38;5;167;01mValueError[39;00m [38;5;28;01mas[39;00m exc: [0;32m---> 11[0m [38;5;28;01mraise[39;00m [38;5;167;01mValueError[39;00m([38;5;124mf[39m[38;5;124m’[39m[38;5;124mInvalid color name: [39m[38;5;132;01m{[39;00mcolor[38;5;132;01m}[39;00m[38;5;124m’[39m) [38;5;28;01mfrom[39;00m [38;5;21;01mexc[39;00m[0;31mValueError[0m: Invalid color name: invalid_color_nameThe input to
validate(), bound_args, is a dictionary that contains
all constant arguments for a particular instance of the iterator. In
the above example, it contains colorize (because it’s equal to the
constant value 'invalid_color_name'), but not image (which depends
dynamically on the data in the t.input column).
validate() will actually be called twice: once when the iterator is
instantiated, with just the constant arguments present in bound_args;
and again when the iterator is evaluated on each row, this time with
all arguments present.
Class-Based Iterators
For complex iterators that need to maintain a lot of state or provide fine-grained control over their iteration mechanism, it can be convenient to define a class rather than a generator function. This can be done by writing a subclass ofPxtIterator and decorating the class,
rather than decorating a function. Here’s what grayscale_iterator
looks like if written as a class; it is functionally identical to the
earlier implementation.
Unstored Columns
That’s all you need to know to implement fully functional iterators. But sometimes, depending on the nature of the outputs, a little extra work will help make them more performant. In our example, every input image gets turned inton output images.
Moreover, recreating those output images doesn’t involve a lot of
computation: it’s just a simple color mask. If we store every output
image as a separate file, then when n is large we’ll be using up a lot
of storage without much benefit. Even at n=3, the outputs will consume
3x the storage as the inputs (maybe a little less since they’re
monochrome now, but you get the idea).
Just as with computed columns, Pixeltable provides an option for
iterator outputs to be unstored - meaning the outputs won’t be saved
to disk, and they’ll instead be dynamically regenerated each time a
client queries them. Unstored columns don’t provide much benefit for
scalar columns (integers or strings, say), where the storage footprint
is small; or for expensive computations (such as generative model
outputs), where we actually do want to persist the output. But for
simple image operations, they can be a lifesaver.
In the Pixeltable library,
frame_iterator and tile_iterator both use an
unstored column for the output images. In the case of
frame_iterator, the output is potentially huge, because
video data is highly compressed, as compared to individually stored
frame images.unstored_cols
decorator parameter. There is one important caveat:
- If you use unstored columns, you must implement your iterator as a class-based iterator; and
- You must implement a
seek()method in your class, as in the example below.
grayscale_iterator. Let’s check one more time that it all works as
expected.
Inserted 2 rows with 0 errors in 0.03 s (75.79 rows/s)
2 rows inserted.