Documentation Index
Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
Use this file to discover all available pages before exploring further.
Note: you may need to restart the kernel to use updated packages.
Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/sergeymkhitaryan/.pixeltable/pgdata
Created directory ‘rag_demo’.
<pixeltable.catalog.dir.Dir at 0x30cab33a0>
Next we’ll create a table containing the sample questions we want to
answer. The questions are stored in an Excel spreadsheet, along with a
set of “ground truth” answers to help evaluate our model pipeline. We
can use create_table() with the source parameter to load them. Note
that we can pass the URL of the spreadsheet directly.
Created table ‘queries’.
Inserting rows into `queries`: 8 rows [00:00, 2469.96 rows/s]
Inserted 8 rows with 0 errors.
Outline
There are two major parts to our RAG application:- Document Indexing: Load the documents, split them into chunks, and index them using a vector embedding.
- Querying: For each question on our list, do a top-k lookup for the most relevant chunks, use them to construct a ChatGPT prompt, and send the enriched prompt to an LLM.
Document Indexing
All data in Pixeltable, including documents, resides in tables. Tables are persistent containers that can serve as the store of record for your data. Since we are starting from scratch, we will start with an empty tablerag_demo.documents with a single column, document.
Created table ‘documents’.
Next, we’ll insert our first few source documents into the new table.
We’ll leave the rest for later, in order to show how to update the
indexed document base incrementally.
Inserting rows into `documents`: 3 rows [00:00, 491.31 rows/s]
Inserted 3 rows with 0 errors.
In RAG applications, we often decompose documents into smaller units, or
chunks, rather than treating each document as a single entity. In this
example, we’ll use Pixeltable’s built-in document_splitter, but in
general the chunking methodology is highly customizable.
document_splitter has a variety of options for controlling the
chunking behavior, and it’s also possible to replace it entirely with a
user-defined iterator (or an adapter for a third-party document
splitter).
In Pixeltable, operations such as chunking can be automated by creating
views of the base documents table. A view is a virtual derived
table: rather than adding data directly to the view, we define it via a
computation over the base table. In this example, the view is defined by
iteration over the chunks of a document_splitter.
Inserting rows into `chunks`: 41 rows [00:00, 20799.04 rows/s]
Our chunks view now has 3 columns:
textis the chunk text produced by thedocument_splitterposis a system-generated integer column, starting at 0, that provides a sequence number for each rowdocument, which is simply thedocumentcolumn from the base tabledocuments. We won’t need it here, but having access to the base table’s columns (in effect a parent-child join) can be quite useful.
chunks was automatically
populated with data from the existing documents in our base table. We
can select the first 2 chunks from each document using common query
operations, in order to get a feel for what was extracted:
sentence_transformers library, which runs
locally.
The following command creates a vector index on the text column in the
chunks table, using the E5 embedding model. (For details on index
creation, see the Embedding and Vector
Indices
guide.) Note that defining the index is sufficient in order to load it
with the existing data (and also to update it when the underlying data
changes, as we’ll see later).
Querying
In order to express a top-k lookup against our index, we use Pixeltable’ssimilarity operator in combination with the standard
order_by and limit operations. Before building this into our
application, let’s run a sample query to make sure it works.
queries table by
adding it as a computed column. In this case, the operation is a top-k
similarity lookup against the data in the chunks table. To implement
this operation, we’ll use Pixeltable’s @query decorator to enhance the
capabilities of the chunks table.
Added 8 column values with 0 errors.
8 rows updated, 8 values computed.
Our queries table now looks like this:
question_context now contains the result of executing
the query for each row, formatted as a list of dictionaries:
Asking the LLM
Now it’s time for the final step in our application: feeding the document chunks and questions to an LLM for resolution. In this demo, we’ll use OpenAI for this, but any other inference cloud or local model could be used instead. We start by defining a UDF that takes a top-k list of context chunks and a question and turns them into a ChatGPT prompt.queries:
Added 8 column values with 0 errors.
8 rows updated, 16 values computed.
We now have a new string column containing the prompt:
chat_completions() call, we need to construct two messages, containing
the instructions to the model and the prompt. For the latter, we can
simply reference the prompt column we just added.
Added 8 column values with 0 errors.
8 rows updated, 8 values computed.
Our queries table now contains a JSON-structured column response,
which holds the entire API response structure. At the moment, we’re only
interested in the response content, which we can extract easily into
another computed column:
Added 8 column values with 0 errors.
8 rows updated, 8 values computed.
We now have the following queries schema:
Incremental Updates
Pixeltable’s views and computed columns update automatically in response to new data. We can see this when we add the remaining documents to ourdocuments table. Watch how the chunks view is updated to stay in
sync with documents:
Inserting rows into `documents`: 3 rows [00:00, 569.05 rows/s]
Inserting rows into `chunks`: 67 rows [00:00, 325.91 rows/s]
Inserted 70 rows with 0 errors.
70 rows inserted, 6 values computed.
documents and chunks, it does
not automatically update the queries table. This is by design: we
don’t want all rows in queries to get automatically re-executed every
time a single new document is added to the document base. However,
newly-added rows will be run over the new, incrementally-updated index.)
To confirm that the chunks index got updated, we’ll re-run the chunks
retrieval query for the question
What is the expected EPS for Nvidia in Q1 2026?
Previously, our most similar chunk had a similarity score of ~0.8. Let’s
see what we get now:
question_context column of the queries_t table,
which will automatically recompute the answer column as well.
Inserting rows into `queries`: 8 rows [00:00, 580.60 rows/s]
8 rows updated, 40 values computed.
As a final step, let’s confirm that all the queries now have answers: