Building a Website Search Workflow
Pixeltable website search works in two phases:- Define your workflow structure (once)
- Query your content database (anytime)
1
Install Dependencies
Define Your Workflow
Create
table.py
:Use Your Workflow
Create
app.py
:What Makes This Different?
Web Scraping
Automatic content extraction:
Smart Chunking
Token-aware content splitting:
Vector Search
Natural language search:
Workflow Components
Web Processing
Web Processing
Advanced web handling:
- HTML content extraction
- Text cleaning and normalization
- Structure preservation
- Automatic encoding detection
Content Chunking
Content Chunking
Intelligent text splitting:
- Token-aware segmentation
- Configurable chunk sizes
- Context preservation
- Multiple chunking strategies
Vector Search
Vector Search
High-quality search:
- E5 text embeddings
- Fast similarity search
- Natural language queries
- Configurable similarity thresholds