Learn how to integrate custom embedding models with Pixeltable
Pixeltable provides extensive built-in support for popular embedding models, but you can also easily integrate your own custom embedding models. This guide shows you how to create and use custom embedding functions for any model architecture.
Use Pixeltable’s batching capabilities for better performance:
Copy
Ask AI
from pixeltable.func import Batch@pxt.udf(batch_size=32)def batched_bert_embed(texts: Batch[str]) -> Batch[pxt.Array[(512,), pxt.Float]]: """BERT embedding function with batching""" if not hasattr(batched_bert_embed, 'model'): batched_bert_embed.preprocessor = hub.load( 'https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3' ) batched_bert_embed.model = hub.load( 'https://tfhub.dev/tensorflow/small_bert/bert_en_uncased_L-4_H-512_A-8/2' ) # Process entire batch at once tensor = tf.constant(list(texts)) results = batched_bert_embed.model( batched_bert_embed.preprocessor(tensor) )['pooled_output'] return [r for r in results.numpy()]
Always implement proper error handling in production UDFs:
Copy
Ask AI
@pxt.udfdef robust_bert_embed(text: str) -> pxt.Array[(512,), pxt.Float]: """BERT embedding with error handling""" try: if not text or len(text.strip()) == 0: raise ValueError("Empty text input") if not hasattr(robust_bert_embed, 'model'): # Model initialization... pass tensor = tf.constant([text]) result = robust_bert_embed.model( robust_bert_embed.preprocessor(tensor) )['pooled_output'] return result.numpy()[0, :] except Exception as e: logger.error(f"Embedding failed: {str(e)}") raise