Skip to main content

module  pixeltable.functions.llama_cpp

Pixeltable UDFs for llama.cpp models. Provides integration with llama.cpp for running quantized language models locally, supporting chat completions and embeddings with GGUF format models.

udf  create_chat_completion()

create_chat_completion(
    messages: Json,
    *,
    model_path: String | None = None,
    repo_id: String | None = None,
    repo_filename: String | None = None,
    model_kwargs: Json | None = None
) -> Json
Generate a chat completion from a list of messages. The model can be specified either as a local path, or as a repo_id and repo_filename that reference a pretrained model on the Hugging Face model hub. Exactly one of model_path or repo_id must be provided; if model_path is provided, then an optional repo_filename can also be specified. For additional details, see the llama_cpp create_chat_completions documentation. Parameters:
  • messages (Json): A list of messages to generate a response for.
  • model_path (String | None): Path to the model (if using a local model).
  • repo_id (String | None): The Hugging Face model repo id (if using a pretrained model).
  • repo_filename (String | None): A filename or glob pattern to match the model file in the repo (optional, if using a pretrained model).
  • model_kwargs (Json | None): Additional keyword args for the llama_cpp create_chat_completions API, such as max_tokens, temperature, top_p, and top_k. For details, see the llama_cpp create_chat_completions documentation.