Skip to main content

Tracing txtai

txtai Tracing via autolog

txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

MLflow Tracing provides automatic tracing capability for txtai. Auto tracing for txtai can be enabled by calling the mlflow.txtai.autolog function, MLflow will capture traces for LLM invocation, embeddings, vector search, and log them to the active MLflow Experiment.

To get started, install the MLflow txtai extension:

pip install mlflow-txtai

Then, enable autologging in your Python code:

import mlflow

mlflow.txtai.autolog()

Basic Example​

The first example traces a Textractor pipeline.

import mlflow
from txtai.pipeline import Textractor

# Enable MLflow auto-tracing for txtai
mlflow.txtai.autolog()

# Optional: Set a tracking URI and an experiment
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("txtai")

# Define and run a simple Textractor pipeline.
textractor = Textractor()
textractor("https://github.com/neuml/txtai")

txtai Textractor Tracing via autolog

Retrieval Augmented Generation (RAG)​

The next example traces a RAG pipeline.

import mlflow
from txtai import Embeddings, RAG

# Enable MLflow auto-tracing for txtai
mlflow.txtai.autolog()


wiki = Embeddings()
wiki.load(provider="huggingface-hub", container="neuml/txtai-wikipedia-slim")

# Define prompt template
template = """
Answer the following question using only the context below. Only include information
specifically discussed.

question: {question}
context: {context} """

# Create RAG pipeline
rag = RAG(
wiki,
"hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4",
system="You are a friendly assistant. You answer questions from users.",
template=template,
context=10,
)

rag("Tell me about the Roman Empire", maxlength=2048)

txtai Rag Tracing via autolog

Agent​

The last example runs a txtai agent designed to research questions on astronomy.

import mlflow
from txtai import Agent, Embeddings

# Enable MLflow auto-tracing for txtai
mlflow.txtai.autolog()


def search(query):
"""
Searches a database of astronomy data.

Make sure to call this tool only with a string input, never use JSON.

Args:
query: concepts to search for using similarity search

Returns:
list of search results with for each match
"""

return embeddings.search(
"SELECT id, text, distance FROM txtai WHERE similar(:query)",
10,
parameters={"query": query},
)


embeddings = Embeddings()
embeddings.load(provider="huggingface-hub", container="neuml/txtai-astronomy")

agent = Agent(
tools=[search],
llm="hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4",
max_iterations=10,
)

researcher = """
{command}

Do the following.
- Search for results related to the topic.
- Analyze the results
- Continue querying until conclusive answers are found
- Write a Markdown report
"""

agent(
researcher.format(
command="""
Write a detailed list with explanations of 10 candidate stars that could potentially be habitable to life.
"""
),
maxlength=16000,
)

txtai Agent Tracing via autolog

More Information​

For more examples and guidance on using txtai with MLflow, please refer to the MLflow txtai extension documentation