Zero-Code AI Agent Monitoring with OpenTelemetry

February 12, 20258 min read

You want visibility into your AI agent's behavior, but you don't want to rewrite your code to get it.

Foil's OpenTelemetry integration gives you full observability with three lines of setup. No manual spans, no token counting, no changes to your LLM calls. Initialize once, and every OpenAI, Anthropic, Cohere, Bedrock, and LangChain call is traced automatically.

Why OpenTelemetry?

OTEL auto-instrumentation is the fastest way to get observability. Three lines of setup, zero changes to your agent code, and automatic capture of model name, input, output, token usage, and latency for every LLM call. It's the right choice when you want instant visibility without touching your existing codebase.

Need more control? Foil's Native SDK lets you define custom span types (AGENT, LLM, TOOL, CHAIN, EMBEDDING, RETRIEVER), attach metadata to individual spans, and build precise trace hierarchies for complex multi-step pipelines. Read the Native SDK guide →

Under the hood it uses OpenLLMetry to instrument LLM providers and sends traces to Foil via a custom span processor. You get the same dashboard experience — traces, evaluations, alerts, semantic search — without writing any instrumentation code.

This guide shows you how to set it up and walks through two real scenarios: a customer support bot and a RAG pipeline.

What You'll Build

Section	What You'll Learn
Setup	Install and initialize OTEL auto-instrumentation
Customer support bot	Tool calling with nested traces
RAG pipeline	Auto-traced embeddings + LLM, trade-offs with custom spans
Evaluations & feedback	Server-side features that work with any tracing approach
Native SDK vs. OTEL	When to use which approach

Prerequisites

A Foil account (free tier works)
Node.js 18+ or Python 3.9+
An API key from Foil (Settings → API Keys)
An OpenAI API key — OTEL auto-instrumentation traces real LLM calls, so a working API key is required for this guide

Install

For JavaScript, OTEL dependencies are bundled with the SDK — no extra install needed. For Python, install the openllmetry extra:

npm init -y
npm install @getfoil/foil-js openai
# @opentelemetry/api is included as a dependency — no extra install

Initialize

Three lines of setup. Import the OTEL module, call Foil.init(), and you're done:

const { Foil } = require('@getfoil/foil-js/otel');

Foil.init({
  apiKey: process.env.FOIL_API_KEY,
  agentName: 'my-agent',
});

// Import LLM clients AFTER Foil.init() — this is required
// so auto-instrumentation hooks are active when modules load
const OpenAI = require('openai');

// That's it. Every OpenAI/Anthropic call is now auto-traced.

Foil.init() sets up an OpenTelemetry TracerProvider with a custom span processor that batches and exports traces to Foil. It also initializes OpenLLMetry, which monkey-patches supported LLM client libraries to emit spans automatically.

Supported providers out of the box:

OpenAI (including Azure OpenAI)
Anthropic
Cohere
Amazon Bedrock
LangChain
Any OpenTelemetry-compatible instrumentation

Example: Customer Support Bot

Here's a customer support bot that handles order status queries using tool calling. The agent calls OpenAI, which returns a tool_calls response to look up the order, then calls OpenAI again with the result to generate the final answer. We wrap the whole interaction in a manual parent span using @opentelemetry/api to create a nested trace:

const { Foil } = require('@getfoil/foil-js/otel');
// @opentelemetry/api is already installed — it's a dependency of @getfoil/foil-js
const { trace } = require('@opentelemetry/api');

// Initialize Foil BEFORE importing LLM clients
Foil.init({
  apiKey: process.env.FOIL_API_KEY,
  agentName: 'customer-support',
});

// Import OpenAI AFTER Foil.init() so auto-instrumentation hooks are active
const OpenAI = require('openai');

const openai = new OpenAI();
const tracer = trace.getTracer('customer-support');

// Tool definition for looking up order status
const tools = [
  {
    type: 'function',
    function: {
      name: 'lookup_order',
      description: 'Look up the status of an order by order ID',
      parameters: {
        type: 'object',
        properties: {
          order_id: { type: 'string', description: 'The order ID, e.g. ORD-12345' },
        },
        required: ['order_id'],
      },
    },
  },
];

// Mock implementation — replace with your real order lookup
function lookupOrder(orderId) {
  return { order_id: orderId, status: 'shipped', eta: '2025-02-15' };
}

async function handleQuery(question) {
  // Manual parent span — auto-traced OpenAI calls nest inside it
  return tracer.startActiveSpan('handle-query', async (span) => {
    try {
      const messages = [
        { role: 'system', content: 'You are a helpful support agent for TechMart.' },
        { role: 'user', content: question },
      ];

      // First LLM call — may return tool_calls (auto-traced)
      const first = await openai.chat.completions.create({
        model: 'gpt-4o-mini',
        messages,
        tools,
      });

      const choice = first.choices[0];

      if (choice.finish_reason === 'tool_calls') {
        messages.push(choice.message);

        for (const tc of choice.message.tool_calls) {
          const args = JSON.parse(tc.function.arguments);
          const result = lookupOrder(args.order_id);
          messages.push({
            role: 'tool',
            tool_call_id: tc.id,
            content: JSON.stringify(result),
          });
        }

        // Second LLM call — generates final answer (auto-traced)
        const second = await openai.chat.completions.create({
          model: 'gpt-4o-mini',
          messages,
        });

        return second.choices[0].message.content;
      }

      return choice.message.content;
    } finally {
      span.end();
    }
  });
}

(async () => {
  const answer = await handleQuery('Can you check on order ORD-12345?');
  console.log(`Q: Can you check on order ORD-12345?`);
  console.log(`A: ${answer}\n`);

  console.log('Flushing traces to Foil...');
  await Foil.flush();
  console.log('Done! View your traces at https://app.getfoil.ai\n');
})();

Run It

Save the code above to a file and run it:

FOIL_API_KEY=sk_live_xxx OPENAI_API_KEY=sk-xxx node customer_support.js

The tool-calling flow produces a nested trace — a parent handle-query span wrapping two auto-traced openai.chat calls. The first call returns a tool_calls response, and the second generates the final answer with the tool result. Open your Foil dashboard and you'll see the full hierarchy:

Foil dashboard showing a nested trace with a parent handle-query span containing two auto-traced OpenAI chat spans from the tool-calling flow

Example: RAG Pipeline

A Retrieval-Augmented Generation pipeline has three steps: embed the query, retrieve documents, and generate an answer. With OTEL, the embedding and LLM calls are auto-traced. The retriever step — your custom vector store logic — is not.

const { Foil } = require('@getfoil/foil-js/otel');

Foil.init({
  apiKey: process.env.FOIL_API_KEY,
  agentName: 'rag-pipeline',
});

// Import OpenAI AFTER Foil.init()
const OpenAI = require('openai');
const openai = new OpenAI();

async function ragQuery(userQuestion) {
  // Step 1: Embed the query — auto-traced
  const embedding = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: userQuestion,
  });

  // Step 2: Retrieve docs — your code, NOT auto-traced
  const docs = await vectorStore.search(embedding.data[0].embedding, { topK: 3 });

  // Step 3: Generate answer — auto-traced
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      {
        role: 'system',
        content: `Answer based on the following context:\n${docs.map(d => d.content).join('\n')}`,
      },
      { role: 'user', content: userQuestion },
    ],
  });

  return response.choices[0].message.content;
}

(async () => {
  const answer = await ragQuery('How do I reset my password?');
  console.log(answer);

  await Foil.flush();
})();

In the dashboard, you'll see two auto-traced spans per query: one for the embedding call and one for the chat completion. The vector store retrieval step won't appear as a span — that's the main trade-off with auto-instrumentation.

If you need visibility into retriever performance, document relevance scores, or custom tool calls, use the Native SDK with SpanKind.RETRIEVER and SpanKind.TOOL spans.

Evaluations and Feedback

Evaluations, feedback signals, semantic search, and alerts are server-side features that run automatically on incoming traces regardless of how those traces were created. Everything from the Native SDK guide — custom evaluations, evaluation templates, A/B testing, data leakage detection — works identically with OTEL-traced spans.

For example, to add a groundedness check to your RAG agent, create the evaluation via the SDK or dashboard:

const { Foil } = require('@getfoil/foil-js');

const foil = new Foil({ apiKey: process.env.FOIL_API_KEY });

// Create a custom evaluation — runs on every trace automatically
await foil.createEvaluation(agentId, {
  name: 'groundedness_check',
  description: 'Checks if the response is grounded in retrieved documents',
  prompt: `Evaluate whether the assistant response is grounded in the provided context.
Return true if grounded, false if it contains hallucinated or unsupported claims.

Input: {input}
Output: {output}`,
  evaluationType: 'boolean',
  enabled: true,
});

// Clone a pre-built template
await foil.cloneEvaluationTemplate(agentId, 'data_leakage');

Once enabled, these evaluations run automatically on every new trace — no changes to your OTEL setup needed.

Native SDK vs. OpenTelemetry

Foil offers two integration paths. Here's how they compare:

Feature	Native SDK	OpenTelemetry
Setup	~10 lines (create tracer + trace callback)	3 lines (`Foil.init()`)
LLM call changes	Yes — wrap in spans	None
Span types	Full control (AGENT, LLM, TOOL, CHAIN, etc.)	Auto-detected LLM spans
Token counting	Manual	Automatic
Tool/retriever spans	Explicit	Not auto-captured
Best for	Complex pipelines, precise control	Quick setup, standard LLM apps

Our recommendation: Start with OTEL for quick visibility into your agent's behavior. Switch to the Native SDK when you need fine-grained control over span hierarchy, custom tool spans, or retriever-level detail. See the Native SDK guide for the full walkthrough.

Cleanup

Before your process exits, flush pending spans and shut down the provider:

// Flush any pending spans (e.g., before responding to a request)
await Foil.flush();

// Shutdown when your app exits (flushes + closes connections)
await Foil.shutdown();

In long-running servers, you typically only call Foil.shutdown() on graceful shutdown (e.g., SIGTERM). The span processor batches and exports automatically every 5 seconds, so individual requests don't need explicit flush calls.

Summary

With three lines of setup, you now have:

Automatic tracing of every LLM call (model, input, output, tokens, latency)
Zero code changes to your existing agent logic
Full dashboard access — traces, evaluations, alerts, semantic search, and analytics
Server-side evaluations that run automatically on every trace

For complex pipelines where you need custom tool spans, retriever-level detail, or explicit span hierarchy, check out the Native SDK guide.

Foil is an AI monitoring platform that gives you visibility into your agents in production. Sign up free at getfoil.ai.