You want visibility into your AI agent's behavior, but you don't want to rewrite your code to get it.
Foil's OpenTelemetry integration gives you full observability with three lines of setup. No manual spans, no token counting, no changes to your LLM calls. Initialize once, and every OpenAI, Anthropic, Cohere, Bedrock, and LangChain call is traced automatically.
Why OpenTelemetry?
OTEL auto-instrumentation is the fastest way to get observability. Three lines of setup, zero changes to your agent code, and automatic capture of model name, input, output, token usage, and latency for every LLM call. It's the right choice when you want instant visibility without touching your existing codebase.
Need more control? Foil's Native SDK lets you define custom span types (AGENT, LLM, TOOL, CHAIN, EMBEDDING, RETRIEVER), attach metadata to individual spans, and build precise trace hierarchies for complex multi-step pipelines. Read the Native SDK guide →
Under the hood it uses OpenLLMetry to instrument LLM providers and sends traces to Foil via a custom span processor. You get the same dashboard experience — traces, evaluations, alerts, semantic search — without writing any instrumentation code.
This guide shows you how to set it up and walks through two real scenarios: a customer support bot and a RAG pipeline.
What You'll Build
| Section | What You'll Learn |
|---|---|
| Setup | Install and initialize OTEL auto-instrumentation |
| Customer support bot | Tool calling with nested traces |
| RAG pipeline | Auto-traced embeddings + LLM, trade-offs with custom spans |
| Evaluations & feedback | Server-side features that work with any tracing approach |
| Native SDK vs. OTEL | When to use which approach |
Prerequisites
- A Foil account (free tier works)
- Node.js 18+ or Python 3.9+
- An API key from Foil (Settings → API Keys)
- An OpenAI API key — OTEL auto-instrumentation traces real LLM calls, so a working API key is required for this guide
Install
For JavaScript, OTEL dependencies are bundled with the SDK — no extra install needed. For Python, install the openllmetry extra:
npm init -y
npm install @getfoil/foil-js openai
# @opentelemetry/api is included as a dependency — no extra installInitialize
Three lines of setup. Import the OTEL module, call Foil.init(), and you're done:
const { Foil } = require('@getfoil/foil-js/otel');
Foil.init({
apiKey: process.env.FOIL_API_KEY,
agentName: 'my-agent',
});
// Import LLM clients AFTER Foil.init() — this is required
// so auto-instrumentation hooks are active when modules load
const OpenAI = require('openai');
// That's it. Every OpenAI/Anthropic call is now auto-traced.Foil.init() sets up an OpenTelemetry TracerProvider with a custom span processor that batches and exports traces to Foil. It also initializes OpenLLMetry, which monkey-patches supported LLM client libraries to emit spans automatically.
Supported providers out of the box:
- OpenAI (including Azure OpenAI)
- Anthropic
- Cohere
- Amazon Bedrock
- LangChain
- Any OpenTelemetry-compatible instrumentation
Example: Customer Support Bot
Here's a customer support bot that handles order status queries using tool calling. The agent calls OpenAI, which returns a tool_calls response to look up the order, then calls OpenAI again with the result to generate the final answer. We wrap the whole interaction in a manual parent span using @opentelemetry/api to create a nested trace:
const { Foil } = require('@getfoil/foil-js/otel');
// @opentelemetry/api is already installed — it's a dependency of @getfoil/foil-js
const { trace } = require('@opentelemetry/api');
// Initialize Foil BEFORE importing LLM clients
Foil.init({
apiKey: process.env.FOIL_API_KEY,
agentName: 'customer-support',
});
// Import OpenAI AFTER Foil.init() so auto-instrumentation hooks are active
const OpenAI = require('openai');
const openai = new OpenAI();
const tracer = trace.getTracer('customer-support');
// Tool definition for looking up order status
const tools = [
{
type: 'function',
function: {
name: 'lookup_order',
description: 'Look up the status of an order by order ID',
parameters: {
type: 'object',
properties: {
order_id: { type: 'string', description: 'The order ID, e.g. ORD-12345' },
},
required: ['order_id'],
},
},
},
];
// Mock implementation — replace with your real order lookup
function lookupOrder(orderId) {
return { order_id: orderId, status: 'shipped', eta: '2025-02-15' };
}
async function handleQuery(question) {
// Manual parent span — auto-traced OpenAI calls nest inside it
return tracer.startActiveSpan('handle-query', async (span) => {
try {
const messages = [
{ role: 'system', content: 'You are a helpful support agent for TechMart.' },
{ role: 'user', content: question },
];
// First LLM call — may return tool_calls (auto-traced)
const first = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages,
tools,
});
const choice = first.choices[0];
if (choice.finish_reason === 'tool_calls') {
messages.push(choice.message);
for (const tc of choice.message.tool_calls) {
const args = JSON.parse(tc.function.arguments);
const result = lookupOrder(args.order_id);
messages.push({
role: 'tool',
tool_call_id: tc.id,
content: JSON.stringify(result),
});
}
// Second LLM call — generates final answer (auto-traced)
const second = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages,
});
return second.choices[0].message.content;
}
return choice.message.content;
} finally {
span.end();
}
});
}
(async () => {
const answer = await handleQuery('Can you check on order ORD-12345?');
console.log(`Q: Can you check on order ORD-12345?`);
console.log(`A: ${answer}\n`);
console.log('Flushing traces to Foil...');
await Foil.flush();
console.log('Done! View your traces at https://app.getfoil.ai\n');
})();Run It
Save the code above to a file and run it:
FOIL_API_KEY=sk_live_xxx OPENAI_API_KEY=sk-xxx node customer_support.jsThe tool-calling flow produces a nested trace — a parent handle-query span wrapping two auto-traced openai.chat calls. The first call returns a tool_calls response, and the second generates the final answer with the tool result. Open your Foil dashboard and you'll see the full hierarchy:

Example: RAG Pipeline
A Retrieval-Augmented Generation pipeline has three steps: embed the query, retrieve documents, and generate an answer. With OTEL, the embedding and LLM calls are auto-traced. The retriever step — your custom vector store logic — is not.
const { Foil } = require('@getfoil/foil-js/otel');
Foil.init({
apiKey: process.env.FOIL_API_KEY,
agentName: 'rag-pipeline',
});
// Import OpenAI AFTER Foil.init()
const OpenAI = require('openai');
const openai = new OpenAI();
async function ragQuery(userQuestion) {
// Step 1: Embed the query — auto-traced
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: userQuestion,
});
// Step 2: Retrieve docs — your code, NOT auto-traced
const docs = await vectorStore.search(embedding.data[0].embedding, { topK: 3 });
// Step 3: Generate answer — auto-traced
const response = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{
role: 'system',
content: `Answer based on the following context:\n${docs.map(d => d.content).join('\n')}`,
},
{ role: 'user', content: userQuestion },
],
});
return response.choices[0].message.content;
}
(async () => {
const answer = await ragQuery('How do I reset my password?');
console.log(answer);
await Foil.flush();
})();In the dashboard, you'll see two auto-traced spans per query: one for the embedding call and one for the chat completion. The vector store retrieval step won't appear as a span — that's the main trade-off with auto-instrumentation.
If you need visibility into retriever performance, document relevance scores, or custom tool calls, use the Native SDK with SpanKind.RETRIEVER and SpanKind.TOOL spans.
Evaluations and Feedback
Evaluations, feedback signals, semantic search, and alerts are server-side features that run automatically on incoming traces regardless of how those traces were created. Everything from the Native SDK guide — custom evaluations, evaluation templates, A/B testing, data leakage detection — works identically with OTEL-traced spans.
For example, to add a groundedness check to your RAG agent, create the evaluation via the SDK or dashboard:
const { Foil } = require('@getfoil/foil-js');
const foil = new Foil({ apiKey: process.env.FOIL_API_KEY });
// Create a custom evaluation — runs on every trace automatically
await foil.createEvaluation(agentId, {
name: 'groundedness_check',
description: 'Checks if the response is grounded in retrieved documents',
prompt: `Evaluate whether the assistant response is grounded in the provided context.
Return true if grounded, false if it contains hallucinated or unsupported claims.
Input: {input}
Output: {output}`,
evaluationType: 'boolean',
enabled: true,
});
// Clone a pre-built template
await foil.cloneEvaluationTemplate(agentId, 'data_leakage');Once enabled, these evaluations run automatically on every new trace — no changes to your OTEL setup needed.
Native SDK vs. OpenTelemetry
Foil offers two integration paths. Here's how they compare:
| Feature | Native SDK | OpenTelemetry |
|---|---|---|
| Setup | ~10 lines (create tracer + trace callback) | 3 lines (Foil.init()) |
| LLM call changes | Yes — wrap in spans | None |
| Span types | Full control (AGENT, LLM, TOOL, CHAIN, etc.) | Auto-detected LLM spans |
| Token counting | Manual | Automatic |
| Tool/retriever spans | Explicit | Not auto-captured |
| Best for | Complex pipelines, precise control | Quick setup, standard LLM apps |
Our recommendation: Start with OTEL for quick visibility into your agent's behavior. Switch to the Native SDK when you need fine-grained control over span hierarchy, custom tool spans, or retriever-level detail. See the Native SDK guide for the full walkthrough.
Cleanup
Before your process exits, flush pending spans and shut down the provider:
// Flush any pending spans (e.g., before responding to a request)
await Foil.flush();
// Shutdown when your app exits (flushes + closes connections)
await Foil.shutdown();In long-running servers, you typically only call Foil.shutdown() on graceful shutdown (e.g., SIGTERM). The span processor batches and exports automatically every 5 seconds, so individual requests don't need explicit flush calls.
Summary
With three lines of setup, you now have:
- Automatic tracing of every LLM call (model, input, output, tokens, latency)
- Zero code changes to your existing agent logic
- Full dashboard access — traces, evaluations, alerts, semantic search, and analytics
- Server-side evaluations that run automatically on every trace
For complex pipelines where you need custom tool spans, retriever-level detail, or explicit span hierarchy, check out the Native SDK guide.
Foil is an AI monitoring platform that gives you visibility into your agents in production. Sign up free at getfoil.ai.