Overview
Aptible integrates with Langfuse, allowing you to send traces from the Aptible AI Gateway directly to your Langfuse project for LLM observability, debugging, and evaluation. You can send the following data directly to your Langfuse project:- LLM Traces: Send request/response data, token usage, latency, and model metadata for every AI Gateway call in an environment to your Langfuse project using an LLM Trace Drain.
LLM Trace Drains require your organization to sign a BAA and be enrolled in the Aptible AI Gateway beta. If your organization isn’t enrolled, the LLM Trace Drain option will not appear in the Aptible Dashboard.
Langfuse LLM Trace Integration
On Aptible, you can set up a Langfuse LLM Trace Drain within an environment to forward every AI Gateway request originating from that environment to your Langfuse project. This enables you to use Langfuse’s tracing, prompt management, and evaluation features against real production LLM traffic running through the AI Gateway.Prerequisites
Before creating a Langfuse LLM Trace Drain, you’ll need:- An environment enrolled in the AI Gateway beta that has LLM keys in active use.
- A Langfuse project (Langfuse Cloud or a self-hosted deployment) and its Public Key, Secret Key, and Host URL (e.g.,
https://cloud.langfuse.com).
Hosting Langfuse on Aptible. Aptible doesn’t currently offer managed Langfuse hosting. Langfuse depends on ClickHouse as its analytical backend, which isn’t yet part of our Managed Databases catalog. If Aptible-hosted Langfuse would be useful to you, let us know — customer demand shapes which database types we prioritize adding.
Creating a Langfuse LLM Trace Drain
Creating a Langfuse LLM Trace Drain
A Langfuse LLM Trace Drain can be created within the Aptible Dashboard by:
- Navigating to an Environment
- Selecting the Integrations tab
- Selecting New LLM Trace Drain
- Selecting Langfuse as the type
- Providing:
- A Handle to identify the drain
- The Langfuse Base URL (must use
https://) - Your Langfuse project Public Key and Secret Key
- Selecting Save LLM Trace Drain
Interested in other destinations? Langfuse is the first LLM Trace Drain destination Aptible supports. If you’d like to route AI Gateway traces to another platform — or feed them into your existing OpenTelemetry-based observability stack — reach out and tell us what you’re trying to do.
What Gets Sent
Each AI Gateway request is forwarded to Langfuse as a single trace span. Traces follow the CNCF OpenTelemetry GenAI semantic conventions (version 1.40.0) with additional Aptible-specific attributes. Every trace includes: Request and response content- Full input messages, system instructions, and tool definitions
- Full output messages (including tool calls) and finish reasons
- Request parameters:
model,max_tokens,temperature,top_p,top_k,frequency_penalty,presence_penalty,seed,stop_sequences,response_format, and others
- Token counts — prompt, completion, total, and (where applicable) cache-creation and cache-read input tokens
- Cost and cost breakdown - Note that cost, spend, and budget values currently reflect list pricing from upstream providers and will vary as the AI Gateway’s pricing model evolves and based on customer contracts.
- Latency — total response time, time-to-first-token (completion start time), and gateway overhead
- Cache hit indicator, cache key, and cost savings on cache hits
- Provider rate-limit headers — request and token allowances and remaining budget
- The originating Aptible organization ID, environment ID, API key hash, and key alias
- The resolved model ID and model group
- The API route, request ID, caller IP address, and user-agent
- Cumulative key spend, per-key budget, and budget reset time
- On failed requests, the error class and any error detail captured by the gateway
Traces arriving in Langfuse reflect the gateway’s view of each request — the model that was actually invoked, the provider latency observed at the gateway, and the token counts Aptible is billing on. If your application also emits its own Langfuse traces, you can correlate gateway traces with application traces using the
x-session-id request header (forwarded as gen_ai.conversation.id) or the end_user field.
