Skip to main content

Overview

Aptible integrates with Langfuse, allowing you to send traces from the Aptible AI Gateway directly to your Langfuse project for LLM observability, debugging, and evaluation. You can send the following data directly to your Langfuse project:
  • LLM Traces: Send request/response data, token usage, latency, and model metadata for every AI Gateway call in an environment to your Langfuse project using an LLM Trace Drain.
LLM Trace Drains require your organization to sign a BAA and be enrolled in the Aptible AI Gateway beta. If your organization isn’t enrolled, the LLM Trace Drain option will not appear in the Aptible Dashboard.
LLM Trace Drains forward the full content of prompts and completions to the configured Langfuse destination. If your AI Gateway traffic includes PHI, PII, regulated data, or internal secrets, confirm that the Langfuse deployment you’re sending to is HIPAA compliant and that you have a BAA with your provider before enabling a drain. Aptible’s BAA covers the AI Gateway itself; it does not extend to third-party destinations such as Langfuse Cloud.

Langfuse LLM Trace Integration

On Aptible, you can set up a Langfuse LLM Trace Drain within an environment to forward every AI Gateway request originating from that environment to your Langfuse project. This enables you to use Langfuse’s tracing, prompt management, and evaluation features against real production LLM traffic running through the AI Gateway.

Prerequisites

Before creating a Langfuse LLM Trace Drain, you’ll need:
  • An environment enrolled in the AI Gateway beta that has LLM keys in active use.
  • A Langfuse project (Langfuse Cloud or a self-hosted deployment) and its Public Key, Secret Key, and Host URL (e.g., https://cloud.langfuse.com).
Hosting Langfuse on Aptible. Aptible doesn’t currently offer managed Langfuse hosting. Langfuse depends on ClickHouse as its analytical backend, which isn’t yet part of our Managed Databases catalog. If Aptible-hosted Langfuse would be useful to you, let us know — customer demand shapes which database types we prioritize adding.
A Langfuse LLM Trace Drain can be created within the Aptible Dashboard by:
  • Navigating to an Environment
  • Selecting the Integrations tab
  • Selecting New LLM Trace Drain
  • Selecting Langfuse as the type
  • Providing:
    • A Handle to identify the drain
    • The Langfuse Base URL (must use https://)
    • Your Langfuse project Public Key and Secret Key
  • Selecting Save LLM Trace Drain
Once saved, all LLM Gateway requests made with API keys belonging to that environment are forwarded to the configured Langfuse project.
Interested in other destinations? Langfuse is the first LLM Trace Drain destination Aptible supports. If you’d like to route AI Gateway traces to another platform — or feed them into your existing OpenTelemetry-based observability stack — reach out and tell us what you’re trying to do.

What Gets Sent

Each AI Gateway request is forwarded to Langfuse as a single trace span. Traces follow the CNCF OpenTelemetry GenAI semantic conventions (version 1.40.0) with additional Aptible-specific attributes. Every trace includes: Request and response content
  • Full input messages, system instructions, and tool definitions
  • Full output messages (including tool calls) and finish reasons
  • Request parameters: model, max_tokens, temperature, top_p, top_k, frequency_penalty, presence_penalty, seed, stop_sequences, response_format, and others
Per-request metrics
  • Token counts — prompt, completion, total, and (where applicable) cache-creation and cache-read input tokens
  • Cost and cost breakdown - Note that cost, spend, and budget values currently reflect list pricing from upstream providers and will vary as the AI Gateway’s pricing model evolves and based on customer contracts.
  • Latency — total response time, time-to-first-token (completion start time), and gateway overhead
  • Cache hit indicator, cache key, and cost savings on cache hits
  • Provider rate-limit headers — request and token allowances and remaining budget
Aptible identity and attribution
  • The originating Aptible organization ID, environment ID, API key hash, and key alias
  • The resolved model ID and model group
  • The API route, request ID, caller IP address, and user-agent
  • Cumulative key spend, per-key budget, and budget reset time
Error information
  • On failed requests, the error class and any error detail captured by the gateway
See Langfuse’s tracing documentation for details on how traces are structured, searched, and visualized.
Traces arriving in Langfuse reflect the gateway’s view of each request — the model that was actually invoked, the provider latency observed at the gateway, and the token counts Aptible is billing on. If your application also emits its own Langfuse traces, you can correlate gateway traces with application traces using the x-session-id request header (forwarded as gen_ai.conversation.id) or the end_user field.