Company

Aug 1, 2023

HIPAA Compliant AI: What developers actually need to know

HIPAA Compliant AI: What developers actually need to know

Henry Hund

Operations

SHARE

SHARE

By Henry Hund, Chief Operating Officer

and Mat Steinlin, Head of Information Security



Last updated: March 2, 2026


Using AI in healthcare isn't hard. Making it compliant is.


If you're building digital health products that use LLMs with patient data, you've probably discovered that calling the OpenAI API is the easy part. The hard part is everything around it: BAAs, audit logging, encrypted storage, key management, de-identification, and the ability to prove to auditors that your controls work.


This guide covers what "HIPAA-compliant AI" actually means, the technical requirements for using LLMs with PHI, which tools offer BAA coverage, and the infrastructure decisions you'll face.


We wrote this for developers and technical leads at healthcare companies who want to use AI without creating a compliance problem. We'll include regulatory citations where they matter, but the focus is on what you need to do.

But first, why listen to us?


Aptible has helped thousands of digital health startups keep their cloud infrastructure HIPAA compliant since 2013. When we started building an AI Gateway, we were surprised that no great developer’s guide about HIPAA-compliant AI existed, so we set out to fix that. The guide is based on our research about and practical experience with how to use PHI with LLMs in a safe a secure manner.


Mat and Henry authored this piece. As a former CISO at digital health startups and current Head of Information Security at Aptible, Mat has been through more compliance audits than he can count. Henry has worked with countless customers over the past decade at Aptible ensuring their infrastructure met HIPAA compliance requirements. Between us, we've probably navigated every HIPAA, HITRUST, SOC 2, or PCI complexity or edge case imaginable.

What "HIPAA-Compliant AI" actually means


No AI tool is "HIPAA certified." HIPAA is a federal law, implemented through regulations, and it does not certify or approve products. It sets requirements for how covered entities and business associates must safeguard protected health information (PHI).


Vendors may obtain independent security attestations such as SOC 2 reports or HITRUST certification to demonstrate the effectiveness of their controls, but those are assurance frameworks, not government certifications and not substitutes for HIPAA compliance. Compliance ultimately depends on how the organization implements and governs the system, including having appropriate safeguards and, where required, a valid BAA in place.


When AI touches PHI, HIPAA requires safeguards across three areas: administrative, physical, and technical. For LLM usage specifically, this translates to concrete technical requirements:


Core requirements:

  • Business Associate Agreement (BAA): If an LLM provider processes PHI on your behalf, they're a business associate under HIPAA. You need a signed BAA before sending any PHI to their API. When a BAA is in place, the provider typically implements specific technical safeguards, such as zero data retention, to align with the legal obligations defined in the agreement. This isn't just paperwork; it changes how the provider handles your data.

  • Audit logging: Any activity involving PHI must be logged. When your LLM calls handle patient data, that means logging prompts, responses, timestamps, and who made the request. This creates the audit trail required under 45 CFR 164.312(b).

  • Encryption: Data must be encrypted in transit (TLS) and at rest. This applies to your logs, any cached data, and anything persisted.

  • No training on PHI: You need assurance that your data won't be used to train models. This should be explicit in the BAA or provider agreement, not just a policy statement.


Operational requirements that matter in practice:

  • Key management: How do you manage API keys across teams, applications, and environments? How do you rotate them? How do you know which key made which request?

  • Model access controls: Can you restrict which models developers can use? Different risk profiles for dev vs. production?

  • De-identification: Are you scrubbing PHI before it reaches the LLM? Do you need to re-identify data in responses?

  • Cost controls: LLM costs can spiral. Do you have visibility into usage? Budget limits that actually stop requests?

  • Observability: Can you inspect actual requests to verify your controls are working?


The distinction between "HIPAA compliant" and "HIPAA eligible" matters. A provider being "HIPAA eligible" means they'll sign a BAA. It doesn't mean using their API automatically makes your implementation compliant. That's still on you.

The regulatory gap around AI


Using LLMs with PHI in a compliant manner does require some extra effort. It’s no surprise: operating in digital health is and has been challenging, and layering in ensuring compliance with AI tools adds a new dimension to the challenge. It’s no surprise that OpenAI’s announcement of ChatGPT Health, its consumer health “experience”, focuses on privacy and security. Sadly, the privacy and security protections built around the consumer-facing ChatGPT Health won’t apply to your company’s use of LLMs.


Here's what you need to build or buy to use LLMs with PHI compliantly.

Business Associates Agreements (BAAs)


If an LLM provider will receive, process, or store PHI, you need a BAA with them. This is non-negotiable under HIPAA. The BAA establishes liability for how they handle your data and requires them to implement appropriate safeguards.


Which providers offer BAAs:


Provider

BAA Available

How to get it

OpenAI API

Yes

Request from Open AI

OpenAI Enterprise

Yes

Enterprise agreement

Anthropic API

Yes

Request from Anthropic

AWS Bedrock

Yes

Part of AWS BAA

Azure OpenAI

Yes

Enterprise agreement

Google Vertex AI

Yes

Enterprise agreement


The catch: if you're using multiple models (an OpenAI model for some tasks, Claude for others, a Bedrock model for something else), you need BAAs with each provider. This creates contract sprawl and increases the risk of accidentally routing PHI through an uncovered path.


Getting a BAA isn't always quick. Some providers offer self-service BAAs you can sign in minutes. Others require weeks or months of negotiation, especially for enterprise agreements. We've seen customers wait three months for a BAA with a major cloud provider. Plan ahead and don't assume you can start using a new provider immediately.

Audit logging


HIPAA's audit control standard (45 CFR 164.312(b)) requires mechanisms to record and examine activity in systems containing PHI. The requirement applies to the activity itself, not to every LLM call you make. If an LLM interaction handles PHI, you need to log it. If it doesn't touch patient data, HIPAA doesn't require a log entry.


For interactions that do involve PHI, log:

  • The prompt sent to the LLM

  • The response received

  • Timestamps

  • User or system that initiated the request

  • Which model was used


LLM providers don't do this logging for you. You get API responses, not audit trails. You need infrastructure that captures this data for PHI-touching requests.


Where this gets complicated:

  • Storage: These logs contain PHI. They must be encrypted at rest, access-controlled, and retained according to your retention policy.

  • Short-term vs. long-term: You need operational access (debugging, verification) and compliance access (audits, investigations). These have different retention requirements.

  • Log drains: For compliance, you often need to export logs to long-term storage (S3, a SIEM, etc.) with appropriate retention periods.

Encryption requirements


Encryption is usually the easiest requirement to meet:

  • In transit: TLS for all API calls. Most LLM providers enforce this by default, so you're likely already covered.

  • At rest: Your responsibility for logs, cached data, and any persisted prompts or responses. This is where teams sometimes miss requirements.


Key management matters here too. Who has access to encryption keys? How do you rotate them? AWS KMS or a similar managed service handles this well. If you're managing keys yourself, that's additional infrastructure to build and maintain.

Data handling


Will your patient data be used to train models?


Major LLM providers have policies about this, but policies aren't contracts. Before sending PHI to an LLM, you should verify:

  1. Explicit opt-out in the BAA or data processing agreement. A blog post saying "we don't train on your data" isn't enforceable. The legal agreement needs to say it.

  2. Data retention terms. How long does the provider keep your prompts and responses? For what purposes? Abuse monitoring? Safety research?

  3. API vs. consumer product distinctions. ChatGPT the product has different data handling than the OpenAI API. Same for Claude.ai versus the Anthropic API. Make sure you're reading the right policy.


OpenAI's current policy says API data is retained for 30 days for abuse monitoring, then deleted, and is not used for training. That's their policy. What matters legally is what your BAA says.

PII/PHI De-identification


The safest PHI is PHI that never reaches the model in identifiable form.


De-identification means scrubbing sensitive data before it goes to the LLM. Names become [PATIENT_1], dates become [DATE_1], and Social Security numbers become tokens. The LLM processes the request without ever seeing the real identifiers.


HIPAA defines two de-identification standards (45 CFR 164.514):

  • Safe Harbor: Remove 18 specific identifiers (names, dates, geographic data, etc.)

  • Expert Determination: A qualified expert determines the risk of re-identification is very small


A common question: if you remove identifiers, send data to an AI system, maintain an internal mapping table, and then re-associate the output back to the individual, does the data still count as de-identified?


Under 45 CFR 164.514(c), HIPAA explicitly permits the use of a code or other means of record identification to allow re-identification by the covered entity or business associate, provided that the code is not derived from or related to the individual, the code is not otherwise capable of being translated to identify the individual, and the re-identification mechanism is not disclosed. If those conditions are met, the data disclosed to the AI system may qualify as de-identified under HIPAA, even though you retain a separate, protected re-identification key. If those conditions are not met, the data would be more accurately described as coded or pseudonymized and would remain identifiable PHI subject to HIPAA.


Building reliable de-identification is hard. You need:

  • NLP that accurately identifies PHI in unstructured text

  • Consistent tokenization so you can re-identify after responses

  • A clean architecture with short mapping lifecycles to reduce risk

  • Handling of edge cases (misspellings, nicknames, context-dependent identifiers)


Re-identification is hard too, and high-stakes. If the LLM responds with [PATIENT_1] should take [MEDICATION_1] twice daily, you need to restore the actual patient name and medication before showing it to a clinician. Obviously, you can’t afford to mess this up. If your re-identification algorithm mixes [PATIENT_1] and [PATIENT_2] your application could end up recommending [MEDICATION_1] to the wrong patient.

API Key management


LLM providers give you API keys, usually just one or a few. For a small team, this could be fine. For larger organizations, the limitations become clear. How do you scope keys by application, team, or environment? How do you rotate keys without breaking production? How do you track which key made which request for auditing and cost allocation? How do you revoke access for a specific use case without affecting others?


This isn't unique to HIPAA, but it becomes a compliance issue when you can't demonstrate who had access to systems processing PHI.

Model access controls


Different use cases have different risk profiles. A production feature summarizing patient records has different requirements than a developer tool for debugging.


You might want production scopes restricted to approved models only, development scopes with broader access for experimentation, and internal tools with different limits than customer-facing features.


Without a proxy layer, enforcing this requires organizational discipline and code reviews. With a proxy, you can enforce it at the infrastructure level.

Compliance verification and observability


Controls are only useful if you can prove they're working. During an audit or customer security review, you need to demonstrate that de-identification is actually scrubbing what it should, that logs are being captured and stored appropriately, and that access controls are enforced.


This requires the ability to inspect actual requests and responses. Not summaries or aggregations, but the real data flowing through your systems. You need short-term access for operational needs and long-term retention for compliance.

Cost controls and budget management


Cost controls aren't a HIPAA requirement, but they're worth mentioning because LLM costs can spiral quickly.

Useful controls include visibility into usage by application, team, and use case, budget limits that stop requests before costs get out of hand, and usage tracking for cost allocation. A runaway loop burning through $10,000 in


API calls overnight is a problem you'd rather catch early.


Building HIPAA-Compliant AI: The DIY approach


If you want to use LLMs with PHI and you're building the compliance infrastructure yourself, here's what that actually entails:

The components you need


  1. Proxy layer to intercept all LLM traffic and apply controls consistently

  2. Logging infrastructure to capture every prompt and response with metadata

  3. Encrypted log storage because logs contain PHI

  4. Log drain to long-term storage for retention compliance

  5. Short-term log access for debugging and verification

  6. Key management system to scope keys by application, team, and use case

  7. Model access controls to enforce which models each key can use

  8. De-identification pipeline to scrub PHI before it reaches the LLM

  9. Re-identification logic to restore original values in responses

  10. Budget controls with usage tracking and limits per scope

  11. Alerting when usage approaches limits

  12. Protocol translation if you want to switch providers without code changes

  13. Capacity management for rate limits and failover


That's 13 systems before you write application code.


True, some of these systems may be easier to implement than others, and in some cases you might be able to skip some (e.g. Proxy layer if you are able to commit to and build systems for each model you use; De-identification pipeline and Re-identification logic if your control implementation meets HIPAA compliance requirements and you elect not to de-identify).


Either way, there’s a significant work to ensuring you’re meeting HIPAA compliance requirements, and model providers aren’t doing that work for you, even if you sign a BAA with them.

Example: DIY audit logging


Here's what a basic logging wrapper looks like:


import json
import logging
from datetime import datetime
from openai import OpenAI

class HIPAALoggingWrapper:
  def init(self, client, logger):
  self.client = client
  self.logger = logger

  def chat_completion(self, user_id: str, **kwargs):
    log_entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "type": "llm_request",
        "user_id": user_id,  # For audit attribution
        "model": kwargs.get("model"),
        "messages": kwargs.get("messages"),  # Contains PHI - encrypt at rest
    }

    response = self.client.chat.completions.create(**kwargs)

    log_entry["response"] = response.choices[0].message.content
    log_entry["usage"] = response.usage.model_dump()
    log_entry["request_id"] = response.id

    # This logger MUST write to encrypted, access-controlled storage
    self.logger.info(json.dumps(log_entry))

    return response

# Usage
client = OpenAI()
hipaa_client = HIPAALoggingWrapper(client, audit_logger)
response = hipaa_client.chat_completion(
    user_id="clinician_123",
    model="gpt-5",
    messages=[{"role": "user", "content": prompt}]
)


This gives you basic audit logging. What it doesn't give you: encrypted storage (you need to configure your logging backend), retention policy enforcement, UI access for compliance verification, log drains to long-term storage, consistent logging across multiple providers (you'd need separate wrappers for each), key management with scoping, model access controls, de-identification, or budget controls.


That's a partial implementation of one capability. The rest of the list is separate work.


This is why teams look for managed solutions.


Common mistakes we see


After working with hundreds of healthcare companies on compliance, these are the mistakes that come up repeatedly:


Logging without encryption. We've seen teams diligently log every prompt and response, then store those logs in plaintext Elasticsearch or an unencrypted S3 bucket. The logs contain PHI. If they're not encrypted at rest and access-controlled, you've created a compliance gap that's often worse than not logging at all.


Adding providers without BAAs. One company had a BAA with Anthropic but not OpenAI. A developer tried OpenAI for a task where it performed better, not realizing the BAA requirement. They found out when the developer mentioned it in standup. The fix was straightforward, but it required disclosure to their compliance team and documentation of the incident.


Confusing consumer products with APIs. A clinician used ChatGPT Plus for a "quick test" with real patient data because "OpenAI is HIPAA compliant." It's not. The API with a BAA can be, assuming proper implementation. ChatGPT Plus is not. By the time anyone noticed, PHI had been sent to a consumer product with no BAA coverage.


Discovering logging gaps during an audit. A company's logging infrastructure had a silent failure for three weeks. They discovered it during a customer security review when they couldn't produce logs for that period. HIPAA requires you to be able to examine activity. If your logging had gaps or your retention policy wasn't enforced, you have a problem that's hard to fix retroactively.


Relying on policy instead of contract. A SaaS vendor's website says something like "we don't train on your data" or "your data is encrypted." Great. What does your BAA say? We've seen this pattern repeatedly with cloud services: policies can change, marketing pages get updated, but contracts are enforceable. If it's not in the BAA, don't assume it's guaranteed.

The gateway approach


Instead of building compliance infrastructure yourself, it is sometimes possible to route LLM traffic through a managed gateway that can handle controls centrally. That’s where Aptible’s AI Gateway came from: many of our customers asked us to help them ensure HIPAA-compliant AI API calls, the same way we help them ensure HIPAA compliant cloud infrastructure.

Why teams use AI gateways


All LLM requests flow through a single control layer. This provides enormous benefits in terms of visibility and cost control, and there are several production-grade gateways available such as LiteLLM and Portkey, which we have recommend to customers in cases where HIPAA compliance is not required.


What no other gateway provides, and the entire reason we built Aptible AI Gateway, is HIPAA compliance requirements, met out of the box.


Aptible AI Gateway gives you:

  • One BAA instead of one per provider

  • Consistent logging regardless of which model you're using

  • Centralized controls for key management, model access, de-identification

  • Model switching without rebuilding compliance infrastructure


The gateway is not intended to replace your application code. You still call LLMs the same way, though you change the base URL for your requests, so calls routes through infrastructure that applies controls such as logging, key management, model access, cost controls, automatically.

Comparing AI gateways for HIPAA


As mentioned above, LiteLLM and Portkey are great gateways, when HIPAA compliance is not required. Here's how the options compare on key HIPAA capabilities:


Capability

LiteLLM

Portkey

Aptible AI Gateway

BAA available

No (self-hosted)

Enterprise tier

Yes, standard

HITRUST certified

No

No

Yes

Audit logging with PHI encryption

DIY

Requires config

Yes

Log drain to long-term storage

DIY

Yes

Yes

Key scoping by app/team

Limited

Yes

Yes

Model access controls

Limited

Yes

Yes

PII/PHI de-identification

No

Enterprise tier

Coming soon

Re-identification

No

No

Coming soon

Budget controls

Basic

Yes

Coming soon

Built for digital health startups

No

No

Yes

Deployment model

Self-hosted

SaaS/Hybrid/Air-gapped

Managed


LiteLLM is open-source and self-hosted. It handles routing and basic logging, but you're responsible for the HIPAA infrastructure around it. No BAA because there's no vendor relationship. Good for teams with existing compliant infrastructure who just need routing.


Portkey offers HIPAA compliance on their Enterprise tier, including BAA signing and PII anonymization. The Enterprise plan supports SaaS, hybrid, or air-gapped deployment. You'll need to negotiate Enterprise pricing and may need to assemble additional components for full HIPAA coverage.


Aptible AI Gateway is purpose-built for healthcare. Aptible has been helping digital health startups meet compliance requirements since 2013, and is HITRUST R2 certified and SOC 2 Type 2 audited. The AI Gateway gives you all 13 systems out of the box: one BAA covers all supported models, de-identification and re-identification are built in, and audit logging is automatic. No enterprise sales process required.


Learn more about Aptible AI Gateway →

Getting started


You can build HIPAA-compliant AI features without building all the compliance infrastructure from scratch. The requirements are real: BAAs, audit logging, encryption, access controls. How you implement them is up to you.

You can certainly start with a single model provider, get the BAA in place, and build the compliance infrastructure like logging from day one.


But, you might consider using a gateway if you want to enable your engineers to work with multiple models while ensuring HIPAA compliance controls for each, or if you don’t want to manage BAAs, logging, access controls and all the rest on your own.


Aptible AI Gateway gives you the compliance infrastructure out of the box. Learn more or talk to an engineer.

Frequently asked questions

Which AI is HIPAA compliant?

No AI is inherently HIPAA compliant. Compliance depends on implementation. The major LLM providers (OpenAI Enterprise and API, Anthropic, AWS Bedrock, Azure OpenAI, and Google Vertex AI) offer BAAs that enable HIPAA-compliant use. You're still responsible for audit logging, encryption, access controls, and other implementation requirements.

Is ChatGPT HIPAA compliant?

It depends which product. ChatGPT Free, Plus, and Team do not offer BAAs and should not be used with PHI. ChatGPT Enterprise offers BAA availability, no training on your data, SSO/SCIM, and encryption at rest. The OpenAI API is a separate product with its own BAA availability. OpenAI's Codex (their coding agent) inherits the compliance posture of your ChatGPT plan. The bottom line: "Is ChatGPT HIPAA compliant?" depends on which product, with what configuration, and what you build around it.

Is Claude HIPAA compliant?

Same structure as OpenAI. Claude.ai (the consumer product) does not offer a BAA. The Claude API offers BAA availability for customers who request it. Claude for Enterprise includes BAA availability and enhanced security. Claude Code (Anthropic's CLI for developers) uses the API, so it's covered under your API agreement if you have a BAA. Anthropic's policy states API data is not used for training. Data is retained briefly for trust and safety, then deleted.

Do I need a BAA with OpenAI to use their models with PHI?

Yes. If you're sending PHI to OpenAI's API, they're acting as a business associate. HIPAA requires a BAA before sharing PHI with a business associate. OpenAI offers BAAs for API and Enterprise customers.

How long does it take to get a BAA?

It varies widely. Some providers offer self-service BAAs that you can review and sign in minutes. Others require weeks or months of back-and-forth, especially for enterprise-tier agreements or if you need custom terms. We've heard from customers who waited three months to finalize a BAA with a major provider. If you're planning to use a new LLM provider with PHI, start the BAA process early. Don't assume you can sign up and start sending data the same day.

What about open-source LLMs like Llama?

Self-hosting open-source models eliminates the need for a provider BAA because you're the only party handling the data. But you're responsible for all security controls on your hosting infrastructure: encryption, access controls, audit logging, physical security if applicable. Self-hosting doesn't simplify compliance. It shifts the entire burden to you.

How do I audit AI usage for HIPAA?

You need to log activity involving PHI. When LLM calls handle patient data, log the prompt, response, timestamps, user/system attribution, and which model was used. These logs must be encrypted at rest, access-controlled, and retained according to your retention policy. Most LLM providers don't create these logs for you, so you need to build or buy logging infrastructure.

Is de-identification required for HIPAA-compliant AI?

No. De-identification is not required if the AI provider is acting as a Business Associate and a valid BAA is in place. In that case, PHI may be processed in compliance with HIPAA safeguards.


De-identification is only required if you do not have a BAA and want to share data without it being treated as PHI. To qualify, the data must meet the HIPAA de-identification standard under 45 CFR 164.514, either by removing all required identifiers under the Safe Harbor method or through a documented Expert Determination. If those standards are not met, the data is still considered PHI under HIPAA.


That said, even with a BAA in place, de-identification can reduce your risk exposure and simplify your compliance posture.

What's the difference between HIPAA compliant and HIPAA eligible?

"HIPAA eligible" typically means a vendor will sign a BAA. "HIPAA compliant" implies the full implementation meets HIPAA requirements. A product being HIPAA eligible doesn't make your use of it compliant. You still need to implement required controls like logging, encryption, and access controls. The term "HIPAA certified" is misleading because HIPAA doesn't certify products. For more on HIPAA basics, see our HIPAA Compliance Guide.


By Henry Hund, Chief Operating Officer

and Mat Steinlin, Head of Information Security



Last updated: March 2, 2026


Using AI in healthcare isn't hard. Making it compliant is.


If you're building digital health products that use LLMs with patient data, you've probably discovered that calling the OpenAI API is the easy part. The hard part is everything around it: BAAs, audit logging, encrypted storage, key management, de-identification, and the ability to prove to auditors that your controls work.


This guide covers what "HIPAA-compliant AI" actually means, the technical requirements for using LLMs with PHI, which tools offer BAA coverage, and the infrastructure decisions you'll face.


We wrote this for developers and technical leads at healthcare companies who want to use AI without creating a compliance problem. We'll include regulatory citations where they matter, but the focus is on what you need to do.

But first, why listen to us?


Aptible has helped thousands of digital health startups keep their cloud infrastructure HIPAA compliant since 2013. When we started building an AI Gateway, we were surprised that no great developer’s guide about HIPAA-compliant AI existed, so we set out to fix that. The guide is based on our research about and practical experience with how to use PHI with LLMs in a safe a secure manner.


Mat and Henry authored this piece. As a former CISO at digital health startups and current Head of Information Security at Aptible, Mat has been through more compliance audits than he can count. Henry has worked with countless customers over the past decade at Aptible ensuring their infrastructure met HIPAA compliance requirements. Between us, we've probably navigated every HIPAA, HITRUST, SOC 2, or PCI complexity or edge case imaginable.

What "HIPAA-Compliant AI" actually means


No AI tool is "HIPAA certified." HIPAA is a federal law, implemented through regulations, and it does not certify or approve products. It sets requirements for how covered entities and business associates must safeguard protected health information (PHI).


Vendors may obtain independent security attestations such as SOC 2 reports or HITRUST certification to demonstrate the effectiveness of their controls, but those are assurance frameworks, not government certifications and not substitutes for HIPAA compliance. Compliance ultimately depends on how the organization implements and governs the system, including having appropriate safeguards and, where required, a valid BAA in place.


When AI touches PHI, HIPAA requires safeguards across three areas: administrative, physical, and technical. For LLM usage specifically, this translates to concrete technical requirements:


Core requirements:

  • Business Associate Agreement (BAA): If an LLM provider processes PHI on your behalf, they're a business associate under HIPAA. You need a signed BAA before sending any PHI to their API. When a BAA is in place, the provider typically implements specific technical safeguards, such as zero data retention, to align with the legal obligations defined in the agreement. This isn't just paperwork; it changes how the provider handles your data.

  • Audit logging: Any activity involving PHI must be logged. When your LLM calls handle patient data, that means logging prompts, responses, timestamps, and who made the request. This creates the audit trail required under 45 CFR 164.312(b).

  • Encryption: Data must be encrypted in transit (TLS) and at rest. This applies to your logs, any cached data, and anything persisted.

  • No training on PHI: You need assurance that your data won't be used to train models. This should be explicit in the BAA or provider agreement, not just a policy statement.


Operational requirements that matter in practice:

  • Key management: How do you manage API keys across teams, applications, and environments? How do you rotate them? How do you know which key made which request?

  • Model access controls: Can you restrict which models developers can use? Different risk profiles for dev vs. production?

  • De-identification: Are you scrubbing PHI before it reaches the LLM? Do you need to re-identify data in responses?

  • Cost controls: LLM costs can spiral. Do you have visibility into usage? Budget limits that actually stop requests?

  • Observability: Can you inspect actual requests to verify your controls are working?


The distinction between "HIPAA compliant" and "HIPAA eligible" matters. A provider being "HIPAA eligible" means they'll sign a BAA. It doesn't mean using their API automatically makes your implementation compliant. That's still on you.

The regulatory gap around AI


Using LLMs with PHI in a compliant manner does require some extra effort. It’s no surprise: operating in digital health is and has been challenging, and layering in ensuring compliance with AI tools adds a new dimension to the challenge. It’s no surprise that OpenAI’s announcement of ChatGPT Health, its consumer health “experience”, focuses on privacy and security. Sadly, the privacy and security protections built around the consumer-facing ChatGPT Health won’t apply to your company’s use of LLMs.


Here's what you need to build or buy to use LLMs with PHI compliantly.

Business Associates Agreements (BAAs)


If an LLM provider will receive, process, or store PHI, you need a BAA with them. This is non-negotiable under HIPAA. The BAA establishes liability for how they handle your data and requires them to implement appropriate safeguards.


Which providers offer BAAs:


Provider

BAA Available

How to get it

OpenAI API

Yes

Request from Open AI

OpenAI Enterprise

Yes

Enterprise agreement

Anthropic API

Yes

Request from Anthropic

AWS Bedrock

Yes

Part of AWS BAA

Azure OpenAI

Yes

Enterprise agreement

Google Vertex AI

Yes

Enterprise agreement


The catch: if you're using multiple models (an OpenAI model for some tasks, Claude for others, a Bedrock model for something else), you need BAAs with each provider. This creates contract sprawl and increases the risk of accidentally routing PHI through an uncovered path.


Getting a BAA isn't always quick. Some providers offer self-service BAAs you can sign in minutes. Others require weeks or months of negotiation, especially for enterprise agreements. We've seen customers wait three months for a BAA with a major cloud provider. Plan ahead and don't assume you can start using a new provider immediately.

Audit logging


HIPAA's audit control standard (45 CFR 164.312(b)) requires mechanisms to record and examine activity in systems containing PHI. The requirement applies to the activity itself, not to every LLM call you make. If an LLM interaction handles PHI, you need to log it. If it doesn't touch patient data, HIPAA doesn't require a log entry.


For interactions that do involve PHI, log:

  • The prompt sent to the LLM

  • The response received

  • Timestamps

  • User or system that initiated the request

  • Which model was used


LLM providers don't do this logging for you. You get API responses, not audit trails. You need infrastructure that captures this data for PHI-touching requests.


Where this gets complicated:

  • Storage: These logs contain PHI. They must be encrypted at rest, access-controlled, and retained according to your retention policy.

  • Short-term vs. long-term: You need operational access (debugging, verification) and compliance access (audits, investigations). These have different retention requirements.

  • Log drains: For compliance, you often need to export logs to long-term storage (S3, a SIEM, etc.) with appropriate retention periods.

Encryption requirements


Encryption is usually the easiest requirement to meet:

  • In transit: TLS for all API calls. Most LLM providers enforce this by default, so you're likely already covered.

  • At rest: Your responsibility for logs, cached data, and any persisted prompts or responses. This is where teams sometimes miss requirements.


Key management matters here too. Who has access to encryption keys? How do you rotate them? AWS KMS or a similar managed service handles this well. If you're managing keys yourself, that's additional infrastructure to build and maintain.

Data handling


Will your patient data be used to train models?


Major LLM providers have policies about this, but policies aren't contracts. Before sending PHI to an LLM, you should verify:

  1. Explicit opt-out in the BAA or data processing agreement. A blog post saying "we don't train on your data" isn't enforceable. The legal agreement needs to say it.

  2. Data retention terms. How long does the provider keep your prompts and responses? For what purposes? Abuse monitoring? Safety research?

  3. API vs. consumer product distinctions. ChatGPT the product has different data handling than the OpenAI API. Same for Claude.ai versus the Anthropic API. Make sure you're reading the right policy.


OpenAI's current policy says API data is retained for 30 days for abuse monitoring, then deleted, and is not used for training. That's their policy. What matters legally is what your BAA says.

PII/PHI De-identification


The safest PHI is PHI that never reaches the model in identifiable form.


De-identification means scrubbing sensitive data before it goes to the LLM. Names become [PATIENT_1], dates become [DATE_1], and Social Security numbers become tokens. The LLM processes the request without ever seeing the real identifiers.


HIPAA defines two de-identification standards (45 CFR 164.514):

  • Safe Harbor: Remove 18 specific identifiers (names, dates, geographic data, etc.)

  • Expert Determination: A qualified expert determines the risk of re-identification is very small


A common question: if you remove identifiers, send data to an AI system, maintain an internal mapping table, and then re-associate the output back to the individual, does the data still count as de-identified?


Under 45 CFR 164.514(c), HIPAA explicitly permits the use of a code or other means of record identification to allow re-identification by the covered entity or business associate, provided that the code is not derived from or related to the individual, the code is not otherwise capable of being translated to identify the individual, and the re-identification mechanism is not disclosed. If those conditions are met, the data disclosed to the AI system may qualify as de-identified under HIPAA, even though you retain a separate, protected re-identification key. If those conditions are not met, the data would be more accurately described as coded or pseudonymized and would remain identifiable PHI subject to HIPAA.


Building reliable de-identification is hard. You need:

  • NLP that accurately identifies PHI in unstructured text

  • Consistent tokenization so you can re-identify after responses

  • A clean architecture with short mapping lifecycles to reduce risk

  • Handling of edge cases (misspellings, nicknames, context-dependent identifiers)


Re-identification is hard too, and high-stakes. If the LLM responds with [PATIENT_1] should take [MEDICATION_1] twice daily, you need to restore the actual patient name and medication before showing it to a clinician. Obviously, you can’t afford to mess this up. If your re-identification algorithm mixes [PATIENT_1] and [PATIENT_2] your application could end up recommending [MEDICATION_1] to the wrong patient.

API Key management


LLM providers give you API keys, usually just one or a few. For a small team, this could be fine. For larger organizations, the limitations become clear. How do you scope keys by application, team, or environment? How do you rotate keys without breaking production? How do you track which key made which request for auditing and cost allocation? How do you revoke access for a specific use case without affecting others?


This isn't unique to HIPAA, but it becomes a compliance issue when you can't demonstrate who had access to systems processing PHI.

Model access controls


Different use cases have different risk profiles. A production feature summarizing patient records has different requirements than a developer tool for debugging.


You might want production scopes restricted to approved models only, development scopes with broader access for experimentation, and internal tools with different limits than customer-facing features.


Without a proxy layer, enforcing this requires organizational discipline and code reviews. With a proxy, you can enforce it at the infrastructure level.

Compliance verification and observability


Controls are only useful if you can prove they're working. During an audit or customer security review, you need to demonstrate that de-identification is actually scrubbing what it should, that logs are being captured and stored appropriately, and that access controls are enforced.


This requires the ability to inspect actual requests and responses. Not summaries or aggregations, but the real data flowing through your systems. You need short-term access for operational needs and long-term retention for compliance.

Cost controls and budget management


Cost controls aren't a HIPAA requirement, but they're worth mentioning because LLM costs can spiral quickly.

Useful controls include visibility into usage by application, team, and use case, budget limits that stop requests before costs get out of hand, and usage tracking for cost allocation. A runaway loop burning through $10,000 in


API calls overnight is a problem you'd rather catch early.


Building HIPAA-Compliant AI: The DIY approach


If you want to use LLMs with PHI and you're building the compliance infrastructure yourself, here's what that actually entails:

The components you need


  1. Proxy layer to intercept all LLM traffic and apply controls consistently

  2. Logging infrastructure to capture every prompt and response with metadata

  3. Encrypted log storage because logs contain PHI

  4. Log drain to long-term storage for retention compliance

  5. Short-term log access for debugging and verification

  6. Key management system to scope keys by application, team, and use case

  7. Model access controls to enforce which models each key can use

  8. De-identification pipeline to scrub PHI before it reaches the LLM

  9. Re-identification logic to restore original values in responses

  10. Budget controls with usage tracking and limits per scope

  11. Alerting when usage approaches limits

  12. Protocol translation if you want to switch providers without code changes

  13. Capacity management for rate limits and failover


That's 13 systems before you write application code.


True, some of these systems may be easier to implement than others, and in some cases you might be able to skip some (e.g. Proxy layer if you are able to commit to and build systems for each model you use; De-identification pipeline and Re-identification logic if your control implementation meets HIPAA compliance requirements and you elect not to de-identify).


Either way, there’s a significant work to ensuring you’re meeting HIPAA compliance requirements, and model providers aren’t doing that work for you, even if you sign a BAA with them.

Example: DIY audit logging


Here's what a basic logging wrapper looks like:


import json
import logging
from datetime import datetime
from openai import OpenAI

class HIPAALoggingWrapper:
  def init(self, client, logger):
  self.client = client
  self.logger = logger

  def chat_completion(self, user_id: str, **kwargs):
    log_entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "type": "llm_request",
        "user_id": user_id,  # For audit attribution
        "model": kwargs.get("model"),
        "messages": kwargs.get("messages"),  # Contains PHI - encrypt at rest
    }

    response = self.client.chat.completions.create(**kwargs)

    log_entry["response"] = response.choices[0].message.content
    log_entry["usage"] = response.usage.model_dump()
    log_entry["request_id"] = response.id

    # This logger MUST write to encrypted, access-controlled storage
    self.logger.info(json.dumps(log_entry))

    return response

# Usage
client = OpenAI()
hipaa_client = HIPAALoggingWrapper(client, audit_logger)
response = hipaa_client.chat_completion(
    user_id="clinician_123",
    model="gpt-5",
    messages=[{"role": "user", "content": prompt}]
)


This gives you basic audit logging. What it doesn't give you: encrypted storage (you need to configure your logging backend), retention policy enforcement, UI access for compliance verification, log drains to long-term storage, consistent logging across multiple providers (you'd need separate wrappers for each), key management with scoping, model access controls, de-identification, or budget controls.


That's a partial implementation of one capability. The rest of the list is separate work.


This is why teams look for managed solutions.


Common mistakes we see


After working with hundreds of healthcare companies on compliance, these are the mistakes that come up repeatedly:


Logging without encryption. We've seen teams diligently log every prompt and response, then store those logs in plaintext Elasticsearch or an unencrypted S3 bucket. The logs contain PHI. If they're not encrypted at rest and access-controlled, you've created a compliance gap that's often worse than not logging at all.


Adding providers without BAAs. One company had a BAA with Anthropic but not OpenAI. A developer tried OpenAI for a task where it performed better, not realizing the BAA requirement. They found out when the developer mentioned it in standup. The fix was straightforward, but it required disclosure to their compliance team and documentation of the incident.


Confusing consumer products with APIs. A clinician used ChatGPT Plus for a "quick test" with real patient data because "OpenAI is HIPAA compliant." It's not. The API with a BAA can be, assuming proper implementation. ChatGPT Plus is not. By the time anyone noticed, PHI had been sent to a consumer product with no BAA coverage.


Discovering logging gaps during an audit. A company's logging infrastructure had a silent failure for three weeks. They discovered it during a customer security review when they couldn't produce logs for that period. HIPAA requires you to be able to examine activity. If your logging had gaps or your retention policy wasn't enforced, you have a problem that's hard to fix retroactively.


Relying on policy instead of contract. A SaaS vendor's website says something like "we don't train on your data" or "your data is encrypted." Great. What does your BAA say? We've seen this pattern repeatedly with cloud services: policies can change, marketing pages get updated, but contracts are enforceable. If it's not in the BAA, don't assume it's guaranteed.

The gateway approach


Instead of building compliance infrastructure yourself, it is sometimes possible to route LLM traffic through a managed gateway that can handle controls centrally. That’s where Aptible’s AI Gateway came from: many of our customers asked us to help them ensure HIPAA-compliant AI API calls, the same way we help them ensure HIPAA compliant cloud infrastructure.

Why teams use AI gateways


All LLM requests flow through a single control layer. This provides enormous benefits in terms of visibility and cost control, and there are several production-grade gateways available such as LiteLLM and Portkey, which we have recommend to customers in cases where HIPAA compliance is not required.


What no other gateway provides, and the entire reason we built Aptible AI Gateway, is HIPAA compliance requirements, met out of the box.


Aptible AI Gateway gives you:

  • One BAA instead of one per provider

  • Consistent logging regardless of which model you're using

  • Centralized controls for key management, model access, de-identification

  • Model switching without rebuilding compliance infrastructure


The gateway is not intended to replace your application code. You still call LLMs the same way, though you change the base URL for your requests, so calls routes through infrastructure that applies controls such as logging, key management, model access, cost controls, automatically.

Comparing AI gateways for HIPAA


As mentioned above, LiteLLM and Portkey are great gateways, when HIPAA compliance is not required. Here's how the options compare on key HIPAA capabilities:


Capability

LiteLLM

Portkey

Aptible AI Gateway

BAA available

No (self-hosted)

Enterprise tier

Yes, standard

HITRUST certified

No

No

Yes

Audit logging with PHI encryption

DIY

Requires config

Yes

Log drain to long-term storage

DIY

Yes

Yes

Key scoping by app/team

Limited

Yes

Yes

Model access controls

Limited

Yes

Yes

PII/PHI de-identification

No

Enterprise tier

Coming soon

Re-identification

No

No

Coming soon

Budget controls

Basic

Yes

Coming soon

Built for digital health startups

No

No

Yes

Deployment model

Self-hosted

SaaS/Hybrid/Air-gapped

Managed


LiteLLM is open-source and self-hosted. It handles routing and basic logging, but you're responsible for the HIPAA infrastructure around it. No BAA because there's no vendor relationship. Good for teams with existing compliant infrastructure who just need routing.


Portkey offers HIPAA compliance on their Enterprise tier, including BAA signing and PII anonymization. The Enterprise plan supports SaaS, hybrid, or air-gapped deployment. You'll need to negotiate Enterprise pricing and may need to assemble additional components for full HIPAA coverage.


Aptible AI Gateway is purpose-built for healthcare. Aptible has been helping digital health startups meet compliance requirements since 2013, and is HITRUST R2 certified and SOC 2 Type 2 audited. The AI Gateway gives you all 13 systems out of the box: one BAA covers all supported models, de-identification and re-identification are built in, and audit logging is automatic. No enterprise sales process required.


Learn more about Aptible AI Gateway →

Getting started


You can build HIPAA-compliant AI features without building all the compliance infrastructure from scratch. The requirements are real: BAAs, audit logging, encryption, access controls. How you implement them is up to you.

You can certainly start with a single model provider, get the BAA in place, and build the compliance infrastructure like logging from day one.


But, you might consider using a gateway if you want to enable your engineers to work with multiple models while ensuring HIPAA compliance controls for each, or if you don’t want to manage BAAs, logging, access controls and all the rest on your own.


Aptible AI Gateway gives you the compliance infrastructure out of the box. Learn more or talk to an engineer.

Frequently asked questions

Which AI is HIPAA compliant?

No AI is inherently HIPAA compliant. Compliance depends on implementation. The major LLM providers (OpenAI Enterprise and API, Anthropic, AWS Bedrock, Azure OpenAI, and Google Vertex AI) offer BAAs that enable HIPAA-compliant use. You're still responsible for audit logging, encryption, access controls, and other implementation requirements.

Is ChatGPT HIPAA compliant?

It depends which product. ChatGPT Free, Plus, and Team do not offer BAAs and should not be used with PHI. ChatGPT Enterprise offers BAA availability, no training on your data, SSO/SCIM, and encryption at rest. The OpenAI API is a separate product with its own BAA availability. OpenAI's Codex (their coding agent) inherits the compliance posture of your ChatGPT plan. The bottom line: "Is ChatGPT HIPAA compliant?" depends on which product, with what configuration, and what you build around it.

Is Claude HIPAA compliant?

Same structure as OpenAI. Claude.ai (the consumer product) does not offer a BAA. The Claude API offers BAA availability for customers who request it. Claude for Enterprise includes BAA availability and enhanced security. Claude Code (Anthropic's CLI for developers) uses the API, so it's covered under your API agreement if you have a BAA. Anthropic's policy states API data is not used for training. Data is retained briefly for trust and safety, then deleted.

Do I need a BAA with OpenAI to use their models with PHI?

Yes. If you're sending PHI to OpenAI's API, they're acting as a business associate. HIPAA requires a BAA before sharing PHI with a business associate. OpenAI offers BAAs for API and Enterprise customers.

How long does it take to get a BAA?

It varies widely. Some providers offer self-service BAAs that you can review and sign in minutes. Others require weeks or months of back-and-forth, especially for enterprise-tier agreements or if you need custom terms. We've heard from customers who waited three months to finalize a BAA with a major provider. If you're planning to use a new LLM provider with PHI, start the BAA process early. Don't assume you can sign up and start sending data the same day.

What about open-source LLMs like Llama?

Self-hosting open-source models eliminates the need for a provider BAA because you're the only party handling the data. But you're responsible for all security controls on your hosting infrastructure: encryption, access controls, audit logging, physical security if applicable. Self-hosting doesn't simplify compliance. It shifts the entire burden to you.

How do I audit AI usage for HIPAA?

You need to log activity involving PHI. When LLM calls handle patient data, log the prompt, response, timestamps, user/system attribution, and which model was used. These logs must be encrypted at rest, access-controlled, and retained according to your retention policy. Most LLM providers don't create these logs for you, so you need to build or buy logging infrastructure.

Is de-identification required for HIPAA-compliant AI?

No. De-identification is not required if the AI provider is acting as a Business Associate and a valid BAA is in place. In that case, PHI may be processed in compliance with HIPAA safeguards.


De-identification is only required if you do not have a BAA and want to share data without it being treated as PHI. To qualify, the data must meet the HIPAA de-identification standard under 45 CFR 164.514, either by removing all required identifiers under the Safe Harbor method or through a documented Expert Determination. If those standards are not met, the data is still considered PHI under HIPAA.


That said, even with a BAA in place, de-identification can reduce your risk exposure and simplify your compliance posture.

What's the difference between HIPAA compliant and HIPAA eligible?

"HIPAA eligible" typically means a vendor will sign a BAA. "HIPAA compliant" implies the full implementation meets HIPAA requirements. A product being HIPAA eligible doesn't make your use of it compliant. You still need to implement required controls like logging, encryption, and access controls. The term "HIPAA certified" is misleading because HIPAA doesn't certify products. For more on HIPAA basics, see our HIPAA Compliance Guide.


By Henry Hund, Chief Operating Officer

and Mat Steinlin, Head of Information Security



Last updated: March 2, 2026


Using AI in healthcare isn't hard. Making it compliant is.


If you're building digital health products that use LLMs with patient data, you've probably discovered that calling the OpenAI API is the easy part. The hard part is everything around it: BAAs, audit logging, encrypted storage, key management, de-identification, and the ability to prove to auditors that your controls work.


This guide covers what "HIPAA-compliant AI" actually means, the technical requirements for using LLMs with PHI, which tools offer BAA coverage, and the infrastructure decisions you'll face.


We wrote this for developers and technical leads at healthcare companies who want to use AI without creating a compliance problem. We'll include regulatory citations where they matter, but the focus is on what you need to do.

But first, why listen to us?


Aptible has helped thousands of digital health startups keep their cloud infrastructure HIPAA compliant since 2013. When we started building an AI Gateway, we were surprised that no great developer’s guide about HIPAA-compliant AI existed, so we set out to fix that. The guide is based on our research about and practical experience with how to use PHI with LLMs in a safe a secure manner.


Mat and Henry authored this piece. As a former CISO at digital health startups and current Head of Information Security at Aptible, Mat has been through more compliance audits than he can count. Henry has worked with countless customers over the past decade at Aptible ensuring their infrastructure met HIPAA compliance requirements. Between us, we've probably navigated every HIPAA, HITRUST, SOC 2, or PCI complexity or edge case imaginable.

What "HIPAA-Compliant AI" actually means


No AI tool is "HIPAA certified." HIPAA is a federal law, implemented through regulations, and it does not certify or approve products. It sets requirements for how covered entities and business associates must safeguard protected health information (PHI).


Vendors may obtain independent security attestations such as SOC 2 reports or HITRUST certification to demonstrate the effectiveness of their controls, but those are assurance frameworks, not government certifications and not substitutes for HIPAA compliance. Compliance ultimately depends on how the organization implements and governs the system, including having appropriate safeguards and, where required, a valid BAA in place.


When AI touches PHI, HIPAA requires safeguards across three areas: administrative, physical, and technical. For LLM usage specifically, this translates to concrete technical requirements:


Core requirements:

  • Business Associate Agreement (BAA): If an LLM provider processes PHI on your behalf, they're a business associate under HIPAA. You need a signed BAA before sending any PHI to their API. When a BAA is in place, the provider typically implements specific technical safeguards, such as zero data retention, to align with the legal obligations defined in the agreement. This isn't just paperwork; it changes how the provider handles your data.

  • Audit logging: Any activity involving PHI must be logged. When your LLM calls handle patient data, that means logging prompts, responses, timestamps, and who made the request. This creates the audit trail required under 45 CFR 164.312(b).

  • Encryption: Data must be encrypted in transit (TLS) and at rest. This applies to your logs, any cached data, and anything persisted.

  • No training on PHI: You need assurance that your data won't be used to train models. This should be explicit in the BAA or provider agreement, not just a policy statement.


Operational requirements that matter in practice:

  • Key management: How do you manage API keys across teams, applications, and environments? How do you rotate them? How do you know which key made which request?

  • Model access controls: Can you restrict which models developers can use? Different risk profiles for dev vs. production?

  • De-identification: Are you scrubbing PHI before it reaches the LLM? Do you need to re-identify data in responses?

  • Cost controls: LLM costs can spiral. Do you have visibility into usage? Budget limits that actually stop requests?

  • Observability: Can you inspect actual requests to verify your controls are working?


The distinction between "HIPAA compliant" and "HIPAA eligible" matters. A provider being "HIPAA eligible" means they'll sign a BAA. It doesn't mean using their API automatically makes your implementation compliant. That's still on you.

The regulatory gap around AI


Using LLMs with PHI in a compliant manner does require some extra effort. It’s no surprise: operating in digital health is and has been challenging, and layering in ensuring compliance with AI tools adds a new dimension to the challenge. It’s no surprise that OpenAI’s announcement of ChatGPT Health, its consumer health “experience”, focuses on privacy and security. Sadly, the privacy and security protections built around the consumer-facing ChatGPT Health won’t apply to your company’s use of LLMs.


Here's what you need to build or buy to use LLMs with PHI compliantly.

Business Associates Agreements (BAAs)


If an LLM provider will receive, process, or store PHI, you need a BAA with them. This is non-negotiable under HIPAA. The BAA establishes liability for how they handle your data and requires them to implement appropriate safeguards.


Which providers offer BAAs:


Provider

BAA Available

How to get it

OpenAI API

Yes

Request from Open AI

OpenAI Enterprise

Yes

Enterprise agreement

Anthropic API

Yes

Request from Anthropic

AWS Bedrock

Yes

Part of AWS BAA

Azure OpenAI

Yes

Enterprise agreement

Google Vertex AI

Yes

Enterprise agreement


The catch: if you're using multiple models (an OpenAI model for some tasks, Claude for others, a Bedrock model for something else), you need BAAs with each provider. This creates contract sprawl and increases the risk of accidentally routing PHI through an uncovered path.


Getting a BAA isn't always quick. Some providers offer self-service BAAs you can sign in minutes. Others require weeks or months of negotiation, especially for enterprise agreements. We've seen customers wait three months for a BAA with a major cloud provider. Plan ahead and don't assume you can start using a new provider immediately.

Audit logging


HIPAA's audit control standard (45 CFR 164.312(b)) requires mechanisms to record and examine activity in systems containing PHI. The requirement applies to the activity itself, not to every LLM call you make. If an LLM interaction handles PHI, you need to log it. If it doesn't touch patient data, HIPAA doesn't require a log entry.


For interactions that do involve PHI, log:

  • The prompt sent to the LLM

  • The response received

  • Timestamps

  • User or system that initiated the request

  • Which model was used


LLM providers don't do this logging for you. You get API responses, not audit trails. You need infrastructure that captures this data for PHI-touching requests.


Where this gets complicated:

  • Storage: These logs contain PHI. They must be encrypted at rest, access-controlled, and retained according to your retention policy.

  • Short-term vs. long-term: You need operational access (debugging, verification) and compliance access (audits, investigations). These have different retention requirements.

  • Log drains: For compliance, you often need to export logs to long-term storage (S3, a SIEM, etc.) with appropriate retention periods.

Encryption requirements


Encryption is usually the easiest requirement to meet:

  • In transit: TLS for all API calls. Most LLM providers enforce this by default, so you're likely already covered.

  • At rest: Your responsibility for logs, cached data, and any persisted prompts or responses. This is where teams sometimes miss requirements.


Key management matters here too. Who has access to encryption keys? How do you rotate them? AWS KMS or a similar managed service handles this well. If you're managing keys yourself, that's additional infrastructure to build and maintain.

Data handling


Will your patient data be used to train models?


Major LLM providers have policies about this, but policies aren't contracts. Before sending PHI to an LLM, you should verify:

  1. Explicit opt-out in the BAA or data processing agreement. A blog post saying "we don't train on your data" isn't enforceable. The legal agreement needs to say it.

  2. Data retention terms. How long does the provider keep your prompts and responses? For what purposes? Abuse monitoring? Safety research?

  3. API vs. consumer product distinctions. ChatGPT the product has different data handling than the OpenAI API. Same for Claude.ai versus the Anthropic API. Make sure you're reading the right policy.


OpenAI's current policy says API data is retained for 30 days for abuse monitoring, then deleted, and is not used for training. That's their policy. What matters legally is what your BAA says.

PII/PHI De-identification


The safest PHI is PHI that never reaches the model in identifiable form.


De-identification means scrubbing sensitive data before it goes to the LLM. Names become [PATIENT_1], dates become [DATE_1], and Social Security numbers become tokens. The LLM processes the request without ever seeing the real identifiers.


HIPAA defines two de-identification standards (45 CFR 164.514):

  • Safe Harbor: Remove 18 specific identifiers (names, dates, geographic data, etc.)

  • Expert Determination: A qualified expert determines the risk of re-identification is very small


A common question: if you remove identifiers, send data to an AI system, maintain an internal mapping table, and then re-associate the output back to the individual, does the data still count as de-identified?


Under 45 CFR 164.514(c), HIPAA explicitly permits the use of a code or other means of record identification to allow re-identification by the covered entity or business associate, provided that the code is not derived from or related to the individual, the code is not otherwise capable of being translated to identify the individual, and the re-identification mechanism is not disclosed. If those conditions are met, the data disclosed to the AI system may qualify as de-identified under HIPAA, even though you retain a separate, protected re-identification key. If those conditions are not met, the data would be more accurately described as coded or pseudonymized and would remain identifiable PHI subject to HIPAA.


Building reliable de-identification is hard. You need:

  • NLP that accurately identifies PHI in unstructured text

  • Consistent tokenization so you can re-identify after responses

  • A clean architecture with short mapping lifecycles to reduce risk

  • Handling of edge cases (misspellings, nicknames, context-dependent identifiers)


Re-identification is hard too, and high-stakes. If the LLM responds with [PATIENT_1] should take [MEDICATION_1] twice daily, you need to restore the actual patient name and medication before showing it to a clinician. Obviously, you can’t afford to mess this up. If your re-identification algorithm mixes [PATIENT_1] and [PATIENT_2] your application could end up recommending [MEDICATION_1] to the wrong patient.

API Key management


LLM providers give you API keys, usually just one or a few. For a small team, this could be fine. For larger organizations, the limitations become clear. How do you scope keys by application, team, or environment? How do you rotate keys without breaking production? How do you track which key made which request for auditing and cost allocation? How do you revoke access for a specific use case without affecting others?


This isn't unique to HIPAA, but it becomes a compliance issue when you can't demonstrate who had access to systems processing PHI.

Model access controls


Different use cases have different risk profiles. A production feature summarizing patient records has different requirements than a developer tool for debugging.


You might want production scopes restricted to approved models only, development scopes with broader access for experimentation, and internal tools with different limits than customer-facing features.


Without a proxy layer, enforcing this requires organizational discipline and code reviews. With a proxy, you can enforce it at the infrastructure level.

Compliance verification and observability


Controls are only useful if you can prove they're working. During an audit or customer security review, you need to demonstrate that de-identification is actually scrubbing what it should, that logs are being captured and stored appropriately, and that access controls are enforced.


This requires the ability to inspect actual requests and responses. Not summaries or aggregations, but the real data flowing through your systems. You need short-term access for operational needs and long-term retention for compliance.

Cost controls and budget management


Cost controls aren't a HIPAA requirement, but they're worth mentioning because LLM costs can spiral quickly.

Useful controls include visibility into usage by application, team, and use case, budget limits that stop requests before costs get out of hand, and usage tracking for cost allocation. A runaway loop burning through $10,000 in


API calls overnight is a problem you'd rather catch early.


Building HIPAA-Compliant AI: The DIY approach


If you want to use LLMs with PHI and you're building the compliance infrastructure yourself, here's what that actually entails:

The components you need


  1. Proxy layer to intercept all LLM traffic and apply controls consistently

  2. Logging infrastructure to capture every prompt and response with metadata

  3. Encrypted log storage because logs contain PHI

  4. Log drain to long-term storage for retention compliance

  5. Short-term log access for debugging and verification

  6. Key management system to scope keys by application, team, and use case

  7. Model access controls to enforce which models each key can use

  8. De-identification pipeline to scrub PHI before it reaches the LLM

  9. Re-identification logic to restore original values in responses

  10. Budget controls with usage tracking and limits per scope

  11. Alerting when usage approaches limits

  12. Protocol translation if you want to switch providers without code changes

  13. Capacity management for rate limits and failover


That's 13 systems before you write application code.


True, some of these systems may be easier to implement than others, and in some cases you might be able to skip some (e.g. Proxy layer if you are able to commit to and build systems for each model you use; De-identification pipeline and Re-identification logic if your control implementation meets HIPAA compliance requirements and you elect not to de-identify).


Either way, there’s a significant work to ensuring you’re meeting HIPAA compliance requirements, and model providers aren’t doing that work for you, even if you sign a BAA with them.

Example: DIY audit logging


Here's what a basic logging wrapper looks like:


import json
import logging
from datetime import datetime
from openai import OpenAI

class HIPAALoggingWrapper:
  def init(self, client, logger):
  self.client = client
  self.logger = logger

  def chat_completion(self, user_id: str, **kwargs):
    log_entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "type": "llm_request",
        "user_id": user_id,  # For audit attribution
        "model": kwargs.get("model"),
        "messages": kwargs.get("messages"),  # Contains PHI - encrypt at rest
    }

    response = self.client.chat.completions.create(**kwargs)

    log_entry["response"] = response.choices[0].message.content
    log_entry["usage"] = response.usage.model_dump()
    log_entry["request_id"] = response.id

    # This logger MUST write to encrypted, access-controlled storage
    self.logger.info(json.dumps(log_entry))

    return response

# Usage
client = OpenAI()
hipaa_client = HIPAALoggingWrapper(client, audit_logger)
response = hipaa_client.chat_completion(
    user_id="clinician_123",
    model="gpt-5",
    messages=[{"role": "user", "content": prompt}]
)


This gives you basic audit logging. What it doesn't give you: encrypted storage (you need to configure your logging backend), retention policy enforcement, UI access for compliance verification, log drains to long-term storage, consistent logging across multiple providers (you'd need separate wrappers for each), key management with scoping, model access controls, de-identification, or budget controls.


That's a partial implementation of one capability. The rest of the list is separate work.


This is why teams look for managed solutions.


Common mistakes we see


After working with hundreds of healthcare companies on compliance, these are the mistakes that come up repeatedly:


Logging without encryption. We've seen teams diligently log every prompt and response, then store those logs in plaintext Elasticsearch or an unencrypted S3 bucket. The logs contain PHI. If they're not encrypted at rest and access-controlled, you've created a compliance gap that's often worse than not logging at all.


Adding providers without BAAs. One company had a BAA with Anthropic but not OpenAI. A developer tried OpenAI for a task where it performed better, not realizing the BAA requirement. They found out when the developer mentioned it in standup. The fix was straightforward, but it required disclosure to their compliance team and documentation of the incident.


Confusing consumer products with APIs. A clinician used ChatGPT Plus for a "quick test" with real patient data because "OpenAI is HIPAA compliant." It's not. The API with a BAA can be, assuming proper implementation. ChatGPT Plus is not. By the time anyone noticed, PHI had been sent to a consumer product with no BAA coverage.


Discovering logging gaps during an audit. A company's logging infrastructure had a silent failure for three weeks. They discovered it during a customer security review when they couldn't produce logs for that period. HIPAA requires you to be able to examine activity. If your logging had gaps or your retention policy wasn't enforced, you have a problem that's hard to fix retroactively.


Relying on policy instead of contract. A SaaS vendor's website says something like "we don't train on your data" or "your data is encrypted." Great. What does your BAA say? We've seen this pattern repeatedly with cloud services: policies can change, marketing pages get updated, but contracts are enforceable. If it's not in the BAA, don't assume it's guaranteed.

The gateway approach


Instead of building compliance infrastructure yourself, it is sometimes possible to route LLM traffic through a managed gateway that can handle controls centrally. That’s where Aptible’s AI Gateway came from: many of our customers asked us to help them ensure HIPAA-compliant AI API calls, the same way we help them ensure HIPAA compliant cloud infrastructure.

Why teams use AI gateways


All LLM requests flow through a single control layer. This provides enormous benefits in terms of visibility and cost control, and there are several production-grade gateways available such as LiteLLM and Portkey, which we have recommend to customers in cases where HIPAA compliance is not required.


What no other gateway provides, and the entire reason we built Aptible AI Gateway, is HIPAA compliance requirements, met out of the box.


Aptible AI Gateway gives you:

  • One BAA instead of one per provider

  • Consistent logging regardless of which model you're using

  • Centralized controls for key management, model access, de-identification

  • Model switching without rebuilding compliance infrastructure


The gateway is not intended to replace your application code. You still call LLMs the same way, though you change the base URL for your requests, so calls routes through infrastructure that applies controls such as logging, key management, model access, cost controls, automatically.

Comparing AI gateways for HIPAA


As mentioned above, LiteLLM and Portkey are great gateways, when HIPAA compliance is not required. Here's how the options compare on key HIPAA capabilities:


Capability

LiteLLM

Portkey

Aptible AI Gateway

BAA available

No (self-hosted)

Enterprise tier

Yes, standard

HITRUST certified

No

No

Yes

Audit logging with PHI encryption

DIY

Requires config

Yes

Log drain to long-term storage

DIY

Yes

Yes

Key scoping by app/team

Limited

Yes

Yes

Model access controls

Limited

Yes

Yes

PII/PHI de-identification

No

Enterprise tier

Coming soon

Re-identification

No

No

Coming soon

Budget controls

Basic

Yes

Coming soon

Built for digital health startups

No

No

Yes

Deployment model

Self-hosted

SaaS/Hybrid/Air-gapped

Managed


LiteLLM is open-source and self-hosted. It handles routing and basic logging, but you're responsible for the HIPAA infrastructure around it. No BAA because there's no vendor relationship. Good for teams with existing compliant infrastructure who just need routing.


Portkey offers HIPAA compliance on their Enterprise tier, including BAA signing and PII anonymization. The Enterprise plan supports SaaS, hybrid, or air-gapped deployment. You'll need to negotiate Enterprise pricing and may need to assemble additional components for full HIPAA coverage.


Aptible AI Gateway is purpose-built for healthcare. Aptible has been helping digital health startups meet compliance requirements since 2013, and is HITRUST R2 certified and SOC 2 Type 2 audited. The AI Gateway gives you all 13 systems out of the box: one BAA covers all supported models, de-identification and re-identification are built in, and audit logging is automatic. No enterprise sales process required.


Learn more about Aptible AI Gateway →

Getting started


You can build HIPAA-compliant AI features without building all the compliance infrastructure from scratch. The requirements are real: BAAs, audit logging, encryption, access controls. How you implement them is up to you.

You can certainly start with a single model provider, get the BAA in place, and build the compliance infrastructure like logging from day one.


But, you might consider using a gateway if you want to enable your engineers to work with multiple models while ensuring HIPAA compliance controls for each, or if you don’t want to manage BAAs, logging, access controls and all the rest on your own.


Aptible AI Gateway gives you the compliance infrastructure out of the box. Learn more or talk to an engineer.

Frequently asked questions

Which AI is HIPAA compliant?

No AI is inherently HIPAA compliant. Compliance depends on implementation. The major LLM providers (OpenAI Enterprise and API, Anthropic, AWS Bedrock, Azure OpenAI, and Google Vertex AI) offer BAAs that enable HIPAA-compliant use. You're still responsible for audit logging, encryption, access controls, and other implementation requirements.

Is ChatGPT HIPAA compliant?

It depends which product. ChatGPT Free, Plus, and Team do not offer BAAs and should not be used with PHI. ChatGPT Enterprise offers BAA availability, no training on your data, SSO/SCIM, and encryption at rest. The OpenAI API is a separate product with its own BAA availability. OpenAI's Codex (their coding agent) inherits the compliance posture of your ChatGPT plan. The bottom line: "Is ChatGPT HIPAA compliant?" depends on which product, with what configuration, and what you build around it.

Is Claude HIPAA compliant?

Same structure as OpenAI. Claude.ai (the consumer product) does not offer a BAA. The Claude API offers BAA availability for customers who request it. Claude for Enterprise includes BAA availability and enhanced security. Claude Code (Anthropic's CLI for developers) uses the API, so it's covered under your API agreement if you have a BAA. Anthropic's policy states API data is not used for training. Data is retained briefly for trust and safety, then deleted.

Do I need a BAA with OpenAI to use their models with PHI?

Yes. If you're sending PHI to OpenAI's API, they're acting as a business associate. HIPAA requires a BAA before sharing PHI with a business associate. OpenAI offers BAAs for API and Enterprise customers.

How long does it take to get a BAA?

It varies widely. Some providers offer self-service BAAs that you can review and sign in minutes. Others require weeks or months of back-and-forth, especially for enterprise-tier agreements or if you need custom terms. We've heard from customers who waited three months to finalize a BAA with a major provider. If you're planning to use a new LLM provider with PHI, start the BAA process early. Don't assume you can sign up and start sending data the same day.

What about open-source LLMs like Llama?

Self-hosting open-source models eliminates the need for a provider BAA because you're the only party handling the data. But you're responsible for all security controls on your hosting infrastructure: encryption, access controls, audit logging, physical security if applicable. Self-hosting doesn't simplify compliance. It shifts the entire burden to you.

How do I audit AI usage for HIPAA?

You need to log activity involving PHI. When LLM calls handle patient data, log the prompt, response, timestamps, user/system attribution, and which model was used. These logs must be encrypted at rest, access-controlled, and retained according to your retention policy. Most LLM providers don't create these logs for you, so you need to build or buy logging infrastructure.

Is de-identification required for HIPAA-compliant AI?

No. De-identification is not required if the AI provider is acting as a Business Associate and a valid BAA is in place. In that case, PHI may be processed in compliance with HIPAA safeguards.


De-identification is only required if you do not have a BAA and want to share data without it being treated as PHI. To qualify, the data must meet the HIPAA de-identification standard under 45 CFR 164.514, either by removing all required identifiers under the Safe Harbor method or through a documented Expert Determination. If those standards are not met, the data is still considered PHI under HIPAA.


That said, even with a BAA in place, de-identification can reduce your risk exposure and simplify your compliance posture.

What's the difference between HIPAA compliant and HIPAA eligible?

"HIPAA eligible" typically means a vendor will sign a BAA. "HIPAA compliant" implies the full implementation meets HIPAA requirements. A product being HIPAA eligible doesn't make your use of it compliant. You still need to implement required controls like logging, encryption, and access controls. The term "HIPAA certified" is misleading because HIPAA doesn't certify products. For more on HIPAA basics, see our HIPAA Compliance Guide.


By Henry Hund, Chief Operating Officer

and Mat Steinlin, Head of Information Security



Last updated: March 2, 2026


Using AI in healthcare isn't hard. Making it compliant is.


If you're building digital health products that use LLMs with patient data, you've probably discovered that calling the OpenAI API is the easy part. The hard part is everything around it: BAAs, audit logging, encrypted storage, key management, de-identification, and the ability to prove to auditors that your controls work.


This guide covers what "HIPAA-compliant AI" actually means, the technical requirements for using LLMs with PHI, which tools offer BAA coverage, and the infrastructure decisions you'll face.


We wrote this for developers and technical leads at healthcare companies who want to use AI without creating a compliance problem. We'll include regulatory citations where they matter, but the focus is on what you need to do.

But first, why listen to us?


Aptible has helped thousands of digital health startups keep their cloud infrastructure HIPAA compliant since 2013. When we started building an AI Gateway, we were surprised that no great developer’s guide about HIPAA-compliant AI existed, so we set out to fix that. The guide is based on our research about and practical experience with how to use PHI with LLMs in a safe a secure manner.


Mat and Henry authored this piece. As a former CISO at digital health startups and current Head of Information Security at Aptible, Mat has been through more compliance audits than he can count. Henry has worked with countless customers over the past decade at Aptible ensuring their infrastructure met HIPAA compliance requirements. Between us, we've probably navigated every HIPAA, HITRUST, SOC 2, or PCI complexity or edge case imaginable.

What "HIPAA-Compliant AI" actually means


No AI tool is "HIPAA certified." HIPAA is a federal law, implemented through regulations, and it does not certify or approve products. It sets requirements for how covered entities and business associates must safeguard protected health information (PHI).


Vendors may obtain independent security attestations such as SOC 2 reports or HITRUST certification to demonstrate the effectiveness of their controls, but those are assurance frameworks, not government certifications and not substitutes for HIPAA compliance. Compliance ultimately depends on how the organization implements and governs the system, including having appropriate safeguards and, where required, a valid BAA in place.


When AI touches PHI, HIPAA requires safeguards across three areas: administrative, physical, and technical. For LLM usage specifically, this translates to concrete technical requirements:


Core requirements:

  • Business Associate Agreement (BAA): If an LLM provider processes PHI on your behalf, they're a business associate under HIPAA. You need a signed BAA before sending any PHI to their API. When a BAA is in place, the provider typically implements specific technical safeguards, such as zero data retention, to align with the legal obligations defined in the agreement. This isn't just paperwork; it changes how the provider handles your data.

  • Audit logging: Any activity involving PHI must be logged. When your LLM calls handle patient data, that means logging prompts, responses, timestamps, and who made the request. This creates the audit trail required under 45 CFR 164.312(b).

  • Encryption: Data must be encrypted in transit (TLS) and at rest. This applies to your logs, any cached data, and anything persisted.

  • No training on PHI: You need assurance that your data won't be used to train models. This should be explicit in the BAA or provider agreement, not just a policy statement.


Operational requirements that matter in practice:

  • Key management: How do you manage API keys across teams, applications, and environments? How do you rotate them? How do you know which key made which request?

  • Model access controls: Can you restrict which models developers can use? Different risk profiles for dev vs. production?

  • De-identification: Are you scrubbing PHI before it reaches the LLM? Do you need to re-identify data in responses?

  • Cost controls: LLM costs can spiral. Do you have visibility into usage? Budget limits that actually stop requests?

  • Observability: Can you inspect actual requests to verify your controls are working?


The distinction between "HIPAA compliant" and "HIPAA eligible" matters. A provider being "HIPAA eligible" means they'll sign a BAA. It doesn't mean using their API automatically makes your implementation compliant. That's still on you.

The regulatory gap around AI


Using LLMs with PHI in a compliant manner does require some extra effort. It’s no surprise: operating in digital health is and has been challenging, and layering in ensuring compliance with AI tools adds a new dimension to the challenge. It’s no surprise that OpenAI’s announcement of ChatGPT Health, its consumer health “experience”, focuses on privacy and security. Sadly, the privacy and security protections built around the consumer-facing ChatGPT Health won’t apply to your company’s use of LLMs.


Here's what you need to build or buy to use LLMs with PHI compliantly.

Business Associates Agreements (BAAs)


If an LLM provider will receive, process, or store PHI, you need a BAA with them. This is non-negotiable under HIPAA. The BAA establishes liability for how they handle your data and requires them to implement appropriate safeguards.


Which providers offer BAAs:


Provider

BAA Available

How to get it

OpenAI API

Yes

Request from Open AI

OpenAI Enterprise

Yes

Enterprise agreement

Anthropic API

Yes

Request from Anthropic

AWS Bedrock

Yes

Part of AWS BAA

Azure OpenAI

Yes

Enterprise agreement

Google Vertex AI

Yes

Enterprise agreement


The catch: if you're using multiple models (an OpenAI model for some tasks, Claude for others, a Bedrock model for something else), you need BAAs with each provider. This creates contract sprawl and increases the risk of accidentally routing PHI through an uncovered path.


Getting a BAA isn't always quick. Some providers offer self-service BAAs you can sign in minutes. Others require weeks or months of negotiation, especially for enterprise agreements. We've seen customers wait three months for a BAA with a major cloud provider. Plan ahead and don't assume you can start using a new provider immediately.

Audit logging


HIPAA's audit control standard (45 CFR 164.312(b)) requires mechanisms to record and examine activity in systems containing PHI. The requirement applies to the activity itself, not to every LLM call you make. If an LLM interaction handles PHI, you need to log it. If it doesn't touch patient data, HIPAA doesn't require a log entry.


For interactions that do involve PHI, log:

  • The prompt sent to the LLM

  • The response received

  • Timestamps

  • User or system that initiated the request

  • Which model was used


LLM providers don't do this logging for you. You get API responses, not audit trails. You need infrastructure that captures this data for PHI-touching requests.


Where this gets complicated:

  • Storage: These logs contain PHI. They must be encrypted at rest, access-controlled, and retained according to your retention policy.

  • Short-term vs. long-term: You need operational access (debugging, verification) and compliance access (audits, investigations). These have different retention requirements.

  • Log drains: For compliance, you often need to export logs to long-term storage (S3, a SIEM, etc.) with appropriate retention periods.

Encryption requirements


Encryption is usually the easiest requirement to meet:

  • In transit: TLS for all API calls. Most LLM providers enforce this by default, so you're likely already covered.

  • At rest: Your responsibility for logs, cached data, and any persisted prompts or responses. This is where teams sometimes miss requirements.


Key management matters here too. Who has access to encryption keys? How do you rotate them? AWS KMS or a similar managed service handles this well. If you're managing keys yourself, that's additional infrastructure to build and maintain.

Data handling


Will your patient data be used to train models?


Major LLM providers have policies about this, but policies aren't contracts. Before sending PHI to an LLM, you should verify:

  1. Explicit opt-out in the BAA or data processing agreement. A blog post saying "we don't train on your data" isn't enforceable. The legal agreement needs to say it.

  2. Data retention terms. How long does the provider keep your prompts and responses? For what purposes? Abuse monitoring? Safety research?

  3. API vs. consumer product distinctions. ChatGPT the product has different data handling than the OpenAI API. Same for Claude.ai versus the Anthropic API. Make sure you're reading the right policy.


OpenAI's current policy says API data is retained for 30 days for abuse monitoring, then deleted, and is not used for training. That's their policy. What matters legally is what your BAA says.

PII/PHI De-identification


The safest PHI is PHI that never reaches the model in identifiable form.


De-identification means scrubbing sensitive data before it goes to the LLM. Names become [PATIENT_1], dates become [DATE_1], and Social Security numbers become tokens. The LLM processes the request without ever seeing the real identifiers.


HIPAA defines two de-identification standards (45 CFR 164.514):

  • Safe Harbor: Remove 18 specific identifiers (names, dates, geographic data, etc.)

  • Expert Determination: A qualified expert determines the risk of re-identification is very small


A common question: if you remove identifiers, send data to an AI system, maintain an internal mapping table, and then re-associate the output back to the individual, does the data still count as de-identified?


Under 45 CFR 164.514(c), HIPAA explicitly permits the use of a code or other means of record identification to allow re-identification by the covered entity or business associate, provided that the code is not derived from or related to the individual, the code is not otherwise capable of being translated to identify the individual, and the re-identification mechanism is not disclosed. If those conditions are met, the data disclosed to the AI system may qualify as de-identified under HIPAA, even though you retain a separate, protected re-identification key. If those conditions are not met, the data would be more accurately described as coded or pseudonymized and would remain identifiable PHI subject to HIPAA.


Building reliable de-identification is hard. You need:

  • NLP that accurately identifies PHI in unstructured text

  • Consistent tokenization so you can re-identify after responses

  • A clean architecture with short mapping lifecycles to reduce risk

  • Handling of edge cases (misspellings, nicknames, context-dependent identifiers)


Re-identification is hard too, and high-stakes. If the LLM responds with [PATIENT_1] should take [MEDICATION_1] twice daily, you need to restore the actual patient name and medication before showing it to a clinician. Obviously, you can’t afford to mess this up. If your re-identification algorithm mixes [PATIENT_1] and [PATIENT_2] your application could end up recommending [MEDICATION_1] to the wrong patient.

API Key management


LLM providers give you API keys, usually just one or a few. For a small team, this could be fine. For larger organizations, the limitations become clear. How do you scope keys by application, team, or environment? How do you rotate keys without breaking production? How do you track which key made which request for auditing and cost allocation? How do you revoke access for a specific use case without affecting others?


This isn't unique to HIPAA, but it becomes a compliance issue when you can't demonstrate who had access to systems processing PHI.

Model access controls


Different use cases have different risk profiles. A production feature summarizing patient records has different requirements than a developer tool for debugging.


You might want production scopes restricted to approved models only, development scopes with broader access for experimentation, and internal tools with different limits than customer-facing features.


Without a proxy layer, enforcing this requires organizational discipline and code reviews. With a proxy, you can enforce it at the infrastructure level.

Compliance verification and observability


Controls are only useful if you can prove they're working. During an audit or customer security review, you need to demonstrate that de-identification is actually scrubbing what it should, that logs are being captured and stored appropriately, and that access controls are enforced.


This requires the ability to inspect actual requests and responses. Not summaries or aggregations, but the real data flowing through your systems. You need short-term access for operational needs and long-term retention for compliance.

Cost controls and budget management


Cost controls aren't a HIPAA requirement, but they're worth mentioning because LLM costs can spiral quickly.

Useful controls include visibility into usage by application, team, and use case, budget limits that stop requests before costs get out of hand, and usage tracking for cost allocation. A runaway loop burning through $10,000 in


API calls overnight is a problem you'd rather catch early.


Building HIPAA-Compliant AI: The DIY approach


If you want to use LLMs with PHI and you're building the compliance infrastructure yourself, here's what that actually entails:

The components you need


  1. Proxy layer to intercept all LLM traffic and apply controls consistently

  2. Logging infrastructure to capture every prompt and response with metadata

  3. Encrypted log storage because logs contain PHI

  4. Log drain to long-term storage for retention compliance

  5. Short-term log access for debugging and verification

  6. Key management system to scope keys by application, team, and use case

  7. Model access controls to enforce which models each key can use

  8. De-identification pipeline to scrub PHI before it reaches the LLM

  9. Re-identification logic to restore original values in responses

  10. Budget controls with usage tracking and limits per scope

  11. Alerting when usage approaches limits

  12. Protocol translation if you want to switch providers without code changes

  13. Capacity management for rate limits and failover


That's 13 systems before you write application code.


True, some of these systems may be easier to implement than others, and in some cases you might be able to skip some (e.g. Proxy layer if you are able to commit to and build systems for each model you use; De-identification pipeline and Re-identification logic if your control implementation meets HIPAA compliance requirements and you elect not to de-identify).


Either way, there’s a significant work to ensuring you’re meeting HIPAA compliance requirements, and model providers aren’t doing that work for you, even if you sign a BAA with them.

Example: DIY audit logging


Here's what a basic logging wrapper looks like:


import json
import logging
from datetime import datetime
from openai import OpenAI

class HIPAALoggingWrapper:
  def init(self, client, logger):
  self.client = client
  self.logger = logger

  def chat_completion(self, user_id: str, **kwargs):
    log_entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "type": "llm_request",
        "user_id": user_id,  # For audit attribution
        "model": kwargs.get("model"),
        "messages": kwargs.get("messages"),  # Contains PHI - encrypt at rest
    }

    response = self.client.chat.completions.create(**kwargs)

    log_entry["response"] = response.choices[0].message.content
    log_entry["usage"] = response.usage.model_dump()
    log_entry["request_id"] = response.id

    # This logger MUST write to encrypted, access-controlled storage
    self.logger.info(json.dumps(log_entry))

    return response

# Usage
client = OpenAI()
hipaa_client = HIPAALoggingWrapper(client, audit_logger)
response = hipaa_client.chat_completion(
    user_id="clinician_123",
    model="gpt-5",
    messages=[{"role": "user", "content": prompt}]
)


This gives you basic audit logging. What it doesn't give you: encrypted storage (you need to configure your logging backend), retention policy enforcement, UI access for compliance verification, log drains to long-term storage, consistent logging across multiple providers (you'd need separate wrappers for each), key management with scoping, model access controls, de-identification, or budget controls.


That's a partial implementation of one capability. The rest of the list is separate work.


This is why teams look for managed solutions.


Common mistakes we see


After working with hundreds of healthcare companies on compliance, these are the mistakes that come up repeatedly:


Logging without encryption. We've seen teams diligently log every prompt and response, then store those logs in plaintext Elasticsearch or an unencrypted S3 bucket. The logs contain PHI. If they're not encrypted at rest and access-controlled, you've created a compliance gap that's often worse than not logging at all.


Adding providers without BAAs. One company had a BAA with Anthropic but not OpenAI. A developer tried OpenAI for a task where it performed better, not realizing the BAA requirement. They found out when the developer mentioned it in standup. The fix was straightforward, but it required disclosure to their compliance team and documentation of the incident.


Confusing consumer products with APIs. A clinician used ChatGPT Plus for a "quick test" with real patient data because "OpenAI is HIPAA compliant." It's not. The API with a BAA can be, assuming proper implementation. ChatGPT Plus is not. By the time anyone noticed, PHI had been sent to a consumer product with no BAA coverage.


Discovering logging gaps during an audit. A company's logging infrastructure had a silent failure for three weeks. They discovered it during a customer security review when they couldn't produce logs for that period. HIPAA requires you to be able to examine activity. If your logging had gaps or your retention policy wasn't enforced, you have a problem that's hard to fix retroactively.


Relying on policy instead of contract. A SaaS vendor's website says something like "we don't train on your data" or "your data is encrypted." Great. What does your BAA say? We've seen this pattern repeatedly with cloud services: policies can change, marketing pages get updated, but contracts are enforceable. If it's not in the BAA, don't assume it's guaranteed.

The gateway approach


Instead of building compliance infrastructure yourself, it is sometimes possible to route LLM traffic through a managed gateway that can handle controls centrally. That’s where Aptible’s AI Gateway came from: many of our customers asked us to help them ensure HIPAA-compliant AI API calls, the same way we help them ensure HIPAA compliant cloud infrastructure.

Why teams use AI gateways


All LLM requests flow through a single control layer. This provides enormous benefits in terms of visibility and cost control, and there are several production-grade gateways available such as LiteLLM and Portkey, which we have recommend to customers in cases where HIPAA compliance is not required.


What no other gateway provides, and the entire reason we built Aptible AI Gateway, is HIPAA compliance requirements, met out of the box.


Aptible AI Gateway gives you:

  • One BAA instead of one per provider

  • Consistent logging regardless of which model you're using

  • Centralized controls for key management, model access, de-identification

  • Model switching without rebuilding compliance infrastructure


The gateway is not intended to replace your application code. You still call LLMs the same way, though you change the base URL for your requests, so calls routes through infrastructure that applies controls such as logging, key management, model access, cost controls, automatically.

Comparing AI gateways for HIPAA


As mentioned above, LiteLLM and Portkey are great gateways, when HIPAA compliance is not required. Here's how the options compare on key HIPAA capabilities:


Capability

LiteLLM

Portkey

Aptible AI Gateway

BAA available

No (self-hosted)

Enterprise tier

Yes, standard

HITRUST certified

No

No

Yes

Audit logging with PHI encryption

DIY

Requires config

Yes

Log drain to long-term storage

DIY

Yes

Yes

Key scoping by app/team

Limited

Yes

Yes

Model access controls

Limited

Yes

Yes

PII/PHI de-identification

No

Enterprise tier

Coming soon

Re-identification

No

No

Coming soon

Budget controls

Basic

Yes

Coming soon

Built for digital health startups

No

No

Yes

Deployment model

Self-hosted

SaaS/Hybrid/Air-gapped

Managed


LiteLLM is open-source and self-hosted. It handles routing and basic logging, but you're responsible for the HIPAA infrastructure around it. No BAA because there's no vendor relationship. Good for teams with existing compliant infrastructure who just need routing.


Portkey offers HIPAA compliance on their Enterprise tier, including BAA signing and PII anonymization. The Enterprise plan supports SaaS, hybrid, or air-gapped deployment. You'll need to negotiate Enterprise pricing and may need to assemble additional components for full HIPAA coverage.


Aptible AI Gateway is purpose-built for healthcare. Aptible has been helping digital health startups meet compliance requirements since 2013, and is HITRUST R2 certified and SOC 2 Type 2 audited. The AI Gateway gives you all 13 systems out of the box: one BAA covers all supported models, de-identification and re-identification are built in, and audit logging is automatic. No enterprise sales process required.


Learn more about Aptible AI Gateway →

Getting started


You can build HIPAA-compliant AI features without building all the compliance infrastructure from scratch. The requirements are real: BAAs, audit logging, encryption, access controls. How you implement them is up to you.

You can certainly start with a single model provider, get the BAA in place, and build the compliance infrastructure like logging from day one.


But, you might consider using a gateway if you want to enable your engineers to work with multiple models while ensuring HIPAA compliance controls for each, or if you don’t want to manage BAAs, logging, access controls and all the rest on your own.


Aptible AI Gateway gives you the compliance infrastructure out of the box. Learn more or talk to an engineer.

Frequently asked questions

Which AI is HIPAA compliant?

No AI is inherently HIPAA compliant. Compliance depends on implementation. The major LLM providers (OpenAI Enterprise and API, Anthropic, AWS Bedrock, Azure OpenAI, and Google Vertex AI) offer BAAs that enable HIPAA-compliant use. You're still responsible for audit logging, encryption, access controls, and other implementation requirements.

Is ChatGPT HIPAA compliant?

It depends which product. ChatGPT Free, Plus, and Team do not offer BAAs and should not be used with PHI. ChatGPT Enterprise offers BAA availability, no training on your data, SSO/SCIM, and encryption at rest. The OpenAI API is a separate product with its own BAA availability. OpenAI's Codex (their coding agent) inherits the compliance posture of your ChatGPT plan. The bottom line: "Is ChatGPT HIPAA compliant?" depends on which product, with what configuration, and what you build around it.

Is Claude HIPAA compliant?

Same structure as OpenAI. Claude.ai (the consumer product) does not offer a BAA. The Claude API offers BAA availability for customers who request it. Claude for Enterprise includes BAA availability and enhanced security. Claude Code (Anthropic's CLI for developers) uses the API, so it's covered under your API agreement if you have a BAA. Anthropic's policy states API data is not used for training. Data is retained briefly for trust and safety, then deleted.

Do I need a BAA with OpenAI to use their models with PHI?

Yes. If you're sending PHI to OpenAI's API, they're acting as a business associate. HIPAA requires a BAA before sharing PHI with a business associate. OpenAI offers BAAs for API and Enterprise customers.

How long does it take to get a BAA?

It varies widely. Some providers offer self-service BAAs that you can review and sign in minutes. Others require weeks or months of back-and-forth, especially for enterprise-tier agreements or if you need custom terms. We've heard from customers who waited three months to finalize a BAA with a major provider. If you're planning to use a new LLM provider with PHI, start the BAA process early. Don't assume you can sign up and start sending data the same day.

What about open-source LLMs like Llama?

Self-hosting open-source models eliminates the need for a provider BAA because you're the only party handling the data. But you're responsible for all security controls on your hosting infrastructure: encryption, access controls, audit logging, physical security if applicable. Self-hosting doesn't simplify compliance. It shifts the entire burden to you.

How do I audit AI usage for HIPAA?

You need to log activity involving PHI. When LLM calls handle patient data, log the prompt, response, timestamps, user/system attribution, and which model was used. These logs must be encrypted at rest, access-controlled, and retained according to your retention policy. Most LLM providers don't create these logs for you, so you need to build or buy logging infrastructure.

Is de-identification required for HIPAA-compliant AI?

No. De-identification is not required if the AI provider is acting as a Business Associate and a valid BAA is in place. In that case, PHI may be processed in compliance with HIPAA safeguards.


De-identification is only required if you do not have a BAA and want to share data without it being treated as PHI. To qualify, the data must meet the HIPAA de-identification standard under 45 CFR 164.514, either by removing all required identifiers under the Safe Harbor method or through a documented Expert Determination. If those standards are not met, the data is still considered PHI under HIPAA.


That said, even with a BAA in place, de-identification can reduce your risk exposure and simplify your compliance posture.

What's the difference between HIPAA compliant and HIPAA eligible?

"HIPAA eligible" typically means a vendor will sign a BAA. "HIPAA compliant" implies the full implementation meets HIPAA requirements. A product being HIPAA eligible doesn't make your use of it compliant. You still need to implement required controls like logging, encryption, and access controls. The term "HIPAA certified" is misleading because HIPAA doesn't certify products. For more on HIPAA basics, see our HIPAA Compliance Guide.


Build Your Product.
Not Product Infra.

Build Your Product.
Not Product Infra.

Build Your Product.
Not Product Infra.

Build Your Product.
Not Product Infra.

548 Market St #75826 San Francisco, CA 94104

© 2025. All rights reserved. Privacy Policy

548 Market St #75826 San Francisco, CA 94104

© 2025. All rights reserved. Privacy Policy

548 Market St #75826 San Francisco, CA 94104

© 2025. All rights reserved. Privacy Policy

548 Market St #75826 San Francisco, CA 94104

© 2025. All rights reserved. Privacy Policy