HIPAA-Compliant AI: What developers need to know
By Henry Hund, Chief Operating Officer
and Mat Steinlin, Head of Information Security

Last updated: March 2, 2026
Using AI in healthcare isn't hard. Making it compliant is.
If you're building digital health products that use LLMs with patient data, you've probably discovered that calling the OpenAI API is the easy part. The hard part is everything around it: BAAs, audit logging, encrypted storage, key management, de-identification, and the ability to prove to auditors that your controls work.
This guide covers what "HIPAA-compliant AI" actually means, the technical requirements for using LLMs with PHI, which tools offer BAA coverage, and the infrastructure decisions you'll face.
We wrote this for developers and technical leads at healthcare companies who want to use AI without creating a compliance problem. We'll include regulatory citations where they matter, but the focus is on what you need to do.
But first, why listen to us?
Aptible has helped thousands of digital health startups keep their cloud infrastructure HIPAA compliant since 2013. When we started building an AI Gateway, we were surprised that no great developer’s guide about HIPAA-compliant AI existed, so we set out to fix that. The guide is based on our research about and practical experience with how to use PHI with LLMs in a safe a secure manner.
Mat and Henry authored this piece. As a former CISO at digital health startups and current Head of Information Security at Aptible, Mat has been through more compliance audits than he can count. Henry has worked with countless customers over the past decade at Aptible ensuring their infrastructure met HIPAA compliance requirements. Between us, we've probably navigated every HIPAA, HITRUST, SOC 2, or PCI complexity or edge case imaginable.
What "HIPAA-Compliant AI" actually means
No AI tool is "HIPAA certified." HIPAA is a federal law, implemented through regulations, and it does not certify or approve products. It sets requirements for how covered entities and business associates must safeguard protected health information (PHI).
Vendors may obtain independent security attestations such as SOC 2 reports or HITRUST certification to demonstrate the effectiveness of their controls, but those are assurance frameworks, not government certifications and not substitutes for HIPAA compliance. Compliance ultimately depends on how the organization implements and governs the system, including having appropriate safeguards and, where required, a valid BAA in place.
When AI touches PHI, HIPAA requires safeguards across three areas: administrative, physical, and technical. For LLM usage specifically, this translates to concrete technical requirements:
Core requirements:
Business Associate Agreement (BAA): If an LLM provider processes PHI on your behalf, they're a business associate under HIPAA. You need a signed BAA before sending any PHI to their API. When a BAA is in place, the provider typically implements specific technical safeguards, such as zero data retention, to align with the legal obligations defined in the agreement. This isn't just paperwork; it changes how the provider handles your data.
Audit logging: Any activity involving PHI must be logged. When your LLM calls handle patient data, that means logging prompts, responses, timestamps, and who made the request. This creates the audit trail required under 45 CFR 164.312(b).
Encryption: Data must be encrypted in transit (TLS) and at rest. This applies to your logs, any cached data, and anything persisted.
No training on PHI: You need assurance that your data won't be used to train models. This should be explicit in the BAA or provider agreement, not just a policy statement.
Operational requirements that matter in practice:
Key management: How do you manage API keys across teams, applications, and environments? How do you rotate them? How do you know which key made which request?
Model access controls: Can you restrict which models developers can use? Different risk profiles for dev vs. production?
De-identification: Are you scrubbing PHI before it reaches the LLM? Do you need to re-identify data in responses?
Cost controls: LLM costs can spiral. Do you have visibility into usage? Budget limits that actually stop requests?
Observability: Can you inspect actual requests to verify your controls are working?
The distinction between "HIPAA compliant" and "HIPAA eligible" matters. A provider being "HIPAA eligible" means they'll sign a BAA. It doesn't mean using their API automatically makes your implementation compliant. That's still on you.
The regulatory gap around AI
Using LLMs with PHI in a compliant manner does require some extra effort. It’s no surprise: operating in digital health is and has been challenging, and layering in ensuring compliance with AI tools adds a new dimension to the challenge. It’s no surprise that OpenAI’s announcement of ChatGPT Health, its consumer health “experience”, focuses on privacy and security. Sadly, the privacy and security protections built around the consumer-facing ChatGPT Health won’t apply to your company’s use of LLMs.
Here's what you need to build or buy to use LLMs with PHI compliantly.
Business Associates Agreements (BAAs)
If an LLM provider will receive, process, or store PHI, you need a BAA with them. This is non-negotiable under HIPAA. The BAA establishes liability for how they handle your data and requires them to implement appropriate safeguards.
Which providers offer BAAs:
Provider | BAA Available | How to get it |
|---|---|---|
OpenAI API | Yes | Request from Open AI |
OpenAI Enterprise | Yes | Enterprise agreement |
Anthropic API | Yes | Request from Anthropic |
AWS Bedrock | Yes | Part of AWS BAA |
Azure OpenAI | Yes | Enterprise agreement |
Google Vertex AI | Yes | Enterprise agreement |
The catch: if you're using multiple models (an OpenAI model for some tasks, Claude for others, a Bedrock model for something else), you need BAAs with each provider. This creates contract sprawl and increases the risk of accidentally routing PHI through an uncovered path.
Getting a BAA isn't always quick. Some providers offer self-service BAAs you can sign in minutes. Others require weeks or months of negotiation, especially for enterprise agreements. We've seen customers wait three months for a BAA with a major cloud provider. Plan ahead and don't assume you can start using a new provider immediately.
Audit logging
HIPAA's audit control standard (45 CFR 164.312(b)) requires mechanisms to record and examine activity in systems containing PHI. The requirement applies to the activity itself, not to every LLM call you make. If an LLM interaction handles PHI, you need to log it. If it doesn't touch patient data, HIPAA doesn't require a log entry.
For interactions that do involve PHI, log:
The prompt sent to the LLM
The response received
Timestamps
User or system that initiated the request
Which model was used
LLM providers don't do this logging for you. You get API responses, not audit trails. You need infrastructure that captures this data for PHI-touching requests.
Where this gets complicated:
Storage: These logs contain PHI. They must be encrypted at rest, access-controlled, and retained according to your retention policy.
Short-term vs. long-term: You need operational access (debugging, verification) and compliance access (audits, investigations). These have different retention requirements.
Log drains: For compliance, you often need to export logs to long-term storage (S3, a SIEM, etc.) with appropriate retention periods.
Encryption requirements
Encryption is usually the easiest requirement to meet:
In transit: TLS for all API calls. Most LLM providers enforce this by default, so you're likely already covered.
At rest: Your responsibility for logs, cached data, and any persisted prompts or responses. This is where teams sometimes miss requirements.
Key management matters here too. Who has access to encryption keys? How do you rotate them? AWS KMS or a similar managed service handles this well. If you're managing keys yourself, that's additional infrastructure to build and maintain.
Data handling
Will your patient data be used to train models?
Major LLM providers have policies about this, but policies aren't contracts. Before sending PHI to an LLM, you should verify:
Explicit opt-out in the BAA or data processing agreement. A blog post saying "we don't train on your data" isn't enforceable. The legal agreement needs to say it.
Data retention terms. How long does the provider keep your prompts and responses? For what purposes? Abuse monitoring? Safety research?
API vs. consumer product distinctions. ChatGPT the product has different data handling than the OpenAI API. Same for Claude.ai versus the Anthropic API. Make sure you're reading the right policy.
OpenAI's current policy says API data is retained for 30 days for abuse monitoring, then deleted, and is not used for training. That's their policy. What matters legally is what your BAA says.
PII/PHI De-identification
The safest PHI is PHI that never reaches the model in identifiable form.
De-identification means scrubbing sensitive data before it goes to the LLM. Names become [PATIENT_1], dates become [DATE_1], and Social Security numbers become tokens. The LLM processes the request without ever seeing the real identifiers.
HIPAA defines two de-identification standards (45 CFR 164.514):
Safe Harbor: Remove 18 specific identifiers (names, dates, geographic data, etc.)
Expert Determination: A qualified expert determines the risk of re-identification is very small
A common question: if you remove identifiers, send data to an AI system, maintain an internal mapping table, and then re-associate the output back to the individual, does the data still count as de-identified?
Under 45 CFR 164.514(c), HIPAA explicitly permits the use of a code or other means of record identification to allow re-identification by the covered entity or business associate, provided that the code is not derived from or related to the individual, the code is not otherwise capable of being translated to identify the individual, and the re-identification mechanism is not disclosed. If those conditions are met, the data disclosed to the AI system may qualify as de-identified under HIPAA, even though you retain a separate, protected re-identification key. If those conditions are not met, the data would be more accurately described as coded or pseudonymized and would remain identifiable PHI subject to HIPAA.
Building reliable de-identification is hard. You need:
NLP that accurately identifies PHI in unstructured text
Consistent tokenization so you can re-identify after responses
A clean architecture with short mapping lifecycles to reduce risk
Handling of edge cases (misspellings, nicknames, context-dependent identifiers)
Re-identification is hard too, and high-stakes. If the LLM responds with [PATIENT_1] should take [MEDICATION_1] twice daily, you need to restore the actual patient name and medication before showing it to a clinician. Obviously, you can’t afford to mess this up. If your re-identification algorithm mixes [PATIENT_1] and [PATIENT_2] your application could end up recommending [MEDICATION_1] to the wrong patient.
API Key management
LLM providers give you API keys, usually just one or a few. For a small team, this could be fine. For larger organizations, the limitations become clear. How do you scope keys by application, team, or environment? How do you rotate keys without breaking production? How do you track which key made which request for auditing and cost allocation? How do you revoke access for a specific use case without affecting others?
This isn't unique to HIPAA, but it becomes a compliance issue when you can't demonstrate who had access to systems processing PHI.
Model access controls
Different use cases have different risk profiles. A production feature summarizing patient records has different requirements than a developer tool for debugging.
You might want production scopes restricted to approved models only, development scopes with broader access for experimentation, and internal tools with different limits than customer-facing features.
Without a proxy layer, enforcing this requires organizational discipline and code reviews. With a proxy, you can enforce it at the infrastructure level.
Compliance verification and observability
Controls are only useful if you can prove they're working. During an audit or customer security review, you need to demonstrate that de-identification is actually scrubbing what it should, that logs are being captured and stored appropriately, and that access controls are enforced.
This requires the ability to inspect actual requests and responses. Not summaries or aggregations, but the real data flowing through your systems. You need short-term access for operational needs and long-term retention for compliance.
Cost controls and budget management
Cost controls aren't a HIPAA requirement, but they're worth mentioning because LLM costs can spiral quickly.
Useful controls include visibility into usage by application, team, and use case, budget limits that stop requests before costs get out of hand, and usage tracking for cost allocation. A runaway loop burning through $10,000 in
API calls overnight is a problem you'd rather catch early.
Building HIPAA-Compliant AI: The DIY approach
If you want to use LLMs with PHI and you're building the compliance infrastructure yourself, here's what that actually entails:
The components you need
Proxy layer to intercept all LLM traffic and apply controls consistently
Logging infrastructure to capture every prompt and response with metadata
Encrypted log storage because logs contain PHI
Log drain to long-term storage for retention compliance
Short-term log access for debugging and verification
Key management system to scope keys by application, team, and use case
Model access controls to enforce which models each key can use
De-identification pipeline to scrub PHI before it reaches the LLM
Re-identification logic to restore original values in responses
Budget controls with usage tracking and limits per scope
Alerting when usage approaches limits
Protocol translation if you want to switch providers without code changes
Capacity management for rate limits and failover
That's 13 systems before you write application code.
True, some of these systems may be easier to implement than others, and in some cases you might be able to skip some (e.g. Proxy layer if you are able to commit to and build systems for each model you use; De-identification pipeline and Re-identification logic if your control implementation meets HIPAA compliance requirements and you elect not to de-identify).
Either way, there’s a significant work to ensuring you’re meeting HIPAA compliance requirements, and model providers aren’t doing that work for you, even if you sign a BAA with them.
Example: DIY audit logging
Here's what a basic logging wrapper looks like:
This gives you basic audit logging. What it doesn't give you: encrypted storage (you need to configure your logging backend), retention policy enforcement, UI access for compliance verification, log drains to long-term storage, consistent logging across multiple providers (you'd need separate wrappers for each), key management with scoping, model access controls, de-identification, or budget controls.
That's a partial implementation of one capability. The rest of the list is separate work.
This is why teams look for managed solutions.
Common mistakes we see
After working with hundreds of healthcare companies on compliance, these are the mistakes that come up repeatedly:
Logging without encryption. We've seen teams diligently log every prompt and response, then store those logs in plaintext Elasticsearch or an unencrypted S3 bucket. The logs contain PHI. If they're not encrypted at rest and access-controlled, you've created a compliance gap that's often worse than not logging at all.
Adding providers without BAAs. One company had a BAA with Anthropic but not OpenAI. A developer tried OpenAI for a task where it performed better, not realizing the BAA requirement. They found out when the developer mentioned it in standup. The fix was straightforward, but it required disclosure to their compliance team and documentation of the incident.
Confusing consumer products with APIs. A clinician used ChatGPT Plus for a "quick test" with real patient data because "OpenAI is HIPAA compliant." It's not. The API with a BAA can be, assuming proper implementation. ChatGPT Plus is not. By the time anyone noticed, PHI had been sent to a consumer product with no BAA coverage.
Discovering logging gaps during an audit. A company's logging infrastructure had a silent failure for three weeks. They discovered it during a customer security review when they couldn't produce logs for that period. HIPAA requires you to be able to examine activity. If your logging had gaps or your retention policy wasn't enforced, you have a problem that's hard to fix retroactively.
Relying on policy instead of contract. A SaaS vendor's website says something like "we don't train on your data" or "your data is encrypted." Great. What does your BAA say? We've seen this pattern repeatedly with cloud services: policies can change, marketing pages get updated, but contracts are enforceable. If it's not in the BAA, don't assume it's guaranteed.
The gateway approach
Instead of building compliance infrastructure yourself, it is sometimes possible to route LLM traffic through a managed gateway that can handle controls centrally. That’s where Aptible’s AI Gateway came from: many of our customers asked us to help them ensure HIPAA-compliant AI API calls, the same way we help them ensure HIPAA compliant cloud infrastructure.
Why teams use AI gateways
All LLM requests flow through a single control layer. This provides enormous benefits in terms of visibility and cost control, and there are several production-grade gateways available such as LiteLLM and Portkey, which we have recommend to customers in cases where HIPAA compliance is not required.
What no other gateway provides, and the entire reason we built Aptible AI Gateway, is HIPAA compliance requirements, met out of the box.
Aptible AI Gateway gives you:
One BAA instead of one per provider
Consistent logging regardless of which model you're using
Centralized controls for key management, model access, de-identification
Model switching without rebuilding compliance infrastructure
The gateway is not intended to replace your application code. You still call LLMs the same way, though you change the base URL for your requests, so calls routes through infrastructure that applies controls such as logging, key management, model access, cost controls, automatically.
Comparing AI gateways for HIPAA
As mentioned above, LiteLLM and Portkey are great gateways, when HIPAA compliance is not required. Here's how the options compare on key HIPAA capabilities:
Capability | LiteLLM | Portkey | Aptible AI Gateway |
|---|---|---|---|
BAA available | No (self-hosted) | Enterprise tier | Yes, standard |
HITRUST certified | No | No | Yes |
Audit logging with PHI encryption | DIY | Requires config | Yes |
Log drain to long-term storage | DIY | Yes | Yes |
Key scoping by app/team | Limited | Yes | Yes |
Model access controls | Limited | Yes | Yes |
PII/PHI de-identification | No | Enterprise tier | Coming soon |
Re-identification | No | No | Coming soon |
Budget controls | Basic | Yes | Coming soon |
Built for digital health startups | No | No | Yes |
Deployment model | Self-hosted | SaaS/Hybrid/Air-gapped | Managed |
LiteLLM is open-source and self-hosted. It handles routing and basic logging, but you're responsible for the HIPAA infrastructure around it. No BAA because there's no vendor relationship. Good for teams with existing compliant infrastructure who just need routing.
Portkey offers HIPAA compliance on their Enterprise tier, including BAA signing and PII anonymization. The Enterprise plan supports SaaS, hybrid, or air-gapped deployment. You'll need to negotiate Enterprise pricing and may need to assemble additional components for full HIPAA coverage.
Aptible AI Gateway is purpose-built for healthcare. Aptible has been helping digital health startups meet compliance requirements since 2013, and is HITRUST R2 certified and SOC 2 Type 2 audited. The AI Gateway gives you all 13 systems out of the box: one BAA covers all supported models, de-identification and re-identification are built in, and audit logging is automatic. No enterprise sales process required.
Learn more about Aptible AI Gateway →
Getting started
You can build HIPAA-compliant AI features without building all the compliance infrastructure from scratch. The requirements are real: BAAs, audit logging, encryption, access controls. How you implement them is up to you.
You can certainly start with a single model provider, get the BAA in place, and build the compliance infrastructure like logging from day one.
But, you might consider using a gateway if you want to enable your engineers to work with multiple models while ensuring HIPAA compliance controls for each, or if you don’t want to manage BAAs, logging, access controls and all the rest on your own.
Aptible AI Gateway gives you the compliance infrastructure out of the box. Learn more or talk to an engineer.
Frequently asked questions
Which AI is HIPAA compliant?
No AI is inherently HIPAA compliant. Compliance depends on implementation. The major LLM providers (OpenAI Enterprise and API, Anthropic, AWS Bedrock, Azure OpenAI, and Google Vertex AI) offer BAAs that enable HIPAA-compliant use. You're still responsible for audit logging, encryption, access controls, and other implementation requirements.
Is ChatGPT HIPAA compliant?
It depends which product. ChatGPT Free, Plus, and Team do not offer BAAs and should not be used with PHI. ChatGPT Enterprise offers BAA availability, no training on your data, SSO/SCIM, and encryption at rest. The OpenAI API is a separate product with its own BAA availability. OpenAI's Codex (their coding agent) inherits the compliance posture of your ChatGPT plan. The bottom line: "Is ChatGPT HIPAA compliant?" depends on which product, with what configuration, and what you build around it.
Is Claude HIPAA compliant?
Same structure as OpenAI. Claude.ai (the consumer product) does not offer a BAA. The Claude API offers BAA availability for customers who request it. Claude for Enterprise includes BAA availability and enhanced security. Claude Code (Anthropic's CLI for developers) uses the API, so it's covered under your API agreement if you have a BAA. Anthropic's policy states API data is not used for training. Data is retained briefly for trust and safety, then deleted.
Do I need a BAA with OpenAI to use their models with PHI?
Yes. If you're sending PHI to OpenAI's API, they're acting as a business associate. HIPAA requires a BAA before sharing PHI with a business associate. OpenAI offers BAAs for API and Enterprise customers.
How long does it take to get a BAA?
It varies widely. Some providers offer self-service BAAs that you can review and sign in minutes. Others require weeks or months of back-and-forth, especially for enterprise-tier agreements or if you need custom terms. We've heard from customers who waited three months to finalize a BAA with a major provider. If you're planning to use a new LLM provider with PHI, start the BAA process early. Don't assume you can sign up and start sending data the same day.
What about open-source LLMs like Llama?
Self-hosting open-source models eliminates the need for a provider BAA because you're the only party handling the data. But you're responsible for all security controls on your hosting infrastructure: encryption, access controls, audit logging, physical security if applicable. Self-hosting doesn't simplify compliance. It shifts the entire burden to you.
How do I audit AI usage for HIPAA?
You need to log activity involving PHI. When LLM calls handle patient data, log the prompt, response, timestamps, user/system attribution, and which model was used. These logs must be encrypted at rest, access-controlled, and retained according to your retention policy. Most LLM providers don't create these logs for you, so you need to build or buy logging infrastructure.
Is de-identification required for HIPAA-compliant AI?
No. De-identification is not required if the AI provider is acting as a Business Associate and a valid BAA is in place. In that case, PHI may be processed in compliance with HIPAA safeguards.
De-identification is only required if you do not have a BAA and want to share data without it being treated as PHI. To qualify, the data must meet the HIPAA de-identification standard under 45 CFR 164.514, either by removing all required identifiers under the Safe Harbor method or through a documented Expert Determination. If those standards are not met, the data is still considered PHI under HIPAA.
That said, even with a BAA in place, de-identification can reduce your risk exposure and simplify your compliance posture.
What's the difference between HIPAA compliant and HIPAA eligible?
"HIPAA eligible" typically means a vendor will sign a BAA. "HIPAA compliant" implies the full implementation meets HIPAA requirements. A product being HIPAA eligible doesn't make your use of it compliant. You still need to implement required controls like logging, encryption, and access controls. The term "HIPAA certified" is misleading because HIPAA doesn't certify products. For more on HIPAA basics, see our HIPAA Compliance Guide.


