HIPAA AI Security
Data residency for healthcare AI: what the regulations actually require
By Mat Steinlin, Head of Information Security
Last updated: April 2026
"Data residency" is one of the most misapplied terms in healthcare compliance. US healthcare companies sometimes believe they need their data to stay in the US because of HIPAA. They don't. European and Australian healthcare companies may have genuine regional requirements, but "data residency" can mean different things depending on which regulation you're reading and who's interpreting it.
This chapter defines what data residency actually means at the infrastructure level, which regulatory frameworks require it, what each framework actually demands (and doesn't), and what "data residency for LLM infrastructure" looks like in practice. The goal is to help engineering teams make accurate assessments: neither over-building regional infrastructure they don't need, nor under-building for a requirement they genuinely have.
What data residency means for AI infrastructure
"Data residency" encompasses at least three distinct requirements that are often conflated:
Storage residency
The data at rest (logs, stored records, backups) must reside on infrastructure physically located in a specific region. This is the most common form of data residency requirement. Under this standard, your audit logs containing patient data must be stored in EU-region S3 buckets, AU-region databases, or equivalent regional infrastructure. The model inference can happen wherever; the stored data must stay in region.
Processing residency
The data must also be processed (including inference) on infrastructure in the specified region. For LLM workloads, this means the model endpoint itself must be in the required region. This is more restrictive than storage residency because it limits your model options to whatever the provider makes available in that region, and not all models are available in all regions.
End-to-end residency
Every component in the data flow (the API gateway, the model endpoint, the log storage, the key management infrastructure) must be in the required region. No data leaves that region at any point. This is the most architecturally complex form, and it's what strict regulatory interpretations sometimes require.
Most regulatory requirements specify storage residency at minimum. Some (particularly under strict GDPR interpretations for cross-border transfer) effectively require processing residency as well. End-to-end residency is typically required only for government or national security adjacent workloads, not for standard healthcare applications.
Understanding which tier your regulatory situation actually demands is the first step before any infrastructure decisions.
US-only operations: what HIPAA does and doesn't require
HIPAA doesn't impose data residency requirements. The HIPAA Security Rule requires appropriate physical, technical, and administrative safeguards for electronic PHI, but it doesn't specify which country or region those safeguards must operate in. A BAA with a US-based LLM provider whose infrastructure runs in US-East is sufficient for HIPAA compliance regardless of that provider's datacenter locations, as long as the BAA and the underlying security practices meet the standard.
US-based digital health companies serving only US patients and operating under HIPAA don't need to implement data residency controls for HIPAA compliance. If customers or contracts require specific regional hosting (common for enterprise health system customers), that's a contractual requirement, not a regulatory one, and should be scoped to what the contract actually specifies.
If your product also serves EU, Australian, or Canadian users, read on. The frameworks below have substantive requirements that differ from HIPAA.
GDPR (EU/EEA)
If your product processes personal data of EU or EEA residents (including as a US-based company serving EU patients), GDPR applies. GDPR doesn't prescribe where data must be stored, but it restricts the transfer of personal data outside the EU/EEA to countries that don't provide an equivalent level of protection.
The US doesn't have a general adequacy decision from the European Commission for all data transfers. The EU-US Data Privacy Framework (the successor to Privacy Shield) provides an adequacy mechanism for transfers to certified US organizations, but it has been challenged legally and its long-term stability is uncertain.
For most healthcare AI teams, the practical implication of GDPR on LLM usage is:
Your LLM provider is a data processor and must sign a Data Processing Agreement (DPA) (not a US-style BAA, a GDPR DPA)
If the provider's infrastructure is outside the EU/EEA, the transfer requires a valid legal mechanism, typically Standard Contractual Clauses (SCCs)
Log retention, deletion, and access must comply with GDPR obligations, including the right to erasure
Major LLM providers (Anthropic, OpenAI, Google) offer DPAs and SCC mechanisms for EU customers. Verify that your provider's DPA covers healthcare/special category data specifically (not just personal data in general), since health data has heightened requirements under GDPR Article 9.
If your EU regulatory situation requires that EU patient data not leave the EU at all (rather than relying on SCCs), you need a provider with genuine EU-region inference endpoints, which typically means AWS Bedrock, Azure OpenAI, or Google Vertex AI in EU regions, not the consumer APIs from Anthropic or OpenAI directly.
DPA requirements
Under GDPR, any vendor that processes personal data on your behalf is a data processor and must sign a DPA before processing begins. The DPA must specify the subject matter, duration, nature and purpose of processing, the type of personal data, and the rights and obligations of both parties, per GDPR Article 28.
For LLM providers: if you're sending EU patient data to an LLM provider, they're a data processor. Check whether the provider offers a GDPR-compliant DPA, whether that DPA covers special category data (health data falls under Article 9), and whether they will sign it for your account tier. Enterprise and API tiers from major providers typically offer DPAs; consumer products typically don't.
What to look for in an LLM provider's DPA:
Data processing purpose limitations (they can only process your data for the purpose of providing the service)
Sub-processor obligations (they must list sub-processors and notify you of changes)
Data deletion commitments aligned with your retention requirements
Breach notification timeline (GDPR requires notification within 72 hours)
Your right to audit or receive audit reports
Cross-border transfer mechanisms
If your LLM provider processes EU patient data on infrastructure outside the EU/EEA, a transfer mechanism is required. The principal options:
Standard Contractual Clauses (SCCs): The most common mechanism. The European Commission publishes standard contract clauses that, when incorporated into a DPA, provide the legal basis for cross-border transfer. Most enterprise LLM provider agreements include SCCs in their DPA packages. Verify that the SCCs reference the current (2021) version, not the pre-2021 version which is no longer valid for new agreements.
EU-region inference: If you require that EU patient data not leave the EU at all, you need a provider with EU-region model endpoints: AWS Bedrock (EU regions), Azure OpenAI (EU regions), or Google Vertex AI (EU regions). This is architecturally more complex and limits your model options to what's available in those regions, but it eliminates the cross-border transfer question entirely.
EU-US Data Privacy Framework: For US providers certified under the DPF, this provides an adequacy mechanism. However, the DPF has been subject to legal challenges since its predecessors (Safe Harbor and Privacy Shield) were both invalidated by the Court of Justice of the EU. Teams relying solely on DPF certification should have a fallback plan.
Log retention and right to erasure
GDPR's right to erasure (Article 17) gives data subjects the right to request deletion of their personal data. For LLM audit logs that contain patient data in prompts and responses, this creates a tension with HIPAA's six-year retention requirement.
The practical resolution:
HIPAA retention requirements are a legal obligation that typically takes precedence for covered entities; document this tension and note that logs are retained for legal compliance
For non-HIPAA-covered data in logs (European patient data from non-US contexts), deletion requests must be handled and documented
Implementing the de-identification architecture from the de-identification chapter reduces this tension: de-identified logs contain tokens rather than direct personal data, which may limit the scope of erasure obligations; but get a privacy legal opinion on whether token-mapped data qualifies as de-identified under GDPR's standard before relying on this
The European Data Protection Board (EDPB) has published guidelines on AI and data protection that are the authoritative reference for GDPR interpretation in AI contexts.
Australian Privacy Act
The Australian Privacy Act 1988 and the Australian Privacy Principles (APPs) impose obligations on organizations handling health information of Australian residents. APP 8 restricts cross-border disclosure of personal information unless the recipient is subject to a law or binding scheme that provides substantially similar privacy protection, or the individual has consented.
For healthcare companies operating in Australia (particularly those covered by the My Health Records Act 2012), there's increasing customer and regulatory expectation that health data stays in Australia. This isn't always a legal requirement in all contexts, but it's frequently a contractual requirement from Australian health system customers and often a de facto requirement for participation in national health data programs.
AWS Bedrock has an Asia Pacific (Sydney) region that supports a subset of available models. If you need AU-region processing, Bedrock with a Sydney-region deployment is the primary path for enterprise-grade compliance. Model availability in the Sydney region is more limited than US regions; verify specific model availability in the Bedrock model documentation before committing to a model choice.
PIPEDA and Quebec Law 25 (Canada)
PIPEDA (the federal framework) applies to commercial organizations handling personal information in Canada, including health information outside provincially regulated contexts. Quebec's Law 25 (Bill 64) significantly strengthened Quebec's privacy requirements and brought them closer to GDPR in scope, including data transfer restrictions and privacy impact assessment requirements.
For cross-border data transfers under PIPEDA, the organization transferring data remains accountable for ensuring the recipient provides comparable protection. Under Law 25, transfers to recipients in jurisdictions with "adequate" protection levels are permitted, but the transferring organization must conduct and document a privacy impact assessment.
AWS Bedrock has a Canada (Central) region with a subset of available models. For Canadian health data with strict residency requirements, a Bedrock Canada-region deployment is the appropriate path, with the same caveat about verifying specific model availability.
What data residency requires at the infrastructure level
For teams with a genuine regional residency requirement (EU, Australia, or Canada), the infrastructure implications are substantial and often underestimated.
Gateway infrastructure in region. An LLM API gateway sitting in US-East that routes to EU-region model endpoints still processes the request (reads prompts, writes logs) in the US. If processing residency is required, the gateway itself must be deployed in the required region.
Model endpoints in region. Provider-specific:
AWS Bedrock supports regional deployments in EU (Frankfurt, Ireland), AP (Sydney, Tokyo, Seoul), and Canada (Central). Supported models vary by region.
Azure OpenAI is available in multiple EU regions (Sweden Central, France Central, UK South, others). Model availability and versions differ from US regions.
Google Vertex AI is available in EU regions (Netherlands, Belgium, Finland). Gemini model availability in EU regions may lag US availability.
Direct Anthropic API and direct OpenAI API route to US infrastructure by default. Regional routing is available in some enterprise configurations; verify with the provider.
Log storage in region. S3 buckets, CloudWatch log groups, and similar must be created in the required region. Data written to a US-region bucket doesn't satisfy EU or AU storage residency requirements even if the bucket has appropriate encryption and access controls.
KMS key management in region. If you use AWS KMS for log encryption (as described in the audit logging chapter), the KMS keys must also be in the required region. Cross-region KMS key usage for data encrypted in a different region creates both a compliance gap and operational complexity.
This isn't a configuration change to an existing deployment. It requires either a regional parallel deployment or a multi-region architecture that routes requests based on data subject location. Both have meaningful engineering and operational overhead.
Model availability constraints by region
Not all models are available in all regions, and regional model availability lags behind US availability as providers roll out new models. This is a practical constraint that should factor into model selection decisions for teams with residency requirements.
AWS Bedrock regional availability
The AWS Bedrock model documentation is the authoritative reference for which models are available in which regions. Key constraints as of this writing:
The AP Sydney region (ap-southeast-2) has limited model availability compared to US regions. Anthropic Claude models have been available in Sydney; verify specific model versions.
EU regions (eu-central-1 Frankfurt, eu-west-1 Ireland) support a broader model set but may not have the latest model versions at launch.
Bedrock cross-region inference allows routing to other regions when a model isn't available in your primary region, but this defeats regional data residency if the fallback region is outside your required geography.
Azure OpenAI EU regions
Azure OpenAI's regional availability documentation covers which GPT and other models are available in EU regions. Azure typically has strong EU coverage (Sweden Central, France Central, UK South are common deployment targets), though specific model versions and quota availability may differ from US regions. Azure has existing healthcare compliance infrastructure and BAA-equivalent agreements that make it a common choice for EU healthcare AI.
Google Vertex AI EU regions
Vertex AI's regional documentation covers Gemini and other model availability. EU regions in Belgium (europe-west1) and Netherlands (europe-west4) support Gemini models, but availability and versions may lag US regions.
The general constraint to plan around: if you design a healthcare AI feature around a specific model in a US region, verify that the same model is available in your required region before committing to that model. A feature built around GPT-4o-2024-11-20 that can only run in US-East can't be deployed in EU-region without confirming Azure OpenAI offers the same version in your target EU region.
A note on Aptible's data residency support
We work with customers who have regional data residency requirements like the ones described in this chapter (one in the EU, one in Australia). Aptible AI Gateway is built to support those deployment patterns; deployment of the gateway stack in EU and AP regions to satisfy storage and processing residency requirements. Regional support hasn't been released publicly yet. If you have immediate EU or AP regional requirements, request beta access and we'll work with you to enable it for your account.
Teams that can't wait for regional gateway support should implement the DIY regional architecture described above, or evaluate AWS Bedrock regional deployments with their own gateway and logging infrastructure.
FAQs
Does HIPAA require data residency?
No. HIPAA doesn't specify where PHI must be stored or processed geographically. The HIPAA Security Rule requires appropriate safeguards for electronic PHI, but those safeguards are about access controls, encryption, and audit trails, not geography. A covered entity can use a US-based LLM provider whose infrastructure runs in any AWS region and be fully HIPAA compliant, as long as a BAA is in place and the underlying security practices meet the standard.
The misconception is common because "compliance" and "HIPAA" are often used interchangeably, and some enterprise healthcare customers require US-region hosting in their contracts. That is a contractual requirement, not a HIPAA requirement.
What's the difference between a BAA and a DPA?
A BAA (Business Associate Agreement) is a US HIPAA construct. It establishes the legal relationship between a covered entity and a vendor that handles PHI, specifying the vendor's obligations under HIPAA: permitted uses, breach notification, security requirements.
A DPA (Data Processing Agreement) is a GDPR construct. It establishes the terms under which a data processor may handle personal data on behalf of a data controller, per the requirements of GDPR Article 28. A DPA must address purpose limitation, sub-processors, security measures, audit rights, and data subject rights obligations.
They aren't interchangeable. A US company processing EU patient data needs both a BAA (if HIPAA applies) and a DPA (for GDPR compliance). A provider may offer one without the other. Verify both when evaluating LLM providers for healthcare use cases with EU patient populations.
Do I need a DPA with every LLM provider I use?
If any of those providers processes personal data of EU residents in the course of providing their service: yes. Each provider is an independent data processor, and each requires its own DPA.
If you route all LLM traffic through a single gateway that handles the provider relationship, the DPA situation may simplify: you have a DPA with the gateway operator, and the gateway operator's sub-processor obligations require them to have DPAs with the underlying model providers. Verify that your gateway operator's DPA explicitly covers the sub-processors they use; this is a standard DPA requirement, but not all providers' standard agreements address it adequately.
How does de-identification interact with GDPR obligations?
Under GDPR, truly anonymized data is outside the scope of the regulation: if data is anonymized such that the data subject can't be re-identified by any means reasonably likely to be used, it's no longer personal data and GDPR doesn't apply (GDPR Recital 26). Pseudonymized data — where a key exists that could re-identify the subject — is still personal data under GDPR. This is confirmed by EDPB Guidelines 01/2025 on pseudonymisation: pseudonymized data remains personal data even in the hands of a recipient who cannot re-identify it, as long as re-identification is technically possible anywhere.
The token mapping approach described in the de-identification chapter produces pseudonymized data: the tokens can be re-identified using the mapping you retain. GDPR still applies to de-identified prompts. The practical benefit is reduced exposure if logs are compromised, not elimination of GDPR obligations.
There is a second, separate obligation to understand: de-identifying data before sending it to an LLM doesn't affect your GDPR obligations as a data controller for the original personal data you continue to hold. GDPR's full requirements — lawful basis for processing, security obligations under Article 32, data subject rights, and retention limits — apply to your original records regardless of what version you send to the LLM provider. De-identification reduces risk for the data in the LLM's hands; it doesn't reduce your obligations for the data in your own systems.
Get a privacy legal opinion on the specific de-identification approach you implement before drawing compliance conclusions from it, particularly on whether your approach meets GDPR's anonymization standard rather than pseudonymization.
Next steps
This is the final chapter in the guide. If you've worked through from the beginning, you have the full picture: key architecture, audit logging, de-identification, shadow AI governance, prompt injection mitigations, agentic security controls, and regional compliance requirements.
HIPAA AI security: guide index: the full chapter list)
How to build a secure AI stack in digital health: the overview chapter with the pre-launch checklist, if you want to review what to prioritize
HIPAA-compliant AI: what developers need to know: the compliance baseline that this guide builds on