Every quarter, we host a webinar to share everything that’s new with Enclave and Gridiron.
In case you missed it, you can watch a recording of our January webinar below. You can also grab the transcript and the slide deck in our resources section. We provide a full recap of the event in this post.
January 2018 Quarterly Product Update Webinar
Meltdown and Spectre
We kicked off the webinar with a panel on Meltdown and Spectre, to discuss on how we protect our customers as in the event of disclosed vulnerabilities. We also did a deep dive into the ways that Enclave is architected for security more generally, with an emphasis on isolation based on trust levels. CTO Frank Macreery and Lead Enclave Developer Thomas Orozco discussed these topics and more in a conversation moderated by Web Security Advocate Elissa Shevinsky.
Meltdown and Spectre aren’t the worst vulnerabilities, but they are unique because they have a wide impact. Specifically, Meltdown and Spectre are a design flaw in CPUs that affects modern Intel processors as well as other processors that are used in everything from laptops to mobile devices. Meltdown uses speculative execution to leak kernel data. Spectre can also be used to leak information from the kernel. It’s trickier to exploit but also more difficult to mitigate.
Meltdown can be easy to exploit and those exploits are difficult to detect. Meltdown exploits privilege escalation paths, which makes it particularly relevant for cloud computing infrastructure like Aptible.
There are two general pathways to exploit Meltdown. The first is an attacker can gain access to data run by your peers (on same instance as you.) Aptible protects against this on Enclave by requiring that all sensitive workloads are isolated on dedicated-tenancy stacks (which are not shared with other customers).
The other way is to attack Enclave itself. If you look at Enclave, anyone on the internet can potentially open up an account on a shared environment. The way we protect the risk of access to Enclave is by separating our riskiest systems from our most sensitive environment variables that are required to administer Aptible.
We’ve put together a diagram and slide deck showing how we think about trust level, isolation, and threats. We isolate the riskiest components from the most sensitive and privileged data. Our CTO Frank Macreery wrote a blog post on how the Aptible team responded to Meltdown and Spectre. For an even deeper dive into these details, we recommend watching the webinar or reading the webinar transcript.
The steps for the Meltdown mitigation were fairly extensive. We began our patching process with the most high risk instances, which are shared-tenancy instances. (We love our customers but treat all applications as potentially untrusted by nature.) Then we moved onto dedicated tenancy stacks. As a final step we scheduled maintenance windows with customers to restart databases and make sure all instances were patched against Meltdown. We completed on Jan 9th, five days after we began.
There’s a lot to be learned from these vulnerabilities, and there are important lessons that our customers can and should apply to any potential security risk.
You always want to start your security process by identifying abstract threats i.e. where you are vulnerable and how you can architect to protect against those vulnerabilities.
Threats are individual. The threats to your company may be different than the threats that we face at Aptible.
You want to figure out what services you depend on (that are the biggest threats) and how to architect to isolate and protect against those threats.
Threats aren’t always abstract. Whenever a new major vulnerability emerges (Meltdown, for example) you want to assess how that fits into your security model and how you need to respond to it. And this is a process that should be ongoing over time.
The Enclave infrastructure wasn’t built overnight–it was an iterative process. The diagram that we’ve shown in the slide deck is the result of an accumulation of updates and improvements since 2013.
Enclave Metric Drains
We released a new feature: Metric Drains. Metric Drains help you monitor the performance of your containers. Container Metrics are captured every 30 seconds. They are routed every 15 seconds to the destination of your choice. Currently, that includes InfluxDB (both hosted on Enclave or hosted by InfluxData) and Datadog. More third party providers to come in the future–feedback welcome.
We’ve had a lot of requests around metrics retention. Using Metric Drains, you can retain the metrics for as long as you want, and,you gain ownership of your metrics. You can incorporate them into dashboards, for example, to better understand your applications.
Metric drains are functionally similar to Log Drains except for metrics. What’s captured?
Memory Usage & Limit
Disk Usage & Limit (DB Only.)
If you want to start capturing performance metrics, check out our documentation on Metric Drains.
Other Enclave Updates
Managed HIDS is now generally available. On a weekly basis, you get audit ready PDF + CSV reports.This satisfies compliance requirements for intrusion detection. Free for shared tenancy stacks and $0.02 / GB / hour for dedicated tenancy stacks.
VPC Peers and VPN Tunnels
Connect to any AWS VPC via a peering connection. No maintenance and 100% free. If you have your own VPC, this is convenient. The downside is this only works for AWS, which is where VPN tunnels come in as another option. VPN tunnels connect to any VPN network. Requires a VPN gateway. $99 / VPN connection.
For setup, just contact Aptible support at contact.aptible.com. Once setup, connection details are visible in the dashboard.
Additional Enclave Features
The Enclave CLI now supports JSON output. JSON output provides enhanced scriptability.
The DB:create command now supports picking a version.
Restored instances of a MongoDB replica set will no longer attempt to join the existing replica set.
Databases now optimize their configuration according to their container size. This lets you get the most out of your resources, and it’s easier to experiment with new footprints.
Register for April 2018 Aptible Product Update Webinar
We’ll host our next product update webinar April 25, 2018 at 11 a.m. PT (2 p.m. ET).
As an Aptible customer, here’s what you need to know:
Enclave is architected to mitigate vulnerabilities like Meltdown and Spectre.
The Aptible Security Team immediately responded to the disclosure to further remediate the issue.
We provided realtime account of our response efforts on our status page. This blog post will provide additional context on our response. We’ll also share some of the ways our architecture is designed to protect against these sorts of vulnerabilities.
How these vulnerabilities impact cloud infrastructure
Meltdown in particular (more on Spectre later in this post) allows processes to read memory they should normally not have access to. By extension, in a PaaS environment running untrusted customer code, it allows customers to read memory they shouldn’t normally be allowed to read.
The vulnerability isn’t trivial to exploit at scale, but in theory, it allows for:
Escalation: one customer reads data (e.g. credentials) belonging to the PaaS provider, and uses that to compromise the PaaS provider itself, and by extension other customers.
Lateral compromise: one customer reads data belonging to another customer whose apps are deployed on the same underlying instance.
In other words, Meltdown is a critically important vulnerability for any PaaS provider. However, as an Aptible Enclave customer, you’re protected by the intrinsic architecture of Enclave, as well as an active Security Team. Here’s how.
Aptible Enclave is architected to protect against attacks like Meltdown and Spectre
In fact, this exact type of vulnerability where a customer gets access to memory they shouldn’t normally be able to read is part of our threat model, and Enclave is architected to protect against those.
Here’s how this plays out in terms of the escalation and lateral compromise attacks explained earlier:
Escalation: instances running customer containers on Enclave are unprivileged by design. All privileged access to e.g. AWS or Aptible APIs is orchestrated through isolated “coordinator” instances, which do not host customer containers.
Lateral compromise: for sensitive data, Enclave requires that customers deploy on dedicated-tenancy stacks, which host a single customer’s containers.
In other words: the container boundary is our first line of defense, but it’s not the only one.
Aptible’s Meltdown remediation efforts
As soon as the Meltdown vulnerability was publicized, we acted immediately to deploy patches across our infrastructure to restore the integrity of the container boundary before public exploits were available. These patches needed to be applied to the Linux Kernel, and are known as the “Kernel Page-Table-Isolation” patch set (or “KPTI”).
Here, our remediation was made more difficult by the fact that the Ubuntu Linux distribution, which we rely on for Enclave, was taken by surprise by the unanticipated early release of the vulnerability on January 3rd, and did not have patched Kernels available yet.
As a result, hours after the vulnerability was announced, we started working on a contingency plan, which consisted of building our own patched Kernels targeting Linux 4.14.112. On January 4th, we understood that Ubuntu was unlikely to be able to provide patched Kernels before January 9th (which turned out to be correct), and made the decision to roll out our own Kernels instead3. Other providers have since announced that they followed a similar approach.
Once we validated our newly-minted Kernel through Enclave’s suite of integration tests, we published our plans on our status page and contacted customers with scheduled maintenance windows. Over the course of a few days, we replaced thousands of instances with minimal disruption. Ultimately, our patching of Meltdown completed early in the morning of January 9th, before public Meltdown PoCs were available and before Ubuntu had released patched Kernels.
|January 3, 2018||We posted to our status page indicating that the Security Team was monitoring the expected release of information about an upcoming vulnerability.|
|January 4, 2018||Once the details of the vulnerability were released, we published our response plan to our status page, and prioritized response around patching Shared Stacks (which are inherently vulnerable to Meltdown) and otherwise vulnerable Dedicated Stacks.
We completed kernel patching for Shared and Dedicated Stacks. We used a bespoke kernel because an official kernel patch was not yet released.
We began to contact each customer to coordinate a scheduled maintenance window during which we could restart databases, as needed.
|January 9, 2018||We completed all patching and database restarts needed for all remediation efforts related to Meltdown.|
Looking ahead and Spectre remediation
As of now Aptible has fully remediated Meltdown for Enclave Stacks.
Going forward, we are continuing to assess the impact of the Spectre vulnerabilities and the development of mitigations in the Linux Kernel to protect against it. Once these mitigations evolve, we’ll likely follow a similar approach (albeit with less urgency) to deploy mitigations for Spectre.
The stakes continue to get higher, as the threat environment continues to elevate just as the consequences for data breach grows. The Aptible Security Team will continue to be aggressive about protecting our customers’ environments from these and all critical vulnerabilities.
- The meltdownattack.com site provides useful information, recommendations and links to security advisories that describe Meltdown for a context broader than this blog post. You may find useful information there related to how to appropriately respond to Meltdown in your own cloud or personal data environments.
- Some additional fixes to the KPTI patch series were included in the subsequent Linux 4.14.12 release. 4.14.12 hadn’t been released yet when we started rolling out our Linux 4.14.11-based Kernel, but we did backport the relevant patches onto our 4.14.11 tree ahead of time.
- It’s worth noting that the reason we were able to move faster than Ubuntu was because we had fewer constraints. Indeed, Ubuntu guarantees a stable Kernel version for a given Ubuntu release, which means they had to backport the KPTI patches onto older Kernels. That’s a lot of work, which they had to complete on short notice. In comparison, we had the flexibility to choose to upgrade to a newer Kernel instead, which we did.
Every quarter, we host a webinar to share everything that’s new with Enclave and Gridiron.
In case you missed it, you can watch a recording of our October webinar below. You can also grab the transcript and the slide deck in our resources section. And, we provide a full recap of the event in this post.
October 2017 Quarterly Product Update Webinar
Achieving ISO 27001 Certification
In September, we earned our ISO 27001 certification, covering both Enclave and Gridiron.
ISO 27001 is a cross-industry, international standard of security. It prescribes security controls for use across an organization, not just technical safeguards. Becoming ISO 27001 helps communicate your commitment to security to customers and auditors.
Aptible’s ISO 27001 certification is great news for our customers. You can use our certificate to show that your cloud infrastructure meets international standards of security.
As an aside: we used Gridiron to help us achieve our ISO 27001 certification. Don’t hesitate to let us know if you’d like to discuss attaining your own cert. We built Gridiron to make the process of meeting organization-wide security and compliance requirements straightforward.
Enclave: Easier to Audit (and Easier to Use)
This past quarter we released an array of features to make Enclave easier to audit. Of course, we also launched features that make it easier to use Enclave.
Sneak Preview: Managed HIDS
In the coming weeks, you’ll hear more about Enclave Managed Host-level Intrusion Detection System (Managed HIDS). This is an exciting upgrade to the security of your hosts.
With Managed HIDS, the Aptible Security Team collects, monitors, investigates, and responds to security events–such as sudo logins, file integrity changes, rootkit detection–within your infrastructure. Aptible manages the entire process on your behalf, and notifies you of the results.
Managed HIDS provides an additional level of security for your infrastructure, automatically enabled for all Stacks.
Aptible will also offer a weekly digest of Managed HIDS activity. The Enclave Intrusion Detection Report will be available for an additional subscription. It’ll be prepared automatically, so you can provide customers and auditors evidence that your Stack is monitored for host-level intrusions.
Other Audit-Ready Enclave Features
We added SSH Session Logging so you can capture SSH session activity. This is important: auditors and customers will want to ensure access to your prod data is audited. In particular, this is often a requirement for HITRUST.
Activity Reports enables you to review every operation within your Stack, attributed to individual users. Your auditors will want confirmation that you are monitoring for suspicious activity.
Making Enclave Easier to Use
Part of making Enclave the best place to deploy regulated and sensitive projects is ensuring that it we are making it as easy as possible to use and deploy to Enclave.
This quarter, we released the following improvements:
Gridiron: Enhancing your Information Security Management System
Gridiron is the easiest and fastest way to create and manage your information security management system (ISMS).
This quarter, we focused on:
Helping you to achieve certifications (such as ISO 27001, SOC 2) and pass customer audits with new reporting
Managing and auditing internal compliance obligations, including your agreements with customers and vendors
Updating the Gridiron Risk Model
Improved Audit and Certification Prep with Gridiron Reports
We launched a collection of reports designed to meet audit requirements. By using Gridiron, these reports will be automatically prepared so you can share with your auditors (and use for internal audits), shortcutting the audit process.
Training History shows all security and compliance training activity. Asset Inventory contains all details about assets covered in your ISMS. Business Continuity allows you to implement and execute on business continuity plans faster. And, the Audit Log Report shows details about all audit logs captured for each part of your ISMS.
Other Gridiron Enhancements
Customer and Vendor Management - meet audit (such as ISO 27001) requirements by creating an index of all legal and regulatory requirements you’re bound to by agreements with customers and vendors.
ISMS Asset Management - track all information security assets, such as networks, devices, and third-party systems.
Gridiron Risk Model - perform deep risk analysis across all aspects of your internal ISMS
There’s much more about all the changes to Enclave and Gridiron in the webinar recording.
Register for January 2018 Aptible Product Update Webinar
We’ll host our next product update webinar January 25, 2018 at 11 a.m. PT (2 p.m. ET).
All registrants will receive a webinar recap and recording shortly after the conclusion of the webinar.
For more information, review our documentation on Implicit Services.
I am happy to announce that Aptible has earned ISO 27001 certification for our Enclave and Gridiron products! This is the result of a lot of hard work by the Aptible team, and is good news for you if you’re an Aptible customer: You can use Aptible’s ISO 27001 certification to show your customers that your cloud computing stack meets an international standard for security.
What is ISO 27001?
ISO is an organization. In English, the name of the organization is the “International Organization for Standardization,” but usually people just call it ISO, like International Business Machines Corporation is just IBM.
ISO produces “standards:” documents that outline requirements, specifications, and guidelines.
Requirements, specifications, and guidelines for what? Lots of things. There are over 20,000 standards, and they can be very specific.
You can play around and search the ISO site. This can be strangely fascinating: pick a random noun and search for it.
“Avocado?” Boom: ISO 2295 is a guide for the storage and transport of avocados. ISO 3659 has instructions on how to ripen avocados after cold storage. And so on.
ISO standards also cover more abstract concepts. One of the best-known standards is ISO 9001, which sets out criteria for a quality management “system”, or set of principles and business processes.
ISO 27001 is also a “system” standard. It defines requirements for information security management systems. The main body of the standard outlines a governance structure that you have to adopt: requirements for determining what counts as in-scope or out-of-scope for your “system,” assigning security roles and responsibilities, security planning activities, risk management activities, monitoring/metrics, and improving the system itself.
ISO 27001 also has an annex of reference controls relating to areas like cryptography, operations security, asset management, incident management, and more. The reference controls are normative, in the sense that if you don’t implement a given control, you need to be able to convince your auditor that your decision was reasonable, or otherwise explain yourself.
What does ISO 27001 mean for software development teams?
Think of ISO 27001 as a baseline for good security management processes. “We take security seriously” is a cliche. Many developer teams know they would benefit from an organized approach to security, but don’t know where to start. Hiring someone full-time for security is a stretch for small teams, and managing security just gets more complex as you scale.
Teams seeking ISO 27001 certification need to be organized. Like most of the major information security protocols (SOC 2, HIPAA, PCI, etc.), ISO 27001 requires:
Proactive risk management, instead of just reacting to bad things as they happen
Planning ahead for security and setting appropriate security improvement goals
Writing down the rules for how security is supposed to work for your system (in policies and procedures)
Training your workforce on those rules, with advanced training for those with more security responsibilities
Training for and responding to security and availability incidents, including breaches
Most teams will end up investing in secure software development practices, such as test coverage, continuous integration/continuous deployment, code review, vulnerability scanning, penetration testing. On a practical level, you’ll probably get serious about MFA, require everyone to use a password manager, start using mobile device management to secure laptops and phones, do criminal background screenings, stuff like that.
What does ISO 27001 “certification” mean?
ISO standards are voluntary. Unlike the Department of Health and Human Services with HIPAA enforcement or the PCI Security Standards Council, the ISO organization itself doesn’t have any ability to enforce the standards. In fact, anyone can claim they “comply” or are “consistent” with any of the ISO standards.
The gold standard is a certification performed by an “accredited” certification body, or auditor. Being “accredited” means the auditors have themselves been audited against an ISO standard for how they conduct audits and certifications.
Aptible has been certified by Coalfire ISO, an ISO/IEC 27001 Certification Body accredited by the ANSI-ASQ National Accreditation Board (ANAB).
How does Aptible’s ISO 27001 certification benefit you?
Getting organized about security helps us protect your data. ISO 27001 lays out clear best practices for security management. With developer teams, huge problems can come from seemingly little things like not sanitizing inputs, not patching vulns, accidentally pushing sensitive data to the wrong system. ISO 27001 certification means we’ve spent time thinking systematically about risk, and have strong controls in place to manage it.
In turn, you can use Aptible’s ISO 27001 certification to show your customers that your cloud computing stack meets an international standard for security.
How can you get your own ISO 27001 certification?
The traditional way is prepare is to use consultants or full-time hires. This usually involves a lot of Word documents and Excel spreadsheets, takes a long time, is extremely expensive, and makes you feel slightly let down, like you just spent all that time and money and not much really changed. You may have this nagging feeling that you’re not actually that much more secure, but at least you have antivirus on everyone’s laptops.
I think there’s a better way. At Aptible, we make Gridiron, a set of tools for managing security, designed specifically for software development teams. Let us know if you want to get ready for ISO 27001, HIPAA, SOC 2, PCI, NIST 800-53, 21 CFR Part 11, or any other security framework.