Changelog

Introducing Container Recovery

Thomas Orozco on June 22, 2017

We’re proud to announce that as of this week, Enclave automatically restarts application and database containers when they crash.

Thanks to this new feature, you no longer need to use process supervisors or shell while loops in your containers to ensure that they stay up no matter what: Enclave will take care of that for you.

How does it work?

Container Recovery functions similarly to Memory Management: if one of your containers crashes (or, in the case of Memory Management, exceeds its memory allocation), Enclave automatically restores your container to a pristine state, then restarts it.

You don’t have to do anything, it just works.

Why does this matter?

Enclave provides a number of features to ensure high-availability on your apps at the infrastructure level, including:

  • Automatically distributing your app containers across instances located in distinct EC2 availability zones.

  • Implementing health checks to automatically divert traffic away from crashed app containers.

These controls effectively protect you against infrastructure failures, but they can’t help you when your app containers all crash due to a bug affecting your app itself. Here are a few examples of the latter, which we’ve seen affect customer apps deployed on Enclave, and which are now mitigated by Container Recovery:

  • Apps that crash when their database connection is interrupted due to temporary network unavailability, a timeout, or simply downtime (for example, during a database resizing operation).

  • Background processors that crash. For example, all your Sidekiq workers exiting with an irrecoverable error, such as a segfault caused by a faulty native dependency.

If you’d like to learn more about this feature, please find a full overview of Container Recovery in the Enclave documentation.