The current coronavirus pandemic is a global public-health issue that will affect all businesses, from physical retail outlets to fully cloud-based technology companies. Cities and countries across the world have mandated or recommended that nonessential businesses cease their operations (at least in physical locations) and prepare for long stretches of quarantining. As a result, many organizations that are used to working out of a central headquarters must now quickly adapt to a team that works entirely from home.
Because Aptible is experienced in security planning (and has allowed team members to work from home since day one), we wrote this blog post to help other organizations with their transition. Specifically, this post includes three recommended actions that you should take in the coming weeks in order to best prepare for the changes brought by the coronavirus pandemic:
- Review your cloud services
- Update your risk assessment
- Plan for contingencies
Review your cloud services
As your team starts working from home, it’s going to start relying more and more on the cloud services you use. For most organizations, this includes services like G Suite, Slack, and Zoom, to name the obvious ones.
With this increased reliance, it’s going to become extremely important to ensure that your team members interact with your cloud services in the right way. Any disruptions or security issues with your cloud services will have a much larger impact now than they would have just two months ago. In order to ensure that you’re using cloud services the right way, we have four recommendations: (1) require multi-factor authentication for sensitive systems; (2) switch to single sign-on; (3) make sure your authorization lists are up to date; and (4) perform an access control review to ensure that access is properly provisioned.
Require multi-factor authentication
Our first recommendation is something you’re probably already doing: require multi-factor authentication (MFA) for any systems that store or process sensitive company data. The concept of MFA is simple: instead of authenticating a user with just one piece of evidence (e.g., a password), require at least two (e.g., a six-digit key generated by an authenticator app).
MFA is important because not all passwords are alike—some are easy to guess because they are short, predictable, or reused in multiple places. And despite your policies that require complex passwords for certain systems, not everyone will follow along. In these cases, MFA provides a second layer of protection: even if a password is guessable, a six-digit key that changes every minute is not.
Because your team members working from home will now start to access systems—including those that store or process sensitive data—in new locations, making sure that only authenticated users can log in is important. Thus, we highly recommend requiring MFA for all systems that support it. (And while SMS-based verification isn’t ideal, it is better than nothing.)
Switch to single sign-on
Second, we recommend using a single-sign on (SSO) provider. SSO helps ensure that only the right people are accessing certain systems. With SSO, you put one service (the “identity provider”) in charge of authenticating a user before she can access any system configured with SSO. For example, suppose that you configure Slack with SSO. Now, if a user wants to use Slack, she can’t do so by logging in to it directly. Instead, she must authenticate with the identity provider, which in turn gives her access to Slack.
While there are a number of benefits to using SSO (including by centralizing access management and reducing the friction from the use of strong authentication policies like MFA), the most important one is that it puts many of your services behind a strong authentication process. For example, if your identity provider requires MFA, all of the services tied to it are behind MFA.
Each system and service you use will likely have different SSO policies. The strongest protections come from those, like Slack, that allow you to require the use of SSO for access to your systems.
Update authorization lists
Third, make sure that you’ve only authorized the right people to access company information systems.
Now that your team members can access systems from home, you’ll have less control over accidental disclosures. To minimize this risk, you should review your lists of users authorized to access systems that store sensitive data, and make sure they’re accurate. Keep in mind the standard considerations when authorizing system access—including the principles of least privilege (only users who need access to a system to perform work are authorized to access the system) and separation of duties (no one person or team has too much access). And if you can utilize role-based authorizations, now is the best time to do so!
Review access controls
Finally, now is a great time to perform an access control review (ACR). During an ACR, your job is simple: you’ve already created a list of the users or teams that are authorized to access systems (see above). Now you need to make sure that those are the only individuals/teams that actually have access to systems. In other words, does your list of users who are permitted to access AWS look exactly like the list of users who have an AWS account? If not, something needs to change—either the authorization list needs to be updated, or you need to change your access grants.
Update your risk assessment
In addition to reviewing and updating how your team members interact with cloud services, we also recommend reassessing your risks in light of changes brought about by the coronavirus pandemic. This means identifying any new risks you face, and thinking about how you might respond to them.
What is risk assessment?
Put simply, risk assessment is the process of identifying and ordering the risks that you face. Your risks should include any threatening event that could compromise the security or privacy of your data—everything from “team member falls for phishing attack and improperly discloses sensitive information” to “our hosting provider goes offline and we lose access to our information systems.”
The output of your assessment should be your risks ordered by priority, which gives you a way of triaging and appropriately responding to them—risks with the highest overall value (e.g., “very high” or $20 million, depending on your model) should be mitigated before those lower down on the list.
Why reassess your risks now?
Organizations should reassess their risks at least annually and whenever there are significant changes to their systems or operating environments. These changes can include cases where you identify new threats or vulnerabilities. Because of the coronavirus pandemic’s impact on how businesses operate, we believe that changes related to working from home warrant an update to organizations’ risk assessments.
What risks should you think about now?
While the particular risks that organizations face in light of the coronavirus pandemic will differ, there are a few that likely apply universally:
- Your reliability team becomes unable to work continuously resulting in the loss of availability of your services
- A vendor in your supply chain is impacted by the coronavirus pandemic resulting in the loss of availability of your services
- You experience a loss of productivity due to your team members working from home
Once identified, what should you do?
Once you’ve identified your biggest risks, the next step is to respond to them. Most or all of your risks related to the coronavirus pandemic can be mitigated, at least in part, by virtue of planning for them. (More on that below.) Additionally, if you can implement other safeguards to mitigate them (such as implementing MFA or SSO), we recommend creating a plan to do so. Proactively implementing these safeguards now can help you avoid larger fallout in the future.
To read more about how we addressed some of our own coronavirus-pandemic-related risks, check out our recent blog post, Aptible’s Response to COVID-19
Plan for contingencies
The last operational task we recommend is contingency planning. Put simply, contingency planning is the process of preparing to respond to—and recover from—events that cause service outages. These include, for example, outages resulting from your hosting provider going offline, as well as the inability to run your business due to team members being out sick. The ultimate goal of contingency planning is—as the name suggests—to have plans in place to ensure the continuity of your operations.
We recommend two broad approaches to contingency planning: (1) complete tabletop exercises by walking through the steps you’ll take in response to identified disruptions, and (2) test your actual ability to respond to disruptions. More information about contingency planning and preparatory exercises can be found in NIST’s Contingency Planning Guide for Federal Information Systems and Guide to Test, Training, and Exercise Programs for IT Plans and Capabilities respectively.
Complete tabletop exercises and document plans
A tabletop exercise is a discussion-based meeting where your reliability team talks through its plan for how to respond to a particular disruption. Here, for example, that might include discussing what to do in the event that the AWS region you rely on goes offline for a day. Or what to do if your Lead Service Reliability Engineer gets sick and spends two weeks offline recovering.
The output of these exercises should be a written “playbook” that your team can follow on occurrence of the event—including the specific steps that need to be taken, by whom, and using which tools. These playbooks are living documents that can (and should) be updated over time.
To see an example of what a Business Continuity Playbook could look like, check out Aptible’s template here.
Test your ability to continue operations
Additionally, you should test your ability to respond to disruptions. It’s one thing to write in a playbook, “restore a backup of our production database.” But it’s another to make sure members of the team actually know how to do this. Thus, we also recommend completing functional exercises to ensure that your playbooks are realistic and can be followed by anyone on your reliability team—for example, actually practice restoring a database backup, and make sure your team knows how to do it and how long it will take.
Completing these exercises will not only help you identify gaps in your current response practices, but they will also give you a realistic preview of the time and resources that will be actually required if a system is disrupted.
The coronavirus pandemic is going to change how we all operate for at least the next few months. And for many organizations, these changes will impact their information security. But by following a few best practices related to cloud services, risks, and contingency planning, we think that most organizations will be able to operate at as close to 100% as possible.
If you have any questions about how the coronavirus pandemic might affect you, or about how Aptible can help you prepare, please don’t hesitate to reach out.