Cyber Defense Advisors

Developer Virtual Machines as Bastion Hosts

ACM.76 Why you might want to move development to cloud VMs

This is a continuation of my series on Automating Cybersecurity Metrics.

I’ve been logging into a VM on AWS to deploy CloudFormation scripts up to this point in this series. I am using a host I previously deployed in an automated fashion from another project, but I want to deploy a host for deployments now into the developer network we created. I want to use that VM to test access to CloudFormation through an endpoint as I explained in a prior post.

Automated deployment of virtual machines (EC2 instances)

If you were starting from scratch in an AWS account and you wanted to deploy CloudFormation scripts as I have been doing, you might be manually deploying a host or running scripts from your local laptop (something I never do anymore).

It would be better if we had a bit more automation and governance around how these hosts get created. We want to ensure the hosts are using zero-trust networking and are running with security best practices from the authentication method to the configuration of the operating system and logging. You can follow the best practices for virtual machine (EC2) and operating system security provided by the OS vendor, cloud vendor, and the CIS benchmark guidance specific to Amazon Linux.

CIS Amazon Linux Benchmarks

This is where an automated mechanism to create VMs that developers can use will be helpful. Automate the creation of approved VM configurations. If developers have an automated and defined process for deploying hosts on AWS you can tell who created what hosts and who is logged into them. You can ensure that all the logs from hosts created in your account are properly logging to a central repository.

There are many aspects to creating a secure VM for developers to use and I won’t be covering all of them in this post. Let’s consider some of the benefits of performing application development on a cloud VM.

Bastion host (or jump host)

I’ll provide the analogy I provided in a presentation at Capital One where I was tasked to deploy bastion hosts for 11,000 developers to log in and access cloud resources. Not scary at all, right? And I’m not talking about the presentation.

When you go visit a highly restricted area at the Pentagon or some other high-security building, you don’t just get to walk through the front door up to the super secret room and swipe a badge. Before you even get to that point, you might have to check in at a gate when you want to drive in to park. Then you probably have to check in at a front desk. You may need to use a badge in the elevator to get to a sensitive floor or have an escort. Then you get to the super secret room where you have to swipe your badge again.

There are obviously reasons for all these security controls. If you are trying to break in, the security staff monitoring for break-ins has multiple points at which they may catch you. The same is true when you use a bastion host on your network.

By the way, do you know how those security controls I just mentioned are usually bypassed? Social engineering. By tricking people. The more you can automate the process of granting access, the less chance that you will have a data breach resulting from human error. Of course, you need to ensure that your automation is well-designed to prevent an attacker from abusing that as well.

When people say identity is all you need to secure your cloud resources, that is essentially the same as letting someone enter the building unrestricted and march all the way up to your super secret room before you attempt to catch them trying to break in. By the time they have and you get there and you respond to a break in — it might be too late. Your sensitive documents or jewels or whatever you were protecting in that room got swiped and the thief is long gone.

Using a bastion host to access hosts running applications or housing sensitive data on AWS is a security best practice. You should not be logging in directly to application servers, web servers, or database servers hosted on AWS. Exposing those resources directly to the Internet is one of the key weaknesses attackers exploit. They try to brute force credentials, use stolen credentials, or leverage vulnerabilities to break into those hosts and steal data.

Create a bastion host to allow people to login to that host first before they can reach other resources. Restrict access to your bastion host to the VPN network or IP addresses to limit remote logins to only authorized networks to prevent attackers from leveraging vulnerabilities or stolen credentials directly from the Internet.

Benefits and downsides of fixed bastion hosts

As one company I worked at, I was supposed to deploy two hosts load balanced EC2 instances to use as bastion hosts. (The only problem was that AWS did not have load balancers that supported SSH and RDP at the time. That was another one of our feature requests that got implemented later.) All developers were supposed to log into those hosts to reach other hosts in high-security environments on the network.

The benefit of the bastion host was that we could create a network with very limited Internet access. Everything else resided in private networks. The bastion host was behind a VPN and only accessible from private IP addresses. None of the resources in AWS were exposed to direct inbound traffic from the Internet.

What’s the downside of an approach that uses fixed bastion hosts? You will need to ensure the credentials exist on those hosts for anyone who needs to log into them. Let’s say you use SSH to access Linux instances. You’ll need to get the public keys for every user into the bastion host. How will you automate that? It’s possible but it’s not simple.

What happens when people come and go from your organization? Don’t forget to remove their key from the bastion host. That could lead to errors. Every time you make a change there’s a chance for error.

The deployment of additional SSH keys will need to be handled outside the AWS process where you can pass an SSH key into an AWS host via CloudFormation. When you want to know what SSH keys are deployed on a host you won’t be able to look at the AWS console or in the AWS logs potentially. You’ll need to think through that logging to make sure you are not susceptible to or affected by an attack I will describe in the next post.

You can also use Active Directory domain join to allow users to log into the host. That gets around some of the SSH key issues but now you’ve got numerous people using the same host. How will you segregate their workspaces? What files are they allowed to pull down and execute from that host? Anything? Will they have to login to the bastion host and then to their own workspace? That’s probably more secure but it’s also a hassle for developers. They hate it.

If you need to inspect the logs to see what actions a developer took you’ll need to sort that out from all the other logs on the system.

Also, your bastion hosts become a bottleneck and a single point of failure. You have to keep them running all the time and pay for a load balancer to support them. If too many people are accessing them all at once they might run out of connections. If malware infects the bastion host you’ve got to block everyone’s access while you fix it. One of the biggest downsides is that you have a constantly open port waiting 24 x 7 for someone to connect even when the bastion host is not in use.

Separate VMs for each developer for AWS access

In a development environment you can use a developer VM as a type of bastion host to access AWS services. Each developer can have their own VM with networking and credentials specific to that developer as well, which is helpful if you have people working in remote locations with different IP addresses, though you can also use a VPN as your single point of entry as explained in a prior post.

You might opt for limited bastion hosts in a production environment where a bastion host is only used in very rare circumstances. But perhaps each person could have a VM they log in to access production resources instead. Perhaps you have a two-step process to grant just-in-time access to a production operations support professional. If you have a fully automated deployment you could deploy a host for the on-call user and stop it. Then you’re not paying for it unless you need it. Start the instance when it is required.

In very rare circumstances I have not been able to restart a VM that I was working with and had previously stopped. It was a larger and perhaps less widely used VM type. If your workload is that critical you might start and leave the VM running. But when a host is running with ports available to which someone can connect, that is also a risk, as already mentioned. If it’s only open when someone is actively using it you’ll have a better chance of noticing if an unauthorized user is trying to use it at the same time.

You can lock own your developer VMs so they can only access the development environment and never have access to production. You should be able to deny Internet access if you think through how developers get packages and code to deploy to their workstations in the cloud. You could limit developers so that they can deploy software onto other machines, not the developer machine itself, depending on how stringent you want to be to prevent malware on developer workstations.

If you automate the creation and backup of developer machines, in the event something happens to one of them, they can be easily restored. You can also potentially leverage these backups in an incident response scenario. You know exactly who owns and was using the machine and the logs related to that machine will be more limited for easier identification of problems and attack paths.

A bastion host simplifies the network design because we don’t have to allow access to AWS services in the local network that should only be used from within AWS. For example, if you have a proxy host protecting your organization’s network, you can restrict access to AWS fully qualified domain names (FQDNs) and IP addresses used for AWS deployments and configuration.

That exact problem caused developers grief at Capital One when developers first started accessing the service. They ran the AWS CLI on their local laptops and when they tried to deploy things on AWS — nothing worked. All the AWS domain names were blocked by the corporate proxy. If you want to continue to block those domains on the corporate network you can give access to developer VMs within AWS without changing those proxy rules.

AWS (and just about every other cloud service) has been abused by attackers in data breaches so the more we can limit access to them except where they are actively being used, the better. If we are only accessing AWS services from defined networks within AWS and are careful to create zero-trust networking and IAM access, there is a better chance we don’t allow an attacker to co-mingle traffic to C2 channels on AWS with valid AWS traffic. If you’re not familiar with command and control hosts I explain those in my book at the bottom of the post.

The complication will be all the web sites that use AWS and the fact that websites fronted with CloudFront will show up with a generic domain name not specific to that web site or application. This complicates network security and identification of rogue traffic as I explain in this post. I wish CDN providers could find a way to fix this. However, this is not a problem related to AWS deployments or development. CloudFront is not related to deployments it is related to hosted applications with distributed points for accessing content. This problem just means that you likely won’t be blocking all of AWS on your corporate network if you use it for browsing the web.

CDN Security Wishlist

But you can still restrict access to only the AWS services your organization requires outside of AWS. Services used to deploy resources on AWS (like CloudFormation and SSM) can be limited to AWS networks.

The one downside of this approach is cost. I looked into using AWS Workspaces for this purpose initially but the cost is a lot higher than using an EC2 instance, so I opt for the latter. I use different instance types depending on what I am doing to try to optimize spending. However, this series is about moving some of those workloads to AWS Batch where I may be able to save money with spot instances. I wrote about that in this post:

How Batch Jobs Can Help Cybersecurity

Why would you ever need to login to a VM running an application in the cloud? You shouldn’t.

We will not be running applications directly on the developer VM or logging into VMs that host applications directly for the architecture in this series, we will be using serverless technologies. We don’t have to worry about developers, QA, or operations logging directly into hosts that run applications at this point.

If you are running applications on EC2 instances and someone has to log into the instance on deployment night to get the application up and running — you have a security risk. The deployment is not designed correctly or the software you are deploying doesn’t correctly support complete automation. Consider using a different vendor in the case of the latter. In the beginning when organizations started using cloud vendors did not understand the concept of fully automated deployments (That included a security vendor I worked for. I had to explain it to them by showing them how to deploy their product automatically and at what points it did not work). We should be past that point now.

Application deployments should never require manual “tweaking” to make them work or manually logging in to configure something. If that is happening in your environment, the deployments were not designed and tested correctly or your architecture was not designed with deployments in mind. Fixing that is one of the most important steps towards preventing a data breach, along with making sure attackers cannot get into your automation and leverage that to make unauthorized changes in your environment.

Logs should be shipped off the host to an alternate location for inspection. I once faced an issue at Capital One where, although I had completely automated the deployment, it wouldn’t work. I had to log into a production resource to figure out why. The problem in that case was that we hadn’t implemented a standard mechanism for shipping logs so we would be able to inspect the logs without logging into the host. Consider how you will troubleshoot issues like that and plan accordingly. I’ll try to cover some logging topics in later posts.

By the way, I covered other considerations for a developer network here. Which rules you require and whether you need a NAT or Internet Gateway depends on where you host your code repository and other tools that can help manage Internet packages.

Network Design: Developer Network

Stay tuned and follow for updates.

Teri Radichel

If you liked this story please clap and follow:

Medium: Teri Radichel or Email List: Teri Radichel
Twitter: @teriradichel or @2ndSightLab
Requests services via LinkedIn: Teri Radichel or IANS Research

© 2nd Sight Lab 2022

All the posts in this series:

Automating Cybersecurity Metrics (ACM)

____________________________________________

Author:

Cybersecurity for Executives in the Age of Cloud on Amazon

Need Cloud Security Training? 2nd Sight Lab Cloud Security Training

Is your cloud secure? Hire 2nd Sight Lab for a penetration test or security assessment.

Have a Cybersecurity or Cloud Security Question? Ask Teri Radichel by scheduling a call with IANS Research.

Cybersecurity & Cloud Security Resources by Teri Radichel: Cybersecurity and Cloud security classes, articles, white papers, presentations, and podcasts

Developer Virtual Machines as Bastion Hosts was originally published in Cloud Security on Medium, where people are continuing the conversation by highlighting and responding to this story.