ACM.67 Creating Zero Trust rulesets or security groups on AWS

This is a continuation of my series of posts on Automating Cybersecurity Metrics.

Back when I worked on the network team at Capital One, developers had to submit requests outlining the network requirements for their applications which got reviewed by the security team and then implemented. I happened to be one of the people implementing those requests in the middle of the night in production.

What often happened was the development team would come in the next day and their application didn’t work as expected. Sometimes I had made a mistake but more often than not, the network rules were not sufficient to cover all of the application’s needs because the developers didn’t fully understand the connection requirements for their application.

This whole networking process was very frustrating for developers and especially if they didn’t understand how networking works. They just wanted to build and run their applications in a lot of cases and they didn’t want to be network administrators. Some of us like the geekiness of designing a zero trust networks. Other people couldn’t be bothered because they want to build cool applications that do cutting edge things for customers.

Preventing Errors in Cloud Networking Implementations

One of the problems in this environment was that the dev, QA, and prod networking environments did not work the same way. I would highly suggest working towards a consistent architecture, though you will have some differences most likely due to QA and security tools that you’ll probably run in QA but not in production. If you can’t exactly mirror the environments then try to create a staging environment to test deployments that mirrors production.

Beyond that, for production environments, I started asking the development and QA team to show up at the time I deployed their network so they could test it. That way any mistakes I made would be discovered and fixed within the open window I had to make the change. Additionally, if the team made a mistake in their network request they would know right away and not start sending messages to upper management about how the network team of five people was blocking the deployments of 11,000 developers when it wasn’t our fault.

As I’ve been doing in this whole post, leverage abstraction to simplify your network design and your code. Don’t write the same code over and over. Reuse what you can for repeated projects. Create acceptable network designs. Consider when you can consolidate into a single VPC or subnet and when things should be segregated for proper trust boundaries. Also, there are newer constructs now which didn’t exist at the time like Transit Gateways and shared VPCs which may help simplify network designs.

Determining your application network requirements with netstat

Besides helping developers with networking requests during daily office hours, the other thing I did was to write a tool that developers could run on their hosts to see what traffic their application was sending and to help them understand the IP addresses and ports in use by their application. At the time Lambda functions did not exist. Flow Logs didn’t either when I first started on that team and weren’t implemented on our VPCs initially when the were introduced. For lack of something better, I essentially wrote a script that ran a netstat command (I think. It may have been tcpdump) and parsed out all the IP addresses and ports with open connections.

Essentially you can run something like this on a host and see what is connected or what ports are open and listening:

netstat -an | grep ‘tcp|udp’

I added a bit more to it to formulate a list that could be submitted in a network request to our team. I am trying to remember if I used that or tcpdump. It’s on some blog post at Capital One if it’s still published. Netstat gives you a point it time snapshot of connections, while tcpdump would capture traffic for a period of time and might catch something netstat misses since it runs and inspects traffic until you stop it. You’d need to run netstat multiple times or use something that captures traffic over time if you have a connection that periodically comes and goes.

This list may or may not be complete because if you are using load balancers, then the system you run the script on might be connected to only one of the load balancers but you have another load balancer out there that the host will try to connect to later — so you still need to understand your architecture and not just rely on this script alone, but the script helped. In addition, when developers ran this script, sometimes they found connections they did not expect and services they needed to turn off that were generating unwanted traffic.

Leveraging FlowLogs to determine application networking requirements

It’s not quite as simple to run netstat on a Lambda function, though you certainly could. Also, now that we have AWS Flow Logs we can query the logs to see what connections an application is making to determine how we can whittle down networking rules to a zero trust approach (presuming everything is logged, which it should be).

I explained how you can ensure every VPC has Flow Logs enabled here:

[LINK HERE]

Once you have VCP Flow Logs Enabled you can query your logs to see what your system components are connecting to, presuming it gets logged. I show how to look at VPC Flow Logs in this Lambda Networking post:

Lambda Networking

We’ll look more at Lambda Flow Logs in the next post, but essentially you can figure out what network interface is associated with your Lambda Function and look at the traffic related to it.

Once you know what rules you need to implement you can create your security group rules.

Security group rules are stateless

One of the mistakes I made early one was thinking that I needed to allow traffic both ways in a security group the same way you do for NACLs. I didn’t realize that security groups are stateless. That means that the networking components checking to see if traffic should be allowed or not are checking to see if a connection has been established in one direction, and if so, it allows the return traffic in the other direction. In other words, if you allow an IP address from the Internet to connect to a web server on port 443 with an inbound rule, the return traffic is allowed, regardless of what outbound rules exist in the security group.

Outbound security group rules

The default outbound rules on an AWS security group used to allow any outbound traffic. I’ve written about my experiences in the past trying to get a data center where I had a managed server to block outbound traffic for me and why that matters in my book and earlier blog posts.

How network traffic got me into cybersecurity

Do not leave your outbound traffic rules completely open. They too should be specific to reduce the blast radius if a host or compute resource is completely compromised. You exponentially leave systems open to data exfiltration and lateral movement within a subnet if you leave these outbound rules open.

Remember that NACLs work between networks, meaning at the subnet level. They will not block traffic between hosts within a subnet. That’s what security groups will do for you.

Unfortunately, if you do not deploy any outbound security group rules using CloudFormation, you will get a default rule that allows outbound access to anywhere. You need to use a workaround to override it and I’ve seen some suggestions which are not good such as a rule to allow anything in the same security group to send traffic to each other. We need to prevent all outbound traffic and that doesn’t work unless you only have one thing in your security group ever and no one ever accidentally or maliciously assigns that group to something else.

Endpoints or hosts assigned to the same security group cannot automatically communicate with each other

A security group is not a group of hosts or endpoints or cloud resources. It is a group of network rules. When you apply that group of rules to the host, it can only communicate on the ports and protocols you open to the hosts and networks you specify in those rules. Unless you explicitly create a rule that says the security group can communicate with other hosts in the same security group, that traffic is blocked.

Creating rules between security groups

One of my favorite things about security groups is that you can create rules using security group IDs. This is convenient because often the IP addresses assigned to hosts change in the cloud. In addition, when network rules change in regards to subnet and VCP CIDRS you don’t have to change your security groups. You might want to allow a security group to communicate with other resources in the same security group and you can use the security group ID for that.

Other Rules and Restrictions

The rules in the cloud are always changing and there are a few other rules you’ll want to be aware of when creating security groups. Check out the latest documentation for more such as naming conventions and limits on the number of rules in a group or a set of groups applied to a resource:

Security group rules

Security Groups and CloudFormation

Let’s create our security groups. We’ll use CloudFormation:

AWS::EC2::SecurityGroup

GroupName: We can use this instead of tags for security group names.

Description: We can provide a useful description for the group so people know it’s purpose.

VpcId: We must associate a security group with a VPC, but we may end up using the same group configuration in different VPCs so we’ll want to pass this value in as a parameter.

Next we have SecurityGroupIngress and SecurityGroupEgress. The documentation says this is an embedded type in a security group, but we can still separate out our rules using these types the same way we did for NACL rules as shown in this template:

We don’t want an inflexible hardcoded solution like above so we have a bit more flexibility when creating rulesets. Check out the solution below.

Security Group Template

First I’m going to create a generic, reusable security group template with no rules associated with it. This offers more flexibility and requires less code overall to implement. Here’s the standalone security group template:

I’ll add a common function for security group deployments like this:

And call it like this for the first security group:

Note that I still haven’t resolved the problem of spaces in parameters passed around in bash. I could change my bash parameters to switches but really I don’t want to use bash at all for this so I’m leaving it this way for now until I get everything working.

When I checked the security group to see what rules got created.

No inbound:

Outbound rules — notice the default any/any rule has been added:

We’ll fix the rules in a minute. for now we can add the RDP Group

And two batch job security groups. I don’t know if I really need two yet.

Rules Templates

First we need to create our templates which define the ingress and egress rules. For SSH and RDP we are going to pass in one CIDR here but you can modify as needed for your purposes. If you use a VPN, this is where your network design may be easier. You allow all your developers to connect to the VPN and once authenticated at the network layer, they can reach an individual host and log into it. If you use this design you can restrict traffic for SSH and RDP to your VPN CIDRs rather than potentially hundreds of thousands of developers working remotely, or the entire Internet.

SSH

RDP

NO Access — a template with no rules to temporarily use for our Lambda functions since we don’t know what rules we will need yet.

Modify the deployment script to obtain a CIDR

To obtain the CIDR for now I’m going to pass it into my command line script using echo and read.

If you’re creating this for your own use and you only want to allow your IP address, you can use your public IP address. For a single IP address you have to put /32 at the end to make it proper CIDR notation. You could add some code to use a third-party service to look up your IP address but I’ve had mixed results with those services. I have another solution in mind which I’ll try to write about later.

If you don’t know what your external, Internet-facing IP address is, search “what is my ip” in Google. That’s the IP address you need to use in your network rules on AWS.

You do not want to use your local IP address on your laptops — which you can get using ipconfig or ifconfig depending on your operating system. This addresses should only be accessible to devices on your local network.

Modify the security group function to call the rules template (passed in) and set the CIDR if provided as a parameter.

Now test deploying our rules.

./deploy.sh

After the script runs, check to see that all your security groups exist with the proper rules.

As I was creating this template I’m thinking about future use cases and not sure the templates will stay exactly like this but they work pretty well for the immediate use case.

Now for final testing, I can deploy an EC2 instance in the remote Access VPC — one Windows and one Linux with the respective security groups, and test to make sure I can login. If I have any problems there are a couple of things I can do.

Look at VPC Flow Logs. Search for the IP address from which I’m trying to connect. Search for Rejected packets to determine if anything is getting rejected from the Internet. Search for the private address of the host in AWS and see if that has any rejected packets.Run the AWS Reachability tool. I don’t usually use this as I tend to look at logs but you may find it useful.

What is VPC Reachability Analyzer?

Now if we need to add new security groups we can use our Security Group template and use an existing or new rule template. If we have multiple remote users with different IP addresses, we can create a separate rule for each user.

We still have some clean up to do. Follow or updates.

Teri Radichel

If you liked this story please clap and follow:

Medium: Teri Radichel or Email List: Teri Radichel
Twitter: @teriradichel or @2ndSightLab
Requests services via LinkedIn: Teri Radichel or IANS Research

All the posts in this series:

Automating Cybersecurity Metrics (ACM)

____________________________________________

Author:

Cybersecurity for Executives in the Age of Cloud on Amazon

Need Cloud Security Training? 2nd Sight Lab Cloud Security Training

Is your cloud secure? Hire 2nd Sight Lab for a penetration test or security assessment.

Have a Cybersecurity or Cloud Security Question? Ask Teri Radichel by scheduling a call with IANS Research.

Cybersecurity & Cloud Security Resources by Teri Radichel: Cybersecurity and Cloud security classes, articles, white papers, presentations, and podcasts

Automated Creation of Security Groups on AWS was originally published in Cloud Security on Medium, where people are continuing the conversation by highlighting and responding to this story.

Automated Creation of Security Groups on AWS

ACM.67 Creating Zero Trust rulesets or security groups on AWS

Company

Services

Important Links

Get In Touch

Automated Creation of Security Groups on AWS

ACM.67 Creating Zero Trust rulesets or security groups on AWS

Related Post

PlayPraetor Android Trojan Infects 11,000+ Devices via Fake

CL-STA-0969 Installs Covert Malware in Telecom Networks During

New ‘Plague’ PAM Backdoor Exposes Critical Linux Systems

Akira Ransomware Exploits SonicWall VPNs in Likely Zero-Day