ACM.103 Provide access to S3 (and yum) in network rules without adding every S3 CIDR to maintain zero-trust networking
This is a continuation of my series on Automating Cybersecurity Metrics.
In the last post, I showed you how to access Git using a static IP addess (otherwise known as an EIP on AWS to add network protection to your GitHub repository.
Limiting Access to an AWS EIP in GitHub
Now we need to install Git. I’m going to show you how to do that while maintaining a zero-trust network.
Every time I think I am going to write a short blog post it turns out to have some complexity that ends up taking me a lot of time to work around. Hopefully this one is quick. I’m not going to go deep into secure deployment of software packages — that’s a topic I often cover on calls with IANS Clients.
I don’t recommend deploying straight from the Internet at a security-conscious organization but when you are just learning and testing from your home to learn new concepts, you have to start somewhere. So we’re going to use yum to download and install Github from the Internet in this post.
Install Git
The first thing we need to do to get the code is install Git, a tool that is used to interact with GitHub and some other source code repositories. Git allows you to run commands to upload and download code (depending on your permission to do so for a particular repository.) There are a number of ways to do that depending on your operating system.
On a Linux EC2 instance it’s this simple:
sudo yum install git
“Could not retrieve mirrorlist”on an AWS EC2 Instance
Like I said, every time I think I’m going to write a simple blog post, there’s something I forgot about that I need to stop and tell you. Of course it doesn’t work from our Developer VM:
Could not retrieve mirrorlist https://amazonlinux-2-repos-xxxx.s3.dualstack.xxx.amazonaws.com/2/core/latest/aarch64/mirror.list error was12: Timeout on https://amazonlinux-2-repos-xxxx.s3.dualstack.us-east-2.amazonaws.com/2/core/latest/aarch64/mirror.list: (28, ‘Failed to connect to amazonlinux-2-repos-us-east-2.s3.dualstack.xxxx.amazonaws.com port 443 after 2702 ms: Connection timed out’)
Do you know why? Because our network doesn’t allow outbound access to S3 to get the software. AWS provides Yum via S3 and S3 requires HTTPS access to use the service. We need to open up port 443 TCP outbound to AWS S3 IP address ranges.
If you want proof that is the problem or if you were troubleshooting this issue, you could go look at VPC Flow Logs, which we enabled on our VPC earlier in this series.
We added Flow Logs to our Remove Access VPC here:
Troubleshooting Network Access with VPC Flow Logs
Click on your VPC as we did before and navigate to Flow Logs:
As mentioned before you’ll see the private IP address of your EC2 instance in the logs and in this case we see the public IP address of the host the EC2 instance is trying to reach which in this case is S3.
The next column indicates the protocol — 6 — which is TCP.
The traffic is rejected.
Now the problem is that we can’t just allow this one IP address because AWS uses many IP addresses to run the S3 service. Recall you can look this up on the AWS IP list:
https://ip-ranges.amazonaws.com/ip-ranges.json
It would be time-consuming and error-prone to add all those IP addresses. Not to mention, if they change, we would need to ensure that we update our list.
AWS-managed prefix lists
Luckily, AWS offers something called a prefix list that partially solves our problem. This is SO much better than trying to keep up with the AWS IP ranges and I wish it was around when I was implementing networking for Capital One.
We can use the prefix list in a network rule instead of all the IP addresses associated with the service. When the IP addresses for the service change, we’ll automatically get those changes via the prefix list. It essentially looks like a domain name.
Work with AWS-managed prefix lists
When looking at the AWS console, you can see the prefix lists, sort of. On my screen it’s not wide enough to see the full prefix list so I no idea which one of these is for S3. You also can’t use the domain name above to add the prefix list to the rule. It gives you an error. Hopefully they fix that. #awswishlist.
You can use the AWS CLI to query the list to get the ID starting with pl- for the list you want or look up the prefix lists in the AWS console. Click Managed prefix lists on the VPC dashboard. We are going to use the prefix list ID in the first column in our security group rule for the services we want to access from our EC2 instance.
Create a Developer Security Group
Now I need to determine if I am going to create a new group or add the rule to my existing SSH group. I’m looking ahead and I think I might need to reuse the SSH group for different roles in my organization.
For example, I might have a penetration tester that needs SSH access, a Security team member, a DevOps or ProdOps user, plus our Developer user. All these different roles will likely need SSH access but beyond that their permissions and network requirements may be different. So I’m going to create a new Developer security group. We can apply both the Developer and the SSH security groups to our Developer EC2 instance.
We’ll need to add an egress rule to allow S3 access. Our instance is trying to initiate outbound traffic to S3. Recall that Security group rules are stateless so we only need to add the outbound rule. Instead of a CIDR we will specify a DestinationPrefixListId.
Get the ID for the S3 prefix list to use in the rule.
Now here’s the problem. I need to write a template that works for anyone other but this prefix list is specific to this AWS region. How can I make this template generic? I’ll need to pass in the ID. I can look up the id for the current region in my deploy script like this:
aws ec2 describe-managed-prefix-lists –filters Name=owner-id,Values=AWS –output text | grep s3 | cut -f5
I can put that into a reusable function:
My developer security group is different enough that I don’t want to muck up my existing security group function so I’m going to create a new function. There is still a lot in common so I’ll continue thinking about how I can use abstraction to reduce lines of code but for now it’s a separate function:
Here’s my security group rules template:
NACL Rules for S3 access
Now the problem we have is that NACLs don’t support prefix lists and if if they do at some point, I will exhaust the rules limit for a NACL most likely since the underlying CIDRs are all counted towards that limit as explained above. Recall from a prior post that NACLs are stateless. They don’t put the packets in a request back together but instead inspect one packet at a time. They pretty much only have access to the source and destination IP ranges, ports, and protocols. That makes them faster, but they offer less functionality.
What can we do about our NACL rules? Recall that I mentioned I generally use NACLs for broad access rather than super fine-grained access. I’m not going to try to figure out every S3 IP range and try to keep that up to date. But I am going to only allow port 443 outbound and ephemeral ports inbound to support this rule, rather than simply allowing all traffic. Let’s add those two rules to our remote access VPC.
I’m going to keep my existing template around because it might come in handy later if we need to allow only SSH access. I’m not sure about that yet. For now I am going to add a new set of NACLs for my remote access or in other words developer VPC. I might even rename that VPC later but as we know, renaming breaks a lot of things so not doing it now. I put those rules in a Developer.yaml NACL rules file.
Our rules are getting a bit more complex so I added a comment for each rule. I recommend doing that so you can remember why all those rules exist later. Trust me, it will help you. I also made a clearer delineation between ingress and egress rules.
Parameters have not changed:
Ingress rules:
Egress rules:
Next I need to deploy the NACL and switch the remote VPC subnets to use our new ruleset.
I’m going to change this:
To this:
Now upon trying to deploy I got this error which is a bit frustrating.
“The network acl entry identified by 200 already exists. (Service: Ec2xxx, Status Code: 400, Request ID: )” (RequestToken: xxxx, HandlerErrorCode: GeneralServiceException)
I feel like CloudFormation should properly handle this because we are simply adding new rules. However I just renumbered my rules to all different numbers to get around this starting with 203, 204, etc so there are no numbers overlapping with the existing rule numbers in my NACL.
A network implementation that is commonly abused by attackers
Now notice above we had to allow ephemeral ports in both directions for this to work. That’s a LOT of open ports. What if we had to open ephemeral ports both directions to the entire Internet and no other security controls? We’d have more open than closed ports and not much in the way of network security.
A particular traffic pattern that I can see attackers looking for in their scan attempts is a request and response on two high ports. I imagine attackers understand that some people simply open ephemeral ports both ways to the entire Internet. In some cases, it’s hard to do otherwise if systems, products, and networks are not designed with network security in mind. (I’m looking at you Google QUIC).
Scanners are sending request and response traffic to two ports in the ephemeral range. What good does that do them? Well they would get a report back from their scanner that they have two open ports to work with if they are trying to carry out an attack that requires a request and a response. What do I mean exactly?
Generally you have to write software to connect on one port and send the data back on another. That’s why we have to open up 443 inbound and all the ephemeral ports outbound to send data back to all the individual requests IP addresses. There are a few protocols that send and receive traffic on the same port (NTP 123 UDP) but typically not when using TCP. If an attacker finds two open ports, they can attack a system and set up their malware to use those two open ports to send and receive information from the infected host.
I created a rule on my local firewall to block this traffic. I call the rule “two high ports” and I block any traffic that comes from and goes to an ephemeral port — because most of the time that’s not normal. It’s definitely not a requirement on my network. I block it right out of the gate before I do any other traffic inspection and that reduces the load on my firewall — because I see a LOT of that traffic. So there’s a tip for you that might help you improve your network performance if you can do it right out of the gate and better yet with a stateless packet inspection device. Ditch that noise!
We can’t really do that in an AWS NACL exactly. But we also are not allowing two ephemeral port ranges in both directions. In this case, the ephemeral ports in one direction are only open to one IP address. So that doesn’t do attackers much good if they are trying to use two open high ports to carry out their dirty work.
What if we had a network design that required ephemeral ports in both directions to the entire Internet? That’s where you want to think about how you can break up your architecture with services that communicate behind the scenes — one for the inbound traffic and one for the outbound traffic. But that’s getting a bit beyond what this blog post is about.
In addition to the above design considerations, our security group has zero trust rules if the attacker gets past our NACL. We cannot define both sides of the connection like we can with a NACL or use deny rules but we can be very specific about which CIDRs our developer machine can communicate with — thanks to the S3 prefix list.
Test installing Git again…
Now that we have updated those rules, we can test installing git again.
And…it doesn’t work. Do you know why? The first thing I did was to try to go look at my security group rules. Navigate to your instance in the EC2 dashboard. Click Security. Here you can see your security groups and network rules.
I only see the Remote Access security group here. We need to add the other security group to our EC2 instance.
If I needed to I could also click on Networking and then click on the subnet to review those rules for any issues.
You can also try to use the Reachability Analyzer but for some reason, the last time I tried it — it didn’t really help me as much as just looking at the rules. Let’s take a look.
Ah yes…now I remember. Many source is external to AWS and there’s no option here for that. This doesn’t help me. It would be pretty simple to analyze the rules for an external source. But you can also just look at the network rules and once you understand how they work you can troubleshoot that way. If you design your network in a nice way you can reduce the rules in each network segment to keep things less confusing, possibly. Unless you’re using a product like Active Directory. ;-D
Anyway, I know I need to add a security group so let’s do that now.
Adding multiple security groups to an EC2 instance
Head over the EC2 template we created. I’m going to change my security group parameter to a list and pass in security group IDs instead of export parameters:
The reason I’m not going to use the export values is because Fn:ImportValue doesn’t operation on lists as far as I can tell from the documentation.
Instead we will reference our list parameter:
Because this VM is getting pretty specific to developers I’m goin to create a new function named deploy_developer_vm:
I’m going to add a new argument to my function that deploys a VM that takes the list of security group IDs (sgids in the code below) so I can pass it into the template above.
OK now I need to get the security group IDs from my exports in my new deploy_developer_vm function and pass those into the function that deploys a VM with the other relevant arguments.
Search on security groups to get my stack names.
Get the output name for each stack:
Add the deploy_developer_vm function:
Try to deploy our VM using the deploy.sh file in the same directory.
A couple of errors to resolve.
Since I’m now passing in a list I can remove the dash that is used for an individual item in a list in YAML and just reference the list.
One of my security groups and the subnet belong to different networks. Hmm..how did that happen because I’m using the same security group template for each?
I’m passing in the VPC ID to the Security Group template:
I added my new developer VPC at the end of list of security groups for my Application VPC:
I need to move it under the remote access VCP and redeploy it.
Not only that, I realized I was referencing the incorrect stack name. Those are some things you might hit when deploying networking and how to fix them.
The next error I got:
Export Developer-ami-08f1b667d4bd99bd1 cannot be updated as it is in use by Network-EIP-RemoteAccessEIP
Now we have a problem. We cannot redeploy the VM because the EIP stack we created is referencing it.
Deploying an AWS Elastic IP Address
We could delete the EIP stack but then we would lose our IP address. If the IP address changes then we have to go back and change the local network firewall rules we created and update them to the new address.
Local Firewall Rules to Connect to an AWS EIP via SSH
How can we solve this problem without losing our EIP?
Here’s our EIP stack:
We can remove the dependency on the export name parameter because the EIP CloudFormation resource does not require an InstanceID in order to deploy it. We can simply remove the InstanceId reference and then we end up with this.
Redeploy the template and the same IP address still exists but it is not associate with our EC2 instance.
Now try to redeploy our EC2 instance to apply the new security group.
That works. There’s no dependency on the EIP but now our EC2 instance is not associated with the EIP that is allowed through our firewall either.
Check out the next post where I fix that problem.
Follow for updates.
Teri Radichel
If you liked this story please clap and follow:
Medium: Teri Radichel or Email List: Teri Radichel
Twitter: @teriradichel or @2ndSightLab
Requests services via LinkedIn: Teri Radichel or IANS Research
© 2nd Sight Lab 2022
All the posts in this series:
Automating Cybersecurity Metrics (ACM)
____________________________________________
Author:
Cybersecurity for Executives in the Age of Cloud on Amazon
Need Cloud Security Training? 2nd Sight Lab Cloud Security Training
Is your cloud secure? Hire 2nd Sight Lab for a penetration test or security assessment.
Have a Cybersecurity or Cloud Security Question? Ask Teri Radichel by scheduling a call with IANS Research.
Cybersecurity & Cloud Security Resources by Teri Radichel: Cybersecurity and Cloud security classes, articles, white papers, presentations, and podcasts
Prefix Lists in Network Rules to Access AWS Services Without CIDRs was originally published in Cloud Security on Medium, where people are continuing the conversation by highlighting and responding to this story.