Cyber Defense Advisors

Creating an AWS Security Group rule to Access GitHub with a Customer-Managed Prefix List

ACM.105 Limiting outbound access to a list of IP addresses represented by a single rule

This is a continuation of my series on Automating Cybersecurity Metrics.

In the last post we looked at how an EIPAssociation could resolve some dependency issues with deployment and deletion scripts that we ran across trying to deploy a new security group with an AWS-managed prefix list to an EC2 instance.

How an EIPAssociation in CloudFormation can Help Prevent Dependency Issues

We created the security group with the AWS-managed prefix lists to allow network access to an AWS service in the post before that.

Prefix Lists in Network Rules to Access AWS Services Without CIDRs

But what if you want to get to another type of service that requires access to many different IP addresses like S3 but is not an AWS service?

You can create your own prefix list on AWS to point to a set of IP addresses. It’s almost as if you can create and use a domain name but not quite. AWS doesn’t support domain names in security group rules (or NACL rules but you can use them with AWS Firewall).

By the way, I preface this with …I really hope AWS and Github (Microsoft) makes this easier soon, without charging more money.

If you read all the way to the end (it’s long because of all the issues getting this working) you will see that although this is a good solution, it doesn’t solve all our problems with GitHub — topics I cover with customers on IANS Research consulting calls.

Why restrict access to Github in outbound networking?

What if we want to allow our developer machine to access GitHub to download code but we don’t want to allow access to the entire Internet? Before I answer that question, let me answer the question of WHY you would want to do this? It’s so complicated. Networking is a pain. We have IAM and authentication, so why bother?

Tell me how authentication or IAM helps you in the event that malware gets onto your EC2 instance and is communicating with a C2 channel on another network — or even on another AWS host controlled by an attacker? It doesn’t. If you don’t know what a C2 channel is, it’s how attackers “call home” or communicate with the attacker’s server. I wrote about that topic in my book at the bottom of this post, among others.

Restricting network access helps us limit the damage that malware on our systems can do. It also hopefully triggers alerts when the rejected network traffic hits our logs in unexpected places (excluding the noise you’re going to get from hosts directly connected to the Internet). I’m talking about the traffic coming out of your host triggering rejects. Either something is misconfigured or someone is trying to do something they shouldn’t.

Again, I will mention that this solution is usually not good enough for large organizations that want to protect their IP in their source code repositories and prevent code exfiltration, but it is better than what a lot of companies and developers are doing at the moment. So it is a step in the right direction.

Github IP Ranges

The first thing we need to figure out is the list of GitHub IP addresses to which we need to provide access. Luckily GitHub provides this to us. If they did not it might be impossible to figure this out. Sometimes companies have IP ranges allocated to them but they use cloud services and content delivery networks (CDNs) that make it impossible to create networking rules. Thank you, Github.

Here’s how you can get the GitHub IP addresses:

About GitHub’s IP addresses – GitHub Docs

It’s accessible from this URL similar to the list of AWS IP ranges:

https://api.github.com/meta

Currently the documentation includes the following warning:

For our purposes this list should be good enough.

Here is one of the challenges when monitoring a vendor IP range to add to your firewall. Github does not warn you if they change their IP ranges:

We make changes to our IP addresses from time to time. We do not recommend allowing by IP address, however if you use these IP ranges we strongly encourage regular monitoring of our API.

What happens if the IP ranges change and you are not monitoring it? Well, if someone else came along and used an IP that was dropped for something nefarious that would be the worst scenario. The other is that your connections to GitHub would fail.

Let’s consider the first scenario. Take a look at the GitHub IP ranges. I already know just by looking at them that most of them are owned by Microsoft:

You can look up owners of IP addresses on ARIN and related regional registries as I’ve written about in other blog posts:

This one I didn’t immediately recognize. It’s owned by GitHub (acquired by Microsoft):

I’m pretty sure that every IP in this list is going to be owned by Microsoft. If you are at a large organization and need to be very specific you can vet that. You could write your code to ensure no IP addresses are added to the list that are not owned by Microsoft. In that case, it is highly likely that even if an IP is switched to some other purpose internally at Microsoft it won’t be something malicious. I think we’re pretty safe there.

In terms of not being able to access services, it will be important to monitor for changes and failures and update our prefix if we detect a change.

The other problem we need to consider is if we end up pointing to the wrong list of IP addresses that we are adding to our prefix list. You’ll want to ensure you use TLS to access the list. TLS (the successor of SSL) ensures your traffic is encrypted and validates that you get to the correct host for a domain name.

Prefix List properties

How can we use the GitHub list to create our prefix list? First of all I need a list of IP addresses. Let’s see what format it needs to be to pass into our prefix list and limits we may have.

AWS::EC2::PrefixList

AddressFamily — should really be “version” — and as I’ve explained before I’m only using IPv4.

Entries — these are our IP addresses. There’s a limit of 100. We’ll have to consider that in our design. An entry consists of a required CIDR and an optional description.

MaxEntries — It is not clear why this is required but it is. It seems like the max entries should be 100, no? We’ll just set it to 100.

PrefixListName — We can name our list without using tags thankfully. The name cannot start with com.amazonaws. It would probably make sense to name our list starting with com.github.

Get the list of GitHub CIDRs

Back to the GitHub list, how many entries do we have? Over 100? Could it increase to over 100? Let’s grab the IPs and count them. I can use curl to grab the list:

curl https://api.github.com/meta

Now, I don’t want the headings or the IPv6 addresses. How can I fix that.

Let’s use grep to only get things with a “.” and a “/” in them.

To grep things with a /:

curl https://api.github.com/meta | grep “/”

The period is a bit more complicated as it is a special character so we’ll need to escape it:

curl https://api.github.com/meta | grep “/” | grep “.”

Closer, but we don’t want quotes, commas, or spaces.

We can use sed to remove those:

curl https://api.github.com/meta | grep “/” | grep “.” | sed ‘s/”//g’ | sed ‘s/ //g’ | sed ‘s/,//g’

I’m sure there’s a way to make that shorter but that makes it easier to see what we’re doing. Choose every line with a slash. Choose every line out of the results with a period. Replace double quotes, replace space, replace comma with nothing.

We can write a quick script to count the values:

Ouch. Over 2000 CIDRs.

Maybe there are some duplicates add sort and uniq.

Not many. There are still over 2000 CIDRs.

Microsoft really has not made this easy on us, have they?

Reducing the number of GitHub IPs we require

Well, we could try to consolidate these CIDRS into fewer CIDRs. I wrote about that here:

Concatenating IP Ranges And Other Firewall Rule Tricks

We could also try to reduce the number by excluding services there we are not going to use. Run this command to get the list of GitHub services:

$ curl https://api.github.com/meta | grep “: [“

Well, I know that I need the git IPs. I’m not sure about the rest but I know I definitely don’t need or want actions right now. I’ll just start with git and add more ranges when I need them. We can choose all the IPs after “git” and before “packages” to get the list we want.

Now we’re down to 20 IPs

Be aware, by the way, that there is a rate limit when using the Github API. 🙂 I may have hit it…

Now we need to convert our values to a list that works with CloudFormation. Initially I thought I could just create a list but that doesn’t work as we have to create each entry in the format above. Unfortunate because it makes things much more complicated.

Here’s how I grepped a comma separated list of IPs.

Remove the count. Replace end of line with a comma and space.

I’ve got two problems left — the curl output and an extraneous comma.

Add -s to the curl command to suppress the curl output. Add one more sed at the end to remove the last character

And now we have a comma separated list of CIDRs.

Using our IP list

We can move this code into a function that we can call when creating our customer-managed prefix list:

Now create a function to create the prefix list.

Create the CloudFormation template:

Update the deploy.sh script to deploy the prefix list.

Well, my prefix list failed to deploy.

It is at this point I realized I had missing the Entry format I posted above.

Now I’m stuck. We might be able to do something like this with Transform:

Taking Json type as parameter for cloudformation templateAWS::Include transform

But as I already mentioned I am trying to avoid S3. I would have to automate an S3 bucket deployment and go over all the security settings and options, some of which I haven’t even gotten to yet in this series. It’s like putting the cart before the horse as we say in the US.

It was at this point when I stepped away from my computer for a bit and when I got back my stack was still stuck in this state:

That’s a bug. Something I did caused a problem. I don’t remember what it was.

Just out of curiosity I tried passing this value in as a parameter to the CloudFormation template: “Cidr: 1.1.1.1/32”. After overcoming some issues with bash spaces again, I got this error:

[‘1.1.1.1/32,’] value passed to –parameter-overrides must be of format Key=Value

I didn’t really expect that to work but it was worth a shot. I don’t really want to explain S3 bucket security and automation. That is coming later. What are my options? I can write a script to generate the CIDR portion of my template for now and use that in a hard-coded template and fix it later.

I could also just use the AWS CLI:

create-managed-prefix-list – AWS CLI 1.25.97 Command Reference

Let’s try that.

I always like to skip to and look at the examples:

Back to get github IPs. We want to end up with this format:

Cidr=10.0.0.0/16,Description=vpc-a Cidr=10.2.0.0/16,Description=vpc-b

We will just add the same description for all the IPs for now: “github-git” so for each CIDR we’ll need this line with a space between each entry.

Cidr=x.x.x.x/xx,Description=github-git

Back to our get_github_ips function. We can pretty easily alter this to return the string we want. We need to restore the end of line characters so we can loop through each line. then concatenate the proper value for each line.

I then ran the get_github_ips alone to see the output and it looks correct:

Now we can easily concatenate the command string we want to execute using that:

c=”aws ec2 create-managed-prefix-list
–address-family IPv4
–max-entries 10
–entries $entries
–prefix-list-name $listname”

Then call it like this:

$(c)

But what problems will we have when we try to use this approach? The next time we go to execute this script the list will already exist. We have to take some additional steps:

Determine if the list already exists.Delete the list if it exists.Then run our create command.

That’s one of the nice things CloudFormation does for us. Also if we create this list outside of CloudFormation we can’t easily audit it along with all our other AWS resources. Enforcing everything to be created through CloudFormation when possible will make your security and probably DevOps teams’ jobs easier.

Creating a CloudFormation template on the fly

One other option I have would be to generate the whole CloudFormation template on the fly. I just generated code on the command line to generate a command. This is kind of a painful approach.

I could have the template with placeholders in a tmp file and then replace the values to create the CloudFormation template using sed.

To create the entries we can create a new function that looks like this:

Running the function independently we can confirm we get the results we want:

I make a copy of PrefixList.yaml for now and call it PrefixList.tmp. Notice I have placeholders for name and ips.

Create a new function to replace those values. While testing I’m not going to repeatedly call the Github list function and use up my quota. I can test with hardcoded values initially and just print the output to the screen.

My variables are placed in the proper locations:

OK let’s replace them with the actual values and output to the template and then deploy it. I’m going to move all this into one function. We need to add some extra slashes for sed to work properly and then replace our temp values with our variables. Make sure to put quote around $entries or the new lines will not process correctly. Also in the sed command below I replaced the default delimiter (/) with * because the entries string has slashes in it.

That worked but I noticed my entries are misaligned.

We can resolve this by moving the placeholder to the start of the line in the template:

Add the necessary spaces when we concatenate our entries:

A bit hokey, but it works:

Now output the results to a template and execute the template. Note that I renamed the template to be specific to Github at the top and use a variable to avoid typos. I remove the template file if it exists so I know my changes are getting deployed. If the file doesn’t exist something went wrong.

Finally! That works and the template is deployed. We can see the ID for the prefix list below that we can use in our security group rules to allow access to GitHub.

We can head over to the VPC dashboard and find our prefix list here. I didn’t end up naming it a name with a “.” because that caused problems due to the fact that we can’t use a “.” in some of our other names due to inconsistent AWS naming conventions. I wrote about that in a much earlier post in this series. We chose a naming format that works across services and in our CloudFormation stack names.

Add the prefix list to our security group rules

Add the new Egress rule to the Developer security group.

Redeploy the developer security group.

Next problem. Even though I am only adding two rules to this security group, we are still facing the limit on rules per security group.

Apparently even though I only have two rules, the number of rules is calculated by the quantity of CIDRs the prefix list represents. Based on experience this could definitely cause a problem later if we want to continue to create a zero-trust network. The source of this problem is not zero-trust networking, it has to do with vendors designing services that require you to open up your network to so many different IP ranges and ports.

We can get around this problem in various ways by redesigning our network and how developers access source control systems and deployment pipelines. I’m not going to go into all the options here but for now I’m going to create two different security groups — one for GitHub and one for S3 — to resolve this problem.

Run these two commands:

mv Developer.yaml Github.yaml
cp Github.yaml S3.yaml

Modify the Github.yaml file to remove the S3 rule and parameter:

Modify the S3.yaml file to only include the S3 rule:

For right now I will just modify the function that deploys the developer security group to deploy both sets of rules.

There is probably a way to make this more generic but for right now I’m going to change the function that deploys the developer security group into a function to deploy the S3 secruity group since it needs an extra parameter for the S3 prefix:

I can use the generic security group deployment for the GitHub group. I’ll add code to deploy both security groups to deploy.sh.

Now what I realized is that I’m still getting the same error — maximum number of rules — for the GitHub security group. I only have 20 rules in my Prefix List. You can add 60 inbound and 60 outbound rules (total of 120) to a security group. What is going on??

Amazon VPC quotas

The only thing I can guess is that I set the maximum rules in my prefix list to 100. Let’s change that to 50 and see what happens.

One thing I forgot here is that you have to change the template tmp file that is used to create the above template or your change will simply be overwritten:

I personally find this error message particularly annoying. This error message requires a bunch of extra work for the customer that AWS should handle behind the scenes.

In my case I just deleted the stack and started over.

And..Are you serious? That fixed the problem.

First of all – why does the max entries property exist at all? Is it supposed to allow a customer to limit entries to 20 for NACLs and 60 for security groups or something? OK let’s say that is a valid property to add. Don’t use that to calculate whether a prefix list exceeds the rules! Use the actual number of CIDRs in the prefix list.Please fix. #awswishlist.

I also deleted the Developer SG Rules stack as that is no longer required.

This is taking way too long but that’s not because it is “networking” it is because of the way all this is implemented and too many rules to add by vendors who don’t consider firewall rules when designing systems.

And, they will keep doing it until customers push them to change these practices.

Update the developer EC2 instance to use the new security groups

Next we have to update our EC2 instance to use the new security groups.

Head over to our EC2 code and update the code to add the two new secruity groups to our VM.

Redeploy.

Now, unfortunately that change removed the EIP. We need to redeploy the EIP Association now (not the EIP only the association).

And, delete the known_hosts file as we did before.

Test Github Clone again

When you try to test the git clone command again, you will notice that it fails. That’s because the instance had to be recreated and all data on it was lost during that redeployment. EC2 instances are ephemeral. If you want to install software and keep it around you’ll need to create a custom Amazon Machine Image (AMI) or store data on an EBS volume — two topics for another day.

Reinstall git. Add -y so you don’t have to confirm. Also update the OS to get any security updates.

sudo yum update -y
sudo yum install git -y

Now try your git clone command again.

git clone https://github.com/tradichel/SecurityMetricsAutomation

And…it works. Finally!!

Potential CloudFormation fixes for the above complications

This post took WAY longer than I expected. This is how it goes when deploying to cloud environments a lot. 🙂 I’m used to it. But I wish things were easier.

It is unfortunate that we cannot pass in a list of CIDRS (without using an S3 bucket) to CloudFormation to create this list. Perhaps someone at AWS will read this. There are two potential solutions.

Pass in a comma separated list of IPS if you don’t have descriptions.Pass in a comma separated list of descriptions and a comma separated list of IP — but that sounds error prone and has a “code smell.” It’s just not right.Perhaps a new “type” for CloudFormation is needed that takes a Yaml or JSON snippet. That’s complicated though because it would be harder to validate and could lead to security problems. Maybe if it was a well-defined type.Alternatively, don’t force people to use an S3 bucket but rather allow them to incorporate a list from a local file, as an “include” CloudFormation template. I could easily generate the portion of the template that defines the entries.

If you’re thinking what about custom functions in CloudFormation — I wrote about that before. CloudFormation is really data that defines a configuration or describes resources. It’s not executable code. Mixing the two is messy and not something a seasoned programmer generally does. You keep those two things separate. Adding executable code to CloudFormation is like adding executable code to XML or JSON. Not recommended. It lets pentesters and attackers like me do bad things. 🙂

Fixes by Vendors to reduce network rules

Rather than blame networking for being hard, let’s direct the blame to the appropriate source. Vendors need to design networking to minimize the number of rules customers have to add to firewalls to access their services. Period. This requires thoughtful network design.

If a customer truly has to access services all over the world, perhaps they need many different IP ranges. However, if I only need to access S3 in one region, there should be one, maybe two CIDRs for that and I should be able to point to only those through a domain name or prefix list.

Same for GitHub. Why do I need 20 different CIDRs to access Github and that’s just for git. Then for all the other services I have to add 2000 CIDR ranges? Come on. That’s ridiculous. Reduce those CIDRs for the customer instead of making me do the calculations to reduce that number.

I hope vendors will start being more consider to their customers when it comes to networking instead of telling customers they don’t need firewall rules anymore — when those rules and associated network logging are one of the best way to tell if you have been breached. They also prevent a myriad of attacks better than authentication alone.

This post is a perfect example of the problems caused by too many network rules and overly complicated networking requirements.

Problems this solution doesn’t solve

As you can see in this post, I was able to pull down code from a public GitHub repository. Although you can restrict access to GitHub you can’t restrict access to a specific repository this way. This is why larger companies or companies concerned about developers either exfiltrating intellectual property or downloading random code with malware use alternate solutions. But for a small company concerned with malware establishing C2 channels, this can help (presuming they cannot establish a C2 channel using Github!)

Alright…this post took way too long.

Stay tuned as we try to add a VPC endpoint …finally… for this host to access CloudFormation and let’s see what happens to our existing access.

Follow for updates.

Teri Radichel

If you liked this story please clap and follow:

Medium: Teri Radichel or Email List: Teri Radichel
Twitter: @teriradichel or @2ndSightLab
Requests services via LinkedIn: Teri Radichel or IANS Research

© 2nd Sight Lab 2022

All the posts in this series:

Automating Cybersecurity Metrics (ACM)

____________________________________________

Author:

Cybersecurity for Executives in the Age of Cloud on Amazon

Need Cloud Security Training? 2nd Sight Lab Cloud Security Training

Is your cloud secure? Hire 2nd Sight Lab for a penetration test or security assessment.

Have a Cybersecurity or Cloud Security Question? Ask Teri Radichel by scheduling a call with IANS Research.

Cybersecurity & Cloud Security Resources by Teri Radichel: Cybersecurity and Cloud security classes, articles, white papers, presentations, and podcasts

Creating an AWS Security Group rule to Access GitHub with a Customer-Managed Prefix List was originally published in Cloud Security on Medium, where people are continuing the conversation by highlighting and responding to this story.