ACM.65 Yes, you need a VPC.
This is a continuation of my series on Automating Cybersecurity Metrics.
I used to say when I was a lead developer at Capital One that communication was the hardest part of my job. Writing the code was the easy part. Sometimes I think I explain things clearly and then people make comments about what I said and it wasn’t clear at all or I need to re-explain or clarify a point. What I had in my mind did not translate properly to the other person’s brain. That is apparently the case with networking on AWS.
Someone came away from reading my book at the bottom of this post and pulled out a single sentence to say not in so many words — see, she said networking is too complicated therefore you shouldn’t bother with it. That is far from the point made in the chapter where that particular sentence exists. I suppose I need to revisit and clarify the three or four chapters that attempt to explain why networking is crucial for security. Maybe this blog post will help because I’m still seeing comments that miss the point of why we use network security at all.
It’s too complicated?
I’m honestly baffled by the argument that “networking is too complicated and people make mistakes therefore we shouldn’t do it.” I’ve heard this argument before. I also read it in the context of bastion hosts. They are often misconfigured so we shouldn’t bother. Is that a problem with the bastion host or the level of knowledge of the person who implemented it incorrectly? I’m sure they could learn to do it properly with enough time and effort.
We could put more thought into making networking easier to implement or use, as Ben Kehoe aptly pointed out in a tweet yesterday regarding VPC networking for Lambda functions. I agree. That is part of the reason for this blog series. I’m trying to show people how to do things they consider hard. But just because something is hard doesn’t mean it’s not worth implementing.
By the way, the same comment applies to encryption keys and IAM, which have been taking me way too long to demonstrate in this series due to cryptic error messages and implementation complexity and in some cases what seem to be flaws in the logic. KMS has been most troublesome for me, personally, due to some strange behavior, implementation choices, and inconsistencies. I am working through it and hopefully making it easier for others to avoid these pitfalls along the way.I used to tell the DevOps team I managed when I helped a security vendor move to AWS: If people are complaining we aren’t doing it right. Either we failed to properly explain it (i.e. documentation, training, and helpful error messages) or we need to redesign it to work with the developer workflow. Cloud platforms can do the same with their security controls – make them easier for customers to use so they don’t get stuck and skip them altogether.
Since when is something being complicated as a justification for not doing it when it prevents a disaster?
As I am writing this, Savannah is watching a hurricane on it’s way up the coast from Florida. I’m reminded of a building in Florida that was not properly constructed or updated after multiple hurricanes. It crashed down and killed people in the process.
What we know about the Surfside condo collapse
Would someone building a skyscraper say, the foundation is complicated, time-consuming, or expensive so we’re not doing to do it? Apparently the owner of the building above made that choice and it wasn’t a good one. There’s a reason you need a proper foundation and engineering when you build a skyscraper — to keep is standing. If you live in an error prone to earthquakes or hurricanes you need to plan accordingly.
Your cloud systems exist in an environment susceptible to cyber attacks. Architect accordingly.
Security controls implemented incorrectly don’t help you
When I said a VPC won’t help you if you don’t configure it properly, the context was not at all that therefore you should not implement one. What I said was, if you add a VPC or a security group to your cloud resource but you fail to configure the network rules properly, it isn’t helping you. The point is — you need to learn to configure your networking properly by understanding how attacks work — not that you shouldn’t use networking at all because you don’t know how to do it.
I spent multiple chapters explaining how attackers break into networks and how lack of network security gives them free reign to repeatedly bombard your Internet-exposed resources with attacks, brute force passwords, and exfiltrate data. I went on to explain how attackers can use your open network ports and proxy through network security controls to perform data exfiltration. I explain how lack of Internet networking and network segregation on internal networks allowed attackers to carry out two of the most devastating ransomware attacks to date. Basic network controls would have prevented both those attacks.
Internal networks matter too
I once read a poll of penetration testers that asked them what security control would make their job hardest:
Prevention of lateral movement.
In other words: no network segregation or better yet — zero trust networks — to prevent attackers who have accessed a resource to pivot to another resource. If you don’t prevent lateral movement in your cloud environment you’re making an attacker’s job much easier.
Zero trust networks severely limit what an attacker can do once they have breached a system. That is why everyone and anyone in security is talking about zero trust networks and IAM these days. And this, my friends, is one of the key benefits I saw when I revisited AWS and suggested that we could use it at Capital One. It is easier to implement a zero-trust everything in the cloud and segregate duties versus in a traditional data center or on-premises environment.
Zero trust networking reveals a security problems — an Azure case in point
I explained in another post related to problems I was having in Azure how I created a zero trust network. At one point Azure was telling me my IP address did not have permission to access the resource I was trying to access. The only problem was — it was not my IP address! The address was a 20.x.x.x address belonging to Microsoft. So why was a Microsoft IP address trying to access my private resources when I was directly logging in and trying to access those resources from my own laptop?
I reported the problem to Azure support. Someone responded to me later that there had been an “internal incident” and they couldn’t tell me about it because it was discovered by an internal secret system of some kind but the problem was resolved.
Here’s the thing. If I had not created a zero trust network for that resource I was trying to access — I would have never known that problem existed. And potentially Microsoft would not have either. Proper networking not only blocks unauthorized access, it helps you uncover security problems you might not otherwise know exist. You will be able to tell when someone is accessing something they shouldn’t based on rejects in your network logs — even when the person is using valid (possibly stolen) credentials.
AWS zero trust networking reveals yum updates coming from China
Here’s an example of how zero-trust networking alerted me to another interesting factoid. I was trying to run yum updates but they kept failing. I opened up various CIDR blocks on the AWS network.
Finally I figured out that my yum updates were coming from China and I had that network blocked. I contacted AWS support and they said that was expected because if a region was having issues it would fail over to some other region. But China? Aren’t there enough US regions? There was a way to configure the EC2 instance to only get updates from a specific region. I think the plan was to ensure that all yum updates were coming from a local region and that was a while ago so hopefully that is not happening anymore.
How would you spot that happening on your cloud resources without a zero trust network and especially if you don’t have any network logs at all for outbound traffic?
Lack of host-based security controls in a serverless environment
I’ve also often explained in my book and elsewhere that host-based security controls can sometimes be turned off or bypassed by malware on a host. Your network controls cannot be affected by malware on a host and vice versa.
Most of the time you want to use both host-based and network-based controls. Running host-based agents on a Lambda function isn’t really feasible. In a serverless environment, networking is even more crucial due to the lack of host-based controls. Secure code, logging, and deployment processes are also critical since we can’t (easily though possibly theoretically) capture memory from a Lambda function after a security incident.
An investigation of a Lambda security incident will largely be based on application and network logs which will not provide insight into attacks carried out in memory. Although you may not be able to capture the memory, at some point, the attackers need to communicate on the network for their attacks to be useful. And that is where you will capture the evidence of a “fileless” malware attack, for example.
If you aren’t using a VPC, you won’t have network logs. If you don’t have network logs, you might not have any way to tell your system is compromised. If you don’t use zero-trust networking, you might not be aware that someone is trying to access something they shouldn’t.
Learn how to implement proper networking — and automate it
Creating a zero trust networking and zero trust IAM is exactly what I have been showing you how to do in these blog posts — and I’m giving you the code! For free! You don’t have to figure it all out yourself. But you will need to learn some networking. I can’t determine from afar how your applications work or your network needs to be constructed. If you need help with that you could schedule a call with me through IANS Research, the same way I used to help developers at Capital One.
Information Security Insights and CISO Guidance | IANS Research
By the way – yes, Capital One had a data breach. I wrote a white paper based on my experiences while there and how I would have done things differently. As I already mentioned in other posts, the Capital One breach seems like an architecture flaw. I would not have been involved in that decision even if I had still worked there at the time, but if I was, I would have recommended an alternate approach. Why a firewall had access to every S3 bucket makes me curious. I’ve heard conflicting stories as to why it was configured that way. I have a friend who may write some blog posts on it but last time I spoke to that person, it sounded like that may or may not happen. And as always, security is hard and hind-sight is 20/20.
I am confident that any solid software engineer or architect has the capacity and capability to design proper networks. It just takes some time and effort to properly architect your cloud environment, deployment systems, and security controls over and above the time and effort you put into “making an application work.”
If you want to know how to do that, I’m laying it out in this series. I hope to organize it all a bit more once I’m done but you can see the entire thought process and what I’ve done to date here — including how to deploy basic network controls on AWS with templates you can use to do it. And more on the way…I’m not done.
I’m halfway through showing you how to construct a basic network. I’ve already written posts pondering the overall work architecture. We need to get our Lambda functions deployed in a VPC with private network access and get developers making AWS calls on private AWS networks instead of sending all that traffic over the Internet, where the traffic is subject to man-in-the-middle attacks, credential abuse, and all the types of attacks that become impossible if an attacker cannot connect to the resource and the resource cannot connect to the attacker’s network, even if they have valid credentials.
So many topics — so little time.
Next up — how and why to create NACLs for a subnet and how they are different than Security Group rules, a question I get frequently.
Follow for updates.
Teri Radichel
If you liked this story please clap and follow:
Medium: Teri Radichel or Email List: Teri Radichel
Twitter: @teriradichel or @2ndSightLab
Requests services via LinkedIn: Teri Radichel or IANS Research
© 2nd Sight Lab 2022
All the posts in this series:
Automating Cybersecurity Metrics (ACM)
____________________________________________
Author:
Cybersecurity for Executives in the Age of Cloud on Amazon
Need Cloud Security Training? 2nd Sight Lab Cloud Security Training
Is your cloud secure? Hire 2nd Sight Lab for a penetration test or security assessment.
Have a Cybersecurity or Cloud Security Question? Ask Teri Radichel by scheduling a call with IANS Research.
Cybersecurity & Cloud Security Resources by Teri Radichel: Cybersecurity and Cloud security classes, articles, white papers, presentations, and podcasts
Why You Need a VPC was originally published in Cloud Security on Medium, where people are continuing the conversation by highlighting and responding to this story.