Cyber Defense Advisors

Authentication Flow for Batch Jobs

Considering the threat model and attack surface

This is a continuation of my series on Automating Cybersecurity Metrics.

Very happy to get back to my security metrics automation series in 2023 and a bunch of other things I want to finish. Had a great Azure Security class at the end of the year, but so tired of all the half-finished projects on this blog. We did finish one major house renovation and repair project dragging out all last year. So now that distraction is out of the way (for a bit) I can hopefully focus and get some things done quickly. I hope!

Since it’s been a while I had to think through where I left off and what’s next. All the code is here on GitHub so far. Updates will continue with future blog posts.

GitHub – tradichel/SecurityMetricsAutomation

For now, let’s think through our objectives and how our system might be breached in order to design a secure batch job authentication flow.

As I mentioned at the start of this series, I want to figure out if I can require MFA for a batch job.

Creating an AWS Batch Job That Requires MFA

Making something work is not the same as making something secure. It’s not simply a process of creating the batch job. We also need to ensure we can deploy it securely. The objective led with a series of blog posts to create a secure foundation for our cloud environment, step by step.

Automating Cybersecurity Metrics (ACM)

My last post was about using using multiple MFA devices to ensure you don’t get locked out of your account if your primary key is lost, stolen or broken.

Multiple MFA Devices for AWS IAM

We’ve got a per-user EC2 instance set up and we are accessing CloudFormation to deploy resources via a private network using a VPC endpoint. I’m using this instance currently to execute the CloudFormation templates and scripts I’ve checked into GitHub.

User-Specific EC2 Instance

Now I think I’m ready to finally start deploying the application and related infrastructure.

To set up our authentication workflow we’re going to have credentials and sessions at some point. We already looked at storing credentials in AWS Secrets Manager and Systems Manager Parameter Store:

Limiting Access to KMS Keys via Secrets Manager

We’re going to use what we learned to securely handle credentials and secrets used in the process.

We also considered Lambda Networking Security and how lack thereof could lead to security problem:

Lambda Networking

And why you need a VPC:

Why You Need a VPC

I went on to deploy the network resources which exist in the GitHub code and are described in the related posts. We’ll use and possibly alter that networking for a Lambda function and a Batch job.

We used VPC endpoints to keep traffic private between our EC2 instance and CloudFormation.

AWS PrivateLink and VPC Endpoints

We’ll use that same concept to keep traffic private between AWS Lambda and AWS Secrets Manager or AWS Parameter Store.

Now we need to think through the architecture of our authentication process a bit more.

Triggering the authentication process

Something will need to trigger the authentication process.

We could have a scheduled batch job that runs once per day.A user might want to manually start a batch job.A job might be triggered as a result of some other event.We might also have a batch job that is running via spot and fails due to price changes or fails for some other reason and needs to be restarted.

Our authentication process needs to handle all these scenarios.

Initially I thought about texting a user and having them respond with an MFA code.

After thinking it through, I’m going to separate the process for sending an alert that a batch job needs to be executed from the authentication process that kicks off execution of a batch job. There are a few reasons for this.

First of all, when an application requests an MFA code, you have a limited time to enter that code before the request expires. If the process to start authenticating and creating a session for a batch job is not user-initiated and times out, the user does not have a way to initiate the login process and start the job.

I also reviewed various SMS security issues and don’t want to send SMS codes, links, and possibly any secrets in SMS. More on the reasons for that decision in an upcoming post where I take a deeper look at the Oktapus breach.

The process will probably look something like this:

Send the user an alert if they need to allow a job to execute or restart a job.

Alternatively the user decides they want to trigger a new job and initiates the process if they are allowed to do that.

Users cannot directly generate or access the session used by a batch job to carry out actions in AWS. They will not have access to the batch job credentials. However, a user will have to enter the MFA code for the job to proceed.

In order to enter the MFA code or trigger a job the user will first have to login with their own user-specific credentials to identify who triggered the job.

Segregating the users from sessions and limiting access to the active sessions will hopefully reduce the chance of someone obtaining and leveraging a session used by an active production batch job since all that will occur within a private network with no inbound access. We’ll have to take a look at outbound access on a per job basis.

Users cannot directly start batch jobs because they don’t have the required credentials. They have to go through a (hopefully) secure portal that can be closely monitored and trigger alerts if something suspicious occurs.

We’re actually going to use multiple forms of MFA to limit the potential for stolen MFA codes.

This will all get fleshed out as I work through it. As when I used to do oil painting in college I put the paint on the paper and pushed it around. I wasn’t as much of a fan of water color where everything had to be perfect at the start and wasn’t easy to adjust later. I have an idea in mind but it will morph as we go.

Let’s do some threat modeling. What kind of attacks do we need to consider as we build out this authentication system?

The impact of authentication and authorization on cybersecurity

I always review the latest breaches and breach statistics before each class and I’ve recently been reviewing some of the biggest breaches in 2022. As I explain in classes, it’s clear that stolen and abused credentials are one of the most common attack vectors in successful data breaches.

How we handle credentials will be critical to keeping cloud accounts secure. Attackers are moving beyond simple cross site scripting (XSS) and other basic web attacks to more complex, targeted, and automated attacks. Even when application vulnerabilities are involved, the end goal is generally some form of credentials.

Phishing is one of the most dangerous problems. This year attackers use phishing, or smishing (phishing via SMS) to get a user to enter credentials and an MFA code into a web site that impersonates a valid web site. Attackers used that approach in the Oktapus phishing campaign that affected a number of large companies. More thoughts on that breach in the next post. As I go through this implementation of batch job authentication to run a job, I want to figure out how we can prevent a similar fate.

Before giving away all my thoughts (and I do not claim to have all the answers, but I do have some ideas), I’ll let you ponder how you would prevent your users from entering MFA credentials in a bogus web form. Follow me for updates as I work through this problem.

The risk of existing session credentials

After a user has authenticated, the user name and password are not passed around anymore. The user obtains some sort of session id, token, or ticket that the system users to verify the user has already been authenticated and determine what permissions they have. If an attacker can obtain the session identifier, they no longer have to bypass MFA. That’s part of the authentication process that the user previously completed.

I already wrote about how you might expose those credentials inadvertently here.

AWS Credentials in Boto3 and CLI Debug Output

How are we going to ensure an attacker cannot obtain access to an active session and perform actions that our batch jobs are allowed to perform? In my case, I’m trying to protect the session credentials from access by separating the trigger from the execution and the authentication of a user from the authentication of a batch process.

Secure and helpful error handling

We’ll need to make sure that our error messages are helpful but do not expose information useful to attackers that might help them access the system.

Thoughtful Error Handling

Proper error handling will help use resolve problems quickly and ensure an attacker cannot break a system and obtain information from the output we don’t want them to obtain.

Automation that prevents access to credentials

This whole series has been covering secure automation, starting with deployment security that helps reduce blast radius and prevent misuse of credentials.

I haven’t explained everything I would do to secure an automated deployment pipeline (something I often cover with clients on IANS security consulting calls) but I’ve explained a lot of the groundwork.

We want to prevent attackers from obtaining credentials or access to deploy code or obtain access that allows them to deploy their own credentials and access — both of which happened in the Solar Winds breach.

SolarWinds Hack: Retrospective

As I implement the batch job authentication flow I’m thinking about how I can securely deploy credentials without allowing external access to the credentials that perform the deployments. In the end, you may find this is a discussion that comes full circle if I can get to the point where I hope to end up.

Server-Side Request Forgery (SSRF)

However we implement authentication and authorization, we want to prevent SSRF such as what the attacker used in the Capital One Breach:

What’s in your cloud?

I wrote a blog post for IANS Research (that never got published) on use of cloud credentials in a way that facilitates SSRS attacks. That is essentially what the attacker in the Capital One case may have done and has occurred in many successful bug bounties. I might write more about that if time allows. This risk applies to pretty much all infrastructure as service cloud providers.

I’ve also written and spoken about related attacks that can lead up to or contribute to SSRF attacks:

Teri Radichel

We’ll want to ensure our site is not susceptible to any of those.

Caching attacks

Another topic I’ve written and spoken about are various types of caching attacks that could expose credentials or provide access to sensitive data. James Kettle of Portswigger has spoken on this topic extensively. Improper configuration of caches, CDNs, and use of multiple servers with different methods of evaluating requests can lead to various types of attacks we’ll need to avoid.

DOM-Based XSS

I explained how another common attack can be pretty much eliminated using proper security controls — DOM XSS — in a paper and presentation for IANS Research. Unfortunately, the link to that presentation is no longer available but you may be able to find it in the IANS portal if you are a customer. I will need to make sure my web pages are not susceptible to those attacks, which I often find on cloud and application penetration tests when the proper controls are not in place.

And more…

These are just a few of the things I look for on cloud and application penetration tests for customers. We’ll think through how we can defend against all of this as we work through the solution I’ve outlined at a high level, and will explain in much more detail in future posts.

Follow for updates.

Teri Radichel

If you liked this story please clap and follow:

******************************************************************

Medium: Teri Radichel or Email List: Teri Radichel
Twitter: @teriradichel or @2ndSightLab
Requests services via LinkedIn: Teri Radichel or IANS Research

******************************************************************

© 2nd Sight Lab 2022

All the posts in this series:

Automating Cybersecurity Metrics (ACM)GitHub – tradichel/SecurityMetricsAutomation

____________________________________________

Author:

Cybersecurity for Executives in the Age of Cloud on Amazon

Need Cloud Security Training? 2nd Sight Lab Cloud Security Training

Is your cloud secure? Hire 2nd Sight Lab for a penetration test or security assessment.

Have a Cybersecurity or Cloud Security Question? Ask Teri Radichel by scheduling a call with IANS Research.

Cybersecurity & Cloud Security Resources by Teri Radichel: Cybersecurity and Cloud security classes, articles, white papers, presentations, and podcasts

Authentication Flow for Batch Jobs was originally published in Cloud Security on Medium, where people are continuing the conversation by highlighting and responding to this story.