Skip to main content

Introduction

During a security assessment, it is critical for the consultant or penetration tester performing the test to identify the true root cause of identified vulnerabilities as, generally, this will allow for the most appropriate remediation advice to be given. While in many cases pertaining to more well-known vulnerabilities, such as SQL Injection, this can be a simple matter, it can become exponentially more difficult the more unique a vulnerability is. This article will explore what a root cause is, why it is important to accurately identify, and common mistakes that people, including security professionals, can make in identifying and understanding the root causes of vulnerabilities.

What is a Root Cause?

The root cause of a vulnerability is the specific thing that went wrong in architecting or developing a system or application. By identifying the cause of the vulnerability, it enables a consultant to recommend an appropriate remedial action that fixes the vulnerability; instead of a mitigation which only reduces the likelihood or impact of exploitation (reduction in both of these is useful, but only remediations solve the issue in and of themselves). This process can become difficult in more complex applications and environments that have many moving parts, as it can be difficult to drill down to the specific cause of an exploitable behaviour. It can also become difficult, or impossible, for a consultant performing an assessment for a set time period to understand the system well enough to pinpoint the precise root cause of an issue. This is often where it is necessary for discussions between the consultant and their client (which may involve the developers of the application or the administrators of the system) to work together to determine the root cause.

Case Study Tracking Down the Root Cause of SQLi

To help understand why it can be challenging to identify the root cause of a vulnerability, we will investigate this through a case study of SQL Injection (SQLi). SQLi is a well-known vulnerability that has existed for decades; it is caused by the concatenation of untrusted input into a SQL query that allows a malicious actor to modify the query syntax to execute their own SQL commands. The most common exploitation payload used against a SQLi vulnerability ('or 1=1 --) makes a conditional statement always return true, which passes checks that would otherwise fail, or extracts more information from the statement than originally intended.

​Consider the following SQL query, where user input is concatenated with a query string prior to it being sent to the database to be executed:

"SELECT * FROM products WHERE ProductID = ' " + user_input + "';"

If a user entered a product id — such as AB101 — as the application expects, the query would return all the information about the specified product. However, if a payload such as AB101' or 1=1 -- were submitted, the query would evaluate the WHERE condition as true and return the data of all of the products in the products table.

Analysing a root cause statement

​The reason I am using SQLi as an example is because I have found that its root cause is often confused; especially by people who are new to the cybersecurity space. Though, this is not entirely their fault, as, for example, when I performed a quick google search for the “root cause of SQL Injection” the following was my top result (an extract from https://hwang.cisdept.cpp.edu/swanew/text/SQL-Injection.htm):

The three root causes of SQL injection vulnerabilities are the combining of data and code in dynamic SQL statement, error revealation, and the insufficient input validation.

Your search results may vary, but more examples that have a similar issue (that appear high in search results) are provided in the footnotes at the end of this article.

​The root cause statement above is incorrect, as only the first portion of the statement is the true root cause of SQLi — but let’s investigate why this is the case.

Analysing potential root cause 1 Error revealation

​”Error revealation” is when an application encounters database errors and displays these to the user, typically in response to malicious or malformed user input. This behaviour is certainly extremely useful to malicious actors in identifying and exploiting a SQLi vulnerability. However, even if a developer prevents all system errors from being displayed to a user, a SQLi attack is still possible, through an inference attack. An inference attack (or blind SQL injection) is where a malicious actor forces the server to respond differently between the success and failure of a query to infer the contents of the database. Most commonly, these attacks fall into the following categories:

  • Error-based SQLi: Causing a failure condition to be triggered in the application based on the success or failure of a query. The presence (or absence) of the failure condition is then used to infer if the queried data exists in the database or not.
  • Time-based SQLi: Causing a time delay on the success of a query to infer if the queried data exists in the database or not.

This is just a high-level summary of inference attacks and further explanation can be found here or here.

Analysing potential root cause 2 Insufficient input validation

​Input validation is a strong mitigation against all injection attacks; however, it is not a remedial action in and of itself. The reason for this is that even if the most stringent input validation is implemented on an application, the underlying query itself is still vulnerable to a SQLi attack. Now, the input validation may ensure that this attack is not realistically exploitable via user-supplied data that is sent to the application, but, achieving this level of security may require a significant investment of resources to ensure that all user input and data consumed by a SQL query has been validated and does not contain a malicious payload. If even one instance of input being consumed by a vulnerable query is missed, an attacker could exploit SQLi in the application and potentially gain access or control over the contents of the database that the application has access or control over. Furthermore, this mitigation is not applicable to any application that may need to use potentially dangerous characters in a SQL query, such as an apostrophe or quotation marks.

Analysing potential root cause 3 Concatenation of untrusted user data into dynamically constructed SQL queries

​This then leaves concatenation of untrusted input into dynamically constructed SQL queries as the root cause of SQLi.

Why is this the case? And why does addressing this remediate the vulnerability compared to the aforementioned issues? Well, the issue we see with the previous cases is that even if you completely and effectively fix the candidate “root cause”, the vulnerability could still be exploited. In contrast to this, the fix for the concatenation of untrusted input into dynamically constructed SQL queries is to use parameterised queries. These are a type of SQL query that define the query with placeholders for variable user input. The query and input variables are sent to the database separately and the database pre-compiles the query before the input is inserted. This, in turn, prevents the input from being able to manipulate the query syntax and execute unintended actions. Therefore, it is impossible to exploit SQLi on a properly constructed parameterised query – no matter what input is received by the server or what errors the server shows the user; unless there is a flaw in the underlying database logic (i.e. a vulnerability in the database server software itself).

By addressing and remediating the one thing that went wrong in the development process and implementing parameterised queries of all SQL queries, your application will be secure against SQLi attacks.

Identifying Root Causes

​How can we better identify root causes of a detected vulnerability? The most important exercise in identifying a root cause is critical discussion and questioning of a potentially identified root cause. Several questions can be asked that may be useful in this analysis process:

  • What is the problem posed by the vulnerability and why is this a problem?
  • Can we pinpoint exactly where the vulnerability is in a code base or environment?
  • How would we address the vulnerability?
  • If we implemented that fix, would the vulnerability still be exploitable in any way?

​It is also important to ignore tangential aspects to the root cause. There may be several of these:

  • Difficulty of exploitation. For example, input validation would increase the difficulty of exploitation but not remediate the issue.
  • Symptomatic issues. A vulnerability may cause something like Error Revealation but that is a symptom of the problem and not the root cause itself.
  • Bureaucracy/business processes around the exploitation of a vulnerability; i.e. even though a organisation may have welldefined processes for executing an action, it is important to remember that if a malicious actor is in a position to exploit they vulnerability, they would not undergo those checks, or may not care about them. (If an attacker is position to exploit a vulnerability, then it is prudent to assume that they would and aim to build defences that prevent or detect such actions.)

Conclusion

​I hope this article has shown why, during security testing, it is important to correctly identify the root causes of vulnerabilities, and how a mistake in the identification of the root cause can result in inappropriate or incomplete remedial advice being offered. This could easily result in the vulnerability not being fixed and the application remaining vulnerable after testing is completed and the recommended fixes are subsequently implemented.

​SQLi is a well-known example and its true root cause can be quickly identified; however, even with this commonly known vulnerability, mistakes are possible if the necessary critical thinking is not applied. In comparison, a complex system only introduces more difficulty into the identification of the root cause. As such, MWR aims to consistently work with our clients to gain a fuller understanding of the applications and environments that we test, so that we can accurately identify the root causes of vulnerabilities and use this to communicate the most appropriate remedial actions – i.e those that have the highest likelihood of truly resolving vulnerabilities when implemented.

The root cause of an SQL injection attack is letting the user input to directly be executed. This goes against the tenets of secure input handling which requires all user input to be –

2. Validated against a finite, known range of expected characters.

3. Sanitized and escaped – the apostrophe character is notorious for being at the heart of SQL injections, if there is a reason to accept this character as input (for instance if a person’s name is O’Hara), then it needs to be converted to O&#39Hara where &#39 is the HTML code for the ‘apostrophe’.

4. Limited to a reasonable maximum size.

Footnote 2:

This footnote is an extract from https://www.ijert.org/a-review-on-sql-injection:

The root cause of SQL injection vulnerabilities is insufficient input validation.

Encoding of inputs: Injection into a string parameter is often accomplished through the use of meta-characters that trick the SQL parser into interpreting user input as SQL tokens. While it is possible to prohibit any usage of these meta-characters, doing so would restrict a non-malicious users ability to specify legal inputs that contain such characters. A better solution is to use functions that encode a string in such a way that all meta-characters are specially encoded and interpreted by the database as normal characters.

​Positive pattern matching: Developers should establish input validation routines that identify good input as opposed to bad input. This approach is generally called positive validation, as opposed to negative validation, which searches input for forbidden patterns or SQL tokens. Because developers might not be able to envision every type of attack that could be launched against their application, but should be able to specify all the forms of legal input, positive validation is a safer way to check inputs.

​Identification of all input sources: Developers must check all input to their application.