Previously on this blog, we’ve talked about how MSRC automates the root cause analysis of vulnerabilities reported and found. After doing this, our next step is variant analysis: finding and investigating any variants of the vulnerability. It’s important that we find all such variants and patch them simultaneously, otherwise we bear the risk of these being exploited in the wild. In this post, I’d like to explain the automation we use in variant finding.
For the past year or so, we’ve been augmenting our manual code review processes with Semmle, a third-party static analysis environment. It compiles code to a relational database (the snapshot database – a combination of database and source code), which is queried using Semmle QL, a declarative, object-oriented query language designed for program analysis.
The basic workflow is that, after root cause analysis, we write queries to find code patterns that are semantically similar to the original vulnerability. Any results are triaged as usual and provided to our engineering teams for a fix to be developed. Also, the queries are placed in a central repository to be re-run periodically by MSRC and other security teams. This way, we can scale our variant finding over time and across multiple codebases.
In addition to variant analysis, we’ve been using QL proactively, in our security reviews of source code. This will be the topic of a future blog post. For now, let’s look at some real-world examples inspired by MSRC cases.
Incorrect integer overflow checks
This first case is a bug that’s straightforward to define, but would be tedious to find variants of in a large codebase.
The code below shows a common idiom for detecting overflow on the addition of unsigned integers:
Unfortunately, this does not work properly when the width of the integer type is small enough to be subject to integral promotion. For example, if x and y were both unsigned short, when compiled, the above code would end up being equivalent to (unsigned int)x + y < x, making this overflow check ineffective.
Here’s a QL query that matches this code pattern:
In the from clause, we define the variables, and their types, to be used in the rest of the query. AddExpr, Variable, and RelationalOperation are QL classes representing various sets of entities in the snapshot database, e.g. RelationalOperation covers every expression with a relational operation (less than, greater than, etc.)
The where clause is the meat of the query, using logical connectives and quantifiers to define how to relate the variables. Breaking it down, this means that the addition expression and the relational operation need the same variable as one of their operands (x in the example code above):
The other operand of the relational operation must be the addition:
Both operands of the addition must have a width less than 32 bits (4 bytes):
But if there is an explicit cast on the addition expression, we’re not interested if it’s less than 32 bits:
(This is so a check like (unsigned short)(x + y) < x doesn’t get flagged by the query.)
Finally, the select clause defines the result set.
Unsafe use of SafeInt
An alternative to rolling your own integer overflow checks is to use a library with such checks built in. SafeInt is a C++ template class that overrides arithmetic operators to throw an exception where overflow is detected.
Here’s an example of how to use it correctly:
But here is an example of how it can be unintentionally misused – the expression passed to the constructor may already have overflowed:
How to write a query to detect this? In the previous example, our query only used built-in QL classes. For this one, let’s start by defining our own. For this, we choose one or more QL classes to subclass from (with extends), and define a characteristic predicate which specifies those entities in the snapshot database that are matched by the class:
The QL class Type represents the set of all types in the snapshot database. For the QL class SafeInt, we subset this to just types with a name that begins with “SafeInt<”, thus indicating that they are instantiations of the SafeInt template class. The getUnspecifiedType() predicate is used to disregard typedefs and type specifiers such as const.
Next, we define the subset of expressions that could potentially overflow. Most arithmetic operations can overflow, but not all; this uses QL’s instanceof operator to define which ones. We use a recursive definition because we need expressions such as (x+1)/y to be included, even though x/y is not.
Finally, we relate these two classes in a query:
.(Call) and .(Constructor) are examples of an inline cast, which, similar to instanceof, is another way of restricting which QL classes match. In this case we are saying that, given an expression that may overflow, we’re only interested if its parent expression is some sort of call. Furthermore, we only want to know if the target of that call is a constructor, and if it’s a constructor for some SafeInt.
Like the previous example, this was a query that provided a number of actionable results across multiple codebases.
For the array buffer pointer, we match on the calls to GetTypedArrayBuffer, where the second argument (getArgument of Call is zero-indexed) is an address-of expression (&), and take its operand:
The exists logical quantifier is used here to introduce a new variable (c) into the scope.
The “+” suffix of the calls predicate specifies a transitive closure – it applies the predicate to itself until there is a match. This permits a concisely defined exploration of the function call graph.
Finally, this query brings these QL class definitions together, relating them by control flow:
The predicate getASuccessor() specifies the next statement or expression in the control flow. Therefore, using e.g. call.getASuccessor+() = use will follow the control flow graph from call until there is a match to use. The diagram below illustrates this:
This query uncovered four variants in addition to the originally reported vulnerability, all of them assessed as critical severity.
That’s all for now. The next instalment will cover using QL for data flow analysis and taint tracking, with examples from our security review of an Azure firmware component.
Steven Hunter, MSRC Vulnerabilities & Mitigations team