You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the examples in this presentation, we will be analyzing `ChakraCore <https://github.com/microsoft/ChakraCore>`__.
16
16
17
-
More resources:
17
+
We recommend you download `this historic snapshot <https://downloads.lgtm.com/snapshots/cpp/microsoft/chakracore/ChakraCore-revision-2017-April-12--18-13-26.zip>`__ to analyze in QL for Eclipse.
18
18
19
-
- To learn more about the main features of QL, try looking at the `QL language handbook <https://help.semmle.com/QL/ql-handbook/>`__.
20
-
- For further information about writing queries in QL, see `Writing QL queries <https://help.semmle.com/QL/learn-ql/ql/writing-queries/writing-queries.html>`__.
19
+
Alternatively, you can query the project in `the query console <https://lgtm.com/query/project:2034240708/lang:cpp/>`__ on LGTM.com.
21
20
22
21
.. note::
23
22
24
-
To run the queries featured in this training presentation, we recommend you download the free-to-use `QL for Eclipse plugin <https://help.semmle.com/ql-for-eclipse/Content/WebHelp/getting-started.html>`__.
25
-
26
-
This plugin allows you to locally access the latest features of QL, including the standard QL libraries and queries. It also provides standard IDE features such as syntax highlighting, jump-to-definition, and tab completion.
27
-
28
-
A good project to start analyzing is `ChakraCore <https://github.com/microsoft/ChakraCore>`__–a suitable snapshot to query is available by visiting the link on the slide.
29
-
30
-
Alternatively, you can query any project (including ChakraCore) in the `query console on LGTM.com <https://lgtm.com/query/project:2034240708/lang:cpp/>`__.
31
-
32
-
Note that results generated in the query console are likely to differ to those generated in the QL plugin. LGTM.com analyzes the most recent revisions of each project that has been added–the snapshot available to download above is for an historical version of the code base.
33
-
23
+
Note that results generated in the query console are likely to differ to those generated in the QL plugin as LGTM.com analyzes the most recent revisions of each project that has been added–the snapshot available to download above is based on an historical version of the code base.
- To learn more about the main features of QL, try looking at the `QL language handbook <https://help.semmle.com/QL/ql-handbook/>`__.
19
-
- For further information about writing queries in QL, see `Writing QL queries <https://help.semmle.com/QL/learn-ql/ql/writing-queries/writing-queries.html>`__.
10
+
.. include:: ../slide-snippets/info.rst
20
11
21
-
.. note::
12
+
QL snapshot
13
+
===========
22
14
23
-
To run the queries featured in this training presentation, we recommend you download the free-to-use `QL for Eclipse plugin <https://help.semmle.com/ql-for-eclipse/Content/WebHelp/getting-started.html>`__.
15
+
For the examples in this presentation, we will be analyzing `ChakraCore <https://github.com/microsoft/ChakraCore>`__.
24
16
25
-
This plugin allows you to locally access the latest features of QL, including the standard QL libraries and queries. It also provides standard IDE features such as syntax highlighting, jump-to-definition, and tab completion.
17
+
We recommend you download `this historic snapshot <https://downloads.lgtm.com/snapshots/cpp/microsoft/chakracore/ChakraCore-revision-2017-April-12--18-13-26.zip>`__ to analyze in QL for Eclipse.
26
18
27
-
A good project to start analyzing is `ChakraCore <https://github.com/microsoft/ChakraCore>`__–a suitable snapshot to query is available by visiting the link on the slide.
19
+
Alternatively, you can query the project in `the query console <https://lgtm.com/query/project:2034240708/lang:cpp/>`__on LGTM.com.
28
20
29
-
Alternatively, you can query any project (including ChakraCore) in the `query console on LGTM.com <https://lgtm.com/query/project:2034240708/lang:cpp/>`__.
21
+
.. note::
30
22
31
-
Note that results generated in the query console are likely to differ to those generated in the QL plugin. LGTM.com analyzes the most recent revisions of each project that has been added–the snapshot available to download above is based on an historical version of the code base.
23
+
Note that results generated in the query console are likely to differ to those generated in the QL plugin as LGTM.com analyzes the most recent revisions of each project that has been added–the snapshot available to download above is based on an historical version of the code base.
32
24
33
25
Agenda
34
26
======
@@ -116,7 +108,7 @@ Find calls to free that are reachable from an allocation on the same variable:
116
108
117
109
.. note::
118
110
119
-
Predicates allocationCall and freeCall are defined in the standard library and model a number of standard alloc/free-like functions.
111
+
Predicates ``allocationCall`` and ``freeCall`` are defined in the standard library and model a number of standard alloc/free-like functions.
120
112
121
113
Exercise: use after free
122
114
========================
@@ -125,12 +117,12 @@ Based on this query, write a query that finds accesses to the variable that occu
125
117
126
118
.. rst-class:: build
127
119
128
-
- What do you find? What problems occur with this approach to detecting use-after-free vulnerabilities?
120
+
- What do you find? What problems occur with this approach to detecting use-after-free vulnerabilities?
For the examples in this presentation, we will be analyzing `dotnet/coreclr <https://github.com/dotnet/coreclr>`__.
19
18
20
-
- To learn more about the main features of QL, try looking at the `QL language handbook <https://help.semmle.com/QL/ql-handbook/>`__.
21
-
- For further information about writing queries in QL, see `Writing QL queries <https://help.semmle.com/QL/learn-ql/ql/writing-queries/writing-queries.html>`__.
19
+
We recommend you download `this historic snapshot <http://downloads.lgtm.com/snapshots/cpp/dotnet/coreclr/dotnet_coreclr_fbe0c77.zip>`__ to analyze in QL for Eclipse.
22
20
23
-
.. note::
24
-
25
-
To run the queries featured in this training presentation, we recommend you download the free-to-use `QL for Eclipse plugin <https://help.semmle.com/ql-for-eclipse/Content/WebHelp/getting-started.html>`__.
26
-
27
-
This plugin allows you to locally access the latest features of QL, including the standard QL libraries and queries. It also provides standard IDE features such as syntax highlighting, jump-to-definition, and tab completion.
21
+
Alternatively, you can query the project in `the query console <https://lgtm.com/query/projects:1505958977333/lang:cpp/>`__ on LGTM.com.
28
22
29
-
A good project to start analyzing is `ChakraCore <https://github.com/dotnet/coreclr>`__–a suitable snapshot to query is available by visiting the link on the slide.
30
-
31
-
Alternatively, you can query any project (including ChakraCore) in the `query console on LGTM.com <https://lgtm.com/query/projects:1505958977333/lang:cpp/>`__.
23
+
.. note::
32
24
33
-
Note that results generated in the query console are likely to differ to those generated in the QL plugin as LGTM.com analyzes the most recent revisions of each project that has been added–the snapshot available to download above is based on an historical version of the code base.
25
+
Note that results generated in the query console are likely to differ to those generated in the QL plugin as LGTM.com analyzes the most recent revisions of each project that has been added–the snapshot available to download above is based on an historical version of the code base.
34
26
35
27
Agenda
36
28
======
@@ -68,7 +60,7 @@ Let’s write a query to identify instances of `CWE-134 <https://cwe.mitre.org/d
68
60
69
61
In this case, we have one more format specifier than we have arguments. In a managed language such as Java or C#, this simply leads to a runtime exception. However, in C/C++, the formatting functions are typically implemented by reading values from the stack without any validation of the number of arguments. This means a mismatch in the number of format specifiers and format arguments can lead to information disclosure.
70
62
71
-
Of course, in practice this happens rarely with *constant* formatting strings. Instead, it’s most problematic when the formatting string can be specified by the user, allowing an attacker to provide a formatting string with the wrong number of format specifiers. Furthermore, if an attacker can control the format string, they may be able to provide the %n format specifier, which causes ``printf`` to write the number characters in the generated output string to a specified location.
63
+
Of course, in practice this happens rarely with *constant* formatting strings. Instead, it’s most problematic when the formatting string can be specified by the user, allowing an attacker to provide a formatting string with the wrong number of format specifiers. Furthermore, if an attacker can control the format string, they may be able to provide the ``%n`` format specifier, which causes ``printf`` to write the number characters in the generated output string to a specified location.
72
64
73
65
See https://en.wikipedia.org/wiki/Uncontrolled_format_string for more background.
Here, ``DMLOut`` and ``ExtOut`` are macros that expand to formatting calls. The format specifier is not constant, in the sense that the format argument is not a string literal. However, it is clearly one of two possible constants, both with the same number of format specifiers.
118
110
119
-
What we need is a way to determine whether the format argument is ever set to something that is not constant.
111
+
What we need is a way to determine whether the format argument is ever set to something that is, not constant.
120
112
121
113
Data flow analysis
122
114
==================
123
115
124
116
- Models flow of data through the program.
125
117
- Implemented in the module ``semmle.code.cpp.dataflow.DataFlow``.
126
118
- Class ``DataFlow::Node`` represents program elements that have a value, such as expressions and function parameters.
119
+
127
120
- Nodes of the data flow graph.
121
+
128
122
- Various predicated represent flow between these nodes.
129
-
Edges of the data flow graph.
123
+
124
+
- Edges of the data flow graph.
130
125
131
126
.. note::
132
127
@@ -183,8 +178,7 @@ Local vs global data flow
183
178
- Local (“intra-procedural”) data flow models flow within one function; feasible to compute for all functions in a snapshot
184
179
- Global (“inter-procedural”) data flow models flow across function calls; not feasible to compute for all functions in a snapshot
185
180
- Different APIs, so discussed separately
186
-
187
-
This slide deck focuses on the former.
181
+
- This slide deck focuses on the former.
188
182
189
183
.. note::
190
184
@@ -212,14 +206,14 @@ To use the data flow library, add the following import:
212
206
.. code-block:: ql
213
207
214
208
module DataFlow {
215
-
class Node extends … { … }
209
+
class Node extends ... { ... }
216
210
predicate localFlow(Node source, Node sink) {
217
211
localFlowStep*(source, sink)
218
212
}
219
-
…
213
+
...
220
214
}
221
215
222
-
So all references will need to be qualified (that is ``DataFlow::Node``)
216
+
So all references will need to be qualified (that is, ``DataFlow::Node``)
223
217
224
218
.. note::
225
219
@@ -248,7 +242,7 @@ Data flow graph
248
242
249
243
The ``DataFlow::Node`` class is shared between both the local and global data flow graphs–the primary difference is the edges, which in the “global” case can link different functions.
250
244
251
-
``localFlowStep`` is the “single step” flow relation–that is it describes single edges in the local data flow graph. ``localFlow`` represents the `transitive <https://help.semmle.com/QL/ql-handbook/recursion.html#transitive-closures>`__ closure of this relation–in other words, it contains every pair of nodes where the second node is reachable from the first in the data flow graph.
245
+
``localFlowStep`` is the “single step” flow relation–that is, it describes single edges in the local data flow graph. ``localFlow`` represents the `transitive <https://help.semmle.com/QL/ql-handbook/recursion.html#transitive-closures>`__ closure of this relation–in other words, it contains every pair of nodes where the second node is reachable from the first in the data flow graph.
252
246
253
247
The data flow graph is separate from the `AST <https://en.wikipedia.org/wiki/Abstract_syntax_tree>`__, to allow for flexibility in how data flow is modeled. There are a small number of data flow node types–expression nodes, parameter nodes, uninitialized variable nodes, and definition by reference nodes. Each node provides mapping functions to and from the relevant AST (for example ``Expr``, ``Parameter`` etc.) or symbol table (for example ``Variable``) classes.
254
248
@@ -306,7 +300,7 @@ Define a subclass of ``DataFlow::Node`` representing “source” nodes, that is
306
300
Revisiting non-constant format strings
307
301
======================================
308
302
309
-
Refine the query to find calls to ``printf``-like functions where the format argument derives from a local source that is not a constant string.
303
+
Refine the query to find calls to ``printf``-like functions where the format argument derives from a local source that is, not a constant string.
310
304
311
305
.. rst-class:: build
312
306
@@ -320,6 +314,7 @@ Audit the results and apply any refinements you deem necessary.
320
314
Suggestions:
321
315
322
316
- Replace ``DataFlow::localFlowStep`` with a custom predicate that includes steps through global variable definitions.
317
+
323
318
**Hint**: Use class ``GlobalVariable`` and its member predicates ``getAnAssignedValue()`` and ``getAnAccess()``.
324
319
325
320
- Exclude calls in wrapper functions that just forward their format argument to another ``printf``-like function; instead, flag calls to those functions.
@@ -330,4 +325,4 @@ Beyond local data flow
330
325
- Results are still underwhelming.
331
326
- Dealing with parameter passing becomes cumbersome.
332
327
- Instead, let’s turn the problem around and find user-controlled data that flows into a ``printf`` format argument, potentially through calls.
0 commit comments