Skip to content

Commit a9a0b9a

Browse files
author
james
committed
docs: fix some errors picked up by vale linter
1 parent c97f582 commit a9a0b9a

File tree

5 files changed

+38
-34
lines changed

5 files changed

+38
-34
lines changed

docs/language/ql-training-rst/cpp/bad-overflow-guard.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ More resources:
2929

3030
Alternatively, you can query any project (including ChakraCore) in the `query console on LGTM.com <https://lgtm.com/query/project:2034240708/lang:cpp/>`__.
3131

32-
Note that results generated in the query console are likely to differ to those generated in the QL plugin as LGTM.com analyzes the most recent revisions of each project that has been added–the snapshot available to download above is based on an historical version of the code base.
32+
Note that results generated in the query console are likely to differ to those generated in the QL plugin. LGTM.com analyzes the most recent revisions of each project that has been added–the snapshot available to download above is based on an historical version of the code base.
3333

3434

3535
Checking for overflow in C
@@ -53,7 +53,7 @@ Where might this go wrong?
5353
- In C/C++ we often need to check for whether an operation `overflows <https://en.wikipedia.org/wiki/Integer_overflow>`__.
5454
- An overflow is when an arithmetic operation, such as an addition, results in a number which is too large to be stored in the type.
5555
- When an operation overflows, the value “wraps” around.
56-
- A typical way to check for overflow of an addition, therefore, is whether the result is less than one of the arguments - i.e. the result has wrapped.
56+
- A typical way to check for overflow of an addition, therefore, is whether the result is less than one of the arguments - that is the result has **wrapped**.
5757

5858
Integer promotion
5959
=================
@@ -174,7 +174,7 @@ We can get the size (in bytes) of a type using the ``getSize()`` method.
174174

175175
- An important part of the query is to determine whether a given expression has a “small” type that is going to trigger integer promotion.
176176
- We therefore write a helper predicate for small expressions.
177-
- This predicate effectively represents the set of all expressions in the database where the size of the type of the expression is less than 4 bytes, i.e. less than 32 bits.
177+
- This predicate effectively represents the set of all expressions in the database where the size of the type of the expression is less than 4 bytes, that is less than 32 bits.
178178

179179
QL query: bad overflow guards
180180
=============================
@@ -191,7 +191,7 @@ Now our query becomes:
191191
.. note::
192192

193193
- Recall from earlier that what makes an overflow check a “bad” check is that all the arguments to the addition are integers smaller than 32 bits.
194-
- We could write this by using our helper predicate ``isSmall`` to specify that each individual operand to the addition ``isSmall`` (i.e. under 32 bits):
194+
- We could write this by using our helper predicate ``isSmall`` to specify that each individual operand to the addition ``isSmall`` (that is under 32 bits):
195195

196196
.. code-block:: ql
197197
@@ -206,12 +206,12 @@ Now our query becomes:
206206
- In our case:
207207
- The declaration introduces a variable for Expressions, called ``op``. At this stage, this variable represents all the expressions in the program.
208208
- The “range” part, ``op = a.getAnOperand()``, restricts ``op`` to being one of the two operands to the addition.
209-
- The “condition” part, ``isSmall(op)``, says that the ``forall`` holds only if the condition - that the ``op`` is small - holds for everything in the range - i.e. both the arguments to the addition
209+
- The “condition” part, ``isSmall(op)``, says that the ``forall`` holds only if the condition - that the ``op`` is small - holds for everything in the range - that is both the arguments to the addition
210210

211211
QL query: bad overflow guards
212212
=============================
213213

214-
In some cases the result of the addition is cast to a small type of size less than 4 bytes, preventing automatic widening. We don’t want our query to flag these instances.
214+
Sometimes the result of the addition is cast to a small type of size less than 4 bytes, preventing automatic widening. We don’t want our query to flag these instances.
215215

216216
We can use predicate ``Expr.getExplicitlyConverted()`` to reason about casts that are applied to an expression, adding this restriction to our query:
217217

docs/language/ql-training-rst/cpp/control-flow-cpp.rst

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ More resources:
2828

2929
Alternatively, you can query any project (including ChakraCore) in the `query console on LGTM.com <https://lgtm.com/query/project:2034240708/lang:cpp/>`__.
3030

31-
Note that results generated in the query console are likely to differ to those generated in the QL plugin as LGTM.com analyzes the most recent revisions of each project that has been added–the snapshot available to download above is based on an historical version of the code base.
31+
Note that results generated in the query console are likely to differ to those generated in the QL plugin. LGTM.com analyzes the most recent revisions of each project that has been added–the snapshot available to download above is based on an historical version of the code base.
3232

3333
Agenda
3434
======
@@ -79,7 +79,10 @@ Control flow graphs
7979

8080
.. note::
8181

82-
The control flow graph is a static over-approximation of possible control flow at runtime. Its nodes are program elements such as expressions and statements. If there is an edge from one node to another, then it means that the semantic operation corresponding to the first node may be immediately followed by the operation corresponding to the second node. Some nodes (such as conditions of “if” statements or loop conditions) have more than one successor, representing conditional control flow at runtime.
82+
The control flow graph is a static over-approximation of possible control flow at runtime.
83+
Its nodes are program elements such as expressions and statements.
84+
If there is an edge from one node to another, then it means that the semantic operation corresponding to the first node may be immediately followed by the operation corresponding to the second node.
85+
Some nodes (such as conditions of “if” statements or loop conditions) have more than one successor, representing conditional control flow at runtime.
8386

8487
Modeling control flow
8588
=====================
@@ -101,7 +104,7 @@ The control-flow graph is *intra-procedural* - in other words, only models paths
101104

102105
The control flow graph is similar in concept to data flow graphs. In contrast to data flow, however, the AST nodes are directly control flow graph nodes.
103106

104-
The predecessor/successor predicates are prime examples of member predicates with results that are used in functional syntax, but that are not actually functions, since a control flow node may have any number of predecessors and successors (including zero or more than one).
107+
The predecessor/successor predicates are prime examples of member predicates with results that are used in functional syntax, but that are not actually functions. This is because a control flow node may have any number of predecessors and successors (including zero or more than one).
105108

106109
Example: malloc/free pairs
107110
==========================

docs/language/ql-training-rst/cpp/data-flow-cpp.rst

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ Agenda
4444
Motivation
4545
==========
4646

47-
Let’s write a query to identify instances of `CWE-134 <https://cwe.mitre.org/data/definitions/134.html>`__ Use of externally controlled format string.
47+
Let’s write a query to identify instances of `CWE-134 <https://cwe.mitre.org/data/definitions/134.html>`__ **Use of externally controlled format string**.
4848

4949
.. code-block:: cpp
5050
@@ -60,7 +60,7 @@ Let’s write a query to identify instances of `CWE-134 <https://cwe.mitre.org/d
6060
6161
printf("Name: %s, Age: %d", "Freddie", 2);
6262
63-
would produce the output Name: Freddie, Age: 2”. So far, so good. However, problems arise if there is a mismatch between the number of formatting specifiers, and the number of arguments. For example:
63+
would produce the output ``"Name: Freddie, Age: 2”``. So far, so good. However, problems arise if there is a mismatch between the number of formatting specifiers, and the number of arguments. For example:
6464

6565
.. code-block:: cpp
6666
@@ -123,14 +123,14 @@ Data flow analysis
123123

124124
- Models flow of data through the program.
125125
- Implemented in the module ``semmle.code.cpp.dataflow.DataFlow``.
126-
- Class ``DataFlow::Node`` represents program elements that have a value, such as expressions and fucntion parameters.
126+
- Class ``DataFlow::Node`` represents program elements that have a value, such as expressions and function parameters.
127127
- Nodes of the data flow graph.
128128
- Various predicated represent flow between these nodes.
129129
Edges of the data flow graph.
130130

131131
.. note::
132132

133-
The solution here is to use *data flow*. Data flow is, as the name suggests, about tracking the flow of data through the program. It helps answers questions likedoes this expression ever hold a value that originates from a particular other place in the program”.
133+
The solution here is to use *data flow*. Data flow is, as the name suggests, about tracking the flow of data through the program. It helps answers questions like: *does this expression ever hold a value that originates from a particular other place in the program*?
134134

135135
We can visualize the data flow problem as one of finding paths through a directed graph, where the nodes of the graph are elements in program, and the edges represent the flow of data between those elements. If a path exists, then the data flows between those two edges.
136136

@@ -225,7 +225,7 @@ So all references will need to be qualified (that is ``DataFlow::Node``)
225225

226226
A **query library** is file with the extension ``.qll``. Query libraries do not contain a query clause, but may contain modules, classes, and predicates. For example, the `C/C++ data flow library <https://help.semmle.com/qldoc/cpp/semmle/code/cpp/dataflow/DataFlow.qll/module.DataFlow.html>`__ is contained in the ``semmle/code/cpp/dataflow/DataFlow.qll`` QLL file, and can be imported as shown above.
227227

228-
A **module** is a way of organizing QL code by grouping together related predicates, classes and (sub-)modules; either explicitly declared or implicit. A query library implicitly declares a module with the same name as the QLL file.
228+
A **module** is a way of organizing QL code by grouping together related predicates, classes, and (sub-)modules. They can be either explicitly declared or implicit. A query library implicitly declares a module with the same name as the QLL file.
229229

230230
For further information on libraries and modules in QL, see the chapter on `Modules <https://help.semmle.com/QL/ql-handbook/modules.html>`__ in the QL language handbook.
231231

@@ -250,7 +250,7 @@ Data flow graph
250250

251251
``localFlowStep`` is the “single step” flow relation–that is it describes single edges in the local data flow graph. ``localFlow`` represents the `transitive <https://help.semmle.com/QL/ql-handbook/recursion.html#transitive-closures>`__ closure of this relation–in other words, it contains every pair of nodes where the second node is reachable from the first in the data flow graph.
252252

253-
The data flow graph is completely separate from the `AST <https://en.wikipedia.org/wiki/Abstract_syntax_tree>`__, to allow for flexibility in how data flow is modeled. There are a small number of data flow node types–expression nodes, parameter nodes, uninitialized variable nodes, and definition by reference nodes. Each node provides mapping functions to and from the relevant AST (for example ``Expr``, ``Parameter`` etc.) or symbol table (e.g. ``Variable``) classes.
253+
The data flow graph is separate from the `AST <https://en.wikipedia.org/wiki/Abstract_syntax_tree>`__, to allow for flexibility in how data flow is modeled. There are a small number of data flow node types–expression nodes, parameter nodes, uninitialized variable nodes, and definition by reference nodes. Each node provides mapping functions to and from the relevant AST (for example ``Expr``, ``Parameter`` etc.) or symbol table (for example ``Variable``) classes.
254254

255255
Taint-tracking
256256
==============
@@ -270,9 +270,9 @@ Taint-tracking
270270

271271
Taint tracking can be thought of as another type of data flow graph. It usually extends the standard data flow graph for a problem by adding edges between nodes where one one node influences or *taints* another.
272272

273-
The `API <https://help.semmle.com/qldoc/cpp/semmle/code/cpp/dataflow/TaintTracking.qll/module.TaintTracking.html>`__ is almost identical to that of the local data flow; all we need to do to switch to taint tracking is ``import semmle.code.cpp.dataflow.TaintTracking`` instead of ``semmle.code.cpp.dataflow.DataFlow``, and instead of using ``localFlow``, we use ``localTaint``.
273+
The `API <https://help.semmle.com/qldoc/cpp/semmle/code/cpp/dataflow/TaintTracking.qll/module.TaintTracking.html>`__ is almost identical to that of the local data flow. All we need to do to switch to taint tracking is ``import semmle.code.cpp.dataflow.TaintTracking`` instead of ``semmle.code.cpp.dataflow.DataFlow``, and instead of using ``localFlow``, we use ``localTaint``.
274274

275-
Exercise: Source Nodes
275+
Exercise: source nodes
276276
======================
277277

278278
Define a subclass of ``DataFlow::Node`` representing “source” nodes, that is, nodes without a (local) data flow predecessor.
@@ -329,5 +329,5 @@ Beyond local data flow
329329

330330
- Results are still underwhelming.
331331
- Dealing with parameter passing becomes cumbersome.
332-
- Instead, let’s turn the problem around and find user-controlled data that flows into a printf format argument, potentially through calls.
333-
- This needs global data flow.
332+
- Instead, let’s turn the problem around and find user-controlled data that flows into a ``printf`` format argument, potentially through calls.
333+
- This needs **global data flow**.

docs/language/ql-training-rst/cpp/global-data-flow-cpp.rst

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -59,20 +59,20 @@ Global data flow and taint tracking
5959
- Global (“inter-procedural”) data flow models flow across function calls; not feasible to compute for all functions in a snapshot
6060

6161
- For global data flow (and taint tracking), we must therefore provided restrictions to ensure the problem is tractable.
62-
- Typically, this involves specifying the source and sink.
62+
- Typically, this involves specifying the *source* and *sink*.
6363

6464
.. note::
6565

66-
As we mentioned in the previous slide deck, while local dataflow is feasible to compute for all functions in a snapshot, global dataflow is not. This is because the number of paths becomes exponentially larger for global dataflow.
66+
As we mentioned in the previous slide deck, while local data flow is feasible to compute for all functions in a snapshot, global data flow is not. This is because the number of paths becomes exponentially larger for global data flow.
6767

68-
The global dataflow (and taint tracking) avoids this problem by requiring that the query author specifies which ``sources`` and ``sinks`` are applicable. This allows the implementation to compute paths between the restricted set of nodes, rather than the full graph.
68+
The global data flow (and taint tracking) avoids this problem by requiring that the query author specifies which ``sources`` and ``sinks`` are applicable. This allows the implementation to compute paths between the restricted set of nodes, rather than the full graph.
6969

7070
Global taint tracking library
7171
=============================
7272

73-
The semmle.code.cpp.dataflow.TaintTracking library provides a framework for implementing solvers for global taint tracking problems:
73+
The ``semmle.code.cpp.dataflow.TaintTracking`` library provides a framework for implementing solvers for global taint tracking problems:
7474

75-
#. Subclass TaintTracking::Configuration following this template:
75+
#. Subclass ``TaintTracking::Configuration`` following this template:
7676

7777
.. code-block:: ql
7878
@@ -82,7 +82,7 @@ The semmle.code.cpp.dataflow.TaintTracking library provides a framework for impl
8282
override predicate isSink(DataFlow::Node nd) { … }
8383
}
8484
85-
#. Use Config.hasFlow(source, sink) to find inter-procedural paths.
85+
#. Use ``Config.hasFlow(source, sink)`` to find inter-procedural paths.
8686

8787
.. note::
8888

@@ -96,7 +96,7 @@ Finding tainted format strings (outline)
9696

9797
.. note::
9898

99-
Here’s the outline for a inter-procedural (i.e. “global”) version of the tainted formatting strings query we saw in the previous slide deck. The same template will be applicable for most taint tracking problems.
99+
Here’s the outline for a inter-procedural (that is “global”) version of the tainted formatting strings query we saw in the previous slide deck. The same template will be applicable for most taint tracking problems.
100100

101101
Defining sources
102102
================
@@ -118,7 +118,7 @@ The library class ``SecurityOptions`` provides a (configurable) model of what co
118118
119119
.. note::
120120

121-
We first define what it means to be a ``source`` of tainted data for this particular problem. In this case, what we care about is whether the format string can be provided by an external user to our application or service. As there are many such ways external data could be introduced into the system, the standard QL libraries for C/C++ include an extensible API for modelling user input. In this case, we will simply use the pre-defined set of user inputs, which includes arguments provided to command line applications.
121+
We first define what it means to be a *source* of tainted data for this particular problem. In this case, what we care about is whether the format string can be provided by an external user to our application or service. As there are many such ways external data could be introduced into the system, the standard QL libraries for C/C++ include an extensible API for modelling user input. In this case, we will simply use the predefined set of *user inputs*, which includes arguments provided to command line applications.
122122

123123

124124
Defining sinks (exercise)
@@ -167,7 +167,8 @@ Use the ``FormattingFunction`` class to fill in the definition of “isSink”
167167
Path queries
168168
============
169169

170-
Provide information about the identified paths from sources to sinks; can be examined in Path Explorer view.
170+
Path queries provide information about the identified paths from sources to sinks. Paths can be examined in Path Explorer view.
171+
171172
Use this template:
172173

173174
.. code-block:: ql
@@ -186,7 +187,7 @@ Use this template:
186187
187188
.. note::
188189

189-
In order to see the paths between the source and the sinks, we can convert the query to a path problem query. There are a few minor changes that need to be made for this to work - we need an additional import, to specify ``PathNode`` rather than ``Node``, and to add the source/sink to the query output (so that we can automatically determine the paths).
190+
To see the paths between the source and the sinks, we can convert the query to a path problem query. There are a few minor changes that need to be made for this to work - we need an additional import, to specify ``PathNode`` rather than ``Node``, and to add the source/sink to the query output (so that we can automatically determine the paths).
190191

191192
Defining additional taint steps
192193
===============================

0 commit comments

Comments
 (0)