You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/language/learn-ql/python/functions.rst
+7-5Lines changed: 7 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -58,19 +58,21 @@ We can modify the query further to include only methods whose body consists of a
58
58
Finding a call to a specific function
59
59
-------------------------------------
60
60
61
-
This query uses ``Call`` and ``Name`` to find calls to the function ``input`` - which might potentially be a security hazard (in Python 2).
61
+
This query uses ``Call`` and ``Name`` to find calls to the function ``eval`` - which might potentially be a security hazard.
62
62
63
63
.. code-block:: ql
64
64
65
65
import python
66
66
67
67
from Call call, Name name
68
-
where call.getFunc() = name and name.getId() = "input"
69
-
select call, "call to 'input'."
68
+
where call.getFunc() = name and name.getId() = "eval"
69
+
select call, "call to 'eval'."
70
70
71
-
➤ `See this in the query console <https://lgtm.com/query/686330029/>`__. Some of the demo projects on LGTM.com use this function.
71
+
➤ `See this in the query console <https://lgtm.com/query/6718356557331218618/>`__. Some of the demo projects on LGTM.com use this function.
72
72
73
-
The ``Call`` class represents calls in Python. The ``Call.getFunc()`` predicate gets the expression being called. ``Name.getId()`` gets the identifier (as a string) of the ``Name`` expression. Due to the dynamic nature of Python, this query will select any call of the form ``input(...)`` regardless of whether it is a call to the built-in function ``input`` or not. In a later tutorial we will see how to use the type-inference library to find calls to the built-in function ``input`` regardless of name of the variable called.
73
+
The ``Call`` class represents calls in Python. The ``Call.getFunc()`` predicate gets the expression being called. ``Name.getId()`` gets the identifier (as a string) of the ``Name`` expression.
74
+
Due to the dynamic nature of Python, this query will select any call of the form ``eval(...)`` regardless of whether it is a call to the built-in function ``eval`` or not.
75
+
In a later tutorial we will see how to use the type-inference library to find calls to the built-in function ``eval`` regardless of name of the variable called.
Copy file name to clipboardExpand all lines: docs/language/learn-ql/python/pointsto-type-infer.rst
+61-56Lines changed: 61 additions & 56 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,21 +3,21 @@ Tutorial: Points-to analysis and type inference
3
3
4
4
This topic contains worked examples of how to write queries using the standard QL library classes for Python type inference.
5
5
6
-
The ``Object`` class
6
+
The ``Value`` class
7
7
--------------------
8
8
9
-
The ``Object`` class and its subclasses ``FunctionObject``, ``ClassObject`` and ``ModuleObject`` represent the values an expression may hold at runtime.
9
+
The ``Value`` class and its subclasses ``FunctionValue``, ``ClassValue`` and ``ModuleValue`` represent the values an expression may hold at runtime.
predicate pointsTo(Context context, Value object, ControlFlowNode origin)
38
38
39
-
``object`` is an object that the control flow node refers to, ``origin`` is where the object comes from, which is useful for displaying meaningful results, and ``cls`` is the inferred class of the ``object``.
39
+
``object`` is an object that the control flow node refers to, ``origin`` is where the object comes from, which is useful for displaying meaningful results.
40
+
The third form includes the ``context`` in which the control flow node refers to ``object``. This form can usually be ignored.
40
41
41
42
.. pull-quote::
42
43
43
44
Note
44
45
45
-
``ControlFlowNode.refersTo()`` cannot find all objects that a control flow node might point to as it impossible to be accurate and find all possible values. We prefer precision (no incorrect values) over recall (finding as many values as possible). We do this because queries based on points-to analysis have fewer false positives and are thus more useful.
46
+
``ControlFlowNode.pointsTo()`` cannot find all objects that a control flow node might point to as it impossible to be accurate and find all possible values. We prefer precision (no incorrect values) over recall (finding as many values as possible). We do this because queries based on points-to analysis have fewer false positives and are thus more useful.
46
47
47
-
For complex data flow analyses, involving multiple stages, the ``ControlFlowNode`` version is more precise, but for simple use cases the ``Expr`` based version is easier to use. For convenience, the ``Expr`` class also has the same three predicates. ``Expr.refersTo(...)`` also has three variants:
48
+
For complex data flow analyses, involving multiple stages, the ``ControlFlowNode`` version is more precise, but for simple use cases the ``Expr`` based version is easier to use. For convenience, the ``Expr`` class also has the same three predicates. ``Expr.pointsTo(...)`` also has three variants:
predicate pointsTo(Context context, Value object, AstNode origin)
54
55
55
56
Using points-to analysis
56
57
------------------------
@@ -84,19 +85,20 @@ The results of this query need to be filtered to return only results where ``ex1
84
85
85
86
.. code-block:: ql
86
87
87
-
exists(ClassObject cls1, ClassObject cls2 |
88
-
ex1.getType().refersTo(cls1) and
89
-
ex2.getType().refersTo(cls2) |
88
+
exists(ClassValue cls1, ClassValue cls2 |
89
+
ex1.getType().pointsTo(cls1) and
90
+
ex2.getType().pointsTo(cls2) |
91
+
not cls1 = cls2 and
90
92
cls1 = cls2.getASuperType()
91
93
)
92
94
93
95
The line:
94
96
95
97
::
96
98
97
-
ex1.getType().refersTo(cls1)
99
+
ex1.getType().pointsTo(cls1)
98
100
99
-
ensures that ``cls1`` is a ``ClassObject`` that the ``except`` block would handle.
101
+
ensures that ``cls1`` is a ``ClassValue`` that the ``except`` block would handle.
100
102
101
103
Combining the parts of the query we get this:
102
104
@@ -112,9 +114,10 @@ Combining the parts of the query we get this:
112
114
ex1 = t.getHandler(i) and ex2 = t.getHandler(j) and i < j
113
115
)
114
116
and
115
-
exists(ClassObject cls1, ClassObject cls2 |
116
-
ex1.getType().refersTo(cls1) and
117
-
ex2.getType().refersTo(cls2) |
117
+
exists(ClassValue cls1, ClassValue cls2 |
118
+
ex1.getType().pointsTo(cls1) and
119
+
ex2.getType().pointsTo(cls2) |
120
+
not cls1 = cls2 and
118
121
cls1 = cls2.getASuperType()
119
122
)
120
123
select t, ex1, ex2
@@ -136,48 +139,50 @@ First of all find what object is used in the ``for`` loop:
136
139
137
140
.. code-block:: ql
138
141
139
-
from For loop, Object iter
140
-
where loop.getIter().refersTo(iter)
142
+
from For loop, Value iter
143
+
where loop.getIter().pointsTo(iter)
141
144
select loop, iter
142
145
143
-
Then we need to determine if a ``ClassObject`` is iterable. ``ClassObject`` provides the predicate ``isIterable()`` which we can combine with the longer form of ``ControlFlowNode.refersTo()`` to get the class of the loop iterator, giving us this:
146
+
Then we need to determine if the object ``iter`` is iterable. We can test ``ClassValue`` to see if it has the ``__iter__`` attribute.
144
147
145
148
**Find non-iterable object used as a loop iterator**
146
149
147
150
.. code-block:: ql
148
151
149
152
import python
150
153
151
-
from For loop, Object iter, ClassObject cls
152
-
where loop.getIter().refersTo(iter, cls, _)
153
-
and not cls.isIterable()
154
+
from For loop, Value iter, ClassValue cls
155
+
where loop.getIter().pointsTo(iter) and
156
+
cls = iter.getClass() and
157
+
not cls.hasAttribute("__iter__")
154
158
select loop, cls
155
159
156
160
➤ `See this in the query console <https://lgtm.com/query/670720182/>`__. Many projects use a non-iterable as a loop iterator.
157
161
158
-
Many of the results shown will have ``cls`` as ``NoneType``. It is more informative to show where these ``None`` values may come from. To do this we use the final field of ``refersTo``, as follows:
162
+
Many of the results shown will have ``cls`` as ``NoneType``. It is more informative to show where these ``None`` values may come from. To do this we use the final field of ``pointsTo``, as follows:
159
163
160
164
**Find non-iterable object used as a loop iterator 2**
161
165
162
166
.. code-block:: ql
163
167
164
168
import python
165
169
166
-
from For loop, Object iter, ClassObject cls, AstNode origin
167
-
where loop.getIter().refersTo(iter, cls, origin)
168
-
and not cls.isIterable()
170
+
from For loop, Value iter, ClassValue cls, AstNode origin
171
+
where loop.getIter().pointsTo(iter, origin) and
172
+
cls = iter.getClass() and
173
+
not cls.hasAttribute("__iter__")
169
174
select loop, cls, origin
170
175
171
-
➤ `See this in the query console <https://lgtm.com/query/672230046/>`__. This reports the same results, but with a third column showing the source of the ``None`` values.
176
+
➤ `See this in the query console <https://lgtm.com/query/6718356557331218618/>`__. This reports the same results, but with a third column showing the source of the ``None`` values.
172
177
173
-
Finding calls to functions using call-graph analysis
The ``FunctionObject`` class is a subclass of ``Object`` and corresponds to function objects in Python, in much the same way as the ``ClassObject`` class corresponds to class objects in Python.
181
+
The ``Value`` class has a method ``getACall()`` which allows us to find calls to a particular function (including builtin functions).
177
182
178
-
The ``FunctionObject`` class has a method ``getACall()`` which allows us to find calls to a particular function (including builtin functions).
183
+
If we wish to restrict the callables to actual functions we can use the ``FunctionValue`` class, which is a subclass of ``Value`` and corresponds to function objects in Python, in much the same way as the ``ClassValue`` class corresponds to class objects in Python.
179
184
180
-
Returning to an example from :doc:`Tutorial: Functions <functions>`, we wish to find calls to the ``input`` function.
185
+
Returning to an example from :doc:`Tutorial: Functions <functions>`, we wish to find calls to the ``eval`` function.
181
186
182
187
The original query looked this:
183
188
@@ -186,38 +191,38 @@ The original query looked this:
186
191
import python
187
192
188
193
from Call call, Name name
189
-
where call.getFunc() = name and name.getId() = "input"
190
-
select call, "call to 'input'."
194
+
where call.getFunc() = name and name.getId() = "eval"
195
+
select call, "call to 'eval'."
191
196
192
-
➤ `See this in the query console <https://lgtm.com/query/690010037/>`__. Two of the demo projects on LGTM.com have calls that match this pattern.
197
+
➤ `See this in the query console <https://lgtm.com/query/6718356557331218618/>`__. Two of the demo projects on LGTM.com have calls that match this pattern.
193
198
194
199
There are two problems with this query:
195
200
196
-
- It assumes that any call to something named "input" is a call to the builtin ``input`` function, which may result in some false positive results.
197
-
- It assumes that ``input`` cannot be referred to by any other name, which may result in some false negative results.
201
+
- It assumes that any call to something named "eval" is a call to the builtin ``eval`` function, which may result in some false positive results.
202
+
- It assumes that ``eval`` cannot be referred to by any other name, which may result in some false negative results.
198
203
199
-
We can get much more accurate results using call-graph analysis. First, we can precisely identify the ``FunctionObject`` for the ``input`` function, by using the ``builtin_object`` QL predicate as follows:
204
+
We can get much more accurate results using call-graph analysis. First, we can precisely identify the ``FunctionValue`` for the ``eval`` function, by using the ``Value::named`` QL predicate as follows:
200
205
201
206
.. code-block:: ql
202
207
203
208
import python
204
209
205
-
from FunctionObject input
206
-
where input = builtin_object("input")
207
-
select input
210
+
from Value eval
211
+
where eval = Value::named("eval")
212
+
select eval
208
213
209
-
Then we can use ``FunctionObject.getACall()`` to identify calls to the ``input`` function, as follows:
214
+
Then we can use ``Value.getACall()`` to identify calls to the ``eval`` function, as follows:
210
215
211
216
.. code-block:: ql
212
217
213
218
import python
214
219
215
-
from ControlFlowNode call, FunctionObject input
216
-
where input = builtin_object("input") and
217
-
call = input.getACall()
218
-
select call, "call to 'input'."
220
+
from ControlFlowNode call, Value eval
221
+
where eval = Value::named("eval") and
222
+
call = eval.getACall()
223
+
select call, "call to 'eval'."
219
224
220
-
➤ `See this in the query console <https://lgtm.com/query/670490037/>`__. This accurately identifies calls to the builtin ``input`` function even when they are referred to using an alternative name. Any false positive results with calls to other ``input`` functions, reported by the original query, have been eliminated. It finds one result in files referenced by the *saltstack/salt* project.
225
+
➤ `See this in the query console <https://lgtm.com/query/535131812579637425/>`__. This accurately identifies calls to the builtin ``eval`` function even when they are referred to using an alternative name. Any false positive results with calls to other ``eval`` functions, reported by the original query, have been eliminated. It finds one result in files referenced by the *saltstack/salt* project.
0 commit comments