JS: Do not taint whole array when storing into ArrayElement#18790
JS: Do not taint whole array when storing into ArrayElement#18790asgerf merged 16 commits intogithub:mainfrom
Conversation
No changes in actual alerts
This flow was lost since the existing model of concat() boxes its return value in ArrayElement. There is no explicit model of Buffer.concat.
Not all of lodash, just the callbacks we already modeled plus a few easy ones
getAPropertyWrite() contains getALocalSource() under the the hood. Don't rely on that to find the successor of a mutation.
erik-krogh
left a comment
There was a problem hiding this comment.
Looks good, just two minor questions.
| } | ||
|
|
||
| class ToString extends SummarizedCallable { | ||
| ToString() { this = "Object#toString / Array#toString" } |
There was a problem hiding this comment.
Is this a model of Object#toString? It only handles array elements.
There was a problem hiding this comment.
I was a bit torn about what to name the summary. It applies to all method calls to .toString() so I thought it would look confusing if someone was to look at the call graph and see all toString calls resolving to Array#toString. It makes it clear that this summary is responsible for toString calls in general.
The "correct" model for Object#toString is an empty set of flow summaries so there's just nothing to do for that case.
|
|
||
| override predicate propagatesFlow(string input, string output, boolean preservesValue) { | ||
| input = "Argument[0].ArrayElement" and | ||
| output = ["Argument[1].Parameter[1]", "ReturnValue"] and |
There was a problem hiding this comment.
| output = ["Argument[1].Parameter[1]", "ReturnValue"] and | |
| output = ["Argument[1].Parameter[0]", "ReturnValue"] and |
I found this usage example: _.sortBy(users, [function(o) { return o.user; }]);
Seems the relevant value is the first parameter.
Also, you can specify an array of functions, but I'm not sure we need to model that.
There was a problem hiding this comment.
Good catch, fixed the parameter index. I was just trying to reach parity with the old model, but without jump steps.
We need a more scalable way to make tests for libraries like lodash. I actually started writing tests for these but realised I was basically writing the same code twice in two different forms and would be likely to repeat the same mistakes in both places. I tried a rather simple use of copilot auto completion to generate tests, but it did not work well.
When a flow summary uses
ArrayElementin its output column, we'd previously generate a taint step in addition to a store step into the array. This PR removes that rule, so that reading fromArrayElementgenerates a taint step, but storing intoArrayElementdoes not.The rule was there in order to be consistent with how things worked with the old data flow library, where arrays were generally considered tainted if they contained a tainted element, but we don't really do that anymore. This PR makes a couple of semi-related changes in order to recover from observed FPs/FNs resulting from the first commit.
The DCA run looks good:
ExceptionXsswhich is due to that query having an AP limit of 1, which is no longer enough to find the flow. The result was not exploitable; it seemed like the sort of FP some users might have appreciated and fixed anyway, but on the whole it's not critical to preserve it.