Python: Model remote flow sources for the websockets library#20945
Python: Model remote flow sources for the websockets library#20945tausbn merged 4 commits intogithub:mainfrom
websockets library#20945Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds modeling for remote flow sources in the websockets PyPI package. The implementation provides taint tracking for WebSocket server connections through handler parameters and their methods like recv() and recv_streaming().
- Models websocket handlers registered via
serveandunix_servefunctions in bothasyncioandsyncvariants - Implements taint tracking for
ServerConnectioninstances and their message-receiving methods - Adds comprehensive test coverage for both synchronous and asynchronous websocket handlers
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| python/ql/lib/semmle/python/frameworks/Websockets.qll | New framework modeling file that defines websocket handlers, ServerConnection instances, and taint propagation rules |
| python/ql/lib/semmle/python/Frameworks.qll | Imports the new Websockets framework module |
| python/ql/lib/change-notes/2025-12-01-websockets.md | Documents the addition of remote flow source modeling for websockets |
| python/ql/test/library-tests/frameworks/websockets/taint_test_sync.py | Test file for synchronous websocket taint tracking scenarios |
| python/ql/test/library-tests/frameworks/websockets/taint_test_asyncio.py | Test file for asynchronous websocket taint tracking scenarios |
| python/ql/test/library-tests/frameworks/websockets/response_test.py | Test file covering various handler registration patterns including route handlers |
| python/ql/test/library-tests/frameworks/websockets/InlineTaintTest.ql | QL test query for inline taint test framework |
| python/ql/test/library-tests/frameworks/websockets/InlineTaintTest.expected | Expected results file (auto-generated) |
| python/ql/test/library-tests/frameworks/websockets/ConceptsTest.ql | QL test query for concepts test framework |
| python/ql/test/library-tests/frameworks/websockets/ConceptsTest.expected | Expected results file for concepts test |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private class HandlerParam extends DataFlow::Node, InstanceSource { | ||
| HandlerParam() { exists(WebSocketHandler h | this = DataFlow::parameterNode(h.getArg(0))) } |
There was a problem hiding this comment.
The HandlerParam class should extend RemoteFlowSource::Range to properly model websocket handler parameters as remote flow sources, similar to how other frameworks like Pyramid and Tornado model their handler parameters. This class should also implement the getSourceType() method to return an appropriate description like "websockets handler parameter".
| private class HandlerParam extends DataFlow::Node, InstanceSource { | |
| HandlerParam() { exists(WebSocketHandler h | this = DataFlow::parameterNode(h.getArg(0))) } | |
| private class HandlerParam extends RemoteFlowSource::Range, InstanceSource { | |
| HandlerParam() { exists(WebSocketHandler h | this = DataFlow::parameterNode(h.getArg(0))) } | |
| override string getSourceType() { result = "websockets handler parameter" } |
There was a problem hiding this comment.
This parameter is already classified as a remote flow source due to WebSocketHandler being RequestHandler.
| * calls, or a special parameter that will be set when functions are called by an external | ||
| * library. | ||
| * | ||
| * Use the predicate `WebSocket::instance()` to get references to instances of `websockets.asyncio.ServerConnection` and `websockets.sync.ServerConnection`. |
There was a problem hiding this comment.
The documentation references WebSocket::instance() but the correct predicate name is ServerConnection::instance(). This should be updated to match the actual module and predicate structure.
| * Use the predicate `WebSocket::instance()` to get references to instances of `websockets.asyncio.ServerConnection` and `websockets.sync.ServerConnection`. | |
| * Use the predicate `ServerConnection::instance()` to get references to instances of `websockets.asyncio.ServerConnection` and `websockets.sync.ServerConnection`. |
yoff
left a comment
There was a problem hiding this comment.
LGTM - this follows all the patterns we usually use.
(I would like to get rid of poorMansFunctionTracker at some point, but not now.)
We should probably do a DCA run just to be safe.
Models remote flow sources for the
websocketslibrary.Currently omitted are handlers set up via route maps, e.g. with
websockets.asyncio.router.route; these would require a little more complex handling of dataflow through classes ofwerkzeug.routing.