-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Python: Prompt injection in OpenAI clients #21141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Add testcase and coverage for agents sdk runner run with input param * Rename agent sdk module for clarity * Add case for unnamed param use in runner run from agent sdk
|
QHelp previews: python/ql/src/experimental/Security/CWE-1427/PromptInjection.qhelpPrompt injectionPrompts can be constructed to bypass the original purposes of an agent and lead to sensitive data leak or operations that were not intended. RecommendationSanitize user input and also avoid using user input in developer or system level prompts. ExampleIn the following examples, the cases marked GOOD show secure prompt construction; whereas in the case marked BAD they may be susceptible to prompt injection. from flask import Flask, request
from agents import Agent
from guardrails import GuardrailAgent
@app.route("/parameter-route")
def get_input():
input = request.args.get("input")
goodAgent = GuardrailAgent( # GOOD: Agent created with guardrails automatically configured.
config=Path("guardrails_config.json"),
name="Assistant",
instructions="This prompt is customized for " + input)
badAgent = Agent(
name="Assistant",
instructions="This prompt is customized for " + input # BAD: user input in agent instruction.
)References
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request introduces a new experimental CodeQL query to detect prompt injection vulnerabilities in Python code that uses AI/LLM APIs, specifically targeting the openai and agents libraries. The implementation adds security analysis capabilities for identifying where user-controlled input flows into AI prompts without proper sanitization.
Key changes:
- New experimental query
py/prompt-injectionwith dataflow configuration to track tainted data from remote sources to AI prompt sinks - Framework models for OpenAI and agents SDK to identify prompt construction patterns
- New
AIPromptconcept in the core library to model AI prompting operations
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
python/ql/src/experimental/Security/CWE-1427/PromptInjection.ql |
Main query definition using taint tracking to detect prompt injection |
python/ql/src/experimental/Security/CWE-1427/PromptInjection.qhelp |
Documentation and examples for the query |
python/ql/src/experimental/Security/CWE-1427/examples/example.py |
Example code demonstrating good and bad practices |
python/ql/lib/semmle/python/security/dataflow/PromptInjectionQuery.qll |
Dataflow configuration for prompt injection detection |
python/ql/lib/semmle/python/security/dataflow/PromptInjectionCustomizations.qll |
Sources, sinks, and sanitizers for prompt injection |
python/ql/lib/semmle/python/frameworks/OpenAI.qll |
Models for OpenAI and agents SDK APIs |
python/ql/lib/semmle/python/frameworks/openai.model.yml |
MaD model for OpenAI sink and type definitions |
python/ql/lib/semmle/python/frameworks/agent.model.yml |
MaD model for agents SDK sink definitions |
python/ql/lib/semmle/python/Frameworks.qll |
Integration of OpenAI framework into main frameworks module |
python/ql/lib/semmle/python/Concepts.qll |
New AIPrompt concept for modeling AI prompting operations |
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/openai_test.py |
Test cases for OpenAI prompt injection patterns |
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/agent_instructions.py |
Test cases for agents SDK prompt injection patterns |
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/PromptInjection.qlref |
Test query reference configuration |
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/PromptInjection.expected |
Expected test results |
python/ql/lib/change-notes/2026-01-02-prompt-injection.md |
Release notes for the new feature |
Comments suppressed due to low confidence (11)
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/agent_instructions.py:35
- There is inconsistent spacing in the inline test annotation. The annotation should be
# $Alert[py/prompt-injection]with a space after#, consistent with the annotations on other lines in the file.
"content": input, # $Alert[py/prompt-injection]
python/ql/lib/semmle/python/frameworks/OpenAI.qll:4
- The comment text "openAI" should use consistent capitalization. The official product name is "OpenAI" (capital O and capital AI).
* As well as the regular openai python interface.
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/agent_instructions.py:25
- There is inconsistent spacing in the inline test annotation. The annotation should be
# $Alert[py/prompt-injection]with a space after#, consistent with the annotations on other lines in the file.
"content": input, # $Alert[py/prompt-injection]
python/ql/src/experimental/Security/CWE-1427/PromptInjection.qhelp:16
- The phrase "the case marked BAD they may be" is missing a conjunction. It should be "the case marked BAD, they may be" (with comma) or "the cases marked BAD are" for better grammatical flow.
<p>In the following examples, the cases marked GOOD show secure prompt construction; whereas in the case marked BAD they may be susceptible to prompt injection.</p>
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/agent_instructions.py:30
- Variable result2 is not used.
result2 = Runner.run_sync(
python/ql/src/experimental/Security/CWE-1427/examples/example.py:14
- Variable badAgent is not used.
badAgent = Agent(
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/openai_test.py:21
- Variable response2 is not used.
response2 = client.responses.create(
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/openai_test.py:40
- Variable response3 is not used.
response3 = await async_client.responses.create(
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/openai_test.py:59
- Variable completion1 is not used.
completion1 = client.chat.completions.create(
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/openai_test.py:76
- Variable completion2 is not used.
completion2 = azure_client.chat.completions.create(
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/openai_test.py:89
- Variable assistant is not used.
assistant = client.beta.assistants.create(
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/openai_test.py
Outdated
Show resolved
Hide resolved
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/agent_instructions.py
Show resolved
Hide resolved
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/agent_instructions.py
Show resolved
Hide resolved
python/ql/test/experimental/query-tests/Security/CWE-1427-PromptInjection/openai_test.py
Show resolved
Hide resolved
…ptInjection/openai_test.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
yoff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the long answer time.
This looks generally quite good. It fits our current structure, including the tests.
There are a few things I would change:
- If it is an experimental query, then its associated models and concepts should also be in the experimental directory. There should be appropriate places for everything, but shout if it looks like not, and I can help.
- #21134 is now merged, so you can probably move a lot of modeling into models-as-data. (If you do, you can probably ignore the code comments.)
- it might be nice to clean up the commit history 😅
| /** Gets a reference to the `openai.OpenAI` class. */ | ||
| API::Node classRef() { | ||
| result = | ||
| API::moduleImport("openai").getMember(["OpenAI", "AsyncOpenAI", "AzureOpenAI"]).getReturn() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you already have these in MaD, can you not just reuse those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
classRef is used in getContentNode so that we can use some logic to only mark as sink the innermost element in the object structure
This pull request introduces a new CodeQL query for detecting prompt injection vulnerabilities in Python code targeting AI prompting APIs such as
agentsandopenai. The changes includes a new experimental query, new taint flow and type models, a customizable dataflow configuration, documentation, and comprehensive test coverage.