-
Notifications
You must be signed in to change notification settings - Fork 487
Romain/test-ok #3717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Romain/test-ok #3717
Conversation
|
|
||
| chat = ChatOpenAI(model="gpt-3.5-turbo-1106", temperature=0.2) | ||
| # proruleid: prompt-injection-fastapi | ||
| chat.invoke([HumanMessage(content=user_chat)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semgrep identified an issue in your code:
A prompt is created and user-controlled data reaches that prompt. This can lead to prompt injection. Make sure the user inputs are properly segmented from the system's in your prompts.
Dataflow graph
flowchart LR
classDef invis fill:white, stroke: none
classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none
subgraph File0["<b>python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py</b>"]
direction LR
%% Source
subgraph Source
direction LR
v0["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L13 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 13] user_name</a>"]
end
%% Intermediate
subgraph Traces0[Traces]
direction TB
v2["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L13 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 13] user_name</a>"]
v3["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L15 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 15] user_chat</a>"]
end
v2 --> v3
%% Sink
subgraph Sink
direction LR
v1["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L45 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 45] user_chat</a>"]
end
end
%% Class Assignment
Source:::invis
Sink:::invis
Traces0:::invis
File0:::invis
%% Connections
Source --> Traces0
Traces0 --> Sink
To resolve this comment:
✨ Commit Assistant Fix Suggestion
- Avoid passing untrusted user input directly into LLM prompts. Instead, validate and sanitize the
user_nameparameter before using it in your prompt. - Use input validation to restrict
user_nameto a safe character set, such as alphanumerics and basic punctuation, using a function like:
import re
def sanitize_username(name): return re.sub(r'[^a-zA-Z0-9_\- ]', '', name)
Then, usesanitized_user_name = sanitize_username(user_name). - Replace usages of
user_chat = f"ints are safe {user_name}"withuser_chat = f"ints are safe {sanitized_user_name}". - For all calls to LLM APIs (
OpenAI,HuggingFace,ChatOpenAI, etc.), ensure only sanitized or trusted data is used when building prompt or messages content.
Alternatively, if you want to reject invalid usernames altogether, raise an error if the input doesn't match your allowed pattern.
Input sanitization reduces the risk of prompt injection by removing unexpected control characters or instructions that a malicious user could provide.
💬 Ignore this finding
Reply with Semgrep commands to ignore this finding.
/fp <comment>for false positive/ar <comment>for acceptable risk/other <comment>for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by prompt-injection-fastapi.
You can view more details about this finding in the Semgrep AppSec Platform.
| messages=[ | ||
| {"role": "system", "content": "You are a helpful assistant."}, | ||
| {"role": "user", "content": user_chat}, | ||
| ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semgrep identified an issue in your code:
A prompt is created and user-controlled data reaches that prompt. This can lead to prompt injection. Make sure the user inputs are properly segmented from the system's in your prompts.
Dataflow graph
flowchart LR
classDef invis fill:white, stroke: none
classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none
subgraph File0["<b>python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py</b>"]
direction LR
%% Source
subgraph Source
direction LR
v0["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L13 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 13] user_name</a>"]
end
%% Intermediate
subgraph Traces0[Traces]
direction TB
v2["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L13 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 13] user_name</a>"]
v3["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L15 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 15] user_chat</a>"]
end
v2 --> v3
%% Sink
subgraph Sink
direction LR
v1["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L20 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 20] [<br> {"role": "system", "content": "You are a helpful assistant."},<br> {"role": "user", "content": user_chat},<br> ]</a>"]
end
end
%% Class Assignment
Source:::invis
Sink:::invis
Traces0:::invis
File0:::invis
%% Connections
Source --> Traces0
Traces0 --> Sink
To resolve this comment:
✨ Commit Assistant Fix Suggestion
- Never insert user-controlled values directly into prompts. For the OpenAI and HuggingFace calls, replace
user_chat = f"ints are safe {user_name}"with code that validates or sanitizesuser_name. - If you expect
user_nameto be a plain name, restrict to allowed characters using a regex or manual check. Example:import reand useif not re.match(r"^[a-zA-Z0-9_ -]{1,32}$", user_name): raise ValueError("Invalid user name"). - Alternatively, if the input could contain dangerous characters, escape or neutralize control characters before using it in prompts. Example:
user_name = user_name.replace("{", "").replace("}", ""). - After validation/sanitization, use the safe value when building the prompt:
user_chat = f"ints are safe {user_name}". - Use the sanitized
user_chatfor all calls instead of the raw one. For example, in your OpenAI and HuggingFace requests, replace the user message content parameter with the sanitized version. - Avoid allowing users to inject prompt instructions (like
"\nSystem: ..."or similar) by keeping formatting simple and validated.
Only allow trusted or validated input to reach the LLM prompt, since prompt injection can result in loss of control over the model's outputs or leakage of system information.
💬 Ignore this finding
Reply with Semgrep commands to ignore this finding.
/fp <comment>for false positive/ar <comment>for acceptable risk/other <comment>for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by prompt-injection-fastapi.
You can view more details about this finding in the Semgrep AppSec Platform.
|
|
||
| huggingface = InferenceClient() | ||
| # proruleid: prompt-injection-fastapi | ||
| res = huggingface.text_generation(user_chat, stream=True, details=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semgrep identified an issue in your code:
A prompt is created and user-controlled data reaches that prompt. This can lead to prompt injection. Make sure the user inputs are properly segmented from the system's in your prompts.
Dataflow graph
flowchart LR
classDef invis fill:white, stroke: none
classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none
subgraph File0["<b>python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py</b>"]
direction LR
%% Source
subgraph Source
direction LR
v0["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L13 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 13] user_name</a>"]
end
%% Intermediate
subgraph Traces0[Traces]
direction TB
v2["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L13 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 13] user_name</a>"]
v3["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L15 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 15] user_chat</a>"]
end
v2 --> v3
%% Sink
subgraph Sink
direction LR
v1["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L37 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 37] user_chat</a>"]
end
end
%% Class Assignment
Source:::invis
Sink:::invis
Traces0:::invis
File0:::invis
%% Connections
Source --> Traces0
Traces0 --> Sink
To resolve this comment:
✨ Commit Assistant Fix Suggestion
- Validate or sanitize the
user_nameinput before using it to build prompts. For example, allow only a limited set of safe characters (such as alphanumerics and a few accepted symbols) using a regular expression:import reand thenif not re.fullmatch(r"[a-zA-Z0-9_\- ]{1,64}", user_name): raise ValueError("Invalid user name"). - Alternatively, if you cannot strictly limit allowed characters, escape or segment user input clearly in prompts so it's obvious to the language model which parts are from the user, such as:
{"role": "user", "content": f"USER_INPUT_START {user_name} USER_INPUT_END"}. - Update all instances where
user_chat = f"ints are safe {user_name}"to use the validated and/or clearly segmented version ofuser_namein the prompt. - Use the sanitized/escaped input when calling language model APIs, for example:
res = huggingface.text_generation(safe_user_chat, ...)and only insert trusted or sanitized data in the prompt contents.
Prompt injection is possible when user-controlled input is included in the prompt for an LLM without validation, escaping, or clear segmentation, allowing users to "break out" of the intended structure. Input validation reduces the risk of unexpected prompt alteration.
💬 Ignore this finding
Reply with Semgrep commands to ignore this finding.
/fp <comment>for false positive/ar <comment>for acceptable risk/other <comment>for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by prompt-injection-fastapi.
You can view more details about this finding in the Semgrep AppSec Platform.
|
|
||
| huggingface = InferenceClient() | ||
| # proruleid: prompt-injection-fastapi | ||
| res = huggingface.text_generation(user_chat, stream=True, details=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semgrep identified an issue in your code:
A prompt is created and user-controlled data reaches that prompt. This can lead to prompt injection. Make sure the user inputs are properly segmented from the system's in your prompts.
Dataflow graph
flowchart LR
classDef invis fill:white, stroke: none
classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none
subgraph File0["<b>python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py</b>"]
direction LR
%% Source
subgraph Source
direction LR
v0["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L13 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 13] user_name</a>"]
end
%% Intermediate
subgraph Traces0[Traces]
direction TB
v2["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L13 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 13] user_name</a>"]
v3["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L15 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 15] user_chat</a>"]
end
v2 --> v3
%% Sink
subgraph Sink
direction LR
v1["<a href=https://github.com/semgrep/semgrep-rules/blob/c2f7953f3ffe21cf3532da7aa3dc6052d919b2cf/python/fastapi/ai/prompt-injection-fastapi/prompt-injection-fastapi.py#L41 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 41] user_chat</a>"]
end
end
%% Class Assignment
Source:::invis
Sink:::invis
Traces0:::invis
File0:::invis
%% Connections
Source --> Traces0
Traces0 --> Sink
To resolve this comment:
✨ Commit Assistant Fix Suggestion
- Avoid passing direct user input like
user_nameto LLMs or text generation APIs, as this allows prompt injection attacks. - If you must use
user_name, strictly validate and escape it before use:- Allow only safe characters:
import rethenuser_name = re.sub(r'[^a-zA-Z0-9_ -]', '', user_name) - Alternatively, if you expect a specific format (such as usernames), use a stricter regex:
^[a-zA-Z0-9_-]+$
- Allow only safe characters:
- If the LLM prompt must reference the username, clearly segment user data in the prompt. For example:
user_chat = f"ints are safe. User name (not command): {user_name}" - When calling APIs like
huggingface.text_generationor passing messages to LLMs, use the sanitized and segmented value instead of the raw input. For example, replacehuggingface.text_generation(user_chat, ...)with your sanitized and segmented prompt. - Prefer only including trusted or controlled data where possible, and consider dropping user-controlled input from system prompts if not strictly required.
Using strong input validation and separating user input contextually in prompts helps prevent attackers from injecting harmful instructions into LLM queries.
💬 Ignore this finding
Reply with Semgrep commands to ignore this finding.
/fp <comment>for false positive/ar <comment>for acceptable risk/other <comment>for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by prompt-injection-fastapi.
You can view more details about this finding in the Semgrep AppSec Platform.
Link to an issue, if relevant
Issue link
Adding a new rule? Look over this PR checklist