-
Notifications
You must be signed in to change notification settings - Fork 819
include_thoughts=False does not suppress thought parts for image gen models on Vertex AI #2239
Description
Description
When using gemini-3.1-flash-image-preview (NB2) on Vertex AI with ThinkingConfig(include_thoughts=False), thought parts (both text and images) are still returned in the response. The include_thoughts=False setting has no effect — all thought text and interim "thought images" (drafts) remain in the response.
The part.thought flag IS correctly set to True on thought parts, so client-side filtering is possible. But per the docs, include_thoughts=False should suppress thought parts from the response entirely.
Environment
- SDK:
google-genai==1.70.0 - API: Vertex AI
- Model:
gemini-3.1-flash-image-preview - Platform: Python 3.10
Steps to Reproduce
from google import genai
from google.genai import types
client = genai.Client(vertexai=True, project="YOUR_PROJECT", location="us-central1")
response = client.models.generate_content(
model="gemini-3.1-flash-image-preview",
contents="Draw a red car",
config=types.GenerateContentConfig(
response_modalities=["TEXT", "IMAGE"],
thinking_config=types.ThinkingConfig(include_thoughts=False),
),
)
for i, part in enumerate(response.candidates[0].content.parts):
print(f"[{i}] thought={part.thought}, has_text={bool(part.text)}, has_image={bool(part.inline_data)}")Expected Behavior
Only non-thought parts should be returned (the final image):
[0] thought=None, has_text=False, has_image=True
Actual Behavior
All thought parts (text and interim draft images) are still present:
[0] thought=True, has_text=True, has_image=False # thinking text
[1] thought=True, has_text=True, has_image=False # thinking text
[2] thought=True, has_text=True, has_image=False # thinking text
[3] thought=True, has_text=True, has_image=False # thinking text
[4] thought=True, has_text=False, has_image=True # draft image (thought image)
[5] thought=True, has_text=True, has_image=False # self-critique text
[6] thought=None, has_text=False, has_image=True # final image
thoughts_token_count: 799 — thinking tokens are billed regardless (documented), but the parts themselves should be hidden from the response when include_thoughts=False.
Additional Context
- This only affects Vertex AI. On the Developer API, NB2 has no thinking at all (
thoughts_token_countisNone). gemini-3-pro-image-previewrejectsThinkingConfigentirely with a 400: "Thinking_config.include_thoughts is only enabled when thinking is enabled" — despite reportingthoughts_token_count(~220-240) in every response. Pro's thinking is locked to HIGH and not configurable.- The
part.thoughtboolean is correctly set, so client-side filtering works as a workaround.