Skip to content

include_thoughts=False does not suppress thought parts for image gen models on Vertex AI #2239

@DrChrisLevy

Description

@DrChrisLevy

Description

When using gemini-3.1-flash-image-preview (NB2) on Vertex AI with ThinkingConfig(include_thoughts=False), thought parts (both text and images) are still returned in the response. The include_thoughts=False setting has no effect — all thought text and interim "thought images" (drafts) remain in the response.

The part.thought flag IS correctly set to True on thought parts, so client-side filtering is possible. But per the docs, include_thoughts=False should suppress thought parts from the response entirely.

Environment

  • SDK: google-genai==1.70.0
  • API: Vertex AI
  • Model: gemini-3.1-flash-image-preview
  • Platform: Python 3.10

Steps to Reproduce

from google import genai
from google.genai import types

client = genai.Client(vertexai=True, project="YOUR_PROJECT", location="us-central1")

response = client.models.generate_content(
    model="gemini-3.1-flash-image-preview",
    contents="Draw a red car",
    config=types.GenerateContentConfig(
        response_modalities=["TEXT", "IMAGE"],
        thinking_config=types.ThinkingConfig(include_thoughts=False),
    ),
)

for i, part in enumerate(response.candidates[0].content.parts):
    print(f"[{i}] thought={part.thought}, has_text={bool(part.text)}, has_image={bool(part.inline_data)}")

Expected Behavior

Only non-thought parts should be returned (the final image):

[0] thought=None, has_text=False, has_image=True

Actual Behavior

All thought parts (text and interim draft images) are still present:

[0] thought=True, has_text=True, has_image=False    # thinking text
[1] thought=True, has_text=True, has_image=False    # thinking text
[2] thought=True, has_text=True, has_image=False    # thinking text
[3] thought=True, has_text=True, has_image=False    # thinking text
[4] thought=True, has_text=False, has_image=True    # draft image (thought image)
[5] thought=True, has_text=True, has_image=False    # self-critique text
[6] thought=None, has_text=False, has_image=True    # final image

thoughts_token_count: 799 — thinking tokens are billed regardless (documented), but the parts themselves should be hidden from the response when include_thoughts=False.

Additional Context

  • This only affects Vertex AI. On the Developer API, NB2 has no thinking at all (thoughts_token_count is None).
  • gemini-3-pro-image-preview rejects ThinkingConfig entirely with a 400: "Thinking_config.include_thoughts is only enabled when thinking is enabled" — despite reporting thoughts_token_count (~220-240) in every response. Pro's thinking is locked to HIGH and not configurable.
  • The part.thought boolean is correctly set, so client-side filtering works as a workaround.

Metadata

Metadata

Labels

priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions