Describe the bug
When invoking a prompt function via kernel.InvokeAsync(function, ...) with the Ollama connecter and a thinking-enabled model, in my case Qwen3.5, the result is an empty string.
The model generates a correct response, but it lands in Ollama's thinking stream rather than message.content, and OllamaPromptExecutionSettings provides no way to set think=false to prevent this. FunctionResult therefore returns empty.
To Reproduce
Steps to reproduce the behavior:
- Configure a kernel with the Ollama connector pointing to a thinking-enabled model (e.g. qwen3.5:9b)
- Create a prompt function: var function = kernel.CreateFunctionFromPrompt(promptTemplate);
- Invoke it with no execution settings: var response = await kernel.InvokeAsync(function, new KernelArguments { ["extractedText"] = extractedText });
- Call response.ToString() — result is an empty string, despite the function completing successfully and the model taking the full generation time (~70s in our case)
Expected behavior
response.ToString() should return the model's generated content. Either OllamaPromptExecutionSettings should expose a Think property (mapping to Ollama's top-level think request filed) so users can set think=false, or ChatMessageContent should surface the thinking field content as fallback or separate property so it isn't silently dropped.
Platform
- Language: C#
- Source: NuGet package Microsoft.SemanticKernel.Connectors.Ollama 1.77.0-alpha (latest available)
- AI model: Ollama: qwen3.5:9b (local)
- IDE: Visual Studio Code
- OS: macOS (client), Windows (Ollama host)
Additional context
The root cause has been confirmed by checking OllamaPromptExecutionSettings and saw that is exposes NumPredict, Temperature, TopK, TopP, Stop but no Think property and ExtensionData does not map to Ollama's top-level think request field.
Log output from broken path (kernel.InvokeAsync, no execution settings):

But a working fix has been done by bypassing the kernel.InvokeAsync and calling OllamaApiClient.ChatAsync directly with Think = false, Stream = false:
Resulting in the following (in 23.4s):
Finally, workarounds for this issue are done by accessing raw response payloads / custom handlers, which appears to be the norm for reasoning models generally, not specific to Ollama — see #13889 and related discussion.
Describe the bug
When invoking a prompt function via kernel.InvokeAsync(function, ...) with the Ollama connecter and a thinking-enabled model, in my case Qwen3.5, the result is an empty string.
The model generates a correct response, but it lands in Ollama's thinking stream rather than message.content, and OllamaPromptExecutionSettings provides no way to set think=false to prevent this. FunctionResult therefore returns empty.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
response.ToString() should return the model's generated content. Either OllamaPromptExecutionSettings should expose a Think property (mapping to Ollama's top-level think request filed) so users can set think=false, or ChatMessageContent should surface the thinking field content as fallback or separate property so it isn't silently dropped.
Platform
Additional context
The root cause has been confirmed by checking OllamaPromptExecutionSettings and saw that is exposes NumPredict, Temperature, TopK, TopP, Stop but no Think property and ExtensionData does not map to Ollama's top-level think request field.
Log output from broken path (kernel.InvokeAsync, no execution settings):

But a working fix has been done by bypassing the kernel.InvokeAsync and calling OllamaApiClient.ChatAsync directly with Think = false, Stream = false:
Resulting in the following (in 23.4s):
Finally, workarounds for this issue are done by accessing raw response payloads / custom handlers, which appears to be the norm for reasoning models generally, not specific to Ollama — see #13889 and related discussion.