Support Image results in FunctionExecutionResult #7131

xukefaker · 2025-11-29T14:43:26Z

xukefaker
Nov 29, 2025

I am developing an LLM-based agent designed for paper analysis. The agent's core workflow is as follows:

The agent receives a markdown document (representing the paper's sections) as input.
The LLM may invoke the read_paper_images() tool to perform a detailed analysis of the paper's figures.
The read_paper_images() tool then returns the image data to the LLM."

The result of the read_paper_images() tool is:

        result = {
            "status": "success",
            "images": [f"data:{mime_type_1};base64,{base64_str_1}", "data:{mime_type_2};base64,{base64_str_2}"...],
            "image_paths": [img["path"] for img in result_images if img.get("found", False)],
        }

However, I've observed that the $\text{FunctionExecutionResult}$ object flattens this dictionary into a single string (specifically, $\text{FunctionExecutionResult.content}$). Given this, how can the LLM effectively read and process the images returned by the function/tool?

This is also related to #5250

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support Image results in FunctionExecutionResult #7131

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Support Image results in FunctionExecutionResult #7131

Uh oh!

xukefaker Nov 29, 2025

Replies: 0 comments

xukefaker
Nov 29, 2025