I tested the open-source IDEA-FinAI/chartmoe model using the quickstart.py script.
When I use different instructions, the output results vary significantly:
- The results from the original instructions in the script match the script's prompts. This confirms there are no issues with the model I downloaded.
- When I used an instruction from the
chart2json.json in the Coobiw/ChartMoE-Data dataset: "Directly output table information and corresponding matplotlib plot properties into a JSON file," the model exhibited instruction comprehension deviation, producing outputs that diverged from the test image.
Response:
Attention Implementation: flash_attention_2
Set max length to 4096
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:16<00:00, 8.23s/it]
/home/miniconda3/envs/chartmoe/lib/python3.9/site-packages/transformers/generation/utils.py:1417: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation )
warnings.warn(
This task requires generating a JSON file with the table information and corresponding matplotlib plot properties. The output would be a JSON object with keys representing the table columns and values representing the corresponding data points. The JSON object would also include the matplotlib plot properties such as the x-axis label, y-axis label, title, and legend.
- When I used another different instruction from
chart2json.json in the Coobiw/ChartMoE-Data dataset: "Pull table information and associated matplotlib plot configurations, then directly format as JSON," the model correctly interpreted the image and output information relevant to the visual content.
Response:
Attention Implementation: flash_attention_2
Set max length to 4096
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:15<00:00, 7.71s/it]
/home/miniconda3/envs/chartmoe/lib/python3.9/site-packages/transformers/generation/utils.py:1417: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation )
warnings.warn(
{
"data": [
{
"Year": 2016,
"Student A.Average": 3.25
},
{
"Year": 2017,
"Student A.Average": 3.35
},
{
"Year": 2018,
"Student A.Average": 3.6
},
{
"Year": 2019,
"Student A.Average": 3.8
},
{
"Year": 2020,
"Student A.Average": 3.7
},
{
"Year": 2021,
"Student A.Average": 3.5
},
{
"Year": 2022,
"Student A.Average": 3.9
}
],
"labels": ["2016", "2017", "2018", "2019", "2020", "2021", "2022"],
"title": "Student Performance"
}
I'm perplexed by the model's exaggerated comprehension deviation with Align Stage instructions. Is this due to my operational error or an inherent issue with the model itself?
I tested the open-source
IDEA-FinAI/chartmoemodel using thequickstart.pyscript.When I use different instructions, the output results vary significantly:
chart2json.jsonin theCoobiw/ChartMoE-Data dataset: "Directly output table information and corresponding matplotlib plot properties into a JSON file," the model exhibited instruction comprehension deviation, producing outputs that diverged from the test image.Response:
chart2json.jsonin theCoobiw/ChartMoE-Data dataset: "Pull table information and associated matplotlib plot configurations, then directly format as JSON," the model correctly interpreted the image and output information relevant to the visual content.Response:
I'm perplexed by the model's exaggerated comprehension deviation with Align Stage instructions. Is this due to my operational error or an inherent issue with the model itself?