Skip to content

[megatron] support language_model_only#9496

Merged
Jintao-Huang merged 4 commits into
modelscope:mainfrom
Jintao-Huang:support_language_model_only
Jun 4, 2026
Merged

[megatron] support language_model_only#9496
Jintao-Huang merged 4 commits into
modelscope:mainfrom
Jintao-Huang:support_language_model_only

Conversation

@Jintao-Huang

@Jintao-Huang Jintao-Huang commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new command-line parameter, language_model_only, which allows training, loading, and saving only the language model portion of a multimodal model. This parameter is documented in both Chinese and English and is added to the MegatronArguments class. The reviewer suggested persisting this argument in load_args_config to ensure it is restored correctly when resuming training, and adding a validation check in __post_init__ to restrict its usage to multimodal models.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.


# other
megatron_extra_kwargs: Optional[Union[dict, str]] = None
language_model_only: bool = False

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When resuming training or loading a checkpoint, the configuration is restored using the load_args_config method (defined around line 654). Since language_model_only is a critical model-structure-related argument that determines whether only the language model part is loaded/saved, it should be persisted and restored.\n\nPlease consider adding 'language_model_only' to the keys list in load_args_config so that it is automatically loaded from args.json when resuming.\n\nAdditionally, you might want to add a validation check in __post_init__ to ensure language_model_only is only set to True when self.is_multimodal is True.

@Jintao-Huang

Copy link
Copy Markdown
Collaborator Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new language_model_only parameter to support training only the language model component of multimodal models. It includes documentation updates, argument definitions, dependency checks, and initialization logic. A review comment suggests adding a validation check to raise a ValueError if language_model_only is enabled alongside tuner_type="lora_llm", as this combination is contradictory.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread swift/megatron/arguments/megatron_args.py
@Jintao-Huang Jintao-Huang merged commit a5aa39f into modelscope:main Jun 4, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants