[megatron] support language_model_only#9496
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new command-line parameter, language_model_only, which allows training, loading, and saving only the language model portion of a multimodal model. This parameter is documented in both Chinese and English and is added to the MegatronArguments class. The reviewer suggested persisting this argument in load_args_config to ensure it is restored correctly when resuming training, and adding a validation check in __post_init__ to restrict its usage to multimodal models.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
|
||
| # other | ||
| megatron_extra_kwargs: Optional[Union[dict, str]] = None | ||
| language_model_only: bool = False |
There was a problem hiding this comment.
When resuming training or loading a checkpoint, the configuration is restored using the load_args_config method (defined around line 654). Since language_model_only is a critical model-structure-related argument that determines whether only the language model part is loaded/saved, it should be persisted and restored.\n\nPlease consider adding 'language_model_only' to the keys list in load_args_config so that it is automatically loaded from args.json when resuming.\n\nAdditionally, you might want to add a validation check in __post_init__ to ensure language_model_only is only set to True when self.is_multimodal is True.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new language_model_only parameter to support training only the language model component of multimodal models. It includes documentation updates, argument definitions, dependency checks, and initialization logic. A review comment suggests adding a validation check to raise a ValueError if language_model_only is enabled alongside tuner_type="lora_llm", as this combination is contradictory.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
modelscope/mcore-bridge#112