Skip to content

140GB is needed on hard disk instead of 2.62GB for downloading a dataset for puzzletron algorithm #1658

@danielkorzekwa

Description

@danielkorzekwa

see: https://github.com/NVIDIA/Model-Optimizer/blob/main/examples/puzzletron/README.md#compress-the-model (v.0.44.0)

Nemotron-Post-Training-Dataset-v2 dataset is first downloaded to hf_home requesting 136GB:

.../experiments/6_5_qwen_35_moments_lab$ du -ms ./hf_home
136234  ./hf_home

then the final data set is created in a separate folder requesting 2.6GB

could you please:

  • clarify it in docs
  • ideally only require to download 2.6GB instead of the excessive 136GB dataset

thank you

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions