-
Notifications
You must be signed in to change notification settings - Fork 9
Wdesalvador/add ccws deploy script #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
infrastructure_references/azure_cyclecloud_workspace_for_slurm/README.md
Outdated
Show resolved
Hide resolved
edwardsp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved, just need to fix conflicts
infrastructure_references/azure_cyclecloud_workspace_for_slurm/scripts/deploy-ccws.sh
Show resolved
Hide resolved
65e4088 to
c080b1b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a comprehensive deployment automation script (deploy-ccws.sh) for Azure CycleCloud Workspace for Slurm, replacing the previous manual template-based approach. The changes include:
- A new 888-line Bash deployment script with extensive parameter handling, validation, and interactive/non-interactive modes
- Removal of the old template file (
large-ai-training-cluster-parameters.template) - Comprehensive README documentation updates with detailed usage examples and parameter references
- Git configuration updates (
.gitignorefor generated artifacts,.gitattributesfor shell script line endings) - Minor documentation corrections standardizing "Azure CycleCloud Workspace for Slurm" naming
Reviewed Changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
infrastructure_references/azure_cyclecloud_workspace_for_slurm/scripts/deploy-ccws.sh |
New deployment automation script with SKU validation, zone discovery, database auto-creation, and parameter generation |
infrastructure_references/azure_cyclecloud_workspace_for_slurm/large-ai-training-cluster-parameters.template |
Removed old template file in favor of automated generation |
infrastructure_references/azure_cyclecloud_workspace_for_slurm/README.md |
Comprehensive documentation update with script usage, examples, and parameter reference |
examples/megatron-lm/GPT3-175B/slurm/README.md |
Minor naming correction: "Azure CycleCloud Slurm Workspace" → "Azure CycleCloud Workspace for Slurm" |
examples/llm-foundry/slurm/README.md |
Minor naming correction: "Azure CycleCloud Slurm Workspace" → "Azure CycleCloud Workspace for Slurm" |
README.md |
Minor naming correction in infrastructure references catalog |
.gitignore |
Added generated artifacts: output.json and cyclecloud-slurm-workspace directory |
.gitattributes |
Enforced LF line endings for shell scripts |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
infrastructure_references/azure_cyclecloud_workspace_for_slurm/scripts/deploy-ccws.sh
Show resolved
Hide resolved
infrastructure_references/azure_cyclecloud_workspace_for_slurm/scripts/deploy-ccws.sh
Show resolved
Hide resolved
infrastructure_references/azure_cyclecloud_workspace_for_slurm/scripts/deploy-ccws.sh
Show resolved
Hide resolved
infrastructure_references/azure_cyclecloud_workspace_for_slurm/scripts/deploy-ccws.sh
Show resolved
Hide resolved
infrastructure_references/azure_cyclecloud_workspace_for_slurm/README.md
Show resolved
Hide resolved
infrastructure_references/azure_cyclecloud_workspace_for_slurm/scripts/deploy-ccws.sh
Show resolved
Hide resolved
34be821 to
de5baf8
Compare
This pull request modernizes and simplifies the deployment workflow for Azure CycleCloud Slurm Workspace environments. It replaces manual parameter file editing and environment variable management with a comprehensive, interactive deployment script (
deploy-ccws.sh). The documentation is extensively updated to guide users through both interactive and automated deployments, including support for advanced features like availability zones and optional database integration.Deployment workflow modernization:
scripts/deploy-ccws.shautomation script, which interactively collects deployment parameters, generates the requiredoutput.jsonfile, and optionally performs the deployment. This replaces manual editing of parameter templates and environment variable setup.Template and documentation cleanup:
large-ai-training-cluster-parameters.templatefile, which relied on environment variable substitution. All parameterization is now handled bydeploy-ccws.sh, streamlining the process and reducing the risk of misconfiguration.Database integration improvements:
--db-name,--db-user,--db-password,--db-id). The script validates these parameters and disables accounting if not all are provided, improving security and usability.These changes make deployments more robust, reproducible, and user-friendly, while reducing manual steps and potential errors.