Skip to content

[fix] Fix placement group bundle ordering for inference engines#1308

Open
SumanthRH wants to merge 17 commits intomainfrom
fix-placement
Open

[fix] Fix placement group bundle ordering for inference engines#1308
SumanthRH wants to merge 17 commits intomainfrom
fix-placement

Conversation

@SumanthRH
Copy link
Member

@SumanthRH SumanthRH commented Mar 10, 2026

Fixes #1306

Ray placement groups don't guarantee bundle order - bundles on the same node may not have consecutive indices.

This PR introduces a ResolvedPlacementGroup wrapper that computes GPU-aware reordered bundle indices (sorted by node_id, gpu_id) as a cached property, and use them for all bundle indexing in both inference engine and trainer placement.


Open with Devin

SumanthRH and others added 6 commits March 10, 2026 20:16
Ray placement groups don't guarantee bundle order - bundles on the same
node may not have consecutive indices. Introduce SkyRLPlacementGroup
wrapper that computes GPU-aware reordered bundle indices (sorted by
node_id, gpu_id) as a cached property, and use them for all bundle
indexing in both inference engine and trainer placement.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
- Update ServerGroup to accept SkyRLPlacementGroup consistently
- Change VLLMServerActor to accept explicit bundle_indices list instead
  of computing contiguous range from start_bundle_idx
- Fix type annotation in trainer.py for colocate_pg
- Update tests to wrap raw PlacementGroups in SkyRLPlacementGroup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
…ping

All callers now wrap raw PlacementGroups in SkyRLPlacementGroup at
creation time. PPORayActorGroup, create_ray_wrapped_inference_engines,
and flash_rl_engine assert the type instead of silently auto-wrapping,
which avoids redundant reordering computation when the same PG is
shared across multiple actor groups.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@SumanthRH SumanthRH marked this pull request as ready for review March 11, 2026 00:16
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
devin-ai-integration[bot]

This comment was marked as resolved.

…_offload

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a SkyRLPlacementGroup wrapper class to manage Ray placement groups, ensuring GPU-aware ordering of resource bundles. The change addresses the issue that Ray's native placement groups do not guarantee bundle order, which is crucial for consistent GPU resource allocation in distributed training and inference. The code updates replace direct PlacementGroup usage with SkyRLPlacementGroup across various modules, including inference engines, trainers, and server groups. This involves extracting the raw Ray placement group and its reordered bundle indices from the wrapper, and modifying scheduling strategies and bundle index assignments to utilize these reordered indices. Additionally, server actor initialization now accepts a list of bundle_indices instead of a starting index, providing more precise control over resource allocation.

Note: Security Review did not run due to the size of the PR.

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH requested a review from erictang000 March 11, 2026 01:21
@erictang000
Copy link
Collaborator

@SumanthRH this pretty much lgtm, but let's merge #1300 first since there's some overlap/merge conflicts to resolve?

@SumanthRH SumanthRH marked this pull request as draft March 11, 2026 23:28
SumanthRH and others added 3 commits March 11, 2026 23:28
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
…ndant GPU probing

SkyRLPlacementGroup now stores the full (bundle_idx, node_id, gpu_id) mapping
via lazy _get_bundle_placement(), exposing bundle_gpu_ids and bundle_node_ids
properties. This eliminates get_gpu_ids_for_pg_bundles and
get_reordered_bundle_indices (dead code), and simplifies mp backend GPU ID
lookup in both old (ray_wrapped_inference_engine) and new (ServerGroup) stacks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
num_servers: int,
start_port: int = 8000,
placement_group: Optional[PlacementGroup] = None,
placement_group: Optional[SkyRLPlacementGroup] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this OrderedPlacementGroup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I don't think that name would be right given that the underlying pg object is still "unordered" and the wrapper class here is simply resolving the physical ordering of the bundles. OrderedPlacementGroup makes it sound like we have additional guarantees on the pg object

I've renamed to ResolvedPlacementGroup - this should be better.

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH marked this pull request as ready for review March 13, 2026 21:53
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a ResolvedPlacementGroup wrapper to address incorrect bundle ordering in Ray placement groups, which is a critical fix for ensuring correct GPU-aware resource allocation. The new wrapper cleverly caches bundle placement information, improving both correctness and efficiency. The changes are consistently applied across the codebase, including inference engines, trainers, and tests. The overall implementation is robust and well-designed. I have one minor suggestion to improve code clarity in server_group.py.

Comment on lines 90 to +100
self._external_pg = placement_group

# Extract the raw PG, reordered indices, and GPU IDs from ResolvedPlacementGroup.
if placement_group is not None:
self._external_pg = placement_group.pg
self._reordered_bundle_indices = placement_group.reordered_bundle_indices
self._bundle_gpu_ids = placement_group.bundle_gpu_ids
else:
self._external_pg = None
self._reordered_bundle_indices = None
self._bundle_gpu_ids = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block can be simplified to improve clarity and remove redundancy. The assignment self._external_pg = placement_group on line 90 is immediately overwritten within the if block, and the else block repeats the default None assignment. Initializing these attributes to None before the if block would make the logic more straightforward.

Suggested change
self._external_pg = placement_group
# Extract the raw PG, reordered indices, and GPU IDs from ResolvedPlacementGroup.
if placement_group is not None:
self._external_pg = placement_group.pg
self._reordered_bundle_indices = placement_group.reordered_bundle_indices
self._bundle_gpu_ids = placement_group.bundle_gpu_ids
else:
self._external_pg = None
self._reordered_bundle_indices = None
self._bundle_gpu_ids = None
self._external_pg = None
self._reordered_bundle_indices = None
self._bundle_gpu_ids = None
if placement_group is not None:
self._external_pg = placement_group.pg
self._reordered_bundle_indices = placement_group.reordered_bundle_indices
self._bundle_gpu_ids = placement_group.bundle_gpu_ids

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[vllm] Make sure vLLM inference engine allocates tp ranks contiguously with ray backend

3 participants