feat(configurator): add ObsLeafDescriptor + structured-observation protocol#928
feat(configurator): add ObsLeafDescriptor + structured-observation protocol#928rutayan-nv wants to merge 17 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds DSE domain-randomization infrastructure: a ChangesDSE Domain Randomization and Error Handling
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)
145-170:⚠️ Potential issue | 🟠 Major | ⚡ Quick winConstraint-failure path breaks
env.csv↔trajectory.csvstep alignment.Line 146 triggers
observer.before_step(...)(which persistsenv.csv) before constraints are checked, but Line 170 returns early without appending a trajectory row. For env-param-enabled runs, this creates unmatchedenv.csvrows.Proposed fix
if not self.test_run.test.constraint_check(self.test_run, self.runner.system): logging.info("Constraint check failed. Skipping step.") - return [-1.0], self.rewards.constraint_failure, True, {} + observation = [-1.0] + reward = self.rewards.constraint_failure + self.write_trajectory( + TrajectoryEntry( + step=self.test_run.step, + action=action, + reward=reward, + observation=observation, + env_params=dict(self.test_run.current_env_params), + ) + ) + for observer in self.observers: + observer.after_step(self.test_run, observation, reward) + return observation, reward, True, {}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/cloudai_gym.py` around lines 145 - 170, The constraint check failure path breaks the alignment between env.csv and trajectory.csv because observer.before_step() is called before the constraint validation, writing an env.csv entry that never gets a matching trajectory.csv row when the function returns early. Move the constraint check (the if not self.test_run.test.constraint_check() block) to execute before calling observer.before_step() at the beginning of the step. This ensures that if constraints fail, no env.csv entry is written, keeping both CSV files aligned.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 149-161: The run loop captures the `done` flag from the
env.step(action) call but never checks it, causing the loop to continue
executing after the environment signals termination. Add a conditional check
after the update_policy call to break out of the loop when `done` evaluates to
True, ensuring the agent stops running trials when the episode terminates as
intended.
In `@src/cloudai/configurator/env_params.py`:
- Around line 87-97: The ObsLeafDescriptor class is missing a dtype field to
expose the box data type contract. Add an optional dtype field to
ObsLeafDescriptor that defaults to "float32" for box leaves, allowing adapters
to consume dtype metadata from the descriptor schema. Update the _validate
method to ensure dtype is appropriately set for box-type leaves and validate
that the dtype value is valid when specified.
- Around line 113-115: The `encode_observation()` method signature in
StructuredObservation diverges from the documented public contract. The
documented contract specifies a zero-arg hook, but the current implementation
includes an `observation: list` parameter. Remove the `observation` parameter
from the `encode_observation()` method signature to align it with the documented
public contract and ensure compatibility with duck-typed environments and
adapters that expect a zero-arg method.
- Line 2: The copyright header on line 2 of
src/cloudai/configurator/env_params.py does not match the CI header-check test
expectations for the year format. Update the copyright year string in the header
from the current "2024-2026" format to match the exact year string that the
repository's header-check test expects for this file (which should be "2026"
based on the CI requirement). Ensure the format precisely matches what the
enforced header check validates.
In `@tests/test_action_space.py`:
- Around line 43-50: Add type-check suppressions to the test functions that
intentionally pass invalid arguments to validate runtime error handling. For the
test_continuous_space_rejects_unknown_dtype function, suppress the type-check
error on the line where ContinuousSpace is instantiated with the invalid
"double" dtype argument. For the test_continuous_space_forbids_extra_fields
function, suppress the type-check error on the line where ContinuousSpace is
instantiated with the unexpected step parameter. These suppressions allow the
runtime validation assertions to execute without being blocked by static type
checkers like pyright, which would otherwise reject these intentionally invalid
arguments.
In `@tests/test_env_params.py`:
- Line 2: The copyright header on line 2 contains an incorrect year range that
fails the CI header check. Update the copyright year in the NVIDIA CORPORATION &
AFFILIATES copyright line from the current range (2025-2026) to only 2026 to
match the CI expectation and unblock the header validation.
- Around line 143-148: Replace the three direct ObsLeafDescriptor constructor
calls with model_validate() calls to satisfy pyright type checking. For the
first validation test (with the dim=0 case and match pattern), call
model_validate() instead of the constructor directly. For the second test (with
the unexpected=1 parameter), similarly use model_validate(). For the third test,
use model_validate() AND correct the test data from kind="categorical" to a
valid literal value that ObsLeafDescriptor accepts, which should be either "box"
or "discrete" based on the kind validation constraints.
---
Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 145-170: The constraint check failure path breaks the alignment
between env.csv and trajectory.csv because observer.before_step() is called
before the constraint validation, writing an env.csv entry that never gets a
matching trajectory.csv row when the function returns early. Move the constraint
check (the if not self.test_run.test.constraint_check() block) to execute before
calling observer.before_step() at the beginning of the step. This ensures that
if constraints fail, no env.csv entry is written, keeping both CSV files
aligned.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: ed2eee75-0fae-450f-a244-ee6a41e8986a
📒 Files selected for processing (13)
src/cloudai/_core/action_space.pysrc/cloudai/_core/test_scenario.pysrc/cloudai/cli/handlers.pysrc/cloudai/configurator/base_agent.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/env_params.pysrc/cloudai/core.pysrc/cloudai/models/workload.pytests/test_action_space.pytests/test_cloudaigym.pytests/test_env_params.pytests/test_handlers.pytests/test_test_scenario.py
| observation, reward, done, *_ = self.env.step(action) | ||
| self.update_policy( | ||
| { | ||
| "trial_index": step, | ||
| "value": reward, | ||
| "observation": observation, | ||
| "prev_observation": prev_observation, | ||
| "action": action, | ||
| "done": done, | ||
| } | ||
| ) | ||
| logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}") | ||
| return 0 |
There was a problem hiding this comment.
Honor terminal done in the default run loop.
Line 149 receives done from env.step(action), but the loop continues regardless. This can execute extra trials after termination and skew agent behavior.
Proposed fix
self.update_policy(
{
"trial_index": step,
"value": reward,
"observation": observation,
"prev_observation": prev_observation,
"action": action,
"done": done,
}
)
logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
+ if done:
+ break
return 0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| observation, reward, done, *_ = self.env.step(action) | |
| self.update_policy( | |
| { | |
| "trial_index": step, | |
| "value": reward, | |
| "observation": observation, | |
| "prev_observation": prev_observation, | |
| "action": action, | |
| "done": done, | |
| } | |
| ) | |
| logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}") | |
| return 0 | |
| observation, reward, done, *_ = self.env.step(action) | |
| self.update_policy( | |
| { | |
| "trial_index": step, | |
| "value": reward, | |
| "observation": observation, | |
| "prev_observation": prev_observation, | |
| "action": action, | |
| "done": done, | |
| } | |
| ) | |
| logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}") | |
| if done: | |
| break | |
| return 0 |
🧰 Tools
🪛 Ruff (0.15.17)
[warning] 160-160: Logging statement uses f-string
(G004)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/cloudai/configurator/base_agent.py` around lines 149 - 161, The run loop
captures the `done` flag from the env.step(action) call but never checks it,
causing the loop to continue executing after the environment signals
termination. Add a conditional check after the update_policy call to break out
of the loop when `done` evaluates to True, ensuring the agent stops running
trials when the episode terminates as intended.
…ole mutator The trial index lives on TestRun and is advanced exactly once per CloudAIGymEnv.step(), before output_path and the trajectory row are computed. Step artifacts and trajectory rows are therefore tagged with the advanced index rather than the pre-step value.
handle_dse_job calls agent.run() uniformly; agents (e.g. RLlib-based) implement their own training loop in run(). Failure handling follows a hybrid contract: a non-zero rc accumulates via `err |= rc` and the sweep continues, while an unexpected exception hard-fails the scenario after reports are generated and the aborting error is documented.
7c6a1ab to
5657f1e
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)
145-170:⚠️ Potential issue | 🟠 Major | ⚡ Quick winConstraint-failure early return skips trajectory persistence and
after_stephooks.After Line 145-147, observers run
before_step. But Line 168-170 returns immediately on failed constraints, so no trajectory row is written andafter_stepis never called. With env-param observers, this can leaveenv.csvandtrajectory.csvout of sync for that step.Proposed fix
if not self.test_run.test.constraint_check(self.test_run, self.runner.system): logging.info("Constraint check failed. Skipping step.") - return [-1.0], self.rewards.constraint_failure, True, {} + observation = [-1.0] + reward = self.rewards.constraint_failure + self.write_trajectory( + TrajectoryEntry( + step=self.test_run.step, + action=action, + reward=reward, + observation=observation, + env_params=dict(self.test_run.current_env_params), + ) + ) + for observer in self.observers: + observer.after_step(self.test_run, observation, reward) + return observation, reward, True, {}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/cloudai_gym.py` around lines 145 - 170, The constraint-failure return path at lines 168-170 skips calling write_trajectory and observer.after_step, creating a mismatch with the before_step call at line 146-147. Before the early return statement that handles the failed constraint check, add a trajectory entry write using write_trajectory with appropriate step, action, and constraint-failure reward values, then call observer.after_step for all observers in self.observers (similar to the pattern used in the cached_result branch at lines 164-169) to ensure trajectory persistence and observer symmetry are maintained.
♻️ Duplicate comments (7)
src/cloudai/configurator/env_params.py (3)
113-115:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
StructuredObservationsignatures diverge from the declared public contract.Line 113-115 currently uses
Optional[...]for descriptors and requires anobservationarg inencode_observation, but the documented contract is zero-arg hooks returning a descriptor dict and encoded leaves.Proposed fix
- def structured_observation_descriptors(self) -> Optional[Dict[str, ObsLeafDescriptor]]: ... - def encode_observation(self, observation: list) -> Dict[str, Any]: ... + def structured_observation_descriptors(self) -> Dict[str, ObsLeafDescriptor]: ... + def encode_observation(self) -> Dict[str, Any]: ...🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/env_params.py` around lines 113 - 115, The method signatures for structured_observation_descriptors and encode_observation in the StructuredObservation class do not match the documented public contract. Fix structured_observation_descriptors to return Dict[str, ObsLeafDescriptor] (removing the Optional wrapper) to indicate it always returns a descriptor dict, and remove the observation parameter from encode_observation to make it a zero-arg hook that returns encoded leaves, aligning both methods with the declared contract for these zero-arg hook methods.
2-2:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix the SPDX copyright year string to unblock CI.
Line 2 still uses
2024-2026, but CI expects the exact2026string for this file.Proposed fix
-# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.Based on learnings, header year formatting must match the repository’s enforced checker exactly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/env_params.py` at line 2, The copyright header on line 2 uses the year range "2024-2026" but the CI checker requires the exact year "2026" for this file. Update the copyright year in the first file header comment from the range format to just "2026" to match the repository's enforced header format checker requirements.Sources: Learnings, Pipeline failures
87-97:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
ObsLeafDescriptordoes not expose the documented box dtype contract.Line 87-90 defines
kind/dim/nonly; the PR objective says box leaves default tofloat32. Without a descriptordtype, adapters cannot consume that schema metadata.Proposed fix
class ObsLeafDescriptor(BaseModel): @@ kind: Literal["box", "discrete"] dim: int = 1 n: Optional[int] = None + dtype: Literal["float32"] = "float32"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/env_params.py` around lines 87 - 97, The ObsLeafDescriptor class is missing a dtype field that should be exposed as part of the schema metadata. Add a dtype field to ObsLeafDescriptor alongside the existing kind, dim, and n fields, with a default value of float32 for box type observations, so that adapters can access the data type information from the descriptor schema.tests/test_action_space.py (1)
45-45:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUnblock pyright for intentionally-invalid constructor tests.
Line 45 and Line 50 are valid runtime-validation tests, but static typing rejects them and currently breaks CI lint.
Proposed fix
+from typing import Any, cast @@ def test_continuous_space_rejects_unknown_dtype() -> None: with pytest.raises(ValidationError): - ContinuousSpace(low=0.0, high=1.0, dtype="double") + ContinuousSpace(low=0.0, high=1.0, dtype=cast(Any, "double")) @@ def test_continuous_space_forbids_extra_fields() -> None: with pytest.raises(ValidationError): - ContinuousSpace(low=0.0, high=1.0, step=0.1) + ContinuousSpace(low=0.0, high=1.0, **cast(Any, {"step": 0.1}))Also applies to: 50-50
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_action_space.py` at line 45, The ContinuousSpace constructor calls at lines 45 and 50 in tests/test_action_space.py are intentionally-invalid runtime validation tests but violate static typing contracts checked by pyright, causing CI lint failures. Add a pyright ignore directive comment to each of these lines (both the line 45 ContinuousSpace call with dtype="double" and the line 50 similar construct) to allow the static type checker to skip validation while preserving the runtime validation test behavior.Source: Pipeline failures
tests/test_env_params.py (2)
146-148:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse
model_validate(...)for these negative constructor tests so pyright passes.Line 146 and Line 148 intentionally pass invalid payloads, but direct
ObsLeafDescriptor(...)calls are rejected by the typed constructor before runtime validation, causing CI lint failure. Keep the invalid test data, but route it throughmodel_validate({...}).Proposed fix
def test_obs_leaf_descriptor_rejects_bad_dim_and_extra_fields() -> None: with pytest.raises(ValidationError, match="dim must be"): ObsLeafDescriptor(kind="box", dim=0) with pytest.raises(ValidationError): - ObsLeafDescriptor(kind="box", dim=1, unexpected=1) + ObsLeafDescriptor.model_validate({"kind": "box", "dim": 1, "unexpected": 1}) with pytest.raises(ValidationError): - ObsLeafDescriptor(kind="categorical", dim=1) + ObsLeafDescriptor.model_validate({"kind": "categorical", "dim": 1})🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_env_params.py` around lines 146 - 148, The direct constructor calls to ObsLeafDescriptor on lines 146 and 148 are rejected by the type checker before runtime validation can occur, causing CI lint failures. Convert both constructor calls to use model_validate({...}) instead, wrapping the invalid test data (the keyword arguments) into a dictionary payload. This allows the invalid data to pass through type checking and be caught by the runtime validation in the pytest.raises blocks.Source: Pipeline failures
2-2:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix the copyright year on Line 2 to unblock CI.
Line 2 still uses
2025-2026, but CI expects this file’s second header line to be2026only.Proposed fix
-# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.Based on learnings, copyright year formatting must match the repository’s header-check logic in
tests/test_check_copyright_headers.py.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_env_params.py` at line 2, Update the copyright year in the header line to match the CI validation requirements. Change the copyright year format from the current `2025-2026` to `2026` only in the copyright header comment. This aligns with the copyright header check logic that validates the formatting across the repository, ensuring CI will pass the header validation.Sources: Learnings, Pipeline failures
src/cloudai/configurator/base_agent.py (1)
149-161:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHonor terminal
donein the default run loop.Line 149 reads
done, but the loop continues throughmax_steps. This can run extra steps after episode termination and feed invalid transitions intoupdate_policy.Proposed fix
self.update_policy( { "trial_index": step, "value": reward, "observation": observation, "prev_observation": prev_observation, "action": action, "done": done, } ) logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}") + if done: + break return 0🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/base_agent.py` around lines 149 - 161, The run loop in the base_agent.py file reads the done flag from the environment but does not honor it to exit early. After calling update_policy with the step data (which includes the done flag), add a check: if done is True, break out of the loop immediately. This prevents extra steps from being executed after the episode has terminated and avoids feeding invalid transitions into update_policy beyond the natural episode boundary.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 145-170: The constraint-failure return path at lines 168-170 skips
calling write_trajectory and observer.after_step, creating a mismatch with the
before_step call at line 146-147. Before the early return statement that handles
the failed constraint check, add a trajectory entry write using write_trajectory
with appropriate step, action, and constraint-failure reward values, then call
observer.after_step for all observers in self.observers (similar to the pattern
used in the cached_result branch at lines 164-169) to ensure trajectory
persistence and observer symmetry are maintained.
---
Duplicate comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 149-161: The run loop in the base_agent.py file reads the done
flag from the environment but does not honor it to exit early. After calling
update_policy with the step data (which includes the done flag), add a check: if
done is True, break out of the loop immediately. This prevents extra steps from
being executed after the episode has terminated and avoids feeding invalid
transitions into update_policy beyond the natural episode boundary.
In `@src/cloudai/configurator/env_params.py`:
- Around line 113-115: The method signatures for
structured_observation_descriptors and encode_observation in the
StructuredObservation class do not match the documented public contract. Fix
structured_observation_descriptors to return Dict[str, ObsLeafDescriptor]
(removing the Optional wrapper) to indicate it always returns a descriptor dict,
and remove the observation parameter from encode_observation to make it a
zero-arg hook that returns encoded leaves, aligning both methods with the
declared contract for these zero-arg hook methods.
- Line 2: The copyright header on line 2 uses the year range "2024-2026" but the
CI checker requires the exact year "2026" for this file. Update the copyright
year in the first file header comment from the range format to just "2026" to
match the repository's enforced header format checker requirements.
- Around line 87-97: The ObsLeafDescriptor class is missing a dtype field that
should be exposed as part of the schema metadata. Add a dtype field to
ObsLeafDescriptor alongside the existing kind, dim, and n fields, with a default
value of float32 for box type observations, so that adapters can access the data
type information from the descriptor schema.
In `@tests/test_action_space.py`:
- Line 45: The ContinuousSpace constructor calls at lines 45 and 50 in
tests/test_action_space.py are intentionally-invalid runtime validation tests
but violate static typing contracts checked by pyright, causing CI lint
failures. Add a pyright ignore directive comment to each of these lines (both
the line 45 ContinuousSpace call with dtype="double" and the line 50 similar
construct) to allow the static type checker to skip validation while preserving
the runtime validation test behavior.
In `@tests/test_env_params.py`:
- Around line 146-148: The direct constructor calls to ObsLeafDescriptor on
lines 146 and 148 are rejected by the type checker before runtime validation can
occur, causing CI lint failures. Convert both constructor calls to use
model_validate({...}) instead, wrapping the invalid test data (the keyword
arguments) into a dictionary payload. This allows the invalid data to pass
through type checking and be caught by the runtime validation in the
pytest.raises blocks.
- Line 2: Update the copyright year in the header line to match the CI
validation requirements. Change the copyright year format from the current
`2025-2026` to `2026` only in the copyright header comment. This aligns with the
copyright header check logic that validates the formatting across the
repository, ensuring CI will pass the header validation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 4dd48574-dda1-4f02-8138-c5510caa80a9
📒 Files selected for processing (13)
src/cloudai/_core/action_space.pysrc/cloudai/_core/test_scenario.pysrc/cloudai/cli/handlers.pysrc/cloudai/configurator/base_agent.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/env_params.pysrc/cloudai/core.pysrc/cloudai/models/workload.pytests/test_action_space.pytests/test_cloudaigym.pytests/test_env_params.pytests/test_handlers.pytests/test_test_scenario.py
|
@coderabbitai review |
✅ Action performedReview finished.
|
5657f1e to
29867a8
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
src/cloudai/configurator/env_params.py (2)
100-115:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
StructuredObservation.encode_observationsignature diverges from the documented contract.The PR objective documents a zero-arg hook
encode_observation() -> dict[str, Any], but Line 115 includes anobservation: listparameter. This breaks the duck-typing contract for environments implementing the protocol.Proposed fix
- def encode_observation(self, observation: list) -> Dict[str, Any]: ... + def encode_observation(self) -> Dict[str, Any]: ...🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/env_params.py` around lines 100 - 115, The encode_observation method signature in the StructuredObservation Protocol class on line 115 includes an observation: list parameter, which diverges from the documented contract that specifies a zero-argument method. Remove the observation: list parameter from the encode_observation method definition to align with the documented contract and maintain proper duck-typing compatibility with environments implementing this protocol.
74-97:⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoff
ObsLeafDescriptoris missing the boxdtypecontract.The PR objective states box leaves default to
float32, but this model exposes nodtypefield, so adapters cannot consume dtype metadata from the descriptor schema. Add adtypefield with appropriate validation.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/env_params.py` around lines 74 - 97, The ObsLeafDescriptor class is missing a dtype field needed to expose data type metadata for box leaves to adapters. Add a dtype field to the class (appropriate type should be determined based on your dtype representation strategy, likely Optional[str] with a default of "float32" for box kind), then update the _validate method to enforce that dtype is present and valid when kind equals "box", ensuring adapters can properly consume the dtype contract from the descriptor schema.tests/test_env_params.py (1)
142-148:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix negative-constructor tests to satisfy type checking.
Line 148 uses
kind="categorical"which is not a validLiteral["box", "discrete"]value. This fails type checking before Pydantic validation runs. Usemodel_validate()for negative tests or choose a valid literal value that triggers the intended validation error.Proposed fix
with pytest.raises(ValidationError, match="dim must be"): ObsLeafDescriptor(kind="box", dim=0) with pytest.raises(ValidationError): - ObsLeafDescriptor(kind="box", dim=1, unexpected=1) # type: ignore + ObsLeafDescriptor.model_validate({"kind": "box", "dim": 1, "unexpected": 1}) with pytest.raises(ValidationError): - ObsLeafDescriptor(kind="categorical", dim=1) # type: ignore + ObsLeafDescriptor.model_validate({"kind": "invalid", "dim": 1})🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_env_params.py` around lines 142 - 148, The test case on lines 147-148 in the test_obs_leaf_descriptor_rejects_bad_dim_and_extra_fields function uses kind="categorical" which is not a valid Literal value (only "box" and "discrete" are valid). This causes type checking to fail before Pydantic validation runs. Either use model_validate() with a dictionary to bypass type checking for this negative test, or replace the invalid kind value with a valid literal ("box" or "discrete") that still triggers the validation error you are testing for (likely related to dim=1 being invalid for that kind).tests/test_action_space.py (1)
43-50:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winPyright errors still present (duplicate of past review).
The
# type: ignorecomments don't suppress pyright-specific errors. CI is still failing withreportArgumentTypeon line 45 andreportCallIssueon line 50. Use pyright-specific suppressions as noted in the previous review.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_action_space.py` around lines 43 - 50, The generic `# type: ignore` comments are not suppressing pyright-specific errors in the test functions test_continuous_space_rejects_unknown_dtype and test_continuous_space_forbids_extra_fields. Replace the `# type: ignore` comment on the ContinuousSpace call at line 45 with a pyright-specific suppression for the reportArgumentType error, and replace the `# type: ignore` comment on the ContinuousSpace call at line 50 with a pyright-specific suppression for the reportCallIssue error. Use the format `# pyright: ignore[error_code]` for each respective error code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/test_env_params.py`:
- Around line 87-88: The pytest.raises context manager for ValueError at the
sink.write call lacks a match parameter. Add a match parameter with an
appropriate regex pattern to the pytest.raises(ValueError) call that matches the
expected error message thrown when sink.write is called with drop_rate of 0.0,
which will improve test specificity and provide better failure diagnostics.
---
Duplicate comments:
In `@src/cloudai/configurator/env_params.py`:
- Around line 100-115: The encode_observation method signature in the
StructuredObservation Protocol class on line 115 includes an observation: list
parameter, which diverges from the documented contract that specifies a
zero-argument method. Remove the observation: list parameter from the
encode_observation method definition to align with the documented contract and
maintain proper duck-typing compatibility with environments implementing this
protocol.
- Around line 74-97: The ObsLeafDescriptor class is missing a dtype field needed
to expose data type metadata for box leaves to adapters. Add a dtype field to
the class (appropriate type should be determined based on your dtype
representation strategy, likely Optional[str] with a default of "float32" for
box kind), then update the _validate method to enforce that dtype is present and
valid when kind equals "box", ensuring adapters can properly consume the dtype
contract from the descriptor schema.
In `@tests/test_action_space.py`:
- Around line 43-50: The generic `# type: ignore` comments are not suppressing
pyright-specific errors in the test functions
test_continuous_space_rejects_unknown_dtype and
test_continuous_space_forbids_extra_fields. Replace the `# type: ignore` comment
on the ContinuousSpace call at line 45 with a pyright-specific suppression for
the reportArgumentType error, and replace the `# type: ignore` comment on the
ContinuousSpace call at line 50 with a pyright-specific suppression for the
reportCallIssue error. Use the format `# pyright: ignore[error_code]` for each
respective error code.
In `@tests/test_env_params.py`:
- Around line 142-148: The test case on lines 147-148 in the
test_obs_leaf_descriptor_rejects_bad_dim_and_extra_fields function uses
kind="categorical" which is not a valid Literal value (only "box" and "discrete"
are valid). This causes type checking to fail before Pydantic validation runs.
Either use model_validate() with a dictionary to bypass type checking for this
negative test, or replace the invalid kind value with a valid literal ("box" or
"discrete") that still triggers the validation error you are testing for (likely
related to dim=1 being invalid for that kind).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 10cda741-a2d5-4271-a790-dac9d27e8a32
📒 Files selected for processing (9)
src/cloudai/_core/action_space.pysrc/cloudai/_core/test_scenario.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/env_params.pysrc/cloudai/core.pysrc/cloudai/models/workload.pytests/test_action_space.pytests/test_cloudaigym.pytests/test_env_params.py
|
@coderabbitai review |
✅ Action performedReview finished.
|
29867a8 to
1072426
Compare
run() already treats a None return from select_action() as a stop signal, but the abstract method declared only tuple[int, dict[str, Any]]. Widen the return annotation to tuple[int, dict[str, Any]] | None and document the termination semantics so the declared contract matches run()'s usage.
CloudAIGymEnv.get_cached_trajectory_result(action) keys the trajectory cache on action alone. When a workload declares env_params (e.g. drop_rate) and the agent re-selects the same action under a different env_params sample, the cache returns a stale reward measured under a different env, silently invalidating any domain-randomization workflow. Four tests pin the contract; today two FAIL (bug-exposing), two PASS (back-compat sanity). All 27 pre-existing tests are untouched. FAIL test_cache_miss_when_env_params_differ FAIL test_step_reruns_workload_when_env_params_change PASS test_cache_hit_when_action_and_env_params_match PASS test_cache_hit_when_neither_has_env_params Driver for the follow-up PR that adds env_params as a first-class field on TestDefinition + TrajectoryEntry and rewrites the cache key as (action, env_params). Both unit and integration shapes are covered so the fix can be validated end-to-end through env.step().
…ge signature Satisfy ruff-format on the env_params cache-key TDD test.
1072426 to
a7a2c01
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/cloudai/configurator/cloudai_gym.py (2)
185-201:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve sampled
current_env_paramsin the fallback restore path.When the output path is missing, Line 188 resets from
original_test_runand only restoresstep/output_path; by Line 201 the trajectory row can record empty env params even though a sample existed for this step.Proposed fix
else: self.test_run = copy.deepcopy(self.original_test_run) self.test_run.step = new_tr.step self.test_run.output_path = new_tr.output_path + self.test_run.current_env_params = dict(new_tr.current_env_params)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/cloudai_gym.py` around lines 185 - 201, When the fallback restore path is taken (when test output path does not exist), the code resets from original_test_run but only restores the step and output_path attributes from new_tr. This causes the current_env_params to be empty when writing the trajectory entry, even though new_tr contains sampled environment parameters. In the else block after line 188, after setting self.test_run.output_path from new_tr, also restore self.test_run.current_env_params from new_tr to preserve the sampled parameters for the trajectory record.
145-170:⚠️ Potential issue | 🟠 Major | ⚡ Quick winConstraint-failure exits leave
env.csvandtrajectory.csvout of sync.At Line 146,
before_stepcan persist a row toenv.csv, but the early return at Line 170 skipswrite_trajectory, so the same trial step exists only in one file.Proposed fix
- if not self.test_run.test.constraint_check(self.test_run, self.runner.system): - logging.info("Constraint check failed. Skipping step.") - return [-1.0], self.rewards.constraint_failure, True, {} + if not self.test_run.test.constraint_check(self.test_run, self.runner.system): + logging.info("Constraint check failed. Skipping step.") + observation = [-1.0] + reward = self.rewards.constraint_failure + self.write_trajectory( + TrajectoryEntry( + step=self.test_run.step, + action=action, + reward=reward, + observation=observation, + env_params=dict(self.test_run.current_env_params), + ) + ) + for observer in self.observers: + observer.after_step(self.test_run, observation, reward) + return observation, reward, True, {}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/cloudai_gym.py` around lines 145 - 170, The constraint-failure early return at the end of the code snippet skips the write_trajectory call, while the before_step observer has already potentially written to env.csv, causing the files to be out of sync. Before the early return statement in the constraint_check failure case, add a write_trajectory call with a TrajectoryEntry that mirrors the structure used in the cached result handling, including the current step, action, the constraint_failure reward, the current observation, and the current env_params dictionary, to ensure both csv files remain synchronized.
♻️ Duplicate comments (1)
src/cloudai/configurator/base_agent.py (1)
150-162:⚠️ Potential issue | 🟠 Major | ⚡ Quick winStop the default run loop when
env.stepreturnsdone=True.Line 150 captures
done, but the loop never checks it. This keeps stepping after terminal state and can skew DSE trial flow and persisted step-aligned artifacts.Proposed fix
self.update_policy( { "trial_index": step, "value": reward, "observation": observation, "prev_observation": prev_observation, "action": action, "done": done, } ) + if done: + break logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}") return 0🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/base_agent.py` around lines 150 - 162, The loop captures the done flag from self.env.step(action) on line 150 but never checks it to terminate the loop when the environment reaches a terminal state. Add a check after the self.update_policy() call that breaks out of the loop when done is True. This ensures the default run loop stops stepping once the environment signals completion, preventing artifacts misalignment and incorrect trial flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cloudai/configurator/env_params.py`:
- Around line 66-70: The `EnvParamSpec._validate_weights()` method currently
only validates that weights are non-negative and have a positive sum, but it
does not reject non-finite values like infinity or NaN. Add validation to check
that all weights in self.weights are finite numbers before the existing positive
sum check. Raise a ValueError with a descriptive message if any non-finite
weights are detected, using a utility function like math.isfinite() to test each
weight value. This ensures malformed configurations are rejected at schema
validation time rather than failing later at runtime when random.choices() is
called.
In `@tests/test_cloudaigym.py`:
- Around line 567-615: Add a new test function alongside
test_env_csv_is_step_aligned_with_trajectory that validates env.csv and
trajectory.csv step alignment when a constraint failure causes early exit. This
test should follow a similar structure to the existing test but configure the
CloudAIGymEnv with constraint parameters that will be violated during step
execution, triggering an early termination. Verify that even when the
environment exits due to constraint failure, the step columns in env.csv and
trajectory.csv remain perfectly aligned (same number of rows, matching step
numbers), ensuring the corpus-friendly merge contract holds across both normal
execution and constraint-failure paths.
---
Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 185-201: When the fallback restore path is taken (when test output
path does not exist), the code resets from original_test_run but only restores
the step and output_path attributes from new_tr. This causes the
current_env_params to be empty when writing the trajectory entry, even though
new_tr contains sampled environment parameters. In the else block after line
188, after setting self.test_run.output_path from new_tr, also restore
self.test_run.current_env_params from new_tr to preserve the sampled parameters
for the trajectory record.
- Around line 145-170: The constraint-failure early return at the end of the
code snippet skips the write_trajectory call, while the before_step observer has
already potentially written to env.csv, causing the files to be out of sync.
Before the early return statement in the constraint_check failure case, add a
write_trajectory call with a TrajectoryEntry that mirrors the structure used in
the cached result handling, including the current step, action, the
constraint_failure reward, the current observation, and the current env_params
dictionary, to ensure both csv files remain synchronized.
---
Duplicate comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 150-162: The loop captures the done flag from
self.env.step(action) on line 150 but never checks it to terminate the loop when
the environment reaches a terminal state. Add a check after the
self.update_policy() call that breaks out of the loop when done is True. This
ensures the default run loop stops stepping once the environment signals
completion, preventing artifacts misalignment and incorrect trial flow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 64a943dc-eea4-4d96-8190-7db0c7da3c25
📒 Files selected for processing (12)
src/cloudai/_core/action_space.pysrc/cloudai/_core/test_scenario.pysrc/cloudai/cli/handlers.pysrc/cloudai/configurator/base_agent.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/env_params.pysrc/cloudai/core.pysrc/cloudai/models/workload.pytests/test_action_space.pytests/test_cloudaigym.pytests/test_env_params.pytests/test_handlers.py
CodeRabbit (Ruff PT018): `assert a and b` obscures which clause failed. Split the two `assert result is not None and result.step == 1` checks in the env_params cache-key tests into separate assertions for clearer pytest introspection. Test-only change, no behavior impact.
Domain randomization broke under the trajectory cache because the cache key was action-only: an agent revisiting the same action under a different env_params sample (e.g. drop_rate) got the *first* trial's reward served back, silently corrupting the reward signal. env.csv was also workload-owned and skipped on cache hits, so it was sparse and mis-aligned with trajectory.csv. This change makes env_params first-class in cloudai so the design works the same way for every agent (PPO, BO, GA, MAB) and every workload: - TestDefinition.env_params: dict[str, EnvParamSpec] - sibling to cmd_args, not part of the action space. - TestRun.current_env_params: dict[str, Any] - per-trial sample attached by the env so the cache key and downstream sinks see it. - TrajectoryEntry gains env_params; the cache key is now (action, current_env_params). - CloudAIGymEnv builds an EnvParamsObserver when env_params is declared. before_step() samples deterministically by (seed, name, trial), stashes the sample on test_run, and appends to env.csv *before* the cache lookup - so env.csv is written on every trial (cache hit or miss) and aligns 1:1 with trajectory.csv by step. - Workloads with no [env_params.*] block pay zero overhead: no observer, no env.csv, identical behaviour to today. Tests: turns the four TDD-red tests from the parent PR green and adds nine unit tests for EnvParamSpec / EnvParamsSampler / CsvSink / EnvParamsObserver plus two integration tests pinning the env.csv ↔ trajectory.csv contract.
Closes the most important gap in the env_params test matrix: the
cache-HIT path under an observer-driven TestDefinition.env_params.
The previous tests covered each axis separately - cache key unit tests
for env_params, a step()-level cache-miss integration with manually
set current_env_params, and a step-alignment test where all trials
were cache misses. None of them proved that a cache HIT still drives
the observer (and therefore env.csv) - which is precisely the
sparsity contract the fix was about.
The new test:
- declares env_params on the TestDefinition (real observer wired in),
- pre-seeds the trajectory with an entry whose action AND deterministic
env_params sample match what the observer will produce for step 1,
- calls env.step() once and asserts:
(a) runner.run is NOT called (cache short-circuit honored),
(b) env.csv gains a row (observer fired before the cache lookup),
(c) the new trajectory entry carries the per-trial env_params.
This is the end-to-end proof that the env.csv sparsity bug cannot
recur, and that env.csv ↔ trajectory.csv stay step-aligned on the
cache-hit path too.
…s files env_params.py and test_env_params.py were first authored in 2026; the copyright-header check derives the expected year from git history.
env_params are sampled only during DSE (by CloudAIGymEnv); on a non-DSE run they would be silently ignored. is_dse_job is a property of the fully prepped config, so validate it there rather than at parse time. Add validate_dse_env_params(test_scenario), which raises when a test run declares env_params while is_dse_job is False. Enforce it at run/dry-run dispatch (handle_dry_run_and_run, fail fast after setup) and mandatorily in verify_test_scenarios, so a config declaring env_params without a DSE sweep (list-valued cmd_args/extra_env_vars or num_nodes) is rejected instead of silently dropped.
a7a2c01 to
a7cb274
Compare
Sink env.csv from the same TrajectoryEntry as trajectory.csv inside
write_trajectory, instead of eagerly in the observer's before_step. A
constraint-failed step returns before write_trajectory, so it now records
no row in either file and the two stay step-aligned (a downstream
pd.merge(traj, env, on="step") loses no rows).
Also restore current_env_params after the run-result fallback in step():
that branch rebuilt test_run from the pristine copy and reset the sampled
env_params to {}, recording empty env_params in the trajectory entry and
weakening the (action, env_params) cache key.
EnvParamsObserver is reduced to sample-and-stash; CloudAIGymEnv owns
persistence of both files.
a7cb274 to
f44b1ea
Compare
random.choices treats inf/nan weights inconsistently (they slip past the existing non-negative + positive-sum checks), so validate finiteness in EnvParamSpec and reject inf/nan/-inf. Also add a match= to the CsvSink zero-step pytest.raises for clarity (CodeRabbit, NVIDIA#901/NVIDIA#928).
Introduce ContinuousSpace, a closed-interval [low, high] action-space
dimension with a dtype ("int" | "float") that declares how consumers
should quantize decoded samples. CloudAI action spaces can now express
continuous tunables alongside discrete (list) candidate domains; agents
and adapters read low/high/dtype to build their own representation.
Validated low < high; exported via cloudai.core.
…ejection tests These negative tests pass invalid dtype/extra kwargs to assert pydantic ValidationError; mark the deliberate type violations with type: ignore.
f44b1ea to
a85f886
Compare
GymnasiumAdapter._decode_continuous clamps to [low, high] then rounds, so a dtype="int" space with non-integer bounds (e.g. [0.2, 0.8]) can decode to a value outside the declared range (1). Validate that int-quantized spaces declare integer bounds (CodeRabbit, NVIDIA#932/NVIDIA#927).
…otocol Add ObsLeafDescriptor (a self-describing observation leaf: "box" of width dim, or "discrete" of size n) and a StructuredObservation Protocol that documents the optional env hooks structured_observation_descriptors() and encode_observation(). These let an env expose a named, per-leaf observation so adapters (e.g. GymnasiumAdapter) can build the matching gymnasium spaces.Dict; the hooks are duck-typed, so envs need not subclass. Both exported via cloudai.core.
…ejection tests Negative tests pass an extra kwarg and an out-of-Literal kind to assert ValidationError; mark the deliberate type violations with type: ignore.
a85f886 to
24b8b99
Compare
Issue
Fix
ObsLeafDescriptor(kind="box"|"discrete", dim, n)and aStructuredObservationProtocol documenting the optional, duck-typed env hooksstructured_observation_descriptors()/encode_observation(). Adapters read these to build a matchinggymnasium.spaces.Dict; envs need not subclass. Both exported viacloudai.core.Testing
tests/test_env_params.py(+3 tests): box defaults,discreterequiresn >= 1, bad-dim / extra-field / unknown-kind rejection. ruff + pyright + vulture clean.Stack: #901 ← #927 (ContinuousSpace) ← this. Second of the gymnasium-adapter upstreaming; PR3 ports the adapter that consumes both primitives.