Skip to content

feat(configurator): add ObsLeafDescriptor + structured-observation protocol#928

Open
rutayan-nv wants to merge 17 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/obs-leaf-descriptor
Open

feat(configurator): add ObsLeafDescriptor + structured-observation protocol#928
rutayan-nv wants to merge 17 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/obs-leaf-descriptor

Conversation

@rutayan-nv

Copy link
Copy Markdown
Contributor

Issue

  • CloudAI envs only expose a flat observation vector; there is no way to describe a named, per-leaf observation so an adapter can build a structured (Dict) observation space.

Fix

  • Add ObsLeafDescriptor(kind="box"|"discrete", dim, n) and a StructuredObservation Protocol documenting the optional, duck-typed env hooks structured_observation_descriptors() / encode_observation(). Adapters read these to build a matching gymnasium.spaces.Dict; envs need not subclass. Both exported via cloudai.core.

Testing

  • tests/test_env_params.py (+3 tests): box defaults, discrete requires n >= 1, bad-dim / extra-field / unknown-kind rejection. ruff + pyright + vulture clean.

Stack: #901#927 (ContinuousSpace) ← this. Second of the gymnasium-adapter upstreaming; PR3 ports the adapter that consumes both primitives.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds DSE domain-randomization infrastructure: a ContinuousSpace action-space model, EnvParamSpec/EnvParamsSampler/CsvSink/EnvParamsObserver primitives in env_params.py, observer hooks in CloudAIGymEnv.step() and cache-key logic, a default BaseAgent.run() loop with termination control via select_action() returning None, and structured failure reporting in handle_dse_job via _record_run_failure and dse_failure.txt.

Changes

DSE Domain Randomization and Error Handling

Layer / File(s) Summary
Action space and env-params schemas and infrastructure
src/cloudai/_core/action_space.py, src/cloudai/configurator/env_params.py
Defines ContinuousSpace for action-space bounds with strict validation (low < high), EnvParamSpec for categorical parameter specs with optional validated weights, ObsLeafDescriptor for observation leaf structure, StructuredObservation protocol for observation encoding, EnvParamsSampler for deterministic per-trial categorical sampling with RNG isolation, EnvParamsSink protocol and CsvSink for step-aligned CSV persistence, and StepObserver protocol with EnvParamsObserver for per-step sampling and recording.
TestRun and TestDefinition extensions for env-params
src/cloudai/_core/test_scenario.py, src/cloudai/models/workload.py
Extends TestRun with current_env_params: dict[str, Any] field and increment_step() -> int method. Extends TestDefinition with env_params: dict[str, EnvParamSpec] field defaulting to empty dict, documenting sampling per trial, persistence to env.csv, and trajectory cache-key incorporation.
Public type exports via core.py
src/cloudai/core.py
Re-exports ContinuousSpace, ObsLeafDescriptor, and StructuredObservation from internal modules by adding imports and updating the __all__ list.
Test coverage for action-space and env-params primitives
tests/test_action_space.py, tests/test_env_params.py, tests/test_test_scenario.py
Comprehensive pytest coverage for ContinuousSpace dtype defaults and bounds validation, EnvParamSpec weight validation, EnvParamsSampler determinism and per-parameter independence, CsvSink file I/O and step validation, EnvParamsObserver current_env_params updates and sink persistence, ObsLeafDescriptor box/discrete validation rules, and TestRun.increment_step() monotonic advancement.
CloudAIGymEnv observer wiring, TrajectoryEntry env_params, and cache key change
src/cloudai/configurator/cloudai_gym.py
Adds env_params field to TrajectoryEntry, wires EnvParamsObserver initialization and before_step/after_step hooks into CloudAIGymEnv.step() for both cached and non-cached branches, increments test run step before cache lookup, and updates get_cached_trajectory_result to require exact match of both action and current_env_params with back-compat for missing env_params.
Test coverage for CloudAIGymEnv observer integration and cache behavior
tests/test_cloudaigym.py
Tests step advancement before output_path computation, cached trajectory row step coupling to advance ordering, cache-miss when env_params differ, cache-hit when action and current_env_params match, back-compat cache-hit when neither has env_params, workload re-execution on env_params change, env.csv/trajectory.csv step alignment, observer execution on cache-hit with declared env_params, and observer absence when env_params not declared.
BaseAgent.run() default exploration loop
src/cloudai/configurator/base_agent.py
Updates abstract select_action signature to optionally return None for early termination. Adds logging import and concrete run() method that resets the environment, iterates select_action/env.step/update_policy up to max_steps, breaks when select_action returns None, logs step progress, and returns 0 on completion. Docstring defines failure contract distinguishing recoverable return codes from unexpected exceptions.
handle_dse_job refactor: agent.run() delegation and failure reporting
src/cloudai/cli/handlers.py
Refactors handle_dse_job to instantiate the configured agent and call agent.run() per TestRun, accumulates recoverable non-zero return codes via bitwise OR while capturing unexpected exceptions in run_error. Adds _record_run_failure helper to write dse_failure.txt with formatted traceback and OSError handling. Updates generate_reports signature to accept optional error parameter and calls _record_run_failure before normal report generation. Adds validate_dse_env_params to reject non-DSE test runs declaring env_params, wired into handle_dry_run_and_run and verify_test_scenarios.
Test coverage for handle_dse_job, validate_dse_env_params, and error reporting
tests/test_handlers.py
Introduces CustomRunStubAgent with configurable run() behavior, custom_run_agent_name fixture for agent registry management, and comprehensive tests covering agent.run() invocation, non-zero return code propagation and accumulation, exception propagation as hard failures aborting remaining runs, dse_failure.txt generation on hard-fail with exception type/message, and validate_dse_env_params/verify_test_scenarios enforcement that env_params are allowed only for DSE jobs.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • srivatsankrishnan
  • jeffnvidia
  • podkidyshev

🐰 Hoppity-hop through the action space wide,
env_params sampled per trial — what a ride!
Observers before_step, CsvSink writes the row,
Cache keys match action AND env, don't you know.
On hard fail, dse_failure.txt tells the tale —
A rabbit reports errors without fail! 🌿

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the primary change: adding ObsLeafDescriptor and StructuredObservation protocol for structured observation support, which is the core feature of this changeset.
Description check ✅ Passed The PR description clearly explains the issue (flat observation vectors lack structure), the fix (new descriptor and protocol with env hooks), testing approach, and context within the upstreaming stack. It is directly related to the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)

145-170: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constraint-failure path breaks env.csvtrajectory.csv step alignment.

Line 146 triggers observer.before_step(...) (which persists env.csv) before constraints are checked, but Line 170 returns early without appending a trajectory row. For env-param-enabled runs, this creates unmatched env.csv rows.

Proposed fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 145 - 170, The
constraint check failure path breaks the alignment between env.csv and
trajectory.csv because observer.before_step() is called before the constraint
validation, writing an env.csv entry that never gets a matching trajectory.csv
row when the function returns early. Move the constraint check (the if not
self.test_run.test.constraint_check() block) to execute before calling
observer.before_step() at the beginning of the step. This ensures that if
constraints fail, no env.csv entry is written, keeping both CSV files aligned.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 149-161: The run loop captures the `done` flag from the
env.step(action) call but never checks it, causing the loop to continue
executing after the environment signals termination. Add a conditional check
after the update_policy call to break out of the loop when `done` evaluates to
True, ensuring the agent stops running trials when the episode terminates as
intended.

In `@src/cloudai/configurator/env_params.py`:
- Around line 87-97: The ObsLeafDescriptor class is missing a dtype field to
expose the box data type contract. Add an optional dtype field to
ObsLeafDescriptor that defaults to "float32" for box leaves, allowing adapters
to consume dtype metadata from the descriptor schema. Update the _validate
method to ensure dtype is appropriately set for box-type leaves and validate
that the dtype value is valid when specified.
- Around line 113-115: The `encode_observation()` method signature in
StructuredObservation diverges from the documented public contract. The
documented contract specifies a zero-arg hook, but the current implementation
includes an `observation: list` parameter. Remove the `observation` parameter
from the `encode_observation()` method signature to align it with the documented
public contract and ensure compatibility with duck-typed environments and
adapters that expect a zero-arg method.
- Line 2: The copyright header on line 2 of
src/cloudai/configurator/env_params.py does not match the CI header-check test
expectations for the year format. Update the copyright year string in the header
from the current "2024-2026" format to match the exact year string that the
repository's header-check test expects for this file (which should be "2026"
based on the CI requirement). Ensure the format precisely matches what the
enforced header check validates.

In `@tests/test_action_space.py`:
- Around line 43-50: Add type-check suppressions to the test functions that
intentionally pass invalid arguments to validate runtime error handling. For the
test_continuous_space_rejects_unknown_dtype function, suppress the type-check
error on the line where ContinuousSpace is instantiated with the invalid
"double" dtype argument. For the test_continuous_space_forbids_extra_fields
function, suppress the type-check error on the line where ContinuousSpace is
instantiated with the unexpected step parameter. These suppressions allow the
runtime validation assertions to execute without being blocked by static type
checkers like pyright, which would otherwise reject these intentionally invalid
arguments.

In `@tests/test_env_params.py`:
- Line 2: The copyright header on line 2 contains an incorrect year range that
fails the CI header check. Update the copyright year in the NVIDIA CORPORATION &
AFFILIATES copyright line from the current range (2025-2026) to only 2026 to
match the CI expectation and unblock the header validation.
- Around line 143-148: Replace the three direct ObsLeafDescriptor constructor
calls with model_validate() calls to satisfy pyright type checking. For the
first validation test (with the dim=0 case and match pattern), call
model_validate() instead of the constructor directly. For the second test (with
the unexpected=1 parameter), similarly use model_validate(). For the third test,
use model_validate() AND correct the test data from kind="categorical" to a
valid literal value that ObsLeafDescriptor accepts, which should be either "box"
or "discrete" based on the kind validation constraints.

---

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 145-170: The constraint check failure path breaks the alignment
between env.csv and trajectory.csv because observer.before_step() is called
before the constraint validation, writing an env.csv entry that never gets a
matching trajectory.csv row when the function returns early. Move the constraint
check (the if not self.test_run.test.constraint_check() block) to execute before
calling observer.before_step() at the beginning of the step. This ensures that
if constraints fail, no env.csv entry is written, keeping both CSV files
aligned.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: ed2eee75-0fae-450f-a244-ee6a41e8986a

📥 Commits

Reviewing files that changed from the base of the PR and between 1b0e8cc and 7c6a1ab.

📒 Files selected for processing (13)
  • src/cloudai/_core/action_space.py
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/core.py
  • src/cloudai/models/workload.py
  • tests/test_action_space.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py
  • tests/test_test_scenario.py

Comment on lines +149 to +161
observation, reward, done, *_ = self.env.step(action)
self.update_policy(
{
"trial_index": step,
"value": reward,
"observation": observation,
"prev_observation": prev_observation,
"action": action,
"done": done,
}
)
logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
return 0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Honor terminal done in the default run loop.

Line 149 receives done from env.step(action), but the loop continues regardless. This can execute extra trials after termination and skew agent behavior.

Proposed fix
             self.update_policy(
                 {
                     "trial_index": step,
                     "value": reward,
                     "observation": observation,
                     "prev_observation": prev_observation,
                     "action": action,
                     "done": done,
                 }
             )
             logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
+            if done:
+                break
         return 0
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
observation, reward, done, *_ = self.env.step(action)
self.update_policy(
{
"trial_index": step,
"value": reward,
"observation": observation,
"prev_observation": prev_observation,
"action": action,
"done": done,
}
)
logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
return 0
observation, reward, done, *_ = self.env.step(action)
self.update_policy(
{
"trial_index": step,
"value": reward,
"observation": observation,
"prev_observation": prev_observation,
"action": action,
"done": done,
}
)
logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
if done:
break
return 0
🧰 Tools
🪛 Ruff (0.15.17)

[warning] 160-160: Logging statement uses f-string

(G004)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/base_agent.py` around lines 149 - 161, The run loop
captures the `done` flag from the env.step(action) call but never checks it,
causing the loop to continue executing after the environment signals
termination. Add a conditional check after the update_policy call to break out
of the loop when `done` evaluates to True, ensuring the agent stops running
trials when the episode terminates as intended.

Comment thread src/cloudai/configurator/env_params.py Outdated
Comment thread src/cloudai/configurator/env_params.py
Comment thread src/cloudai/configurator/env_params.py
Comment thread tests/test_action_space.py Outdated
Comment thread tests/test_env_params.py Outdated
Comment thread tests/test_env_params.py Outdated
…ole mutator

The trial index lives on TestRun and is advanced exactly once per
CloudAIGymEnv.step(), before output_path and the trajectory row are computed.
Step artifacts and trajectory rows are therefore tagged with the advanced
index rather than the pre-step value.
handle_dse_job calls agent.run() uniformly; agents (e.g. RLlib-based) implement
their own training loop in run(). Failure handling follows a hybrid contract: a
non-zero rc accumulates via `err |= rc` and the sweep continues, while an
unexpected exception hard-fails the scenario after reports are generated and the
aborting error is documented.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)

145-170: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constraint-failure early return skips trajectory persistence and after_step hooks.

After Line 145-147, observers run before_step. But Line 168-170 returns immediately on failed constraints, so no trajectory row is written and after_step is never called. With env-param observers, this can leave env.csv and trajectory.csv out of sync for that step.

Proposed fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 145 - 170, The
constraint-failure return path at lines 168-170 skips calling write_trajectory
and observer.after_step, creating a mismatch with the before_step call at line
146-147. Before the early return statement that handles the failed constraint
check, add a trajectory entry write using write_trajectory with appropriate
step, action, and constraint-failure reward values, then call
observer.after_step for all observers in self.observers (similar to the pattern
used in the cached_result branch at lines 164-169) to ensure trajectory
persistence and observer symmetry are maintained.
♻️ Duplicate comments (7)
src/cloudai/configurator/env_params.py (3)

113-115: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

StructuredObservation signatures diverge from the declared public contract.

Line 113-115 currently uses Optional[...] for descriptors and requires an observation arg in encode_observation, but the documented contract is zero-arg hooks returning a descriptor dict and encoded leaves.

Proposed fix
-    def structured_observation_descriptors(self) -> Optional[Dict[str, ObsLeafDescriptor]]: ...
-    def encode_observation(self, observation: list) -> Dict[str, Any]: ...
+    def structured_observation_descriptors(self) -> Dict[str, ObsLeafDescriptor]: ...
+    def encode_observation(self) -> Dict[str, Any]: ...
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` around lines 113 - 115, The method
signatures for structured_observation_descriptors and encode_observation in the
StructuredObservation class do not match the documented public contract. Fix
structured_observation_descriptors to return Dict[str, ObsLeafDescriptor]
(removing the Optional wrapper) to indicate it always returns a descriptor dict,
and remove the observation parameter from encode_observation to make it a
zero-arg hook that returns encoded leaves, aligning both methods with the
declared contract for these zero-arg hook methods.

2-2: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix the SPDX copyright year string to unblock CI.

Line 2 still uses 2024-2026, but CI expects the exact 2026 string for this file.

Proposed fix
-# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Based on learnings, header year formatting must match the repository’s enforced checker exactly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` at line 2, The copyright header on
line 2 uses the year range "2024-2026" but the CI checker requires the exact
year "2026" for this file. Update the copyright year in the first file header
comment from the range format to just "2026" to match the repository's enforced
header format checker requirements.

Sources: Learnings, Pipeline failures


87-97: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

ObsLeafDescriptor does not expose the documented box dtype contract.

Line 87-90 defines kind/dim/n only; the PR objective says box leaves default to float32. Without a descriptor dtype, adapters cannot consume that schema metadata.

Proposed fix
 class ObsLeafDescriptor(BaseModel):
@@
     kind: Literal["box", "discrete"]
     dim: int = 1
     n: Optional[int] = None
+    dtype: Literal["float32"] = "float32"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` around lines 87 - 97, The
ObsLeafDescriptor class is missing a dtype field that should be exposed as part
of the schema metadata. Add a dtype field to ObsLeafDescriptor alongside the
existing kind, dim, and n fields, with a default value of float32 for box type
observations, so that adapters can access the data type information from the
descriptor schema.
tests/test_action_space.py (1)

45-45: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unblock pyright for intentionally-invalid constructor tests.

Line 45 and Line 50 are valid runtime-validation tests, but static typing rejects them and currently breaks CI lint.

Proposed fix
+from typing import Any, cast
@@
 def test_continuous_space_rejects_unknown_dtype() -> None:
     with pytest.raises(ValidationError):
-        ContinuousSpace(low=0.0, high=1.0, dtype="double")
+        ContinuousSpace(low=0.0, high=1.0, dtype=cast(Any, "double"))
@@
 def test_continuous_space_forbids_extra_fields() -> None:
     with pytest.raises(ValidationError):
-        ContinuousSpace(low=0.0, high=1.0, step=0.1)
+        ContinuousSpace(low=0.0, high=1.0, **cast(Any, {"step": 0.1}))

Also applies to: 50-50

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_action_space.py` at line 45, The ContinuousSpace constructor calls
at lines 45 and 50 in tests/test_action_space.py are intentionally-invalid
runtime validation tests but violate static typing contracts checked by pyright,
causing CI lint failures. Add a pyright ignore directive comment to each of
these lines (both the line 45 ContinuousSpace call with dtype="double" and the
line 50 similar construct) to allow the static type checker to skip validation
while preserving the runtime validation test behavior.

Source: Pipeline failures

tests/test_env_params.py (2)

146-148: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use model_validate(...) for these negative constructor tests so pyright passes.

Line 146 and Line 148 intentionally pass invalid payloads, but direct ObsLeafDescriptor(...) calls are rejected by the typed constructor before runtime validation, causing CI lint failure. Keep the invalid test data, but route it through model_validate({...}).

Proposed fix
 def test_obs_leaf_descriptor_rejects_bad_dim_and_extra_fields() -> None:
     with pytest.raises(ValidationError, match="dim must be"):
         ObsLeafDescriptor(kind="box", dim=0)
     with pytest.raises(ValidationError):
-        ObsLeafDescriptor(kind="box", dim=1, unexpected=1)
+        ObsLeafDescriptor.model_validate({"kind": "box", "dim": 1, "unexpected": 1})
     with pytest.raises(ValidationError):
-        ObsLeafDescriptor(kind="categorical", dim=1)
+        ObsLeafDescriptor.model_validate({"kind": "categorical", "dim": 1})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_env_params.py` around lines 146 - 148, The direct constructor
calls to ObsLeafDescriptor on lines 146 and 148 are rejected by the type checker
before runtime validation can occur, causing CI lint failures. Convert both
constructor calls to use model_validate({...}) instead, wrapping the invalid
test data (the keyword arguments) into a dictionary payload. This allows the
invalid data to pass through type checking and be caught by the runtime
validation in the pytest.raises blocks.

Source: Pipeline failures


2-2: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix the copyright year on Line 2 to unblock CI.

Line 2 still uses 2025-2026, but CI expects this file’s second header line to be 2026 only.

Proposed fix
-# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Based on learnings, copyright year formatting must match the repository’s header-check logic in tests/test_check_copyright_headers.py.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_env_params.py` at line 2, Update the copyright year in the header
line to match the CI validation requirements. Change the copyright year format
from the current `2025-2026` to `2026` only in the copyright header comment.
This aligns with the copyright header check logic that validates the formatting
across the repository, ensuring CI will pass the header validation.

Sources: Learnings, Pipeline failures

src/cloudai/configurator/base_agent.py (1)

149-161: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Honor terminal done in the default run loop.

Line 149 reads done, but the loop continues through max_steps. This can run extra steps after episode termination and feed invalid transitions into update_policy.

Proposed fix
             self.update_policy(
                 {
                     "trial_index": step,
                     "value": reward,
                     "observation": observation,
                     "prev_observation": prev_observation,
                     "action": action,
                     "done": done,
                 }
             )
             logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
+            if done:
+                break
         return 0
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/base_agent.py` around lines 149 - 161, The run loop
in the base_agent.py file reads the done flag from the environment but does not
honor it to exit early. After calling update_policy with the step data (which
includes the done flag), add a check: if done is True, break out of the loop
immediately. This prevents extra steps from being executed after the episode has
terminated and avoids feeding invalid transitions into update_policy beyond the
natural episode boundary.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 145-170: The constraint-failure return path at lines 168-170 skips
calling write_trajectory and observer.after_step, creating a mismatch with the
before_step call at line 146-147. Before the early return statement that handles
the failed constraint check, add a trajectory entry write using write_trajectory
with appropriate step, action, and constraint-failure reward values, then call
observer.after_step for all observers in self.observers (similar to the pattern
used in the cached_result branch at lines 164-169) to ensure trajectory
persistence and observer symmetry are maintained.

---

Duplicate comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 149-161: The run loop in the base_agent.py file reads the done
flag from the environment but does not honor it to exit early. After calling
update_policy with the step data (which includes the done flag), add a check: if
done is True, break out of the loop immediately. This prevents extra steps from
being executed after the episode has terminated and avoids feeding invalid
transitions into update_policy beyond the natural episode boundary.

In `@src/cloudai/configurator/env_params.py`:
- Around line 113-115: The method signatures for
structured_observation_descriptors and encode_observation in the
StructuredObservation class do not match the documented public contract. Fix
structured_observation_descriptors to return Dict[str, ObsLeafDescriptor]
(removing the Optional wrapper) to indicate it always returns a descriptor dict,
and remove the observation parameter from encode_observation to make it a
zero-arg hook that returns encoded leaves, aligning both methods with the
declared contract for these zero-arg hook methods.
- Line 2: The copyright header on line 2 uses the year range "2024-2026" but the
CI checker requires the exact year "2026" for this file. Update the copyright
year in the first file header comment from the range format to just "2026" to
match the repository's enforced header format checker requirements.
- Around line 87-97: The ObsLeafDescriptor class is missing a dtype field that
should be exposed as part of the schema metadata. Add a dtype field to
ObsLeafDescriptor alongside the existing kind, dim, and n fields, with a default
value of float32 for box type observations, so that adapters can access the data
type information from the descriptor schema.

In `@tests/test_action_space.py`:
- Line 45: The ContinuousSpace constructor calls at lines 45 and 50 in
tests/test_action_space.py are intentionally-invalid runtime validation tests
but violate static typing contracts checked by pyright, causing CI lint
failures. Add a pyright ignore directive comment to each of these lines (both
the line 45 ContinuousSpace call with dtype="double" and the line 50 similar
construct) to allow the static type checker to skip validation while preserving
the runtime validation test behavior.

In `@tests/test_env_params.py`:
- Around line 146-148: The direct constructor calls to ObsLeafDescriptor on
lines 146 and 148 are rejected by the type checker before runtime validation can
occur, causing CI lint failures. Convert both constructor calls to use
model_validate({...}) instead, wrapping the invalid test data (the keyword
arguments) into a dictionary payload. This allows the invalid data to pass
through type checking and be caught by the runtime validation in the
pytest.raises blocks.
- Line 2: Update the copyright year in the header line to match the CI
validation requirements. Change the copyright year format from the current
`2025-2026` to `2026` only in the copyright header comment. This aligns with the
copyright header check logic that validates the formatting across the
repository, ensuring CI will pass the header validation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 4dd48574-dda1-4f02-8138-c5510caa80a9

📥 Commits

Reviewing files that changed from the base of the PR and between 7c6a1ab and 5657f1e.

📒 Files selected for processing (13)
  • src/cloudai/_core/action_space.py
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/core.py
  • src/cloudai/models/workload.py
  • tests/test_action_space.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py
  • tests/test_test_scenario.py

@rutayan-nv

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch from 5657f1e to 29867a8 Compare June 16, 2026 15:06

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
src/cloudai/configurator/env_params.py (2)

100-115: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

StructuredObservation.encode_observation signature diverges from the documented contract.

The PR objective documents a zero-arg hook encode_observation() -> dict[str, Any], but Line 115 includes an observation: list parameter. This breaks the duck-typing contract for environments implementing the protocol.

Proposed fix
-    def encode_observation(self, observation: list) -> Dict[str, Any]: ...
+    def encode_observation(self) -> Dict[str, Any]: ...
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` around lines 100 - 115, The
encode_observation method signature in the StructuredObservation Protocol class
on line 115 includes an observation: list parameter, which diverges from the
documented contract that specifies a zero-argument method. Remove the
observation: list parameter from the encode_observation method definition to
align with the documented contract and maintain proper duck-typing compatibility
with environments implementing this protocol.

74-97: ⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoff

ObsLeafDescriptor is missing the box dtype contract.

The PR objective states box leaves default to float32, but this model exposes no dtype field, so adapters cannot consume dtype metadata from the descriptor schema. Add a dtype field with appropriate validation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` around lines 74 - 97, The
ObsLeafDescriptor class is missing a dtype field needed to expose data type
metadata for box leaves to adapters. Add a dtype field to the class (appropriate
type should be determined based on your dtype representation strategy, likely
Optional[str] with a default of "float32" for box kind), then update the
_validate method to enforce that dtype is present and valid when kind equals
"box", ensuring adapters can properly consume the dtype contract from the
descriptor schema.
tests/test_env_params.py (1)

142-148: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix negative-constructor tests to satisfy type checking.

Line 148 uses kind="categorical" which is not a valid Literal["box", "discrete"] value. This fails type checking before Pydantic validation runs. Use model_validate() for negative tests or choose a valid literal value that triggers the intended validation error.

Proposed fix
     with pytest.raises(ValidationError, match="dim must be"):
         ObsLeafDescriptor(kind="box", dim=0)
     with pytest.raises(ValidationError):
-        ObsLeafDescriptor(kind="box", dim=1, unexpected=1)  # type: ignore
+        ObsLeafDescriptor.model_validate({"kind": "box", "dim": 1, "unexpected": 1})
     with pytest.raises(ValidationError):
-        ObsLeafDescriptor(kind="categorical", dim=1)  # type: ignore
+        ObsLeafDescriptor.model_validate({"kind": "invalid", "dim": 1})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_env_params.py` around lines 142 - 148, The test case on lines
147-148 in the test_obs_leaf_descriptor_rejects_bad_dim_and_extra_fields
function uses kind="categorical" which is not a valid Literal value (only "box"
and "discrete" are valid). This causes type checking to fail before Pydantic
validation runs. Either use model_validate() with a dictionary to bypass type
checking for this negative test, or replace the invalid kind value with a valid
literal ("box" or "discrete") that still triggers the validation error you are
testing for (likely related to dim=1 being invalid for that kind).
tests/test_action_space.py (1)

43-50: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Pyright errors still present (duplicate of past review).

The # type: ignore comments don't suppress pyright-specific errors. CI is still failing with reportArgumentType on line 45 and reportCallIssue on line 50. Use pyright-specific suppressions as noted in the previous review.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_action_space.py` around lines 43 - 50, The generic `# type:
ignore` comments are not suppressing pyright-specific errors in the test
functions test_continuous_space_rejects_unknown_dtype and
test_continuous_space_forbids_extra_fields. Replace the `# type: ignore` comment
on the ContinuousSpace call at line 45 with a pyright-specific suppression for
the reportArgumentType error, and replace the `# type: ignore` comment on the
ContinuousSpace call at line 50 with a pyright-specific suppression for the
reportCallIssue error. Use the format `# pyright: ignore[error_code]` for each
respective error code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_env_params.py`:
- Around line 87-88: The pytest.raises context manager for ValueError at the
sink.write call lacks a match parameter. Add a match parameter with an
appropriate regex pattern to the pytest.raises(ValueError) call that matches the
expected error message thrown when sink.write is called with drop_rate of 0.0,
which will improve test specificity and provide better failure diagnostics.

---

Duplicate comments:
In `@src/cloudai/configurator/env_params.py`:
- Around line 100-115: The encode_observation method signature in the
StructuredObservation Protocol class on line 115 includes an observation: list
parameter, which diverges from the documented contract that specifies a
zero-argument method. Remove the observation: list parameter from the
encode_observation method definition to align with the documented contract and
maintain proper duck-typing compatibility with environments implementing this
protocol.
- Around line 74-97: The ObsLeafDescriptor class is missing a dtype field needed
to expose data type metadata for box leaves to adapters. Add a dtype field to
the class (appropriate type should be determined based on your dtype
representation strategy, likely Optional[str] with a default of "float32" for
box kind), then update the _validate method to enforce that dtype is present and
valid when kind equals "box", ensuring adapters can properly consume the dtype
contract from the descriptor schema.

In `@tests/test_action_space.py`:
- Around line 43-50: The generic `# type: ignore` comments are not suppressing
pyright-specific errors in the test functions
test_continuous_space_rejects_unknown_dtype and
test_continuous_space_forbids_extra_fields. Replace the `# type: ignore` comment
on the ContinuousSpace call at line 45 with a pyright-specific suppression for
the reportArgumentType error, and replace the `# type: ignore` comment on the
ContinuousSpace call at line 50 with a pyright-specific suppression for the
reportCallIssue error. Use the format `# pyright: ignore[error_code]` for each
respective error code.

In `@tests/test_env_params.py`:
- Around line 142-148: The test case on lines 147-148 in the
test_obs_leaf_descriptor_rejects_bad_dim_and_extra_fields function uses
kind="categorical" which is not a valid Literal value (only "box" and "discrete"
are valid). This causes type checking to fail before Pydantic validation runs.
Either use model_validate() with a dictionary to bypass type checking for this
negative test, or replace the invalid kind value with a valid literal ("box" or
"discrete") that still triggers the validation error you are testing for (likely
related to dim=1 being invalid for that kind).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 10cda741-a2d5-4271-a790-dac9d27e8a32

📥 Commits

Reviewing files that changed from the base of the PR and between 5657f1e and 29867a8.

📒 Files selected for processing (9)
  • src/cloudai/_core/action_space.py
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/core.py
  • src/cloudai/models/workload.py
  • tests/test_action_space.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py

Comment thread tests/test_env_params.py Outdated
@rutayan-nv

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch from 29867a8 to 1072426 Compare June 16, 2026 21:17
run() already treats a None return from select_action() as a stop signal, but
the abstract method declared only tuple[int, dict[str, Any]]. Widen the return
annotation to tuple[int, dict[str, Any]] | None and document the termination
semantics so the declared contract matches run()'s usage.
CloudAIGymEnv.get_cached_trajectory_result(action) keys the trajectory
cache on action alone. When a workload declares env_params (e.g.
drop_rate) and the agent re-selects the same action under a different
env_params sample, the cache returns a stale reward measured under a
different env, silently invalidating any domain-randomization workflow.

Four tests pin the contract; today two FAIL (bug-exposing), two PASS
(back-compat sanity). All 27 pre-existing tests are untouched.

  FAIL  test_cache_miss_when_env_params_differ
  FAIL  test_step_reruns_workload_when_env_params_change
  PASS  test_cache_hit_when_action_and_env_params_match
  PASS  test_cache_hit_when_neither_has_env_params

Driver for the follow-up PR that adds env_params as a first-class field
on TestDefinition + TrajectoryEntry and rewrites the cache key as
(action, env_params). Both unit and integration shapes are covered so
the fix can be validated end-to-end through env.step().
…ge signature

Satisfy ruff-format on the env_params cache-key TDD test.
@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch from 1072426 to a7a2c01 Compare June 16, 2026 22:15

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/cloudai/configurator/cloudai_gym.py (2)

185-201: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve sampled current_env_params in the fallback restore path.

When the output path is missing, Line 188 resets from original_test_run and only restores step/output_path; by Line 201 the trajectory row can record empty env params even though a sample existed for this step.

Proposed fix
         else:
             self.test_run = copy.deepcopy(self.original_test_run)
             self.test_run.step = new_tr.step
             self.test_run.output_path = new_tr.output_path
+            self.test_run.current_env_params = dict(new_tr.current_env_params)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 185 - 201, When the
fallback restore path is taken (when test output path does not exist), the code
resets from original_test_run but only restores the step and output_path
attributes from new_tr. This causes the current_env_params to be empty when
writing the trajectory entry, even though new_tr contains sampled environment
parameters. In the else block after line 188, after setting
self.test_run.output_path from new_tr, also restore
self.test_run.current_env_params from new_tr to preserve the sampled parameters
for the trajectory record.

145-170: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constraint-failure exits leave env.csv and trajectory.csv out of sync.

At Line 146, before_step can persist a row to env.csv, but the early return at Line 170 skips write_trajectory, so the same trial step exists only in one file.

Proposed fix
-        if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
-            logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+        if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
+            logging.info("Constraint check failed. Skipping step.")
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 145 - 170, The
constraint-failure early return at the end of the code snippet skips the
write_trajectory call, while the before_step observer has already potentially
written to env.csv, causing the files to be out of sync. Before the early return
statement in the constraint_check failure case, add a write_trajectory call with
a TrajectoryEntry that mirrors the structure used in the cached result handling,
including the current step, action, the constraint_failure reward, the current
observation, and the current env_params dictionary, to ensure both csv files
remain synchronized.
♻️ Duplicate comments (1)
src/cloudai/configurator/base_agent.py (1)

150-162: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stop the default run loop when env.step returns done=True.

Line 150 captures done, but the loop never checks it. This keeps stepping after terminal state and can skew DSE trial flow and persisted step-aligned artifacts.

Proposed fix
             self.update_policy(
                 {
                     "trial_index": step,
                     "value": reward,
                     "observation": observation,
                     "prev_observation": prev_observation,
                     "action": action,
                     "done": done,
                 }
             )
+            if done:
+                break
             logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
         return 0
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/base_agent.py` around lines 150 - 162, The loop
captures the done flag from self.env.step(action) on line 150 but never checks
it to terminate the loop when the environment reaches a terminal state. Add a
check after the self.update_policy() call that breaks out of the loop when done
is True. This ensures the default run loop stops stepping once the environment
signals completion, preventing artifacts misalignment and incorrect trial flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/env_params.py`:
- Around line 66-70: The `EnvParamSpec._validate_weights()` method currently
only validates that weights are non-negative and have a positive sum, but it
does not reject non-finite values like infinity or NaN. Add validation to check
that all weights in self.weights are finite numbers before the existing positive
sum check. Raise a ValueError with a descriptive message if any non-finite
weights are detected, using a utility function like math.isfinite() to test each
weight value. This ensures malformed configurations are rejected at schema
validation time rather than failing later at runtime when random.choices() is
called.

In `@tests/test_cloudaigym.py`:
- Around line 567-615: Add a new test function alongside
test_env_csv_is_step_aligned_with_trajectory that validates env.csv and
trajectory.csv step alignment when a constraint failure causes early exit. This
test should follow a similar structure to the existing test but configure the
CloudAIGymEnv with constraint parameters that will be violated during step
execution, triggering an early termination. Verify that even when the
environment exits due to constraint failure, the step columns in env.csv and
trajectory.csv remain perfectly aligned (same number of rows, matching step
numbers), ensuring the corpus-friendly merge contract holds across both normal
execution and constraint-failure paths.

---

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 185-201: When the fallback restore path is taken (when test output
path does not exist), the code resets from original_test_run but only restores
the step and output_path attributes from new_tr. This causes the
current_env_params to be empty when writing the trajectory entry, even though
new_tr contains sampled environment parameters. In the else block after line
188, after setting self.test_run.output_path from new_tr, also restore
self.test_run.current_env_params from new_tr to preserve the sampled parameters
for the trajectory record.
- Around line 145-170: The constraint-failure early return at the end of the
code snippet skips the write_trajectory call, while the before_step observer has
already potentially written to env.csv, causing the files to be out of sync.
Before the early return statement in the constraint_check failure case, add a
write_trajectory call with a TrajectoryEntry that mirrors the structure used in
the cached result handling, including the current step, action, the
constraint_failure reward, the current observation, and the current env_params
dictionary, to ensure both csv files remain synchronized.

---

Duplicate comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 150-162: The loop captures the done flag from
self.env.step(action) on line 150 but never checks it to terminate the loop when
the environment reaches a terminal state. Add a check after the
self.update_policy() call that breaks out of the loop when done is True. This
ensures the default run loop stops stepping once the environment signals
completion, preventing artifacts misalignment and incorrect trial flow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 64a943dc-eea4-4d96-8190-7db0c7da3c25

📥 Commits

Reviewing files that changed from the base of the PR and between 1072426 and a7a2c01.

📒 Files selected for processing (12)
  • src/cloudai/_core/action_space.py
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/core.py
  • src/cloudai/models/workload.py
  • tests/test_action_space.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py

Comment thread src/cloudai/configurator/env_params.py
Comment thread tests/test_cloudaigym.py
CodeRabbit (Ruff PT018): `assert a and b` obscures which clause failed.
Split the two `assert result is not None and result.step == 1` checks in the
env_params cache-key tests into separate assertions for clearer pytest
introspection. Test-only change, no behavior impact.
Domain randomization broke under the trajectory cache because the cache
key was action-only: an agent revisiting the same action under a
different env_params sample (e.g. drop_rate) got the *first* trial's
reward served back, silently corrupting the reward signal. env.csv was
also workload-owned and skipped on cache hits, so it was sparse and
mis-aligned with trajectory.csv.

This change makes env_params first-class in cloudai so the design works
the same way for every agent (PPO, BO, GA, MAB) and every workload:

- TestDefinition.env_params: dict[str, EnvParamSpec] - sibling to
  cmd_args, not part of the action space.
- TestRun.current_env_params: dict[str, Any] - per-trial sample
  attached by the env so the cache key and downstream sinks see it.
- TrajectoryEntry gains env_params; the cache key is now
  (action, current_env_params).
- CloudAIGymEnv builds an EnvParamsObserver when env_params is
  declared. before_step() samples deterministically by
  (seed, name, trial), stashes the sample on test_run, and appends to
  env.csv *before* the cache lookup - so env.csv is written on every
  trial (cache hit or miss) and aligns 1:1 with trajectory.csv by step.
- Workloads with no [env_params.*] block pay zero overhead: no
  observer, no env.csv, identical behaviour to today.

Tests: turns the four TDD-red tests from the parent PR green and adds
nine unit tests for EnvParamSpec / EnvParamsSampler / CsvSink /
EnvParamsObserver plus two integration tests pinning the env.csv ↔
trajectory.csv contract.
Closes the most important gap in the env_params test matrix: the
cache-HIT path under an observer-driven TestDefinition.env_params.

The previous tests covered each axis separately - cache key unit tests
for env_params, a step()-level cache-miss integration with manually
set current_env_params, and a step-alignment test where all trials
were cache misses. None of them proved that a cache HIT still drives
the observer (and therefore env.csv) - which is precisely the
sparsity contract the fix was about.

The new test:
  - declares env_params on the TestDefinition (real observer wired in),
  - pre-seeds the trajectory with an entry whose action AND deterministic
    env_params sample match what the observer will produce for step 1,
  - calls env.step() once and asserts:
      (a) runner.run is NOT called (cache short-circuit honored),
      (b) env.csv gains a row (observer fired before the cache lookup),
      (c) the new trajectory entry carries the per-trial env_params.

This is the end-to-end proof that the env.csv sparsity bug cannot
recur, and that env.csv ↔ trajectory.csv stay step-aligned on the
cache-hit path too.
…s files

env_params.py and test_env_params.py were first authored in 2026; the
copyright-header check derives the expected year from git history.
env_params are sampled only during DSE (by CloudAIGymEnv); on a non-DSE run
they would be silently ignored. is_dse_job is a property of the fully prepped
config, so validate it there rather than at parse time.

Add validate_dse_env_params(test_scenario), which raises when a test run
declares env_params while is_dse_job is False. Enforce it at run/dry-run
dispatch (handle_dry_run_and_run, fail fast after setup) and mandatorily in
verify_test_scenarios, so a config declaring env_params without a DSE sweep
(list-valued cmd_args/extra_env_vars or num_nodes) is rejected instead of
silently dropped.
@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch from a7a2c01 to a7cb274 Compare June 17, 2026 18:33
Sink env.csv from the same TrajectoryEntry as trajectory.csv inside
write_trajectory, instead of eagerly in the observer's before_step. A
constraint-failed step returns before write_trajectory, so it now records
no row in either file and the two stay step-aligned (a downstream
pd.merge(traj, env, on="step") loses no rows).

Also restore current_env_params after the run-result fallback in step():
that branch rebuilt test_run from the pristine copy and reset the sampled
env_params to {}, recording empty env_params in the trajectory entry and
weakening the (action, env_params) cache key.

EnvParamsObserver is reduced to sample-and-stash; CloudAIGymEnv owns
persistence of both files.
@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch from a7cb274 to f44b1ea Compare June 17, 2026 20:26
random.choices treats inf/nan weights inconsistently (they slip past the
existing non-negative + positive-sum checks), so validate finiteness in
EnvParamSpec and reject inf/nan/-inf. Also add a match= to the CsvSink
zero-step pytest.raises for clarity (CodeRabbit, NVIDIA#901/NVIDIA#928).
Introduce ContinuousSpace, a closed-interval [low, high] action-space
dimension with a dtype ("int" | "float") that declares how consumers
should quantize decoded samples. CloudAI action spaces can now express
continuous tunables alongside discrete (list) candidate domains; agents
and adapters read low/high/dtype to build their own representation.

Validated low < high; exported via cloudai.core.
…ejection tests

These negative tests pass invalid dtype/extra kwargs to assert pydantic
ValidationError; mark the deliberate type violations with type: ignore.
@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch from f44b1ea to a85f886 Compare June 17, 2026 22:03
GymnasiumAdapter._decode_continuous clamps to [low, high] then rounds, so
a dtype="int" space with non-integer bounds (e.g. [0.2, 0.8]) can decode
to a value outside the declared range (1). Validate that int-quantized
spaces declare integer bounds (CodeRabbit, NVIDIA#932/NVIDIA#927).
…otocol

Add ObsLeafDescriptor (a self-describing observation leaf: "box" of width
dim, or "discrete" of size n) and a StructuredObservation Protocol that
documents the optional env hooks structured_observation_descriptors() and
encode_observation(). These let an env expose a named, per-leaf observation
so adapters (e.g. GymnasiumAdapter) can build the matching gymnasium
spaces.Dict; the hooks are duck-typed, so envs need not subclass.

Both exported via cloudai.core.
…ejection tests

Negative tests pass an extra kwarg and an out-of-Literal kind to assert
ValidationError; mark the deliberate type violations with type: ignore.
@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch from a85f886 to 24b8b99 Compare June 17, 2026 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant