Enable arithmetic operations between device tensors/batches and scalars #6143

rostan-t · 2025-12-19T09:55:23Z

Category: Bug fix (non-breaking change which fixes an issue)

Description:

This PR enables arithmetic operations between tensors/batches and scalars. Previously, x + n worked if x was a CPU tensor and n a scalar, but not if x was a GPU tensor.

Arithmetic operations between a GPU tensor/batch and a Python list or tuple are also supported. Tensor types need to be explicitly copied.

Additional information:

Affected modules and functionalities:

Dynamic mode tensors and batches.

Key points relevant for the review:

Do arithmetic operations work the way we intend to?

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: DALI-4545

Signed-off-by: Rostan Tabet <[email protected]>

dali/python/nvidia/dali/experimental/dynamic/_arithmetic.py

dali/python/nvidia/dali/experimental/dynamic/_batch.py

dali/python/nvidia/dali/experimental/dynamic/_tensor.py

greptile-apps · 2025-12-19T09:59:44Z

Greptile Summary

This PR enables arithmetic operations between GPU tensors/batches and scalars by consolidating the _arithm_op implementation from _tensor.py and _batch.py into a new shared _arithmetic.py module. The key improvements are:

Fixed GPU tensor + scalar operations: Previously only worked for CPU tensors
Improved device detection logic: Uses two-pass approach to determine target device before converting scalars, ensuring consistent device placement
Added type safety: Validates that non-implicitly-convertible types cannot be used with GPU tensors
Fixed backend bug: Corrected TensorListGPU.broadcast to use GPUBackend instead of CPUBackend
Comprehensive test coverage: Added tests for scalar operations with both tensors and batches, plus device compatibility validation

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The changes are well-structured with clear improvements over the previous implementation. The two-pass device detection algorithm is more robust than the old approach, the backend fix is obviously correct, and comprehensive tests cover the new functionality including edge cases like device mismatches.
No files require special attention

Important Files Changed

Filename	Overview
dali/python/backend_impl.cc	Fixed `TensorListGPU.broadcast` to use `GPUBackend` instead of `CPUBackend`
dali/python/nvidia/dali/experimental/dynamic/_arithmetic.py	New centralized arithmetic operation handler with improved GPU/CPU device handling
dali/test/python/experimental_mode/test_arithm_ops.py	Added comprehensive tests for scalar operations and device compatibility checking

Sequence Diagram

sequenceDiagram
    participant User
    participant Tensor/Batch
    participant _arithm_op
    participant as_tensor
    participant _arithmetic_generic_op

    User->>Tensor/Batch: gpu_tensor + scalar
    Tensor/Batch->>_arithm_op: __add__(gpu_tensor, scalar)
    
    Note over _arithm_op: Check all args for GPU tensors
    _arithm_op->>_arithm_op: gpu = any(arg.device == "gpu"<br/>for Tensor/Batch args)
    
    alt Scalar argument found
        _arithm_op->>_arithm_op: Check if implicitly convertible
        alt GPU mode & not convertible
            _arithm_op-->>User: ValueError: not implicitly copyable
        else Convertible
            _arithm_op->>as_tensor: as_tensor(scalar, device="gpu")
            as_tensor-->>_arithm_op: GPU tensor
        end
    end
    
    Note over _arithm_op: Verify no CPU/GPU mixing
    _arithm_op->>_arithm_op: Check all new_args devices match
    
    alt Device mismatch
        _arithm_op-->>User: ValueError: Cannot mix GPU and CPU
    else All same device
        _arithm_op->>_arithmetic_generic_op: Execute operation
        _arithmetic_generic_op-->>User: Result tensor
    end

greptile-apps

Additional Comments (3)

dali/python/nvidia/dali/experimental/dynamic/_arithmetic.py, line 36 (link)

syntax: Typo: "implictly" should be "implicitly"
dali/python/nvidia/dali/experimental/dynamic/_arithmetic.py, line 30-46 (link)

logic: Logic bug: when a scalar appears before a GPU tensor (e.g., 3 + gpu_tensor), the scalar gets converted to CPU before detecting the GPU tensor, causing the final check to fail with "Cannot mix GPU and CPU inputs."

The algorithm needs two passes:
1. First pass: scan all args to detect if any GPU tensor exists
2. Second pass: convert scalars to appropriate device
Example failure case:
```
gpu_tensor = ndd.tensor([1, 2, 3], device="gpu")
result = 5 + gpu_tensor  # Will raise ValueError
```

dali/test/python/experimental_mode/test_arithm_ops.py, line 128 (link)

style: Test only covers tensor + scalar but not scalar + tensor. Add reverse operation tests:

# Also test scalar + tensor
result_reversed = ndd.as_tensor(apply_bin_op(op, scalar, x))
ref_reversed = apply_bin_op(op, scalar, tensor)
if not np.allclose(result_reversed.cpu(), ref_reversed):
    msg = f"{scalar} {op} {tensor} = \n{result_reversed}\n!=\n{ref_reversed}"
    raise AssertionError(msg)

_{5 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

dali/python/nvidia/dali/experimental/dynamic/_arithmetic.py

greptile-apps

Additional Comments (1)

dali/python/nvidia/dali/experimental/dynamic/_arithmetic.py, line 20 (link)

syntax: using | operator in isinstance() requires Python 3.10+, but pyproject.toml targets Python 3.8+. Use tuple syntax instead for compatibility.

_{5 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

Signed-off-by: Rostan Tabet <[email protected]>

rostan-t added 3 commits December 19, 2025 09:32

Fix input device of TensorListGPU.broadcast

f55ba95

Signed-off-by: Rostan Tabet <[email protected]>

Fix arithmetic operations with GPU tensors and scalars

b5bdad8

Signed-off-by: Rostan Tabet <[email protected]>

Add a test for arithmetic ops with incompatible devices

f218760

Signed-off-by: Rostan Tabet <[email protected]>

github-advanced-security bot found potential problems Dec 19, 2025

View reviewed changes

rostan-t added the Dynamic Mode label Dec 19, 2025

greptile-apps bot reviewed Dec 19, 2025

View reviewed changes

mzient reviewed Dec 19, 2025

View reviewed changes

dali/python/nvidia/dali/experimental/dynamic/_arithmetic.py Outdated Show resolved Hide resolved

mzient self-assigned this Dec 19, 2025

greptile-apps bot reviewed Dec 19, 2025

View reviewed changes

dali-automaton assigned szalpal Dec 19, 2025

rostan-t force-pushed the ndd-gpu-arithmetic branch from c0cc95a to 2265d73 Compare December 19, 2025 12:19

szalpal approved these changes Dec 19, 2025

View reviewed changes

Fix case where scalar is the first arg and make isinstance check faster

8d39fde

Signed-off-by: Rostan Tabet <[email protected]>

rostan-t force-pushed the ndd-gpu-arithmetic branch from 2265d73 to 8d39fde Compare December 19, 2025 13:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable arithmetic operations between device tensors/batches and scalars #6143

Enable arithmetic operations between device tensors/batches and scalars #6143

rostan-t commented Dec 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot commented Dec 19, 2025 •

edited

Loading

Uh oh!

greptile-apps bot left a comment •

edited

Loading

Uh oh!

Uh oh!

greptile-apps bot left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Enable arithmetic operations between device tensors/batches and scalars #6143

Are you sure you want to change the base?

Enable arithmetic operations between device tensors/batches and scalars #6143

Conversation

rostan-t commented Dec 19, 2025

Category: Bug fix (non-breaking change which fixes an issue)

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (3)

Uh oh!

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (1)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps bot commented Dec 19, 2025 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

greptile-apps bot left a comment •

edited

Loading