Validation loss via LossEvaluator logs 0 (head skips loss computation in eval mode)

**Claude Opus 4.8 note** (filed via Claude Code on the maintainer's behalf):

When training with a `type: loss` evaluator, the validation `lm_head_loss` is logged as exactly **0** at every eval step, while `training.lm_head_loss` is correct. Observed across 8 independent training runs, in both the console logs and W&B (`evaluations.<name>.lm_head_loss = 0`).

### Likely cause

In `LanguageModelHead._logits_loss_forward_backward` (`fast_llm/layers/language_model/head.py:203-206` on current `main`), the eval-mode branch computes logits only and returns `None` for the loss — it never runs the cross-entropy or appends to `losses`:

```python
if not self.training:
    logits, _ = self._logits_loss_forward_backward_partial(input_, kwargs, return_logits=True)
    self._debug(logits, "logits", (kwargs[LanguageModelKwargs.hidden_token_dim], self._vocab_dim), kwargs)
    return None, None
```

The `LossEvaluator` runs the model in eval mode, so the head produces no loss and the aggregated eval metric is 0.

### Open question

I haven't pinned the intended fix — either the evaluator should run the forward with loss computation enabled, or the head should compute the loss in eval mode when targets are present. Either way, validation loss appears to be unusable through this path right now.

### Repro

Any `fast-llm train` run configured with a `loss` evaluator; inspect `evaluations.<name>.lm_head_loss` (logs as 0).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation loss via LossEvaluator logs 0 (head skips loss computation in eval mode) #538

Likely cause

Open question

Repro

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Validation loss via LossEvaluator logs 0 (head skips loss computation in eval mode) #538

Description

Likely cause

Open question

Repro

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions