Skip to content

Validation loss via LossEvaluator logs 0 (head skips loss computation in eval mode) #538

@jlamypoirier

Description

@jlamypoirier

Claude Opus 4.8 note (filed via Claude Code on the maintainer's behalf):

When training with a type: loss evaluator, the validation lm_head_loss is logged as exactly 0 at every eval step, while training.lm_head_loss is correct. Observed across 8 independent training runs, in both the console logs and W&B (evaluations.<name>.lm_head_loss = 0).

Likely cause

In LanguageModelHead._logits_loss_forward_backward (fast_llm/layers/language_model/head.py:203-206 on current main), the eval-mode branch computes logits only and returns None for the loss — it never runs the cross-entropy or appends to losses:

if not self.training:
    logits, _ = self._logits_loss_forward_backward_partial(input_, kwargs, return_logits=True)
    self._debug(logits, "logits", (kwargs[LanguageModelKwargs.hidden_token_dim], self._vocab_dim), kwargs)
    return None, None

The LossEvaluator runs the model in eval mode, so the head produces no loss and the aggregated eval metric is 0.

Open question

I haven't pinned the intended fix — either the evaluator should run the forward with loss computation enabled, or the head should compute the loss in eval mode when targets are present. Either way, validation loss appears to be unusable through this path right now.

Repro

Any fast-llm train run configured with a loss evaluator; inspect evaluations.<name>.lm_head_loss (logs as 0).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions