Skip to content

fix: pre-download sage_attention kernel before applying backend, remove pinned fa3 kernel version#578

Open
Marius-Graml wants to merge 2 commits intomainfrom
fix/SageAttn
Open

fix: pre-download sage_attention kernel before applying backend, remove pinned fa3 kernel version#578
Marius-Graml wants to merge 2 commits intomainfrom
fix/SageAttn

Conversation

@Marius-Graml
Copy link
Contributor

@Marius-Graml Marius-Graml commented Mar 16, 2026

Description

Currently, there is a bug in the sageattn algorithm. Diffusers has two set_attention_backend methods, one for the whole model and one for the submodules. The submodule-level set_attention_backend in diffusers does not trigger the kernel download, leaving kernel_fn as None and causing a TypeError. This adds an explicit _maybe_download_kernel_for_backend call.
Further, the pinned version of the fa3 kernel is removed such that fa3 works for torch 2.10 now. Note, that some kernel builds return (out, lse), others return just out, depending on torch and cuda version. Thus, this must be handled in the registered torch-op function.

Related Issue

/

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Run in notebook

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Notes

/

The submodule-level set_attention_backend in diffusers does not trigger
the kernel download, leaving kernel_fn as None and causing a TypeError.
This adds an explicit _maybe_download_kernel_for_backend call.
@Marius-Graml Marius-Graml changed the title Bug fix: pre-download sage_attention kernel before applying backend fix: pre-download sage_attention kernel before applying backend Mar 16, 2026
@Marius-Graml Marius-Graml changed the title fix: pre-download sage_attention kernel before applying backend fix: pre-download sage_attention kernel before applying backend, remove pinned fa3 kernel version Mar 18, 2026
@Marius-Graml Marius-Graml requested a review from begumcig March 18, 2026 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant