Conversation
lkdvos
left a comment
There was a problem hiding this comment.
It seems a bit much to force a copy of ind across the board, even in non-GPU context. Is there any way we can only selectively do this, or find a different approach?
I'm also slightly confused by the collect, I thought we did our best to keep everything on GPU precisely to allow the GPU kernels to do their work, and having ind::Vector{Int} combined with values::CuVector was bad?
| good_indS = findall(i -> i <= r, indS) | ||
| bad_indS = setdiff(1:length(indS), good_indS) | ||
| ΔS₁[indS[good_indS]] = real.(ΔS[good_indS]) | ||
| Δgauge = max(Δgauge, mapreduce(abs, max, ΔS[bad_indS])) |
There was a problem hiding this comment.
Would it help if we re-use the machinery we have for the TruncationIntersection, which I think we already made work, and then something like this? (untested)
bad_indS = _ind_intersect(r+1:length(ΔS), indS)
ΔS₁ = real(ΔS)
badΔS₁ = view(ΔS₁, bad_indS)
Δgauge = max(Δgauge, maximum(abs, badΔS₁))
badΔS₁ .= 0
There was a problem hiding this comment.
ΔS₁ ends up with the wrong length with this, I'll try to fix...
There was a problem hiding this comment.
Not sure this is much nicer but it's passing again at least 😓
I think in general we want to keep the |
|
Your PR no longer requires formatting changes. Thank you for your contribution! |
Codecov Report❌ Patch coverage is
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
|
Moved away from |
svd_pullback! GPU-compatible
| @testset "trunctol" begin | ||
| S = svd_vals(A, alg) | ||
| trunc = trunctol(atol = S[1] / 2) | ||
| trunc = trunctol(atol = maximum(S) / 2) |
|
I did have some final suggestions so I disabled auto-merge. |
Co-authored-by: Lukas Devos <ldevos98@gmail.com>
Co-authored-by: Jutho <Jutho@users.noreply.github.com>
|
Ok, failures on windows-latest seem unrelated, I'll merge. |
Should fix #228
We need to
collecttheinds for the truncated tests because otherwise we trigger scalar indexing :(The solution for the pullback itself is a bit rough so if someone has a nicer list comprehension, feel free to suggest!