fix(rgbled): double-buffer WS2812 strip to eliminate races#7304
fix(rgbled): double-buffer WS2812 strip to eliminate races#7304
Conversation
The WS2812 LED strip was driven from two contexts without any serialisation: the FreeRTOS refresh timer (50ms) and Lua's applyRGBLedColors. Both called ws2812_update directly, racing on the DMA arming sequence, and the DMA ISR read the same _led_colors buffer that callers were mid-writing — producing torn pixels and partial-frame flicker. Switch to a lock-free single-driver design: - Board (rgb_leds.cpp) now owns a back buffer (writes) and a front buffer (DMA). Setters touch only the back buffer; the timer task is the sole arm point for ws2812_update at runtime. - A seqlock-style counter (_commit_seq) publishes batches: setters mark the LSB to flag mid-batch; rgbLedColorApply rounds up to the next even value to publish; the timer copies back→front only when the seq is even, differs from the last consumed value, and DMA is idle. Compiler barriers prevent reordering of plain back-buffer writes around the volatile seq updates. - Pre-OS callers (charging UI) bypass the timer: rgbLedColorApply detects scheduler_is_running() == false and flushes synchronously. To make the board the sole entry point, the no-buf ws2812_set_color / get_color / get_state are replaced by buffer-parameterised variants; all direct callers in horus/taranis/pl18/pa01/rm-h750/jumper-h750 boards and the generic_stm32 fsLedRGB now go through rgbSetLedColor. fsLedRGB and pa01's status-LED helpers gain explicit rgbLedColorApply calls to preserve their fire-and-forget semantics. pa01's separate _led_charge_colors buffer and ad-hoc ws2812_init are removed; rgbLedHwInit is exposed so the charging UI can bring up the WS2812 hardware before the scheduler is running while still sharing the rgb_leds.cpp front buffer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
This may still have issues on V14 & V16 radios which define SIXPOS_SWITCH_INDEX and also have bling LED's. The LED's for the 6POS switch are set in the getSixPosAnalogValue function which I think gets called from the mixer. For scripts that use a two stage update pattern (turn off all, turn on those needed, call applyRGBLedColors) it can still cause LED flickering if the mixer interrupts at the wrong time, |
getSixPosAnalogValue() ran in the mixer task and wrote directly to the WS2812 back buffer. On V12/V14/V16 that made the mixer a second writer racing with Lua's setRGBLedColor / applyRGBLedColors batches — the double-buffer seqlock only serialises publication, not concurrent back-buffer writes. Split the function: keep the sticky position tracker in the ADC hot path (a single volatile uint8_t write, read by other tasks as an atomic byte), and move the LED paint into a new weak rgbLedOnUpdate() hook called by the rgb_leds.cpp refresh timer. The 6POS LEDs (indices 0..5) now have the LED timer task as their sole writer, while Lua addresses map through BLING_LED_STRIP_START=6 onto the remaining range — so the two writable ranges are disjoint and cannot race. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Right, the mixer has no business touching WS2812 buffers. 793e0de should solve this, though I cannot test if it even works since I don't have any V14 or V16. |
|
Follow-up idea (not part of this PR): the sticky 6POS state update could move out of The tracker would rewrite This ties the state update to actual ADC events instead of to arbitrary callers (cli, diaganas, etc. currently re-trigger the tracker on every read). Keeping it out of this PR since it's orthogonal to the race fix and touches the public ADC read path. See #7309 |
|
@raphaelcoeffic i have tested this on my V16 and it seems to work as before! However, I have something that is reproducible. If i press e.g. button 4 multiple times (the same using 6), it sometimes happens that it is showing 1 as active. This seems to be a bad reading from ADC? Maybe because its measuring the "transition" while a cap is loaded somewhere? edit: sometimes it is also 2 instead of 1 but it is always lower than the button pressed (it is never 6 when pressing 3) |
|
With #7309 on top, I am not able to reproduce the issue from above... looks good! |
Ok, so we need both PRs cherry-picked to 2.12. |
@raphaelcoeffic :Thank you so much for helping me solve these problems with this device. Could you please send me your address so I can send you some samples for testing? info@helloradiosky.com |
I've also encountered this issue. It requires using |
|
This seems to be working ok on V16 and TX16SMK3 was fine. |
The WS2812 LED strip was driven from two contexts without any serialisation: the FreeRTOS refresh timer (50ms) and Lua's applyRGBLedColors. Both called ws2812_update directly, racing on the DMA arming sequence, and the DMA ISR read the same _led_colors buffer that callers were mid-writing — producing torn pixels and partial-frame flicker.
Switch to a lock-free single-driver design:
To make the board the sole entry point, the no-buf ws2812_set_color / get_color / get_state are replaced by buffer-parameterised variants; all direct callers in horus/taranis/pl18/pa01/rm-h750/jumper-h750 boards and the generic_stm32 fsLedRGB now go through rgbSetLedColor. fsLedRGB and pa01's status-LED helpers gain explicit rgbLedColorApply calls to preserve their fire-and-forget semantics.
pa01's separate _led_charge_colors buffer and ad-hoc ws2812_init are removed; rgbLedHwInit is exposed so the charging UI can bring up the WS2812 hardware before the scheduler is running while still sharing the rgb_leds.cpp front buffer.
V12 / V14 / V16 6POS LED fix
On radios that define SIXPOS_SWITCH_INDEX, getSixPosAnalogValue() ran in the mixer task and wrote directly to the WS2812 back buffer on every switch change. That made the mixer a second writer racing with Lua's setRGBLedColor / applyRGBLedColors batches — the seqlock above only serialises publication, not concurrent back-buffer writes.
The function is split: