CI improvemens: General improvements by hdiethelm · Pull Request #3984 · LinuxCNC/linuxcnc

hdiethelm · 2026-04-30T06:14:56Z

This is the follow up to #3983

It will be more in depth changes so riskier and will take some time.

There will be a few pushes, including some that should fail on purpose to test if everything works as desired. Just tell me if I abuse the CI to much and I will find a different solution.

It is experimental for now but I need a PR so the CI runs.

hdiethelm · 2026-04-30T07:11:45Z

Remove eatmydata tested:
Before: 4h 42m 44s
After: 4h 34m 41s + 15m estimated of a stalled process -> 4h 49m 41s
So no big impact and better readable.

BsAtHome · 2026-04-30T08:11:14Z

Did you see this message at the "Complete job" stage:

Warning: Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v2. Actions will be forced to run with Node.js 24 by default starting June 2nd, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

hdiethelm · 2026-04-30T08:12:23Z

Did you see this message at the "Complete job" stage:

Warning: Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v2. Actions will be forced to run with Node.js 24 by default starting June 2nd, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

Yes, commit is already there, I just wait for the last CI to pass before the next push.

hdiethelm · 2026-04-30T08:13:33Z

@BsAtHome Feel free to cancel any pipeline that has failed jobs. I think I don't have the access rights to do so.

hdiethelm · 2026-04-30T09:58:17Z

So first success, with 4 CPU's, down to 3h 32
Before: 4h 38m usage / 38m runtime
After: 3h 42m usage / 24m runtime

hdiethelm · 2026-04-30T10:03:07Z

Now the bigger change will be to run all in prepared dockers or runners. It probably won't reduce the runtime a lot, estimated: 3-5 min. However, it is just nice for the debian to not hammer their package repo just for fun. But let's see how well that works.

@BsAtHome Debian is the main target, right? So running rip-* under a debian container is also fine? At the moment, they run with an ubuntu-24.04 runner.

BsAtHome · 2026-04-30T10:06:02Z

Debian is the primary target, yes.

Nice to see improvement and it is primarily in the independent packages building the documentation. Those were the slowest all the time.

The tests are not going to be significantly faster because they run in sequence. We've been discussing parallel execution, but that requires #2722 to be fully implemented. There are a significant number of issues not addressed in that PR (some noted, others implied, still others to be discovered).

andypugh · 2026-04-30T11:08:28Z

#3983 is in.

hdiethelm · 2026-04-30T12:41:24Z

#3983 is in.

Thanks, re-based on top of master.

Debian is the primary target, yes.

Nice to see improvement and it is primarily in the independent packages building the documentation. Those were the slowest all the time.

The tests are not going to be significantly faster because they run in sequence. We've been discussing parallel execution, but that requires #2722 to be fully implemented. There are a significant number of issues not addressed in that PR (some noted, others implied, still others to be discovered).

It depends what the target is. For local usage, #2722 is the most comfortable way for users. But it can be also done another way:

One Build
Run multiple tests in parallel, each testing one part. Can also be separated using:
- Multiple runners (This needs an artifact, the way linuxcnc is built right now, this can be cumbersome bit it might work)
- Might be multiple namespaces in the same runner. If this works, this can be used also locally. Namespaces is one part how docker separates itself from the host but it can be also used directly. I would have to try it out.

Due to the test runner will only shorten the CI if the package-indep gets below 12min, I see this as low prio for the CI.

@andypugh Objections if I push docker images to the linuxcnc github? They will appear somewhere here: https://github.com/LinuxCNC/linuxcnc/packages I can probably do that in CI without any additional rights. However, I might need you if I mess something up and packages have to be deleted. I will try it in my account first with the free credits but to do something meaningful, I have to do it in the linuxcnc CI.

BsAtHome · 2026-04-30T22:20:30Z

Maybe the curl progress meter can be shut off at download.

@NTULINUX, when do these kernels move to linuxcnc.org? Is there a procedure?

grandixximo · 2026-05-01T08:42:53Z

Opened #3992 as a small stopgap for the firefox snap flake (issue #3991). It adds one apt remove firefox line before each upgrade. If you would prefer the larger CI cleanup in #3984 to land first I can close mine; otherwise rebasing #3984 over my change should be a one-line conflict at most. Whichever lands first works for me.

hdiethelm · 2026-05-01T09:35:30Z

@grandixximo This is fine for me. The main target here is to not have to do all this installs at all. But this will take some time.

I could extract the --cpu improvements (25% faster package build) and if desired:

Actions update (A bit risky, I can not test the release target, but I don't expect that there is an issue)
Split into dependency / build / test (I don't see any risks)
Remove apt-get upgrade
Remove eatmydata

in a separate PR, so this can be already merged while I am busy here.

hdiethelm · 2026-05-01T14:05:11Z

So, I have a container based prototype running in my gitlab:
https://github.com/hdiethelm/linuxcnc-fork/tree/ci_docker_prototyping
hdiethelm/linuxcnc-fork@ci_improvemens...hdiethelm:linuxcnc-fork:ci_docker_prototyping
The containers are visible here and in the project page, will be similar on the original repo:
https://github.com/hdiethelm?tab=packages&repo_name=linuxcnc-fork
Docker build:
https://github.com/hdiethelm/linuxcnc-fork/actions/runs/25216626662
LinuxCNC rip-and-test:
https://github.com/hdiethelm/linuxcnc-fork/actions/runs/25216626671

Do you think such an approach is viable? If yes, I will try how to automatically update the docker images from time to time and then port all other targets.

Advantage:

Faster, rip-and-test uses 3m30 less time, will be even more for the debian package build
Will not download any packages except if new dependency's are added until the docker images are updated
Also rip-and-test targets could use a matrix to test arm / all Debian variants
You can use the build docker images also locally

Disadvantage:

Due to preinstalled package, missing dependency's might be discovered later
Possibly a bit more difficult to maintain
Will use storage for images but depending on your contract, this is for free

I will not yet merge this it this PR. As soon as I do this, packages will most probably appear in the original linuxcnc repo.

hdiethelm · 2026-05-01T17:35:43Z

@grandixximo

might want to look at: ROS2, Yocto, buildroot for inspiration

Let's continue the discussion here. LinuxCNC is a bit special due to it need's a ton of depency's to build. The container needed is like 3.4GB in size.

buildroot: No meaningful CI in github
Yocto: self-hosted runner / Might be i am looking in the wrong project, but they do not much here: https://github.com/yoctoproject/vscode-bitbake/actions/runs/24083919003/job/70257898769
ROS2: Also not much in Github CI

If you find something, just tell me. I had some inspiration from https://github.com/open-webui/open-webui due to I know they build containers from CI and many different manuals / articles.

BsAtHome · 2026-05-01T21:18:04Z

The size of the container should be acceptable and manageable if it is in a local repo for quick download. If there need to be many different versions, that could be a problem.

However, I have no clue what deals have been made with github. @andypugh, do you know of any deals with github?

BTW, you need to rebase after I merged the snapped firefox killer.

It fails anyways, let's see if there is any difference in time Error is: ERROR: ld.so: object 'libeatmydata.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

Scripts are easyer maintainable and can be tested locally apt upgrade should not be needed: Runner / docker should be up to date

There is no reason to fetch all for test builds There are no submodules, so deactivated

The old ones are outdated and will be removed as mentioned by the warning in the actions tab.

hdiethelm · 2026-05-02T20:56:32Z

While you are restructuring the CI flow, would it be in scope to add a working-tree-clean gate after the build steps?

Done, was not so bad, only one man was missing, so I was able to enforce it. But I had to add an exclude option so I can exclude the auto updated VERSION / debian/changelog file. The regexp is not perfect, it would also exclude VERSION123 but better than nothing. Any good idea to do this?

The doc step still fails with changed .po files. Should I ignore these?

grandixximo · 2026-05-03T00:55:32Z

For the regex false-positives (VERSION123 etc.), git pathspec excludes give exact-path matching with no regex anchoring needed:

git status -u --porcelain -- ':(exclude)VERSION' ':(exclude)debian/changelog'

That replaces the grep -v -E step entirely. If you want to keep the regex form for flexibility, anchoring against the porcelain XY filename format works: ^.{3}(VERSION|debian/changelog)$.

For the .po files, those get touched by msgmerge during the doc build (line-number refs in source comments shift whenever any translatable string moves), so failing CI on them would basically mean any docs change blocks the gate. I'd lean toward excluding them too, but that's a doc/i18n call. @BsAtHome any preference, or should we ping the translation maintainers (Hans Unzner / Steffen Möller from recent .po history)?

hdiethelm · 2026-05-03T08:14:43Z

For the regex false-positives (VERSION123 etc.), git pathspec excludes give exact-path matching with no regex anchoring needed:

Thanks for the hint with pathspec. I tend towards this solution. Regexp is the thing that many people don't really understand and get it wrong (including me).
For the doc's, ':(exclude)docs/po/*.po' ':(exclude)docs/po/documentation.pot' does the trick. For now, I just allow failure until we know what to do.

hdiethelm · 2026-05-03T08:22:24Z

BTW: With the dockers from my repo, you can easily test parts of the CI locally with this command run from the local source directory and then copy-past snippets from ci.yml:
podman run --rm -it -v .:/linuxcnc ghcr.io/hdiethelm/linuxcnc/build-container:trixie-amd64-latest

I prefer podman over docker due to podman runs without setuid so the created files are not owned by root.

Advantage: Reusable / Can be run locally

BsAtHome · 2026-05-03T11:15:12Z

For the .po files, those get touched by msgmerge during the doc build (line-number refs in source comments shift whenever any translatable string moves), so failing CI on them would basically mean any docs change blocks the gate. I'd lean toward excluding them too, but that's a doc/i18n call. @BsAtHome any preference, or should we ping the translation maintainers (Hans Unzner / Steffen Möller from recent .po history)?

The docs build, integration and translations are fragile. Failing on translations/.po can lead to some self-inflicted wounds. That said, the most common problem is the newline consistency, where translations are missing or adding trailing newlines. Most of that gets caught now because msgmerge fails and that is good. This is not normally merged into the main tree, but just the weblate branch failing CI. That fail is a Good Thing^TM.

The default man-pages should all function and there should not be any non-.gitignored remains in docs/man/*. That bug has bitten many times. I think, in general, there should be no non-.gitignored remains anywhere after the build and test. That also goes for some of the junk left-over in config/* when, f.ex. running RIP qtdragon from a sim config.

grandixximo · 2026-05-03T13:13:09Z

I think, in general, there should be no non-.gitignored remains anywhere after the build and test.

If I understand correctly, the new CI enforces this already

That also goes for some of the junk left-over in config/* when, f.ex. running RIP qtdragon from a sim config.

There are no tests for the UI's, since they are not run, no left over can be caught, we'd have to make tests that run the UIs, and then check for leftovers. @hdiethelm I could work on a separate PR for UI tests, would that be ok?

BsAtHome · 2026-05-03T13:20:01Z

I think, in general, there should be no non-.gitignored remains anywhere after the build and test.

If I understand correctly, the new CI enforces this already

Great!

That also goes for some of the junk left-over in config/* when, f.ex. running RIP qtdragon from a sim config.

There are no tests for the UI's, since they are not run, no left over can be caught, we'd have to make tests that run the UIs, and then check for leftovers. @hdiethelm I could work on a separate PR for UI tests, would that be ok?

As i understand it, there has been talk about test-running some/all/any of the sims in xvfb and do some basic functional testing. That would be a very great test target because the UIs also break. Some sim UIs haven't seen a test in ages (one ini-file has a very old version tag).

See #3756.

grandixximo · 2026-05-03T14:25:06Z

@hdiethelm planning to start on #3756 (xvfb-based UI smoke tests) once this lands or in parallel. Quick coordination question on the dep side before I start: my tests will need at minimum xvfb, xauth (already added by you, thanks), xdotool, and python3-pytest-qt. Possibly mesa-utils / libgl1-mesa-dri for GLcanon under llvmpipe.

I think this is a different situation from your rtapi cleanup V2 (#3919) where I rebased: that one shares files in src/rtapi/* and conflicts were unavoidable. UI tests live under tests/ui/* and only touch .github/ for the dep install list, so the only real shared edit is 3-5 package names. No code conflict, no rebase requirement on either side.

Two ways to handle the deps:

You add the extras to your dep list in this PR while it's open. Keeps the install step in one place.
I add them in the UI test PR. Lets this one land cleanly without scope creep.

Either works for me. Lean which way?

I'll branch from master so neither PR blocks the other. Whoever lands second adjusts the dep list (one-line edit).

hdiethelm · 2026-05-03T15:49:32Z

...so the only real shared edit is 3-5 package names. No code conflict, no rebase requirement on either side.

Two ways to handle the deps:
1. You add the extras to your dep list in this PR while it's open. Keeps the install step in one place.
2. I add them in the UI test PR. Lets this one land cleanly without scope creep.

Probably you need 2. so the CI still runs on your branch.

I have to rebase to master anyway before merge to check if all is still fine. Just do your thing, as long as there are only a few packages more, i will see them, no issue or if mine gets in first this should also be no issue for you to rebase before merge.

hdiethelm · 2026-05-03T15:57:22Z

So, I finally managed to get all packages inside the docker container:
https://github.com/hdiethelm/linuxcnc-fork/actions/runs/25278237458
https://github.com/hdiethelm/linuxcnc-fork/actions/runs/25278693726
To do this, I just compile the deb's, install them and remove linuxcnc_* afterwards, so the dependency's stay:
https://github.com/hdiethelm/linuxcnc-fork/blob/ci_docker_prototyping/.github/docker/Dockerfile#L34

I would not call it elegant but it does the job. Docker build goes up from 7 to 12 min but due to this is only needed from time to time to be up to date, I don't see any issue.

Or has anyone an idea how to install these packages without building / installing linuxcnc?

The linuxcnc pipeline stays at ~ 21m duration / 3h usage. The advantage is that the debian archives are not loaded for nothing and that if they are down, the build still works.

hdiethelm · 2026-05-03T16:19:21Z

I think, in general, there should be no non-.gitignored remains anywhere after the build and test.

If I understand correctly, the new CI enforces this already

Actually, I check after build. Should I move the check to after test? It's easy to do, I can even test after build and after test, takes no time at all.

@BsAtHome What to do with the untracked po files? At the moment, I don't ignore them but I made this build step non-failing, similar to cppcheck / shellcheck:
https://github.com/LinuxCNC/linuxcnc/actions/runs/25276731551
https://github.com/LinuxCNC/linuxcnc/actions/runs/25276731551/job/74107710597#step:6:17

Default ist RTAI. No issue so far but when RTAI, this is built

BsAtHome · 2026-05-03T18:56:55Z

@BsAtHome What to do with the untracked po files? At the moment, I don't ignore them but I made this build step non-failing, similar to cppcheck / shellcheck: https://github.com/LinuxCNC/linuxcnc/actions/runs/25276731551 https://github.com/LinuxCNC/linuxcnc/actions/runs/25276731551/job/74107710597#step:6:17

The po-files should just be ignored. There is no point in failing on them. The whole i18n and l10n stuff needs to be cleaned up anyway. But that is subordinate to fixing the docs build in general (docs build is quite fragile). And all of that is subordinate to reorganizing the docs structure.

Currently, cppcheck and shellcheck must be ignored because the build is still not clean. FWIW, the ability to add -Werror took many, many months before all the kinks were ironed out.
The cppcheck part requires the remote scheduler to be fixed (well, rewritten). Then that could become terminal on any warning or error.
The shellcheck is even harder because the code expects a lot of dubious constructs to work and needs a lot more work. It was a lot worse but there is still a lot of work to get it clean.

hdiethelm · 2026-05-03T19:32:26Z

The po-files should just be ignored.

Done:
0cce4ae
https://github.com/LinuxCNC/linuxcnc/actions/runs/25287962668/job/74135296641#step:6:9

Currently, cppcheck and shellcheck must be ignored because the build is still not clean. FWIW, the ability to add -Werror took many, many months before all the kinks were ironed out. The cppcheck part requires the remote scheduler to be fixed (well, rewritten). Then that could become terminal on any warning or error. The shellcheck is even harder because the code expects a lot of dubious constructs to work and needs a lot more work. It was a lot worse but there is still a lot of work to get it clean.

I know that, introducing this kind of checks takes a lot of patience. Best in my opinion is to warn only for certain files but make the test failing for everything that is already good. So at least no new issues are introduced and then go from there step by step.

hdiethelm · 2026-05-03T19:33:48Z

Any clue why this fails? Might be a broken package went into sid just right now? I did not do any relevant changes...
https://github.com/LinuxCNC/linuxcnc/actions/runs/25287962668

hdiethelm · 2026-05-03T20:18:46Z

@BsAtHome Thinking about this checks, of course I found some issues I created.

One option would be to do something like this once and commit the file:
scripts/shellcheck.sh | grep -Po '^In \K[^ ]+' | sort -u > shellcheck_warn_only.txt
Similar for cppcheck.

Then warn only for all files in this list / hard fail for everything else. So at least no new issues come in.
Remove corrected files from these lists to avoid new issues coming in on already fixed files.

hdiethelm changed the title ~~Ci improvemens: General cleanup~~ Ci improvemens: General improvements Apr 30, 2026

hdiethelm changed the title ~~Ci improvemens: General improvements~~ CI improvemens: General improvements Apr 30, 2026

hdiethelm force-pushed the ci_improvemens branch 3 times, most recently from f59978a to 1921e21 Compare April 30, 2026 07:31

hdiethelm force-pushed the ci_improvemens branch from c617459 to 0c9aaf1 Compare April 30, 2026 12:10

BsAtHome mentioned this pull request May 1, 2026

CI: apt upgrade triggers firefox snap install, blocks PRs intermittently #3991

Closed

grandixximo mentioned this pull request May 1, 2026

ci: remove firefox before apt upgrade to avoid snap store flake #3992

Merged

hdiethelm added 7 commits May 2, 2026 00:38

ci: Remove eatmydata

8065737

It fails anyways, let's see if there is any difference in time Error is: ERROR: ld.so: object 'libeatmydata.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

ci: Move to scripts / No need to update

5d45bca

Scripts are easyer maintainable and can be tested locally apt upgrade should not be needed: Runner / docker should be up to date

ci: Split into dependency/build/test

6259cb3

ci: Update to actions/checkout@v6 / don't fetch all

1813f81

There is no reason to fetch all for test builds There are no submodules, so deactivated

ci: Move cppcheck up, so all testing is at the same place

cf6c63f

ci: actions/upload-artifact@v7 / actions/download-artifact@v8

534e8bf

The old ones are outdated and will be removed as mentioned by the warning in the actions tab.

ci: Set docker to 4 cpu's

e079802

ci: Verify clean

1bd9f6a

hdiethelm force-pushed the ci_improvemens branch from 0f4fb15 to 8d8e846 Compare May 2, 2026 20:38

ci: Check clean with exclude / Fix .gitignore / Enforce

e298f79

hdiethelm force-pushed the ci_improvemens branch from 8d8e846 to e298f79 Compare May 2, 2026 20:47

hdiethelm added 2 commits May 3, 2026 09:44

ci: Use improved regexp / continue-on-error true for html docs and indep

2fd6172

ci: Change from regexp to pathspec

9cb669a

hdiethelm added 2 commits May 3, 2026 12:05

ci: Add names where still missing

8a8c89f

ci: Move more build steps to scripts

3b60760

Advantage: Reusable / Can be run locally

hdiethelm force-pushed the ci_improvemens branch from 21c2dcf to 3b60760 Compare May 3, 2026 10:30

grandixximo mentioned this pull request May 3, 2026

Add tests starting GUIs, likely falling back to xvfb for it #3756

Open

hdiethelm added 3 commits May 3, 2026 18:21

ci: Add clean check after test

800d88c

ci: Quiet curl / Doc verify-clean-repo

b16c713

ci: Add --with-realtime=uspace

1a1916c

Default ist RTAI. No issue so far but when RTAI, this is built

ci: Ignore *.po / documentation.pot but make check hard fail

0cce4ae

ci: Fix shellcheck issues

027717a

Conversation

hdiethelm commented Apr 30, 2026

Uh oh!

hdiethelm commented Apr 30, 2026

Uh oh!

BsAtHome commented Apr 30, 2026

Uh oh!

hdiethelm commented Apr 30, 2026

Uh oh!

hdiethelm commented Apr 30, 2026

Uh oh!

hdiethelm commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hdiethelm commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BsAtHome commented Apr 30, 2026

Uh oh!

andypugh commented Apr 30, 2026

Uh oh!

hdiethelm commented Apr 30, 2026

Uh oh!

BsAtHome commented Apr 30, 2026

Uh oh!

grandixximo commented May 1, 2026

Uh oh!

hdiethelm commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hdiethelm commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hdiethelm commented May 1, 2026

Uh oh!

BsAtHome commented May 1, 2026

Uh oh!

hdiethelm commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grandixximo commented May 3, 2026

Uh oh!

hdiethelm commented May 3, 2026

Uh oh!

hdiethelm commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BsAtHome commented May 3, 2026

Uh oh!

grandixximo commented May 3, 2026

Uh oh!

BsAtHome commented May 3, 2026

Uh oh!

grandixximo commented May 3, 2026

Uh oh!

hdiethelm commented May 3, 2026

Uh oh!

hdiethelm commented May 3, 2026

Uh oh!

hdiethelm commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BsAtHome commented May 3, 2026

Uh oh!

hdiethelm commented May 3, 2026

Uh oh!

hdiethelm commented May 3, 2026

Uh oh!

hdiethelm commented May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hdiethelm commented Apr 30, 2026 •

edited

Loading

hdiethelm commented Apr 30, 2026 •

edited

Loading

hdiethelm commented May 1, 2026 •

edited

Loading

hdiethelm commented May 1, 2026 •

edited

Loading

hdiethelm commented May 2, 2026 •

edited

Loading

hdiethelm commented May 3, 2026 •

edited

Loading

hdiethelm commented May 3, 2026 •

edited

Loading