Make build_wrapper.sh a bit more robust and support LLVM toolchain#455
Open
krinkinmu wants to merge 1 commit intointel:developfrom
Open
Make build_wrapper.sh a bit more robust and support LLVM toolchain#455krinkinmu wants to merge 1 commit intointel:developfrom
krinkinmu wants to merge 1 commit intointel:developfrom
Conversation
There is a minor problem with the current version of the script - it can fail silently resulting into linker errors down the line when linking against built hyperscan library. How could it happen? If I only have llvm toolchain in the system, my nm and objcopy tools are likely to be called llvm-nm and llvm-objcopy and not nm and objcopy. When that happens, build_wraper.sh will try to call nm to produce SYMSFILE. Naturally, because nm does not exist in the system it will fail, but it's not the last command of the script, so the script will continue it's execution. In the end of the script we have an if statement that checks if SYMSFILE exists and not empty. In our case, the file would exist, but would be empty, so objcopy in the if body will never get called and the script will finish execution with a successful status, thus failing silently. When building fat runtime, it basically will result in a library that is missing a bunch of symbol definitions needed by the dispatcher code. To make the failure a little bit more explicit I added set -e, this way the script will fail on the first command failure in the script making the issue explicit. Additionally, given the scenario I described above, I changed the script a little bit to allow overwriting the nm and objcopy tools used in the script via environment variables. By default, when environment variables are not provided we still fallback to the default nm and objcopy tools, so it should not affect anybody who relies on the current behavior. One last change, GNU nm tools for the format flag (-f) only looks at the first letter of the format, so for example when you provide posix there as the flag value it's the same as just providing p as a value. However, LLVM version of nm is a bit more strict and does not recognize p as a valid value. Given, that both tools accept posix as a value, but p is not accepted by LLVM tools, I changed the script to spell out posix value - this should work for both GNU (and any toolchain built on top of GNU codebase) and LLVM. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Author
|
There is an overlap with #448, but that PR is almost a year old at this point, which makes me think that maybe external PRs are not considered in this repo? |
krinkinmu
added a commit
to krinkinmu/envoy
that referenced
this pull request
Feb 13, 2026
When using non-hermetic LLVM toolchain the path to the nm and objcopy tools provided via environment variables to the hyperscan build script are not correct. When using non-hermetic toolchain, make C++ toolchain make variables NM and OBJCOPY will have absolute paths to the host tools, so prefixing them actually turns correct paths to incorrect onces. So one change that this PR does is to check if we are using a non-hermetic toolchain and if so, just pass in $(NM) and $(OBJCOPY) without modifications. The other part of this PR modifies the hyperscan slightly to make build_wrapper.sh fail when one of the commands in the script fails. The reason for this change is that currently, if this script cannot find nm and objcopy tools (i.e., when you don't have them installed - which could happen if you use non-hermetic compiler or, before this PR, because we added a prefix to the host tool paths that wasn't needed) it does not fail, but still does not produce quite correct result. What you will have in the end is a failure during envoy-contrib linking due to linker failing to find definitions for a bunch of the symbols. Let me try to explain what is going on there... Hyperscan, specifically when we use fat runtime option, does something quite clever and quite terrible - it builds the same library source code several times enabling different optimizations and links all those together into the final hyperscan library. It will build a version of the library assuming that AVX512 instruction set is available, another version of the library that assumes that AVC512VBMI instruction set is available and so on. Thus we will have multiple versions of the same library built with different optimizations. Then it takes each of these versions and using nm and objcopy modifies the names of exported symbols adding a prefix to them to avoid name collisions later when it will link them together (remember all of them are built from the same source). Finally, it links all those versions together into one library and provides a dispatcher function - this dispatcher function during runtime detects what instruction sets are actually available on the machine and calls an appropriate implementation for that instruction set. The [build_wrapper.sh](https://github.com/intel/hyperscan/blob/master/cmake/build_wrapper.sh) script is what takes the compiled object files and renames symbols in it to avoid name collision. The build_wrapper.sh is written in such a way that when nm or objcopy are not available - it does not fail. You can do a simple experiment for yourself and run the following script: ``` blahblah > /tmp/test.file if test -s /tmp/test.file then blahblah-again fi ``` Even though neither `blahblah` nor `blahblah-again` command exists (and you will see an error about it in the output) the overall script status code when you run it will be 0 (e.g. `echo $?` will return 0). A similar thing happens in build_wrapper.sh when it cannot find the objcopy or nm tool - the script does not fail, but it does not produce the object file we expect it to produce. When later CMake links `libhs.a` out of the available object files it links everything that exists and produces `libhs.a`, which miss a bunch of symbol definitions, but otherwise is still a valid static library. Because `libhs.a` build didn't fail properly, we proceed to eventually linking envoy-contrib (that's where hyperscan is used) and at that point linker discovers that we don't have all the symbol definitions available. NOTE: I have a PR for the hyperscan library to change their build_wrapper.sh to be a bit more robust (intel/hyperscan#455), but judging by the history of the repository, it appears that they do not accept external PRs (or at least have not accepted them for a while), so I'm not hopleful. Why just not use hermetic toolchain? I'd be happy to, but available published LLVM toolchains are quite limited and only support a few OS (note when it comes to hyperscan, it's specifically limited to x86 architecture, so architecture is not the issue here). So if you're buildin on Linux other that RedHat or Ubuntu - you're basically out of luck at the moment unfortunately. Many companies and communities publish custom built LLVM toolchains, but when they do, it's still typically done in format of a package for whatever is the package manager of the platform (e.g., deb, rpm, etc) and toolchains_llvm that we use to download hermetic LLVM toolchain does not support those. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
phlax
pushed a commit
to envoyproxy/envoy
that referenced
this pull request
Feb 13, 2026
Commit Message: When using non-hermetic LLVM toolchain the path to the nm and objcopy tools provided via environment variables to the hyperscan build script are not correct. When using non-hermetic toolchain, make C++ toolchain make variables NM and OBJCOPY will have absolute paths to the host tools, so prefixing them actually turns correct paths to incorrect onces. So one change that this PR does is to check if we are using a non-hermetic toolchain and if so, just pass in $(NM) and $(OBJCOPY) without modifications. The other part of this PR modifies the hyperscan slightly to make build_wrapper.sh fail when one of the commands in the script fails. The reason for this change is that currently, if this script cannot find nm and objcopy tools (i.e., when you don't have them installed - which could happen if you use non-hermetic compiler or, before this PR, because we added a prefix to the host tool paths that wasn't needed) it does not fail, but still does not produce quite correct result. What you will have in the end is a failure during envoy-contrib linking due to linker failing to find definitions for a bunch of the symbols. Let me try to explain what is going on there... Hyperscan, specifically when we use fat runtime option, does something quite clever and quite terrible - it builds the same library source code several times enabling different optimizations and links all those together into the final hyperscan library. It will build a version of the library assuming that AVX512 instruction set is available, another version of the library that assumes that AVC512VBMI instruction set is available and so on. Thus we will have multiple versions of the same library built with different optimizations. Then it takes each of these versions and using nm and objcopy modifies the names of exported symbols adding a prefix to them to avoid name collisions later when it will link them together (remember all of them are built from the same source). Finally, it links all those versions together into one library and provides a dispatcher function - this dispatcher function during runtime detects what instruction sets are actually available on the machine and calls an appropriate implementation for that instruction set. The [build_wrapper.sh](https://github.com/intel/hyperscan/blob/master/cmake/build_wrapper.sh) script is what takes the compiled object files and renames symbols in it to avoid name collision. The build_wrapper.sh is written in such a way that when nm or objcopy are not available - it does not fail. You can do a simple experiment for yourself and run the following script: ``` blahblah > /tmp/test.file if test -s /tmp/test.file then blahblah-again fi ``` Even though neither `blahblah` nor `blahblah-again` command exists (and you will see an error about it in the output) the overall script status code when you run it will be 0 (e.g. `echo $?` will return 0). A similar thing happens in build_wrapper.sh when it cannot find the objcopy or nm tool - the script does not fail, but it does not produce the object file we expect it to produce. When later CMake links `libhs.a` out of the available object files it links everything that exists and produces `libhs.a`, which miss a bunch of symbol definitions, but otherwise is still a valid static library. Because `libhs.a` build didn't fail properly, we proceed to eventually linking envoy-contrib (that's where hyperscan is used) and at that point linker discovers that we don't have all the symbol definitions available. Why just not use hermetic toolchain? I'd be happy to, but available published LLVM toolchains are quite limited and only support a few OS (note when it comes to hyperscan, it's specifically limited to x86 architecture, so architecture is not the issue here). So if you're buildin on Linux other that RedHat or Ubuntu - you're basically out of luck at the moment unfortunately. Many companies and communities publish custom built LLVM toolchains, but when they do, it's still typically done in format of a package for whatever is the package manager of the platform (e.g., deb, rpm, etc) and toolchains_llvm that we use to download hermetic LLVM toolchain does not support those. Additional Description: NOTE: I have a PR for the hyperscan library to change their build_wrapper.sh to be a bit more robust (intel/hyperscan#455), but judging by the history of the repository, it appears that they do not accept external PRs (or at least have not accepted them for a while), so I'm not hopleful. NOTE: There are few other problems with using non-hermetic toolchains as well, but I want to discuss other issues separately from hyperscan. Hyperscan issue is a bit more straighforward and other issues with non-hermetic toolchains may require a bit more discussion and considering various alternatives. Risk Level: Low Testing: Manually that envoy-contrib builds successfully and that if nm and objcopy aren't found, the build will fail early; +ci Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
krinkinmu
added a commit
to krinkinmu/envoy
that referenced
this pull request
Mar 4, 2026
…nvoyproxy#43479) Commit Message: When using non-hermetic LLVM toolchain the path to the nm and objcopy tools provided via environment variables to the hyperscan build script are not correct. When using non-hermetic toolchain, make C++ toolchain make variables NM and OBJCOPY will have absolute paths to the host tools, so prefixing them actually turns correct paths to incorrect onces. So one change that this PR does is to check if we are using a non-hermetic toolchain and if so, just pass in $(NM) and $(OBJCOPY) without modifications. The other part of this PR modifies the hyperscan slightly to make build_wrapper.sh fail when one of the commands in the script fails. The reason for this change is that currently, if this script cannot find nm and objcopy tools (i.e., when you don't have them installed - which could happen if you use non-hermetic compiler or, before this PR, because we added a prefix to the host tool paths that wasn't needed) it does not fail, but still does not produce quite correct result. What you will have in the end is a failure during envoy-contrib linking due to linker failing to find definitions for a bunch of the symbols. Let me try to explain what is going on there... Hyperscan, specifically when we use fat runtime option, does something quite clever and quite terrible - it builds the same library source code several times enabling different optimizations and links all those together into the final hyperscan library. It will build a version of the library assuming that AVX512 instruction set is available, another version of the library that assumes that AVC512VBMI instruction set is available and so on. Thus we will have multiple versions of the same library built with different optimizations. Then it takes each of these versions and using nm and objcopy modifies the names of exported symbols adding a prefix to them to avoid name collisions later when it will link them together (remember all of them are built from the same source). Finally, it links all those versions together into one library and provides a dispatcher function - this dispatcher function during runtime detects what instruction sets are actually available on the machine and calls an appropriate implementation for that instruction set. The [build_wrapper.sh](https://github.com/intel/hyperscan/blob/master/cmake/build_wrapper.sh) script is what takes the compiled object files and renames symbols in it to avoid name collision. The build_wrapper.sh is written in such a way that when nm or objcopy are not available - it does not fail. You can do a simple experiment for yourself and run the following script: ``` blahblah > /tmp/test.file if test -s /tmp/test.file then blahblah-again fi ``` Even though neither `blahblah` nor `blahblah-again` command exists (and you will see an error about it in the output) the overall script status code when you run it will be 0 (e.g. `echo $?` will return 0). A similar thing happens in build_wrapper.sh when it cannot find the objcopy or nm tool - the script does not fail, but it does not produce the object file we expect it to produce. When later CMake links `libhs.a` out of the available object files it links everything that exists and produces `libhs.a`, which miss a bunch of symbol definitions, but otherwise is still a valid static library. Because `libhs.a` build didn't fail properly, we proceed to eventually linking envoy-contrib (that's where hyperscan is used) and at that point linker discovers that we don't have all the symbol definitions available. Why just not use hermetic toolchain? I'd be happy to, but available published LLVM toolchains are quite limited and only support a few OS (note when it comes to hyperscan, it's specifically limited to x86 architecture, so architecture is not the issue here). So if you're buildin on Linux other that RedHat or Ubuntu - you're basically out of luck at the moment unfortunately. Many companies and communities publish custom built LLVM toolchains, but when they do, it's still typically done in format of a package for whatever is the package manager of the platform (e.g., deb, rpm, etc) and toolchains_llvm that we use to download hermetic LLVM toolchain does not support those. Additional Description: NOTE: I have a PR for the hyperscan library to change their build_wrapper.sh to be a bit more robust (intel/hyperscan#455), but judging by the history of the repository, it appears that they do not accept external PRs (or at least have not accepted them for a while), so I'm not hopleful. NOTE: There are few other problems with using non-hermetic toolchains as well, but I want to discuss other issues separately from hyperscan. Hyperscan issue is a bit more straighforward and other issues with non-hermetic toolchains may require a bit more discussion and considering various alternatives. Risk Level: Low Testing: Manually that envoy-contrib builds successfully and that if nm and objcopy aren't found, the build will fail early; +ci Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
phlax
pushed a commit
to envoyproxy/envoy
that referenced
this pull request
Mar 4, 2026
…43479) Commit Message: When using non-hermetic LLVM toolchain the path to the nm and objcopy tools provided via environment variables to the hyperscan build script are not correct. When using non-hermetic toolchain, make C++ toolchain make variables NM and OBJCOPY will have absolute paths to the host tools, so prefixing them actually turns correct paths to incorrect onces. So one change that this PR does is to check if we are using a non-hermetic toolchain and if so, just pass in $(NM) and $(OBJCOPY) without modifications. The other part of this PR modifies the hyperscan slightly to make build_wrapper.sh fail when one of the commands in the script fails. The reason for this change is that currently, if this script cannot find nm and objcopy tools (i.e., when you don't have them installed - which could happen if you use non-hermetic compiler or, before this PR, because we added a prefix to the host tool paths that wasn't needed) it does not fail, but still does not produce quite correct result. What you will have in the end is a failure during envoy-contrib linking due to linker failing to find definitions for a bunch of the symbols. Let me try to explain what is going on there... Hyperscan, specifically when we use fat runtime option, does something quite clever and quite terrible - it builds the same library source code several times enabling different optimizations and links all those together into the final hyperscan library. It will build a version of the library assuming that AVX512 instruction set is available, another version of the library that assumes that AVC512VBMI instruction set is available and so on. Thus we will have multiple versions of the same library built with different optimizations. Then it takes each of these versions and using nm and objcopy modifies the names of exported symbols adding a prefix to them to avoid name collisions later when it will link them together (remember all of them are built from the same source). Finally, it links all those versions together into one library and provides a dispatcher function - this dispatcher function during runtime detects what instruction sets are actually available on the machine and calls an appropriate implementation for that instruction set. The [build_wrapper.sh](https://github.com/intel/hyperscan/blob/master/cmake/build_wrapper.sh) script is what takes the compiled object files and renames symbols in it to avoid name collision. The build_wrapper.sh is written in such a way that when nm or objcopy are not available - it does not fail. You can do a simple experiment for yourself and run the following script: ``` blahblah > /tmp/test.file if test -s /tmp/test.file then blahblah-again fi ``` Even though neither `blahblah` nor `blahblah-again` command exists (and you will see an error about it in the output) the overall script status code when you run it will be 0 (e.g. `echo $?` will return 0). A similar thing happens in build_wrapper.sh when it cannot find the objcopy or nm tool - the script does not fail, but it does not produce the object file we expect it to produce. When later CMake links `libhs.a` out of the available object files it links everything that exists and produces `libhs.a`, which miss a bunch of symbol definitions, but otherwise is still a valid static library. Because `libhs.a` build didn't fail properly, we proceed to eventually linking envoy-contrib (that's where hyperscan is used) and at that point linker discovers that we don't have all the symbol definitions available. Why just not use hermetic toolchain? I'd be happy to, but available published LLVM toolchains are quite limited and only support a few OS (note when it comes to hyperscan, it's specifically limited to x86 architecture, so architecture is not the issue here). So if you're buildin on Linux other that RedHat or Ubuntu - you're basically out of luck at the moment unfortunately. Many companies and communities publish custom built LLVM toolchains, but when they do, it's still typically done in format of a package for whatever is the package manager of the platform (e.g., deb, rpm, etc) and toolchains_llvm that we use to download hermetic LLVM toolchain does not support those. Additional Description: NOTE: I have a PR for the hyperscan library to change their build_wrapper.sh to be a bit more robust (intel/hyperscan#455), but judging by the history of the repository, it appears that they do not accept external PRs (or at least have not accepted them for a while), so I'm not hopleful. NOTE: There are few other problems with using non-hermetic toolchains as well, but I want to discuss other issues separately from hyperscan. Hyperscan issue is a bit more straighforward and other issues with non-hermetic toolchains may require a bit more discussion and considering various alternatives. Risk Level: Low Testing: Manually that envoy-contrib builds successfully and that if nm and objcopy aren't found, the build will fail early; +ci Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
There is a minor problem with the current version of the script - it can fail silently resulting into linker errors down the line when linking against built hyperscan library.
How could it happen?
If I only have llvm toolchain in the system, my nm and objcopy tools are likely to be called llvm-nm and llvm-objcopy and not nm and objcopy. When that happens, build_wraper.sh will try to call nm to produce SYMSFILE.
Naturally, because nm does not exist in the system it will fail, but it's not the last command of the script, so the script will continue it's execution.
In the end of the script we have an if statement that checks if SYMSFILE exists and not empty. In our case, the file would exist, but would be empty, so objcopy in the if body will never get called and the script will finish execution with a successful status, thus failing silently.
When building fat runtime, it basically will result in a library that is missing a bunch of symbol definitions needed by the dispatcher code.
To make the failure a little bit more explicit I added
set -e, this way the script will fail on the first command failure in the script making the issue explicit.Additionally, given the scenario I described above, I changed the script a little bit to allow overwriting the nm and objcopy tools used in the script via environment variables.
By default, when environment variables are not provided we still fallback to the default nm and objcopy tools, so it should not affect anybody who relies on the current behavior.
One last change, GNU nm tools for the format flag (-f) only looks at the first letter of the format, so for example when you provide posix there as the flag value it's the same as just providing p as a value.
However, LLVM version of nm is a bit more strict and does not recognize p as a valid value. Given, that both tools accept posix as a value, but p is not accepted by LLVM tools, I changed the script to spell out posix value - this should work for both GNU (and any toolchain built on top of GNU codebase) and LLVM.
Testing:
I did build hyperscan with both default nm and objcopy as well as LLVM versions provided via environment variables.
I run unit tests for both bulds, in both cases tests passed.
And I did verify that if the build_wrapper.sh tries to use nm or objcopy binrary that does not exist, the script fails explicitly and the build fails with it.