Skip to content

feat: 2.6.0 inference#174

Merged
calvinleng-science merged 20 commits intomainfrom
calvin/app-inference
Apr 24, 2026
Merged

feat: 2.6.0 inference#174
calvinleng-science merged 20 commits intomainfrom
calvin/app-inference

Conversation

@calvinleng-science
Copy link
Copy Markdown
Contributor

Summary

This introduces the command "deploy-model" that allows users to either deploy a .pt or .onnx model onto the device.

If the user specifies --quantize and --input-list, the model can be used by the DSP runtime. To allow DSP runtime, the user must specify the root directory for v2.34 of Qualcomm's QAIRT to enable conversion of their model to .dlc. This is required since Qualcomm does not allow redistribution of their software.

If the user does not specify --quantize and --input-list, then the model will simply be uploaded as an .onnx model (we will convert .pt models to .onnx) and simply just use onnxruntime's CPU runtime.

Changes

  • Added a Docker container that the SDK directory gets mounted to to allow for model conversion. This was needed since they require a specific version of Python
  • Added conversion scripts
  • Added scripts to deploy the model once converted to the target device via SFTP
  • Updated 'apps build' to also pull over the vcpkg .so's that users may have added to their vcpkg.json. This is needed because otherwise the app will not be able to find the symbols needed at runtime. However, the example app has been updated to pull scifi-headstage-shared-libraries which contains all of the .so's that are already in our company-wide vcpkg. Thus, 'apps build' will use that to determine which .so's to NOT pull, as pulling those .so's onto the device causes issues with updating scifi-headstage-shared-libraries since then two .debs will be conflicting over the same file path.

Testing

Students at Neurohack 2026 were able to use our inference pipeline with Synapse Apps and the deploy-model utility. As for DSP runtime, I have tested it with my own apps and have confirmed that we are able to perform inference with the DSP runtime.

Example DSP runtime user-flow:

image

Example CPU runtime user-flow:
image

Calvin Leng and others added 20 commits April 8, 2026 15:43
Replace host-side SNPE converter invocation with a Docker-based
approach. The container (Python 3.10 + pinned deps) eliminates
Python version and numpy compatibility issues on the host.

- Add model-converter/ with Dockerfile and self-contained convert.py
- Rewrite onnx_to_dlc.py to orchestrate Docker (auto-builds image)
- Bind-mount SNPE SDK at runtime (Qualcomm license compliant)
- Add --snpe-root CLI arg to deploy-model
- Remove unused onnx_transforms.py (logic moved into container)
- Fix -u shorthand conflict between --username and global --uri
…runtime, updated build to package onnxruntime into the deb
…in scifi-headstage-shared-libraries into the resulting app .deb as that blocks installations of scifi-headstage-shared-libraries. additionally, tap names will now wrap instead of truncate in the rich UI
@calvinleng-science calvinleng-science merged commit 4dad3dc into main Apr 24, 2026
2 checks passed
@calvinleng-science calvinleng-science deleted the calvin/app-inference branch April 24, 2026 00:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant