-
Notifications
You must be signed in to change notification settings - Fork 36
feat(runtime): update to v1 API for leanvec ood #262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
7d978f3
c80430d
bd45345
55390be
bc4a70f
6ea7e2e
4082d1a
27cc1b4
2c0ba47
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -27,6 +27,15 @@ namespace v0 { | |
|
|
||
| struct SVS_RUNTIME_API LeanVecTrainingData { | ||
| virtual ~LeanVecTrainingData(); | ||
|
|
||
| /* Build LeanVec training data (compression matrices) from the provided | ||
| * data. | ||
| * @param training_data Output parameter to the created training data object | ||
| * @param dim Dimensionality of the input data and queries | ||
| * @param n Number of data points and queries | ||
| * @param x Pointer to the input data | ||
| * @param leanvec_dims Number of dimensions in the resulting LeanVec data | ||
| */ | ||
| static Status build( | ||
| LeanVecTrainingData** training_data, | ||
| size_t dim, | ||
|
|
@@ -42,5 +51,37 @@ struct SVS_RUNTIME_API LeanVecTrainingData { | |
| }; | ||
|
|
||
| } // namespace v0 | ||
|
|
||
| namespace v1 { | ||
|
|
||
| struct SVS_RUNTIME_API LeanVecTrainingData : public v0::LeanVecTrainingData { | ||
| using v0::LeanVecTrainingData::destroy; | ||
| using v0::LeanVecTrainingData::save; | ||
|
|
||
| /* Build LeanVec training data (compression matrices) from the provided | ||
| * data. | ||
| * Accepts optional training queries for out-of-distribution training. | ||
| * @param training_data Output parameter to the created training data object | ||
| * @param dim Dimensionality of the input data and queries | ||
| * @param n Number of data points and queries | ||
| * @param x Pointer to the input data | ||
| * @param n_train Number of training queries (can be 0) | ||
| * @param q Pointer to the training queries (can be nullptr) | ||
| * @param leanvec_dims Number of dimensions in the resulting LeanVec data | ||
| */ | ||
| static Status build( | ||
| LeanVecTrainingData** training_data, | ||
| size_t dim, | ||
| size_t n, | ||
| const float* x, | ||
| size_t n_train, | ||
| const float* q, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we keep these two new arguments in the end and default initialize to 0 and nullptr? In that case, the older v0 calls to build will still work without any modifications. Also "n_train" is kind of confusing as both data/queries are used for training? How about we explicitly say "n_queries"?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's exactly the discussion I wanted to have. Your suggestion would indeed allow us to stick with v0 (only that we'd need a copy of the function and we can't use default values, due to ABI compability, but that's just a detail). In my opinion, the order Your preference would be to stick with |
||
| size_t leanvec_dims | ||
| ) noexcept; | ||
|
|
||
| static Status load(LeanVecTrainingData** training_data, std::istream& in) noexcept; | ||
| }; | ||
|
|
||
| } // namespace v1 | ||
| } // namespace runtime | ||
| } // namespace svs | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call "Leanvec transformation matrices" instead of training data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LeanVecTrainingData was purposefully chosen to be generic. In hindsight, I think we could have been more specific. But changing this would require an API update and therefore conflicts with your suggestion in the other comment.