feat: Add Boost API#2030
Conversation
There was a problem hiding this comment.
Orca Security Scan Summary
| Status | Check | Issues by priority | |
|---|---|---|---|
| Infrastructure as Code | View in Orca | ||
| SAST | View in Orca | ||
| Secrets | View in Orca | ||
| Vulnerabilities | View in Orca |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## dev/1.38 #2030 +/- ##
===========================================
Coverage ? 86.57%
===========================================
Files ? 300
Lines ? 23051
Branches ? 0
===========================================
Hits ? 19957
Misses ? 3094
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| ) | ||
|
|
||
| @staticmethod | ||
| def property( # noqa: A003 |
There was a problem hiding this comment.
| def property( # noqa: A003 | |
| def numeric_property( # noqa: A003 |
suggestion: if numeric_decay has the "numeric_" prefix to highlight that it's only applicable to numeric properties, then I'd argue this method should have it as well
There was a problem hiding this comment.
We won't make a timestamp_property operator as Timestamps need a sensible scale. I'm open to changing it to numeric_property but it feels a bit more confusing to users.
There was a problem hiding this comment.
numeric_propertybut it feels a bit more confusing to users
My speculation is that property suggest any property (bool, text, blob) can be used, whereas numeric_property sort of constraints that ==> less confusing.
Ofc that is unless we are planning to extend Boost in a way that, say, a text property can be boosted too, and that'll use Boost.property.
There was a problem hiding this comment.
Maybe it makes more sense like this:
Boost.propertyis for values that have meaningful ranking value and is intended as the simplest operator (boost by popularity). Currently this supportsnumericbut ideally later on I'd like to supportboolas well.Boost.filteris for categorical or text values. You provide a standard filter object.Boost.time_decayandBoost.numeric_decayare more advanced operators. They are separated by type because the paramters (origin = now vs origin = 0) and scale ("7d" vs 7) are different enough that we have separate gRPC messages and parsing logic.
So in this case there won't ever be a Boost.text_property you would just use Boost.filter instead.
There was a problem hiding this comment.
Agreed 👌
but ideally later on I'd like to support bool as well
Out of curiosity: is my understanding correct that boosting a bool property should have the same mechanics under the hood as boosting a "categorical" (Boost.filter) property? Given you can turn filter results into boolean [true, false, true, true].
There was a problem hiding this comment.
Orca Security Scan Summary
| Status | Check | Issues by priority | |
|---|---|---|---|
| Secrets | View in Orca |
Adds Boost features weaviate/weaviate#11103 to python client.
Near Vector/Object/XYZ operators now support a
boostparameter.Boost.property- for property boostsBoost.filter- for filter boostsBoost.numeric_decayfor numeric decay boostsBoost.time_decayfor time decay boostsBoost.blendfor combining multiple boosts together.All allow for an optional depth and weight parameter to control boosting impact. Time decay has a
_decay_value_to_strhelper for supporting python time values.Property modifiers and Decay curves have constants.
Integration test added with 1.38 gate.