diff --git a/docs/indexing/index.mdx b/docs/indexing/index.mdx index 39c33b9..5922fa9 100644 --- a/docs/indexing/index.mdx +++ b/docs/indexing/index.mdx @@ -47,7 +47,7 @@ Vector indexes can use different quantization methods to compress vectors and im | :----------- | :------- | :---------- | | `PQ` (Product Quantization) | Default choice for most vector search scenarios. Use when you need to balance index size and recall. | Divides vectors into subvectors and quantizes each subvector independently. Provides a good balance between compression ratio and search accuracy. | | `SQ` (Scalar Quantization) | Use when you need faster indexing or when vector dimensions have consistent value ranges. | Quantizes each dimension independently. Simpler than PQ but typically provides less compression. | -| `RQ` (RabitQ Quantization) | Use when you need maximum compression or have specific per-dimension requirements. | Per-dimension quantization using a RabitQ codebook. Provides fine-grained control over compression per dimension. | +| `RQ` (RabitQ Quantization) | Use when you need maximum compression or have specific per-dimension requirements. | Per-dimension quantization using a RabitQ codebook. Provides fine-grained control over compression per dimension. For `IVF_RQ`, vector dimensions must be divisible by `8`. | | `None/Flat` | Use for binary vectors (with `hamming` distance) or when you need maximum recall and have sufficient storage. | No quantization—stores raw vectors. Provides the highest accuracy but requires more storage and memory. | ## Understanding the IVF-PQ Index @@ -149,4 +149,3 @@ Then the greedy search routine operates as follows: * At the top layer (using an arbitrary vertex as an entry point), use the greedy local search routine on the k-ANN graph to get an approximate nearest neighbor at that layer. * Using the approximate nearest neighbor found in the previous layer as an entry point, find an approximate nearest neighbor in the next layer with the same method. * Repeat until the bottom-most layer is reached. Then use the entry point to find multiple nearest neighbors (e.g. top 10). - diff --git a/docs/indexing/quantization.mdx b/docs/indexing/quantization.mdx index eeca114..15ccac0 100644 --- a/docs/indexing/quantization.mdx +++ b/docs/indexing/quantization.mdx @@ -15,7 +15,7 @@ Use quantization when: LanceDB currently exposes multiple quantized vector index types, including: - `IVF_PQ` -- Inverted File index with Product Quantization (default). See the [vector indexing guide](/indexing/vector-index) for `IVF_PQ` examples. -- `IVF_RQ` -- Inverted File index with **RaBitQ** quantization (binary, 1 bit per dimension). See [below](#rabitq-quantization) for details. +- `IVF_RQ` -- Inverted File index with **RaBitQ** quantization (binary, 1 bit per dimension). Requires vector dimensions divisible by `8`. See [below](#rabitq-quantization) for details. `IVF_PQ` is the default indexing option in LanceDB and works well in many cases. However, in cases where more drastic compression is needed, RaBitQ is also a reasonable option. @@ -42,6 +42,11 @@ For a deeper dive into the theory and some benchmark results, see the blog post: ### Using RaBitQ You can create an RaBitQ-backed vector index by setting `index_type="IVF_RQ"` when calling `create_index`. + + +When using `IVF_RQ`, vector dimensions must be divisible by `8`. + + `num_bits` controls how many bits per dimension are used: ## API Reference @@ -62,4 +67,3 @@ The full list of parameters to the algorithm are listed below. Number of samples per partition during training. Higher values may improve accuracy but increase training time. - `target_partition_size`: Optional[int], defaults to None Target number of vectors per partition. Adjust to control partition granularity and memory usage. - diff --git a/docs/tables/namespaces.mdx b/docs/tables/namespaces.mdx index daa941e..386e930 100644 --- a/docs/tables/namespaces.mdx +++ b/docs/tables/namespaces.mdx @@ -71,7 +71,7 @@ and `drop_namespace`. -In TypeScript, namespace lifecycle and namespace-scoped table operations are not currently exposed on `Connection`. In practice, namespaces in TypeScript are managed through a namespace-aware admin surface (for example [REST](/api-reference/rest/namespace/create-a-new-namespace)/admin tooling), and the Connection APIs operate at the root namespace. +In TypeScript, namespace lifecycle and namespace-scoped table operations are not currently exposed on `Connection`. In practice, namespaces in TypeScript are managed through a namespace-aware admin surface (for example [REST](/api-reference/rest/index)/admin tooling), and the Connection APIs operate at the root namespace. ## Namespaces in LanceDB Enterprise